All of lore.kernel.org
 help / color / mirror / Atom feed
* thermal_zone trip_point_0_temp 200°C
@ 2012-06-01 19:31 Mark B
  2012-06-02  8:34 ` Clemens Ladisch
  0 siblings, 1 reply; 9+ messages in thread
From: Mark B @ 2012-06-01 19:31 UTC (permalink / raw)
  To: linux-kernel

Hi,

Laptop overheating kernel 3.3.7.1 fedora 17, bizarre temp limit
readings; possibly k10temp module?

My Acer Aspire 5552-7260, AMD phenom II N970 cpu, is giving me very
bizarre temp limit readings; the main reason I'm noticing it is that
it is heating up to 70° without much of a load, 63° at startup under
virtually no load; works well in windows, 48° as equivalent to the
linux 63°, so my gradual conclusion is that it's some kernel-level
code that needs changing?

Can't fancontrol/pwmconfig as apparently — from my limited knowledge
of how to double-check, I've tried looking in all the relevant /sys
nodes — there are no pwm-capable fans; lm_sensors, however, gives the
200°C temp limit as does acpiclient; I'm noticing that there are some
kernel patches affecting the area, although it is unclear to me how
far that would even make it the kernel's responsibility; I've tried
all the acpi_osi=Linux, acpi_osi="Linux", acpi_osi=\\\"Linux\\\",
acpi_enforce_resources=lax, acpi.power_nocheck=1 alternatives without
sign of change; as I recalled from my eeepc that that was the way of
preventing the newer, non-fully-functional, acpi kernel module, from
loading, forcing legacy acpi/pwm support; in fact the overheating
eeepc is one more reason I'm writing to the kernel list now, as I'm
seeing a kind of pattern of computers overheating in linux when in
principle all that it should need would be faster fan speeds / lower
soft limits;

Looking carefully at the modules list, I see k10temp as the most
obvious sensor module, unsure how relevant that is

my understanding of it is that the Kernel/the Bios manages the fan
speed in 'automatic' mode, as distinguished from fancontrol-type
'manual' mode; given that windows manages it properly, I'd have to
conclude it wouldn't normally be the Bios's fault?

$ cat /proc/version
Linux version 3.3.7-1.fc17.x86_64
(mockbuild@x86-11.phx2.fedoraproject.org) (gcc version 4.7.0 20120507
(Red Hat 4.7.0-5) (GCC) ) #1 SMP Mon May 21 22:32:19 UTC 2012

$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 16
model : 5
model name : AMD Phenom(tm) II N970 Quad-Core Processor
stepping : 3
microcode : 0x10000c8
cpu MHz : 800.000
cache size : 512 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl
nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm
extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit
wdt nodeid_msr npt lbrv svm_lock nrip_save
bogomips : 4388.95
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate
[etc]

$ grep -r . /sys/class/thermal/thermal_zone*
/sys/class/thermal/thermal_zone0/power/async:disabled
/sys/class/thermal/thermal_zone0/power/runtime_status:unsupported
/sys/class/thermal/thermal_zone0/power/runtime_usage:0
/sys/class/thermal/thermal_zone0/power/runtime_active_kids:0
/sys/class/thermal/thermal_zone0/power/runtime_enabled:disabled
/sys/class/thermal/thermal_zone0/power/control:auto
/sys/class/thermal/thermal_zone0/power/runtime_suspended_time:0
/sys/class/thermal/thermal_zone0/power/runtime_active_time:0
grep: /sys/class/thermal/thermal_zone0/power/autosuspend_delay_ms:
Input/output error
/sys/class/thermal/thermal_zone0/type:acpitz
/sys/class/thermal/thermal_zone0/temp:62000
/sys/class/thermal/thermal_zone0/mode:enabled
/sys/class/thermal/thermal_zone0/trip_point_0_type:critical
/sys/class/thermal/thermal_zone0/trip_point_0_temp:200000
/sys/class/thermal/thermal_zone0/trip_point_1_type:passive
/sys/class/thermal/thermal_zone0/trip_point_1_temp:90000
/sys/class/thermal/thermal_zone0/cdev0_trip_point:1
/sys/class/thermal/thermal_zone0/cdev1_trip_point:1
/sys/class/thermal/thermal_zone0/cdev2_trip_point:1
/sys/class/thermal/thermal_zone0/cdev3_trip_point:1


$ sh ~/ver_linux
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux MYCOMPUTER 3.3.7-1.fc17.x86_64 #1 SMP Mon May 21 22:32:19 UTC
2012 x86_64 x86_64 x86_64 GNU/Linux

Gnu C                  4.7.0
Gnu make               3.82
binutils               2.22.52.0.1
util-linux             2.21.2
mount                  debug
module-init-tools      7
e2fsprogs              1.42
xfsprogs               3.1.8
pcmciautils            018
PPP                    2.4.5
Linux C Library        2.15
Dynamic linker (ldd)   2.15
Procps                 3.2.8
Net-tools              1.60
Kbd                    1.15.3wip
Sh-utils               8.15
wireless-tools         29
Modules Loaded         ip6table_filter ip6_tables ip6t_REJECT
nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_netbios_ns
nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack fuse bnep bluetooth uvcvideo videobuf2_vmalloc
videobuf2_memops videobuf2_core videodev media snd_hda_codec_hdmi arc4
broadcom snd_hda_codec_realtek ath9k ath9k_common ath9k_hw ath
mac80211 cfg80211 snd_hda_intel snd_hda_codec tg3 acer_wmi
sparse_keymap snd_hwdep snd_pcm rfkill snd_page_alloc shpchp
sp5100_tco edac_core edac_mce_amd snd_timer snd k10temp microcode
soundcore i2c_piix4 uinput ums_realtek usb_storage video wmi radeon
i2c_algo_bit drm_kms_helper ttm drm i2c_core

I'll be happy to provide the results from

dmesg | grep -i acpi
lspci -vvv

etc should you need it

please cc me directly when responding

thanks for reading

Best regards

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: thermal_zone trip_point_0_temp 200°C
  2012-06-01 19:31 thermal_zone trip_point_0_temp 200°C Mark B
@ 2012-06-02  8:34 ` Clemens Ladisch
  2012-06-03 18:22   ` Mark
  2012-06-04  2:56   ` Zhang Rui
  0 siblings, 2 replies; 9+ messages in thread
From: Clemens Ladisch @ 2012-06-02  8:34 UTC (permalink / raw)
  To: Mark B, Zhang Rui; +Cc: linux-kernel

Mark B wrote:
> My Acer Aspire 5552-7260, AMD phenom II N970 cpu, is giving me very
> bizarre temp limit readings; the main reason I'm noticing it is that
> it is heating up to 70° without much of a load, 63° at startup under
> virtually no load; works well in windows, 48° as equivalent to the
> linux 63°, so my gradual conclusion is that it's some kernel-level
> code that needs changing?
>
> Can't fancontrol/pwmconfig as apparently — from my limited knowledge
> of how to double-check, I've tried looking in all the relevant /sys
> nodes — there are no pwm-capable fans; lm_sensors, however, gives the
> 200°C temp limit as does acpiclient; I'm noticing that there are some
> kernel patches affecting the area, although it is unclear to me how
> far that would even make it the kernel's responsibility; I've tried
> all the acpi_osi=Linux, acpi_osi="Linux", acpi_osi=\\\"Linux\\\",
> acpi_enforce_resources=lax, acpi.power_nocheck=1 alternatives without
> sign of change; as I recalled from my eeepc that that was the way of
> preventing the newer, non-fully-functional, acpi kernel module, from
> loading, forcing legacy acpi/pwm support; in fact the overheating
> eeepc is one more reason I'm writing to the kernel list now, as I'm
> seeing a kind of pattern of computers overheating in linux when in
> principle all that it should need would be faster fan speeds / lower
> soft limits;
>
> Looking carefully at the modules list, I see k10temp as the most
> obvious sensor module, unsure how relevant that is

The k10temp module provides nothing but a sensor for monitoring
applications.  Your thermal zones are managed by the acpitz driver,
which is compiled into the kernel and uses its own sensor.

> my understanding of it is that the Kernel/the Bios manages the fan
> speed in 'automatic' mode, as distinguished from fancontrol-type
> 'manual' mode; given that windows manages it properly, I'd have to
> conclude it wouldn't normally be the Bios's fault?

The ACPI tables are provided by the BIOS.

For documentation about the thermal zone files, see
<Documentation/thermal/sysfs-api.txt>.

> $ cat /proc/version
> Linux version 3.3.7-1.fc17.x86_64
> (mockbuild@x86-11.phx2.fedoraproject.org) (gcc version 4.7.0 20120507
> (Red Hat 4.7.0-5) (GCC) ) #1 SMP Mon May 21 22:32:19 UTC 2012
>
> $ grep -r . /sys/class/thermal/thermal_zone*
> /sys/class/thermal/thermal_zone0/power/async:disabled
> /sys/class/thermal/thermal_zone0/power/runtime_status:unsupported
> /sys/class/thermal/thermal_zone0/power/runtime_usage:0
> /sys/class/thermal/thermal_zone0/power/runtime_active_kids:0
> /sys/class/thermal/thermal_zone0/power/runtime_enabled:disabled
> /sys/class/thermal/thermal_zone0/power/control:auto
> /sys/class/thermal/thermal_zone0/power/runtime_suspended_time:0
> /sys/class/thermal/thermal_zone0/power/runtime_active_time:0
> grep: /sys/class/thermal/thermal_zone0/power/autosuspend_delay_ms:
> Input/output error
> /sys/class/thermal/thermal_zone0/type:acpitz
> /sys/class/thermal/thermal_zone0/temp:62000
> /sys/class/thermal/thermal_zone0/mode:enabled
> /sys/class/thermal/thermal_zone0/trip_point_0_type:critical
> /sys/class/thermal/thermal_zone0/trip_point_0_temp:200000
> /sys/class/thermal/thermal_zone0/trip_point_1_type:passive
> /sys/class/thermal/thermal_zone0/trip_point_1_temp:90000
> /sys/class/thermal/thermal_zone0/cdev0_trip_point:1
> /sys/class/thermal/thermal_zone0/cdev1_trip_point:1
> /sys/class/thermal/thermal_zone0/cdev2_trip_point:1
> /sys/class/thermal/thermal_zone0/cdev3_trip_point:1


Regards,
Clemens

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: thermal_zone trip_point_0_temp 200°C
  2012-06-02  8:34 ` Clemens Ladisch
@ 2012-06-03 18:22   ` Mark
  2012-06-04  3:21     ` Zhang Rui
  2012-06-04  2:56   ` Zhang Rui
  1 sibling, 1 reply; 9+ messages in thread
From: Mark @ 2012-06-03 18:22 UTC (permalink / raw)
  To: Clemens Ladisch; +Cc: Zhang Rui, linux-kernel

Hi Clemens, thanks for writing;

On 06/02/2012 04:34 AM, Clemens Ladisch wrote:
> The k10temp module provides nothing but a sensor for monitoring
> applications.  Your thermal zones are managed by the acpitz driver,
> which is compiled into the kernel and uses its own sensor.
thanks for confirming that, I've noticed that the k10temp, at least as ksensors
interpret it, seems to report a more sensible limit of 80°, although as you say
that's a reporting module that won't affect actual management, I suppose it's
neither here nor there, unless that's a sign that the kernel code is faulty while
the k10temp code is correct — inasmuch as it interprets the readings in
accordance with the BIOS's directions
>> my understanding of it is that the Kernel/the Bios manages the fan
>> speed in 'automatic' mode, as distinguished from fancontrol-type
>> 'manual' mode; given that windows manages it properly, I'd have to
>> conclude it wouldn't normally be the Bios's fault?
> The ACPI tables are provided by the BIOS.
>
> For documentation about the thermal zone files, see
> <Documentation/thermal/sysfs-api.txt>.
yet the documentation for 3.3.7 says 'mode' for instance should be \in [kernel,
user], while the node is reporting 'enabled'
>> $ grep -r . /sys/class/thermal/thermal_zone*
>> /sys/class/thermal/thermal_zone0/power/async:disabled
>> /sys/class/thermal/thermal_zone0/power/runtime_status:unsupported
>> /sys/class/thermal/thermal_zone0/power/runtime_usage:0
>> /sys/class/thermal/thermal_zone0/power/runtime_active_kids:0
>> /sys/class/thermal/thermal_zone0/power/runtime_enabled:disabled
>> /sys/class/thermal/thermal_zone0/power/control:auto
>> /sys/class/thermal/thermal_zone0/power/runtime_suspended_time:0
>> /sys/class/thermal/thermal_zone0/power/runtime_active_time:0
>> grep: /sys/class/thermal/thermal_zone0/power/autosuspend_delay_ms:
>> Input/output error
>> /sys/class/thermal/thermal_zone0/type:acpitz
>> /sys/class/thermal/thermal_zone0/temp:62000
>> /sys/class/thermal/thermal_zone0/mode:enabled
>> /sys/class/thermal/thermal_zone0/trip_point_0_type:critical
>> /sys/class/thermal/thermal_zone0/trip_point_0_temp:200000
>> /sys/class/thermal/thermal_zone0/trip_point_1_type:passive
>> /sys/class/thermal/thermal_zone0/trip_point_1_temp:90000
>> /sys/class/thermal/thermal_zone0/cdev0_trip_point:1
>> /sys/class/thermal/thermal_zone0/cdev1_trip_point:1
>> /sys/class/thermal/thermal_zone0/cdev2_trip_point:1
>> /sys/class/thermal/thermal_zone0/cdev3_trip_point:1
given that the 90° is a 'passive' trip_point it seems natural that the cdev[0-3]
linked to it, are apparently internal cpu cooling mechanisms rather than fans

$ grep -r . /sys/class/thermal/cool*
/sys/class/thermal/cooling_device0/power/async:disabled
/sys/class/thermal/cooling_device0/power/runtime_status:unsupported
/sys/class/thermal/cooling_device0/power/runtime_usage:0
/sys/class/thermal/cooling_device0/power/runtime_active_kids:0
/sys/class/thermal/cooling_device0/power/runtime_enabled:disabled
/sys/class/thermal/cooling_device0/power/control:auto
/sys/class/thermal/cooling_device0/power/runtime_suspended_time:0
/sys/class/thermal/cooling_device0/power/runtime_active_time:0
grep: /sys/class/thermal/cooling_device0/power/autosuspend_delay_ms: Input/output
error
/sys/class/thermal/cooling_device0/type:Processor
/sys/class/thermal/cooling_device0/max_state:10
/sys/class/thermal/cooling_device0/cur_state:0
/sys/class/thermal/cooling_device1/power/async:disabled
/sys/class/thermal/cooling_device1/power/runtime_status:unsupported
/sys/class/thermal/cooling_device1/power/runtime_usage:0
/sys/class/thermal/cooling_device1/power/runtime_active_kids:0
/sys/class/thermal/cooling_device1/power/runtime_enabled:disabled
/sys/class/thermal/cooling_device1/power/control:auto
/sys/class/thermal/cooling_device1/power/runtime_suspended_time:0
/sys/class/thermal/cooling_device1/power/runtime_active_time:0
grep: /sys/class/thermal/cooling_device1/power/autosuspend_delay_ms: Input/output
error
/sys/class/thermal/cooling_device1/type:Processor
/sys/class/thermal/cooling_device1/max_state:3
/sys/class/thermal/cooling_device1/cur_state:0
/sys/class/thermal/cooling_device2/power/async:disabled
/sys/class/thermal/cooling_device2/power/runtime_status:unsupported
/sys/class/thermal/cooling_device2/power/runtime_usage:0
/sys/class/thermal/cooling_device2/power/runtime_active_kids:0
/sys/class/thermal/cooling_device2/power/runtime_enabled:disabled
/sys/class/thermal/cooling_device2/power/control:auto
/sys/class/thermal/cooling_device2/power/runtime_suspended_time:0
/sys/class/thermal/cooling_device2/power/runtime_active_time:0
grep: /sys/class/thermal/cooling_device2/power/autosuspend_delay_ms: Input/output
error
/sys/class/thermal/cooling_device2/type:Processor
/sys/class/thermal/cooling_device2/max_state:3
/sys/class/thermal/cooling_device2/cur_state:0
/sys/class/thermal/cooling_device3/power/async:disabled
/sys/class/thermal/cooling_device3/power/runtime_status:unsupported
/sys/class/thermal/cooling_device3/power/runtime_usage:0
/sys/class/thermal/cooling_device3/power/runtime_active_kids:0
/sys/class/thermal/cooling_device3/power/runtime_enabled:disabled
/sys/class/thermal/cooling_device3/power/control:auto
/sys/class/thermal/cooling_device3/power/runtime_suspended_time:0
/sys/class/thermal/cooling_device3/power/runtime_active_time:0
grep: /sys/class/thermal/cooling_device3/power/autosuspend_delay_ms: Input/output
error
/sys/class/thermal/cooling_device3/type:Processor
/sys/class/thermal/cooling_device3/max_state:3
/sys/class/thermal/cooling_device3/cur_state:0
/sys/class/thermal/cooling_device4/power/async:disabled
/sys/class/thermal/cooling_device4/power/runtime_status:unsupported
/sys/class/thermal/cooling_device4/power/runtime_usage:0
/sys/class/thermal/cooling_device4/power/runtime_active_kids:0
/sys/class/thermal/cooling_device4/power/runtime_enabled:disabled
/sys/class/thermal/cooling_device4/power/control:auto
/sys/class/thermal/cooling_device4/power/runtime_suspended_time:0
/sys/class/thermal/cooling_device4/power/runtime_active_time:0
grep: /sys/class/thermal/cooling_device4/power/autosuspend_delay_ms: Input/output
error
/sys/class/thermal/cooling_device4/type:LCD
/sys/class/thermal/cooling_device4/max_state:9
/sys/class/thermal/cooling_device4/cur_state:0

surprising that the fan seems to work at all, given that there is no
kernel-registered cooling fan; yet it definitely goes faster the hotter the cpu
is, the trouble is that that is already too hot
> Regards,
> Clemens
thanks for allowing me to make some progress, seems I'll need to look more
carefully at the BIOS too; from my understanding of it Acer has non-pwm ways of
setting hardware registers to manage fan speed that'll be one angle to look into,
as long as they'll provide some documentation

hoping to avoid needing to compile custom kernels given that I've already emerged
bruised a few times from the myriad of compile options :)

Best regards

Mark

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: thermal_zone trip_point_0_temp 200°C
  2012-06-02  8:34 ` Clemens Ladisch
  2012-06-03 18:22   ` Mark
@ 2012-06-04  2:56   ` Zhang Rui
  2012-06-04 14:41     ` Mark
  1 sibling, 1 reply; 9+ messages in thread
From: Zhang Rui @ 2012-06-04  2:56 UTC (permalink / raw)
  To: Clemens Ladisch; +Cc: Mark B, linux-kernel

On 六, 2012-06-02 at 10:34 +0200, Clemens Ladisch wrote:
> Mark B wrote:
> > My Acer Aspire 5552-7260, AMD phenom II N970 cpu, is giving me very
> > bizarre temp limit readings; the main reason I'm noticing it is that
> > it is heating up to 70° without much of a load, 63° at startup under
> > virtually no load; works well in windows, 48° as equivalent to the
> > linux 63°, so my gradual conclusion is that it's some kernel-level
> > code that needs changing?
> >
> > Can't fancontrol/pwmconfig as apparently — from my limited knowledge
> > of how to double-check, I've tried looking in all the relevant /sys
> > nodes — there are no pwm-capable fans; lm_sensors, however, gives the
> > 200°C temp limit as does acpiclient; I'm noticing that there are some
> > kernel patches affecting the area, although it is unclear to me how
> > far that would even make it the kernel's responsibility; I've tried
> > all the acpi_osi=Linux, acpi_osi="Linux", acpi_osi=\\\"Linux\\\",
> > acpi_enforce_resources=lax, acpi.power_nocheck=1 alternatives without
> > sign of change; as I recalled from my eeepc that that was the way of
> > preventing the newer, non-fully-functional, acpi kernel module, from
> > loading, forcing legacy acpi/pwm support; in fact the overheating
> > eeepc is one more reason I'm writing to the kernel list now, as I'm
> > seeing a kind of pattern of computers overheating in linux when in
> > principle all that it should need would be faster fan speeds / lower
> > soft limits;
> >
> > Looking carefully at the modules list, I see k10temp as the most
> > obvious sensor module, unsure how relevant that is
> 
> The k10temp module provides nothing but a sensor for monitoring
> applications.  Your thermal zones are managed by the acpitz driver,
> which is compiled into the kernel and uses its own sensor.
> 
> > my understanding of it is that the Kernel/the Bios manages the fan
> > speed in 'automatic' mode, as distinguished from fancontrol-type
> > 'manual' mode; given that windows manages it properly, I'd have to
> > conclude it wouldn't normally be the Bios's fault?
> 
> The ACPI tables are provided by the BIOS.
> 
yes.
Usually, the critical trip point value is a hard coded number provided
by the BIOS.
About fan control, it seems that there is no ACPI FAN on this machine,
so the fan may be controlled either by firmware or by some platform
specific driver.
To make a double check, it would be great if you can refer to
http://www.lesswatts.org/projects/acpi/utilities.php
to get the acpidump output of this machine.

thanks,
rui

> For documentation about the thermal zone files, see
> <Documentation/thermal/sysfs-api.txt>.
> 
> > $ cat /proc/version
> > Linux version 3.3.7-1.fc17.x86_64
> > (mockbuild@x86-11.phx2.fedoraproject.org) (gcc version 4.7.0 20120507
> > (Red Hat 4.7.0-5) (GCC) ) #1 SMP Mon May 21 22:32:19 UTC 2012
> >
> > $ grep -r . /sys/class/thermal/thermal_zone*
> > /sys/class/thermal/thermal_zone0/power/async:disabled
> > /sys/class/thermal/thermal_zone0/power/runtime_status:unsupported
> > /sys/class/thermal/thermal_zone0/power/runtime_usage:0
> > /sys/class/thermal/thermal_zone0/power/runtime_active_kids:0
> > /sys/class/thermal/thermal_zone0/power/runtime_enabled:disabled
> > /sys/class/thermal/thermal_zone0/power/control:auto
> > /sys/class/thermal/thermal_zone0/power/runtime_suspended_time:0
> > /sys/class/thermal/thermal_zone0/power/runtime_active_time:0
> > grep: /sys/class/thermal/thermal_zone0/power/autosuspend_delay_ms:
> > Input/output error
> > /sys/class/thermal/thermal_zone0/type:acpitz
> > /sys/class/thermal/thermal_zone0/temp:62000
> > /sys/class/thermal/thermal_zone0/mode:enabled
> > /sys/class/thermal/thermal_zone0/trip_point_0_type:critical
> > /sys/class/thermal/thermal_zone0/trip_point_0_temp:200000
> > /sys/class/thermal/thermal_zone0/trip_point_1_type:passive
> > /sys/class/thermal/thermal_zone0/trip_point_1_temp:90000
> > /sys/class/thermal/thermal_zone0/cdev0_trip_point:1
> > /sys/class/thermal/thermal_zone0/cdev1_trip_point:1
> > /sys/class/thermal/thermal_zone0/cdev2_trip_point:1
> > /sys/class/thermal/thermal_zone0/cdev3_trip_point:1
> 
> 
> Regards,
> Clemens



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: thermal_zone trip_point_0_temp 200°C
  2012-06-03 18:22   ` Mark
@ 2012-06-04  3:21     ` Zhang Rui
  0 siblings, 0 replies; 9+ messages in thread
From: Zhang Rui @ 2012-06-04  3:21 UTC (permalink / raw)
  To: Mark; +Cc: Clemens Ladisch, linux-kernel, Matthew Garrett

On 日, 2012-06-03 at 14:22 -0400, Mark wrote:
> Hi Clemens, thanks for writing;
> 
> On 06/02/2012 04:34 AM, Clemens Ladisch wrote:
> > The k10temp module provides nothing but a sensor for monitoring
> > applications.  Your thermal zones are managed by the acpitz driver,
> > which is compiled into the kernel and uses its own sensor.
> thanks for confirming that, I've noticed that the k10temp, at least as ksensors
> interpret it, seems to report a more sensible limit of 80°, although as you say
> that's a reporting module that won't affect actual management, I suppose it's
> neither here nor there, unless that's a sign that the kernel code is faulty while
> the k10temp code is correct — inasmuch as it interprets the readings in
> accordance with the BIOS's directions
> >> my understanding of it is that the Kernel/the Bios manages the fan
> >> speed in 'automatic' mode, as distinguished from fancontrol-type
> >> 'manual' mode; given that windows manages it properly, I'd have to
> >> conclude it wouldn't normally be the Bios's fault?
> > The ACPI tables are provided by the BIOS.
> >
> > For documentation about the thermal zone files, see
> > <Documentation/thermal/sysfs-api.txt>.
> yet the documentation for 3.3.7 says 'mode' for instance should be \in [kernel,
> user], while the node is reporting 'enabled'

yes, this is a problem introduced by 6503e5df.
we should update Documentation/thermal/sysfs-api.txt as well
patch attached.

Subject: Fix Documentation for thermal code changes

Fix the documentation about predefined values of "mode" file.

With commit 6503e5df, we use "enabled/disabled" as the predefined values
for enable/disable kernel thermal management, instead of "user/kernel".

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
 Documentation/thermal/sysfs-api.txt |   20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

Index: rtd3/Documentation/thermal/sysfs-api.txt
===================================================================
--- rtd3.orig/Documentation/thermal/sysfs-api.txt
+++ rtd3/Documentation/thermal/sysfs-api.txt
@@ -45,11 +45,11 @@ temperature) and throttle appropriate de
 	.bind: bind the thermal zone device with a thermal cooling device.
 	.unbind: unbind the thermal zone device with a thermal cooling device.
 	.get_temp: get the current temperature of the thermal zone.
-	.get_mode: get the current mode (user/kernel) of the thermal zone.
-	    - "kernel" means thermal management is done in kernel.
-	    - "user" will prevent kernel thermal driver actions upon trip points
+	.get_mode: get the current mode (enabled/disabled) of the thermal zone.
+	    - "enabled" means the kernel thermal management is enabled.
+	    - "disable" will prevent kernel thermal driver actions upon trip points
 	      so that user applications can take charge of thermal management.
-	.set_mode: set the mode (user/kernel) of the thermal zone.
+	.set_mode: set the mode (enabled/disabled) of the thermal zone.
 	.get_trip_type: get the type of certain trip point.
 	.get_trip_temp: get the temperature above which the certain trip point
 			will be fired.
@@ -167,14 +167,14 @@ temp
 	RO, Required
 
 mode
-	One of the predefined values in [kernel, user].
+	One of the predefined values in [enabled, disableed].
 	This file gives information about the algorithm that is currently
 	managing the thermal zone. It can be either default kernel based
 	algorithm or user space application.
-	kernel	= Thermal management in kernel thermal zone driver.
-	user	= Preventing kernel thermal zone driver actions upon
-		  trip points so that user application can take full
-		  charge of the thermal management.
+	enabled		= enable Kernel Thermal management.
+	disabled	= Preventing kernel thermal zone driver actions upon
+			  trip points so that user application can take full
+			  charge of the thermal management.
 	RW, Optional
 
 trip_point_[0-*]_temp
@@ -248,7 +248,7 @@ method, the sys I/F structure will be bu
 |thermal_zone1:
     |---type:			acpitz
     |---temp:			37000
-    |---mode:			kernel
+    |---mode:			enabled
     |---trip_point_0_temp:	100000
     |---trip_point_0_type:	critical
     |---trip_point_1_temp:	80000


> >> $ grep -r . /sys/class/thermal/thermal_zone*
> >> /sys/class/thermal/thermal_zone0/power/async:disabled
> >> /sys/class/thermal/thermal_zone0/power/runtime_status:unsupported
> >> /sys/class/thermal/thermal_zone0/power/runtime_usage:0
> >> /sys/class/thermal/thermal_zone0/power/runtime_active_kids:0
> >> /sys/class/thermal/thermal_zone0/power/runtime_enabled:disabled
> >> /sys/class/thermal/thermal_zone0/power/control:auto
> >> /sys/class/thermal/thermal_zone0/power/runtime_suspended_time:0
> >> /sys/class/thermal/thermal_zone0/power/runtime_active_time:0
> >> grep: /sys/class/thermal/thermal_zone0/power/autosuspend_delay_ms:
> >> Input/output error
> >> /sys/class/thermal/thermal_zone0/type:acpitz
> >> /sys/class/thermal/thermal_zone0/temp:62000
> >> /sys/class/thermal/thermal_zone0/mode:enabled
> >> /sys/class/thermal/thermal_zone0/trip_point_0_type:critical
> >> /sys/class/thermal/thermal_zone0/trip_point_0_temp:200000
> >> /sys/class/thermal/thermal_zone0/trip_point_1_type:passive
> >> /sys/class/thermal/thermal_zone0/trip_point_1_temp:90000
> >> /sys/class/thermal/thermal_zone0/cdev0_trip_point:1
> >> /sys/class/thermal/thermal_zone0/cdev1_trip_point:1
> >> /sys/class/thermal/thermal_zone0/cdev2_trip_point:1
> >> /sys/class/thermal/thermal_zone0/cdev3_trip_point:1
> given that the 90° is a 'passive' trip_point it seems natural that the cdev[0-3]
> linked to it, are apparently internal cpu cooling mechanisms rather than fans
> 
> $ grep -r . /sys/class/thermal/cool*
> /sys/class/thermal/cooling_device0/power/async:disabled
> /sys/class/thermal/cooling_device0/power/runtime_status:unsupported
> /sys/class/thermal/cooling_device0/power/runtime_usage:0
> /sys/class/thermal/cooling_device0/power/runtime_active_kids:0
> /sys/class/thermal/cooling_device0/power/runtime_enabled:disabled
> /sys/class/thermal/cooling_device0/power/control:auto
> /sys/class/thermal/cooling_device0/power/runtime_suspended_time:0
> /sys/class/thermal/cooling_device0/power/runtime_active_time:0
> grep: /sys/class/thermal/cooling_device0/power/autosuspend_delay_ms: Input/output
> error
> /sys/class/thermal/cooling_device0/type:Processor
> /sys/class/thermal/cooling_device0/max_state:10
> /sys/class/thermal/cooling_device0/cur_state:0
> /sys/class/thermal/cooling_device1/power/async:disabled
> /sys/class/thermal/cooling_device1/power/runtime_status:unsupported
> /sys/class/thermal/cooling_device1/power/runtime_usage:0
> /sys/class/thermal/cooling_device1/power/runtime_active_kids:0
> /sys/class/thermal/cooling_device1/power/runtime_enabled:disabled
> /sys/class/thermal/cooling_device1/power/control:auto
> /sys/class/thermal/cooling_device1/power/runtime_suspended_time:0
> /sys/class/thermal/cooling_device1/power/runtime_active_time:0
> grep: /sys/class/thermal/cooling_device1/power/autosuspend_delay_ms: Input/output
> error
> /sys/class/thermal/cooling_device1/type:Processor
> /sys/class/thermal/cooling_device1/max_state:3
> /sys/class/thermal/cooling_device1/cur_state:0
> /sys/class/thermal/cooling_device2/power/async:disabled
> /sys/class/thermal/cooling_device2/power/runtime_status:unsupported
> /sys/class/thermal/cooling_device2/power/runtime_usage:0
> /sys/class/thermal/cooling_device2/power/runtime_active_kids:0
> /sys/class/thermal/cooling_device2/power/runtime_enabled:disabled
> /sys/class/thermal/cooling_device2/power/control:auto
> /sys/class/thermal/cooling_device2/power/runtime_suspended_time:0
> /sys/class/thermal/cooling_device2/power/runtime_active_time:0
> grep: /sys/class/thermal/cooling_device2/power/autosuspend_delay_ms: Input/output
> error
> /sys/class/thermal/cooling_device2/type:Processor
> /sys/class/thermal/cooling_device2/max_state:3
> /sys/class/thermal/cooling_device2/cur_state:0
> /sys/class/thermal/cooling_device3/power/async:disabled
> /sys/class/thermal/cooling_device3/power/runtime_status:unsupported
> /sys/class/thermal/cooling_device3/power/runtime_usage:0
> /sys/class/thermal/cooling_device3/power/runtime_active_kids:0
> /sys/class/thermal/cooling_device3/power/runtime_enabled:disabled
> /sys/class/thermal/cooling_device3/power/control:auto
> /sys/class/thermal/cooling_device3/power/runtime_suspended_time:0
> /sys/class/thermal/cooling_device3/power/runtime_active_time:0
> grep: /sys/class/thermal/cooling_device3/power/autosuspend_delay_ms: Input/output
> error
> /sys/class/thermal/cooling_device3/type:Processor
> /sys/class/thermal/cooling_device3/max_state:3
> /sys/class/thermal/cooling_device3/cur_state:0
> /sys/class/thermal/cooling_device4/power/async:disabled
> /sys/class/thermal/cooling_device4/power/runtime_status:unsupported
> /sys/class/thermal/cooling_device4/power/runtime_usage:0
> /sys/class/thermal/cooling_device4/power/runtime_active_kids:0
> /sys/class/thermal/cooling_device4/power/runtime_enabled:disabled
> /sys/class/thermal/cooling_device4/power/control:auto
> /sys/class/thermal/cooling_device4/power/runtime_suspended_time:0
> /sys/class/thermal/cooling_device4/power/runtime_active_time:0
> grep: /sys/class/thermal/cooling_device4/power/autosuspend_delay_ms: Input/output
> error
> /sys/class/thermal/cooling_device4/type:LCD
> /sys/class/thermal/cooling_device4/max_state:9
> /sys/class/thermal/cooling_device4/cur_state:0
> 
> surprising that the fan seems to work at all, given that there is no
> kernel-registered cooling fan;

this only means that there is no ACPI controlled Fan on this platforms.
As we can see, there are only two trip points, critical and passive.
this suggests that ACPI thermal management can only use processors for
passive cooling.
The fan may be controlled either by some platform code or by firmware.

thanks,
rui
>  yet it definitely goes faster the hotter the cpu
> is, the trouble is that that is already too hot
> > Regards,
> > Clemens
> thanks for allowing me to make some progress, seems I'll need to look more
> carefully at the BIOS too; from my understanding of it Acer has non-pwm ways of
> setting hardware registers to manage fan speed that'll be one angle to look into,
> as long as they'll provide some documentation
> 
> hoping to avoid needing to compile custom kernels given that I've already emerged
> bruised a few times from the myriad of compile options :)
> 
> Best regards
> 
> Mark



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: thermal_zone trip_point_0_temp 200°C
  2012-06-04  2:56   ` Zhang Rui
@ 2012-06-04 14:41     ` Mark
  2012-06-06  1:54       ` Zhang Rui
  0 siblings, 1 reply; 9+ messages in thread
From: Mark @ 2012-06-04 14:41 UTC (permalink / raw)
  To: Zhang Rui; +Cc: Clemens Ladisch, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1176 bytes --]


Hi Rui,

Thanks for writing;

On 06/03/2012 10:56 PM, Zhang Rui wrote:
> Usually, the critical trip point value is a hard coded number provided
> by the BIOS.
200° though, possibly the kernel should set a default upper limit, unless there's
a rational reason
> About fan control, it seems that there is no ACPI FAN on this machine,
> so the fan may be controlled either by firmware or by some platform
> specific driver.
> To make a double check, it would be great if you can refer to
> http://www.lesswatts.org/projects/acpi/utilities.php
> to get the acpidump output of this machine.
>
> thanks,
> rui

I'm tending to a similar conclusion, it looks as though that's generally the case
with many Aspires; there is a script for poking the hardware registers
http://code.google.com/p/aceracpi/source/browse/trunk/acer_ec/acer_ec.pl referred
to at http://www.linuxquestions.org/questions/showthread.php?p=4657451 though it
sounds a somewhat risky/random business without the manufacturers' specifications
at the ready

there are 2-3 symbols in the ACPI that may be relevant, FANG, FANW, possibly
FANU; I attach the relevant files

thanks for helping me :)

Best regards

Mark

[-- Attachment #2: Acer_ACPI.tar.bz2 --]
[-- Type: application/x-bzip, Size: 83611 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: thermal_zone trip_point_0_temp 200°C
  2012-06-04 14:41     ` Mark
@ 2012-06-06  1:54       ` Zhang Rui
  2012-06-15 14:42         ` Mark
  0 siblings, 1 reply; 9+ messages in thread
From: Zhang Rui @ 2012-06-06  1:54 UTC (permalink / raw)
  To: Mark; +Cc: Clemens Ladisch, linux-kernel

On 一, 2012-06-04 at 10:41 -0400, Mark wrote:
> Hi Rui,
> 
> Thanks for writing;
> 
> On 06/03/2012 10:56 PM, Zhang Rui wrote:
> > Usually, the critical trip point value is a hard coded number provided
> > by the BIOS.

This is the ASL code for critical trip point in the ACPI table,

            Name (DCRT, 0x127C)

            Method (_CRT, 0, NotSerialized)
            {
                Return (DCRT)
            }

_CRT, the control method which OS evaluates to get the thermal critical
trip point, returns DCRT, which is a hard coded value 0x127C. And this
equals 473.2 K, or 200C.

So I do not see ACPI thermal does anything wrong here.

> 200° though, possibly the kernel should set a default upper limit, unless there's
> a rational reason

you can use module parameter thermal.crt= to override the critical trip
point.
But I'm not sure if kernel should set a default upper limit or not.
Maybe we need another entry for this laptop in thermal_dmi_table.

> > About fan control, it seems that there is no ACPI FAN on this machine,
> > so the fan may be controlled either by firmware or by some platform
> > specific driver.
> > To make a double check, it would be great if you can refer to
> > http://www.lesswatts.org/projects/acpi/utilities.php
> > to get the acpidump output of this machine.
> >
> > thanks,
> > rui
> 
> I'm tending to a similar conclusion, it looks as though that's generally the case
> with many Aspires; there is a script for poking the hardware registers
> http://code.google.com/p/aceracpi/source/browse/trunk/acer_ec/acer_ec.pl referred
> to at http://www.linuxquestions.org/questions/showthread.php?p=4657451 though it
> sounds a somewhat risky/random business without the manufacturers' specifications
> at the ready
> 
> there are 2-3 symbols in the ACPI that may be relevant, FANG, FANW, possibly
> FANU; I attach the relevant files
> 
FANG/FANW/FANU can be used for fan control?
I do not know what these mean as they are not ACPI pre-defined control
method.
But if all the Aspires machines follow the same rule, then maybe we need
a kernel Acer platform driver that handles this.

thanks,
rui



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: thermal_zone trip_point_0_temp 200°C
  2012-06-06  1:54       ` Zhang Rui
@ 2012-06-15 14:42         ` Mark
  2012-06-20  6:54           ` Zhang Rui
  0 siblings, 1 reply; 9+ messages in thread
From: Mark @ 2012-06-15 14:42 UTC (permalink / raw)
  To: Zhang Rui, Clemens Ladisch, linux-kernel

Hi Rui,

thanks for writing; apologies for the delay I've been coping with some bizarre
communications from AMD who won't even state thermal specifications for the cpu,
while 'passing the buck' to Acer who are slippery customers — various methods of
avoiding giving technical support worthy of the name — too :-)

On 06/05/2012 09:54 PM, Zhang Rui wrote:
> you can use module parameter thermal.crt= to override the critical trip
> point.
> But I'm not sure if kernel should set a default upper limit or not.
> Maybe we need another entry for this laptop in thermal_dmi_table.
thanks, sounds worth a try; at least to prevent a fire :-D
>
>> there are 2-3 symbols in the ACPI that may be relevant, FANG, FANW, possibly
>> FANU; I attach the relevant files
> FANG/FANW/FANU can be used for fan control?
> I do not know what these mean as they are not ACPI pre-defined control
> method.
> But if all the Aspires machines follow the same rule, then maybe we need
> a kernel Acer platform driver that handles this.
I'm simply conjecturing for now, that the letters 'FAN' may be relevant, you're
the specialist though :-)
> thanks,
> rui
AMD suggested a channel that would seem designed more for people whose Email
address is '@acer.com' / '@kernel.org', possibly '@intel.com' than people such as
me '@gmail.com' :-D

http://support.amd.com/us/contacts/Pages/EmbeddedTechnicalSupport.aspx

My question would be, what *is* the thermal range for the Phenom II N970 Mobile
CPU, reference HMN970DCR42GM ?
conspicuously absent from http://products.amd.com/en-us/NotebookCPUDetail.aspx?id=733

Aside from the question of whether they've got some technical contact at Acer who
could verify how — presumably some hardware registers that are somehow accessible
to the OS — windows seems to manage to keep the CPU relatively cool / fan
management in the Aspire 5552-7260

Best regards

Mark


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: thermal_zone trip_point_0_temp 200°C
  2012-06-15 14:42         ` Mark
@ 2012-06-20  6:54           ` Zhang Rui
  0 siblings, 0 replies; 9+ messages in thread
From: Zhang Rui @ 2012-06-20  6:54 UTC (permalink / raw)
  To: Mark; +Cc: Clemens Ladisch, linux-kernel

On 五, 2012-06-15 at 10:42 -0400, Mark wrote:
> Hi Rui,
> 
> thanks for writing; apologies for the delay I've been coping with some bizarre
> communications from AMD who won't even state thermal specifications for the cpu,
> while 'passing the buck' to Acer who are slippery customers — various methods of
> avoiding giving technical support worthy of the name — too :-)
> 
> On 06/05/2012 09:54 PM, Zhang Rui wrote:
> > you can use module parameter thermal.crt= to override the critical trip
> > point.
> > But I'm not sure if kernel should set a default upper limit or not.
> > Maybe we need another entry for this laptop in thermal_dmi_table.
> thanks, sounds worth a try; at least to prevent a fire :-D
> >
> >> there are 2-3 symbols in the ACPI that may be relevant, FANG, FANW, possibly
> >> FANU; I attach the relevant files
> > FANG/FANW/FANU can be used for fan control?
> > I do not know what these mean as they are not ACPI pre-defined control
> > method.
> > But if all the Aspires machines follow the same rule, then maybe we need
> > a kernel Acer platform driver that handles this.
> I'm simply conjecturing for now, that the letters 'FAN' may be relevant, you're
> the specialist though :-)

No, I only cares about the ACPI pre-defined control methods, which means
their behaviors have been defined in ACPI spec.
Actually, platforms can write whatever AML code, use whatever names, for
their own purpose. so I know nothing more than you on this.

> > thanks,
> > rui
> AMD suggested a channel that would seem designed more for people whose Email
> address is '@acer.com' / '@kernel.org', possibly '@intel.com' than people such as
> me '@gmail.com' :-D
> 
> http://support.amd.com/us/contacts/Pages/EmbeddedTechnicalSupport.aspx
> 
> My question would be, what *is* the thermal range for the Phenom II N970 Mobile
> CPU, reference HMN970DCR42GM ?
> conspicuously absent from http://products.amd.com/en-us/NotebookCPUDetail.aspx?id=733
> 
Sorry, I do not know. I can not help you on this. :(

thanks,
rui

> Aside from the question of whether they've got some technical contact at Acer who
> could verify how — presumably some hardware registers that are somehow accessible
> to the OS — windows seems to manage to keep the CPU relatively cool / fan
> management in the Aspire 5552-7260
> 
> Best regards
> 
> Mark
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-06-20  6:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-01 19:31 thermal_zone trip_point_0_temp 200°C Mark B
2012-06-02  8:34 ` Clemens Ladisch
2012-06-03 18:22   ` Mark
2012-06-04  3:21     ` Zhang Rui
2012-06-04  2:56   ` Zhang Rui
2012-06-04 14:41     ` Mark
2012-06-06  1:54       ` Zhang Rui
2012-06-15 14:42         ` Mark
2012-06-20  6:54           ` Zhang Rui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.