* AMD Bulldozer FX-8150 Powers off during kernel build
@ 2012-09-13 1:30 Sid Boyce
2012-09-13 9:44 ` Borislav Petkov
0 siblings, 1 reply; 4+ messages in thread
From: Sid Boyce @ 2012-09-13 1:30 UTC (permalink / raw)
To: LKML Mailing List
I have a huge heatsink and large CPU fan plus lots of cooling fans in
the case and nothing gets hot.
If I build e.g 3.6-rc5 with 8 or 6 cores, part way through it suddenly
powers off.
I have checked hwmon/k10temp.c to see if I could see where these values
were defined.
k10temp.h is 0 bytes.
-rw-r--r-- 1 root root 0 Sep 9 01:59
/usr/src/linux-3.6.0-rc5/include/config/sensors/k10temp.h
Currently I build with "make -j 1" and temperature and power values are
around those below.
# sensors
k10temp-pci-00c3
Adapter: PCI adapter
temp1: +60.4°C (high = +70.0°C)
(crit = +90.0°C, hyst = +87.0°C)
fam15h_power-pci-00c4
Adapter: PCI adapter
power1: 127.49 W (crit = 124.77 W)
# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 21
model : 1
model name : AMD FX(tm)-8150 Eight-Core Processor
stepping : 2
microcode : 0x6000626
cpu MHz : 3600.000
cache size : 2048 KB
from .config:-
# grep HWMON .config
CONFIG_IXGBE_HWMON=y
CONFIG_HWMON=y
CONFIG_HWMON_VID=m
# CONFIG_HWMON_DEBUG_CHIP is not set
CONFIG_THERMAL_HWMON=y
# grep POWERSAVE .config
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
# CONFIG_PCIEASPM_POWERSAVE is not set
CONFIG_DEVFREQ_GOV_POWERSAVE=y
On another 6-core box I can build kernels with "make -j 6" without problems.
# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 21
model : 1
model name : AMD FX(tm)-6100 Six-Core Processor
stepping : 2
microcode : 0x6000623
cpu MHz : 3300.000
cache size : 2048 KB
With a kernel build going on six core box, temperature and power hover
around the values below.
sabre:~ # sensors
k10temp-pci-00c3
Adapter: PCI adapter
temp1: +50.2°C (high = +70.0°C)
(crit = +90.0°C, hyst = +87.0°C)
fam15h_power-pci-00c4
Adapter: PCI adapter
power1: 94.40 W (crit = 95.01 W)
73 ... Sid.
--
Sid Boyce ... Hamradio License G3VBV, Licensed Private Pilot
Emeritus IBM/Amdahl Mainframes and Sun/Fujitsu Servers Tech Support
Senior Staff Specialist, Cricket Coach
Microsoft Windows Free Zone - Linux used for all Computing Tasks
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: AMD Bulldozer FX-8150 Powers off during kernel build
2012-09-13 1:30 AMD Bulldozer FX-8150 Powers off during kernel build Sid Boyce
@ 2012-09-13 9:44 ` Borislav Petkov
2012-09-13 21:58 ` Sid Boyce
0 siblings, 1 reply; 4+ messages in thread
From: Borislav Petkov @ 2012-09-13 9:44 UTC (permalink / raw)
To: Sid Boyce; +Cc: LKML Mailing List, Andreas Herrmann
On Thu, Sep 13, 2012 at 02:30:27AM +0100, Sid Boyce wrote:
> I have a huge heatsink and large CPU fan plus lots of cooling fans
> in the case and nothing gets hot.
> If I build e.g 3.6-rc5 with 8 or 6 cores, part way through it
> suddenly powers off.
Ok, can you catch the whole dmesg when you boot the machine _after_ the
sudden poweroff? You can send it to me and Andreas (on CC) privately if
you prefer.
Important: make sure the kernel has CONFIG_X86_MCE and
CONFIG_EDAC_DECODE_MCE built-in.
Please make sure to use a recent kernel, i.e. 3.4, 3.5 is fine.
Thanks.
(Leaving in the rest for reference)
> I have checked hwmon/k10temp.c to see if I could see where these
> values were defined.
>
> k10temp.h is 0 bytes.
> -rw-r--r-- 1 root root 0 Sep 9 01:59
> /usr/src/linux-3.6.0-rc5/include/config/sensors/k10temp.h
>
> Currently I build with "make -j 1" and temperature and power values
> are around those below.
> # sensors
> k10temp-pci-00c3
> Adapter: PCI adapter
> temp1: +60.4°C (high = +70.0°C)
> (crit = +90.0°C, hyst = +87.0°C)
>
> fam15h_power-pci-00c4
> Adapter: PCI adapter
> power1: 127.49 W (crit = 124.77 W)
>
> # cat /proc/cpuinfo
> processor : 0
> vendor_id : AuthenticAMD
> cpu family : 21
> model : 1
> model name : AMD FX(tm)-8150 Eight-Core Processor
> stepping : 2
> microcode : 0x6000626
> cpu MHz : 3600.000
> cache size : 2048 KB
>
> from .config:-
> # grep HWMON .config
> CONFIG_IXGBE_HWMON=y
> CONFIG_HWMON=y
> CONFIG_HWMON_VID=m
> # CONFIG_HWMON_DEBUG_CHIP is not set
> CONFIG_THERMAL_HWMON=y
>
> # grep POWERSAVE .config
> # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
> CONFIG_CPU_FREQ_GOV_POWERSAVE=m
> # CONFIG_PCIEASPM_POWERSAVE is not set
> CONFIG_DEVFREQ_GOV_POWERSAVE=y
>
> On another 6-core box I can build kernels with "make -j 6" without problems.
> # cat /proc/cpuinfo
> processor : 0
> vendor_id : AuthenticAMD
> cpu family : 21
> model : 1
> model name : AMD FX(tm)-6100 Six-Core Processor
> stepping : 2
> microcode : 0x6000623
> cpu MHz : 3300.000
> cache size : 2048 KB
>
> With a kernel build going on six core box, temperature and power
> hover around the values below.
> sabre:~ # sensors
> k10temp-pci-00c3
> Adapter: PCI adapter
> temp1: +50.2°C (high = +70.0°C)
> (crit = +90.0°C, hyst = +87.0°C)
>
> fam15h_power-pci-00c4
> Adapter: PCI adapter
> power1: 94.40 W (crit = 95.01 W)
>
> 73 ... Sid.
>
> --
> Sid Boyce ... Hamradio License G3VBV, Licensed Private Pilot
> Emeritus IBM/Amdahl Mainframes and Sun/Fujitsu Servers Tech Support
> Senior Staff Specialist, Cricket Coach
> Microsoft Windows Free Zone - Linux used for all Computing Tasks
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Regards/Gruss,
Boris.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: AMD Bulldozer FX-8150 Powers off during kernel build
2012-09-13 9:44 ` Borislav Petkov
@ 2012-09-13 21:58 ` Sid Boyce
2012-09-13 22:28 ` Borislav Petkov
0 siblings, 1 reply; 4+ messages in thread
From: Sid Boyce @ 2012-09-13 21:58 UTC (permalink / raw)
To: Borislav Petkov, LKML Mailing List, Andreas Herrmann
# uname -r
3.6.0-rc5-u1-smp+
I built a new 3.6-rc5 kernel (3.6.0-rc5-u2) using 3.6.0-rc5-u1 with 8
cores and power off didn't ocur.
slipstream:/usr/src/linux-3.6.0-rc5-u1 # grep POWER .config
# CONFIG_ACPI_PROCFS_POWER is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_X86_POWERNOW_K8=m
# CONFIG_PCIEASPM_POWERSAVE is not set
CONFIG_INPUT_POWERMATE=m
CONFIG_IPMI_POWEROFF=m
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
CONFIG_PDA_POWER=m
CONFIG_TEST_POWER=m
CONFIG_POWER_AVS=y
CONFIG_SENSORS_FAM15H_POWER=m
CONFIG_SENSORS_ACPI_POWER=m
CONFIG_SND_AC97_POWER_SAVE=y
CONFIG_SND_AC97_POWER_SAVE_DEFAULT=0
# CONFIG_SND_HDA_POWER_SAVE is not set
# CONFIG_HID_LCPOWER is not set
CONFIG_DEVFREQ_GOV_POWERSAVE=y
CONFIG_EVENT_POWER_TRACING_DEPRECATED=y
# CONFIG_XZ_DEC_POWERPC is not set
When it was powering off "CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y" was
set.
slipstream:/usr/src/linux-3.6.0-rc5-u1 # grep PERFORMANCE .config
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_PCIEASPM_PERFORMANCE=y
CONFIG_DEVFREQ_GOV_PERFORMANCE=y
slipstream:/usr/src/linux-3.6.0-rc5-u1 # grep MCE .config
CONFIG_X86_MCE=y
# CONFIG_X86_MCE_INTEL is not set
CONFIG_X86_MCE_AMD=y
CONFIG_X86_MCE_THRESHOLD=y
# CONFIG_X86_MCE_INJECT is not set
CONFIG_EDAC_DECODE_MCE=y
# CONFIG_EDAC_MCE_INJ is not set
During the build temperature and power was around these values
-------------------------------------------------------------------------------------
fam15h_power-pci-00c4
Adapter: PCI adapter
power1: 133.30 W (crit = 124.77 W)
k10temp-pci-00c3
Adapter: PCI adapter
temp1: +61.9°C (high = +70.0°C)
(crit = +90.0°C, hyst = +87.0°C)
Immediately after the build the values are much lower than what it was
with the kernel and config that caused the power off.
----------------------------------------
fam15h_power-pci-00c4
Adapter: PCI adapter
power1: 31.10 W (crit = 124.77 W)
k10temp-pci-00c3
Adapter: PCI adapter
temp1: +33.2°C (high = +70.0°C)
(crit = +90.0°C, hyst = +87.0°C)
------------------------------------------
If needed I can go back to the earlier 3.6.0-rc5 kernel and config to
recreate the power off situation.
With the kernel that powered off, MCE was not set and
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
For the 3.6.0-rc5-u1 kernel only those 2 were changed.
Regards
Sid.
On 13/09/12 10:44, Borislav Petkov wrote:
> On Thu, Sep 13, 2012 at 02:30:27AM +0100, Sid Boyce wrote:
>> I have a huge heatsink and large CPU fan plus lots of cooling fans
>> in the case and nothing gets hot.
>> If I build e.g 3.6-rc5 with 8 or 6 cores, part way through it
>> suddenly powers off.
> Ok, can you catch the whole dmesg when you boot the machine _after_ the
> sudden poweroff? You can send it to me and Andreas (on CC) privately if
> you prefer.
>
> Important: make sure the kernel has CONFIG_X86_MCE and
> CONFIG_EDAC_DECODE_MCE built-in.
>
> Please make sure to use a recent kernel, i.e. 3.4, 3.5 is fine.
>
> Thanks.
>
> (Leaving in the rest for reference)
>
>> I have checked hwmon/k10temp.c to see if I could see where these
>> values were defined.
>>
>> k10temp.h is 0 bytes.
>> -rw-r--r-- 1 root root 0 Sep 9 01:59
>> /usr/src/linux-3.6.0-rc5/include/config/sensors/k10temp.h
>>
>> Currently I build with "make -j 1" and temperature and power values
>> are around those below.
>> # sensors
>> k10temp-pci-00c3
>> Adapter: PCI adapter
>> temp1: +60.4°C (high = +70.0°C)
>> (crit = +90.0°C, hyst = +87.0°C)
>>
>> fam15h_power-pci-00c4
>> Adapter: PCI adapter
>> power1: 127.49 W (crit = 124.77 W)
>>
>> # cat /proc/cpuinfo
>> processor : 0
>> vendor_id : AuthenticAMD
>> cpu family : 21
>> model : 1
>> model name : AMD FX(tm)-8150 Eight-Core Processor
>> stepping : 2
>> microcode : 0x6000626
>> cpu MHz : 3600.000
>> cache size : 2048 KB
>>
>> from .config:-
>> # grep HWMON .config
>> CONFIG_IXGBE_HWMON=y
>> CONFIG_HWMON=y
>> CONFIG_HWMON_VID=m
>> # CONFIG_HWMON_DEBUG_CHIP is not set
>> CONFIG_THERMAL_HWMON=y
>>
>> # grep POWERSAVE .config
>> # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
>> CONFIG_CPU_FREQ_GOV_POWERSAVE=m
>> # CONFIG_PCIEASPM_POWERSAVE is not set
>> CONFIG_DEVFREQ_GOV_POWERSAVE=y
>>
>> On another 6-core box I can build kernels with "make -j 6" without problems.
>> # cat /proc/cpuinfo
>> processor : 0
>> vendor_id : AuthenticAMD
>> cpu family : 21
>> model : 1
>> model name : AMD FX(tm)-6100 Six-Core Processor
>> stepping : 2
>> microcode : 0x6000623
>> cpu MHz : 3300.000
>> cache size : 2048 KB
>>
>> With a kernel build going on six core box, temperature and power
>> hover around the values below.
>> sabre:~ # sensors
>> k10temp-pci-00c3
>> Adapter: PCI adapter
>> temp1: +50.2°C (high = +70.0°C)
>> (crit = +90.0°C, hyst = +87.0°C)
>>
>> fam15h_power-pci-00c4
>> Adapter: PCI adapter
>> power1: 94.40 W (crit = 95.01 W)
>>
>> 73 ... Sid.
>>
>> --
>>
--
Sid Boyce ... Hamradio License G3VBV, Licensed Private Pilot
Emeritus IBM/Amdahl Mainframes and Sun/Fujitsu Servers Tech Support
Senior Staff Specialist, Cricket Coach
Microsoft Windows Free Zone - Linux used for all Computing Tasks
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: AMD Bulldozer FX-8150 Powers off during kernel build
2012-09-13 21:58 ` Sid Boyce
@ 2012-09-13 22:28 ` Borislav Petkov
0 siblings, 0 replies; 4+ messages in thread
From: Borislav Petkov @ 2012-09-13 22:28 UTC (permalink / raw)
To: Sid Boyce; +Cc: LKML Mailing List, Andreas Herrmann
On Thu, Sep 13, 2012 at 10:58:49PM +0100, Sid Boyce wrote:
> If needed I can go back to the earlier 3.6.0-rc5 kernel and config to
> recreate the power off situation. With the kernel that powered off,
> MCE was not set and CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
Yes, as I suggested earlier, enable CONFIG_X86_MCE and
CONFIG_EDAC_DECODE_MCE and *then* try recreating the reboot.
After it reboots, catch the whole dmesg and send it to me.
Thanks.
--
Regards/Gruss,
Boris.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-09-13 22:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-13 1:30 AMD Bulldozer FX-8150 Powers off during kernel build Sid Boyce
2012-09-13 9:44 ` Borislav Petkov
2012-09-13 21:58 ` Sid Boyce
2012-09-13 22:28 ` Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).