All of lore.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION] NMI received for unknown reason 3c on CPU 0, strange powersaving mode?
@ 2012-03-30 11:04 Martin Steigerwald
  2012-04-02 11:04 ` Srivatsa S. Bhat
  0 siblings, 1 reply; 6+ messages in thread
From: Martin Steigerwald @ 2012-03-30 11:04 UTC (permalink / raw)
  To: linux-kernel, linux-pm

Hi!

Since some time I am seeing things like

Message from syslogd@merkaba at Mar 30 00:29:30 ...
 kernel:[49074.294260] Uhhuh. NMI received for unknown reason 3c on CPU 0.

Message from syslogd@merkaba at Mar 30 00:29:30 ...
 kernel:[49074.294263] Do you have a strange power saving mode enabled?

Message from syslogd@merkaba at Mar 30 00:29:30 ...
 kernel:[49074.294264] Dazed and confused, but trying to continue

on resume after in-kernel hibernation.

I do not see any trace of it in syslog, kern.log or dmesg.

>From the timestemp it seems that these messages are issued shortly before 
I send the laptop to hibernation last night.


I am using a ThinkPad T520 with Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz 
and Sandybridge graphics.

I am not exactly sure since when it happens, cause I basically ignored it 
for quite some time. Might be some 3.2 kernel where it started, maybe even 
the first 3.2 kernel I had. Currently I am using:

martin@merkaba:~> cat /proc/version
Linux version 3.3.0-trunk-amd64 (Debian 3.3-1~experimental.1) (debian-
kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-1) ) #1 SMP Thu 
Mar 22 18:02:10 UTC 2012

Since I am quite sure I didn´t see this with the first kernel I used on 
this machine, which was a 2.6.39 if I remember correctly, I consider this 
to be a regression for now.


I did not see any other strange effects, only this message.


When searching for it I see quite some references¹. But what I looked at 
seemed to either quite old or different in that the machine was frozen 
then.


There seems to be some hints that its related to USB power management.

Here is what powertop says about the autosuspend settings - I did not 
change anything in there:

   Bad           Wireless Power Saving for interface wlan0            
   Bad           Enable SATA link power management for /dev/sda
   Bad           Power Aware CPU scheduler
   Bad           VM writeback timeout
   Bad           Enable Audio codec power management
   Bad           Autosuspend for USB device Biometric Coprocessor (UPE
   Bad           Autosuspend for USB device Integrated Smart Card Read
   Bad           Autosuspend for USB device USB-PS/2 Optical Mouse (Lo
   Bad           Runtime PM for PCI Device Ricoh Co Ltd MMC/SD Host Co
   Bad           Runtime PM for PCI Device Intel Corporation 2nd Gener
   Bad           Runtime PM for PCI Device Intel Corporation 2nd Gener
   Bad           Runtime PM for PCI Device Intel Corporation 82579LM G
   Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
   Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
   Bad           Runtime PM for PCI Device Ricoh Co Ltd FireWire Host
   Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
   Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
   Bad           Runtime PM for PCI Device Silicon Image, Inc. SiI 353
   Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
   Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
   Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
   Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
   Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
   Bad           Runtime PM for PCI Device Intel Corporation Centrino
   Good          NMI watchdog should be turned off
   Good          Autosuspend for unknown USB device 1-1.5 (17ef:100a)
   Good          Autosuspend for unknown USB device 1-1 (8087:0024)
   Good          Autosuspend for unknown USB device 2-1 (8087:0024)
   Good          Autosuspend for USB device EHCI Host Controller [usb1
   Good          Autosuspend for USB device EHCI Host Controller [usb2
   Good          Wake-on-lan status for device eth0
   Good          Wake-on-lan status for device wlan0
   Good          Using 'ondemand' cpufreq governor

merkaba:~> lsusb 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 003: ID 147e:2016 Upek Biometric Touchchip/Touchstrip 
Fingerprint Sensor
Bus 001 Device 004: ID 17ef:100a Lenovo ThinkPad Mini Dock Plus Series 3
Bus 002 Device 003: ID 17ef:1003 Lenovo Integrated Smart Card Reader
Bus 001 Device 005: ID 046d:c00e Logitech, Inc. M-BJ58/M-BJ69 Optical 
Wheel Mouse


But I think I have seen it at work as well where I use different USB 
devices (except for the builtin) and no Minidock for now.


As for other settings that might be related:

merkaba:~> cat /etc/modprobe.d/i915-kms.conf 
# Thorsten Leemhuis, Die Woche: Ungenutztes Stromsparpotenzial
# http://www.heise.de/open/artikel/Die-Woche-Ungenutztes-
Stromsparpotenzial-1361381.html
# Eugeni Dodonov, Intel Linux Graphics
# Following the open source road from Kernel to UI toolkits
# http://www.scribd.com/doc/73071712/Intel-Linux-Graphics
# i915_enable_fbc wieder aus, da:
# Enabling FBC is causing the BLT ring to run between 10-100x slower than
# normal and frequently lockup. The interim solution is disable FBC once
# more until we know why.
# http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;
# a=commitdiff;h=d56d8b28e9247e7e35e02fbb12b12239a2c33ad1
options i915 modeset=1 i915_enable_rc6=1 semaphores=1


/etc/sysfs.conf:
# Werner Fischer, ADMIN 03/2011
# Schnelligkeit ist keine Hexerei
# http://www.admin-magazin.de/Das-Heft/2011/03/SSD-Performance-optimieren
class/scsi_host/host1/link_power_management_policy = min_power
class/scsi_host/host2/link_power_management_policy = min_power
# eSATA-Port
class/scsi_host/host3/link_power_management_policy = medium_power
class/scsi_host/host4/link_power_management_policy = min_power
class/scsi_host/host5/link_power_management_policy = min_power
class/scsi_host/host6/link_power_management_policy = min_power

# c`t kompakt Linux 1/2012
# Thorsten Leemhuis, Notebooks unter Linux, S. 38ff
# S. 42, Kasten Handoptimiert
devices/system/cpu/sched_mc_power_savings = 1
# Macht modprobe/kmod anhand von /etc/modprobe.d/snd-hda-intel.conf 
derzeit nicht.
module/snd_hda_intel/parameters/power_save = 1

# By setting this to '1', under light load scenarios, the process load is
# distributed such that all the threads in a core and all the cores in a
# processor package are busy before distributing the process load to
# threads and cores, in other processor packages.
# http://lesswatts.org/tips/cpu.php#smpsched
devices/system/cpu/sched_smt_power_savings = 1


/etc/grub/default:

GRUB_CMDLINE_LINUX_DEFAULT="threadirqsi init=/bin/systemd"

Which is currently not used due to my Vim typo in there.

I am using systemd only since last week and think that I have seen the 
message before.


Anyway, if you suggest to alter some settings, please tell me and I will 
try it.

If you need additional info like dmidecode or something please tell me as 
well.


[1] https://bugs.launchpad.net/ubuntu/+source/linux-
source-2.6.20/+bug/116752 and quite some others

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] NMI received for unknown reason 3c on CPU 0, strange powersaving mode?
  2012-03-30 11:04 [REGRESSION] NMI received for unknown reason 3c on CPU 0, strange powersaving mode? Martin Steigerwald
@ 2012-04-02 11:04 ` Srivatsa S. Bhat
  2012-04-03  7:27   ` Martin Steigerwald
  2012-04-03  7:50   ` Martin Steigerwald
  0 siblings, 2 replies; 6+ messages in thread
From: Srivatsa S. Bhat @ 2012-04-02 11:04 UTC (permalink / raw)
  To: Martin Steigerwald
  Cc: linux-kernel, linux-pm, a.p.zijlstra, stern, rjw, pavel

On 03/30/2012 04:34 PM, Martin Steigerwald wrote:

> Hi!
> 
> Since some time I am seeing things like
> 
> Message from syslogd@merkaba at Mar 30 00:29:30 ...
>  kernel:[49074.294260] Uhhuh. NMI received for unknown reason 3c on CPU 0.
> 
> Message from syslogd@merkaba at Mar 30 00:29:30 ...
>  kernel:[49074.294263] Do you have a strange power saving mode enabled?
> 
> Message from syslogd@merkaba at Mar 30 00:29:30 ...
>  kernel:[49074.294264] Dazed and confused, but trying to continue
> 
> on resume after in-kernel hibernation.
> 


Do you see this after suspend-to-ram too?

> I do not see any trace of it in syslog, kern.log or dmesg.
> 
> From the timestemp it seems that these messages are issued shortly before 
> I send the laptop to hibernation last night.
> 
> 
> I am using a ThinkPad T520 with Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz 
> and Sandybridge graphics.
> 
> I am not exactly sure since when it happens, cause I basically ignored it 
> for quite some time. Might be some 3.2 kernel where it started, maybe even 
> the first 3.2 kernel I had. Currently I am using:
> 
> martin@merkaba:~> cat /proc/version
> Linux version 3.3.0-trunk-amd64 (Debian 3.3-1~experimental.1) (debian-
> kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-1) ) #1 SMP Thu 
> Mar 22 18:02:10 UTC 2012
> 
> Since I am quite sure I didn´t see this with the first kernel I used on 
> this machine, which was a 2.6.39 if I remember correctly, I consider this 
> to be a regression for now.
> 
> 
> I did not see any other strange effects, only this message.
> 
> 
> When searching for it I see quite some references¹. But what I looked at 
> seemed to either quite old or different in that the machine was frozen 
> then.
> 


There was once such a bug report and commit 144060fee (perf: Add PM notifiers
to fix CPU hotplug races) tried to fix it, however it didn't work out IIRC.

Can you please try out the pm-test framework and let us know in which phase
this message is encountered?
Documentation/power/basic-pm-debugging.txt

1. Recompile the kernel with CONFIG_PM_DEBUG=y
2. # cat /sys/power/pm_test
3. # echo <value> > /sys/power/pm_test
   Use the values from the list given in step 2.
   From freezer to core, it is increasing depth of suspend phase.
4. # echo mem > /sys/power/state  (for suspend-to-ram)
   or echo disk > /sys/power/state  (for suspend-to-disk)

It would be great if you could tell which of the phases (freezer to core)
fails.

> 
> There seems to be some hints that its related to USB power management.
> 


Adding Alan Stern to CC.

> Here is what powertop says about the autosuspend settings - I did not 
> change anything in there:
> 
>    Bad           Wireless Power Saving for interface wlan0            
>    Bad           Enable SATA link power management for /dev/sda
>    Bad           Power Aware CPU scheduler
>    Bad           VM writeback timeout
>    Bad           Enable Audio codec power management
>    Bad           Autosuspend for USB device Biometric Coprocessor (UPE
>    Bad           Autosuspend for USB device Integrated Smart Card Read
>    Bad           Autosuspend for USB device USB-PS/2 Optical Mouse (Lo
>    Bad           Runtime PM for PCI Device Ricoh Co Ltd MMC/SD Host Co
>    Bad           Runtime PM for PCI Device Intel Corporation 2nd Gener
>    Bad           Runtime PM for PCI Device Intel Corporation 2nd Gener
>    Bad           Runtime PM for PCI Device Intel Corporation 82579LM G
>    Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
>    Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
>    Bad           Runtime PM for PCI Device Ricoh Co Ltd FireWire Host
>    Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
>    Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
>    Bad           Runtime PM for PCI Device Silicon Image, Inc. SiI 353
>    Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
>    Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
>    Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
>    Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
>    Bad           Runtime PM for PCI Device Intel Corporation 6 Series/
>    Bad           Runtime PM for PCI Device Intel Corporation Centrino
>    Good          NMI watchdog should be turned off
>    Good          Autosuspend for unknown USB device 1-1.5 (17ef:100a)
>    Good          Autosuspend for unknown USB device 1-1 (8087:0024)
>    Good          Autosuspend for unknown USB device 2-1 (8087:0024)
>    Good          Autosuspend for USB device EHCI Host Controller [usb1
>    Good          Autosuspend for USB device EHCI Host Controller [usb2
>    Good          Wake-on-lan status for device eth0
>    Good          Wake-on-lan status for device wlan0
>    Good          Using 'ondemand' cpufreq governor
> 
> merkaba:~> lsusb 
> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
> Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
> Bus 001 Device 003: ID 147e:2016 Upek Biometric Touchchip/Touchstrip 
> Fingerprint Sensor
> Bus 001 Device 004: ID 17ef:100a Lenovo ThinkPad Mini Dock Plus Series 3
> Bus 002 Device 003: ID 17ef:1003 Lenovo Integrated Smart Card Reader
> Bus 001 Device 005: ID 046d:c00e Logitech, Inc. M-BJ58/M-BJ69 Optical 
> Wheel Mouse
> 
> 
> But I think I have seen it at work as well where I use different USB 
> devices (except for the builtin) and no Minidock for now.
> 
> 
> As for other settings that might be related:
> 
> merkaba:~> cat /etc/modprobe.d/i915-kms.conf 
> # Thorsten Leemhuis, Die Woche: Ungenutztes Stromsparpotenzial
> # http://www.heise.de/open/artikel/Die-Woche-Ungenutztes-
> Stromsparpotenzial-1361381.html
> # Eugeni Dodonov, Intel Linux Graphics
> # Following the open source road from Kernel to UI toolkits
> # http://www.scribd.com/doc/73071712/Intel-Linux-Graphics
> # i915_enable_fbc wieder aus, da:
> # Enabling FBC is causing the BLT ring to run between 10-100x slower than
> # normal and frequently lockup. The interim solution is disable FBC once
> # more until we know why.
> # http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;
> # a=commitdiff;h=d56d8b28e9247e7e35e02fbb12b12239a2c33ad1
> options i915 modeset=1 i915_enable_rc6=1 semaphores=1
> 
> 
> /etc/sysfs.conf:
> # Werner Fischer, ADMIN 03/2011
> # Schnelligkeit ist keine Hexerei
> # http://www.admin-magazin.de/Das-Heft/2011/03/SSD-Performance-optimieren
> class/scsi_host/host1/link_power_management_policy = min_power
> class/scsi_host/host2/link_power_management_policy = min_power
> # eSATA-Port
> class/scsi_host/host3/link_power_management_policy = medium_power
> class/scsi_host/host4/link_power_management_policy = min_power
> class/scsi_host/host5/link_power_management_policy = min_power
> class/scsi_host/host6/link_power_management_policy = min_power
> 
> # c`t kompakt Linux 1/2012
> # Thorsten Leemhuis, Notebooks unter Linux, S. 38ff
> # S. 42, Kasten Handoptimiert
> devices/system/cpu/sched_mc_power_savings = 1
> # Macht modprobe/kmod anhand von /etc/modprobe.d/snd-hda-intel.conf 
> derzeit nicht.
> module/snd_hda_intel/parameters/power_save = 1
> 
> # By setting this to '1', under light load scenarios, the process load is
> # distributed such that all the threads in a core and all the cores in a
> # processor package are busy before distributing the process load to
> # threads and cores, in other processor packages.
> # http://lesswatts.org/tips/cpu.php#smpsched
> devices/system/cpu/sched_smt_power_savings = 1
> 
> 
> /etc/grub/default:
> 
> GRUB_CMDLINE_LINUX_DEFAULT="threadirqsi init=/bin/systemd"
> 
> Which is currently not used due to my Vim typo in there.
> 
> I am using systemd only since last week and think that I have seen the 
> message before.
> 
> 
> Anyway, if you suggest to alter some settings, please tell me and I will 
> try it.
> 
> If you need additional info like dmidecode or something please tell me as 
> well.
> 
> 
> [1] https://bugs.launchpad.net/ubuntu/+source/linux-
> source-2.6.20/+bug/116752 and quite some others
> 
> Ciao,



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] NMI received for unknown reason 3c on CPU 0, strange powersaving mode?
  2012-04-02 11:04 ` Srivatsa S. Bhat
@ 2012-04-03  7:27   ` Martin Steigerwald
  2012-04-03  9:45     ` Srivatsa S. Bhat
  2012-04-03  7:50   ` Martin Steigerwald
  1 sibling, 1 reply; 6+ messages in thread
From: Martin Steigerwald @ 2012-04-03  7:27 UTC (permalink / raw)
  To: Srivatsa S. Bhat; +Cc: linux-kernel, linux-pm, a.p.zijlstra, stern, rjw, pavel

Am Montag, 2. April 2012 schrieb Srivatsa S. Bhat:
> On 03/30/2012 04:34 PM, Martin Steigerwald wrote:
> > Hi!
> > 
> > Since some time I am seeing things like
> > 
> > Message from syslogd@merkaba at Mar 30 00:29:30 ...
> > 
> >  kernel:[49074.294260] Uhhuh. NMI received for unknown reason 3c on
> >  CPU 0.
> > 
> > Message from syslogd@merkaba at Mar 30 00:29:30 ...
> > 
> >  kernel:[49074.294263] Do you have a strange power saving mode
> >  enabled?
> > 
> > Message from syslogd@merkaba at Mar 30 00:29:30 ...
> > 
> >  kernel:[49074.294264] Dazed and confused, but trying to continue
> > 
> > on resume after in-kernel hibernation.
> 
> Do you see this after suspend-to-ram too?

No.

> > I do not see any trace of it in syslog, kern.log or dmesg.
> > 
> > From the timestemp it seems that these messages are issued shortly
> > before I send the laptop to hibernation last night.
> > 
> > 
> > I am using a ThinkPad T520 with Intel(R) Core(TM) i5-2520M CPU @
> > 2.50GHz and Sandybridge graphics.
> > 
> > I am not exactly sure since when it happens, cause I basically
> > ignored it for quite some time. Might be some 3.2 kernel where it
> > started, maybe even the first 3.2 kernel I had. Currently I am
> > using:
> > 
> > martin@merkaba:~> cat /proc/version
> > Linux version 3.3.0-trunk-amd64 (Debian 3.3-1~experimental.1)
> > (debian- kernel@lists.debian.org) (gcc version 4.6.3 (Debian
> > 4.6.3-1) ) #1 SMP Thu Mar 22 18:02:10 UTC 2012
> > 
> > Since I am quite sure I didn´t see this with the first kernel I used
> > on this machine, which was a 2.6.39 if I remember correctly, I
> > consider this to be a regression for now.
> > 
> > 
> > I did not see any other strange effects, only this message.
> > 
> > 
> > When searching for it I see quite some references¹. But what I looked
> > at seemed to either quite old or different in that the machine was
> > frozen then.
> 
> There was once such a bug report and commit 144060fee (perf: Add PM
> notifiers to fix CPU hotplug races) tried to fix it, however it didn't
> work out IIRC.
> 
> Can you please try out the pm-test framework and let us know in which
> phase this message is encountered?
> Documentation/power/basic-pm-debugging.txt
> 
> 1. Recompile the kernel with CONFIG_PM_DEBUG=y

Luckily I have this already.

martin@merkaba:~> grep   CONFIG_PM_DEBUG /boot/config-3.3.0-trunk-amd64
CONFIG_PM_DEBUG=y

> 2. # cat /sys/power/pm_test
> 3. # echo <value> > /sys/power/pm_test
>    Use the values from the list given in step 2.
>    From freezer to core, it is increasing depth of suspend phase.
> 4. # echo mem > /sys/power/state  (for suspend-to-ram)
>    or echo disk > /sys/power/state  (for suspend-to-disk)

I understand it that you want me to do step 4 for each of the values from 
step 3. If not so, please tell me.

Now I send this out, before I start my tests. ;)

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] NMI received for unknown reason 3c on CPU 0, strange powersaving mode?
  2012-04-02 11:04 ` Srivatsa S. Bhat
  2012-04-03  7:27   ` Martin Steigerwald
@ 2012-04-03  7:50   ` Martin Steigerwald
  2012-04-03  9:50     ` Srivatsa S. Bhat
  1 sibling, 1 reply; 6+ messages in thread
From: Martin Steigerwald @ 2012-04-03  7:50 UTC (permalink / raw)
  To: Srivatsa S. Bhat; +Cc: linux-kernel, linux-pm, a.p.zijlstra, stern, rjw, pavel

Am Montag, 2. April 2012 schrieb Srivatsa S. Bhat:
> On 03/30/2012 04:34 PM, Martin Steigerwald wrote:
> > Hi!
> > 
> > Since some time I am seeing things like
> > 
> > Message from syslogd@merkaba at Mar 30 00:29:30 ...
> > 
> >  kernel:[49074.294260] Uhhuh. NMI received for unknown reason 3c on
> >  CPU 0.
> > 
> > Message from syslogd@merkaba at Mar 30 00:29:30 ...
> > 
> >  kernel:[49074.294263] Do you have a strange power saving mode
> >  enabled?
> > 
> > Message from syslogd@merkaba at Mar 30 00:29:30 ...
> > 
> >  kernel:[49074.294264] Dazed and confused, but trying to continue
> > 
> > on resume after in-kernel hibernation.
> 
> Do you see this after suspend-to-ram too?
> 
> > I do not see any trace of it in syslog, kern.log or dmesg.
> > 
> > From the timestemp it seems that these messages are issued shortly
> > before I send the laptop to hibernation last night.
> > 
> > 
> > I am using a ThinkPad T520 with Intel(R) Core(TM) i5-2520M CPU @
> > 2.50GHz and Sandybridge graphics.
> > 
> > I am not exactly sure since when it happens, cause I basically
> > ignored it for quite some time. Might be some 3.2 kernel where it
> > started, maybe even the first 3.2 kernel I had. Currently I am
> > using:
> > 
> > martin@merkaba:~> cat /proc/version
> > Linux version 3.3.0-trunk-amd64 (Debian 3.3-1~experimental.1)
> > (debian- kernel@lists.debian.org) (gcc version 4.6.3 (Debian
> > 4.6.3-1) ) #1 SMP Thu Mar 22 18:02:10 UTC 2012
> > 
> > Since I am quite sure I didn´t see this with the first kernel I used
> > on this machine, which was a 2.6.39 if I remember correctly, I
> > consider this to be a regression for now.
> > 
> > 
> > I did not see any other strange effects, only this message.
> > 
> > 
> > When searching for it I see quite some references¹. But what I looked
> > at seemed to either quite old or different in that the machine was
> > frozen then.
> 
> There was once such a bug report and commit 144060fee (perf: Add PM
> notifiers to fix CPU hotplug races) tried to fix it, however it didn't
> work out IIRC.
> 
> Can you please try out the pm-test framework and let us know in which
> phase this message is encountered?
> Documentation/power/basic-pm-debugging.txt
> 
> 1. Recompile the kernel with CONFIG_PM_DEBUG=y
> 2. # cat /sys/power/pm_test
> 3. # echo <value> > /sys/power/pm_test
>    Use the values from the list given in step 2.
>    From freezer to core, it is increasing depth of suspend phase.
> 4. # echo mem > /sys/power/state  (for suspend-to-ram)
>    or echo disk > /sys/power/state  (for suspend-to-disk)
> 
> It would be great if you could tell which of the phases (freezer to
> core) fails.

Here I have the one from this morning. This time as resume time:

martin@merkaba:~> 
Message from syslogd@merkaba at Apr  3 09:10:15 ...
 kernel:[ 3755.145282] Uhhuh. NMI received for unknown reason 3c on CPU 0.

Message from syslogd@merkaba at Apr  3 09:10:15 ...
 kernel:[ 3755.145285] Do you have a strange power saving mode enabled?

Message from syslogd@merkaba at Apr  3 09:10:15 ...
 kernel:[ 3755.145286] Dazed and confused, but trying to continue


And here are the tests - short summary I was not able to reproduce the 
issue - nothing means that there was no furch NMI message on the Konsole 
window where it usually appears:

merkaba:~> cat /sys/power/pm_test 
[none] core processors platform devices freezer
merkaba:~> echo "core" > /sys/power/pm_test
merkaba:~> echo "mem" > /sys/power/state  
merkaba:~> echo nothing               
nothing
merkaba:~> echo "disk" > /sys/power/state
merkaba:~> echo nothing
nothing
merkaba:~> echo "processors" > /sys/power/pm_test
merkaba:~> echo "mem" > /sys/power/state         
merkaba:~> echo nothing                          
nothing
merkaba:~> cat /sys/power/pm_test        
none core [processors] platform devices freezer
merkaba:~> echo "disk" > /sys/power/state
merkaba:~> echo nothing                  
nothing
merkaba:~> echo "platform" > /sys/power/pm_test
merkaba:~> echo "mem" > /sys/power/state         
merkaba:~> echo nothing                 
nothing
merkaba:~> echo "disk" > /sys/power/state
merkaba:~> echo nothing                        
nothing
merkaba:~> echo "devices" > /sys/power/pm_test
merkaba:~> echo "mem" > /sys/power/state      
merkaba:~> echo nothing                       
nothing
merkaba:~> echo "disk" > /sys/power/state  
merkaba:~> echo nothing                  
nothing
merkaba:~> echo "freezer" > /sys/power/pm_test
merkaba:~> echo "mem" > /sys/power/state      
merkaba:~> echo nothing                       
nothing
merkaba:~> echo "disk" > /sys/power/state
merkaba:~> echo nothing                  
nothing


Now trying a regular hibernation:

merkaba:~> echo "none" > /sys/power/pm_test
merkaba:~> cat /sys/power/pm_test
[none] core processors platform devices freezer


Nothing.


Now trying a regular hibernation with some minutes downtime and
unplugging the power from the laptop.


Nothing as well.


Now I am puzzled.


Maybe its the switch from minidock to no dock and vice versa?

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] NMI received for unknown reason 3c on CPU 0, strange powersaving mode?
  2012-04-03  7:27   ` Martin Steigerwald
@ 2012-04-03  9:45     ` Srivatsa S. Bhat
  0 siblings, 0 replies; 6+ messages in thread
From: Srivatsa S. Bhat @ 2012-04-03  9:45 UTC (permalink / raw)
  To: Martin Steigerwald
  Cc: linux-kernel, linux-pm, a.p.zijlstra, stern, rjw, pavel

On 04/03/2012 12:57 PM, Martin Steigerwald wrote:

> Am Montag, 2. April 2012 schrieb Srivatsa S. Bhat:
>> On 03/30/2012 04:34 PM, Martin Steigerwald wrote:
>>> Hi!
>>>
>>> Since some time I am seeing things like
>>>
>>> Message from syslogd@merkaba at Mar 30 00:29:30 ...
>>>
>>>  kernel:[49074.294260] Uhhuh. NMI received for unknown reason 3c on
>>>  CPU 0.
>>>
>>> Message from syslogd@merkaba at Mar 30 00:29:30 ...
>>>
>>>  kernel:[49074.294263] Do you have a strange power saving mode
>>>  enabled?
>>>
>>> Message from syslogd@merkaba at Mar 30 00:29:30 ...
>>>
>>>  kernel:[49074.294264] Dazed and confused, but trying to continue
>>>
>>> on resume after in-kernel hibernation.
>>
>> Do you see this after suspend-to-ram too?
> 
> No.


Ok..

> 
>>> I do not see any trace of it in syslog, kern.log or dmesg.
>>>
>>> From the timestemp it seems that these messages are issued shortly
>>> before I send the laptop to hibernation last night.
>>>
>>>
>>> I am using a ThinkPad T520 with Intel(R) Core(TM) i5-2520M CPU @
>>> 2.50GHz and Sandybridge graphics.
>>>
>>> I am not exactly sure since when it happens, cause I basically
>>> ignored it for quite some time. Might be some 3.2 kernel where it
>>> started, maybe even the first 3.2 kernel I had. Currently I am
>>> using:
>>>
>>> martin@merkaba:~> cat /proc/version
>>> Linux version 3.3.0-trunk-amd64 (Debian 3.3-1~experimental.1)
>>> (debian- kernel@lists.debian.org) (gcc version 4.6.3 (Debian
>>> 4.6.3-1) ) #1 SMP Thu Mar 22 18:02:10 UTC 2012
>>>
>>> Since I am quite sure I didn´t see this with the first kernel I used
>>> on this machine, which was a 2.6.39 if I remember correctly, I
>>> consider this to be a regression for now.
>>>
>>>
>>> I did not see any other strange effects, only this message.
>>>
>>>
>>> When searching for it I see quite some references¹. But what I looked
>>> at seemed to either quite old or different in that the machine was
>>> frozen then.
>>
>> There was once such a bug report and commit 144060fee (perf: Add PM
>> notifiers to fix CPU hotplug races) tried to fix it, however it didn't
>> work out IIRC.
>>
>> Can you please try out the pm-test framework and let us know in which
>> phase this message is encountered?
>> Documentation/power/basic-pm-debugging.txt
>>
>> 1. Recompile the kernel with CONFIG_PM_DEBUG=y
> 
> Luckily I have this already.
> 
> martin@merkaba:~> grep   CONFIG_PM_DEBUG /boot/config-3.3.0-trunk-amd64
> CONFIG_PM_DEBUG=y
> 
>> 2. # cat /sys/power/pm_test
>> 3. # echo <value> > /sys/power/pm_test
>>    Use the values from the list given in step 2.
>>    From freezer to core, it is increasing depth of suspend phase.
>> 4. # echo mem > /sys/power/state  (for suspend-to-ram)
>>    or echo disk > /sys/power/state  (for suspend-to-disk)
> 
> I understand it that you want me to do step 4 for each of the values from 
> step 3. If not so, please tell me.
>


Yes, that's right. And moreover, the values in step 3 are in increasing order
from freezer to core. Which means, the core level is a superset of everything
before it. (So if you don't hit the problem with the core level, you won't hit it
in any previous level.)

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [REGRESSION] NMI received for unknown reason 3c on CPU 0, strange powersaving mode?
  2012-04-03  7:50   ` Martin Steigerwald
@ 2012-04-03  9:50     ` Srivatsa S. Bhat
  0 siblings, 0 replies; 6+ messages in thread
From: Srivatsa S. Bhat @ 2012-04-03  9:50 UTC (permalink / raw)
  To: Martin Steigerwald
  Cc: linux-kernel, linux-pm, a.p.zijlstra, stern, rjw, pavel

On 04/03/2012 01:20 PM, Martin Steigerwald wrote:

> Am Montag, 2. April 2012 schrieb Srivatsa S. Bhat:
>> On 03/30/2012 04:34 PM, Martin Steigerwald wrote:
>>> Hi!
>>>
>>> Since some time I am seeing things like
>>>
>>> Message from syslogd@merkaba at Mar 30 00:29:30 ...
>>>
>>>  kernel:[49074.294260] Uhhuh. NMI received for unknown reason 3c on
>>>  CPU 0.
>>>
>>> Message from syslogd@merkaba at Mar 30 00:29:30 ...
>>>
>>>  kernel:[49074.294263] Do you have a strange power saving mode
>>>  enabled?
>>>
>>> Message from syslogd@merkaba at Mar 30 00:29:30 ...
>>>
>>>  kernel:[49074.294264] Dazed and confused, but trying to continue
>>>
>>> on resume after in-kernel hibernation.
>>
>> Do you see this after suspend-to-ram too?
>>
>>> I do not see any trace of it in syslog, kern.log or dmesg.
>>>
>>> From the timestemp it seems that these messages are issued shortly
>>> before I send the laptop to hibernation last night.
>>>
>>>
>>> I am using a ThinkPad T520 with Intel(R) Core(TM) i5-2520M CPU @
>>> 2.50GHz and Sandybridge graphics.
>>>
>>> I am not exactly sure since when it happens, cause I basically
>>> ignored it for quite some time. Might be some 3.2 kernel where it
>>> started, maybe even the first 3.2 kernel I had. Currently I am
>>> using:
>>>
>>> martin@merkaba:~> cat /proc/version
>>> Linux version 3.3.0-trunk-amd64 (Debian 3.3-1~experimental.1)
>>> (debian- kernel@lists.debian.org) (gcc version 4.6.3 (Debian
>>> 4.6.3-1) ) #1 SMP Thu Mar 22 18:02:10 UTC 2012
>>>
>>> Since I am quite sure I didn´t see this with the first kernel I used
>>> on this machine, which was a 2.6.39 if I remember correctly, I
>>> consider this to be a regression for now.
>>>
>>>
>>> I did not see any other strange effects, only this message.
>>>
>>>
>>> When searching for it I see quite some references¹. But what I looked
>>> at seemed to either quite old or different in that the machine was
>>> frozen then.
>>
>> There was once such a bug report and commit 144060fee (perf: Add PM
>> notifiers to fix CPU hotplug races) tried to fix it, however it didn't
>> work out IIRC.
>>
>> Can you please try out the pm-test framework and let us know in which
>> phase this message is encountered?
>> Documentation/power/basic-pm-debugging.txt
>>
>> 1. Recompile the kernel with CONFIG_PM_DEBUG=y
>> 2. # cat /sys/power/pm_test
>> 3. # echo <value> > /sys/power/pm_test
>>    Use the values from the list given in step 2.
>>    From freezer to core, it is increasing depth of suspend phase.
>> 4. # echo mem > /sys/power/state  (for suspend-to-ram)
>>    or echo disk > /sys/power/state  (for suspend-to-disk)
>>
>> It would be great if you could tell which of the phases (freezer to
>> core) fails.
> 
> Here I have the one from this morning. This time as resume time:
> 
> martin@merkaba:~> 
> Message from syslogd@merkaba at Apr  3 09:10:15 ...
>  kernel:[ 3755.145282] Uhhuh. NMI received for unknown reason 3c on CPU 0.
> 
> Message from syslogd@merkaba at Apr  3 09:10:15 ...
>  kernel:[ 3755.145285] Do you have a strange power saving mode enabled?
> 
> Message from syslogd@merkaba at Apr  3 09:10:15 ...
>  kernel:[ 3755.145286] Dazed and confused, but trying to continue
> 
> 
> And here are the tests - short summary I was not able to reproduce the 
> issue - nothing means that there was no furch NMI message on the Konsole 
> window where it usually appears:
> 
> merkaba:~> cat /sys/power/pm_test 
> [none] core processors platform devices freezer
> merkaba:~> echo "core" > /sys/power/pm_test
> merkaba:~> echo "mem" > /sys/power/state  
> merkaba:~> echo nothing               
> nothing
> merkaba:~> echo "disk" > /sys/power/state
> merkaba:~> echo nothing
> nothing
> merkaba:~> echo "processors" > /sys/power/pm_test
> merkaba:~> echo "mem" > /sys/power/state         
> merkaba:~> echo nothing                          
> nothing
> merkaba:~> cat /sys/power/pm_test        
> none core [processors] platform devices freezer
> merkaba:~> echo "disk" > /sys/power/state
> merkaba:~> echo nothing                  
> nothing
> merkaba:~> echo "platform" > /sys/power/pm_test
> merkaba:~> echo "mem" > /sys/power/state         
> merkaba:~> echo nothing                 
> nothing
> merkaba:~> echo "disk" > /sys/power/state
> merkaba:~> echo nothing                        
> nothing
> merkaba:~> echo "devices" > /sys/power/pm_test
> merkaba:~> echo "mem" > /sys/power/state      
> merkaba:~> echo nothing                       
> nothing
> merkaba:~> echo "disk" > /sys/power/state  
> merkaba:~> echo nothing                  
> nothing
> merkaba:~> echo "freezer" > /sys/power/pm_test
> merkaba:~> echo "mem" > /sys/power/state      
> merkaba:~> echo nothing                       
> nothing
> merkaba:~> echo "disk" > /sys/power/state
> merkaba:~> echo nothing                  
> nothing
> 
> 
> Now trying a regular hibernation:
> 
> merkaba:~> echo "none" > /sys/power/pm_test
> merkaba:~> cat /sys/power/pm_test
> [none] core processors platform devices freezer
> 
> 
> Nothing.
> 
> 
> Now trying a regular hibernation with some minutes downtime and
> unplugging the power from the laptop.
> 
> 
> Nothing as well.
> 
> 
> Now I am puzzled.
> 
> 
> Maybe its the switch from minidock to no dock and vice versa?
> 


Oh.. so you couldn't reproduce the problem.. 
Can you try with the original setup (minidock?) with which you found the
issue during hibernation and see what pm_test has to say in that case?

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-04-03  9:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-30 11:04 [REGRESSION] NMI received for unknown reason 3c on CPU 0, strange powersaving mode? Martin Steigerwald
2012-04-02 11:04 ` Srivatsa S. Bhat
2012-04-03  7:27   ` Martin Steigerwald
2012-04-03  9:45     ` Srivatsa S. Bhat
2012-04-03  7:50   ` Martin Steigerwald
2012-04-03  9:50     ` Srivatsa S. Bhat

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.