linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.22-rc3 hibernate(?) disables skge wol
@ 2007-06-01 21:23 David Greaves
  2007-06-01 21:42 ` Rafael J. Wysocki
  0 siblings, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-01 21:23 UTC (permalink / raw)
  To: 'linux-kernel@vger.kernel.org', netdev

Not a regression though, it does it in 2.6.21

If I cause the system to save state to disk then whilst off it no longer
responds to g-wol.

bug report:
# wakeonlan cu
Sending magic packet to 255.255.255.255:9 with 00:0C:6E:F6:47:EE

Nothing happens <grin>


cu:~# ethtool eth0
Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: pg
        Wake-on: pg
        Current message level: 0x00000037 (55)
        Link detected: yes

What more can I say?

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) disables skge wol
  2007-06-01 21:23 2.6.22-rc3 hibernate(?) disables skge wol David Greaves
@ 2007-06-01 21:42 ` Rafael J. Wysocki
  2007-06-01 22:37   ` 2.6.22-rc3 hibernate(?) fails totally - regression David Greaves
  0 siblings, 1 reply; 48+ messages in thread
From: Rafael J. Wysocki @ 2007-06-01 21:42 UTC (permalink / raw)
  To: David Greaves; +Cc: 'linux-kernel@vger.kernel.org', netdev, linux-pm

On Friday, 1 June 2007 23:23, David Greaves wrote:
> Not a regression though, it does it in 2.6.21
> 
> If I cause the system to save state to disk then whilst off it no longer
> responds to g-wol.

Can you please try with the hibernation and suspend patch series from

http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.22-rc3/patches/

applied?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression
  2007-06-01 21:42 ` Rafael J. Wysocki
@ 2007-06-01 22:37   ` David Greaves
  2007-06-01 23:22     ` Rafael J. Wysocki
  0 siblings, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-01 22:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: 'linux-kernel@vger.kernel.org', netdev, linux-pm

Rafael J. Wysocki wrote:
> On Friday, 1 June 2007 23:23, David Greaves wrote:
>> Not a regression though, it does it in 2.6.21
>>
>> If I cause the system to save state to disk then whilst off it no longer
>> responds to g-wol.
> 
> Can you please try with the hibernation and suspend patch series from
> 
> http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.22-rc3/patches/
> 
> applied?
> 
> Greetings,
> Rafael
> 

Sorry I made a mistake in the report.
I was still booting 2.6.21.1 - very sorry :(

The real situation is worse :(

2.6.22-rc3 (no patches) just hangs on suspend at:
Suspending consoles

console switching works but needs a hard reset to reboot.

2.6.22-rc3-skge (with Rafael's patches)
suspends to disk and powers off
wol doesn't work incidentally
resume resumes to the exact same place that 2.6.22-rc3 hangs at...
ie a non-responsive system saying
Suspending consoles

Note, in both cases I can switch VTs, the caps/numlock lights respond.

David





^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression
  2007-06-01 22:37   ` 2.6.22-rc3 hibernate(?) fails totally - regression David Greaves
@ 2007-06-01 23:22     ` Rafael J. Wysocki
  2007-06-02 22:31       ` David Greaves
  0 siblings, 1 reply; 48+ messages in thread
From: Rafael J. Wysocki @ 2007-06-01 23:22 UTC (permalink / raw)
  To: David Greaves; +Cc: 'linux-kernel@vger.kernel.org', netdev, linux-pm

On Saturday, 2 June 2007 00:37, David Greaves wrote:
> Rafael J. Wysocki wrote:
> > On Friday, 1 June 2007 23:23, David Greaves wrote:
> >> Not a regression though, it does it in 2.6.21
> >>
> >> If I cause the system to save state to disk then whilst off it no longer
> >> responds to g-wol.
> > 
> > Can you please try with the hibernation and suspend patch series from
> > 
> > http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.22-rc3/patches/
> > 
> > applied?
> > 
> > Greetings,
> > Rafael
> > 
> 
> Sorry I made a mistake in the report.
> I was still booting 2.6.21.1 - very sorry :(
> 
> The real situation is worse :(

Ouch.
 
> 2.6.22-rc3 (no patches) just hangs on suspend at:
> Suspending consoles
> 
> console switching works but needs a hard reset to reboot.
> 
> 2.6.22-rc3-skge (with Rafael's patches)
> suspends to disk and powers off
> wol doesn't work incidentally
> resume resumes to the exact same place that 2.6.22-rc3 hangs at...
> ie a non-responsive system saying
> Suspending consoles
> 
> Note, in both cases I can switch VTs, the caps/numlock lights respond.

Can you set CONFIG_DISABLE_CONSOLE_SUSPEND in .config and see where exactly it
fails?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression
  2007-06-01 23:22     ` Rafael J. Wysocki
@ 2007-06-02 22:31       ` David Greaves
  2007-06-02 22:46         ` Linus Torvalds
  0 siblings, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-02 22:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linus Torvalds, xfs
  Cc: 'linux-kernel@vger.kernel.org', netdev, linux-pm

This started as a non-regression bug-report about wakeonlan
During tests I found a real regression and this email is only about 2.6.22-rc3
without Rafael's patches - which I'll happily come back to later :)

Rafael J. Wysocki wrote:
> On Saturday, 2 June 2007 00:37, David Greaves wrote:
>> Rafael J. Wysocki wrote:
>>> On Friday, 1 June 2007 23:23, David Greaves wrote:
>> The real situation is worse :(
> 
> Ouch.
>  
>> 2.6.22-rc3 (no patches) just hangs on suspend at:
>> Suspending consoles
>>
>> console switching works but needs a hard reset to reboot.
>>
>> 2.6.22-rc3-skge (with Rafael's patches)
> Can you set CONFIG_DISABLE_CONSOLE_SUSPEND in .config and see where exactly it
> fails?

Given that I expected it to fail I unmounted my 1Tb array data before suspending.
It succeeded... so I started digging...

So I tried again.
2.6.22-rc3 (vanilla) suspended OK this time but on resume hung at:
Stopping tasks done
Shrinking memory done (0 pages freed)
Freed 0 kb in 0.0 secs (0.0MB/s)
Suspending console(s)

Then 2.6.22-rc3 again but CONFIG_DISABLE_CONSOLE_SUSPEND=y
It suspended again.
Froze on restore.
Screen photo here:
http://www.dgreaves.com/pub/2.6.21-rc3-resume-failure.jpg

Then 2.6.22-rc3 again but CONFIG_DISABLE_CONSOLE_SUSPEND=y
This time, before suspending I unmounted my xfs/lvm/raid6 filesystem.
Just a umount, I left the devices/array up.
It suspended again.
This time it resumed without fault.

The machine has another xfs filesystem :
/dev/hdb2 on /scratch type xfs (rw)

I've started bisecting... any other info needed?

David

Added Linus as it's a regression
Added xfs as unmounting an xfs filesystem 'fixes' it.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression
  2007-06-02 22:31       ` David Greaves
@ 2007-06-02 22:46         ` Linus Torvalds
  2007-06-03 15:03           ` David Greaves
  0 siblings, 1 reply; 48+ messages in thread
From: Linus Torvalds @ 2007-06-02 22:46 UTC (permalink / raw)
  To: David Greaves
  Cc: Rafael J. Wysocki, xfs, 'linux-kernel@vger.kernel.org',
	netdev, linux-pm



On Sat, 2 Jun 2007, David Greaves wrote:
> 
> Then 2.6.22-rc3 again but CONFIG_DISABLE_CONSOLE_SUSPEND=y
> It suspended again.
> Froze on restore.
> Screen photo here:
> http://www.dgreaves.com/pub/2.6.21-rc3-resume-failure.jpg

Ok, it wasn't a hidden oops. The DISABLE_CONSOLE_SUSPEND=y thing sometimes 
shows oopses that are otherwise hidden, but at other times it just causes 
more problems (hard hangs when trying to display something on a device 
that is suspended, or behind a bridge that got suspended).

In your case, the screen output just shows normal resume output, and it 
apparently just hung for some unknown reason. It *may* be worth trying to 
do a SysRQ + 't' thing to see what tasks are running (or rather, not 
running), but since you won't be able to capture it, it's probably not 
going to be useful.

> Then 2.6.22-rc3 again but CONFIG_DISABLE_CONSOLE_SUSPEND=y
> This time, before suspending I unmounted my xfs/lvm/raid6 filesystem.
> Just a umount, I left the devices/array up.
> It suspended again.
> This time it resumed without fault.

It would be interesting to see what triggered it, since it apparently 
worked before. So yes, a bisection would be great.

			Linus

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression
  2007-06-02 22:46         ` Linus Torvalds
@ 2007-06-03 15:03           ` David Greaves
  2007-06-06  8:33             ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-03 15:03 UTC (permalink / raw)
  To: Linus Torvalds, Tejun Heo
  Cc: Rafael J. Wysocki, xfs, 'linux-kernel@vger.kernel.org',
	netdev, linux-pm, Neil Brown

Linus Torvalds wrote:
> It would be interesting to see what triggered it, since it apparently 
> worked before. So yes, a bisection would be great.

Tejun, all the problematic patches are yours - so adding you.
Neil, since the problem only occurs whilst an xfs filesystem is mounted on a
raid6 array, I've cc'ed you too...

OK
Got as far as I could...
I've run 9 or 10 kernels/bisects and got to a point with 8 of Tejun's changesets
where it wouldn't compile:

  CC      drivers/ata/sata_via.o
drivers/ata/sata_via.c:120: error: `ata_scsi_device_suspend' undeclared here
(not in a function)
drivers/ata/sata_via.c:120: error: initializer element is not constant
drivers/ata/sata_via.c:120: error: (near initialization for `svia_sht.suspend')
drivers/ata/sata_via.c:121: error: `ata_scsi_device_resume' undeclared here (not
in a function)
drivers/ata/sata_via.c:121: error: initializer element is not constant
drivers/ata/sata_via.c:121: error: (near initialization for `svia_sht.resume')
make[2]: *** [drivers/ata/sata_via.o] Error 1

git bisect visualise gave:

bad: 48aaae7a2fa46e1ed0d0b7677fde79ccfcb8c963
bisect: 54936f8b099325992f0f212a5e366fd5257c6c9c
good: 0a3fd051c7036ef71b58863f8e5da7c3dabd9d3f

I used:
git reset --hard 8575b814097af648dad284bd3087875a11b13d18
git reset --hard e92351bb53c0849fabfa80be53cbf3b0aa166e54
git reset --hard 3a32a8e96694a243ec7e7feb6d76dfc4b1fe90c1
git reset --hard 9666f4009c22f6520ac3fb8a19c9e32ab973e828
to step through - non compiled
git reset --hard 1d30c33d8d07868199560b24f10ed6280e78a89c
compiled and hung on resume.

given the first patch identified is
9666f4009c22f6520ac3fb8a19c9e32ab973e828: "libata: reimplement suspend/resume
support using sdev->manage_start_stop"
That seems a good candidate...

Incidentally, when I compile 1d30c33d8d07868199560b24f10ed6280e78a89c (far side
of the implicated changesets) if I umount my xfs over raid6 filesystem (no lvm
as I said in the OP) the resume succeeds.


David
PS I hope I've interpreted bisect correctly - first use and all that...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression
  2007-06-03 15:03           ` David Greaves
@ 2007-06-06  8:33             ` Tejun Heo
  2007-06-06 10:18               ` [PATCH] sata_promise: use TF interface for polling NODATA commands Tejun Heo
  2007-06-06 10:39               ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
  0 siblings, 2 replies; 48+ messages in thread
From: Tejun Heo @ 2007-06-06  8:33 UTC (permalink / raw)
  To: David Greaves
  Cc: Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	netdev, linux-pm, Neil Brown

Hello,

David Greaves wrote:
> Linus Torvalds wrote:
>> It would be interesting to see what triggered it, since it apparently 
>> worked before. So yes, a bisection would be great.
> 
> Tejun, all the problematic patches are yours - so adding you.

Ouch....

> given the first patch identified is
> 9666f4009c22f6520ac3fb8a19c9e32ab973e828: "libata: reimplement suspend/resume
> support using sdev->manage_start_stop"
> That seems a good candidate...

9ce3075c20d458040138690edfdf6446664ec3ee works, right?  Can you test
9666f4009c22f6520ac3fb8a19c9e32ab973e828 by removing
ata_scsi_device_suspend/resume callbacks from sata_via.c?  Just delete
all lines referencing those two functions.  There were one or two
fallouts from the conversion.

How many drives do you have?  Behavior difference introduced by the
reimplementation is serialization of resume sequence, so it takes more
time.  My test machine had problems resuming if resume took too long
even with the previous implementation.  It didn't matter whether the
long resuming sequence is caused by too many controllers or explicit
ssleep().  If time needed for resume sequence is over certain threshold,
machine hangs while resuming.  I thought it was a BIOS glitch and didn't
dig into it but you might be seeing the same issue.

Please post dmesg too.  Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] sata_promise: use TF interface for polling NODATA commands
  2007-06-06  8:33             ` Tejun Heo
@ 2007-06-06 10:18               ` Tejun Heo
  2007-06-06 10:19                 ` Tejun Heo
  2007-06-06 10:39               ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
  1 sibling, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2007-06-06 10:18 UTC (permalink / raw)
  To: David Greaves
  Cc: Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	netdev, linux-pm, Neil Brown, mikpe

sata_promise uses two different command modes - packet and TF.  Packet
mode is intelligent low-overhead mode while TF is the same old
taskfile interface.  As with other advanced interface (ahci/sil24),
ATA_TFLAG_POLLING has no effect in packet mode.  However, PIO commands
are issued using TF interface in polling mode, so pdc_interrupt()
considers interrupts spurious if ATA_TFLAG_POLLING is set.

This is broken for polling NODATA commands because command is issued
using packet mode but the interrupt handler ignores it due to
ATA_TFLAG_POLLING.  Fix pdc_qc_issue_prot() such that ATA/ATAPI NODATA
commands are issued using TF interface if ATA_TFLAG_POLLING is set.

This patch fixes detection failure introduced by polling SETXFERMODE.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
David, please verify this patch.  Mikael, does this look okay?  Please
push this upstream after David and Mikael's ack.

Thanks.

 drivers/ata/sata_promise.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c
index 2b924a6..6dc0b01 100644
--- a/drivers/ata/sata_promise.c
+++ b/drivers/ata/sata_promise.c
@@ -784,9 +784,12 @@ static unsigned int pdc_qc_issue_prot(struct ata_queued_cmd *qc)
 		if (qc->dev->flags & ATA_DFLAG_CDB_INTR)
 			break;
 		/*FALLTHROUGH*/
+	case ATA_PROT_NODATA:
+		if (qc->tf.flags & ATA_TFLAG_POLLING)
+			break;
+		/*FALLTHROUGH*/
 	case ATA_PROT_ATAPI_DMA:
 	case ATA_PROT_DMA:
-	case ATA_PROT_NODATA:
 		pdc_packet_start(qc);
 		return 0;
 
@@ -800,7 +803,7 @@ static unsigned int pdc_qc_issue_prot(struct ata_queued_cmd *qc)
 static void pdc_tf_load_mmio(struct ata_port *ap, const struct ata_taskfile *tf)
 {
 	WARN_ON (tf->protocol == ATA_PROT_DMA ||
-		 tf->protocol == ATA_PROT_NODATA);
+		 tf->protocol == ATA_PROT_ATAPI_DMA);
 	ata_tf_load(ap, tf);
 }
 
@@ -808,7 +811,7 @@ static void pdc_tf_load_mmio(struct ata_port *ap, const struct ata_taskfile *tf)
 static void pdc_exec_command_mmio(struct ata_port *ap, const struct ata_taskfile *tf)
 {
 	WARN_ON (tf->protocol == ATA_PROT_DMA ||
-		 tf->protocol == ATA_PROT_NODATA);
+		 tf->protocol == ATA_PROT_ATAPI_DMA);
 	ata_exec_command(ap, tf);
 }
 

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] sata_promise: use TF interface for polling NODATA commands
  2007-06-06 10:18               ` [PATCH] sata_promise: use TF interface for polling NODATA commands Tejun Heo
@ 2007-06-06 10:19                 ` Tejun Heo
  0 siblings, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2007-06-06 10:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Greaves, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	netdev, linux-pm, Neil Brown, mikpe

Tejun Heo wrote:
> sata_promise uses two different command modes - packet and TF.  Packet
> mode is intelligent low-overhead mode while TF is the same old
> taskfile interface.  As with other advanced interface (ahci/sil24),
> ATA_TFLAG_POLLING has no effect in packet mode.  However, PIO commands
> are issued using TF interface in polling mode, so pdc_interrupt()
> considers interrupts spurious if ATA_TFLAG_POLLING is set.
> 
> This is broken for polling NODATA commands because command is issued
> using packet mode but the interrupt handler ignores it due to
> ATA_TFLAG_POLLING.  Fix pdc_qc_issue_prot() such that ATA/ATAPI NODATA
> commands are issued using TF interface if ATA_TFLAG_POLLING is set.
> 
> This patch fixes detection failure introduced by polling SETXFERMODE.
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>

Eeeek... Wrong thread.  Please ignore this posting.  Will repost.  Sorry.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-06  8:33             ` Tejun Heo
  2007-06-06 10:18               ` [PATCH] sata_promise: use TF interface for polling NODATA commands Tejun Heo
@ 2007-06-06 10:39               ` David Greaves
  2007-06-07  5:53                 ` Tejun Heo
  1 sibling, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-06 10:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

Tejun Heo wrote:
> Hello,
> 
> David Greaves wrote:
>> Linus Torvalds wrote:
>>> It would be interesting to see what triggered it, since it apparently 
>>> worked before. So yes, a bisection would be great.
>> Tejun, all the problematic patches are yours - so adding you.
> 
> Ouch....
<grin> that's what everyone says!

Just to be clear. This problem is where my system won't resume after s2d unless 
I umount my xfs over raid6 filesystem.

>> given the first patch identified is
>> 9666f4009c22f6520ac3fb8a19c9e32ab973e828: "libata: reimplement suspend/resume
>> support using sdev->manage_start_stop"
>> That seems a good candidate...
> 
> 9ce3075c20d458040138690edfdf6446664ec3ee works, right?
Yes
git reset --hard ec4883b015c3212f6f6d04fb2ff45f528492f598
vi Makefile
make oldconfig
make && make install && make modules_install && update-grub
init 6

>  Can you test
> 9666f4009c22f6520ac3fb8a19c9e32ab973e828 by removing
> ata_scsi_device_suspend/resume callbacks from sata_via.c?   Just delete
> all lines referencing those two functions.  There were one or two
> fallouts from the conversion.

Yes, after I posted I realised that Andrews patch fixed the compile failure :)

git reset --hard 9666f4009c22f6520ac3fb8a19c9e32ab973e828

diff --git a/drivers/ata/sata_via.c b/drivers/ata/sata_via.c
index 939c924..bad87b5 100644
--- a/drivers/ata/sata_via.c
+++ b/drivers/ata/sata_via.c
@@ -117,8 +117,6 @@ static struct scsi_host_template svia_sht = {
         .slave_destroy          = ata_scsi_slave_destroy,
         .bios_param             = ata_std_bios_param,
  #ifdef CONFIG_PM
-       .suspend                = ata_scsi_device_suspend,
-       .resume                 = ata_scsi_device_resume,
  #endif
  };

So now this compiles but it does cause the problem:

umount /huge
echo platform > /sys/power/disk
echo disk > /sys/power/state
# resumes fine

mount /huge
echo platform > /sys/power/disk
echo disk > /sys/power/state
# won't resume

FWIW, /huge is:
/dev/md0 on /huge type xfs (rw)
cu:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid6 sdf1[0] sde1[1] sdd1[2] sdc1[3] sdb1[4] sda1[5] hdb1[6]
       1225557760 blocks level 6, 256k chunk, algorithm 2 [7/7] [UUUUUUU]
       bitmap: 0/234 pages [0KB], 512KB chunk

unused devices: <none>

> 
> How many drives do you have?
8 in total
2 pata : VIA vt8237
2 sata on sata_via
4 sata on sata_promise
+1 pata cdrom

   Behavior difference introduced by the
> reimplementation is serialization of resume sequence, so it takes more
> time.  My test machine had problems resuming if resume took too long
> even with the previous implementation.  It didn't matter whether the
> long resuming sequence is caused by too many controllers or explicit
> ssleep().  If time needed for resume sequence is over certain threshold,
> machine hangs while resuming.  I thought it was a BIOS glitch and didn't
> dig into it but you might be seeing the same issue.
given the mount/umount thing this sounds unlikely... but what do I know?

resume does throw up:
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407

which I've not noticed before... oh, alright, I'll check...
reboots to 2.6.21, suspend, resume...
nope, not output on resume in 2.6.21


> Please post dmesg too.  Thanks.
> 

Here is:
  dmesg from 2.6.22-9666f4009c22f6520ac3fb8a19c9e32ab973e828 (ie with sata_via fix)
  dmesg from resume of above when /huge is unmounted
  dmesg from resume of 2.6.21

Linux version 2.6.21-TejunTst2-g9666f400-dirty (root@cu.dgreaves.com) (gcc 
version 3.3.5 (Debian 1:3.3.5-13)) #13 Wed Jun 6 10:16:03 BST 2007
BIOS-provided physical RAM map:
  BIOS-e820: 0000000000000000 - 000000000009c400 (usable)
  BIOS-e820: 000000000009c400 - 00000000000a0000 (reserved)
  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
  BIOS-e820: 0000000000100000 - 000000003fffc000 (usable)
  BIOS-e820: 000000003fffc000 - 000000003ffff000 (ACPI data)
  BIOS-e820: 000000003ffff000 - 0000000040000000 (ACPI NVS)
  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
  BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
Entering add_active_range(0, 0, 262140) 0 entries of 256 used
Zone PFN ranges:
   DMA             0 ->     4096
   Normal       4096 ->   229376
   HighMem    229376 ->   262140
early_node_map[1] active PFN ranges
     0:        0 ->   262140
On node 0 totalpages: 262140
   DMA zone: 32 pages used for memmap
   DMA zone: 0 pages reserved
   DMA zone: 4064 pages, LIFO batch:0
   Normal zone: 1760 pages used for memmap
   Normal zone: 223520 pages, LIFO batch:31
   HighMem zone: 255 pages used for memmap
   HighMem zone: 32509 pages, LIFO batch:7
DMI 2.3 present.
ACPI: RSDP 000F62A0, 0014 (r0 ASUS  )
ACPI: RSDT 3FFFC000, 0030 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: FACP 3FFFC0B2, 0074 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: DSDT 3FFFC126, 2C4F (r1   ASUS A7V600       1000 MSFT  100000B)
ACPI: FACS 3FFFF000, 0040
ACPI: BOOT 3FFFC030, 0028 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: APIC 3FFFC058, 005A (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: PM-Timer IO Port: 0xe408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:10 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
Built 1 zonelists.  Total pages: 260093
Kernel command line: root=/dev/hda2 ro log_buf_len=128k
log_buf_len: 131072
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Detected 1999.872 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1034940k/1048560k available (2426k kernel code, 12968k reserved, 888k 
data, 188k init, 131056k highmem)
virtual kernel memory layout:
     fixmap  : 0xfffaa000 - 0xfffff000   ( 340 kB)
     pkmap   : 0xff800000 - 0xffc00000   (4096 kB)
     vmalloc : 0xf8800000 - 0xff7fe000   ( 111 MB)
     lowmem  : 0xc0000000 - 0xf8000000   ( 896 MB)
       .init : 0xc0440000 - 0xc046f000   ( 188 kB)
       .data : 0xc035e9ef - 0xc043cb90   ( 888 kB)
       .text : 0xc0100000 - 0xc035e9ef   (2426 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 4003.08 BogoMIPS (lpj=8006169)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0383fbff c1cbfbff 00000000 00000000 00000000 
00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU: After all inits, caps: 0383fbff c1cbfbff 00000000 00000420 00000000 
00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Compat vDSO mapped to ffffe000.
CPU: AMD Athlon(TM) MP stepping 00
Checking 'hlt' instruction... OK.
ACPI: Core revision 20070126
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xf1970, last bus=1
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: (supports S0 S1 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
PCI: enabled onboard AC97/MC97 devices
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 6 7 9 10 11 12)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 *5 6 7 9 10 11 12)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 *6 7 9 10 11 12)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 11 12) *15, disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: ACPI device : hid PNP0C01
pnp: ACPI device : hid PNP0A03
pnp: ACPI device : hid PNP0C02
pnp: ACPI device : hid PNP0C02
pnp: ACPI device : hid PNP0200
pnp: ACPI device : hid PNP0B00
pnp: ACPI device : hid PNP0800
pnp: ACPI device : hid PNP0C04
pnp: ACPI device : hid PNP0501
pnp: ACPI device : hid PNP0501
pnp: ACPI device : hid PNP0303
pnp: ACPI device : hid PNP0F03
pnp: ACPI device : hid PNPB02F
pnp: ACPI device : hid PNP0C02
pnp: PnP ACPI: found 14 devices
ACPI: ACPI bus type pnp unregistered
SCSI subsystem initialized
libata version 2.20 loaded.
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
pnp: the driver 'system' has been registered
pnp: match found with the PnP device '00:00' and the driver 'system'
pnp: 00:00: iomem range 0x0-0x9ffff could not be reserved
pnp: 00:00: iomem range 0xf0000-0xfffff could not be reserved
pnp: 00:00: iomem range 0x100000-0x3fffffff could not be reserved
pnp: 00:00: iomem range 0xfec00000-0xfec000ff could not be reserved
pnp: match found with the PnP device '00:02' and the driver 'system'
pnp: 00:02: ioport range 0xe400-0xe47f has been reserved
pnp: 00:02: ioport range 0xe800-0xe81f has been reserved
pnp: 00:02: iomem range 0xfff80000-0xffffffff could not be reserved
pnp: 00:02: iomem range 0xffb80000-0xffbfffff has been reserved
pnp: match found with the PnP device '00:03' and the driver 'system'
pnp: match found with the PnP device '00:0d' and the driver 'system'
pnp: 00:0d: ioport range 0x290-0x297 has been reserved
pnp: 00:0d: ioport range 0x370-0x375 has been reserved
Time: tsc clocksource has been installed.
PCI: Bridge: 0000:00:01.0
   IO window: disabled.
   MEM window: disabled.
   PREFETCH window: disabled.
PCI: Setting latency timer of device 0000:00:01.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
TCP bind hash table entries: 65536 (order: 6, 262144 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
Simple Boot Flag at 0x3a set to 0x1
Machine check exception polling timer started.
highmem bounce pool size: 64 pages
SGI XFS with ACLs, no debug enabled
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
Boot video device is 0000:00:0a.0
PCI: Bypassing VIA 8237 APIC De-Assert Message
atyfb: using auxiliary register aperture
atyfb: 3D RAGE II+ (Mach64 GU) [0x4755 rev 0x9a]
atyfb: Mach64 BIOS is located at c0000, mapped at c00c0000.
atyfb: BIOS frequency table:
atyfb: PCLK_min_freq 926, PCLK_max_freq 22216, ref_freq 1432, ref_divider 33
atyfb: MCLK_pwd 4200, MCLK_max_freq 6000, XCLK_max_freq 6000, SCLK_freq 5000
atyfb: 4M EDO, 14.31818 MHz XTAL, 222 MHz PLL, 60 Mhz MCLK, 60 MHz XCLK
Console: switching to colour frame buffer device 80x30
atyfb: fb0: ATY Mach64 frame buffer device on PCI
input: Power Button (FF) as /class/input/input0
ACPI: Power Button (FF) [PWRF]
input: Power Button (CM) as /class/input/input1
ACPI: Power Button (CM) [PWRB]
netconsole: not configured, aborting
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:0f.1
ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 16
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
     ide0: BM-DMA at 0x9000-0x9007, BIOS settings: hda:DMA, hdb:DMA
     ide1: BM-DMA at 0x9008-0x900f, BIOS settings: hdc:pio, hdd:DMA
Probing IDE interface ide0...
Switched to high resolution mode on CPU 0
hda: ST320420A, ATA DISK drive
hdb: Maxtor 5A300J0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdd: PLEXTOR CD-R PX-W2410A, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 128KiB
hda: 39851760 sectors (20404 MB) w/2048KiB Cache, CHS=39535/16/63, UDMA(66)
hda: cache flushes not supported
  hda: hda1 hda2 hda3
hdb: max request size: 512KiB
hdb: 585940320 sectors (300001 MB) w/2048KiB Cache, CHS=36473/255/63, UDMA(133)
hdb: cache flushes supported
  hdb: hdb1 hdb2
hdd: ATAPI 40X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
sata_promise 0000:00:0d.0: version 2.07
ACPI: PCI Interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 17
scsi0 : sata_promise
scsi1 : sata_promise
scsi2 : sata_promise
scsi3 : sata_promise
ata1: SATA max UDMA/133 cmd 0xf880a200 ctl 0xf880a238 bmdma 0x00000000 irq 0
ata2: SATA max UDMA/133 cmd 0xf880a280 ctl 0xf880a2b8 bmdma 0x00000000 irq 0
ata3: SATA max UDMA/133 cmd 0xf880a300 ctl 0xf880a338 bmdma 0x00000000 irq 0
ata4: SATA max UDMA/133 cmd 0xf880a380 ctl 0xf880a3b8 bmdma 0x00000000 irq 0
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: ATA-7: Maxtor 6B250S0, BANC19J0, max UDMA/133
ata1.00: 490234752 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: configured for UDMA/133
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: ATA-7: Maxtor 7Y250M0, YAR51EW0, max UDMA/133
ata2.00: 490234752 sectors, multi 0: LBA48
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: configured for UDMA/133
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: ATA-7: Maxtor 7Y250M0, YAR51EW0, max UDMA/133
ata3.00: 490234752 sectors, multi 0: LBA48
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: configured for UDMA/133
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata4.00: ATA-7: Maxtor 6B250S0, BANC1980, max UDMA/133
ata4.00: 490234752 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata4.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata4.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6B250S0   BANC PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sda: sda1
sd 0:0:0:0: [sda] Attached SCSI disk
scsi 1:0:0:0: Direct-Access     ATA      Maxtor 7Y250M0   YAR5 PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors (251000 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors (251000 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sdb: sdb1
sd 1:0:0:0: [sdb] Attached SCSI disk
scsi 2:0:0:0: Direct-Access     ATA      Maxtor 7Y250M0   YAR5 PQ: 0 ANSI: 5
sd 2:0:0:0: [sdc] 490234752 512-byte hardware sectors (251000 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 2:0:0:0: [sdc] 490234752 512-byte hardware sectors (251000 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sdc: sdc1
sd 2:0:0:0: [sdc] Attached SCSI disk
scsi 3:0:0:0: Direct-Access     ATA      Maxtor 6B250S0   BANC PQ: 0 ANSI: 5
sd 3:0:0:0: [sdd] 490234752 512-byte hardware sectors (251000 MB)
sd 3:0:0:0: [sdd] Write Protect is off
sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 3:0:0:0: [sdd] 490234752 512-byte hardware sectors (251000 MB)
sd 3:0:0:0: [sdd] Write Protect is off
sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sdd: sdd1
sd 3:0:0:0: [sdd] Attached SCSI disk
sata_via 0000:00:0f.0: version 2.1
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 16
sata_via 0000:00:0f.0: routed to hard irq line 0
scsi4 : sata_via
scsi5 : sata_via
ata5: SATA max UDMA/133 cmd 0x0001b000 ctl 0x0001a802 bmdma 0x00019800 irq 0
ata6: SATA max UDMA/133 cmd 0x0001a400 ctl 0x0001a002 bmdma 0x00019808 irq 0
ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: ATA-7: Maxtor 7B250S0, BANC1980, max UDMA/133
ata5.00: 490234752 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: configured for UDMA/133
ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: ATA-7: ST3400620AS, 3.AAK, max UDMA/133
ata6.00: 781422768 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: configured for UDMA/133
scsi 4:0:0:0: Direct-Access     ATA      Maxtor 7B250S0   BANC PQ: 0 ANSI: 5
sd 4:0:0:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 4:0:0:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sde: sde1
sd 4:0:0:0: [sde] Attached SCSI disk
scsi 5:0:0:0: Direct-Access     ATA      ST3400620AS      3.AA PQ: 0 ANSI: 5
sd 5:0:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 5:0:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sdf: sdf1 sdf2
sd 5:0:0:0: [sdf] Attached SCSI disk
pnp: the driver 'i8042 kbd' has been registered
pnp: match found with the PnP device '00:0a' and the driver 'i8042 kbd'
pnp: the driver 'i8042 aux' has been registered
pnp: match found with the PnP device '00:0b' and the driver 'i8042 aux'
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard as /class/input/input2
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
raid6: int32x1    806 MB/s
raid6: int32x2   1097 MB/s
raid6: int32x4    694 MB/s
raid6: int32x8    648 MB/s
raid6: mmxx1     1683 MB/s
raid6: mmxx2     3028 MB/s
raid6: sse1x1    1622 MB/s
raid6: sse1x2    2655 MB/s
raid6: using algorithm sse1x2 (2655 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid5: automatically using best checksumming function: pIII_sse
    pIII_sse  :  4201.000 MB/sec
raid5: using function: pIII_sse (4201.000 MB/sec)
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com
TCP cubic registered
Using IPI Shortcut mode
input: ImPS/2 Logitech Wheel Mouse as /class/input/input3
md: Autodetecting RAID arrays.
md: autorun ...
md: considering sdf1 ...
md:  adding sdf1 ...
md:  adding sde1 ...
md:  adding sdd1 ...
md:  adding sdc1 ...
md:  adding sdb1 ...
md:  adding sda1 ...
md:  adding hdb1 ...
md: created md0
md: bind<hdb1>
md: bind<sda1>
md: bind<sdb1>
md: bind<sdc1>
md: bind<sdd1>
md: bind<sde1>
md: bind<sdf1>
md: running: <sdf1><sde1><sdd1><sdc1><sdb1><sda1><hdb1>
raid5: device sdf1 operational as raid disk 0
raid5: device sde1 operational as raid disk 1
raid5: device sdd1 operational as raid disk 2
raid5: device sdc1 operational as raid disk 3
raid5: device sdb1 operational as raid disk 4
raid5: device sda1 operational as raid disk 5
raid5: device hdb1 operational as raid disk 6
raid5: allocated 7316kB for md0
raid5: raid level 6 set md0 active with 7 out of 7 devices, algorithm 2
RAID5 conf printout:
  --- rd:7 wd:7
  disk 0, o:1, dev:sdf1
  disk 1, o:1, dev:sde1
  disk 2, o:1, dev:sdd1
  disk 3, o:1, dev:sdc1
  disk 4, o:1, dev:sdb1
  disk 5, o:1, dev:sda1
  disk 6, o:1, dev:hdb1
md0: bitmap initialized from disk: read 15/15 pages, set 2 bits, status: 0
created bitmap (234 pages) for device md0
md: ... autorun DONE.
Filesystem "hda2": Disabling barriers, not supported by the underlying device
XFS mounting filesystem hda2
Ending clean XFS mount for filesystem: hda2
VFS: Mounted root (xfs filesystem) readonly.
Freeing unused kernel memory: 188k freed
NET: Registered protocol family 1
PCI: Enabling device 0000:00:09.0 (0014 -> 0017)
ACPI: PCI Interrupt 0000:00:09.0[A] -> GSI 18 (level, low) -> IRQ 18
skge 1.11 addr 0xf6000000 irq 18 chip Yukon rev 1
skge eth0: addr 00:0c:6e:f6:47:ee
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 19
uhci_hcd 0000:00:10.0: UHCI Host Controller
uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:10.0: irq 19, io base 0x00008800
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 19
uhci_hcd 0000:00:10.1: UHCI Host Controller
uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:10.1: irq 19, io base 0x00008400
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
sk98lin: driver has been replaced by the skge driver and is scheduled for removal
ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 19
uhci_hcd 0000:00:10.2: UHCI Host Controller
uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:10.2: irq 19, io base 0x00008000
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 19
uhci_hcd 0000:00:10.3: UHCI Host Controller
uhci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:10.3: irq 19, io base 0x00007800
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
vt596_smbus 0000:00:11.0: VT596_smba = 0xE800
i2c-adapter i2c-0: adapter [SMBus Via Pro adapter at e800] registered
ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 19
ehci_hcd 0000:00:10.4: EHCI Host Controller
ehci_hcd 0000:00:10.4: new USB bus registered, assigned bus number 5
ehci_hcd 0000:00:10.4: irq 19, io mem 0xf4000000
ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 8 ports detected
ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:11.5 to 64
codec_read: codec 0 is not valid [0xfe0000]
codec_read: codec 0 is not valid [0xfe0000]
codec_read: codec 0 is not valid [0xfe0000]
codec_read: codec 0 is not valid [0xfe0000]
Adding 522100k swap on /dev/hda3.  Priority:-1 extents:1 across:522100k
Filesystem "hda2": Disabling barriers, not supported by the underlying device
i2c-core: driver [eeprom] registered
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x50
i2c-adapter i2c-0: Transaction (pre): STS=42 CNT=14 CMD=00 ADD=a0 DAT=11,00
i2c-adapter i2c-0: SMBus busy (0x42). Resetting...
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=a0 DAT=11,00
i2c-adapter i2c-0: Transaction (pre): STS=40 CNT=00 CMD=00 ADD=a0 DAT=11,00
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=a0 DAT=11,00
i2c-adapter i2c-0: client [eeprom] registered with bus id 0-0050
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x51
i2c-adapter i2c-0: Transaction (pre): STS=40 CNT=00 CMD=00 ADD=a2 DAT=11,00
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=a2 DAT=11,00
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x52
i2c-adapter i2c-0: Transaction (pre): STS=40 CNT=00 CMD=00 ADD=a4 DAT=11,00
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=a4 DAT=11,00
i2c-adapter i2c-0: Transaction (pre): STS=40 CNT=00 CMD=00 ADD=a4 DAT=11,00
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=a4 DAT=11,00
i2c-adapter i2c-0: client [eeprom] registered with bus id 0-0052
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x53
i2c-adapter i2c-0: Transaction (pre): STS=40 CNT=00 CMD=00 ADD=a6 DAT=11,00
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=a6 DAT=11,00
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x54
i2c-adapter i2c-0: Transaction (pre): STS=40 CNT=00 CMD=00 ADD=a8 DAT=11,00
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=a8 DAT=11,00
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x55
i2c-adapter i2c-0: Transaction (pre): STS=40 CNT=00 CMD=00 ADD=aa DAT=11,00
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=aa DAT=11,00
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x56
i2c-adapter i2c-0: Transaction (pre): STS=40 CNT=00 CMD=00 ADD=ac DAT=11,00
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=ac DAT=11,00
i2c-adapter i2c-0: found normal entry for adapter 0, addr 0x57
i2c-adapter i2c-0: Transaction (pre): STS=40 CNT=00 CMD=00 ADD=ae DAT=11,00
i2c-adapter i2c-0: Transaction (post): STS=00 CNT=00 CMD=00 ADD=ae DAT=11,00
i2c-adapter i2c-9191: ISA main adapter registered
it87: Found IT8712F chip at 0x290, revision 5
i2c-adapter i2c-9191: Driver it87-isa registered
i2c-adapter i2c-9191: client [it8712] registered with bus id 9191-0290
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
kjournald starting.  Commit interval 5 seconds
EXT3 FS on hda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Filesystem "md0": Disabling barriers, not supported by the underlying device
XFS mounting filesystem md0
Ending clean XFS mount for filesystem: md0
XFS mounting filesystem hdb2
Ending clean XFS mount for filesystem: hdb2
skge eth0: enabling interface
skge eth0: Link is up at 1000 Mbps, full duplex, flow control both

[I umount /dev/md0 here and then suspend2disk/resume]


swsusp: Basic memory bitmaps created
Stopping tasks ... done.
Shrinking memory... done (0 pages freed)
Freed 0 kbytes in 0.03 seconds (0.00 MB/s)
Suspending console(s)
sd 5:0:0:0: [sdf] Synchronizing SCSI cache
sd 4:0:0:0: [sde] Synchronizing SCSI cache
sd 3:0:0:0: [sdd] Synchronizing SCSI cache
sd 2:0:0:0: [sdc] Synchronizing SCSI cache
sd 1:0:0:0: [sdb] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Synchronizing SCSI cache
ACPI: PCI interrupt for device 0000:00:11.5 disabled
ACPI: PCI interrupt for device 0000:00:10.4 disabled
ACPI: PCI interrupt for device 0000:00:10.3 disabled
ACPI: PCI interrupt for device 0000:00:10.2 disabled
ACPI: PCI interrupt for device 0000:00:10.1 disabled
ACPI: PCI interrupt for device 0000:00:10.0 disabled
ACPI: PCI interrupt for device 0000:00:0f.0 disabled
skge eth0: disabling interface
swsusp: critical section:
swsusp: Need to copy 34443 pages
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
PCI: Setting latency timer of device 0000:00:01.0 to 64
PM: Writing back config space on device 0000:00:09.0 at offset 1 (was 2b00014, 
writing 2b00017)
skge eth0: enabling interface
Clocksource tsc unstable (delta = 4327744420428 ns)
ACPI: PCI Interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 17
PM: Writing back config space on device 0000:00:0f.0 at offset 1 (was 2900003, 
writing 2900007)
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 16
ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 16
Time: acpi_pm clocksource has been installed.
ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 19
usb usb1: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 19
usb usb2: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 19
usb usb3: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 19
usb usb4: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 19
PM: Writing back config space on device 0000:00:10.4 at offset 3 (was 802008, 
writing 802010)
usb usb5: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:11.5 to 64
pnp: Res cnt 3
pnp: res cnt 3
pnp: Encode io
pnp: Encode io
pnp: Encode irq
pnp: Failed to activate device 00:0a.
pnp: Res cnt 1
pnp: res cnt 1
pnp: Encode irq
pnp: Failed to activate device 00:0b.
sd 0:0:0:0: [sda] Starting disk
sd 1:0:0:0: [sdb] Starting disk
sd 2:0:0:0: [sdc] Starting disk
sd 3:0:0:0: [sdd] Starting disk
sd 4:0:0:0: [sde] Starting disk
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: configured for UDMA/133
sd 4:0:0:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
sd 5:0:0:0: [sdf] Starting disk
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: configured for UDMA/133
sd 5:0:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
Restarting tasks ... done.
swsusp: Basic memory bitmaps freed
skge eth0: Link is up at 1000 Mbps, full duplex, flow control both


[following is the dmesg of 2.6.21 post resume]

Stopping tasks ... done.
Shrinking memory... done (0 pages freed)
Freed 0 kbytes in 0.03 seconds (0.00 MB/s)
Suspending console(s)
ACPI: PCI interrupt for device 0000:00:11.5 disabled
ACPI: PCI interrupt for device 0000:00:10.4 disabled
ACPI: PCI interrupt for device 0000:00:10.3 disabled
ACPI: PCI interrupt for device 0000:00:10.2 disabled
ACPI: PCI interrupt for device 0000:00:10.1 disabled
ACPI: PCI interrupt for device 0000:00:10.0 disabled
skge eth0: disabling interface
swsusp: critical section:
swsusp: Need to copy 34078 pages
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
PCI: Setting latency timer of device 0000:00:01.0 to 64
Clocksource tsc unstable (delta = 4337118157984 ns)
Time: acpi_pm clocksource has been installed.
PM: Writing back config space on device 0000:00:09.0 at offset 1 (was 2b00014, 
writing 2b00017)
skge eth0: enabling interface
ACPI: PCI Interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 17
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 16
ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 16
ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 19
usb usb1: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 19
usb usb2: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 19
usb usb3: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 19
usb usb4: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 19
PM: Writing back config space on device 0000:00:10.4 at offset 3 (was 802008, 
writing 802010)
usb usb5: root hub lost power or was reset
ACPI: PCI Interrupt 0000:00:11.5[C] -> GSI 22 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:11.5 to 64
pnp: Res cnt 3
pnp: res cnt 3
pnp: Encode io
pnp: Encode io
pnp: Encode irq
pnp: Failed to activate device 00:0a.
pnp: Res cnt 1
pnp: res cnt 1
pnp: Encode irq
pnp: Failed to activate device 00:0b.
logips2pp: Detected unknown logitech mouse model 1
Restarting tasks ... done.
skge eth0: Link is up at 1000 Mbps, full duplex, flow control both



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-06 10:39               ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
@ 2007-06-07  5:53                 ` Tejun Heo
  2007-06-07 10:30                   ` David Greaves
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2007-06-07  5:53 UTC (permalink / raw)
  To: David Greaves
  Cc: Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

Hello,

David Greaves wrote:
> Just to be clear. This problem is where my system won't resume after s2d
> unless I umount my xfs over raid6 filesystem.

This is really weird.  I don't see how xfs mount can affect this at all.

[--snip--]
> So now this compiles but it does cause the problem:
> 
> umount /huge
> echo platform > /sys/power/disk
> echo disk > /sys/power/state
> # resumes fine
> 
> mount /huge
> echo platform > /sys/power/disk
> echo disk > /sys/power/state
> # won't resume

How hard does the machine freeze?  Can you use sysrq?  If so, please
dump sysrq-t.

>   Behavior difference introduced by the
>> reimplementation is serialization of resume sequence, so it takes more
>> time.  My test machine had problems resuming if resume took too long
>> even with the previous implementation.  It didn't matter whether the
>> long resuming sequence is caused by too many controllers or explicit
>> ssleep().  If time needed for resume sequence is over certain threshold,
>> machine hangs while resuming.  I thought it was a BIOS glitch and didn't
>> dig into it but you might be seeing the same issue.
> given the mount/umount thing this sounds unlikely... but what do I know?

No I don't think this is the same problem either.  The problem I
described happened during resume from s2ram.

> resume does throw up:
> ATA: abnormal status 0x7F on port 0x0001b007
> ATA: abnormal status 0x7F on port 0x0001b007
> ATA: abnormal status 0x7F on port 0x0001a407
> ATA: abnormal status 0x7F on port 0x0001a407
> 
> which I've not noticed before... oh, alright, I'll check...
> reboots to 2.6.21, suspend, resume...
> nope, not output on resume in 2.6.21

The messages don't really matter.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07  5:53                 ` Tejun Heo
@ 2007-06-07 10:30                   ` David Greaves
  2007-06-07 11:07                     ` David Chinner
                                       ` (2 more replies)
  0 siblings, 3 replies; 48+ messages in thread
From: David Greaves @ 2007-06-07 10:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

Tejun Heo wrote:
> Hello,
> 
> David Greaves wrote:
>> Just to be clear. This problem is where my system won't resume after s2d
>> unless I umount my xfs over raid6 filesystem.
> 
> This is really weird.  I don't see how xfs mount can affect this at all.
Indeed.
It does :)

> How hard does the machine freeze?  Can you use sysrq?  If so, please
> dump sysrq-t.
I suspect there is a problem writing to the consoles...

I recompiled (rc4+patch) with sysrq support, suspended, resumed and tried 
sysrq-t but got no output.

I *can* change VTs and see the various login prompts, bitmap messages and the 
console messages. Caps/Num lock lights work.

Fearing incompetence I tried sysrq-s sysrq-u sysrq-b and got a reboot so sysrq 
is OK.

Any suggestions on how to see more? Or what to try next?

Any other kernel debug options to set?

David
PS Back in a couple of hours...



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 10:30                   ` David Greaves
@ 2007-06-07 11:07                     ` David Chinner
  2007-06-07 13:59                       ` David Greaves
  2007-06-07 13:45                     ` Duane Griffin
  2007-06-07 20:12                     ` Pavel Machek
  2 siblings, 1 reply; 48+ messages in thread
From: David Chinner @ 2007-06-07 11:07 UTC (permalink / raw)
  To: David Greaves
  Cc: Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

On Thu, Jun 07, 2007 at 11:30:05AM +0100, David Greaves wrote:
> Tejun Heo wrote:
> >Hello,
> >
> >David Greaves wrote:
> >>Just to be clear. This problem is where my system won't resume after s2d
> >>unless I umount my xfs over raid6 filesystem.
> >
> >This is really weird.  I don't see how xfs mount can affect this at all.
> Indeed.
> It does :)

Ok, so lets determine if it really is XFS.  Does the lockup happen with a
different filesystem on the md device? Or if you can't test that, does
any other XFS filesystem you have show the same problem?

If it is xfs that is causing the problem, what happens if you
remount read-only instead of unmounting before shutting down?
What about freezing the filesystem?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 10:30                   ` David Greaves
  2007-06-07 11:07                     ` David Chinner
@ 2007-06-07 13:45                     ` Duane Griffin
  2007-06-07 14:00                       ` David Greaves
  2007-06-07 20:12                     ` Pavel Machek
  2 siblings, 1 reply; 48+ messages in thread
From: Duane Griffin @ 2007-06-07 13:45 UTC (permalink / raw)
  To: David Greaves
  Cc: Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs, linux-kernel,
	linux-pm, Neil Brown

On 07/06/07, David Greaves <david@dgreaves.com> wrote:
> > How hard does the machine freeze?  Can you use sysrq?  If so, please
> > dump sysrq-t.
> I suspect there is a problem writing to the consoles...
>
> I recompiled (rc4+patch) with sysrq support, suspended, resumed and tried
> sysrq-t but got no output.
>
> I *can* change VTs and see the various login prompts, bitmap messages and the
> console messages. Caps/Num lock lights work.
>
> Fearing incompetence I tried sysrq-s sysrq-u sysrq-b and got a reboot so sysrq
> is OK.

Try sysrq-9 before the sysrq-t. Probably the messages are not being
printed to console with your default output level.

Cheers,
Duane Griffin.

-- 
"I never could learn to drink that blood and call it wine" - Bob Dylan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 11:07                     ` David Chinner
@ 2007-06-07 13:59                       ` David Greaves
  2007-06-07 22:28                         ` David Chinner
  2007-06-10 18:43                         ` Pavel Machek
  0 siblings, 2 replies; 48+ messages in thread
From: David Greaves @ 2007-06-07 13:59 UTC (permalink / raw)
  To: David Chinner
  Cc: Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

David Chinner wrote:
> On Thu, Jun 07, 2007 at 11:30:05AM +0100, David Greaves wrote:
>> Tejun Heo wrote:
>>> Hello,
>>>
>>> David Greaves wrote:
>>>> Just to be clear. This problem is where my system won't resume after s2d
>>>> unless I umount my xfs over raid6 filesystem.
>>> This is really weird.  I don't see how xfs mount can affect this at all.
>> Indeed.
>> It does :)
> 
> Ok, so lets determine if it really is XFS.
Seems like a good next step...

> Does the lockup happen with a
> different filesystem on the md device? Or if you can't test that, does
> any other XFS filesystem you have show the same problem?
It's a rather full 1.2Tb raid6 array - can't reformat it - sorry :)
I only noticed the problem when I umounted the fs during tests to prevent 
corruption - and it worked. I'm doing a sync each time it hibernates (see below) 
and a couple of paranoia xfs_repairs haven't shown any problems.

I do have another xfs filesystem on /dev/hdb2 (mentioned when I noticed the 
md/XFS correlation). It doesn't seem to have/cause any problems.

> If it is xfs that is causing the problem, what happens if you
> remount read-only instead of unmounting before shutting down?
Yes, I'm happy to try these tests.
nb, the hibernate script is:
ethtool -s eth0 wol g
sync
echo platform > /sys/power/disk
echo disk > /sys/power/state

So there has always been a sync before any hibernate.


cu:~# mount -oremount,ro /huge
cu:~# mount
/dev/hda2 on / type xfs (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
usbfs on /proc/bus/usb type usbfs (rw)
tmpfs on /dev/shm type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
nfsd on /proc/fs/nfsd type nfsd (rw)
/dev/hda1 on /boot type ext3 (rw)
/dev/md0 on /huge type xfs (ro)
/dev/hdb2 on /scratch type xfs (rw)
tmpfs on /dev type tmpfs (rw,size=10M,mode=0755)
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
cu:(pid2862,port1022) on /net type nfs 
(intr,rw,port=1022,toplvl,map=/usr/share/am-utils/amd.net,noac)
elm:/space on /amd/elm/root/space type nfs (rw,vers=3,proto=tcp)
elm:/space-backup on /amd/elm/root/space-backup type nfs (rw,vers=3,proto=tcp)
elm:/usr/src on /amd/elm/root/usr/src type nfs (rw,vers=3,proto=tcp)
cu:~# /usr/net/bin/hibernate
[this works and resumes]

cu:~# mount -oremount,rw /huge
cu:~# /usr/net/bin/hibernate
[this works and resumes too !]

cu:~# touch /huge/tst
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate]




 > What about freezing the filesystem?
cu:~# xfs_freeze -f /huge
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate - same as the 'touch']

Nb the screen looks like this:
http://www.dgreaves.com/pub/2.6.21-rc4-ptched-suspend-failure.jpg
whether it hangs on suspend or resume.

So I wouldn't say it *is* XFS at fault - but there certainly seems to be an 
interaction...
At least it's easily reproducible :) Shame about the sysrq

I can think of other permutations of freeze/ro/writing tests but I'm just 
thrashing really. Happy for you to tell me what to try next ...


David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 13:45                     ` Duane Griffin
@ 2007-06-07 14:00                       ` David Greaves
  2007-06-07 14:05                         ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-07 14:00 UTC (permalink / raw)
  To: Duane Griffin
  Cc: Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs, linux-kernel,
	linux-pm, Neil Brown

Duane Griffin wrote:
> On 07/06/07, David Greaves <david@dgreaves.com> wrote:
>> > How hard does the machine freeze?  Can you use sysrq?  If so, please
>> > dump sysrq-t.
>> I suspect there is a problem writing to the consoles...
>>
>> I recompiled (rc4+patch) with sysrq support, suspended, resumed and tried
>> sysrq-t but got no output.
>>
>> I *can* change VTs and see the various login prompts, bitmap messages 
>> and the
>> console messages. Caps/Num lock lights work.
>>
>> Fearing incompetence I tried sysrq-s sysrq-u sysrq-b and got a reboot 
>> so sysrq
>> is OK.
> 
> Try sysrq-9 before the sysrq-t. Probably the messages are not being
> printed to console with your default output level.

Good idea :)
Didn't work :(

Cheers

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 14:00                       ` David Greaves
@ 2007-06-07 14:05                         ` Tejun Heo
  2007-06-07 14:36                           ` Mark Lord
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2007-06-07 14:05 UTC (permalink / raw)
  To: David Greaves
  Cc: Duane Griffin, Linus Torvalds, Rafael J. Wysocki, xfs,
	linux-kernel, linux-pm, Neil Brown

David Greaves wrote:
> Duane Griffin wrote:
>> On 07/06/07, David Greaves <david@dgreaves.com> wrote:
>>> > How hard does the machine freeze?  Can you use sysrq?  If so, please
>>> > dump sysrq-t.
>>> I suspect there is a problem writing to the consoles...
>>>
>>> I recompiled (rc4+patch) with sysrq support, suspended, resumed and
>>> tried
>>> sysrq-t but got no output.
>>>
>>> I *can* change VTs and see the various login prompts, bitmap messages
>>> and the
>>> console messages. Caps/Num lock lights work.
>>>
>>> Fearing incompetence I tried sysrq-s sysrq-u sysrq-b and got a reboot
>>> so sysrq
>>> is OK.
>>
>> Try sysrq-9 before the sysrq-t. Probably the messages are not being
>> printed to console with your default output level.
> 
> Good idea :)
> Didn't work :(

Can you setup serial console and/or netconsole (not sure whether this
would work tho)?

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 14:05                         ` Tejun Heo
@ 2007-06-07 14:36                           ` Mark Lord
  2007-06-07 15:20                             ` David Greaves
  0 siblings, 1 reply; 48+ messages in thread
From: Mark Lord @ 2007-06-07 14:36 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Greaves, Duane Griffin, Linus Torvalds, Rafael J. Wysocki,
	xfs, linux-kernel, linux-pm, Neil Brown

Tejun Heo wrote:
>
> Can you setup serial console and/or netconsole (not sure whether this
> would work tho)?

Since he has good console output already, capturable by digicam,
I think a better approach might be to provide a patch with extra instrumentation..
You know.. progress messages and the like, so we can see at what step
things stop working.  Or would that not help ?

David, does scrollback work on your dead console?

Cheers

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 14:36                           ` Mark Lord
@ 2007-06-07 15:20                             ` David Greaves
  2007-06-07 16:58                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-07 15:20 UTC (permalink / raw)
  To: Mark Lord
  Cc: Tejun Heo, Duane Griffin, Linus Torvalds, Rafael J. Wysocki, xfs,
	linux-kernel, linux-pm, Neil Brown

Mark Lord wrote:
> Tejun Heo wrote:
>>
>> Can you setup serial console and/or netconsole (not sure whether this
>> would work tho)?
> 
> Since he has good console output already, capturable by digicam,
> I think a better approach might be to provide a patch with extra 
> instrumentation..
> You know.. progress messages and the like, so we can see at what step
> things stop working.  Or would that not help ?
> 
> David, does scrollback work on your dead console?

hmmmm, scrollback doesn't currently _do_ anything.

But the messages didn't scroll there, they just appear (as the memory is 
restored I assume). The same messages appear during the fail-to-suspend case too.

Linus said at one point:
 > Ok, it wasn't a hidden oops. The DISABLE_CONSOLE_SUSPEND=y thing sometimes
 > shows oopses that are otherwise hidden, but at other times it just causes
 > more problems (hard hangs when trying to display something on a device
 > that is suspended, or behind a bridge that got suspended).

 > In your case, the screen output just shows normal resume output, and it
 > apparently just hung for some unknown reason. It *may* be worth trying to
 > do a SysRQ + 't' thing to see what tasks are running (or rather, not
 > running), but since you won't be able to capture it, it's probably not
 > going to be useful.

So I've since removed DISABLE_CONSOLE_SUSPEND=y
Should I put it back?

I was actually doing the netconsole anyway - but skge is currently a module - 
I've avoided making any changes to the config during all these tests but what 
the heck...

And wouldn't you know it.
Get netconsole working (ie new kernel with skge builtin) and I get the hang on 
suspend. Here's the netconsole output...

swsusp: Basic memory bitmaps created
Stopping tasks ... done.
Shrinking memory... done (0 pages freed)
Freed 0 kbytes in 0.03 seconds (0.00 MB/s)
Suspending console(s)


Given that moving something from module to builtin changes the behaviour I 
thought I'd bring these warnings up again (Andrew or Alan mentioned similar 
warnings being problems in another thread...)
Now, I have mentioned these before but there's been a lot going on so here you go:

   MODPOST vmlinux
WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch: 
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference 
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to 
.init.text: (between 'kthreadd' and 'init_waitqueue_head')


David
PS Gotta go - back in a couple of hours - let me know if there are any more 
tests to try.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 15:20                             ` David Greaves
@ 2007-06-07 16:58                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 48+ messages in thread
From: Rafael J. Wysocki @ 2007-06-07 16:58 UTC (permalink / raw)
  To: David Greaves
  Cc: Mark Lord, Tejun Heo, Duane Griffin, Linus Torvalds, xfs,
	linux-kernel, linux-pm, Neil Brown

On Thursday, 7 June 2007 17:20, David Greaves wrote:
> Mark Lord wrote:
> > Tejun Heo wrote:
> >>
> >> Can you setup serial console and/or netconsole (not sure whether this
> >> would work tho)?
> > 
> > Since he has good console output already, capturable by digicam,
> > I think a better approach might be to provide a patch with extra 
> > instrumentation..
> > You know.. progress messages and the like, so we can see at what step
> > things stop working.  Or would that not help ?
> > 
> > David, does scrollback work on your dead console?
> 
> hmmmm, scrollback doesn't currently _do_ anything.
> 
> But the messages didn't scroll there, they just appear (as the memory is 
> restored I assume). The same messages appear during the fail-to-suspend case too.
> 
> Linus said at one point:
>  > Ok, it wasn't a hidden oops. The DISABLE_CONSOLE_SUSPEND=y thing sometimes
>  > shows oopses that are otherwise hidden, but at other times it just causes
>  > more problems (hard hangs when trying to display something on a device
>  > that is suspended, or behind a bridge that got suspended).
> 
>  > In your case, the screen output just shows normal resume output, and it
>  > apparently just hung for some unknown reason. It *may* be worth trying to
>  > do a SysRQ + 't' thing to see what tasks are running (or rather, not
>  > running), but since you won't be able to capture it, it's probably not
>  > going to be useful.
> 
> So I've since removed DISABLE_CONSOLE_SUSPEND=y
> Should I put it back?

I would do that.

Apart from this, your observations don't directly imply that XFS is to blame
here.  It might be involved somehow, but it's also possible that RAID6 alone
would suffice to trigger the problem.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 10:30                   ` David Greaves
  2007-06-07 11:07                     ` David Chinner
  2007-06-07 13:45                     ` Duane Griffin
@ 2007-06-07 20:12                     ` Pavel Machek
  2 siblings, 0 replies; 48+ messages in thread
From: Pavel Machek @ 2007-06-07 20:12 UTC (permalink / raw)
  To: David Greaves
  Cc: Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

Hi!

> >How hard does the machine freeze?  Can you use sysrq?  
> >If so, please
> >dump sysrq-t.
> I suspect there is a problem writing to the consoles...
> 
> I recompiled (rc4+patch) with sysrq support, suspended, 
> resumed and tried sysrq-t but got no output.
> 
> I *can* change VTs and see the various login prompts, 
> bitmap messages and the console messages. Caps/Num lock 
> lights work.
> 
> Fearing incompetence I tried sysrq-s sysrq-u sysrq-b and 
> got a reboot so sysrq is OK.

Increase console loglevel by killing klogd/sysrq-9?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 13:59                       ` David Greaves
@ 2007-06-07 22:28                         ` David Chinner
  2007-06-08 19:09                           ` David Greaves
  2007-06-12 12:31                           ` David Greaves
  2007-06-10 18:43                         ` Pavel Machek
  1 sibling, 2 replies; 48+ messages in thread
From: David Chinner @ 2007-06-07 22:28 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

On Thu, Jun 07, 2007 at 02:59:58PM +0100, David Greaves wrote:
> David Chinner wrote:
> >On Thu, Jun 07, 2007 at 11:30:05AM +0100, David Greaves wrote:
> >>Tejun Heo wrote:
> >>>Hello,
> >>>
> >>>David Greaves wrote:
> >>>>Just to be clear. This problem is where my system won't resume after s2d
> >>>>unless I umount my xfs over raid6 filesystem.
> >>>This is really weird.  I don't see how xfs mount can affect this at all.
> >>Indeed.
> >>It does :)
> >
> >Ok, so lets determine if it really is XFS.
> Seems like a good next step...
> 
> >Does the lockup happen with a
> >different filesystem on the md device? Or if you can't test that, does
> >any other XFS filesystem you have show the same problem?
> It's a rather full 1.2Tb raid6 array - can't reformat it - sorry :)

I suspected as much :/

> I only noticed the problem when I umounted the fs during tests to prevent 
> corruption - and it worked. I'm doing a sync each time it hibernates (see 
> below) and a couple of paranoia xfs_repairs haven't shown any problems.

sync just guarantees that metadata changes are logged and data is
on disk - it doesn't stop the filesystem from doing anything after
the sync...

> I do have another xfs filesystem on /dev/hdb2 (mentioned when I noticed the 
> md/XFS correlation). It doesn't seem to have/cause any problems.

Ok, so it's not an obvious XFS problem...

> >If it is xfs that is causing the problem, what happens if you
> >remount read-only instead of unmounting before shutting down?
> Yes, I'm happy to try these tests.
> nb, the hibernate script is:
> ethtool -s eth0 wol g
> sync
> echo platform > /sys/power/disk
> echo disk > /sys/power/state
> 
> So there has always been a sync before any hibernate.
> 
> 
> cu:~# mount -oremount,ro /huge
.....
> [this works and resumes]

Ok.

> cu:~# mount -oremount,rw /huge
> cu:~# /usr/net/bin/hibernate
> [this works and resumes too !]

Interesting. That means something in the generic remount code
is affecting this.

> cu:~# touch /huge/tst
> cu:~# /usr/net/bin/hibernate
> [but this doesn't even hibernate]

Ok, so a clean inode is sufficient to prevent hibernate from working.

So, what's different between a sync and a remount?

do_remount_sb() does:

    599         shrink_dcache_sb(sb);
    600         fsync_super(sb);

of which a sync does neither. sync does what fsync_super() does in
different sort of way, but does not call sync_blockdev() on each
block device. It looks like that is the two main differences between
sync and remount - remount trims the dentry cache and syncs the blockdev,
sync doesn't.

> > What about freezing the filesystem?
> cu:~# xfs_freeze -f /huge
> cu:~# /usr/net/bin/hibernate
> [but this doesn't even hibernate - same as the 'touch']

I suspect that the frozen filesystem might cause other problems
in the hibernate process. However, while a freeze calls sync_blockdev()
it does not trim the dentry cache.....

So, rather than a remount before hibernate, lets see if we can 
remove the dentries some other way to determine if removing excess
dentries/inodes from the caches makes a difference. Can you do:

# touch /huge/foo
# sync
# echo 1 > /proc/sys/vm/drop_caches
# hibernate

# touch /huge/bar
# sync
# echo 2 > /proc/sys/vm/drop_caches
# hibernate

# touch /huge/baz
# sync
# echo 3 > /proc/sys/vm/drop_caches
# hibernate

And see if any of those survive the suspend/resume?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 22:28                         ` David Chinner
@ 2007-06-08 19:09                           ` David Greaves
  2007-06-12 18:43                             ` Linus Torvalds
  2007-06-12 12:31                           ` David Greaves
  1 sibling, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-08 19:09 UTC (permalink / raw)
  To: David Chinner
  Cc: Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

I had this as a PS, then I thought, we could all be wasting our time...

I don't like these "Section mismatch" warnings but that's because I'm paranoid 
rather than because I know what they mean. I'll be happier when someone says 
"That's OK, I know about them, they're not the problem"

WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch: 
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference 
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to 
.init.text: (between 'kthreadd' and 'init_waitqueue_head')


Andrew Morton said a couple of weeks ago:
 > Could the people who write these bugs, please, like, fix them?
 > It's not trivial noise.  These things lead to kernel crashes.

Anyhow...

David Chinner wrote:
> sync just guarantees that metadata changes are logged and data is
> on disk - it doesn't stop the filesystem from doing anything after
> the sync...
No, but there are no apps accessing the filesystem. It's just available for NFS 
serving. Seems safer before potentially hanging the machine?


Also I made these changes to the kernel:
cu:/boot# diff config-2.6.22-rc4-TejuTst-dbg3-dirty 
config-2.6.22-rc4-TejuTst-dbg1-dirty
3,4c3,4
< # Linux kernel version: 2.6.22-rc4-TejuTst-dbg3
< # Thu Jun  7 20:00:34 2007
---
 > # Linux kernel version: 2.6.22-rc4-TejuTst3
 > # Thu Jun  7 10:59:21 2007
242,244c242
< CONFIG_PM_DEBUG=y
< CONFIG_DISABLE_CONSOLE_SUSPEND=y
< # CONFIG_PM_TRACE is not set
---
 > # CONFIG_PM_DEBUG is not set

positive: I can now get sysrq-t :)
negative: if I build skge into the kernel the behaviour changes so I can't run 
netconsole

Just to be sure I tested and this kernel suspends/restores with /huge unmounted.
It also hangs without an umount so the behaviour is the same.

> Ok, so a clean inode is sufficient to prevent hibernate from working.
> 
> So, what's different between a sync and a remount?
> 
> do_remount_sb() does:
> 
>     599         shrink_dcache_sb(sb);
>     600         fsync_super(sb);
> 
> of which a sync does neither. sync does what fsync_super() does in
> different sort of way, but does not call sync_blockdev() on each
> block device. It looks like that is the two main differences between
> sync and remount - remount trims the dentry cache and syncs the blockdev,
> sync doesn't.
> 
>>> What about freezing the filesystem?
>> cu:~# xfs_freeze -f /huge
>> cu:~# /usr/net/bin/hibernate
>> [but this doesn't even hibernate - same as the 'touch']
> 
> I suspect that the frozen filesystem might cause other problems
> in the hibernate process. However, while a freeze calls sync_blockdev()
> it does not trim the dentry cache.....
> 
> So, rather than a remount before hibernate, lets see if we can 
> remove the dentries some other way to determine if removing excess
> dentries/inodes from the caches makes a difference. Can you do:
> 
> # touch /huge/foo
> # sync
> # echo 1 > /proc/sys/vm/drop_caches
> # hibernate
success
> 
> # touch /huge/bar
> # sync
> # echo 2 > /proc/sys/vm/drop_caches
> # hibernate
success
> 
> # touch /huge/baz
> # sync
> # echo 3 > /proc/sys/vm/drop_caches
> # hibernate
success

So I added
# touch /huge/bork
# sync
# hibernate

And it still succeeded - sigh.

So I thought a bit and did:
rm /huge/b* /huge/foo

 > Clean boot
 > # touch /huge/bar
 > # sync
 > # echo 2 > /proc/sys/vm/drop_caches
 > # hibernate
hangs on suspend (sysrq-b doesn't work)

 > Clean boot
 > # touch /huge/baz
 > # sync
 > # echo 3 > /proc/sys/vm/drop_caches
 > # hibernate
hangs on suspend (sysrq-b doesn't work)

So I rebooted and hibernated to make sure I'm not having random behaviour - yep, 
hang on resume (as per usual).

Now I wonder if any other mounts have an effect...
reboot and umount /dev/hdb2 xfs fs, - hang on hibernate


I'm confused. I'm going to order chinese takeaway and then find a serial cable...

David
PS 2.6.21.1 works fine.




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 13:59                       ` David Greaves
  2007-06-07 22:28                         ` David Chinner
@ 2007-06-10 18:43                         ` Pavel Machek
  2007-06-12 18:00                           ` David Greaves
  1 sibling, 1 reply; 48+ messages in thread
From: Pavel Machek @ 2007-06-10 18:43 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

Hi!

> >On Thu, Jun 07, 2007 at 11:30:05AM +0100, David Greaves 
> >wrote:
> >>Tejun Heo wrote:
> >>>Hello,
> >>>
> >>>David Greaves wrote:
> >>>>Just to be clear. This problem is where my system 
> >>>>won't resume after s2d
> >>>>unless I umount my xfs over raid6 filesystem.
> >>>This is really weird.  I don't see how xfs mount can 
> >>>affect this at all.
> >>Indeed.
> >>It does :)
> >
> >Ok, so lets determine if it really is XFS.
> Seems like a good next step...
> 
> >Does the lockup happen with a
> >different filesystem on the md device? Or if you can't 
> >test that, does
> >any other XFS filesystem you have show the same problem?
> It's a rather full 1.2Tb raid6 array - can't reformat it 
> - sorry :)
> I only noticed the problem when I umounted the fs during 
> tests to prevent corruption - and it worked. I'm doing a 
> sync each time it hibernates (see below) and a couple of 
> paranoia xfs_repairs haven't shown any problems.
> 
> I do have another xfs filesystem on /dev/hdb2 (mentioned 
> when I noticed the md/XFS correlation). It doesn't seem 
> to have/cause any problems.
> 
> >If it is xfs that is causing the problem, what happens 
> >if you
> >remount read-only instead of unmounting before shutting 
> >down?
> Yes, I'm happy to try these tests.
> nb, the hibernate script is:
> ethtool -s eth0 wol g
> sync
> echo platform > /sys/power/disk
> echo disk > /sys/power/state
> 
> So there has always been a sync before any hibernate.
> 
> 
> cu:~# mount -oremount,ro /huge
> cu:~# mount
> /dev/hda2 on / type xfs (rw)
> proc on /proc type proc (rw)
> sysfs on /sys type sysfs (rw)
> usbfs on /proc/bus/usb type usbfs (rw)
> tmpfs on /dev/shm type tmpfs (rw)
> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> nfsd on /proc/fs/nfsd type nfsd (rw)
> /dev/hda1 on /boot type ext3 (rw)
> /dev/md0 on /huge type xfs (ro)
> /dev/hdb2 on /scratch type xfs (rw)
> tmpfs on /dev type tmpfs (rw,size=10M,mode=0755)
> rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs 
> (rw)
> cu:(pid2862,port1022) on /net type nfs 
> (intr,rw,port=1022,toplvl,map=/usr/share/am-utils/amd.net,noac)
> elm:/space on /amd/elm/root/space type nfs 
> (rw,vers=3,proto=tcp)
> elm:/space-backup on /amd/elm/root/space-backup type nfs 
> (rw,vers=3,proto=tcp)
> elm:/usr/src on /amd/elm/root/usr/src type nfs 
> (rw,vers=3,proto=tcp)
> cu:~# /usr/net/bin/hibernate
> [this works and resumes]
> 
> cu:~# mount -oremount,rw /huge
> cu:~# /usr/net/bin/hibernate
> [this works and resumes too !]
> 
> cu:~# touch /huge/tst
> cu:~# /usr/net/bin/hibernate
> [but this doesn't even hibernate]

This is very probably separate problem... and you should have enough
data in dmesg to do something with it.

							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-07 22:28                         ` David Chinner
  2007-06-08 19:09                           ` David Greaves
@ 2007-06-12 12:31                           ` David Greaves
  1 sibling, 0 replies; 48+ messages in thread
From: David Greaves @ 2007-06-12 12:31 UTC (permalink / raw)
  To: David Chinner
  Cc: Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

[RESEND since I sent this late last friday and it's probably been buried by now.]

I had this as a PS, then I thought, we could all be wasting our time...

I don't like these "Section mismatch" warnings but that's because I'm paranoid
rather than because I know what they mean. I'll be happier when someone says
"That's OK, I know about them, they're not the problem"

WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch:
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to
.init.text: (between 'kthreadd' and 'init_waitqueue_head')

I'm paranoid because Andrew Morton said a couple of weeks ago:
> Could the people who write these bugs, please, like, fix them?
> It's not trivial noise.  These things lead to kernel crashes.

Anyhow...

David Chinner wrote:
> sync just guarantees that metadata changes are logged and data is
> on disk - it doesn't stop the filesystem from doing anything after
> the sync...
No, but there are no apps accessing the filesystem. It's just available for NFS
serving. Seems safer before potentially hanging the machine?


Also I made these changes to the kernel:
cu:/boot# diff config-2.6.22-rc4-TejuTst-dbg3-dirty
config-2.6.22-rc4-TejuTst-dbg1-dirty
3,4c3,4
< # Linux kernel version: 2.6.22-rc4-TejuTst-dbg3
< # Thu Jun  7 20:00:34 2007
---
> # Linux kernel version: 2.6.22-rc4-TejuTst3
> # Thu Jun  7 10:59:21 2007
242,244c242
< CONFIG_PM_DEBUG=y
< CONFIG_DISABLE_CONSOLE_SUSPEND=y
< # CONFIG_PM_TRACE is not set
---
> # CONFIG_PM_DEBUG is not set

positive: I can now get sysrq-t :)
negative: if I build skge into the kernel the behaviour changes so I can't run
netconsole

Just to be sure I tested and this kernel suspends/restores with /huge unmounted.
It also hangs without an umount so the behaviour is the same.

> Ok, so a clean inode is sufficient to prevent hibernate from working.
> 
> So, what's different between a sync and a remount?
> 
> do_remount_sb() does:
> 
>     599         shrink_dcache_sb(sb);
>     600         fsync_super(sb);
> 
> of which a sync does neither. sync does what fsync_super() does in
> different sort of way, but does not call sync_blockdev() on each
> block device. It looks like that is the two main differences between
> sync and remount - remount trims the dentry cache and syncs the blockdev,
> sync doesn't.
> 
>>> What about freezing the filesystem?
>> cu:~# xfs_freeze -f /huge
>> cu:~# /usr/net/bin/hibernate
>> [but this doesn't even hibernate - same as the 'touch']
> 
> I suspect that the frozen filesystem might cause other problems
> in the hibernate process. However, while a freeze calls sync_blockdev()
> it does not trim the dentry cache.....
> 
> So, rather than a remount before hibernate, lets see if we can 
> remove the dentries some other way to determine if removing excess
> dentries/inodes from the caches makes a difference. Can you do:
> 
> # touch /huge/foo
> # sync
> # echo 1 > /proc/sys/vm/drop_caches
> # hibernate
success
> 
> # touch /huge/bar
> # sync
> # echo 2 > /proc/sys/vm/drop_caches
> # hibernate
success
> 
> # touch /huge/baz
> # sync
> # echo 3 > /proc/sys/vm/drop_caches
> # hibernate
success

So I added
# touch /huge/bork
# sync
# hibernate

And it still succeeded - sigh.

So I thought a bit and did:
rm /huge/b* /huge/foo

> Clean boot
> # touch /huge/bar
> # sync
> # echo 2 > /proc/sys/vm/drop_caches
> # hibernate
hangs on suspend (sysrq-b doesn't work)

> Clean boot
> # touch /huge/baz
> # sync
> # echo 3 > /proc/sys/vm/drop_caches
> # hibernate
hangs on suspend (sysrq-b doesn't work)

So I rebooted and hibernated to make sure I'm not having random behaviour - yep,
hang on resume (as per usual).

Now I wonder if any other mounts have an effect...
reboot and umount /dev/hdb2 xfs fs, - hang on hibernate


I'm confused. I'm going to order chinese takeaway and then find a serial cable...

David
PS 2.6.21.1 works fine.
PPS the takeaway was nice.




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-10 18:43                         ` Pavel Machek
@ 2007-06-12 18:00                           ` David Greaves
  2007-06-12 21:31                             ` Pavel Machek
  0 siblings, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-12 18:00 UTC (permalink / raw)
  To: Pavel Machek
  Cc: David Chinner, Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

Pavel Machek wrote:
> Hi!
>> cu:~# mount -oremount,ro /huge
>> cu:~# /usr/net/bin/hibernate
>> [this works and resumes]
>>
>> cu:~# mount -oremount,rw /huge
>> cu:~# /usr/net/bin/hibernate
>> [this works and resumes too !]
>>
>> cu:~# touch /huge/tst
>> cu:~# /usr/net/bin/hibernate
>> [but this doesn't even hibernate]
> 
> This is very probably separate problem... and you should have enough
> data in dmesg to do something with it.

What makes you say it's a different problem - it's hanging at the same point 
visually - it's just that one is pre suspend, one is post suspend.

It all feels very related to me - the behaviour all hinges around the same patch 
too.

I'll take a look in dmesg though...

David

PS, looks like some mail holdups somewhere...
Received: from spitz.ucw.cz (gprs189-60.eurotel.cz [160.218.189.60])
	by mail.ukfsn.org (Postfix) with ESMTP id A9125E6AE9
	for <david@dgreaves.com>; Tue, 12 Jun 2007 15:41:23 +0100 (BST)
Received: by spitz.ucw.cz (Postfix, from userid 0)
	id E05FC279F2; Sun, 10 Jun 2007 18:43:48 +0000 (UTC)
Date: Sun, 10 Jun 2007 18:43:48 +0000

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-08 19:09                           ` David Greaves
@ 2007-06-12 18:43                             ` Linus Torvalds
  2007-06-13 11:16                               ` David Greaves
  0 siblings, 1 reply; 48+ messages in thread
From: Linus Torvalds @ 2007-06-12 18:43 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, Tejun Heo, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik



On Fri, 8 Jun 2007, David Greaves wrote:
> 
> positive: I can now get sysrq-t :)

Ok, so color me confused, and maybe I have missed some of the emails or 
skimmed over them too fast (there's been too many of them ;), but

 - I haven't actually seen any traces for this (netconsole apparently 
   doesn't work for you, and I'm not surprised: it never really worked 
   well for me over suspend/resume either, but I think I saw a mention of 
   serial console?)

 - You apparently bisected it down to the range

	0a3fd051c7036ef71b58863f8e5da7c3dabd9d3f <- works
	1d30c33d8d07868199560b24f10ed6280e78a89c <- breaks

   but some of the intermediates in that range didn't compile. Correct?

Can you try to bisect down a bit more, despite the compile error? Just do

	git bisect start
	git bisect good 0a3fd051c7036ef71b58863f8e5da7c3dabd9d3f
	git bisect bad 1d30c33d8d07868199560b24f10ed6280e78a89c

and it should pick 

	f4d6d004: libata: ignore EH scheduling during initialization

for you to test. It will apparently break on the fact that "sata_via.c" 
wants "ata_scsi_device_resume/suspend" for the initialization of the
resume/suspend things in the scsi_host_template, but you should just 
remove those lines, and the compile hopefully completes cleanly after 
that.

IOW, it *should* be easy enough to pinpoint this from 9 changes down to 
just one.

Jeff added to the Cc, since he may not have noticed that one of the most 
long-running issues is apparently sata-related.

(Jeff: David Greaves _also_ had issues with -rc4 due to the SETFXSR 
change, but that should hopefully be resolved and is presumably an 
independent bug. Apart from the fact that "sata_via.c" seems problematic)

		Linus

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-12 18:00                           ` David Greaves
@ 2007-06-12 21:31                             ` Pavel Machek
  0 siblings, 0 replies; 48+ messages in thread
From: Pavel Machek @ 2007-06-12 21:31 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, Tejun Heo, Linus Torvalds, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

Hi!

> >>cu:~# mount -oremount,ro /huge
> >>cu:~# /usr/net/bin/hibernate
> >>[this works and resumes]
> >>
> >>cu:~# mount -oremount,rw /huge
> >>cu:~# /usr/net/bin/hibernate
> >>[this works and resumes too !]
> >>
> >>cu:~# touch /huge/tst
> >>cu:~# /usr/net/bin/hibernate
> >>[but this doesn't even hibernate]
> >
> >This is very probably separate problem... and you should have enough
> >data in dmesg to do something with it.
> 
> What makes you say it's a different problem - it's hanging at the same 
> point visually - it's just that one is pre suspend, one is post suspend.

Ok, I did not see the visuals.

> It all feels very related to me - the behaviour all hinges around the same 
> patch too.
> 
> I'll take a look in dmesg though...
> 
> David
> 
> PS, looks like some mail holdups somewhere...
> Received: from spitz.ucw.cz (gprs189-60.eurotel.cz [160.218.189.60])
> 	by mail.ukfsn.org (Postfix) with ESMTP id A9125E6AE9
> 	for <david@dgreaves.com>; Tue, 12 Jun 2007 15:41:23 +0100 (BST)
> Received: by spitz.ucw.cz (Postfix, from userid 0)
> 	id E05FC279F2; Sun, 10 Jun 2007 18:43:48 +0000 (UTC)
> Date: Sun, 10 Jun 2007 18:43:48 +0000

Yep, that's normal, spitz is 0.3kg machine connected over gprs. Okay,
I should probably try to sync it more often than once in two days.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-12 18:43                             ` Linus Torvalds
@ 2007-06-13 11:16                               ` David Greaves
  2007-06-13 21:04                                 ` Linus Torvalds
  2007-06-14  0:28                                 ` David Chinner
  0 siblings, 2 replies; 48+ messages in thread
From: David Greaves @ 2007-06-13 11:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Chinner, Tejun Heo, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

Linus Torvalds wrote:
> 
> On Fri, 8 Jun 2007, David Greaves wrote:
>> positive: I can now get sysrq-t :)
> 
> Ok, so color me confused,
So what do you think that makes me <grin>

> and maybe I have missed some of the emails or 
> skimmed over them too fast (there's been too many of them ;),

You may have missed these 'tests' with rc4+Tejun's fix:
* clean boot, unmounting the xfs fs : normal hibernate/resume
* clean boot, remount ro xfs fs : normal hibernate/resume
* clean boot, touch; sync; echo 1 > /proc/sys/vm/drop_caches: normal 
hibernate/resume
* clean boot, touch; sync; echo 2 > /proc/sys/vm/drop_caches: hang hibernating
* clean boot, touch; sync; echo 3 > /proc/sys/vm/drop_caches: hang hibernating

Dave asked me to do them but hasn't responded yet.

> but
> 
>  - I haven't actually seen any traces for this (netconsole apparently 
>    doesn't work for you, and I'm not surprised: it never really worked 
>    well for me over suspend/resume either, but I think I saw a mention of 
>    serial console?)
Well, I got netconsole working but needed to buildin the skge and that changed 
the behaviour a bit... let me know if that's interesting. I've left skge as a 
module as originally reported.
I have however configured serial and serial console and plugged in a cable so I 
can capture data there. Sysrq at the end.


>  - You apparently bisected it down to the range
> 
> 	0a3fd051c7036ef71b58863f8e5da7c3dabd9d3f <- works
> 	1d30c33d8d07868199560b24f10ed6280e78a89c <- breaks
> 
>    but some of the intermediates in that range didn't compile. Correct?
Yes, then applying the sata_via patch confirmed
9666f4009c22f6520ac3fb8a19c9e32ab973e828  libata: reimplement suspend/resume 
support using sdev->manage_start_stop
was the first to cause the problems.

However....
> Can you try to bisect down a bit more, despite the compile error? Just do
> 
> 	git bisect start
> 	git bisect good 0a3fd051c7036ef71b58863f8e5da7c3dabd9d3f
> 	git bisect bad 1d30c33d8d07868199560b24f10ed6280e78a89c
> 
> and it should pick 
> 
> 	f4d6d004: libata: ignore EH scheduling during initialization
> 
> for you to test. It will apparently break on the fact that "sata_via.c" 
> wants "ata_scsi_device_resume/suspend" for the initialization of the
> resume/suspend things in the scsi_host_template, but you should just 
> remove those lines, and the compile hopefully completes cleanly after 
> that.
> 
> IOW, it *should* be easy enough to pinpoint this from 9 changes down to 
> just one.
... let me reconfirm - there's been a lot of testing and I don't want my 
recollection causing problems...

These tests have had the config changed to include serial+console I also 
configured CONFIG_DISABLE_CONSOLE_SUSPEND=y

2.6.21-gf4d6d004-dirty : bad
2.6.21-g920a4b10-dirty : bad
2.6.21-g9666f400-dirty : bad

git-bisect bad
9666f4009c22f6520ac3fb8a19c9e32ab973e828 is first bad commit
commit 9666f4009c22f6520ac3fb8a19c9e32ab973e828
Author: Tejun Heo <htejun@gmail.com>
Date:   Fri May 4 21:27:47 2007 +0200

     libata: reimplement suspend/resume support using sdev->manage_start_stop

Good.

> Jeff added to the Cc, since he may not have noticed that one of the most 
> long-running issues is apparently sata-related.
> 
> (Jeff: David Greaves _also_ had issues with -rc4 due to the SETFXSR 
> change, but that should hopefully be resolved and is presumably an 
> independent bug. Apart from the fact that "sata_via.c" seems problematic)
> 
> 		Linus

So here's a sysrq-t from a failed resume. Ask if you'd like anything else...


SysRq : Show State

                          free                        sibling
   task             PC    stack   pid father child younger older
init          D 00000001     0     1      0 (NOTLB)
        c1941ea0 00000082 00000000 00000001 00000001 00000000 466fc747 28d2d41a
        466fc747 28d2d41a 9120edc4 000001b3 000943a3 00000000 c192e030 c192eb3c
        00000086 00001182 9136dece 000001b3 00000000 00000000 00000000 c1941f08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c022f282>] copy_to_user+0x32/0x50
  [<c016a6f9>] sys_select+0x149/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
kthreadd      S F6BA3F04     0     2      0 (L-TLB)
        c1943fd0 00000046 00000001 f6ba3f04 c1943fb0 c0117627 00000000 00000000
        ffffffff c0104a04 f6ba3f04 00000000 00000003 00000296 f72e15b0 c192e63c
        c1943fd0 00000048 6e2deb31 0000000c 00000a74 c0431298 00000000 00000000
Call Trace:
  [<c0117627>] __wake_up_common+0x37/0x60
  [<c0104a04>] kernel_thread_helper+0x0/0x3c
  [<c012b411>] kthreadd+0x71/0xa0
  [<c012b3a0>] kthreadd+0x0/0xa0
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
ksoftirqd/0   S 000001B3     0     3      2 (L-TLB)
        c1945fc0 00000046 68967778 000001b3 000000fc 00000000 f72eaad0 c192e140
        00000073 c192ea30 68967778 000001b3 c1945f90 00000000 f73a5a50 c192e13c
        c1945fb0 000000a5 9136e544 000001b3 c1945fc0 00000000 c011ef70 fffffffc
Call Trace:
  [<c011ef70>] ksoftirqd+0x0/0x90
  [<c011efeb>] ksoftirqd+0x7b/0x90
  [<c011ef70>] ksoftirqd+0x0/0x90
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
watchdog/0    S C0369725     0     4      2 (L-TLB)
        c1947fc0 00000046 c1943f70 c0369725 c1947fd0 00000046 c048e0e0 c1932a50
        c1947f90 00000296 2dbc9200 000012ae 5716de49 00000009 c1932550 c1932b5c
        fffffffc 00000ab5 ab8b6f40 00000004 c1947fd0 00000000 c0140290 fffffffc
Call Trace:
  [<c0369725>] schedule+0x2e5/0x580
  [<c0140290>] watchdog+0x0/0x70
  [<c01402de>] watchdog+0x4e/0x70
  [<c0140290>] watchdog+0x0/0x70
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
events/0      R running     0     5      2 (L-TLB)
khelper       S 00000286     0     6      2 (L-TLB)
        c194bf80 00000046 00000060 00000286 00000000 00000000 c1914c20 c0127a80
        c194bf60 00000001 f66ebb60 c0127a80 00000000 c0127aa5 f73a5a50 c193215c
        c1914c20 000008c7 b1f02986 0000000f c194bfd0 c1914c20 c194bfa8 c1914c28
Call Trace:
  [<c0127a80>] __call_usermodehelper+0x0/0x70
  [<c0127a80>] __call_usermodehelper+0x0/0x70
  [<c0127aa5>] __call_usermodehelper+0x25/0x70
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
kblockd/0     S C021F890     0    35      2 (L-TLB)
        c19a7f80 00000046 c19146e0 c021f890 00000000 c021f85b c19146e0 c0225790
        f7ea2438 c021f869 5fe478a8 00000009 028a6b4f c021f89b c19b7a70 c19616dc
        0000006e 0000003c 8c1ad4be 00000013 ffffff10 c19146e0 c19a7fa8 c19146e8
Call Trace:
  [<c021f890>] blk_unplug_work+0x0/0x10
  [<c021f85b>] __generic_unplug_device+0x2b/0x30
  [<c0225790>] as_work_handler+0x0/0x20
  [<c021f869>] generic_unplug_device+0x9/0x10
  [<c021f89b>] blk_unplug_work+0xb/0x10
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
kacpid        S 000012B0     0    36      2 (L-TLB)
        c19a9f80 00000046 8e56e600 000012b0 58472b73 00000009 c192e530 ac2395b9
        c19a9f60 c0116ce7 ac239cdd 00000004 00000077 00000000 c192e530 c19611dc
        00000078 00000117 ac239df7 00000004 c19a9fd0 c19145e0 c19a9fa8 c19145e8
Call Trace:
  [<c0116ce7>] activate_task+0x37/0xb0
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
kacpi_notify  S 000012B0     0    37      2 (L-TLB)
        c19abf80 00000046 8ef1f200 000012b0 584778f9 00000009 c192e530 ac23bc7c
        c19abf60 c0116ce7 ac6306e3 00000004 00000154 00000000 c1932050 c1969b3c
        00000078 000002f9 ac630a09 00000004 c19abfd0 c19145a0 c19abfa8 c19145a8
Call Trace:
  [<c0116ce7>] activate_task+0x37/0xb0
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
ata/0         S 0000000A     0   121      2 (L-TLB)
        c19ddf80 00000046 00000060 0000000a 00000000 00000009 f7f40000 f7f41e90
        00000000 c02c9acb 64c028bc 00000009 00023d66 00000000 c19615d0 c19b3b5c
        0000006e 00003906 64c028bc 00000009 c19ddfd0 c19989e0 c19ddfa8 c19989e8
Call Trace:
  [<c02c9acb>] ata_pio_task+0x5b/0xe0
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
ata_aux       D F65F09A4     0   122      2 (L-TLB)
        c19dfd20 00000046 f786c800 f65f09a4 f7ea23b8 c02bcdef ce2e3b9f 4d64703f
        ffffcfff f7ea23b8 00000082 f7ea2418 0005eb1d 00000082 f7c9f0b0 c196913c
        f786c800 000003d4 605db977 00000009 f7ea23b8 c19dfe08 f65f09a4 c19dfd3c
Call Trace:
  [<c02bcdef>] scsi_prep_fn+0x8f/0x130
  [<c0369a24>] wait_for_completion+0x64/0xa0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c021f85b>] __generic_unplug_device+0x2b/0x30
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c02207a7>] blk_execute_rq+0xa7/0xe0
  [<c0220970>] blk_end_sync_rq+0x0/0x30
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c0146c50>] get_page_from_freelist+0x80/0xc0
  [<c02bbd48>] scsi_execute+0xb8/0x110
  [<c02bbe0b>] scsi_execute_req+0x6b/0x90
  [<c02c22f6>] sd_spinup_disk+0x76/0x440
  [<c02c36fe>] sd_revalidate_disk+0x6e/0x160
  [<c02c17a4>] __scsi_disk_get+0x34/0x40
  [<c02c200d>] sd_rescan+0x1d/0x40
  [<c02bf350>] scsi_rescan_device+0x40/0x50
  [<c02ce39c>] ata_scsi_dev_rescan+0x5c/0x70
  [<c02ce340>] ata_scsi_dev_rescan+0x0/0x70
  [<c0127f2a>] run_workqueue+0x4a/0xf0
  [<c01280dd>] worker_thread+0xcd/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
kseriod       D C0369725     0   123      2 (L-TLB)
        c1971f60 00000046 f6d0ff90 c0369725 c1971f70 00000046 c0443c08 c1971fa8
        00000000 c0444de0 91c4e988 000001b3 00002028 00000000 c19b3050 c196963c
        00000073 00000120 91c535b2 000001b3 00000000 00000000 c02da2d0 c1971fa8
Call Trace:
  [<c0369725>] schedule+0x2e5/0x580
  [<c02da2d0>] serio_thread+0x0/0x100
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c02da3c8>] serio_thread+0xf8/0x100
  [<c0117627>] __wake_up_common+0x37/0x60
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c02da2d0>] serio_thread+0x0/0x100
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
pdflush       D C0369725     0   145      2 (L-TLB)
        c19cdf70 00000046 c1971f60 c0369725 c19cdf80 00000046 ef006e00 000012ca
        65778037 00000009 91c4eb5e 000001b3 00002124 00000000 c19b7a70 c19b315c
        00000073 0000006a 91c539de 000001b3 00000000 00000000 c19cdfa8 fffffffc
Call Trace:
  [<c0369725>] schedule+0x2e5/0x580
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01487c5>] __pdflush+0x145/0x150
  [<c01487d0>] pdflush+0x0/0x30
  [<c01487d0>] pdflush+0x0/0x30
  [<c01487f5>] pdflush+0x25/0x30
  [<c01487d0>] pdflush+0x0/0x30
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
pdflush       D C0369725     0   146      2 (L-TLB)
        c19cff70 00000046 c19cdf70 c0369725 c19cff80 00000046 00000001 c19cff48
        c19cff74 c19615d0 91c4f11f 000001b3 00002016 00000000 f7d5f550 c19b7b7c
        00000076 00000067 4270ffb6 0000000a 00000000 00000000 c19cffa8 fffffffc
Call Trace:
  [<c0369725>] schedule+0x2e5/0x580
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01487c5>] __pdflush+0x145/0x150
  [<c01487d0>] pdflush+0x0/0x30
  [<c01487d0>] pdflush+0x0/0x30
  [<c01487f5>] pdflush+0x25/0x30
  [<c0147c30>] wb_kupdate+0x0/0xf0
  [<c01487d0>] pdflush+0x0/0x30
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
kswapd0       D C0369725     0   147      2 (L-TLB)
        c1981f50 00000046 f67a5e90 c0369725 c1981f60 00000046 c1955550 00000000
        c1981f78 c018dcac 91c4dbf7 000001b3 00000c71 00000000 f7d61070 c19b767c
        0000006e 000000c5 91c4f96f 000001b3 00000000 00000000 c0434f44 00000000
Call Trace:
  [<c0369725>] schedule+0x2e5/0x580
  [<c018dcac>] proc_flush_task+0x4c/0x1a0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c014b4d1>] kswapd+0xe1/0x110
  [<c0369725>] schedule+0x2e5/0x580
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c014b3f0>] kswapd+0x0/0x110
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
aio/0         S 000012CA     0   148      2 (L-TLB)
        c1a71f80 00000046 f91aea00 000012ca 657c8d75 00000009 c192e530 b2be46ba
        c1a71f60 c0116ce7 b2c07abe 00000004 000000b0 00000000 c192e530 c19b717c
        00000078 000001a2 b2c07c5f 00000004 c1a71fd0 c199be20 c1a71fa8 c199be28
Call Trace:
  [<c0116ce7>] activate_task+0x37/0xb0
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
xfslogd/0     S F74D6EE0     0   149      2 (L-TLB)
        c1a73f80 00000046 f74d2500 f74d6ee0 f78a64e0 c01f3421 00000000 b2c09d9a
        c1a73f60 00000246 f6f22840 c199b9e0 f6f22cc0 f6f22d1c c199b9e0 c19bbb9c
        00000000 0000004a 91980244 000001b3 c1a73fd0 c199b9e0 c1a73fa8 c199b9e8
Call Trace:
  [<c01f3421>] xlog_iodone+0x51/0xd0
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
xfsdatad/0    S 00000000     0   150      2 (L-TLB)
        c1a75f80 00000046 00000000 00000000 00000000 00000246 00000000 c19e17f0
        00000000 c199e820 c199b9a0 c020c690 00000000 c014557a c19b7a70 c19bb69c
        c199b9a0 00000049 91b5c447 000001b3 c1a75fd0 c199b9a0 c1a75fa8 c199b9a8
Call Trace:
  [<c020c690>] xfs_end_bio_read+0x0/0x10
  [<c014557a>] mempool_free+0x2a/0x60
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
scsi_eh_0     S F7C98084     0   774      2 (L-TLB)
        f7fa1fc0 00000046 00000246 f7c98084 f7c98000 00000246 c1a6f044 00000001
        00000003 00000000 a35b398a 00000005 00000e17 00000000 c1932550 f7cbab5c
        0000006e 0019a81a a35b398a 00000005 00000000 c1a6f000 c02bb740 fffffffc
Call Trace:
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c02bb781>] scsi_error_handler+0x41/0xa0
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
scsi_eh_1     S F7C64084     0   776      2 (L-TLB)
        c1ad3fc0 00000046 00000246 f7c64084 f7c64000 00000246 f78b3c44 00000001
        00000003 00000000 00000292 fffffffc c1ad3fb0 00000246 c192ea30 f7d6167c
        00000000 0019d570 c305aee5 00000005 00000000 f78b3c00 c02bb740 fffffffc
Call Trace:
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c02bb781>] scsi_error_handler+0x41/0xa0
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
scsi_eh_2     S C1AD4084     0   778      2 (L-TLB)
        f7d5dfc0 00000046 00000246 c1ad4084 c1ad4000 00000246 f78b3844 00000001
        00000003 00000000 00000292 fffffffc f7d5dfb0 00000246 c192ea30 f7efd65c
        00000000 001a0997 e2bc559e 00000005 00000000 f78b3800 c02bb740 fffffffc
Call Trace:
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c02bb781>] scsi_error_handler+0x41/0xa0
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
scsi_eh_3     S F7FC4084     0   780      2 (L-TLB)
        c1a77fc0 00000046 00000246 f7fc4084 f7fc4000 00000246 f78b3444 00000001
        00000003 00000000 02d2bfed 00000006 0000067f 00000000 c1932550 f7d61b7c
        0000006e 001c7d7b 02d2bfed 00000006 00000000 f78b3400 c02bb740 fffffffc
Call Trace:
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c02bb781>] scsi_error_handler+0x41/0xa0
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
scsi_eh_4     S F786C800     0   803      2 (L-TLB)
        f7fc3fc0 00000046 f786c800 f786c800 f7ea23b8 c02bd166 f7964044 f7ea23b8
        00000292 c021f7f2 6154d436 00000009 002f304e 00000000 c1932550 f7d0915c
        0000006e 000ec1d6 6154d436 00000009 00000000 f7964000 c02bb740 fffffffc
Call Trace:
  [<c02bd166>] scsi_request_fn+0x196/0x280
  [<c021f7f2>] blk_remove_plug+0x32/0x70
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c02bb781>] scsi_error_handler+0x41/0xa0
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
scsi_eh_5     S F786C400     0   805      2 (L-TLB)
        f7e3ffc0 00000046 f786c400 f786c400 f7ea2730 c02bd166 f786cc44 f7ea2730
        00000292 c021f7f2 67583eed 00000009 002f0e42 00000000 c19615d0 f7c9f1bc
        0000006e 000ec010 67583eed 00000009 00000000 f786cc00 c02bb740 fffffffc
Call Trace:
  [<c02bd166>] scsi_request_fn+0x196/0x280
  [<c021f7f2>] blk_remove_plug+0x32/0x70
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c02bb781>] scsi_error_handler+0x41/0xa0
  [<c02bb740>] scsi_error_handler+0x0/0xa0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
kpsmoused     S 000019B6     0   826      2 (L-TLB)
        f7e81f80 00000046 b68df400 000019b6 db5b46fa 0000000c c192e530 6dada37d
        f7e81f60 c0116ce7 6dadbfd3 00000006 000000a8 00000000 c1932050 f7d5fb5c
        00000078 00000029 6dadc163 00000006 f7e81fd0 f7fa9c20 f7e81fa8 f7fa9c28
Call Trace:
  [<c0116ce7>] activate_task+0x37/0xb0
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
md0_raid5     D 1D3836C7     0   836      2 (L-TLB)
        f7cbfdf0 00000046 f7ea3888 1d3836c7 00000008 c0221036 06a47600 00011210
        c191ede0 c01454ba 5324bddc 00000009 0002fb08 00000000 f7d09050 f7e2c6bc
        0000006e 0000320f 532bcd0b 00000009 f7ea3888 c1a4f000 f7cbfe18 c1a4f13c
Call Trace:
  [<c0221036>] generic_make_request+0x146/0x1d0
  [<c01454ba>] mempool_alloc+0x2a/0xc0
  [<c02f639e>] md_super_wait+0x7e/0xc0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c017eff1>] bio_clone+0x31/0x40
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c02fe880>] write_sb_page+0x50/0x80
  [<c02fe9c2>] write_page+0x112/0x120
  [<c02f8247>] sync_sbs+0x77/0xe0
  [<c02fed09>] bitmap_update_sb+0x69/0xa0
  [<c02f83e8>] md_update_sb+0x138/0x2c0
  [<c0369725>] schedule+0x2e5/0x580
  [<c02fe2dd>] md_check_recovery+0x2dd/0x360
  [<c02f11e0>] raid5d+0x10/0xe0
  [<c02fc7c5>] md_thread+0x55/0x110
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c02fc770>] md_thread+0x0/0x110
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
xfsbufd       D F7D6117C     0   839      2 (L-TLB)
        f74dffa0 00000046 00000282 f7d6117c 0005eadc 000000de 00000282 f74dff88
        00000000 00000282 91c4e16c 000001b3 00000eb5 00000000 f7c51a30 f7d6117c
        0000006e 00000115 91c50443 000001b3 00000000 00000000 00000000 f74ccfa0
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c020fd80>] xfsbufd+0xf0/0x100
  [<c020fc90>] xfsbufd+0x0/0x100
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
xfssyncd      D F7D5F65C     0   840      2 (L-TLB)
        f775ff90 00000046 00000282 f7d5f65c 00060a54 00001d76 00000282 f775ff78
        00000000 00000282 91c4fd9e 000001b3 00001d19 00000000 f7348090 f7d5f65c
        00000078 000002b5 91c5428a 000001b3 00000000 00000000 f775ffb8 f78a93dc
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c0215aab>] xfssyncd+0x15b/0x160
  [<c0215950>] xfssyncd+0x0/0x160
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
udevd         D 00000010     0   921      1 (NOTLB)
        f7043ea0 00000082 f66ec590 00000010 c1913ba0 f67ba22c 407ef268 00000010
        c1913ba0 f7fa5cb4 9133dae2 000001b3 00006df1 00000000 f7348590 f7cba65c
        00000075 00000567 9134df46 000001b3 00000000 00000000 00000000 f7043f08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c011d7bb>] do_wait+0x1bb/0x3b0
  [<c016e8bd>] shrink_dcache_parent+0xd/0x30
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c016a659>] sys_select+0xa9/0x170
  [<c011da71>] sys_wait4+0x31/0x40
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
ksuspend_usbd S 00002061     0  1822      2 (L-TLB)
        f6c81f80 00000046 f9766400 00002061 30fcbb32 00000010 c192e530 187e5d99
        f6c81f60 c0116ce7 c048e0e0 c192e530 f6c81f60 c0116be1 f7c51a30 f735867c
        f7358570 000003e1 19639b4d 00000008 f6c81fd0 f71611a0 f6c81fa8 f71611a8
Call Trace:
  [<c0116ce7>] activate_task+0x37/0xb0
  [<c0116be1>] __activate_task+0x21/0x40
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
khubd         D C0369725     0  1825      2 (L-TLB)
        f75adf70 00000046 f74dffa0 c0369725 f75adf80 00000046 00000023 00000011
        00000002 f882a3a5 91c4e5a2 000001b3 000010e3 00000000 f73a25d0 f7c51b3c
        0000006e 000000f0 91c50da4 000001b3 00000000 00000000 f882d180 f75adfa8
Call Trace:
  [<c0369725>] schedule+0x2e5/0x580
  [<f882a3a5>] usb_get_intf+0x15/0x20 [usbcore]
  [<f882d180>] hub_thread+0x0/0xf0 [usbcore]
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f882d1d5>] hub_thread+0x55/0xf0 [usbcore]
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<f882d180>] hub_thread+0x0/0xf0 [usbcore]
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
ksnapd        S 000025EF     0  2132      2 (L-TLB)
        f7237f80 00000046 6f8a7a00 000025ef f7b7c53d 00000012 c192e530 7bdbe29e
        f7237f60 c0116ce7 c048e0e0 c192e530 f7237f60 c0116be1 c19bb090 f730e67c
        f730e570 0000007c 7bea9794 00000009 f7237fd0 f71c9360 f7237fa8 f71c9368
Call Trace:
  [<c0116ce7>] activate_task+0x37/0xb0
  [<c0116be1>] __activate_task+0x21/0x40
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
kjournald     D C0369725     0  2178      2 (L-TLB)
        f6d71f40 00000046 f75adf70 c0369725 f6d71f50 00000046 00000000 f6d0dd38
        00000000 c0117627 91c4e939 000001b3 000012d7 00000000 c19bb090 f73a26dc
        0000006e 000000d2 91c515db 000001b3 00000000 00000000 f73a25d0 00000001
Call Trace:
  [<c0369725>] schedule+0x2e5/0x580
  [<c0117627>] __wake_up_common+0x37/0x60
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01ac221>] kjournald+0xd1/0x1d0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c01ac150>] kjournald+0x0/0x1d0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
xfsbufd       D C19BB19C     0  2179      2 (L-TLB)
        f6d29fa0 00000046 00000282 c19bb19c 0005eb4b 00000085 00000282 f6d29f88
        00000000 00000282 91c4ec56 000001b3 0000145d 00000000 c1961ad0 c19bb19c
        0000006e 000000ab 91c51c91 000001b3 00000000 00000000 00000000 f6d44aa0
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c020fd80>] xfsbufd+0xf0/0x100
  [<c020fc90>] xfsbufd+0x0/0x100
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
xfssyncd      D C1961BDC     0  2183      2 (L-TLB)
        f6881f90 00000046 00000282 c1961bdc 0005f177 00000269 00000282 f6881f78
        00000000 00000282 91c4efba 000001b3 000015d2 00000000 f7cb10b0 c1961bdc
        0000006e 000000af 91c52369 000001b3 00000000 00000000 f6881fb8 f6d45c3c
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c0215aab>] xfssyncd+0x15b/0x160
  [<c0215950>] xfssyncd+0x0/0x160
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
xfsbufd       D F7CB11BC     0  2184      2 (L-TLB)
        f6a33fa0 00000046 00000282 f7cb11bc 0005eb9d 00000067 00000282 f6a33f88
        00000000 00000282 91c4f278 000001b3 0000171e 00000000 f73945b0 f7cb11bc
        0000006e 00000094 91c52939 000001b3 00000000 00000000 00000000 f74cc720
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c020fd80>] xfsbufd+0xf0/0x100
  [<c020fc90>] xfsbufd+0x0/0x100
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
xfssyncd      D F73946BC     0  2185      2 (L-TLB)
        f6d0ff90 00000046 00000282 f73946bc 0005f197 00000153 00000282 f6d0ff78
        00000000 00000282 91c4e688 000001b3 00001e93 00000000 c1969530 f73946bc
        00000072 00000092 91c52ef2 000001b3 00000000 00000000 f6d0ffb8 f6d4573c
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c0215aab>] xfssyncd+0x15b/0x160
  [<c0215950>] xfssyncd+0x0/0x160
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
portmap       D F6BA6684     0  2341      1 (NOTLB)
        f6d0dea0 00000086 00000282 f6ba6684 f6ba64a0 c033b7e6 00000282 f6ba64a0
        f6ba64a0 c034d144 913389cf 000001b3 00002e10 00000000 f7358070 f730cb9c
        00000073 000002ae 9133f6e9 000001b3 00000000 00000000 0804ff78 f6d0df08
Call Trace:
  [<c033b7e6>] inet_csk_clear_xmit_timers+0x36/0x50
  [<c034d144>] tcp_v4_destroy_sock+0x14/0x150
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c017c44d>] invalidate_inode_buffers+0xd/0xa0
  [<c016dfb0>] d_kill+0x40/0x60
  [<c016dfec>] dput+0x1c/0xe0
  [<c015fa94>] __fput+0xf4/0x160
  [<c017272c>] mntput_no_expire+0x1c/0x70
  [<c015e2d3>] filp_close+0x43/0x70
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
syslogd       D F77A1F28     0  2478      1 (NOTLB)
        f77a1ea0 00000086 f7fbe804 f77a1f28 00000000 00000001 ffffffff f7c20940
        00000000 00000000 91338d55 000001b3 00002e6f 00000000 f7358a70 f735817c
        00000073 00000070 9133fb50 000001b3 00000000 00000000 00000000 f77a1f08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c015f20a>] do_readv_writev+0xaa/0x190
  [<c0163fb0>] pipe_write+0x0/0x400
  [<c016a659>] sys_select+0xa9/0x170
  [<c0124675>] sigprocmask+0x45/0xb0
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
klogd         D C04953A0     0  2484      1 (NOTLB)
        f6809ea0 00000086 f68d0a20 c04953a0 f6809dc8 f6809e58 00000000 00000000
        f6809eb0 00000001 913380eb 000001b3 0000267d 00000000 f730ca90 f7c9e69c
        00000073 000000b5 9133dc16 000001b3 00000000 00000000 00000000 f6809f08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c0111b80>] smp_apic_timer_interrupt+0x30/0x40
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c011a8b2>] do_syslog+0xf2/0x350
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c015ea84>] vfs_read+0xe4/0x110
  [<c015ed07>] sys_read+0x47/0x80
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
sshd          D F7215BF8     0  2506      1 (NOTLB)
        f6a2dea0 00200082 f687e360 f7215bf8 00000000 f6a2de58 00000000 00000000
        f6a2deb0 00000001 91339d89 000001b3 00002dd0 00000000 f730c090 f7358b7c
        00000073 00000179 91340a0c 000001b3 00000000 00000000 00000000 f6a2df08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014eb79>] do_wp_page+0x299/0x3b0
  [<c017c44d>] invalidate_inode_buffers+0xd/0xa0
  [<c016dfb0>] d_kill+0x40/0x60
  [<c016a659>] sys_select+0xa9/0x170
  [<c017272c>] mntput_no_expire+0x1c/0x70
  [<c015e2d3>] filp_close+0x43/0x70
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
exim4         D 00000000     0  2546      1 (NOTLB)
        f6835ea0 00000086 f6520a30 00000000 00000000 f6520a30 f6520a30 00000000
        bf896808 c018dcac f6835e88 0000000d c03f1d32 00000f42 f72e1ab0 f7e2cbbc
        f6835e88 00000235 9134f609 000001b3 00000000 00000000 00000000 f6835f08
Call Trace:
  [<c018dcac>] proc_flush_task+0x4c/0x1a0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c011d7bb>] do_wait+0x1bb/0x3b0
  [<c011dea1>] do_setitimer+0x1f1/0x280
  [<c016a659>] sys_select+0xa9/0x170
  [<c011da71>] sys_wait4+0x31/0x40
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
inetd         D 00000000     0  2552      1 (NOTLB)
        f6a29ea0 00000082 00000000 00000000 662f6984 c0437200 f6845bc0 f6a29eb8
        c195f8c0 c02101b8 00000001 00000000 00000001 00000000 f7394ab0 f72e1bbc
        f6a29eb0 000004f5 913518bf 000001b3 00000000 00000000 00000000 f6a29f08
Call Trace:
  [<c02101b8>] xfs_file_aio_read+0x78/0x90
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c016a659>] sys_select+0xa9/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
lpd           D 00000001     0  2556      1 (NOTLB)
        f6a8fea0 00000086 f7ddeb40 00000001 c1913ba0 f6870114 fcd011e4 00000007
        f7312005 c03121f7 00000005 bf9e2660 00000007 00000044 f73a20d0 f7394bbc
        f780c800 00000267 91352994 000001b3 00000000 00000000 00000000 f6a8ff08
Call Trace:
  [<c03121f7>] sys_socketcall+0x87/0x250
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c015db55>] chown_common+0xa5/0xd0
  [<c016dfec>] dput+0x1c/0xe0
  [<c03115c2>] sys_listen+0x42/0x70
  [<c016a659>] sys_select+0xa9/0x170
  [<c0124675>] sigprocmask+0x45/0xb0
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld_safe   D F721BBF8     0  2571      1 (NOTLB)
        f6acdea0 00000086 00000010 f721bbf8 f722f8c0 f71fb700 00000000 00000004
        c0434f00 f7240bf8 f721bbf8 bf953000 bf953000 c014d94d f72ea5d0 f73a21dc
        f73a20d0 00000148 9135328e 000001b3 00000000 00000000 081041c8 f6acdf08
Call Trace:
  [<c014d94d>] copy_page_range+0x9d/0xd0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c011d7bb>] do_wait+0x1bb/0x3b0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c0124e76>] do_sigaction+0x116/0x150
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c012513d>] sys_rt_sigaction+0x8d/0xa0
  [<c011da71>] sys_wait4+0x31/0x40
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld        D 00000000     0  2608   2571 (NOTLB)
        f685fea0 00000086 f77b87bc 00000000 00000001 c0434c64 c0434f04 00000044
        c0434f00 c0146c50 9133fd28 000001b3 00008a1b 00000000 f72e10b0 f72ea6dc
        00000076 00000287 91354440 000001b3 00000000 00000000 00000000 f685ff08
Call Trace:
  [<c0146c50>] get_page_from_freelist+0x80/0xc0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c011952d>] copy_process+0x55d/0xb30
  [<c01527e8>] change_protection+0x78/0xc0
  [<c0116be1>] __activate_task+0x21/0x40
  [<c016a659>] sys_select+0xa9/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld        D F7127D94     0  2610   2571 (NOTLB)
        f6a25ea0 00000086 08a45aa0 f7127d94 08a45aa0 c0134f9c 00000246 c17a5ca0
        c0434da8 c0146a8f 00000000 00000000 f7348590 00000001 f7e2cab0 f734869c
        c0434f14 000000e9 9134e692 000001b3 00000000 00000000 00000000 f6a25f08
Call Trace:
  [<c0134f9c>] futex_wait+0x2fc/0x3c0
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f4a5>] do_anonymous_page+0xb5/0x130
  [<c014f98f>] __handle_mm_fault+0xaf/0x1e0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c0135cf3>] do_futex+0x53/0x130
  [<c0135e35>] sys_futex+0x65/0xf0
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld        D F7127D94     0  2611   2571 (NOTLB)
        f6a03ea0 00000086 08a45b10 f7127d94 08a45b10 c0134f9c 00000000 f6a03ee0
        f6a03e70 c0369f41 9133a6b3 000001b3 000030e1 00000000 f730c590 f730c19c
        00000073 000001a4 91341a7a 000001b3 00000000 00000000 00000000 f6a03f08
Call Trace:
  [<c0134f9c>] futex_wait+0x2fc/0x3c0
  [<c0369f41>] io_schedule+0x11/0x20
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014975a>] pagevec_lookup_tag+0x2a/0x50
  [<c014951b>] __pagevec_release+0x1b/0x30
  [<c01423ea>] wait_on_page_writeback_range+0x6a/0x110
  [<c020128c>] xfs_trans_ijoin+0x2c/0x80
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c0206555>] xfs_fsync+0x1b5/0x1d0
  [<c0135cf3>] do_futex+0x53/0x130
  [<c0135e35>] sys_futex+0x65/0xf0
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld        D F7127D94     0  2612   2571 (NOTLB)
        f6a09ea0 00000086 08a45b80 f7127d94 08a45b80 c0134f9c 00000246 c1760960
        c0434da8 c0146a8f 00000000 00000000 f72e10b0 0000001b f7d5f050 f72e11bc
        c0434f14 0000011d 91354c0e 000001b3 00000000 00000000 00000000 f6a09f08
Call Trace:
  [<c0134f9c>] futex_wait+0x2fc/0x3c0
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f4a5>] do_anonymous_page+0xb5/0x130
  [<c014f98f>] __handle_mm_fault+0xaf/0x1e0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c0135cf3>] do_futex+0x53/0x130
  [<c0135e35>] sys_futex+0x65/0xf0
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld        D F7127D94     0  2613   2571 (NOTLB)
        f6a2bea0 00000086 08a45bf0 f7127d94 08a45bf0 c0134f9c c048e0e0 f7cb15b0
        f6a2be70 c0116be1 9133b006 000001b3 00002fcc 00000000 f7d09a50 f730c69c
        00000073 000000ac 9134213a 000001b3 00000000 00000000 00000000 f6a2bf08
Call Trace:
  [<c0134f9c>] futex_wait+0x2fc/0x3c0
  [<c0116be1>] __activate_task+0x21/0x40
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c0133d5e>] wake_futex+0x2e/0x50
  [<c01348a7>] futex_requeue+0xb7/0x1d0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c0135cf3>] do_futex+0x53/0x130
  [<c0135e35>] sys_futex+0x65/0xf0
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld        D 00000001     0  2616   2571 (NOTLB)
        f6ff7ea0 00000086 f72944a0 00000001 00000000 c014557a f6ff7e68 c1965e94
        f72944a0 c017eef6 c176b8e0 00000000 00000086 00000048 f7c9f5b0 f7d5f15c
        00000001 000001a6 9135579c 000001b3 00000000 00000000 00000000 f6ff7f08
Call Trace:
  [<c014557a>] mempool_free+0x2a/0x60
  [<c017eef6>] bio_put+0x26/0x40
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c02a70c7>] __ide_end_request+0x87/0xe0
  [<c021d404>] elv_queue_empty+0x24/0x30
  [<c02a8417>] ide_do_request+0x67/0x330
  [<c02aeafd>] ide_dma_intr+0x7d/0xc0
  [<c022f282>] copy_to_user+0x32/0x50
  [<c016a6f9>] sys_select+0x149/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld        D 00000000     0  2617   2571 (NOTLB)
        f6ff9ea0 00000086 00000000 00000000 00000000 00000000 00000000 00000000
        00000000 00000000 00000000 00000000 00000000 00000000 f72ea0d0 f7c9f6bc
        00000000 000001a8 91356336 000001b3 00000000 00000000 00000000 f6ff9f08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c022f282>] copy_to_user+0x32/0x50
  [<c016a6f9>] sys_select+0x149/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld        D F7127D94     0  2618   2571 (NOTLB)
        f6ffbea0 00000086 08700fe0 f7127d94 08700fe0 c0134f9c 80b65200 00003dd4
        ea405b29 0000001e 913481e3 000001b3 0000ea28 00000000 f72e15b0 f7cb16bc
        0000007a 0000019b 9136ac79 000001b3 00000000 00000000 00000000 f6ffbf08
Call Trace:
  [<c0134f9c>] futex_wait+0x2fc/0x3c0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c01348a7>] futex_requeue+0xb7/0x1d0
  [<c0133d5e>] wake_futex+0x2e/0x50
  [<c0133fc6>] futex_wake+0x76/0xb0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c0135cf3>] do_futex+0x53/0x130
  [<c0135e35>] sys_futex+0x65/0xf0
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mysqld        D F730C590     0  2619   2571 (NOTLB)
        f6ffdea0 00000086 f730c740 f730c590 f7d09a50 f7127d60 f6a2bea0 c0369725
        f6ffdec0 00000086 9133b619 000001b3 000030c3 00000000 f72eaad0 f7d09b5c
        00000073 000000d6 91342999 000001b3 00000000 00000000 00000008 f6ffdf08
Call Trace:
  [<c0369725>] schedule+0x2e5/0x580
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c0122ea1>] dequeue_signal+0x31/0x130
  [<c0124b12>] sys_rt_sigtimedwait+0x172/0x1d0
  [<c0135d12>] do_futex+0x72/0x130
  [<c0124675>] sigprocmask+0x45/0xb0
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
logger        D 000081A4     0  2609   2571 (NOTLB)
        f6aa9ea0 00000086 04003fff 000081a4 00000001 00000000 00000000 00800081
        00000000 00000000 f6f50cd8 00000400 fffffe00 00000000 f7c53050 f72ea1dc
        f6aa9eb0 000001fd 91357122 000001b3 00000000 00000000 b7f69420 f6aa9f08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0369725>] schedule+0x2e5/0x580
  [<c015ea26>] vfs_read+0x86/0x110
  [<c015ed07>] sys_read+0x47/0x80
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
nrpe          D F6673904     0  2653      1 (NOTLB)
        f6b25ea0 00200086 f6673904 f6673904 f6673904 f67af43c c1910e28 f6673904
        f67af43c c016dfb0 9133cde8 000001b3 00002e52 00000000 f7d09550 f72eabdc
        00000073 000001cd 91343b9d 000001b3 00000000 00000000 bf995dec f6b25f08
Call Trace:
  [<c016dfb0>] d_kill+0x40/0x60
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c022f282>] copy_to_user+0x32/0x50
  [<c016a6f9>] sys_select+0x149/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
nagios-statd  D F6B1FD44     0  2661      1 (NOTLB)
        f6a19ea0 00200086 f6b1fd44 f6b1fd44 f6b1fd44 f6b689ec c1910e28 f6b1fd44
        f6b689ec c016dfb0 f7c20b20 f6b689ec f7c20b20 c016dfec f73940b0 f7c5315c
        00000008 00000654 91359d74 000001b3 00000000 00000000 bf9c0860 f6a19f08
Call Trace:
  [<c016dfb0>] d_kill+0x40/0x60
  [<c016dfec>] dput+0x1c/0xe0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c0312231>] sys_socketcall+0xc1/0x250
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
netserver     D F6B1FA84     0  2664      1 (NOTLB)
        f6b49ea0 00000086 f6b1fa84 f6b1fa84 f6b1fa84 f6b68134 c1910e28 f6b1fa84
        f6b68134 c016dfb0 f7c20da0 f6b68134 f7c20da0 c016dfec f7efda50 f73941bc
        00000008 000002be 9135b0ab 000001b3 00000000 00000000 bfdb1550 f6b49f08
Call Trace:
  [<c016dfb0>] d_kill+0x40/0x60
  [<c016dfec>] dput+0x1c/0xe0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c0312231>] sys_socketcall+0xc1/0x250
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c01158e0>] do_page_fault+0x0/0x590
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
lockd         D F734819C     0  2675      2 (L-TLB)
        f6ba3ed0 00000046 f7e2c0b0 f734819c f727c6c0 000012a8 6e33830e 0000000c
        f69d69a0 7fffffff 91c5013c 000001b3 00001ea8 00000000 f7e2c0b0 f734819c
        00000078 00000176 91c549d9 000001b3 00000000 00000000 f6f40000 00000003
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f918fc82>] svc_recv+0x392/0x420 [sunrpc]
  [<c0116ce7>] activate_task+0x37/0xb0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<f91b34de>] lockd+0x10e/0x240 [lockd]
  [<c0103dc6>] ret_from_fork+0x6/0x1c
  [<f91b33d0>] lockd+0x0/0x240 [lockd]
  [<f91b33d0>] lockd+0x0/0x240 [lockd]
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
rpciod/0      S F65780A0     0  2676      2 (L-TLB)
        f6ba5f80 00000046 f69d63a0 f65780a0 00000000 f918574f 00000000 f657810c
        f65d506c f65780a4 f65780a0 f65780a4 00000286 f918ade2 f73a5a50 f7cb1bbc
        f918afd0 000002c5 8bb74456 000001b3 f6ba5fd0 f69f88a0 f6ba5fa8 f69f88a8
Call Trace:
  [<f918574f>] rpc_release_client+0x3f/0x70 [sunrpc]
  [<f918ade2>] rpc_release_calldata+0x12/0x20 [sunrpc]
  [<f918afd0>] rpc_async_schedule+0x0/0x10 [sunrpc]
  [<c01280f4>] worker_thread+0xe4/0xf0
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c0128010>] worker_thread+0x0/0xf0
  [<c012b20a>] kthread+0x6a/0x70
  [<c012b1a0>] kthread+0x0/0x70
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
nfsd          D F7E2C1BC     0  2677      2 (L-TLB)
        f6bc7f10 00000046 f73a5050 f7e2c1bc 000cb759 00001972 00000282 f6bc7ef8
        00000000 00000282 91c503f8 000001b3 00001ff2 00000000 f73a5050 f7e2c1bc
        00000078 00000128 91c54fa4 000001b3 00000000 00000000 f6b87000 00000022
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f918fc82>] svc_recv+0x392/0x420 [sunrpc]
  [<c015e2d3>] filp_close+0x43/0x70
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<f91c456e>] nfsd+0xae/0x240 [nfsd]
  [<f91c44c0>] nfsd+0x0/0x240 [nfsd]
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
nfsd          D F73A515C     0  2678      2 (L-TLB)
        f6be9f10 00000046 f73a5550 f73a515c 000cb759 00000a68 00000282 f6be9ef8
        00000000 00000282 91c506b7 000001b3 0000211f 00000000 f73a5550 f73a515c
        00000078 0000011b 91c5552c 000001b3 00000000 00000000 f6bb0000 00000022
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f918fc82>] svc_recv+0x392/0x420 [sunrpc]
  [<c015e2d3>] filp_close+0x43/0x70
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<f91c456e>] nfsd+0xae/0x240 [nfsd]
  [<f91c44c0>] nfsd+0x0/0x240 [nfsd]
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
nfsd          D F73A565C     0  2679      2 (L-TLB)
        f6409f10 00000046 f7c53a50 f73a565c 000cb759 00000a75 00000282 f6409ef8
        00000000 00000282 91c50942 000001b3 00002251 00000000 f7c53a50 f73a565c
        00000078 00000112 91c55a89 000001b3 00000000 00000000 f6bd5000 00000022
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f918fc82>] svc_recv+0x392/0x420 [sunrpc]
  [<c015e2d3>] filp_close+0x43/0x70
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<f91c456e>] nfsd+0xae/0x240 [nfsd]
  [<f91c44c0>] nfsd+0x0/0x240 [nfsd]
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
nfsd          D F7C53B5C     0  2680      2 (L-TLB)
        f642bf10 00000046 f730e070 f7c53b5c 000cb759 0000098a 00000282 f642bef8
        00000000 00000282 91c50bda 000001b3 00002387 00000000 f730e070 f7c53b5c
        00000078 00000117 91c56000 000001b3 00000000 00000000 f6bfa000 00000022
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f918fc82>] svc_recv+0x392/0x420 [sunrpc]
  [<c015e2d3>] filp_close+0x43/0x70
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<f91c456e>] nfsd+0xae/0x240 [nfsd]
  [<f91c44c0>] nfsd+0x0/0x240 [nfsd]
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
nfsd          D F730E17C     0  2681      2 (L-TLB)
        f644bf10 00000046 f730ea70 f730e17c 000cb759 00000848 00000282 f644bef8
        00000000 00000282 91c50ed9 000001b3 000024a6 00000000 f730ea70 f730e17c
        00000078 00000121 91c565a6 000001b3 00000000 00000000 f641f000 00000022
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f918fc82>] svc_recv+0x392/0x420 [sunrpc]
  [<c0150701>] remove_vma+0x31/0x50
  [<c015e2d3>] filp_close+0x43/0x70
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<f91c456e>] nfsd+0xae/0x240 [nfsd]
  [<f91c44c0>] nfsd+0x0/0x240 [nfsd]
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
nfsd          D F730EB7C     0  2682      2 (L-TLB)
        f646df10 00000046 f7c9e090 f730eb7c 000cb759 000007b0 00000282 f646def8
        00000000 00000282 91c511a3 000001b3 000025c6 00000000 f7c9e090 f730eb7c
        00000078 00000117 91c56b1c 000001b3 00000000 00000000 f6444000 00000022
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f918fc82>] svc_recv+0x392/0x420 [sunrpc]
  [<c015e2d3>] filp_close+0x43/0x70
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<f91c456e>] nfsd+0xae/0x240 [nfsd]
  [<f91c44c0>] nfsd+0x0/0x240 [nfsd]
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
nfsd          D F7C9E19C     0  2683      2 (L-TLB)
        f64adf10 00000046 f72e15b0 f7c9e19c 0000007c 00000d7d 00000282 f64adef8
        00000000 00000282 91c51404 000001b3 000026f5 00000000 f73a2ad0 f7c9e19c
        00000078 00000109 91c57049 000001b3 00000000 00000000 f6469000 00000022
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f918fc82>] svc_recv+0x392/0x420 [sunrpc]
  [<c015e2d3>] filp_close+0x43/0x70
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<f91c456e>] nfsd+0xae/0x240 [nfsd]
  [<f91c44c0>] nfsd+0x0/0x240 [nfsd]
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
nfsd          D F73A2BDC     0  2684      2 (L-TLB)
        f64cdf10 00000046 f7c9e090 f73a2bdc 0000007c 00006c5a 00000282 f64cdef8
        00000000 00000282 00000282 000cb77f 000cb77f c0369fe4 f73a5a50 f73a2bdc
        00000008 000000f6 91c57517 000001b3 00000000 00000000 f648e000 00000022
Call Trace:
  [<c0369fe4>] schedule_timeout+0x54/0xa0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<f918fc82>] svc_recv+0x392/0x420 [sunrpc]
  [<f918fde2>] svc_send+0x92/0x100 [sunrpc]
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<f91c456e>] nfsd+0xae/0x240 [nfsd]
  [<f91c44c0>] nfsd+0x0/0x240 [nfsd]
  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
  =======================
rpc.mountd    D C177B160     0  2688      1 (NOTLB)
        f6b85ea0 00000082 00000246 c177b160 c0434da8 c0146a8f f6b6352c 00000000
        00000000 c0434da8 91344a31 000001b3 00010f88 00000000 c192ea30 f72e16bc
        0000007d 00000af1 9136cd4c 000001b3 00000000 00000000 00000000 f6b85f08
Call Trace:
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c016a659>] sys_select+0xa9/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
rsync         D 6C720929     0  2694      1 (NOTLB)
        f64d5ea0 00000086 ab800883 6c720929 b43a31c6 81773793 3b011242 3456459e
        3fbc385d af7f93c4 00000000 f69f87e0 f64d5eb8 0000000a f7c51530 f7efdb5c
        f727c650 0000024b 9135c0b9 000001b3 00000000 00000000 00000000 f64d5f08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c034a919>] tcp_v4_get_port+0x19/0x20
  [<c031442e>] release_sock+0xe/0x60
  [<c0358094>] inet_listen+0x34/0x80
  [<c03115c2>] sys_listen+0x42/0x70
  [<c016a659>] sys_select+0xa9/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
smartd        D 00000000     0  2703      1 (NOTLB)
        f6a0dea0 00000082 00000005 00000000 00000001 00000000 00000200 f65f9400
        bfc7b1b0 f786c400 00004000 f6673e60 f651d960 01d501b0 f65245d0 f7c5163c
        004f0001 00000187 9135cb70 000001b3 00000000 00000000 bfc9abc4 f6a0df08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c012d9f9>] ktime_get_ts+0x19/0x50
  [<c022f282>] copy_to_user+0x32/0x50
  [<c012e740>] hrtimer_nanosleep+0x90/0xe0
  [<c012e670>] hrtimer_wakeup+0x0/0x20
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
sshd          D 00000001     0  2705   2506 (NOTLB)
        f6b0dea0 00000082 f687e4c0 00000001 00000000 f6b0de58 00000000 00000000
        f6b0deb0 00000001 9133dac5 000001b3 0000389d 00000000 f65240d0 f7d0965c
        00000073 000003b9 913460db 000001b3 00000000 00000000 00000000 f6b0df08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c012b5e0>] autoremove_wake_function+0x0/0x50
  [<c027fcb4>] tty_write+0x94/0x1f0
  [<c016a659>] sys_select+0xa9/0x170
  [<c015ed87>] sys_write+0x47/0x80
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
bash          D C1733820     0  2761   2705 (NOTLB)
        f6b27ea0 00000086 00000246 c1733820 c0434da8 c0146a8f 00000000 00000000
        00000000 c0434da8 9133e08b 000001b3 00003a41 00000000 f6524ad0 f65241dc
        00000073 000000f7 91346a84 000001b3 00000000 00000000 080ffac8 f6b27f08
Call Trace:
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c011d7bb>] do_wait+0x1bb/0x3b0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c0281589>] tiocspgrp+0xc9/0xe0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c011da71>] sys_wait4+0x31/0x40
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
rpc.statd     D F6541EA8     0  2768      1 (NOTLB)
        f6541ea0 00000082 00000000 f6541ea8 00000000 c022f282 bffdbb44 00000010
        bffdbb20 c0310235 00000000 f6541ec8 08056a80 f64d3a00 f64c6ab0 f65246dc
        00000190 000002fa 9135e04b 000001b3 00000000 00000000 00000000 f6541f08
Call Trace:
  [<c022f282>] copy_to_user+0x32/0x50
  [<c0310235>] move_addr_to_user+0x65/0x70
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c017c44d>] invalidate_inode_buffers+0xd/0xa0
  [<c016dfb0>] d_kill+0x40/0x60
  [<c016a659>] sys_select+0xa9/0x170
  [<c017272c>] mntput_no_expire+0x1c/0x70
  [<c015e2d3>] filp_close+0x43/0x70
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
rpc.idmapd    D 00008003     0  2776      1 (NOTLB)
        f659fea0 00000082 f6089000 00008003 00000000 f6056c20 00000000 f659ff28
        00000001 c193df40 9133e5ad 000001b3 00003cf9 00000000 f7c53550 f6524bdc
        00000073 00000128 91347618 000001b3 00000000 00000000 00001388 f659ff08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c0369fe4>] schedule_timeout+0x54/0xa0
  [<c0185b19>] ep_events_transfer+0x69/0x80
  [<c0122420>] process_timeout+0x0/0x10
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
ntpd          D F6511E88     0  2791      1 (NOTLB)
        f6511ea0 00200082 00000000 f6511e88 f7fbe7d8 f7c53550 bfeb98fc bfeb98fc
        f7c53760 c0108f0c 9133efd6 000001b3 000042b3 00000000 f64c65b0 f7c5365c
        00000073 0000025f 91348dcf 000001b3 00000000 00000000 00000000 f6511f08
Call Trace:
  [<c0108f0c>] save_i387_fxsave+0x8c/0xb0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c016a659>] sys_select+0xa9/0x170
  [<c010339d>] restore_sigcontext+0x10d/0x160
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
mdadm         D BF9FDFB4     0  2803      1 (NOTLB)
        f6513ea0 00000086 00000000 bf9fdfb4 c1a4f000 c02fc8a5 00000000 80480911
        c1a4f000 c02fc36d c1a4f000 c1a4f10c c190d8c8 c0180d82 f7c9ea90 f64c6bbc
        c1a4f10c 00000276 9135f186 000001b3 00000000 00000000 bf9fcc58 f6513f08
Call Trace:
  [<c02fc8a5>] md_wakeup_thread+0x25/0x30
  [<c02fc36d>] md_ioctl+0xcd/0x410
  [<c0180d82>] check_disk_change+0x32/0x80
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c02fc2a0>] md_ioctl+0x0/0x410
  [<c0222566>] blkdev_driver_ioctl+0x36/0x60
  [<c015dfa8>] nameidata_to_filp+0x28/0x40
  [<c022261f>] blkdev_ioctl+0x8f/0x1d0
  [<c02f5d79>] mddev_put+0x19/0x80
  [<c0181242>] __blkdev_put+0x62/0x120
  [<c022f282>] copy_to_user+0x32/0x50
  [<c016a6f9>] sys_select+0x149/0x170
  [<c017272c>] mntput_no_expire+0x1c/0x70
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
amd           D 00000010     0  2815      1 (NOTLB)
        f65c3ea0 00000082 f65c3ec8 00000010 f65c3e88 00000001 f65c3e68 00000018
        00000000 00000000 9133fbce 000001b3 000049bf 00000000 f7efd050 f64c66bc
        00000073 000002dd 9134aa7a 000001b3 00000000 00000000 00000000 f65c3f08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014eb79>] do_wp_page+0x299/0x3b0
  [<c014fa96>] __handle_mm_fault+0x1b6/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c022f282>] copy_to_user+0x32/0x50
  [<c016a6f9>] sys_select+0x149/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
atd           D C01D83C0     0  2844      1 (NOTLB)
        f6539ea0 00000082 00000000 c01d83c0 00000000 c01d83c0 00000000 00000000
        f65deb00 c01e8434 f6539f08 00000008 00000000 c0209053 c19b3550 f7c9eb9c
        c0434f14 0000015c 9135fb0a 000001b3 00000000 00000000 bfe5d324 f6539f08
Call Trace:
  [<c01d83c0>] xfs_dir2_put_dirent64_direct+0x0/0xb0
  [<c01d83c0>] xfs_dir2_put_dirent64_direct+0x0/0xb0
  [<c01e8434>] xfs_iunlock+0x84/0x90
  [<c0209053>] xfs_readdir+0x53/0x70
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c012d9f9>] ktime_get_ts+0x19/0x50
  [<c022f282>] copy_to_user+0x32/0x50
  [<c012e740>] hrtimer_nanosleep+0x90/0xe0
  [<c012e670>] hrtimer_wakeup+0x0/0x20
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
cron          D 000081A4     0  2847      1 (NOTLB)
        f661bea0 00000082 04003fff 000081a4 00000001 00000000 00000000 060243d5
        00000000 00000274 91340354 000001b3 00004c36 00000000 f64c60b0 f7efd15c
        00000073 00000155 9134b7d5 000001b3 00000000 00000000 bfb8e8d4 f661bf08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c012d9f9>] ktime_get_ts+0x19/0x50
  [<c022f282>] copy_to_user+0x32/0x50
  [<c012e740>] hrtimer_nanosleep+0x90/0xe0
  [<c012e670>] hrtimer_wakeup+0x0/0x20
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
miniserv.pl   D F6F02CA0     0  2864      1 (NOTLB)
        f6631ea0 00000082 f6631eb8 f6f02ca0 f6631e70 c0437200 f667cc00 f6631eb8
        f66647c0 c02101b8 00000001 f6631f00 00000001 00000000 f6520530 c19b365c
        f6631eb0 0000037c 91361374 000001b3 00000000 00000000 00000000 f6631f08
Call Trace:
  [<c02101b8>] xfs_file_aio_read+0x78/0x90
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c011d7bb>] do_wait+0x1bb/0x3b0
  [<c022f282>] copy_to_user+0x32/0x50
  [<c016a6f9>] sys_select+0x149/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
apache        D C173CD20     0  2951      1 (NOTLB)
        f65e7ea0 00000082 00000246 c173cd20 c0434da8 c0146a8f c0434dcc 00000000
        00000000 c0434da8 9133bf89 000001b3 00006e15 00000000 f7cba550 f64c61bc
        00000074 0000013e 9134c442 000001b3 00000000 00000000 00000000 f65e7f08
Call Trace:
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014eb79>] do_wp_page+0x299/0x3b0
  [<c011d7bb>] do_wait+0x1bb/0x3b0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c022f282>] copy_to_user+0x32/0x50
  [<c016a6f9>] sys_select+0x149/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
apache        D C1746100     0  2952   2951 (NOTLB)
        f66a5ea0 00000086 00000246 c1746100 c0434da8 c0146a8f c0434dcc 00000000
        000000b4 c0142b08 c0434f18 00000044 c0434f14 c014290b f7c9fab0 f652063c
        00000000 00000290 91362568 000001b3 00000000 00000000 00000000 f66a5f08
Call Trace:
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c0142b08>] find_lock_page+0x18/0x70
  [<c014290b>] unlock_page+0x1b/0x30
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c016a659>] sys_select+0xa9/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
apache        D C1746480     0  2953   2951 (NOTLB)
        f66d5ea0 00000082 00000246 c1746480 c0434da8 c0146a8f c0434dcc 00000000
        000000b4 c0142b08 c0434f18 00000044 c0434f14 c014290b f64fea50 f7c9fbbc
        00000000 00000211 913633e0 000001b3 00000000 00000000 00000000 f66d5f08
Call Trace:
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c0142b08>] find_lock_page+0x18/0x70
  [<c014290b>] unlock_page+0x1b/0x30
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c022f282>] copy_to_user+0x32/0x50
  [<c0108389>] sys_ipc+0x49/0x280
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c01158e0>] do_page_fault+0x0/0x590
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
apache        D C1746800     0  2954   2951 (NOTLB)
        f66d7ea0 00000086 00000246 c1746800 c0434da8 c0146a8f c0434dcc 00000000
        000000b4 c0142b08 c0434f18 00000044 c0434f14 c014290b f6520030 f64feb5c
        00000000 0000012f 91363c2b 000001b3 00000000 00000000 00000000 f66d7f08
Call Trace:
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c0142b08>] find_lock_page+0x18/0x70
  [<c014290b>] unlock_page+0x1b/0x30
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c022f282>] copy_to_user+0x32/0x50
  [<c0108389>] sys_ipc+0x49/0x280
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c01158e0>] do_page_fault+0x0/0x590
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
apache        D C1741AC0     0  2955   2951 (NOTLB)
        f66d9ea0 00000086 00000246 c1741ac0 c0434da8 c0146a8f c0434dcc 00000000
        000000b4 c0142b08 c0434f18 00000044 c0434f14 c014290b f64fe050 f652013c
        00000000 0000013c 913644d2 000001b3 00000000 00000000 00000000 f66d9f08
Call Trace:
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c0142b08>] find_lock_page+0x18/0x70
  [<c014290b>] unlock_page+0x1b/0x30
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c022f282>] copy_to_user+0x32/0x50
  [<c0108389>] sys_ipc+0x49/0x280
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c01158e0>] do_page_fault+0x0/0x590
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
apache        D C174FBE0     0  2960   2951 (NOTLB)
        f66e3ea0 00000082 00000246 c174fbe0 c0434da8 c0146a8f 00000246 00000000
        000000b4 c0142b08 c0434f18 00000044 c0434f14 c014290b f64fe550 f64fe15c
        00000000 0000012b 91364cff 000001b3 00000000 00000000 00000000 f66e3f08
Call Trace:
  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
  [<c0142b08>] find_lock_page+0x18/0x70
  [<c014290b>] unlock_page+0x1b/0x30
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014f682>] do_no_page+0x162/0x280
  [<c014f9da>] __handle_mm_fault+0xfa/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c022f282>] copy_to_user+0x32/0x50
  [<c0108389>] sys_ipc+0x49/0x280
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c01158e0>] do_page_fault+0x0/0x590
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
munin-node    D 00000000     0  3075      1 (NOTLB)
        f670bea0 00000086 00000000 00000000 00000000 00000000 00000000 00000000
        00000000 00000000 00000000 00000000 00000000 00000000 f64ff090 f64fe65c
        00000000 000002b1 91365fd8 000001b3 00000000 00000000 00000000 f670bf08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c014eb79>] do_wp_page+0x299/0x3b0
  [<c014fa96>] __handle_mm_fault+0x1b6/0x1e0
  [<c0115c7b>] do_page_fault+0x39b/0x590
  [<c022f282>] copy_to_user+0x32/0x50
  [<c016a6f9>] sys_select+0x149/0x170
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
getty         D F64FF19C     0  3095      1 (NOTLB)
        f677dea0 00000086 f66f55b0 f64ff19c 00000000 001179b7 abc5f29a 0000000f
        c1a5e008 7fffffff f77ed800 f677defc bfec340b c036a025 f66f55b0 f64ff19c
        f77ed800 000002b0 913672ae 000001b3 00000000 00000000 0804b214 f677df08
Call Trace:
  [<c036a025>] schedule_timeout+0x95/0xa0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c027fcb4>] tty_write+0x94/0x1f0
  [<c027fbe2>] tty_read+0x82/0xc0
  [<c015ea26>] vfs_read+0x86/0x110
  [<c015ed07>] sys_read+0x47/0x80
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
getty         D F66F56BC     0  3096      1 (NOTLB)
        f677fea0 00000086 f66df570 f66f56bc f66ef400 00002ba6 abc72425 0000000f
        f66ef400 7fffffff f66ef400 f677fefc bf98d6cb c036a025 f66df570 f66f56bc
        f665e008 0000017c 91367d18 000001b3 00000000 00000000 0804b214 f677ff08
Call Trace:
  [<c036a025>] schedule_timeout+0x95/0xa0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c027fcb4>] tty_write+0x94/0x1f0
  [<c027fbe2>] tty_read+0x82/0xc0
  [<c015ea26>] vfs_read+0x86/0x110
  [<c015ed07>] sys_read+0x47/0x80
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
getty         D F66DF67C     0  3097      1 (NOTLB)
        f6781ea0 00000082 f66e8090 f66df67c f651f000 0000232c abc81a59 0000000f
        f651f000 7fffffff f651f000 f6781efc bfb5989b c036a025 f66e8090 f66df67c
        f651f408 00000171 91368732 000001b3 00000000 00000000 0804b214 f6781f08
Call Trace:
  [<c036a025>] schedule_timeout+0x95/0xa0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c027fcb4>] tty_write+0x94/0x1f0
  [<c027fbe2>] tty_read+0x82/0xc0
  [<c015ea26>] vfs_read+0x86/0x110
  [<c015ed07>] sys_read+0x47/0x80
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
getty         D F66E819C     0  3098      1 (NOTLB)
        f6783ea0 00000086 f66e7ad0 f66e819c f6f06c00 00002bab abc94c07 0000000f
        f6f06c00 7fffffff f6f06c00 f6783efc bfffad3b c036a025 f66e7ad0 f66e819c
        f66eac08 0000016f 9136913e 000001b3 00000000 00000000 0804b214 f6783f08
Call Trace:
  [<c036a025>] schedule_timeout+0x95/0xa0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c027fcb4>] tty_write+0x94/0x1f0
  [<c027fbe2>] tty_read+0x82/0xc0
  [<c015ea26>] vfs_read+0x86/0x110
  [<c015ed07>] sys_read+0x47/0x80
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
getty         D F66E7BDC     0  3102      1 (NOTLB)
        f67a9ea0 00000082 f66f50b0 f66e7bdc f66f2000 00002c4c abca821e 0000000f
        f66f2000 7fffffff f66f2000 f67a9efc bf878dbb c036a025 f66f50b0 f66e7bdc
        f668b008 0000015c 91369ac3 000001b3 00000000 00000000 0804b214 f67a9f08
Call Trace:
  [<c036a025>] schedule_timeout+0x95/0xa0
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c027fcb4>] tty_write+0x94/0x1f0
  [<c027fbe2>] tty_read+0x82/0xc0
  [<c015ea26>] vfs_read+0x86/0x110
  [<c015ed07>] sys_read+0x47/0x80
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
getty         D F66F51BC     0  3103      1 (NOTLB)
        f67abea0 00000082 00000000 f66f51bc c195b800 00002659 abcb8e91 0000000f
        c195b800 7fffffff 91346dc8 000001b3 0000efed 00000000 f7cb15b0 f66f51bc
        00000079 0000019c 9136a60a 000001b3 00000000 00000000 0804b214 f67abf08
Call Trace:
  [<c013b27c>] refrigerator+0x3c/0x50
  [<c01245f7>] get_signal_to_deliver+0x227/0x230
  [<c0103ca5>] do_signal+0x65/0x140
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c027fcb4>] tty_write+0x94/0x1f0
  [<c027fbe2>] tty_read+0x82/0xc0
  [<c015ea26>] vfs_read+0x86/0x110
  [<c015ed07>] sys_read+0x47/0x80
  [<c0103dbc>] do_notify_resume+0x3c/0x40
  [<c0103f7e>] work_notifysig+0x13/0x19
  =======================
hibernate     D F65F0C64     0  3907   2761 (NOTLB)
        f67a5cd0 00000082 f786c800 f65f0c64 f7ea23b8 c02bcdef f7ea2040 c1ad1180
        f78b3400 f7ea23b8 5b7f815f 00000009 00003442 00000000 c19615d0 f73a5b5c
        0000006e 0005cef4 5b7f815f 00000009 f7ea23b8 f67a5db8 f65f0c64 f67a5cec
Call Trace:
  [<c02bcdef>] scsi_prep_fn+0x8f/0x130
  [<c0369a24>] wait_for_completion+0x64/0xa0
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c021f85b>] __generic_unplug_device+0x2b/0x30
  [<c01175e0>] default_wake_function+0x0/0x10
  [<c02207a7>] blk_execute_rq+0xa7/0xe0
  [<c0220970>] blk_end_sync_rq+0x0/0x30
  [<c011ece2>] irq_exit+0x42/0x70
  [<c0111b80>] smp_apic_timer_interrupt+0x30/0x40
  [<c01048b8>] apic_timer_interrupt+0x28/0x30
  [<c02bbd48>] scsi_execute+0xb8/0x110
  [<c02bbe0b>] scsi_execute_req+0x6b/0x90
  [<c02c3cc0>] sd_start_stop_device+0x70/0x120
  [<c011ae17>] printk+0x17/0x20
  [<c02c4045>] sd_resume+0x55/0xa0
  [<c02c022f>] scsi_bus_resume+0x6f/0x80
  [<c029da36>] resume_device+0x136/0x190
  [<c022aaf5>] kobject_get+0x15/0x20
  [<c0297ad1>] get_device+0x11/0x20
  [<c029dbad>] dpm_resume+0xbd/0xc0
  [<c029dbcb>] device_resume+0x1b/0x40
  [<c013c043>] hibernate+0x103/0x1a0
  [<c013b205>] state_store+0xc5/0x100
  [<c013b140>] state_store+0x0/0x100
  [<c0194080>] sysfs_write_file+0x0/0x80
  [<c0193e7f>] subsys_attr_store+0x3f/0x50
  [<c019406e>] flush_write_buffer+0x2e/0x40
  [<c01940e5>] sysfs_write_file+0x65/0x80
  [<c015ec39>] vfs_write+0x89/0x110
  [<c015ed87>] sys_write+0x47/0x80
  [<c0168c9b>] sys_dup2+0x9b/0xd0
  [<c0103ef0>] syscall_call+0x7/0xb
  =======================


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-13 11:16                               ` David Greaves
@ 2007-06-13 21:04                                 ` Linus Torvalds
  2007-06-13 21:22                                   ` Jeff Garzik
  2007-06-13 22:02                                   ` David Greaves
  2007-06-14  0:28                                 ` David Chinner
  1 sibling, 2 replies; 48+ messages in thread
From: Linus Torvalds @ 2007-06-13 21:04 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, Tejun Heo, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik



On Wed, 13 Jun 2007, David Greaves wrote:
> 
> git-bisect bad
> 9666f4009c22f6520ac3fb8a19c9e32ab973e828 is first bad commit
> commit 9666f4009c22f6520ac3fb8a19c9e32ab973e828
> Author: Tejun Heo <htejun@gmail.com>
> Date:   Fri May 4 21:27:47 2007 +0200
> 
>     libata: reimplement suspend/resume support using sdev->manage_start_stop
> 
> Good.

Ok, good. So the bug is apparently in the generic SCSI layer start/stop
handling. I'm not entirely surprised, most people would never have 
triggered it (I _think_ it's disabled by default for all devices, and that 
the libata-scsi.c change was literally the first thing to ever enable it 
by default for anything!)

> So here's a sysrq-t from a failed resume. Ask if you'd like anything else...

I'm not seeing anything really obvious. The traces would probably look 
better if you enabled CONFIG_FRAME_POINTER, though. That should cut down 
on some of the noise and make the traces a bit more readable.

"hibernate" is definitely stuck on the new code: it's in the 
"sd_start_stop_device()" call-chain, but I note that ata_aux at the same 
time is also doing some sd_spinup_disk logic as part of rescanning. Maybe 
that's part of the confusion: trying to rescan the bus at the same time 
upper layers (who already *know* the disks that are there) are trying to 
spin up the devices.

Tejun? Jeff?

		Linus

--- some per-thread commentary ---

There's the worrisome thing we've seen before, with the "events" thread 
apparently busy-looping:

> events/0      R running     0     5      2 (L-TLB)

but the more fundamental problem would seem to be ata_aux being in some 
long (infinite?) disk wait:

> ata_aux       D F65F09A4     0   122      2 (L-TLB)
>        c19dfd20 00000046 f786c800 f65f09a4 f7ea23b8 c02bcdef ce2e3b9f 4d64703f
>        ffffcfff f7ea23b8 00000082 f7ea2418 0005eb1d 00000082 f7c9f0b0 c196913c
>        f786c800 000003d4 605db977 00000009 f7ea23b8 c19dfe08 f65f09a4 c19dfd3c
> Call Trace:
>  [<c02bcdef>] scsi_prep_fn+0x8f/0x130
>  [<c0369a24>] wait_for_completion+0x64/0xa0
>  [<c01175e0>] default_wake_function+0x0/0x10
>  [<c021f85b>] __generic_unplug_device+0x2b/0x30
>  [<c01175e0>] default_wake_function+0x0/0x10
>  [<c02207a7>] blk_execute_rq+0xa7/0xe0
>  [<c0220970>] blk_end_sync_rq+0x0/0x30
>  [<c0146a8f>] buffered_rmqueue+0x9f/0x100
>  [<c0146c50>] get_page_from_freelist+0x80/0xc0
>  [<c02bbd48>] scsi_execute+0xb8/0x110
>  [<c02bbe0b>] scsi_execute_req+0x6b/0x90
>  [<c02c22f6>] sd_spinup_disk+0x76/0x440
>  [<c02c36fe>] sd_revalidate_disk+0x6e/0x160
>  [<c02c17a4>] __scsi_disk_get+0x34/0x40
>  [<c02c200d>] sd_rescan+0x1d/0x40
>  [<c02bf350>] scsi_rescan_device+0x40/0x50
>  [<c02ce39c>] ata_scsi_dev_rescan+0x5c/0x70
>  [<c02ce340>] ata_scsi_dev_rescan+0x0/0x70
>  [<c0127f2a>] run_workqueue+0x4a/0xf0
>  [<c01280dd>] worker_thread+0xcd/0xf0
>  [<c012b5e0>] autoremove_wake_function+0x0/0x50
>  [<c012b5e0>] autoremove_wake_function+0x0/0x50
>  [<c0128010>] worker_thread+0x0/0xf0
>  [<c012b20a>] kthread+0x6a/0x70
>  [<c012b1a0>] kthread+0x0/0x70
>  [<c0104a0b>] kernel_thread_helper+0x7/0x3c


And scsi_eh_4/5 seems to be potentially doing something too:

> scsi_eh_4     S F786C800     0   803      2 (L-TLB)
>        f7fc3fc0 00000046 f786c800 f786c800 f7ea23b8 c02bd166 f7964044 f7ea23b8
>        00000292 c021f7f2 6154d436 00000009 002f304e 00000000 c1932550 f7d0915c
>        0000006e 000ec1d6 6154d436 00000009 00000000 f7964000 c02bb740 fffffffc
> Call Trace:
>  [<c02bd166>] scsi_request_fn+0x196/0x280
>  [<c021f7f2>] blk_remove_plug+0x32/0x70
>  [<c02bb740>] scsi_error_handler+0x0/0xa0
>  [<c02bb781>] scsi_error_handler+0x41/0xa0
>  [<c02bb740>] scsi_error_handler+0x0/0xa0
>  [<c012b20a>] kthread+0x6a/0x70
>  [<c012b1a0>] kthread+0x0/0x70
>  [<c0104a0b>] kernel_thread_helper+0x7/0x3c
>  =======================
> scsi_eh_5     S F786C400     0   805      2 (L-TLB)
>        f7e3ffc0 00000046 f786c400 f786c400 f7ea2730 c02bd166 f786cc44 f7ea2730
>        00000292 c021f7f2 67583eed 00000009 002f0e42 00000000 c19615d0 f7c9f1bc
>        0000006e 000ec010 67583eed 00000009 00000000 f786cc00 c02bb740 fffffffc
> Call Trace:
>  [<c02bd166>] scsi_request_fn+0x196/0x280
>  [<c021f7f2>] blk_remove_plug+0x32/0x70
>  [<c02bb740>] scsi_error_handler+0x0/0xa0
>  [<c02bb781>] scsi_error_handler+0x41/0xa0
>  [<c02bb740>] scsi_error_handler+0x0/0xa0
>  [<c012b20a>] kthread+0x6a/0x70
>  [<c012b1a0>] kthread+0x0/0x70
>  [<c0104a0b>] kernel_thread_helper+0x7/0x3c

.. and here it's starting to get interesting: what is md_raid5 hung on?

> md0_raid5     D 1D3836C7     0   836      2 (L-TLB)
>        f7cbfdf0 00000046 f7ea3888 1d3836c7 00000008 c0221036 06a47600 00011210
>        c191ede0 c01454ba 5324bddc 00000009 0002fb08 00000000 f7d09050 f7e2c6bc
>        0000006e 0000320f 532bcd0b 00000009 f7ea3888 c1a4f000 f7cbfe18 c1a4f13c
> Call Trace:
>  [<c0221036>] generic_make_request+0x146/0x1d0
>  [<c01454ba>] mempool_alloc+0x2a/0xc0
>  [<c02f639e>] md_super_wait+0x7e/0xc0
>  [<c012b5e0>] autoremove_wake_function+0x0/0x50
>  [<c017eff1>] bio_clone+0x31/0x40
>  [<c012b5e0>] autoremove_wake_function+0x0/0x50
>  [<c02fe880>] write_sb_page+0x50/0x80
>  [<c02fe9c2>] write_page+0x112/0x120
>  [<c02f8247>] sync_sbs+0x77/0xe0
>  [<c02fed09>] bitmap_update_sb+0x69/0xa0
>  [<c02f83e8>] md_update_sb+0x138/0x2c0
>  [<c0369725>] schedule+0x2e5/0x580
>  [<c02fe2dd>] md_check_recovery+0x2dd/0x360
>  [<c02f11e0>] raid5d+0x10/0xe0
>  [<c02fc7c5>] md_thread+0x55/0x110
>  [<c012b5e0>] autoremove_wake_function+0x0/0x50
>  [<c012b5e0>] autoremove_wake_function+0x0/0x50
>  [<c02fc770>] md_thread+0x0/0x110
>  [<c012b20a>] kthread+0x6a/0x70
>  [<c012b1a0>] kthread+0x0/0x70
>  [<c0104a0b>] kernel_thread_helper+0x7/0x3c

and here's the hibernate damon itself, doing the "sd_start_stop_device()" 
that is supposed to get the ball rolling, but it seems to be another 
infinite wait:

> hibernate     D F65F0C64     0  3907   2761 (NOTLB)
>        f67a5cd0 00000082 f786c800 f65f0c64 f7ea23b8 c02bcdef f7ea2040 c1ad1180
>        f78b3400 f7ea23b8 5b7f815f 00000009 00003442 00000000 c19615d0 f73a5b5c
>        0000006e 0005cef4 5b7f815f 00000009 f7ea23b8 f67a5db8 f65f0c64 f67a5cec
> Call Trace:
>  [<c02bcdef>] scsi_prep_fn+0x8f/0x130
>  [<c0369a24>] wait_for_completion+0x64/0xa0
>  [<c01175e0>] default_wake_function+0x0/0x10
>  [<c021f85b>] __generic_unplug_device+0x2b/0x30
>  [<c01175e0>] default_wake_function+0x0/0x10
>  [<c02207a7>] blk_execute_rq+0xa7/0xe0
>  [<c0220970>] blk_end_sync_rq+0x0/0x30
>  [<c011ece2>] irq_exit+0x42/0x70
>  [<c0111b80>] smp_apic_timer_interrupt+0x30/0x40
>  [<c01048b8>] apic_timer_interrupt+0x28/0x30
>  [<c02bbd48>] scsi_execute+0xb8/0x110
>  [<c02bbe0b>] scsi_execute_req+0x6b/0x90
>  [<c02c3cc0>] sd_start_stop_device+0x70/0x120
>  [<c011ae17>] printk+0x17/0x20
>  [<c02c4045>] sd_resume+0x55/0xa0
>  [<c02c022f>] scsi_bus_resume+0x6f/0x80
>  [<c029da36>] resume_device+0x136/0x190
>  [<c022aaf5>] kobject_get+0x15/0x20
>  [<c0297ad1>] get_device+0x11/0x20
>  [<c029dbad>] dpm_resume+0xbd/0xc0
>  [<c029dbcb>] device_resume+0x1b/0x40
>  [<c013c043>] hibernate+0x103/0x1a0
>  [<c013b205>] state_store+0xc5/0x100
>  [<c013b140>] state_store+0x0/0x100
>  [<c0194080>] sysfs_write_file+0x0/0x80
>  [<c0193e7f>] subsys_attr_store+0x3f/0x50
>  [<c019406e>] flush_write_buffer+0x2e/0x40
>  [<c01940e5>] sysfs_write_file+0x65/0x80
>  [<c015ec39>] vfs_write+0x89/0x110
>  [<c015ed87>] sys_write+0x47/0x80
>  [<c0168c9b>] sys_dup2+0x9b/0xd0
>  [<c0103ef0>] syscall_call+0x7/0xb
>  =======================
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-13 21:04                                 ` Linus Torvalds
@ 2007-06-13 21:22                                   ` Jeff Garzik
  2007-06-13 22:02                                   ` David Greaves
  1 sibling, 0 replies; 48+ messages in thread
From: Jeff Garzik @ 2007-06-13 21:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Greaves, David Chinner, Tejun Heo, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

Linus Torvalds wrote:
> Ok, good. So the bug is apparently in the generic SCSI layer start/stop
> handling. I'm not entirely surprised, most people would never have 
> triggered it (I _think_ it's disabled by default for all devices, and that 
> the libata-scsi.c change was literally the first thing to ever enable it 
> by default for anything!)


I haven't looked at this yet, but wanted to confirm your assessment 
here:  libata was indeed the first (and still only?) user of this code path.

Since some SCSI devices may not be owned by the host computer, in the 
power management sense, we don't want to turn that on for all SCSI 
devices.  Otherwise you wind up powering off a device in another building :)

This is basically the libata suspend/resume path, even though bits touch 
generic SCSI.

	Jeff



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-13 21:04                                 ` Linus Torvalds
  2007-06-13 21:22                                   ` Jeff Garzik
@ 2007-06-13 22:02                                   ` David Greaves
  2007-06-13 22:12                                     ` Linus Torvalds
  1 sibling, 1 reply; 48+ messages in thread
From: David Greaves @ 2007-06-13 22:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Chinner, Tejun Heo, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

Linus Torvalds wrote:
> 
> On Wed, 13 Jun 2007, David Greaves wrote:
>> git-bisect bad
>> 9666f4009c22f6520ac3fb8a19c9e32ab973e828 is first bad commit
>> commit 9666f4009c22f6520ac3fb8a19c9e32ab973e828
>> Author: Tejun Heo <htejun@gmail.com>
>> Date:   Fri May 4 21:27:47 2007 +0200
>>
>>     libata: reimplement suspend/resume support using sdev->manage_start_stop
>>
>> Good.
> 
> Ok, good. So the bug is apparently in the generic SCSI layer start/stop
> handling. I'm not entirely surprised, most people would never have 
> triggered it (I _think_ it's disabled by default for all devices, and that 
> the libata-scsi.c change was literally the first thing to ever enable it 
> by default for anything!)
> 
>> So here's a sysrq-t from a failed resume. Ask if you'd like anything else...
> 
> I'm not seeing anything really obvious. The traces would probably look 
> better if you enabled CONFIG_FRAME_POINTER, though. That should cut down 
> on some of the noise and make the traces a bit more readable.

I can do that...

> "hibernate" is definitely stuck on the new code: it's in the 
> "sd_start_stop_device()" call-chain, but I note that ata_aux at the same 
> time is also doing some sd_spinup_disk logic as part of rescanning. Maybe 
> that's part of the confusion: trying to rescan the bus at the same time 
> upper layers (who already *know* the disks that are there) are trying to 
> spin up the devices.
> 
> Tejun? Jeff?

SysRq : Show State

                          free                        sibling
   task             PC    stack   pid father child younger older
init          D 28D10C50     0     1      0 (NOTLB)
        c1941ea0 00000082 46706775 28d10c50 46706775 28d10c50 00001000 f64b4000
        c1941e80 c04250e0 d5151017 00000018 00002e2e 00000000 f7cb15b0 c192eb3c
        00000073 00000207 d5157d78 00000018 c1941ea0 00000000 00000000 c1941f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
kthreadd      S C192E530     0     2      0 (L-TLB)
        c1943fd0 00000046 00000000 c192e530 c01175f0 00000000 f6967ed4 00000000
        00000003 00000292 f6967eb8 00000000 c1943fc0 00000292 f7122090 c192e63c
        c1943fc0 00000060 d3a629f3 0000000b c1943fd0 c042d298 00000000 00000000
Call Trace:
  [<c012b454>] kthreadd+0x74/0xa0
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
ksoftirqd/0   S C04903C8     0     3      2 (L-TLB)
        c1945fb0 00000046 00000000 c04903c8 c1945f70 00000046 c1932550 c192e140
        c1945f80 c011ee7c c6c1e024 00000000 c1945fa0 c011ec21 f6ba0ad0 c192e13c
        c1945fa0 00000db7 1ba6eda3 00000009 c1945fb0 00000000 c011ef80 fffffffc
Call Trace:
  [<c011effb>] ksoftirqd+0x7b/0x90
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
watchdog/0    S C0364CA5     0     4      2 (L-TLB)
        c1947fb0 00000046 c1943f70 c0364ca5 c1947f70 00000292 c048b0e0 c1932a50
        ac6b4e00 000011ef f7d635a7 00000008 c1947fb0 00000000 c1932550 c1932b5c
        c1947fa0 00000a3e 7beb1af2 00000004 c1947fb0 00000000 c0140310 fffffffc
Call Trace:
  [<c014035e>] watchdog+0x4e/0x70
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
events/0      R running     0     5      2 (L-TLB)
khelper       S 00000000     0     6      2 (L-TLB)
        c194bf60 00000046 00000000 00000000 c194bf20 00000001 f6bf8160 c0127a40
        c194bf30 c0127a67 f6bf8160 c1914c20 c194bf60 c0127ebd f65890b0 c193215c
        ad284200 000008d8 0b3c4782 0000000f 00000246 c1914c20 c194bf88 c1914c28
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
kblockd/0     S F7EA2438     0    35      2 (L-TLB)
        c19a7f60 00000046 c19146e0 f7ea2438 c19a7f20 c021ac9c c19146e0 c021acc0
        c19a7f30 c021acce 2bfa8b70 00000009 c19a7f60 c0127ebd c19b3a50 c19616dc
        0000006e 00000047 e0f8a696 00000010 00000246 c19146e0 c19a7f88 c19146e8
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
kacpid        S C192E530     0    36      2 (L-TLB)
        c19a9f60 00000046 c048b0e0 c192e530 c19a9f20 c0116be1 c192e530 7c841b53
        c19a9f40 c0116d2b 7c842264 00000004 00000087 00000000 c192e530 c19611dc
        00000078 0000013f 7c8423a7 00000004 00000246 c19145e0 c19a9f88 c19145e8
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
kacpi_notify  S C192E530     0    37      2 (L-TLB)
        c19abf60 00000046 c048b0e0 c192e530 c19abf20 c0116be1 c192e530 7c844295
        c19abf40 c0116d2b 7cc769c0 00000004 0000015e 00000000 c1932050 c1969b3c
        00000078 00000328 7cc76cfd 00000004 00000246 c19145a0 c19abf88 c19145a8
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
ata/0         S C192E530     0   121      2 (L-TLB)
        c19f5f60 00000046 00000001 c192e530 c19f5f20 c0116be1 c192e530 7dd68bdb
        c19f5f40 f7839f60 31d38c75 00000009 00023cf6 00000000 c19615d0 c1a021bc
        0000006e 0000391c 31d38c75 00000009 00000246 c1a1a9e0 c19f5f88 c1a1a9e8
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
ata_aux       D F7945000     0   122      2 (L-TLB)
        c19f7ce0 00000046 f7ea23b8 f7945000 c19f7ca0 c0121d67 c198d800 f7ea23b8
        c19f7cb0 c02261b7 f6af7964 f7ea23b8 c19f7cc0 c0293942 f7e865d0 c1a026bc
        c19f7cf0 000003ae 2dae0756 00000009 c19f7cf0 c19f7dc8 f6af7964 c19f7cfc
Call Trace:
  [<c0364fa4>] wait_for_completion+0x64/0xa0
  [<c021bb7d>] blk_execute_rq+0x8d/0xb0
  [<c02b7a88>] scsi_execute+0xb8/0x110
  [<c02b7b48>] scsi_execute_req+0x68/0x90
  [<c02bdefd>] sd_spinup_disk+0x6d/0x400
  [<c02bf25b>] sd_revalidate_disk+0x6b/0x160
  [<c02bdc1f>] sd_rescan+0x1f/0x30
  [<c02bb002>] scsi_rescan_device+0x42/0x50
  [<c02c9e10>] ata_scsi_dev_rescan+0x60/0x70
  [<c0127ebd>] run_workqueue+0x4d/0xf0
  [<c012806d>] worker_thread+0xcd/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
kseriod       D C0364CA5     0   123      2 (L-TLB)
        c19e9f50 00000046 f6723e40 c0364ca5 c19e9f60 00000046 c043fc08 00000000
        c19e9f30 c0440de0 d568e80c 00000018 00000db6 00000000 c19b3550 c1a02bbc
        0000006e 000000d3 d5690886 00000018 c19e9f50 00000000 c02d57a0 c19e9f98
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c02d5889>] serio_thread+0xe9/0x100
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
pdflush       D C0364CA5     0   145      2 (L-TLB)
        c1a49f50 00000046 f6a95f00 c0364ca5 c1a49f60 00000046 c048b0e0 c192e530
        c1a49f20 fffee524 d5691002 00000018 000021c7 00000000 f7cfda70 c196913c
        0000007d 00000088 d5696002 00000018 c1a49f50 00000000 c1a49f98 fffffffc
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c01487c6>] __pdflush+0x146/0x150
  [<c01487f5>] pdflush+0x25/0x30
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
pdflush       D C0364CA5     0   146      2 (L-TLB)
        c1a4bf50 00000046 f6811f80 c0364ca5 c1a4bf60 00000046 c048b0e0 c192e530
        00002afd c19615d0 d569120c 00000018 000018e4 00000000 f7cafa90 c19b3b5c
        00000073 00000129 b6997622 00000009 c1a4bf50 00000000 c1a4bf98 fffffffc
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c01487c6>] __pdflush+0x146/0x150
  [<c01487f5>] pdflush+0x25/0x30
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
kswapd0       D C0364CA5     0   147      2 (L-TLB)
        c1999f40 00000046 c19e9f50 c0364ca5 c1999f50 00000046 c1999f28 0000000d
        7054e600 0000120c d568ec56 00000018 00000ecb 00000000 c19b7070 c19b365c
        0000006e 000000af d5690f62 00000018 c1999f40 00000000 c0430f44 00000000
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c014b451>] kswapd+0xd1/0x100
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
aio/0         S C192E530     0   148      2 (L-TLB)
        c19d1f60 00000046 c048b0e0 c192e530 c19d1f20 c0116be1 c192e530 831e4802
        c19d1f40 c0116d2b 8320a84a 00000004 000000e6 00000000 c192e530 c19b315c
        00000078 0000025c 8320aa6d 00000004 00000246 c194de20 c19d1f88 c194de28
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
xfslogd/0     S C194D9E0     0   149      2 (L-TLB)
        c19d3f60 00000046 f74c8140 c194d9e0 c195f4e0 f74c8140 c194d9e0 c020a5f0
        c19d3f30 c020a629 fe7b2b4b f74c819c c19d3f60 c0127ebd f6ba0ad0 c19b7b7c
        00000073 00000486 d8c04420 00000018 00000246 c194d9e0 c19d3f88 c194d9e8
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
xfsdatad/0    S F6E1F240     0   150      2 (L-TLB)
        c19d5f60 00000046 c048b0e0 f6e1f240 c19d5f20 c1a31be0 c1a31968 c194d9a0
        c19d5f30 c0207f8e 832a8b34 c1a31c08 c19d5f60 c0127ebd c19b3a50 c19b767c
        00000078 00000077 d5327a77 00000018 00000246 c194d9a0 c19d5f88 c194d9a8
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
scsi_eh_0     S 00000001     0   774      2 (L-TLB)
        f7e1dfb0 00000046 c199d044 00000001 00000003 00000246 00000296 00000246
        00000000 00000246 c199d000 fffffffc f7e1df90 c02b822e c192ea30 f7d0915c
        f7e1dfb0 0019aedd 7711c10e 00000005 c199d000 c199d000 c02b7440 fffffffc
Call Trace:
  [<c02b7481>] scsi_error_handler+0x41/0xb0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
scsi_eh_1     S 00000001     0   776      2 (L-TLB)
        f7f63fb0 00000046 f7ff3c44 00000001 00000003 00000246 00000296 00000246
        00000000 00000246 f7ff3c00 fffffffc f7f63f90 c02b822e c192ea30 f7d76b9c
        f7f63fb0 0019dca3 96bbf4e7 00000005 f7ff3c00 f7ff3c00 c02b7440 fffffffc
Call Trace:
  [<c02b7481>] scsi_error_handler+0x41/0xb0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
scsi_eh_2     S 00000001     0   778      2 (L-TLB)
        f7d7ffb0 00000046 f7ff3844 00000001 00000003 00000246 00000296 00000246
        00000000 00000246 f7ff3800 fffffffc f7d7ff90 c02b822e c192ea30 f7f56b7c
        f7d7ffb0 001a10a7 b67304f7 00000005 f7ff3800 f7ff3800 c02b7440 fffffffc
Call Trace:
  [<c02b7481>] scsi_error_handler+0x41/0xb0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
scsi_eh_3     S 00000001     0   780      2 (L-TLB)
        c1973fb0 00000046 f7ff3444 00000001 00000003 00000246 00000296 00000246
        00000000 00000246 f7ff3400 fffffffc c1973f90 c02b822e c192ea30 f7d7217c
        c1973fb0 001c81ca d688c25d 00000005 f7ff3400 f7ff3400 c02b7440 fffffffc
Call Trace:
  [<c02b7481>] scsi_error_handler+0x41/0xb0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
scsi_eh_4     S F7945000     0   803      2 (L-TLB)
        c19a1fb0 00000046 c198d808 f7945000 c19a1f80 c02b44b2 00000296 00000246
        c198d800 c198d800 2ea53f8d 00000009 002f2c50 00000000 c19615d0 f7cb4bdc
        0000006e 000ec5ac 2ea53f8d 00000009 f7945000 f7945000 c02b7440 fffffffc
Call Trace:
  [<c02b7481>] scsi_error_handler+0x41/0xb0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
scsi_eh_5     S C198DC00     0   805      2 (L-TLB)
        f7957fb0 00000046 c198d408 c198dc00 f7957f80 c02b44b2 00000296 00000246
        c198d400 c198d400 34285c1c 00000009 002f23b7 00000000 c19615d0 f7e866dc
        0000006e 000ec3bc 34285c1c 00000009 c198dc00 c198dc00 c02b7440 fffffffc
Call Trace:
  [<c02b7481>] scsi_error_handler+0x41/0xb0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
kpsmoused     S C192E530     0   826      2 (L-TLB)
        f7f65f60 00000046 c048b0e0 c192e530 f7f65f20 c0116be1 c192e530 40af11dc
        f7f65f40 c0116d2b 40af2e63 00000006 000000b8 00000000 c1932050 c196963c
        00000078 0000002d 40af301a 00000006 00000246 f7898c20 f7f65f88 f7898c28
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
md0_raid5     D F7F4D000     0   836      2 (L-TLB)
        f78a3da0 00000046 f7f4d000 f7f4d000 f78a3da0 c01454b6 c048b0e0 f6ba0ad0
        f78a3d70 c0116be1 00010ad0 d5658227 f78a3d90 f7853000 f70c4ae0 f7f5617c
        f78a3de0 00003440 1be4a921 00000009 00000246 f7853000 f78a3dc8 f785313c
Call Trace:
  [<c02f1287>] md_super_wait+0x77/0xc0
  [<c02f993f>] write_sb_page+0x4f/0x80
  [<c02f9a72>] write_page+0x102/0x110
  [<c02f9db9>] bitmap_update_sb+0x89/0x90
  [<c02f3353>] md_update_sb+0x123/0x2a0
  [<c02f93c2>] md_check_recovery+0x302/0x340
  [<c02ec1f2>] raid5d+0x12/0xf0
  [<c02f7746>] md_thread+0x56/0x110
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
xfsbufd       D 00000018     0   838      2 (L-TLB)
        f74dff90 00000046 be2cd3b0 00000018 00000282 00000282 ffff3153 ffff3153
        f74dff90 c0365572 d568effd 00000018 00001133 00000000 f7e86ad0 c19b717c
        0000006e 000000ef d56918bc 00000018 f74dff90 00000000 00000000 f78981a0
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c020b453>] xfsbufd+0xf3/0x100
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
xfssyncd      D 00000017     0   839      2 (L-TLB)
        f75a5f80 00000046 6b45182c 00000017 00000282 00000282 ffff4dc1 ffff4dc1
        f75a5f80 c0365572 d568f322 00000018 00001289 00000000 f7cfd570 f7e86bdc
        0000006e 000000a1 d5691f0a 00000018 f75a5f80 00000000 f75a5fa8 c19c83dc
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0211088>] xfssyncd+0x158/0x160
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
udevd         D 00000005     0   920      1 (NOTLB)
        f7657ea0 00000082 08dd4c82 00000005 f7029005 c017bfc0 ffffffff ffffffff
        f7657e70 c017223e d5159415 00000018 00007ccf 00000000 f7e860d0 f7d72b7c
        00000074 0000052e d516bbb0 00000018 f7657ea0 00000000 00000000 f7657f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
ksuspend_usbd S C192E530     0  1727      2 (L-TLB)
        f7025f60 00000046 c048b0e0 c192e530 f7025f20 c0116be1 c192e530 5513327e
        f7025f40 c0116d2b 00000002 f70fc5c0 c048b0e0 00000003 f7cfd570 f7cac67c
        4ccd1400 0000006f 55e6d417 00000007 00000246 f77644a0 f7025f88 f77644a8
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
khubd         D C0364CA5     0  1728      2 (L-TLB)
        f71edf60 00000046 f75a5f80 c0364ca5 f71edf70 00000046 00000000 000003e8
        f71edf30 f881d3e2 d568f7b8 00000018 00001510 00000000 c1961ad0 f7cfd67c
        0000006e 0000010e d569299c 00000018 f71edf60 00000000 f88201c0 f71edf98
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f8820216>] hub_thread+0x56/0xf0 [usbcore]
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
ksnapd        S C192E530     0  2131      2 (L-TLB)
        f7191f60 00000046 c048b0e0 c192e530 f7191f20 c0116be1 c192e530 d38fa60e
        f7191f40 c0116d2b cec67144 00000008 c048b0e0 00000003 f7d76090 f7122b9c
        3e9b5600 000000a6 d39e34de 00000008 00000246 f77ceda0 f7191f88 f77ceda8
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
kjournald     D C0364CA5     0  2180      2 (L-TLB)
        f7391f30 00000046 f71edf60 c0364ca5 f7391f40 00000046 f7355ce8 00000000
        f7391f20 c0117637 d568fb3b 00000018 00001738 00000000 f7d76090 c1961bdc
        0000006e 000000dc d569323c 00000018 f7391f30 00000000 c1961ad0 00000001
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c01ab19f>] kjournald+0xcf/0x1c0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
xfsbufd       D 00000018     0  2181      2 (L-TLB)
        f7327f90 00000046 a59ec715 00000018 00000282 00000282 ffff30ec ffff30ec
        f7327f90 c0365572 d568fe77 00000018 000018ed 00000000 f7caf090 f7d7619c
        0000006e 000000ba d5693980 00000018 f7327f90 00000000 00000000 f735eae0
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c020b453>] xfsbufd+0xf3/0x100
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
xfssyncd      D 00000011     0  2182      2 (L-TLB)
        f6853f80 00000046 756f731d 00000011 00000282 00000282 ffff34c1 ffff34c1
        f6853f80 c0365572 d569014d 00000018 00001a21 00000000 f7d61070 f7caf19c
        0000006e 00000091 d5693f32 00000018 f6853f80 00000000 f6853fa8 f735fc3c
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0211088>] xfssyncd+0x158/0x160
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
xfsbufd       D 00000018     0  2183      2 (L-TLB)
        f6fd1f90 00000046 b9683ccc 00000018 00000282 00000282 ffff313f ffff313f
        f6fd1f90 c0365572 d569044b 00000018 00001b3d 00000000 f70fc050 f7d6117c
        0000006e 0000008f d56944ce 00000018 f6fd1f90 00000000 00000000 f7764da0
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c020b453>] xfsbufd+0xf3/0x100
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
xfssyncd      D 00000011     0  2184      2 (L-TLB)
        f6811f80 00000046 7d105920 00000011 00000282 00000282 ffff34e1 ffff34e1
        f6811f80 c0365572 d568fa72 00000018 00002175 00000000 c19b3a50 f70fc15c
        00000073 0000007d d56949b2 00000018 f6811f80 00000000 f6811fa8 f735f73c
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0211088>] xfssyncd+0x158/0x160
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
portmap       D F64600A0     0  2344      1 (NOTLB)
        f6f8fea0 00000086 00000000 f64600a0 f6f8fe60 c0341701 f649f960 f64600a0
        f6f8fea0 c0343533 d5152861 00000018 00002e3f 00000000 f771a5d0 f7cb16bc
        00000073 00000271 d51595e9 00000018 f6f8fea0 00000000 0804ff78 f6f8ff08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
syslogd       D F77C12C0     0  2477      1 (NOTLB)
        f6e1bea0 00000086 ffffffff f77c12c0 00000000 00000000 00000000 00000000
        f7cb5a30 00000000 d515b595 00000018 0000799b 00000000 f6ac8090 f7cb5b3c
        00000075 00000093 d516d59a 00000018 f6e1bea0 00000000 00000000 f6e1bf08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
klogd         D 00000016     0  2483      1 (NOTLB)
        f75a7ea0 00000086 f7022a20 00000016 e4078600 00006353 f6ba0c80 f6ba0ad0
        f70b5550 f77e6700 d514f394 00000018 000031a7 00000000 c192ea30 f70b565c
        00000073 000000de d515692f 00000018 f75a7ea0 00000000 00000000 f75a7f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
sshd          D BFBEE000     0  2505      1 (NOTLB)
        f6849ea0 00200086 f68bd4c0 bfbee000 00000000 f6849e58 00000000 00000000
        f6fab910 f70b7a40 d515f375 00000018 000103ef 00000000 f7cb5530 f7cb513c
        00000077 000002e7 d5185b19 00000018 f6849ea0 00000000 00000000 f6849f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
exim4         D 0000000D     0  2545      1 (NOTLB)
        f68c9ea0 00000086 f68c9e78 0000000d c03edd32 000009f3 f7d79384 f7d61a70
        f70edd40 f70edd40 00c65dd8 00000004 f68c9e78 c0118fd2 f7d61a70 f70fcb5c
        00000000 000001f6 d516f1e6 00000018 f68c9ea0 00000000 00000000 f68c9f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
inetd         D F6885EB8     0  2551      1 (NOTLB)
        f6885ea0 00000082 f77e0340 f6885eb8 f6885e70 c01490aa 00000001 f6885f00
        f6885e70 c17c0d80 b7f30000 c17c0d80 f6885e80 c0149122 f7d09550 f7d61b7c
        00000000 000004a7 d5171278 00000018 f6885ea0 00000000 00000000 f6885f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
lpd           D 00000000     0  2555      1 (NOTLB)
        f68cbea0 00000086 00001844 00000000 f68cbe70 f68d8f14 00000001 00000000
        00000004 00000004 00000000 00000000 f68cbed0 c0171008 f7095590 f7d0965c
        00000000 000002b1 d5172550 00000018 f68cbea0 00000000 00000000 f68cbf08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld_safe   D BFD66000     0  2570      1 (NOTLB)
        f68f1ea0 00000082 f702abfc bfd66000 f68f1e90 c014d81a f7723bfc f75ea224
        bfd51000 bfd66000 d515fa2f 00000018 000105af 00000000 f7cb40d0 f7cb563c
        0000007a 000001cf d51865f6 00000018 f68f1ea0 00000000 081041c8 f68f1f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld        D 00000044     0  2607   2570 (NOTLB)
        f6883ea0 00000082 000200d0 00000044 00000001 c16de6c0 c0430f00 00000001
        000200d0 c0430f04 000000d0 f7095590 f6883eb0 c0146ced f7d72570 f709569c
        00000000 0000026e d5173654 00000018 f6883ea0 00000000 00000000 f6883f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld        D C17A5D40     0  2609   2570 (NOTLB)
        f6c99ea0 00000082 00000246 c17a5d40 f6c99e80 c0146aac 00000000 00000092
        f6c99eb0 c0493030 d5022392 00000018 000969ea 00000000 c192e030 f7cb41dc
        00000086 0000031a d5186f45 00000018 f6c99ea0 00000000 00000000 f6c99f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld        D C0142280     0  2610   2570 (NOTLB)
        f6c9bea0 00000082 00000000 c0142280 c18023d0 f7d72570 c17a5ea0 f6c9be94
        00000001 c0492be0 00000000 00000000 f7d72570 0000000b f70b5050 f7d7267c
        00000000 00000147 d5173f48 00000018 f6c9bea0 00000000 00000000 f6c9bf08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld        D C17DB760     0  2611   2570 (NOTLB)
        f6c9dea0 00000082 00000246 c17db760 f6c9de80 c0146aac 00000001 f6c9df00
        00000001 c0493180 d515301d 00000018 0000313f 00000000 f771aad0 f771a6dc
        00000073 0000017b d515a4c0 00000018 f6c9dea0 00000000 00000000 f6c9df08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld        D F7CAC070     0  2612   2570 (NOTLB)
        f6c9fea0 00000082 c048b0e0 f7cac070 f6c9fe60 c0116be1 f7cac070 deff0bf0
        f6c9fe80 c0492c90 00000000 00000000 f70b5050 00000005 f7ca9550 f70b515c
        00000000 000000fd d5174638 00000018 f6c9fea0 00000000 00000000 f6c9ff08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld        D 00000020     0  2619   2570 (NOTLB)
        f68f3ea0 00000082 c042cf00 00000020 f68f3e80 c0117637 00000000 f68f3ea8
        f68f3e80 c01232e0 d515379d 00000018 000032b5 00000000 f7cac070 f771abdc
        00000073 00000118 d515afb8 00000018 f68f3ea0 00000000 00000000 f68f3f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld        D 00000001     0  2620   2570 (NOTLB)
        f6cc9ea0 00000082 f7128e74 00000001 f6cc9e80 c0220656 00000100 00000000
        00000001 f7128e74 00000001 00000000 f6cc9e80 c0229da6 f7ca9050 f7ca965c
        00000000 00000183 d51750ce 00000018 f6cc9ea0 00000000 00000000 f6cc9f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld        D 00000005     0  2621   2570 (NOTLB)
        f6ca3ea0 00000082 00000034 00000005 c048b0e0 00000003 f7d72570 c048b0e0
        f6ca3e80 c0492750 d5153bbe 00000018 0000345c 00000000 f7134030 f7cac17c
        00000073 000000cd d515b7c2 00000018 f6ca3ea0 00000000 00000000 f6ca3f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mysqld        D C0364CA5     0  2623   2570 (NOTLB)
        f6cebea0 00000082 f6ca3ea0 c0364ca5 f6cebeb0 00000082 f7095590 c048b0e0
        f6cebe80 00000046 d5154280 00000018 00003562 00000000 f70b60b0 f713413c
        00000073 000000eb d515c0f0 00000018 f6cebea0 00000000 00000008 f6cebf08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
logger        D 2A7A7965     0  2608   2570 (NOTLB)
        f68ebea0 00000086 4661e75f 2a7a7965 4661e75f 2a7a7965 00000000 00000000
        00000008 00000000 f6fabcd8 00000400 fffffe00 00000000 f70fc550 f7ca915c
        00000000 00000237 d5176054 00000018 f68ebea0 00000000 b7f11420 f68ebf08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
sshd          D F715C00C     0  2613   2505 (NOTLB)
        f6ca1ea0 00000082 f68bd620 f715c00c 00000000 f6ca1e58 00000000 00000000
        f6ca1eb0 00000001 d5154f02 00000018 00003edf 00000000 f7ca9a50 f70b61bc
        00000073 0000037f d515e3ea 00000018 f6ca1ea0 00000000 00000000 f6ca1f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
bash          D CE5E96DF     0  2622   2613 (NOTLB)
        f6ce9ea0 00000086 f6ba0ad0 ce5e96df f6ce9e70 c0116d2b 00000000 00000000
        c0430da8 c0430da8 d5155369 00000018 000040c2 00000000 f70b6ab0 f7ca9b5c
        00000073 000000e3 d515eccc 00000018 f6ce9ea0 00000000 080fff48 f6ce9f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
nrpe          D F64E9A00     0  2660      1 (NOTLB)
        f6f71ea0 00200082 00000008 f64e9a00 f6f71e80 c015f602 00000000 00000000
        f6f71e90 c0353d53 d5155acb 00000018 000045dc 00000000 f7f56570 f70b6bbc
        00000073 000001f2 d5160042 00000018 f6f71ea0 00000000 bfbe883c f6f71f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
nagios-statd  D F6CB6460     0  2668      1 (NOTLB)
        f6947ea0 00200086 00000008 f6cb6460 f6947e80 c015f602 00000000 00000000
        f6947e90 c0353d53 d51572ff 00000018 00004dce 00000000 f7122090 f7f5667c
        00000073 0000044d d5162b48 00000018 f6947ea0 00000000 bf800770 f6947f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
netserver     D F772B780     0  2671      1 (NOTLB)
        f6db3ea0 00000082 00000008 f772b780 f6db3e80 c015f602 00000000 00000000
        f6db3e90 c0353d53 d51575a0 00000018 000092ab 00000000 f7cb5a30 f7e861dc
        00000075 0000025e d516d100 00000018 f6db3ea0 00000000 bfd06990 f6db3f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
lockd         D F7CFDB7C     0  2682      2 (L-TLB)
        f6967eb0 00000046 f7caca70 f7cfdb7c f6967ec0 0000141f d3abfb17 0000000b
        f70796c0 7fffffff d5691332 00000018 000022a5 00000000 f7caca70 f7cfdb7c
        0000007d 00000540 d5696542 00000018 f6967eb0 00000000 f7557000 00000003
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f94edd05>] svc_recv+0x385/0x410 [sunrpc]
  [<f913f461>] lockd+0x111/0x240 [lockd]
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
rpciod/0      S C192E530     0  2683      2 (L-TLB)
        f696df60 00000046 c048b0e0 c192e530 f696df20 f6bfc114 f6f897a0 f94e8fb0
        f696df30 f94e8fbe 00000002 f7cacae0 f696df60 c0127ebd f6ba0ad0 f7cb46dc
        989a3800 0000047b d07704eb 00000018 00000246 f6f897a0 f696df88 f6f897a8
Call Trace:
  [<c012808c>] worker_thread+0xec/0xf0
  [<c012b247>] kthread+0x67/0x70
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
nfsd          D 00000000     0  2684      2 (L-TLB)
        f69adf00 00000046 f69adee8 00000000 f69aded0 00000282 d3ac18fa 0000000b
        00000282 00000282 d56915fd 00000018 00002413 00000000 f7122590 f7cacb7c
        0000007d 0000062d d5696b6f 00000018 f69adf00 00000000 f7283000 00000022
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f94edd05>] svc_recv+0x385/0x410 [sunrpc]
  [<f93b85ac>] nfsd+0xac/0x240 [nfsd]
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
nfsd          D 00000000     0  2685      2 (L-TLB)
        f69cff00 00000046 f69cfee8 00000000 f69cfed0 00000282 d3ac25f1 0000000b
        00000282 00000282 d569187c 00000018 00002518 00000000 f7d09a50 f712269c
        0000007d 000004e9 d5697058 00000018 f69cff00 00000000 f698f000 00000022
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f94edd05>] svc_recv+0x385/0x410 [sunrpc]
  [<f93b85ac>] nfsd+0xac/0x240 [nfsd]
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
nfsd          D 00000000     0  2686      2 (L-TLB)
        f69eff00 00000046 f69efee8 00000000 f69efed0 00000282 d3ac30e9 0000000b
        00000282 00000282 d5691ae8 00000018 0000263a 00000000 f7cb1ab0 f7d09b5c
        0000007d 0000051c d5697574 00000018 f69eff00 00000000 f69b4000 00000022
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f94edd05>] svc_recv+0x385/0x410 [sunrpc]
  [<f93b85ac>] nfsd+0xac/0x240 [nfsd]
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
nfsd          D 00000000     0  2687      2 (L-TLB)
        f6a11f00 00000046 f6a11ee8 00000000 f6a11ed0 00000282 d3ac3dcb 0000000b
        00000282 00000282 d5691df7 00000018 0000276f 00000000 f70b5a50 f7cb1bbc
        0000007d 000005e9 d5697b5d 00000018 f6a11f00 00000000 f69d9000 00000022
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f94edd05>] svc_recv+0x385/0x410 [sunrpc]
  [<f93b85ac>] nfsd+0xac/0x240 [nfsd]
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
nfsd          D 00000000     0  2688      2 (L-TLB)
        f6a31f00 00000046 f6a31ee8 00000000 f6a31ed0 00000282 d3ac48a4 0000000b
        00000282 00000282 d569203a 00000018 0000287a 00000000 f7caf590 f70b5b5c
        0000007d 000004bb d5698018 00000018 f6a31f00 00000000 f69fe000 00000022
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f94edd05>] svc_recv+0x385/0x410 [sunrpc]
  [<f93b85ac>] nfsd+0xac/0x240 [nfsd]
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
nfsd          D 00000000     0  2689      2 (L-TLB)
        f6a53f00 00000046 f6a53ee8 00000000 f6a53ed0 00000282 d3ac53f7 0000000b
        00000282 00000282 000cb599 000cb599 f6a53f10 c0365572 f6ba0ad0 f7caf69c
        00000000 000004bc d56984d4 00000018 f6a53f00 00000000 f6a23000 00000022
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f94edd05>] svc_recv+0x385/0x410 [sunrpc]
  [<f93b85ac>] nfsd+0xac/0x240 [nfsd]
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
nfsd          D 00000000     0  2690      2 (L-TLB)
        f6a73f00 00000046 f6a73ee8 00000000 f6a73ed0 00000282 b6d79c59 00000018
        00000282 00000282 d56915b7 00000018 00001ae5 00000000 f7d61570 f7cafb9c
        00000073 000000d7 d569556a 00000018 f6a73f00 00000000 f6a49000 00000022
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f94edd05>] svc_recv+0x385/0x410 [sunrpc]
  [<f93b85ac>] nfsd+0xac/0x240 [nfsd]
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
nfsd          D 00000000     0  2691      2 (L-TLB)
        f6a95f00 00000046 f6a95ee8 00000000 f6a95ed0 00000282 b6d79006 00000018
        00000282 00000282 d568ff0f 00000018 000026e8 00000000 c1969030 f7d6167c
        00000074 00000094 d5695b36 00000018 f6a95f00 00000000 f6a6e000 00000022
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<f94edd05>] svc_recv+0x385/0x410 [sunrpc]
  [<f93b85ac>] nfsd+0xac/0x240 [nfsd]
  [<c01049fb>] kernel_thread_helper+0x7/0x4c
  =======================
rpc.mountd    D F64604C0     0  2695      1 (NOTLB)
        f6ab5ea0 00000082 00000000 f64604c0 f6ab5e60 c0341701 f649f960 f64604c0
        f6ab5ea0 c0343533 d5158432 00000018 00005425 00000000 f7095090 f712219c
        00000073 00000338 d5164b7e 00000018 f6ab5ea0 00000000 00000000 f6ab5f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
rsync         D 00000002     0  2701      1 (NOTLB)
        f6ab7ea0 00000082 00000000 00000002 00000283 0000000a 00000010 f6968e94
        f6ab7eb0 c0279a20 00000000 f6ab7eb8 c0408d6f 00000246 f6ac8a90 f70fc65c
        00000000 00000233 d5176fba 00000018 f6ab7ea0 00000000 00000000 f6ab7f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
smartd        D 2A7A7965     0  2710      1 (NOTLB)
        f6badea0 00000086 4661e75f 2a7a7965 4661e75f 2a7a7965 00000000 00000000
        f6a9eb84 f78f5f40 f6a9eb60 f78f5f40 f6badea0 c012f070 f6ba8a50 f6ac8b9c
        00000000 00000183 d5177a53 00000018 f6badea0 00000000 bfeea9e4 f6badf08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
rpc.statd     D 00000000     0  2766      1 (NOTLB)
        f6bc9ea0 00000082 00000000 00000000 00000000 bfe8a9f4 00000010 bfe8a9d0
        f6a9e604 c193fda0 00000002 fffffff3 f6bc9ea8 f6a9e5e0 f7d02ab0 f6ba8b5c
        00000000 000002fe d5178f48 00000018 f6bc9ea0 00000000 00000000 f6bc9f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
rpc.idmapd    D 00000001     0  2774      1 (NOTLB)
        f642dea0 00000082 0000012b 00000001 c04903b8 00000000 f642de78 c011ec21
        00000000 00000010 d5158962 00000018 00005704 00000000 f6b9e5b0 f709519c
        00000073 00000133 d516577c 00000018 f642dea0 00000000 00001388 f642df08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
ntpd          D BFA422F4     0  2789      1 (NOTLB)
        f6bbfea0 00200086 00000000 bfa422f4 f6bbfe60 c0108f08 3b9aca00 f6b9e5b0
        f6bbfe80 c010364b d5159373 00000018 00005c9b 00000000 f6ba05d0 f6b9e6bc
        00000073 00000254 d5166ec8 00000018 f6bbfea0 00000000 00000000 f6bbff08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
mdadm         D C190D860     0  2802      1 (NOTLB)
        f6f1dea0 00000082 f7d6aea0 c190d860 c190d8c8 c190d860 c195f7e0 f785310c
        f710bca0 f7d6aea0 f6f1df08 00000001 c190d86c 00000000 f7cfd070 f7d02bbc
        00000000 00000291 d517a142 00000018 f6f1dea0 00000000 bfbf3f78 f6f1df08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
amd           D 00000010     0  2814      1 (NOTLB)
        f6f39ea0 00000086 f6f39ec8 00000010 f6f39e88 00000001 f6f39e68 00000018
        00000000 00000000 d515a1d0 00000018 000062c5 00000000 f7134530 f6ba06dc
        00000073 000002e5 d5168bc0 00000018 f6f39ea0 00000000 00000000 f6f39f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
atd           D C01D5920     0  2843      1 (NOTLB)
        f6465ea0 00000082 00000000 c01d5920 f6465e60 c01e4ce8 00000008 00000000
        f646f2f8 f6aa2660 f6465f08 f6470060 f6465ea0 c012f070 f7cb10b0 f7cfd17c
        00000000 0000016b d517ab31 00000018 f6465ea0 00000000 bfa33f04 f6465f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
cron          D 0000000D     0  2846      1 (NOTLB)
        f6485ea0 00000086 f6485e78 0000000d c03edd32 00000f20 f7ffbda4 f6b9fa70
        f7086460 f7086460 00cafdb8 00000004 f6485ea0 f7004c80 f70fca50 f6ac819c
        00000000 000001d2 d516e42a 00000018 f6485ea0 00000000 bf7ff544 f6485f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
miniserv.pl   D F64EDEB8     0  2869      1 (NOTLB)
        f64edea0 00000086 f77ad3a0 f64edeb8 f64ede90 c020b998 00000001 f64edf00
        00000001 00000000 d5156000 00000018 000087dd 00000000 f7d72a70 f713463c
        00000074 00000234 d516a1ca 00000018 f64edea0 00000000 00000000 f64edf08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
apache        D F6463EB8     0  2898      1 (NOTLB)
        f6463ea0 00000082 c1a110e0 f6463eb8 f6463e90 c020b998 00000000 00000000
        c0430da8 c0430da8 c0430f18 00000044 f6463ea0 c0146c5d f6ba6a30 f7cb11bc
        00000000 000001b1 d517b708 00000018 f6463ea0 00000000 00000000 f6463f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
apache        D 00000000     0  2910   2898 (NOTLB)
        f64f1ea0 00000082 00000000 00000000 f64f1e90 c0158d44 00000000 00000000
        c0430da8 c0430da8 c0430f18 00000044 f68d8064 f68d8154 f7d76590 f6ba6b3c
        00000000 00000265 d517c7cc 00000018 f64f1ea0 00000000 00000000 f64f1f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
apache        D 00000000     0  2911   2898 (NOTLB)
        f6555ea0 00000082 00000000 00000000 f6555e90 c0158d44 00000000 00000000
        c0430da8 c0430da8 c0430f18 00000044 f68d8064 f68d8154 f6ba6530 f7d7669c
        00000000 0000020d d517d62c 00000018 f6555ea0 00000000 00000000 f6555f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
apache        D 00000000     0  2912   2898 (NOTLB)
        f6559ea0 00000086 00000000 00000000 f6559e90 c0158d44 00000000 00000000
        c0430da8 c0430da8 c0430f18 00000044 f68d8064 f68d8154 f7134a30 f6ba663c
        00000000 0000011c d517ddf6 00000018 f6559ea0 00000000 00000000 f6559f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
apache        D 00000000     0  2913   2898 (NOTLB)
        f64a5ea0 00000086 00000000 00000000 f64a5e90 c0158d44 00000000 00000000
        c0430da8 c0430da8 c0430f18 00000044 f68d8064 f68d8154 f6b9f570 f7134b3c
        00000000 00000153 d517e73e 00000018 f64a5ea0 00000000 00000000 f64a5f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
apache        D 00000000     0  2914   2898 (NOTLB)
        f6573ea0 00000086 00000000 00000000 f6573e90 c0158d44 00000000 00000000
        c0430da8 c0430da8 c0430f18 00000044 f68d8064 f68d8154 f7d025b0 f6b9f67c
        00000000 0000013d d517efea 00000018 f6573ea0 00000000 00000000 f6573f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
munin-node    D 00000000     0  3040      1 (NOTLB)
        f6831ea0 00000086 00000000 00000000 00000000 00000000 00000000 00000000
        f714a8ac f66b5800 00000000 00000000 00000000 00000000 f70b65b0 f7d026bc
        00000000 000002c2 d518033a 00000018 f6831ea0 00000000 00000000 f6831f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
getty         D F65F9EEC     0  3066      1 (NOTLB)
        f65f9ea0 00000082 f6d39400 f65f9eec f65f9e90 c03655af 00000000 c1a38800
        00000001 c1846900 f6d39400 00000000 f65f9e80 00000246 f658a5d0 f70b66bc
        00000000 000002b1 d5181617 00000018 f65f9ea0 00000000 0804b214 f65f9f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
getty         D F65FBEEC     0  3067      1 (NOTLB)
        f65fbea0 00000086 f6587800 f65fbeec f65fbe90 c03655af 7f1c0300 f65fc20e
        00000000 ffffffff 00000008 00000000 f6584c08 00000246 f658aad0 f658a6dc
        00000000 0000018a d51820de 00000018 f65fbea0 00000000 0804b214 f65fbf08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
getty         D F65D1EEC     0  3068      1 (NOTLB)
        f65d1ea0 00000082 f6556c00 f65d1eec f65d1e90 c03655af 7f1c0300 f664020e
        00000000 ffffffff 00000008 00000000 f6587408 00000246 f6582030 f658abdc
        00000000 0000015e d5182a74 00000018 f65d1ea0 00000000 0804b214 f65d1f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
getty         D F65D3EEC     0  3069      1 (NOTLB)
        f65d3ea0 00000086 f6556400 f65d3eec f65d3e90 c03655af 7f1c0300 f664620e
        00000000 ffffffff 00000008 00000000 f6556808 00000246 f6589ab0 f658213c
        00000000 00000164 d5183434 00000018 f65d3ea0 00000000 0804b214 f65d3f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
getty         D F6621EEC     0  3070      1 (NOTLB)
        f6621ea0 00000082 f75e5c00 f6621eec f6621e90 c03655af 7f1c0300 f664c20e
        00000000 ffffffff 00000008 00000000 f64de008 00000246 f6585090 f6589bbc
        00000000 00000183 d5183ece 00000018 f6621ea0 00000000 0804b214 f6621f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
getty         D F6623EEC     0  3071      1 (NOTLB)
        f6623ea0 00000086 f7843800 f6623eec f6623e90 c03655af 7f1c0300 f667220e
        00000000 ffffffff d515e7c5 00000018 00010183 00000000 f7cb5030 f658519c
        00000077 0000018d d51849ac 00000018 f6623ea0 00000000 0804b214 f6623f08
Call Trace:
  [<c013b1ff>] refrigerator+0x3f/0x50
  [<c0124646>] get_signal_to_deliver+0x226/0x230
  [<c0103cab>] do_signal+0x5b/0x120
  [<c0103dad>] do_notify_resume+0x3d/0x40
  [<c0103f6e>] work_notifysig+0x13/0x19
  =======================
hibernate     D F7945000     0  3874   2622 (NOTLB)
        f6723c60 00000086 f7ea23b8 f7945000 f6723c20 c0121d67 c198d800 f7ea23b8
        f6723c30 c02261b7 28ccc170 00000009 00003a03 00000000 c19615d0 f6ba0bdc
        0000006e 0006a677 28ccc170 00000009 f6723c70 f6723d48 f6af7804 f6723c7c
Call Trace:
  [<c0364fa4>] wait_for_completion+0x64/0xa0
  [<c021bb7d>] blk_execute_rq+0x8d/0xb0
  [<c02b7a88>] scsi_execute+0xb8/0x110
  [<c02b7b48>] scsi_execute_req+0x68/0x90
  [<c02bf80f>] sd_start_stop_device+0x6f/0x120
  [<c02bfb9a>] sd_resume+0x6a/0xa0
  [<c02bbe59>] scsi_bus_resume+0x69/0x80
  [<c0299932>] resume_device+0x132/0x190
  [<c0299aab>] dpm_resume+0xbb/0xc0
  [<c0299ace>] device_resume+0x1e/0x40
  [<c013bfd6>] hibernate+0x106/0x1a0
  [<c013b193>] state_store+0xc3/0xf0
  [<c019353b>] subsys_attr_store+0x3b/0x40
  [<c019370e>] flush_write_buffer+0x2e/0x40
  [<c0193781>] sysfs_write_file+0x61/0x70
  [<c015e858>] vfs_write+0x88/0x110
  [<c015e991>] sys_write+0x41/0x70
  [<c0103ee0>] syscall_call+0x7/0xb
  =======================

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-13 22:02                                   ` David Greaves
@ 2007-06-13 22:12                                     ` Linus Torvalds
  2007-06-13 23:15                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 48+ messages in thread
From: Linus Torvalds @ 2007-06-13 22:12 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, Tejun Heo, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik



On Wed, 13 Jun 2007, David Greaves wrote:
>
> > I'm not seeing anything really obvious. The traces would probably look
> > better if you enabled CONFIG_FRAME_POINTER, though. That should cut down on
> > some of the noise and make the traces a bit more readable.
> 
> I can do that...

Thanks. That makes a big difference to the readability of the traces.

That said, I'm so used to reading even the messy ones that this didn't 
actually tell me anything new (it made it clear that the SCSI error 
handler noise was just noise), but for people who aren't quite as used to 
seeing crap backtraces, your new trace might hopefully put them on the 
right track.

I threw out the parts that didn't look all that relevant, and left the
ata_aux/md0_raid5/hibernate traces here for others to look at without all 
the other noise. Those _seem_ to be the primary suspects in this saga.

		Linus
---
> ata_aux       D F7945000     0   122      2 (L-TLB)
>        c19f7ce0 00000046 f7ea23b8 f7945000 c19f7ca0 c0121d67 c198d800 f7ea23b8
>        c19f7cb0 c02261b7 f6af7964 f7ea23b8 c19f7cc0 c0293942 f7e865d0 c1a026bc
>        c19f7cf0 000003ae 2dae0756 00000009 c19f7cf0 c19f7dc8 f6af7964 c19f7cfc
> Call Trace:
>  [<c0364fa4>] wait_for_completion+0x64/0xa0
>  [<c021bb7d>] blk_execute_rq+0x8d/0xb0
>  [<c02b7a88>] scsi_execute+0xb8/0x110
>  [<c02b7b48>] scsi_execute_req+0x68/0x90
>  [<c02bdefd>] sd_spinup_disk+0x6d/0x400
>  [<c02bf25b>] sd_revalidate_disk+0x6b/0x160
>  [<c02bdc1f>] sd_rescan+0x1f/0x30
>  [<c02bb002>] scsi_rescan_device+0x42/0x50
>  [<c02c9e10>] ata_scsi_dev_rescan+0x60/0x70
>  [<c0127ebd>] run_workqueue+0x4d/0xf0
>  [<c012806d>] worker_thread+0xcd/0xf0
>  [<c012b247>] kthread+0x67/0x70
>  [<c01049fb>] kernel_thread_helper+0x7/0x4c

> md0_raid5     D F7F4D000     0   836      2 (L-TLB)
>        f78a3da0 00000046 f7f4d000 f7f4d000 f78a3da0 c01454b6 c048b0e0 f6ba0ad0
>        f78a3d70 c0116be1 00010ad0 d5658227 f78a3d90 f7853000 f70c4ae0 f7f5617c
>        f78a3de0 00003440 1be4a921 00000009 00000246 f7853000 f78a3dc8 f785313c
> Call Trace:
>  [<c02f1287>] md_super_wait+0x77/0xc0
>  [<c02f993f>] write_sb_page+0x4f/0x80
>  [<c02f9a72>] write_page+0x102/0x110
>  [<c02f9db9>] bitmap_update_sb+0x89/0x90
>  [<c02f3353>] md_update_sb+0x123/0x2a0
>  [<c02f93c2>] md_check_recovery+0x302/0x340
>  [<c02ec1f2>] raid5d+0x12/0xf0
>  [<c02f7746>] md_thread+0x56/0x110
>  [<c012b247>] kthread+0x67/0x70
>  [<c01049fb>] kernel_thread_helper+0x7/0x4c

> hibernate     D F7945000     0  3874   2622 (NOTLB)
>        f6723c60 00000086 f7ea23b8 f7945000 f6723c20 c0121d67 c198d800 f7ea23b8
>        f6723c30 c02261b7 28ccc170 00000009 00003a03 00000000 c19615d0 f6ba0bdc
>        0000006e 0006a677 28ccc170 00000009 f6723c70 f6723d48 f6af7804 f6723c7c
> Call Trace:
>  [<c0364fa4>] wait_for_completion+0x64/0xa0
>  [<c021bb7d>] blk_execute_rq+0x8d/0xb0
>  [<c02b7a88>] scsi_execute+0xb8/0x110
>  [<c02b7b48>] scsi_execute_req+0x68/0x90
>  [<c02bf80f>] sd_start_stop_device+0x6f/0x120
>  [<c02bfb9a>] sd_resume+0x6a/0xa0
>  [<c02bbe59>] scsi_bus_resume+0x69/0x80
>  [<c0299932>] resume_device+0x132/0x190
>  [<c0299aab>] dpm_resume+0xbb/0xc0
>  [<c0299ace>] device_resume+0x1e/0x40
>  [<c013bfd6>] hibernate+0x106/0x1a0
>  [<c013b193>] state_store+0xc3/0xf0
>  [<c019353b>] subsys_attr_store+0x3b/0x40
>  [<c019370e>] flush_write_buffer+0x2e/0x40
>  [<c0193781>] sysfs_write_file+0x61/0x70
>  [<c015e858>] vfs_write+0x88/0x110
>  [<c015e991>] sys_write+0x41/0x70
>  [<c0103ee0>] syscall_call+0x7/0xb
>  =======================
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-13 22:12                                     ` Linus Torvalds
@ 2007-06-13 23:15                                       ` Rafael J. Wysocki
  2007-06-14 14:21                                         ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Rafael J. Wysocki @ 2007-06-13 23:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Greaves, David Chinner, Tejun Heo, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

On Thursday, 14 June 2007 00:12, Linus Torvalds wrote:
> 
> On Wed, 13 Jun 2007, David Greaves wrote:
> >
> > > I'm not seeing anything really obvious. The traces would probably look
> > > better if you enabled CONFIG_FRAME_POINTER, though. That should cut down on
> > > some of the noise and make the traces a bit more readable.
> > 
> > I can do that...
> 
> Thanks. That makes a big difference to the readability of the traces.
> 
> That said, I'm so used to reading even the messy ones that this didn't 
> actually tell me anything new (it made it clear that the SCSI error 
> handler noise was just noise), but for people who aren't quite as used to 
> seeing crap backtraces, your new trace might hopefully put them on the 
> right track.
> 
> I threw out the parts that didn't look all that relevant, and left the
> ata_aux/md0_raid5/hibernate traces here for others to look at without all 
> the other noise. Those _seem_ to be the primary suspects in this saga.

Hmm, it looks like both hibernate and ata_aux are waiting for the same
completion.  I wonder who's supposed to complete it.

Greetings,
Rafael


> ---
> > ata_aux       D F7945000     0   122      2 (L-TLB)
> >        c19f7ce0 00000046 f7ea23b8 f7945000 c19f7ca0 c0121d67 c198d800 f7ea23b8
> >        c19f7cb0 c02261b7 f6af7964 f7ea23b8 c19f7cc0 c0293942 f7e865d0 c1a026bc
> >        c19f7cf0 000003ae 2dae0756 00000009 c19f7cf0 c19f7dc8 f6af7964 c19f7cfc
> > Call Trace:
> >  [<c0364fa4>] wait_for_completion+0x64/0xa0
> >  [<c021bb7d>] blk_execute_rq+0x8d/0xb0
> >  [<c02b7a88>] scsi_execute+0xb8/0x110
> >  [<c02b7b48>] scsi_execute_req+0x68/0x90
> >  [<c02bdefd>] sd_spinup_disk+0x6d/0x400
> >  [<c02bf25b>] sd_revalidate_disk+0x6b/0x160
> >  [<c02bdc1f>] sd_rescan+0x1f/0x30
> >  [<c02bb002>] scsi_rescan_device+0x42/0x50
> >  [<c02c9e10>] ata_scsi_dev_rescan+0x60/0x70
> >  [<c0127ebd>] run_workqueue+0x4d/0xf0
> >  [<c012806d>] worker_thread+0xcd/0xf0
> >  [<c012b247>] kthread+0x67/0x70
> >  [<c01049fb>] kernel_thread_helper+0x7/0x4c
> 
> > md0_raid5     D F7F4D000     0   836      2 (L-TLB)
> >        f78a3da0 00000046 f7f4d000 f7f4d000 f78a3da0 c01454b6 c048b0e0 f6ba0ad0
> >        f78a3d70 c0116be1 00010ad0 d5658227 f78a3d90 f7853000 f70c4ae0 f7f5617c
> >        f78a3de0 00003440 1be4a921 00000009 00000246 f7853000 f78a3dc8 f785313c
> > Call Trace:
> >  [<c02f1287>] md_super_wait+0x77/0xc0
> >  [<c02f993f>] write_sb_page+0x4f/0x80
> >  [<c02f9a72>] write_page+0x102/0x110
> >  [<c02f9db9>] bitmap_update_sb+0x89/0x90
> >  [<c02f3353>] md_update_sb+0x123/0x2a0
> >  [<c02f93c2>] md_check_recovery+0x302/0x340
> >  [<c02ec1f2>] raid5d+0x12/0xf0
> >  [<c02f7746>] md_thread+0x56/0x110
> >  [<c012b247>] kthread+0x67/0x70
> >  [<c01049fb>] kernel_thread_helper+0x7/0x4c
> 
> > hibernate     D F7945000     0  3874   2622 (NOTLB)
> >        f6723c60 00000086 f7ea23b8 f7945000 f6723c20 c0121d67 c198d800 f7ea23b8
> >        f6723c30 c02261b7 28ccc170 00000009 00003a03 00000000 c19615d0 f6ba0bdc
> >        0000006e 0006a677 28ccc170 00000009 f6723c70 f6723d48 f6af7804 f6723c7c
> > Call Trace:
> >  [<c0364fa4>] wait_for_completion+0x64/0xa0
> >  [<c021bb7d>] blk_execute_rq+0x8d/0xb0
> >  [<c02b7a88>] scsi_execute+0xb8/0x110
> >  [<c02b7b48>] scsi_execute_req+0x68/0x90
> >  [<c02bf80f>] sd_start_stop_device+0x6f/0x120
> >  [<c02bfb9a>] sd_resume+0x6a/0xa0
> >  [<c02bbe59>] scsi_bus_resume+0x69/0x80
> >  [<c0299932>] resume_device+0x132/0x190
> >  [<c0299aab>] dpm_resume+0xbb/0xc0
> >  [<c0299ace>] device_resume+0x1e/0x40
> >  [<c013bfd6>] hibernate+0x106/0x1a0
> >  [<c013b193>] state_store+0xc3/0xf0
> >  [<c019353b>] subsys_attr_store+0x3b/0x40
> >  [<c019370e>] flush_write_buffer+0x2e/0x40
> >  [<c0193781>] sysfs_write_file+0x61/0x70
> >  [<c015e858>] vfs_write+0x88/0x110
> >  [<c015e991>] sys_write+0x41/0x70
> >  [<c0103ee0>] syscall_call+0x7/0xb
> >  =======================
> > 
> 
> 

-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-13 11:16                               ` David Greaves
  2007-06-13 21:04                                 ` Linus Torvalds
@ 2007-06-14  0:28                                 ` David Chinner
  1 sibling, 0 replies; 48+ messages in thread
From: David Chinner @ 2007-06-14  0:28 UTC (permalink / raw)
  To: David Greaves
  Cc: Linus Torvalds, David Chinner, Tejun Heo, Rafael J. Wysocki, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

On Wed, Jun 13, 2007 at 12:16:36PM +0100, David Greaves wrote:
> Linus Torvalds wrote:
> >
> >On Fri, 8 Jun 2007, David Greaves wrote:
> >>positive: I can now get sysrq-t :)
> >
> >Ok, so color me confused,
> So what do you think that makes me <grin>
> 
> >and maybe I have missed some of the emails or 
> >skimmed over them too fast (there's been too many of them ;),
> 
> You may have missed these 'tests' with rc4+Tejun's fix:
> * clean boot, unmounting the xfs fs : normal hibernate/resume
> * clean boot, remount ro xfs fs : normal hibernate/resume
> * clean boot, touch; sync; echo 1 > /proc/sys/vm/drop_caches: normal 
> hibernate/resume
> * clean boot, touch; sync; echo 2 > /proc/sys/vm/drop_caches: hang 
> hibernating
> * clean boot, touch; sync; echo 3 > /proc/sys/vm/drop_caches: hang 
> hibernating
> 
> Dave asked me to do them but hasn't responded yet.

Sorry 'bout that. Bit busy ATM.

What I was effectively looking for was whether it was data or metadata
that was causing the problems. From the above, it would appear that
dropping the page cache (echo 1 > drop caches) allows a successful
hibernate/resume. Next step would have been to isolate which cache
being dropped made the difference (e.g. a file or a bdev cache?).

However, it is clear from the back traces that there is something
unwell with md/sata code, so I don't think this needs to be tracked
any further from the filesystem perspective.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-13 23:15                                       ` Rafael J. Wysocki
@ 2007-06-14 14:21                                         ` Tejun Heo
  2007-06-14 15:10                                           ` Tejun Heo
  2007-06-14 15:19                                           ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
  0 siblings, 2 replies; 48+ messages in thread
From: Tejun Heo @ 2007-06-14 14:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, David Greaves, David Chinner, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

Hello,

Rafael J. Wysocki wrote:
> On Thursday, 14 June 2007 00:12, Linus Torvalds wrote:
>> On Wed, 13 Jun 2007, David Greaves wrote:
>>>> I'm not seeing anything really obvious. The traces would probably look
>>>> better if you enabled CONFIG_FRAME_POINTER, though. That should cut down on
>>>> some of the noise and make the traces a bit more readable.
>>> I can do that...
>> Thanks. That makes a big difference to the readability of the traces.
>>
>> That said, I'm so used to reading even the messy ones that this didn't 
>> actually tell me anything new (it made it clear that the SCSI error 
>> handler noise was just noise), but for people who aren't quite as used to 
>> seeing crap backtraces, your new trace might hopefully put them on the 
>> right track.
>>
>> I threw out the parts that didn't look all that relevant, and left the
>> ata_aux/md0_raid5/hibernate traces here for others to look at without all 
>> the other noise. Those _seem_ to be the primary suspects in this saga.
> 
> Hmm, it looks like both hibernate and ata_aux are waiting for the same
> completion.  I wonder who's supposed to complete it.

They're waiting for the commands they issued to complete.  ata_aux is
trying to revalidate the scsi device after libata EH finished waking up
the port and hibernate is trying to resume scsi disk device.  ata_aux is
issuing either TEST UNIT READY or START STOP.  hibernate is issuing
START STOP.

This can be caused by one of the followings.

1. SCSI EH thread (ATA EH runs off it) for the SCSI device hasn't
finished yet.  All commands are deferred while EH is in progress.

2. request_queue is stuck - somehow somebody forgot to kick the queue at
some point.

3. command is stuck somewhere in SCSI/ATA land.

#1 doesn't seem to be the case as all scsi_eh threads seems idle.  I'm
looking at the code but can't find anything which could cause #2 or #3.
 Also, these code paths are traveled really frequently.

I'm also trying to reproduce the problem here with xfs over RAID-6 array
but haven't been successful yet.

David, do you store the hibernation image on the RAID-6 array?  Can you
post the captured kernel log when it locks up?

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-14 14:21                                         ` Tejun Heo
@ 2007-06-14 15:10                                           ` Tejun Heo
  2007-06-15  9:42                                             ` [PATCH] block: always requeue !fs requests at the front Tejun Heo
  2007-06-14 15:19                                           ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
  1 sibling, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2007-06-14 15:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Linus Torvalds, David Greaves, David Chinner,
	xfs, 'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

Tejun Heo wrote:
> David, do you store the hibernation image on the RAID-6 array?  Can you
> post the captured kernel log when it locks up?

Never mind.  Just succeeded to reproduce it here.  It definitely has
something to do with the raid code.  ext3 on raid6 is showing the same
problem.  I'll report back as soon as I find out more.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)
  2007-06-14 14:21                                         ` Tejun Heo
  2007-06-14 15:10                                           ` Tejun Heo
@ 2007-06-14 15:19                                           ` David Greaves
  1 sibling, 0 replies; 48+ messages in thread
From: David Greaves @ 2007-06-14 15:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Linus Torvalds, David Chinner, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

Tejun Heo wrote:
> They're waiting for the commands they issued to complete.  ata_aux is
> trying to revalidate the scsi device after libata EH finished waking up
> the port and hibernate is trying to resume scsi disk device.  ata_aux is
> issuing either TEST UNIT READY or START STOP.  hibernate is issuing
> START STOP.
> 
> This can be caused by one of the followings.
> 
> 1. SCSI EH thread (ATA EH runs off it) for the SCSI device hasn't
> finished yet.  All commands are deferred while EH is in progress.
> 
> 2. request_queue is stuck - somehow somebody forgot to kick the queue at
> some point.
> 
> 3. command is stuck somewhere in SCSI/ATA land.
> 
> #1 doesn't seem to be the case as all scsi_eh threads seems idle.  I'm
> looking at the code but can't find anything which could cause #2 or #3.
>  Also, these code paths are traveled really frequently.
> 
> I'm also trying to reproduce the problem here with xfs over RAID-6 array
> but haven't been successful yet.
> 
> David, do you store the hibernation image on the RAID-6 array?
No, swap is on a pata disk.

>  Can you post the captured kernel log when it locks up?

Sure... this was still on the serial terminal screen from the sysrq-t trace from 
this morning:

[run hibernate script here]

swsusp: Basic memory bitmaps created
Stopping tasks ... done.
Shrinking memory... done (0 pages freed)
Freed 0 kbytes in 0.04 seconds (0.00 MB/s)
sd 5:0:0:0: [sdf] Synchronizing SCSI cache
sd 4:0:0:0: [sde] Synchronizing SCSI cache
sd 3:0:0:0: [sdd] Synchronizing SCSI cache
sd 2:0:0:0: [sdc] Synchronizing SCSI cache
sd 1:0:0:0: [sdb] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Synchronizing SCSI cache
pnp: Device 00:09 disabled.
pnp: Device 00:08 activated.
pnp: Device 00:09 activated.
pnp: Failed to activate device 00:0a.
pnp: Failed to activate device 00:0b.
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: configured for UDMA/133
sd 0:0:0:0: [sda] Starting disk
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
sd 1:0:0:0: [sdb] Starting disk
sd 2:0:0:0: [sdc] Starting disk
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: configured for UDMA/133
sd 3:0:0:0: [sdd] Starting disk
sd 4:0:0:0: [sde] Starting disk
sd 5:0:0:0: [sdf] Starting disk
sd 4:0:0:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 5:0:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
Saving image data pages (36338 pages) ...  19%<6>skge eth0: Link is up at 1000 
Mbps, full duplex, flow control both
done
Wrote 145352 kbytes in 8.49 seconds (17.12 MB/s)
S|
md: stopping all md devices.
md: md0 still in use.
sd 5:0:0:0: [sdf] Synchronizing SCSI cache
sd 5:0:0:0: [sdf] Stopping disk
sd 4:0:0:0: [sde] Synchronizing SCSI cache
sd 4:0:0:0: [sde] Stopping disk
sd 3:0:0:0: [sdd] Synchronizing SCSI cache
sd 3:0:0:0: [sdd] Stopping disk
sd 2:0:0:0: [sdc] Synchronizing SCSI cache
sd 2:0:0:0: [sdc] Stopping disk
sd 1:0:0:0: [sdb] Synchronizing SCSI cache
sd 1:0:0:0: [sdb] Stopping disk
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Stopping disk
Shutdown: hdb
Shutdown: hda
ACPI: PCI interrupt for device 0000:00:09.0 disabled

[power off/on]

Linux version 2.6.21-g9666f400-dirty (root@cu.dgreaves.com) (gcc version 3.3.5 
(Debian 1:3.3.5-13)) #23 Wed Jun 13 22:51:26 BST 2007
BIOS-provided physical RAM map:
  BIOS-e820: 0000000000000000 - 000000000009c400 (usable)
  BIOS-e820: 000000000009c400 - 00000000000a0000 (reserved)
  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
  BIOS-e820: 0000000000100000 - 000000003fffc000 (usable)
  BIOS-e820: 000000003fffc000 - 000000003ffff000 (ACPI data)
  BIOS-e820: 000000003ffff000 - 0000000040000000 (ACPI NVS)
  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
  BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
Zone PFN ranges:
   DMA             0 ->     4096
   Normal       4096 ->   229376
   HighMem    229376 ->   262140
early_node_map[1] active PFN ranges
     0:        0 ->   262140
DMI 2.3 present.
ACPI: RSDP 000F62A0, 0014 (r0 ASUS  )
ACPI: RSDT 3FFFC000, 0030 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: FACP 3FFFC0B2, 0074 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: DSDT 3FFFC126, 2C4F (r1   ASUS A7V600       1000 MSFT  100000B)
ACPI: FACS 3FFFF000, 0040
ACPI: BOOT 3FFFC030, 0028 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: APIC 3FFFC058, 005A (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: PM-Timer IO Port: 0xe408
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:10 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
Built 1 zonelists.  Total pages: 260093
Kernel command line: root=/dev/hda2 ro log_buf_len=128k console=tty0 
console=ttyS0,115200
log_buf_len: 131072
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Detected 1999.872 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1034872k/1048560k available (2459k kernel code, 13036k reserved, 915k 
data, 196k init, 131056k highmem)
virtual kernel memory layout:
     fixmap  : 0xfffaa000 - 0xfffff000   ( 340 kB)
     pkmap   : 0xff800000 - 0xffc00000   (4096 kB)
     vmalloc : 0xf8800000 - 0xff7fe000   ( 111 MB)
     lowmem  : 0xc0000000 - 0xf8000000   ( 896 MB)
       .init : 0xc044e000 - 0xc047f000   ( 196 kB)
       .data : 0xc0366dc7 - 0xc044bb90   ( 915 kB)
       .text : 0xc0100000 - 0xc0366dc7   (2459 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 4003.08 BogoMIPS (lpj=8006174)
Mount-cache hash table entries: 512
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Compat vDSO mapped to ffffe000.
CPU: AMD Athlon(TM) MP stepping 00
Checking 'hlt' instruction... OK.
ACPI: Core revision 20070126
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xf1970, last bus=1
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: (supports S0 S1 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: enabled onboard AC97/MC97 devices
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 6 7 9 10 11 12)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 *5 6 7 9 10 11 12)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 *6 7 9 10 11 12)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 11 12) *15, disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 14 devices
ACPI: ACPI bus type pnp unregistered
SCSI subsystem initialized
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
pnp: 00:00: iomem range 0x0-0x9ffff could not be reserved
pnp: 00:00: iomem range 0xf0000-0xfffff could not be reserved
pnp: 00:00: iomem range 0x100000-0x3fffffff could not be reserved
pnp: 00:00: iomem range 0xfec00000-0xfec000ff could not be reserved
pnp: 00:02: ioport range 0xe400-0xe47f has been reserved
pnp: 00:02: ioport range 0xe800-0xe81f has been reserved
pnp: 00:02: iomem range 0xfff80000-0xffffffff could not be reserved
pnp: 00:02: iomem range 0xffb80000-0xffbfffff has been reserved
pnp: 00:0d: ioport range 0x290-0x297 has been reserved
pnp: 00:0d: ioport range 0x370-0x375 has been reserved
Time: tsc clocksource has been installed.
PCI: Bridge: 0000:00:01.0
   IO window: disabled.
   MEM window: disabled.
   PREFETCH window: disabled.
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
TCP bind hash table entries: 65536 (order: 6, 262144 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
Simple Boot Flag at 0x3a set to 0x1
Machine check exception polling timer started.
highmem bounce pool size: 64 pages
SGI XFS with ACLs, no debug enabled
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
PCI: Bypassing VIA 8237 APIC De-Assert Message
atyfb: using auxiliary register aperture
atyfb: 3D RAGE II+ (Mach64 GU) [0x4755 rev 0x9a]
atyfb: Mach64 BIOS is located at c0000, mapped at c00c0000.
atyfb: BIOS frequency table:
atyfb: PCLK_min_freq 926, PCLK_max_freq 22216, ref_freq 1432, ref_divider 33
atyfb: MCLK_pwd 4200, MCLK_max_freq 6000, XCLK_max_freq 6000, SCLK_freq 5000
atyfb: 4M EDO, 14.31818 MHz XTAL, 222 MHz PLL, 60 Mhz MCLK, 60 MHz XCLK
Console: switching to colour frame buffer device 80x30
atyfb: fb0: ATY Mach64 frame buffer device on PCI
input: Power Button (FF) as /class/input/input0
ACPI: Power Button (FF) [PWRF]
input: Power Button (CM) as /class/input/input1
ACPI: Power Button (CM) [PWRB]
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:09: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
PCI: Enabling device 0000:00:09.0 (0014 -> 0017)
ACPI: PCI Interrupt 0000:00:09.0[A] -> GSI 18 (level, low) -> IRQ 16
skge 1.11 addr 0xf6000000 irq 16 chip Yukon rev 1
skge eth0: addr 00:0c:6e:f6:47:ee
netconsole: not configured, aborting
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:0f.1
ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 17
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
     ide0: BM-DMA at 0x9000-0x9007, BIOS settings: hda:DMA, hdb:DMA
     ide1: BM-DMA at 0x9008-0x900f, BIOS settings: hdc:pio, hdd:DMA
Switched to high resolution mode on CPU 0
hda: ST320420A, ATA DISK drive
hdb: Maxtor 5A300J0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdd: PLEXTOR CD-R PX-W2410A, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 128KiB
hda: 39851760 sectors (20404 MB) w/2048KiB Cache, CHS=39535/16/63, UDMA(66)
hda: cache flushes not supported
  hda: hda1 hda2 hda3
hdb: max request size: 512KiB
hdb: 585940320 sectors (300001 MB) w/2048KiB Cache, CHS=36473/255/63, UDMA(133)
hdb: cache flushes supported
  hdb: hdb1 hdb2
hdd: ATAPI 40X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ACPI: PCI Interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 18
scsi0 : sata_promise
scsi1 : sata_promise
scsi2 : sata_promise
scsi3 : sata_promise
ata1: SATA max UDMA/133 cmd 0xf880a200 ctl 0xf880a238 bmdma 0x00000000 irq 0
ata2: SATA max UDMA/133 cmd 0xf880a280 ctl 0xf880a2b8 bmdma 0x00000000 irq 0
ata3: SATA max UDMA/133 cmd 0xf880a300 ctl 0xf880a338 bmdma 0x00000000 irq 0
ata4: SATA max UDMA/133 cmd 0xf880a380 ctl 0xf880a3b8 bmdma 0x00000000 irq 0
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: ATA-7: Maxtor 6B250S0, BANC19J0, max UDMA/133
ata1.00: 490234752 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: configured for UDMA/133
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: ATA-7: Maxtor 7Y250M0, YAR51EW0, max UDMA/133
ata2.00: 490234752 sectors, multi 0: LBA48
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: configured for UDMA/133
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: ATA-7: Maxtor 7Y250M0, YAR51EW0, max UDMA/133
ata3.00: 490234752 sectors, multi 0: LBA48
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: configured for UDMA/133
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata4.00: ATA-7: Maxtor 6B250S0, BANC1980, max UDMA/133
ata4.00: 490234752 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata4.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata4.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6B250S0   BANC PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sda: sda1
sd 0:0:0:0: [sda] Attached SCSI disk
scsi 1:0:0:0: Direct-Access     ATA      Maxtor 7Y250M0   YAR5 PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors (251000 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors (251000 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sdb: sdb1
sd 1:0:0:0: [sdb] Attached SCSI disk
scsi 2:0:0:0: Direct-Access     ATA      Maxtor 7Y250M0   YAR5 PQ: 0 ANSI: 5
sd 2:0:0:0: [sdc] 490234752 512-byte hardware sectors (251000 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 2:0:0:0: [sdc] 490234752 512-byte hardware sectors (251000 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sdc: sdc1
sd 2:0:0:0: [sdc] Attached SCSI disk
scsi 3:0:0:0: Direct-Access     ATA      Maxtor 6B250S0   BANC PQ: 0 ANSI: 5
sd 3:0:0:0: [sdd] 490234752 512-byte hardware sectors (251000 MB)
sd 3:0:0:0: [sdd] Write Protect is off
sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 3:0:0:0: [sdd] 490234752 512-byte hardware sectors (251000 MB)
sd 3:0:0:0: [sdd] Write Protect is off
sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sdd: sdd1
sd 3:0:0:0: [sdd] Attached SCSI disk
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 17
sata_via 0000:00:0f.0: routed to hard irq line 0
scsi4 : sata_via
scsi5 : sata_via
ata5: SATA max UDMA/133 cmd 0x0001b000 ctl 0x0001a802 bmdma 0x00019800 irq 0
ata6: SATA max UDMA/133 cmd 0x0001a400 ctl 0x0001a002 bmdma 0x00019808 irq 0
ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: ATA-7: Maxtor 7B250S0, BANC1980, max UDMA/133
ata5.00: 490234752 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: configured for UDMA/133
ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: ATA-7: ST3400620AS, 3.AAK, max UDMA/133
ata6.00: 781422768 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: configured for UDMA/133
scsi 4:0:0:0: Direct-Access     ATA      Maxtor 7B250S0   BANC PQ: 0 ANSI: 5
sd 4:0:0:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 4:0:0:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sde: sde1
sd 4:0:0:0: [sde] Attached SCSI disk
scsi 5:0:0:0: Direct-Access     ATA      ST3400620AS      3.AA PQ: 0 ANSI: 5
sd 5:0:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
sd 5:0:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
  sdf: sdf1 sdf2
sd 5:0:0:0: [sdf] Attached SCSI disk
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard as /class/input/input2
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
raid6: int32x1    636 MB/s
raid6: int32x2    787 MB/s
raid6: int32x4    627 MB/s
raid6: int32x8    606 MB/s
raid6: mmxx1     1583 MB/s
raid6: mmxx2     2557 MB/s
raid6: sse1x1    1592 MB/s
raid6: sse1x2    2631 MB/s
raid6: using algorithm sse1x2 (2631 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid5: automatically using best checksumming function: pIII_sse
    pIII_sse  :  4289.000 MB/sec
raid5: using function: pIII_sse (4289.000 MB/sec)
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com
TCP cubic registered
Using IPI Shortcut mode
swsusp: Basic memory bitmaps created
Stopping tasks ... <6>input: ImPS/2 Logitech Wheel Mouse as /class/input/input3
done.
Loading image data pages (36338 pages) ... done
Read 145352 kbytes in 8.34 seconds (17.42 MB/s)
sd 5:0:0:0: [sdf] Synchronizing SCSI cache
sd 4:0:0:0: [sde] Synchronizing SCSI cache
sd 3:0:0:0: [sdd] Synchronizing SCSI cache
sd 2:0:0:0: [sdc] Synchronizing SCSI cache
sd 1:0:0:0: [sdb] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Synchronizing SCSI cache
pnp: Device 00:09 disabled.
pnp: Device 00:08 activated.
pnp: Device 00:09 activated.
pnp: Failed to activate device 00:0a.
pnp: Failed to activate device 00:0b.
sd 0:0:0:0: [sda] Starting disk
sd 1:0:0:0: [sdb] Starting disk
sd 2:0:0:0: [sdc] Starting disk
sd 3:0:0:0: [sdd] Starting disk
sd 4:0:0:0: [sde] Starting disk
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: configured for UDMA/133
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: configured for UDMA/133
Clocksource tsc unstable (delta = 4327743507262 ns)
Time: acpi_pm clocksource has been installed.
skge eth0: Link is up at 1000 Mbps, full duplex, flow control both

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] block: always requeue !fs requests at the front
  2007-06-14 15:10                                           ` Tejun Heo
@ 2007-06-15  9:42                                             ` Tejun Heo
  2007-06-15 11:05                                               ` Jens Axboe
  2007-06-15 13:58                                               ` David Greaves
  0 siblings, 2 replies; 48+ messages in thread
From: Tejun Heo @ 2007-06-15  9:42 UTC (permalink / raw)
  To: David Greaves
  Cc: Rafael J. Wysocki, Linus Torvalds, David Chinner, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik, jens.axboe

SCSI marks internal commands with REQ_PREEMPT and push it at the front
of the request queue using blk_execute_rq().  When entering suspended
or frozen state, SCSI devices are quiesced using
scsi_device_quiesce().  In quiesced state, only REQ_PREEMPT requests
are processed.  This is how SCSI blocks other requests out while
suspending and resuming.  As all internal commands are pushed at the
front of the queue, this usually works.

Unfortunately, this interacts badly with ordered requeueing.  To
preserve request order on requeueing (due to busy device, active EH or
other failures), requests are sorted according to ordered sequence on
requeue if IO barrier is in progress.

The following sequence deadlocks.

1. IO barrier sequence issues.

2. Suspend requested.  Queue is quiesced with part of all of IO
   barrier sequence at the front.

3. During suspending or resuming, SCSI issues internal command which
   gets deferred and requeued for some reason.  As the command is
   issued after the IO barrier in #1, ordered requeueing code puts the
   request after IO barrier sequence.

4. The device is ready to process requests again but still is in
   quiesced state and the first request of the queue isn't
   REQ_PREEMPT, so command processing is deadlocked -
   suspending/resuming waits for the issued request to complete while
   the request can't be processed till device is put back into
   running state by resuming.

This can be fixed by always putting !fs requests at the front when
requeueing.

The following thread reports this deadlock.

  http://thread.gmane.org/gmane.linux.kernel/537473

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jenn Axboe <jens.axboe@oracle.com>
Cc: David Greaves <david@dgreaves.com>
---
Okay, it took a lot of hours of debugging but boiled down to two liner
fix.  I feel so empty. :-) RAID6 triggers this reliably because it
uses BIO_BARRIER heavily to update its superblock.  The recent ATA
suspend/resume rewrite is hit by this because it uses SCSI internal
commands to spin down and up the drives for suspending and resuming.

David, please test this.  Jens, does it look okay?

Thanks.

 block/ll_rw_blk.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 6b5173a..a2fe2e5 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -340,6 +340,14 @@ unsigned blk_ordered_req_seq(struct request *rq)
 	if (rq == &q->post_flush_rq)
 		return QUEUE_ORDSEQ_POSTFLUSH;
 
+	/* !fs requests don't need to follow barrier ordering.  Always
+	 * put them at the front.  This fixes the following deadlock.
+	 *
+	 * http://thread.gmane.org/gmane.linux.kernel/537473
+	 */
+	if (!blk_fs_request(rq))
+		return QUEUE_ORDSEQ_DRAIN;
+
 	if ((rq->cmd_flags & REQ_ORDERED_COLOR) ==
 	    (q->orig_bar_rq->cmd_flags & REQ_ORDERED_COLOR))
 		return QUEUE_ORDSEQ_DRAIN;

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] block: always requeue !fs requests at the front
  2007-06-15  9:42                                             ` [PATCH] block: always requeue !fs requests at the front Tejun Heo
@ 2007-06-15 11:05                                               ` Jens Axboe
  2007-06-15 11:17                                                 ` Tejun Heo
  2007-06-16 19:54                                                 ` Christoph Hellwig
  2007-06-15 13:58                                               ` David Greaves
  1 sibling, 2 replies; 48+ messages in thread
From: Jens Axboe @ 2007-06-15 11:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Greaves, Rafael J. Wysocki, Linus Torvalds, David Chinner,
	xfs, 'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

On Fri, Jun 15 2007, Tejun Heo wrote:
> SCSI marks internal commands with REQ_PREEMPT and push it at the front
> of the request queue using blk_execute_rq().  When entering suspended
> or frozen state, SCSI devices are quiesced using
> scsi_device_quiesce().  In quiesced state, only REQ_PREEMPT requests
> are processed.  This is how SCSI blocks other requests out while
> suspending and resuming.  As all internal commands are pushed at the
> front of the queue, this usually works.
> 
> Unfortunately, this interacts badly with ordered requeueing.  To
> preserve request order on requeueing (due to busy device, active EH or
> other failures), requests are sorted according to ordered sequence on
> requeue if IO barrier is in progress.
> 
> The following sequence deadlocks.
> 
> 1. IO barrier sequence issues.
> 
> 2. Suspend requested.  Queue is quiesced with part of all of IO
>    barrier sequence at the front.
> 
> 3. During suspending or resuming, SCSI issues internal command which
>    gets deferred and requeued for some reason.  As the command is
>    issued after the IO barrier in #1, ordered requeueing code puts the
>    request after IO barrier sequence.
> 
> 4. The device is ready to process requests again but still is in
>    quiesced state and the first request of the queue isn't
>    REQ_PREEMPT, so command processing is deadlocked -
>    suspending/resuming waits for the issued request to complete while
>    the request can't be processed till device is put back into
>    running state by resuming.
> 
> This can be fixed by always putting !fs requests at the front when
> requeueing.
> 
> The following thread reports this deadlock.
> 
>   http://thread.gmane.org/gmane.linux.kernel/537473
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>
> Cc: Jenn Axboe <jens.axboe@oracle.com>
> Cc: David Greaves <david@dgreaves.com>
> ---
> Okay, it took a lot of hours of debugging but boiled down to two liner
> fix.  I feel so empty. :-) RAID6 triggers this reliably because it
> uses BIO_BARRIER heavily to update its superblock.  The recent ATA
> suspend/resume rewrite is hit by this because it uses SCSI internal
> commands to spin down and up the drives for suspending and resuming.
> 
> David, please test this.  Jens, does it look okay?

Yep looks good, except for the bad multi-line comment style, but that's
minor stuff ;-)

Acked-by: Jens Axboe <jens.axboe@oracle.com>

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH] block: always requeue !fs requests at the front
  2007-06-15 11:05                                               ` Jens Axboe
@ 2007-06-15 11:17                                                 ` Tejun Heo
  2007-06-15 11:21                                                   ` Jens Axboe
  2007-06-15 15:08                                                   ` Jeff Garzik
  2007-06-16 19:54                                                 ` Christoph Hellwig
  1 sibling, 2 replies; 48+ messages in thread
From: Tejun Heo @ 2007-06-15 11:17 UTC (permalink / raw)
  To: Jens Axboe
  Cc: David Greaves, Rafael J. Wysocki, Linus Torvalds, David Chinner,
	xfs, 'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

SCSI marks internal commands with REQ_PREEMPT and push it at the front
of the request queue using blk_execute_rq().  When entering suspended
or frozen state, SCSI devices are quiesced using
scsi_device_quiesce().  In quiesced state, only REQ_PREEMPT requests
are processed.  This is how SCSI blocks other requests out while
suspending and resuming.  As all internal commands are pushed at the
front of the queue, this usually works.

Unfortunately, this interacts badly with ordered requeueing.  To
preserve request order on requeueing (due to busy device, active EH or
other failures), requests are sorted according to ordered sequence on
requeue if IO barrier is in progress.

The following sequence deadlocks.

1. IO barrier sequence issues.

2. Suspend requested.  Queue is quiesced with part or all of IO
   barrier sequence at the front.

3. During suspending or resuming, SCSI issues internal command which
   gets deferred and requeued for some reason.  As the command is
   issued after the IO barrier in #1, ordered requeueing code puts the
   request after IO barrier sequence.

4. The device is ready to process requests again but still is in
   quiesced state and the first request of the queue isn't
   REQ_PREEMPT, so command processing is deadlocked -
   suspending/resuming waits for the issued request to complete while
   the request can't be processed till device is put back into
   running state by resuming.

This can be fixed by always putting !fs requests at the front when
requeueing.

The following thread reports this deadlock.

  http://thread.gmane.org/gmane.linux.kernel/537473

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jenn Axboe <jens.axboe@oracle.com>
Cc: David Greaves <david@dgreaves.com>
---
> Yep looks good, except for the bad multi-line comment style, but that's
> minor stuff ;-)

That's how Jeff likes it in libata and my fingers are hardcoded to it,
but I do appreciate the paramount importance of each maintainer's
right to his/her own comment style, so here's the respinned patch.  :-)

 block/ll_rw_blk.c |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 6b5173a..c99b463 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -340,6 +340,15 @@ unsigned blk_ordered_req_seq(struct request *rq)
 	if (rq == &q->post_flush_rq)
 		return QUEUE_ORDSEQ_POSTFLUSH;
 
+	/*
+	 * !fs requests don't need to follow barrier ordering.  Always
+	 * put them at the front.  This fixes the following deadlock.
+	 *
+	 * http://thread.gmane.org/gmane.linux.kernel/537473
+	 */
+	if (!blk_fs_request(rq))
+		return QUEUE_ORDSEQ_DRAIN;
+
 	if ((rq->cmd_flags & REQ_ORDERED_COLOR) ==
 	    (q->orig_bar_rq->cmd_flags & REQ_ORDERED_COLOR))
 		return QUEUE_ORDSEQ_DRAIN;

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] block: always requeue !fs requests at the front
  2007-06-15 11:17                                                 ` Tejun Heo
@ 2007-06-15 11:21                                                   ` Jens Axboe
  2007-06-15 15:08                                                   ` Jeff Garzik
  1 sibling, 0 replies; 48+ messages in thread
From: Jens Axboe @ 2007-06-15 11:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Greaves, Rafael J. Wysocki, Linus Torvalds, David Chinner,
	xfs, 'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

On Fri, Jun 15 2007, Tejun Heo wrote:
> SCSI marks internal commands with REQ_PREEMPT and push it at the front
> of the request queue using blk_execute_rq().  When entering suspended
> or frozen state, SCSI devices are quiesced using
> scsi_device_quiesce().  In quiesced state, only REQ_PREEMPT requests
> are processed.  This is how SCSI blocks other requests out while
> suspending and resuming.  As all internal commands are pushed at the
> front of the queue, this usually works.
> 
> Unfortunately, this interacts badly with ordered requeueing.  To
> preserve request order on requeueing (due to busy device, active EH or
> other failures), requests are sorted according to ordered sequence on
> requeue if IO barrier is in progress.
> 
> The following sequence deadlocks.
> 
> 1. IO barrier sequence issues.
> 
> 2. Suspend requested.  Queue is quiesced with part or all of IO
>    barrier sequence at the front.
> 
> 3. During suspending or resuming, SCSI issues internal command which
>    gets deferred and requeued for some reason.  As the command is
>    issued after the IO barrier in #1, ordered requeueing code puts the
>    request after IO barrier sequence.
> 
> 4. The device is ready to process requests again but still is in
>    quiesced state and the first request of the queue isn't
>    REQ_PREEMPT, so command processing is deadlocked -
>    suspending/resuming waits for the issued request to complete while
>    the request can't be processed till device is put back into
>    running state by resuming.
> 
> This can be fixed by always putting !fs requests at the front when
> requeueing.
> 
> The following thread reports this deadlock.
> 
>   http://thread.gmane.org/gmane.linux.kernel/537473
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>
> Cc: Jenn Axboe <jens.axboe@oracle.com>
> Cc: David Greaves <david@dgreaves.com>
> ---
> > Yep looks good, except for the bad multi-line comment style, but that's
> > minor stuff ;-)
> 
> That's how Jeff likes it in libata and my fingers are hardcoded to it,
> but I do appreciate the paramount importance of each maintainer's
> right to his/her own comment style, so here's the respinned patch.  :-)

Thanks a lot! I'll pass it right on for 2.6.22.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] block: always requeue !fs requests at the front
  2007-06-15  9:42                                             ` [PATCH] block: always requeue !fs requests at the front Tejun Heo
  2007-06-15 11:05                                               ` Jens Axboe
@ 2007-06-15 13:58                                               ` David Greaves
  1 sibling, 0 replies; 48+ messages in thread
From: David Greaves @ 2007-06-15 13:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Linus Torvalds, David Chinner, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik, jens.axboe

> David, please test this.  Jens, does it look okay?

Phew!

Works for me.

I applied it to 2.6.22-rc4 (along with 
sata_promise_use_TF_interface_for_polling_NODATA_commands.patch) hibernate and 
resume worked.

Thanks for digging it out Tejun (and everyone else!) :)

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] block: always requeue !fs requests at the front
  2007-06-15 11:17                                                 ` Tejun Heo
  2007-06-15 11:21                                                   ` Jens Axboe
@ 2007-06-15 15:08                                                   ` Jeff Garzik
  1 sibling, 0 replies; 48+ messages in thread
From: Jeff Garzik @ 2007-06-15 15:08 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, David Greaves, Rafael J. Wysocki, Linus Torvalds,
	David Chinner, xfs, 'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown

Tejun Heo wrote:
> SCSI marks internal commands with REQ_PREEMPT and push it at the front
> of the request queue using blk_execute_rq().  When entering suspended
> or frozen state, SCSI devices are quiesced using
> scsi_device_quiesce().  In quiesced state, only REQ_PREEMPT requests
> are processed.  This is how SCSI blocks other requests out while
> suspending and resuming.  As all internal commands are pushed at the
> front of the queue, this usually works.
> 
> Unfortunately, this interacts badly with ordered requeueing.  To
> preserve request order on requeueing (due to busy device, active EH or
> other failures), requests are sorted according to ordered sequence on
> requeue if IO barrier is in progress.
> 
> The following sequence deadlocks.
> 
> 1. IO barrier sequence issues.
> 
> 2. Suspend requested.  Queue is quiesced with part or all of IO
>    barrier sequence at the front.
> 
> 3. During suspending or resuming, SCSI issues internal command which
>    gets deferred and requeued for some reason.  As the command is
>    issued after the IO barrier in #1, ordered requeueing code puts the
>    request after IO barrier sequence.
> 
> 4. The device is ready to process requests again but still is in
>    quiesced state and the first request of the queue isn't
>    REQ_PREEMPT, so command processing is deadlocked -
>    suspending/resuming waits for the issued request to complete while
>    the request can't be processed till device is put back into
>    running state by resuming.
> 
> This can be fixed by always putting !fs requests at the front when
> requeueing.
> 
> The following thread reports this deadlock.
> 
>   http://thread.gmane.org/gmane.linux.kernel/537473
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>
> Cc: Jenn Axboe <jens.axboe@oracle.com>
> Cc: David Greaves <david@dgreaves.com>

Acked-by: Jeff Garzik <jeff@garzik.org>

Thanks Tejun, you kick ass as usual.

	Jeff




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] block: always requeue !fs requests at the front
  2007-06-15 11:05                                               ` Jens Axboe
  2007-06-15 11:17                                                 ` Tejun Heo
@ 2007-06-16 19:54                                                 ` Christoph Hellwig
  2007-06-17  7:29                                                   ` Jens Axboe
  1 sibling, 1 reply; 48+ messages in thread
From: Christoph Hellwig @ 2007-06-16 19:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, David Greaves, Rafael J. Wysocki, Linus Torvalds,
	David Chinner, xfs, 'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

On Fri, Jun 15, 2007 at 01:05:44PM +0200, Jens Axboe wrote:
> On Fri, Jun 15 2007, Tejun Heo wrote:
> > SCSI marks internal commands with REQ_PREEMPT and push it at the front
> > of the request queue using blk_execute_rq().  When entering suspended
> > or frozen state, SCSI devices are quiesced using
> > scsi_device_quiesce().  In quiesced state, only REQ_PREEMPT requests
> > are processed.  This is how SCSI blocks other requests out while
> > suspending and resuming.  As all internal commands are pushed at the
> > front of the queue, this usually works.
> > 
> > Unfortunately, this interacts badly with ordered requeueing.  To
> > preserve request order on requeueing (due to busy device, active EH or
> > other failures), requests are sorted according to ordered sequence on
> > requeue if IO barrier is in progress.
> > 
> > The following sequence deadlocks.
> > 
> > 1. IO barrier sequence issues.
> > 
> > 2. Suspend requested.  Queue is quiesced with part of all of IO
> >    barrier sequence at the front.
> > 
> > 3. During suspending or resuming, SCSI issues internal command which
> >    gets deferred and requeued for some reason.  As the command is
> >    issued after the IO barrier in #1, ordered requeueing code puts the
> >    request after IO barrier sequence.
> > 
> > 4. The device is ready to process requests again but still is in
> >    quiesced state and the first request of the queue isn't
> >    REQ_PREEMPT, so command processing is deadlocked -
> >    suspending/resuming waits for the issued request to complete while
> >    the request can't be processed till device is put back into
> >    running state by resuming.
> > 
> > This can be fixed by always putting !fs requests at the front when
> > requeueing.
> > 
> > The following thread reports this deadlock.
> > 
> >   http://thread.gmane.org/gmane.linux.kernel/537473
> > 
> > Signed-off-by: Tejun Heo <htejun@gmail.com>
> > Cc: Jenn Axboe <jens.axboe@oracle.com>
> > Cc: David Greaves <david@dgreaves.com>
> > ---
> > Okay, it took a lot of hours of debugging but boiled down to two liner
> > fix.  I feel so empty. :-) RAID6 triggers this reliably because it
> > uses BIO_BARRIER heavily to update its superblock.  The recent ATA
> > suspend/resume rewrite is hit by this because it uses SCSI internal
> > commands to spin down and up the drives for suspending and resuming.
> > 
> > David, please test this.  Jens, does it look okay?
> 
> Yep looks good, except for the bad multi-line comment style, but that's
> minor stuff ;-)
> 
> Acked-by: Jens Axboe <jens.axboe@oracle.com>

I'd much much prefer having a description of the problem in the actual
comment then a hyperlink.  There's just too much chance of the latter
breaking over time, and it's impossible to update it when things change
that should be reflected in the comment.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] block: always requeue !fs requests at the front
  2007-06-16 19:54                                                 ` Christoph Hellwig
@ 2007-06-17  7:29                                                   ` Jens Axboe
  2007-06-17  8:03                                                     ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Jens Axboe @ 2007-06-17  7:29 UTC (permalink / raw)
  To: Christoph Hellwig, Tejun Heo, David Greaves, Rafael J. Wysocki,
	Linus Torvalds, David Chinner, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

On Sat, Jun 16 2007, Christoph Hellwig wrote:
> On Fri, Jun 15, 2007 at 01:05:44PM +0200, Jens Axboe wrote:
> > On Fri, Jun 15 2007, Tejun Heo wrote:
> > > SCSI marks internal commands with REQ_PREEMPT and push it at the front
> > > of the request queue using blk_execute_rq().  When entering suspended
> > > or frozen state, SCSI devices are quiesced using
> > > scsi_device_quiesce().  In quiesced state, only REQ_PREEMPT requests
> > > are processed.  This is how SCSI blocks other requests out while
> > > suspending and resuming.  As all internal commands are pushed at the
> > > front of the queue, this usually works.
> > > 
> > > Unfortunately, this interacts badly with ordered requeueing.  To
> > > preserve request order on requeueing (due to busy device, active EH or
> > > other failures), requests are sorted according to ordered sequence on
> > > requeue if IO barrier is in progress.
> > > 
> > > The following sequence deadlocks.
> > > 
> > > 1. IO barrier sequence issues.
> > > 
> > > 2. Suspend requested.  Queue is quiesced with part of all of IO
> > >    barrier sequence at the front.
> > > 
> > > 3. During suspending or resuming, SCSI issues internal command which
> > >    gets deferred and requeued for some reason.  As the command is
> > >    issued after the IO barrier in #1, ordered requeueing code puts the
> > >    request after IO barrier sequence.
> > > 
> > > 4. The device is ready to process requests again but still is in
> > >    quiesced state and the first request of the queue isn't
> > >    REQ_PREEMPT, so command processing is deadlocked -
> > >    suspending/resuming waits for the issued request to complete while
> > >    the request can't be processed till device is put back into
> > >    running state by resuming.
> > > 
> > > This can be fixed by always putting !fs requests at the front when
> > > requeueing.
> > > 
> > > The following thread reports this deadlock.
> > > 
> > >   http://thread.gmane.org/gmane.linux.kernel/537473
> > > 
> > > Signed-off-by: Tejun Heo <htejun@gmail.com>
> > > Cc: Jenn Axboe <jens.axboe@oracle.com>
> > > Cc: David Greaves <david@dgreaves.com>
> > > ---
> > > Okay, it took a lot of hours of debugging but boiled down to two liner
> > > fix.  I feel so empty. :-) RAID6 triggers this reliably because it
> > > uses BIO_BARRIER heavily to update its superblock.  The recent ATA
> > > suspend/resume rewrite is hit by this because it uses SCSI internal
> > > commands to spin down and up the drives for suspending and resuming.
> > > 
> > > David, please test this.  Jens, does it look okay?
> > 
> > Yep looks good, except for the bad multi-line comment style, but that's
> > minor stuff ;-)
> > 
> > Acked-by: Jens Axboe <jens.axboe@oracle.com>
> 
> I'd much much prefer having a description of the problem in the actual
> comment then a hyperlink.  There's just too much chance of the latter
> breaking over time, and it's impossible to update it when things change
> that should be reflected in the comment.

The actual commit text is very good though, but I agree - I don't think
the url comment is worth anything. I did consider just killing it.
However, the comment does describe the problem, so I think it's still
ok.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] block: always requeue !fs requests at the front
  2007-06-17  7:29                                                   ` Jens Axboe
@ 2007-06-17  8:03                                                     ` Tejun Heo
  0 siblings, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2007-06-17  8:03 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, David Greaves, Rafael J. Wysocki,
	Linus Torvalds, David Chinner, xfs,
	'linux-kernel@vger.kernel.org',
	linux-pm, Neil Brown, Jeff Garzik

Jens Axboe wrote:
>> I'd much much prefer having a description of the problem in the actual
>> comment then a hyperlink.  There's just too much chance of the latter
>> breaking over time, and it's impossible to update it when things change
>> that should be reflected in the comment.
> 
> The actual commit text is very good though, but I agree - I don't think
> the url comment is worth anything. I did consider just killing it.
> However, the comment does describe the problem, so I think it's still
> ok.

I thought the whole story was too long for the comment.  Feel free to
kill or edit it.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2007-06-17  8:04 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-01 21:23 2.6.22-rc3 hibernate(?) disables skge wol David Greaves
2007-06-01 21:42 ` Rafael J. Wysocki
2007-06-01 22:37   ` 2.6.22-rc3 hibernate(?) fails totally - regression David Greaves
2007-06-01 23:22     ` Rafael J. Wysocki
2007-06-02 22:31       ` David Greaves
2007-06-02 22:46         ` Linus Torvalds
2007-06-03 15:03           ` David Greaves
2007-06-06  8:33             ` Tejun Heo
2007-06-06 10:18               ` [PATCH] sata_promise: use TF interface for polling NODATA commands Tejun Heo
2007-06-06 10:19                 ` Tejun Heo
2007-06-06 10:39               ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
2007-06-07  5:53                 ` Tejun Heo
2007-06-07 10:30                   ` David Greaves
2007-06-07 11:07                     ` David Chinner
2007-06-07 13:59                       ` David Greaves
2007-06-07 22:28                         ` David Chinner
2007-06-08 19:09                           ` David Greaves
2007-06-12 18:43                             ` Linus Torvalds
2007-06-13 11:16                               ` David Greaves
2007-06-13 21:04                                 ` Linus Torvalds
2007-06-13 21:22                                   ` Jeff Garzik
2007-06-13 22:02                                   ` David Greaves
2007-06-13 22:12                                     ` Linus Torvalds
2007-06-13 23:15                                       ` Rafael J. Wysocki
2007-06-14 14:21                                         ` Tejun Heo
2007-06-14 15:10                                           ` Tejun Heo
2007-06-15  9:42                                             ` [PATCH] block: always requeue !fs requests at the front Tejun Heo
2007-06-15 11:05                                               ` Jens Axboe
2007-06-15 11:17                                                 ` Tejun Heo
2007-06-15 11:21                                                   ` Jens Axboe
2007-06-15 15:08                                                   ` Jeff Garzik
2007-06-16 19:54                                                 ` Christoph Hellwig
2007-06-17  7:29                                                   ` Jens Axboe
2007-06-17  8:03                                                     ` Tejun Heo
2007-06-15 13:58                                               ` David Greaves
2007-06-14 15:19                                           ` 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6) David Greaves
2007-06-14  0:28                                 ` David Chinner
2007-06-12 12:31                           ` David Greaves
2007-06-10 18:43                         ` Pavel Machek
2007-06-12 18:00                           ` David Greaves
2007-06-12 21:31                             ` Pavel Machek
2007-06-07 13:45                     ` Duane Griffin
2007-06-07 14:00                       ` David Greaves
2007-06-07 14:05                         ` Tejun Heo
2007-06-07 14:36                           ` Mark Lord
2007-06-07 15:20                             ` David Greaves
2007-06-07 16:58                               ` Rafael J. Wysocki
2007-06-07 20:12                     ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).