All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
@ 2010-08-28 20:28 Siddhartha Jain
  2010-08-30 12:55 ` Tejun Heo
  0 siblings, 1 reply; 46+ messages in thread
From: Siddhartha Jain @ 2010-08-28 20:28 UTC (permalink / raw)
  To: linux-ide

Hi,

I have the same symptom as Stephan and came across this thread while
googling for more info.

Hardware: Macbook Pro 5,4 (Firmware version 1.7)

OS: Fedora 13 x64 
Kernel: 2.6.33.8-149.fc13.x86_64 (stock fedora kernel)
Modifications: Video is on Nvidia proprietary driver and Root and swap
are encrypted with luks. Kernel boot parameters are:

kernel /vmlinuz-2.6.33.8-149.fc13.x86_64 ro
root=/dev/mapper/luks-f01b2a77-921b-45c1-8f28-6095ab3a56f1
rd_LUKS_UUID=luks-f01b2a77-921b-45c1-8f28-6095ab3a56f1
rd_LUKS_UUID=luks-91e5037b-267c-4572-
9c99-baaa5ca41600 rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8
SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb rdblacklist=nouveau vga=792
resume=UUID=91e5037b-267c-4572-9c99-baaa5ca41600 

The system seems to go to suspend alright but on resume, the hdd does
not seem to wake-up. After resume, in X, the login screen background
shows up but the login prompt box does not. If I am logged in before
suspend then after resume, every command I run including reboot/shutdown
etc returns Input/Output error. After resume, if I try to ssh in, I get
the ssh prompt but authentication does not succeed and I get password
failures. 

Bit more HW/SW info:

1. From lspci -vv
00:0b.0 IDE interface: nVidia Corporation MCP79 SATA Controller (rev b1)
(prog-if 85 [Master SecO PriO])
	Subsystem: nVidia Corporation Device cb79
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
	Latency: 0 (750ns min, 250ns max)
	Interrupt: pin A routed to IRQ 27
	Region 0: I/O ports at 21d8 [size=8]
	Region 1: I/O ports at 21ec [size=4]
	Region 2: I/O ports at 21d0 [size=8]
	Region 3: I/O ports at 21e8 [size=4]
	Region 4: I/O ports at 21c0 [size=16]
	Region 5: Memory at d3584000 (32-bit, non-prefetchable) [size=8K]
	Capabilities: [44] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [8c] SATA HBA v1.0 InCfgSpace
	Capabilities: [b0] MSI: Enable+ Count=1/8 Maskable- 64bit+
		Address: 00000000fee0200c  Data: 4181
	Kernel driver in use: ahci
	Kernel modules: ata_generic, pata_acpi


2. HDD Info from smartctl
# smartctl -a /dev/sda
smartctl 5.39.1 2010-01-28 r3054 [x86_64-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen,
http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     FUJITSU MJA2250BH FFS G1
Serial Number:    K94DTA22N7C1
Firmware Version: 00810020
User Capacity:    250,059,350,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 3f
Local Time is:    Sat Aug 28 13:09:06 2010 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


3. ATA related messages from bootup (dmesg | grep ATA)
ata1: SATA max UDMA/133 irq_stat 0x00400000, PHY RDY changed irq 27
ata2: SATA max UDMA/133 irq_stat 0x00400000, PHY RDY changed irq 27
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-8: FUJITSU MJA2250BH FFS G1, 00810020, max UDMA/100
ata2.00: ATAPI: HL-DT-ST DVDRW  GS23N, SB03, max UDMA/133
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
scsi 0:0:0:0: Direct-Access     ATA      FUJITSU MJA2250B 0081 PQ: 0
ANSI: 5
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)


4. # cat /sys/class/scsi_host/host0/link_power_management_policy 
max_performance


5. Some odd errors at boot-up related to 'ata' (captured by directing
*.* to a log file in rsyslog)
Aug 28 11:49:39 myfedorahat kernel: ACPI: SSDT 00000000bfee4000 000A5
(v01 SataRe  SataPri 00001000 INTL 20061109)
Aug 28 11:49:39 myfedorahat kernel: ACPI: SSDT 00000000bfee3000 0009F
(v01 SataRe  SataSec 00001000 INTL 20061109)
Aug 28 11:49:39 myfedorahat kernel:  #0 [0000000000 - 0000001000]   BIOS
data page ==> [0000000000 - 0000001000]
Aug 28 11:49:39 myfedorahat kernel: Kernel command line: ro
root=/dev/mapper/luks-f01b2a77-921b-45c1-8f28-6095ab3a56f1
rd_LUKS_UUID=luks-f01b2a77-921b-45c1-8f28-6095ab3a56f1
rd_LUKS_UUID=luks-91e5037b-26
7c-4572-9c99-baaa5ca41600 rd_NO_LVM rd_NO_MD rd_NO_DM LANG=en_US.UTF-8
SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb rdblacklist=nouveau vga=792
resume=UUID=91e5037b-267c-4572-9c99-baaa5ca41600 
Aug 28 11:49:39 myfedorahat kernel: Memory: 3758324k/5242880k available
(4289k kernel code, 1336536k absent, 148020k reserved, 7539k data, 756k
init)
Aug 28 11:49:39 myfedorahat kernel: ACPI: EC: GPE = 0x3f, I/O:
command/status = 0x66, data = 0x62
Aug 28 11:49:39 myfedorahat kernel: libata version 3.00 loaded.
Aug 28 11:49:39 myfedorahat kernel: ata1: SATA max UDMA/133 irq_stat
0x00400000, PHY RDY changed irq 27
Aug 28 11:49:39 myfedorahat kernel: ata2: SATA max UDMA/133 irq_stat
0x00400000, PHY RDY changed irq 27
Aug 28 11:49:39 myfedorahat kernel: ata3: DUMMY
Aug 28 11:49:39 myfedorahat kernel: ata4: DUMMY
Aug 28 11:49:39 myfedorahat kernel: ata5: DUMMY
Aug 28 11:49:39 myfedorahat kernel: ata6: DUMMY
Aug 28 11:49:39 myfedorahat kernel: ata1: SATA link up 1.5 Gbps (SStatus
113 SControl 300)
Aug 28 11:49:39 myfedorahat kernel: ata2: SATA link up 1.5 Gbps (SStatus
113 SControl 300)
Aug 28 11:49:39 myfedorahat kernel: ata1.00: ATA-8: FUJITSU MJA2250BH
FFS G1, 00810020, max UDMA/100
Aug 28 11:49:39 myfedorahat kernel: ata1.00: 488397168 sectors, multi
16: LBA48 NCQ (depth 31/32)
Aug 28 11:49:39 myfedorahat kernel: ata1.00: configured for UDMA/100
Aug 28 11:49:39 myfedorahat kernel: ata1: exception Emask 0x10 SAct 0x0
SErr 0x5850000 action 0xe frozen
Aug 28 11:49:39 myfedorahat kernel: ata1: irq_stat 0x00000040,
connection status changed
Aug 28 11:49:39 myfedorahat kernel: ata1: SError: { PHYRdyChg CommWake
LinkSeq TrStaTrns DevExch }
Aug 28 11:49:39 myfedorahat kernel: ata2.00: ATAPI: HL-DT-ST DVDRW
GS23N, SB03, max UDMA/133
Aug 28 11:49:39 myfedorahat kernel: ata1: hard resetting link
Aug 28 11:49:39 myfedorahat kernel: ata2.00: configured for UDMA/133
Aug 28 11:49:39 myfedorahat kernel: ata2: exception Emask 0x10 SAct 0x0
SErr 0x5950000 action 0xe frozen
Aug 28 11:49:39 myfedorahat kernel: ata2: irq_stat 0x00400040,
connection status changed
Aug 28 11:49:39 myfedorahat kernel: ata2: SError: { PHYRdyChg CommWake
Dispar LinkSeq TrStaTrns DevExch }
Aug 28 11:49:39 myfedorahat kernel: ata2: hard resetting link
Aug 28 11:49:39 myfedorahat kernel: ata1: SATA link up 1.5 Gbps (SStatus
113 SControl 300)
Aug 28 11:49:39 myfedorahat kernel: ata1.00: configured for UDMA/100
Aug 28 11:49:39 myfedorahat kernel: ata1: EH complete
Aug 28 11:49:39 myfedorahat kernel: ata2: SATA link up 1.5 Gbps (SStatus
113 SControl 300)
Aug 28 11:49:39 myfedorahat kernel: ata2.00: configured for UDMA/133
Aug 28 11:49:39 myfedorahat kernel: ata2: EH complete
Aug 28 11:49:39 myfedorahat kernel: Write protecting the kernel
read-only data: 10240k
Aug 28 11:49:39 myfedorahat kernel: EXT4-fs (dm-0): mounted filesystem
with ordered data mode
Aug 28 11:49:39 myfedorahat kernel: EXT4-fs (sda1): mounted filesystem
with ordered data mode


I am going to try pointing rsyslog at a remote syslog server to see if I
can capture what happens at resume-time but any other pointers to
debug/troubleshoot this issue will be helpful. I tried booting with
"pci=nomsi" but that makes the box hang at boot-up. Tried disabling acpi
and switched to APM but that seems risky because after suspend, APM does
not seem to have control over CPU and fans. With APM enabled, I did a
suspend and put the macbook in my bag. Four hours later when I pulled it
out of the bag, the fan was spinning at highest RPM and the macbook was
running really hot. With ACPI, at least, all the damage is only to
filesystem when I have to force reboot on resume :)

Thanks,

- Siddhartha


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-28 20:28 [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM Siddhartha Jain
@ 2010-08-30 12:55 ` Tejun Heo
  2010-08-30 15:25   ` Siddhartha Jain
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-08-30 12:55 UTC (permalink / raw)
  To: Siddhartha Jain; +Cc: linux-ide

On 08/28/2010 10:28 PM, Siddhartha Jain wrote:
> I am going to try pointing rsyslog at a remote syslog server to see if I
> can capture what happens at resume-time but any other pointers to
> debug/troubleshoot this issue will be helpful.

Setting up netconsole (see Documentation/networking/netconsole.txt in
the kernel source tree) would be better.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-30 12:55 ` Tejun Heo
@ 2010-08-30 15:25   ` Siddhartha Jain
  2010-08-30 22:26     ` Siddhartha Jain
  0 siblings, 1 reply; 46+ messages in thread
From: Siddhartha Jain @ 2010-08-30 15:25 UTC (permalink / raw)
  To: linux-ide

Hi Tejun,

Will get more debugging info by setting up netconsole. Meanwhile,
syslog did glean some important info, I think. Looks like, for some
reason, after resume, device mapping for the hdd changes from sda to
sdc. Please see logs below. For ease of troubleshooting, I
re-installed the box with stock Fedora 13 x64 with no modifications
except that dmcrypt is turned on for root and swap (so using Nouveau
display drivers instead of Nvidia).

Aug 29 19:13:45 192.168.1.199 kernel: PM: Syncing filesystems ... done.
Aug 29 19:13:45 192.168.1.199 kernel: PM: Preparing system for mem sleep
Aug 29 19:18:57 192.168.1.199 kernel: scsi 0:0:0:0: Direct-Access
ATA      FUJITSU MJA2250B 0081 PQ: 0 ANSI: 5
Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: Attached scsi
generic sg0 type 0
Aug 29 19:18:57 192.168.1.199 kernel: ata2.00: detaching (SCSI 1:0:0:0)
Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] 488397168
512-byte logical blocks: (250 GB/232 GiB)
Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Write Protect is off
Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Aug 29 19:18:57 192.168.1.199 kernel: sdc:
Aug 29 19:18:57 192.168.1.199 kernel: scsi 1:0:0:0: CD-ROM
HL-DT-ST DVDRW  GS23N     SB03 PQ: 0 ANSI: 5
Aug 29 19:18:57 192.168.1.199 kernel: sr0: scsi3-mmc drive: 24x/24x
writer cd/rw xa/form2 cdda caddy
Aug 29 19:18:57 192.168.1.199 kernel: sr 1:0:0:0: Attached scsi CD-ROM sr0
Aug 29 19:18:57 192.168.1.199 kernel: sr 1:0:0:0: Attached scsi
generic sg1 type 5
Aug 29 19:18:58 192.168.1.199 kernel: sdc1 sdc2 sdc3
Aug 29 19:18:58 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Attached SCSI disk
Aug 29 19:20:20 192.168.1.199 ntpd[1466]: 0.0.0.0 c61c 0c clock_step -0.389692 s
Aug 29 19:20:20 192.168.1.199 ntpd[1466]: 0.0.0.0 c614 04 freq_mode
Aug 29 19:20:21 192.168.1.199 ntpd[1466]: 0.0.0.0 c618 08 no_sys_peer
Aug 29 19:22:29 192.168.1.199 sshd[2425]: WARNING: no suitable primes
in /etc/ssh/primes
Aug 29 19:22:29 192.168.1.199 rsyslogd: /var/log/secure
Aug 29 19:22:37 192.168.1.199 auditd[1734]: Record was not written to
disk (Read-only file system)
Aug 29 19:22:37 192.168.1.199 auditd[1734]: write: Audit daemon
detected an error writing an event to disk (Connection refused)
Aug 29 19:22:37 192.168.1.199 sshd[2425]: Failed password for sjain
from 192.168.1.100 port 49902 ssh2
Aug 29 19:22:37 192.168.1.199 auditd[2428]: Audit daemon failed to exec (null)
Aug 29 19:22:37 192.168.1.199 auditd[2428]: The audit daemon is exiting.
Aug 29 19:22:37 192.168.1.199 auditd[1734]: Record was not written to
disk (Read-only file system)
Aug 29 19:22:37 192.168.1.199 auditd[1734]: write: Audit daemon
detected an error writing an event to disk (Read-only file system)
Aug 29 19:22:37 192.168.1.199 auditd[2429]: Audit daemon failed to exec (null)
Aug 29 19:22:37 192.168.1.199 auditd[2429]: The audit daemon is exiting.
Aug 29 19:22:37 192.168.1.199 kernel: type=1305
audit(1283134957.511:19078): audit_pid=0 old=1734 auid=4294967295
ses=4294967295 res=1
Aug 29 19:22:37 192.168.1.199 kernel: type=1305
audit(1283134957.511:19079): audit_pid=0 old=0 auid=4294967295
ses=4294967295 res=1
Aug 29 19:22:54 192.168.1.199 kernel: EXT3-fs error (device dm-0):
ext3_find_entry: reading directory #10445218 offset 0
Aug 29 19:22:54 192.168.1.199 sshd[2426]: Connection closed by 192.168.1.100
Aug 29 19:23:17 192.168.1.199 auditd[2428]: Error setting audit daemon
pid (Resource temporarily unavailable)
Aug 29 19:23:18 192.168.1.199 auditd[2429]: Error setting audit daemon
pid (Resource temporarily unavailable)
Aug 29 19:34:26 192.168.1.199 kernel: CE: hpet increasing min_delta_ns
to 15000 nsec
Aug 29 19:35:46 192.168.1.199 ntpd[1466]: 0.0.0.0 c612 02 freq_set
kernel -65.666 PPM
Aug 29 19:35:46 192.168.1.199 ntpd[1466]: 0.0.0.0 c615 05 clock_sync
Aug 29 19:38:26 192.168.1.199 kernel: EXT3-fs error (device dm-0):
ext3_find_entry: reading directory #1803721 offset 0

Odd?

Thanks,

- Siddhartha



On Mon, Aug 30, 2010 at 5:55 AM, Tejun Heo <tj@kernel.org> wrote:
> On 08/28/2010 10:28 PM, Siddhartha Jain wrote:
>> I am going to try pointing rsyslog at a remote syslog server to see if I
>> can capture what happens at resume-time but any other pointers to
>> debug/troubleshoot this issue will be helpful.
>
> Setting up netconsole (see Documentation/networking/netconsole.txt in
> the kernel source tree) would be better.
>
> Thanks.
>
> --
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
- Siddhartha
  WV6U

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-30 15:25   ` Siddhartha Jain
@ 2010-08-30 22:26     ` Siddhartha Jain
  2010-08-31  0:59       ` Siddhartha Jain
  0 siblings, 1 reply; 46+ messages in thread
From: Siddhartha Jain @ 2010-08-30 22:26 UTC (permalink / raw)
  To: linux-ide

Got into runlevel 3 and manually invoked suspend via "pm-suspend". On
resume, got these logs via netconsole. For some reason, sda becomes
sdc and all sorts of panicking ensues.


PM: Syncing filesystems ... done.
PM: Preparing system for mem sleep
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
PM: Entering mem sleep
Suspending console(s) (use no_console_suspend to debug)
eth0: link up.
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata2.00: configured for UDMA/133
ata2: EH complete
Buffer I/O error on device dm-0, logical block 41791664
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 41791665
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 41791666
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 41800003
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 41963521
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 41789440
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 41789449
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 41791667
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 41791667
lost page write due to I/O error on dm-0
EXT3-fs: ext3_journal_dirty_data: aborting transaction: IO failure in
ext3_journal_dirty_data
EXT3-fs (dm-0): error in ext3_orphan_add: Readonly filesystem
EXT3-fs (dm-0): error in ext3_ordered_write_end: IO failure
------------[ cut here ]------------
WARNING: at fs/buffer.c:1159 mark_buffer_dirty+0x2b/0x86()
Hardware name: MacBookPro5,4
                                            Modules linked in: rfcomm
sco bridge stp llc bnep l2cap sbs sbshc coretemp sunrpc
cpufreq_ondemand acpi_cpufreq freq_table ip6t_REJECT nf_conntrack_ipv6
ip6table_filter ip6_tables ipv6 netconsole configfs uinput b43
snd_hda_codec_cirrus snd_hda_intel mac80211 snd_hda_codec cfg80211
snd_hwdep snd_seq uvcvideo snd_seq_device ssb snd_pcm applesmc
snd_timer videodev snd forcedeth v4l1_compat mmc_core shpchp
v4l2_compat_ioctl32 btusb soundcore snd_page_alloc bluetooth
i2c_nforce2 joydev input_polldev bcm5974 appleir rfkill mbp_nvidia_bl
microcode aes_x86_64 aes_generic xts gf128mul dm_crypt firewire_ohci
usb_storage firewire_core crc_itu_t nouveau ttm drm_kms_helper drm
i2c_algo_bit vid
eo output i2c_core [last unloaded: scsi_wait_scan]
Pid: 1989, comm: rsyslogd Not tainted 2.6.33.8-149.fc13.x86_64 #1
Call Trace:
 [<ffffffff81049ecc>] warn_slowpath_common+0x77/0x8f
 [<ffffffff81049ef3>] warn_slowpath_null+0xf/0x11
 [<ffffffff81120247>] mark_buffer_dirty+0x2b/0x86
 [<ffffffff81162b24>] ext3_commit_super.clone.0+0x54/0x64
 [<ffffffff81162bc4>] ext3_handle_error+0x90/0xb7
 [<ffffffff81162c5d>] __ext3_std_error+0x72/0x8f
 [<ffffffff81162cb9>] __ext3_journal_stop+0x3f/0x49
 [<ffffffff8115cf9d>] ext3_ordered_write_end+0xd7/0x11b
 [<ffffffff810bf335>] generic_file_buffered_write+0x165/0x25f
 [<ffffffff8115ae9d>] ? ext3_dirty_inode+0x7b/0x84
 [<ffffffff810c0479>] __generic_file_aio_write+0x247/0x27c
 [<ffffffff810c0506>] generic_file_aio_write+0x58/0xa2
 [<ffffffff811000cd>] do_sync_write+0xbf/0xfc
 [<ffffffff8103b7d4>] ? pick_next_task+0x22/0x41
 [<ffffffff81427c78>] ? schedule+0x850/0x8e6
 [<ffffffff8100f02f>] ? native_sched_clock+0x2d/0x5f
 [<ffffffff811bcb9f>] ? security_file_permission+0x11/0x13
 [<ffffffff8110062b>] vfs_write+0xa9/0x106
 [<ffffffff8110073e>] sys_write+0x45/0x69
 [<ffffffff81008b02>] system_call_fastpath+0x16/0x1b
---[ end trace 65d387dd571aab2f ]---
JBD: Detected IO errors while flushing file data on dm-0
Aborting journal on device dm-0.
EXT3-fs (dm-0): error in ext3_orphan_add: Journal has aborted
EXT3-fs (dm-0): error in ext3_truncate: Journal has aborted
JBD: Detected IO errors while flushing file data on dm-0
EXT3-fs (dm-0): error: ext3_journal_start_sb: Detected aborted journal
EXT3-fs (dm-0): error: ext3_journal_start_sb: Detected aborted journal
EXT3-fs (dm-0): error: remounting filesystem read-only
INFO: task ata_aux:27 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ata_aux       D ffff8800057157c0     0    27      2 0x00000000
 ffff88013b441950 0000000000000046 ffff8801382a0000 0000000000000000
 ffff88013b4418d0 ffffffff812bb5f8 ffff88013b441960 ffff88013b441fd8
 ffff88013b441fd8 000000000000f9b0 00000000000157c0 ffff88013b43b268
Call Trace:
 [<ffffffff812bb5f8>] ? __scsi_get_command+0x15/0x90
 [<ffffffff81428079>] schedule_timeout+0x31/0xde
 [<ffffffff811fb10c>] ? kobject_put+0x47/0x4b
 [<ffffffff812a8a74>] ? put_device+0x12/0x14
 [<ffffffff81427f0b>] wait_for_common+0xd1/0x12c
 [<ffffffff81044f21>] ? default_wake_function+0x0/0xf
 [<ffffffff811e980c>] ? __generic_unplug_device+0x2d/0x32
 [<ffffffff81427ff0>] wait_for_completion+0x18/0x1a
 [<ffffffff811ed7a1>] blk_execute_rq+0xa1/0xd0
 [<ffffffffa014126c>] ? write_msg+0xe5/0xf4 [netconsole]
 [<ffffffff811ea366>] ? blk_get_request+0x3c/0x72
 [<ffffffff812c1ca5>] scsi_execute+0xf1/0x143
 [<ffffffff812c1d9b>] scsi_execute_req+0xa4/0xd6
 [<ffffffff812c8ef8>] sd_sync_cache+0x95/0xf3
 [<ffffffff81427274>] ? printk+0x3c/0x40
 [<ffffffff812c9115>] sd_shutdown+0x93/0x116
 [<ffffffff812c92da>] sd_remove+0x56/0x8b
 [<ffffffff812ac36a>] __device_release_driver+0x70/0xc3
 [<ffffffff812ac481>] device_release_driver+0x1e/0x2b
 [<ffffffff812ab6aa>] bus_remove_device+0xb1/0xe0
 [<ffffffff812a9953>] device_del+0x135/0x1a1
 [<ffffffff812c5592>] __scsi_remove_device+0x4d/0x98
 [<ffffffff812c55fe>] scsi_remove_device+0x21/0x2e
 [<ffffffff812db944>] ata_scsi_handle_link_detach+0xfe/0x131
 [<ffffffff812dbb29>] ata_scsi_hotplug+0x26/0x62
 [<ffffffff8105f6e9>] worker_thread+0x1a4/0x232
 [<ffffffff812dbb03>] ? ata_scsi_hotplug+0x0/0x62
 [<ffffffff810631bf>] ? autoremove_wake_function+0x0/0x34
 [<ffffffff8105f545>] ? worker_thread+0x0/0x232
 [<ffffffff81062d6f>] kthread+0x7a/0x82
 [<ffffffff81009924>] kernel_thread_helper+0x4/0x10
 [<ffffffff81062cf5>] ? kthread+0x0/0x82
 [<ffffffff81009920>] ? kernel_thread_helper+0x0/0x10
sd 0:0:0:0: timing out command, waited 180s


Thanks,

- Siddhartha


On Mon, Aug 30, 2010 at 8:25 AM, Siddhartha Jain
<siddhartha@siddharthajain.net> wrote:
> Hi Tejun,
>
> Will get more debugging info by setting up netconsole. Meanwhile,
> syslog did glean some important info, I think. Looks like, for some
> reason, after resume, device mapping for the hdd changes from sda to
> sdc. Please see logs below. For ease of troubleshooting, I
> re-installed the box with stock Fedora 13 x64 with no modifications
> except that dmcrypt is turned on for root and swap (so using Nouveau
> display drivers instead of Nvidia).
>
> Aug 29 19:13:45 192.168.1.199 kernel: PM: Syncing filesystems ... done.
> Aug 29 19:13:45 192.168.1.199 kernel: PM: Preparing system for mem sleep
> Aug 29 19:18:57 192.168.1.199 kernel: scsi 0:0:0:0: Direct-Access
> ATA      FUJITSU MJA2250B 0081 PQ: 0 ANSI: 5
> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: Attached scsi
> generic sg0 type 0
> Aug 29 19:18:57 192.168.1.199 kernel: ata2.00: detaching (SCSI 1:0:0:0)
> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] 488397168
> 512-byte logical blocks: (250 GB/232 GiB)
> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Write Protect is off
> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Write cache:
> enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 29 19:18:57 192.168.1.199 kernel: sdc:
> Aug 29 19:18:57 192.168.1.199 kernel: scsi 1:0:0:0: CD-ROM
> HL-DT-ST DVDRW  GS23N     SB03 PQ: 0 ANSI: 5
> Aug 29 19:18:57 192.168.1.199 kernel: sr0: scsi3-mmc drive: 24x/24x
> writer cd/rw xa/form2 cdda caddy
> Aug 29 19:18:57 192.168.1.199 kernel: sr 1:0:0:0: Attached scsi CD-ROM sr0
> Aug 29 19:18:57 192.168.1.199 kernel: sr 1:0:0:0: Attached scsi
> generic sg1 type 5
> Aug 29 19:18:58 192.168.1.199 kernel: sdc1 sdc2 sdc3
> Aug 29 19:18:58 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Attached SCSI disk
> Aug 29 19:20:20 192.168.1.199 ntpd[1466]: 0.0.0.0 c61c 0c clock_step -0.389692 s
> Aug 29 19:20:20 192.168.1.199 ntpd[1466]: 0.0.0.0 c614 04 freq_mode
> Aug 29 19:20:21 192.168.1.199 ntpd[1466]: 0.0.0.0 c618 08 no_sys_peer
> Aug 29 19:22:29 192.168.1.199 sshd[2425]: WARNING: no suitable primes
> in /etc/ssh/primes
> Aug 29 19:22:29 192.168.1.199 rsyslogd: /var/log/secure
> Aug 29 19:22:37 192.168.1.199 auditd[1734]: Record was not written to
> disk (Read-only file system)
> Aug 29 19:22:37 192.168.1.199 auditd[1734]: write: Audit daemon
> detected an error writing an event to disk (Connection refused)
> Aug 29 19:22:37 192.168.1.199 sshd[2425]: Failed password for sjain
> from 192.168.1.100 port 49902 ssh2
> Aug 29 19:22:37 192.168.1.199 auditd[2428]: Audit daemon failed to exec (null)
> Aug 29 19:22:37 192.168.1.199 auditd[2428]: The audit daemon is exiting.
> Aug 29 19:22:37 192.168.1.199 auditd[1734]: Record was not written to
> disk (Read-only file system)
> Aug 29 19:22:37 192.168.1.199 auditd[1734]: write: Audit daemon
> detected an error writing an event to disk (Read-only file system)
> Aug 29 19:22:37 192.168.1.199 auditd[2429]: Audit daemon failed to exec (null)
> Aug 29 19:22:37 192.168.1.199 auditd[2429]: The audit daemon is exiting.
> Aug 29 19:22:37 192.168.1.199 kernel: type=1305
> audit(1283134957.511:19078): audit_pid=0 old=1734 auid=4294967295
> ses=4294967295 res=1
> Aug 29 19:22:37 192.168.1.199 kernel: type=1305
> audit(1283134957.511:19079): audit_pid=0 old=0 auid=4294967295
> ses=4294967295 res=1
> Aug 29 19:22:54 192.168.1.199 kernel: EXT3-fs error (device dm-0):
> ext3_find_entry: reading directory #10445218 offset 0
> Aug 29 19:22:54 192.168.1.199 sshd[2426]: Connection closed by 192.168.1.100
> Aug 29 19:23:17 192.168.1.199 auditd[2428]: Error setting audit daemon
> pid (Resource temporarily unavailable)
> Aug 29 19:23:18 192.168.1.199 auditd[2429]: Error setting audit daemon
> pid (Resource temporarily unavailable)
> Aug 29 19:34:26 192.168.1.199 kernel: CE: hpet increasing min_delta_ns
> to 15000 nsec
> Aug 29 19:35:46 192.168.1.199 ntpd[1466]: 0.0.0.0 c612 02 freq_set
> kernel -65.666 PPM
> Aug 29 19:35:46 192.168.1.199 ntpd[1466]: 0.0.0.0 c615 05 clock_sync
> Aug 29 19:38:26 192.168.1.199 kernel: EXT3-fs error (device dm-0):
> ext3_find_entry: reading directory #1803721 offset 0
>
> Odd?
>
> Thanks,
>
> - Siddhartha
>
>
>
> On Mon, Aug 30, 2010 at 5:55 AM, Tejun Heo <tj@kernel.org> wrote:
>> On 08/28/2010 10:28 PM, Siddhartha Jain wrote:
>>> I am going to try pointing rsyslog at a remote syslog server to see if I
>>> can capture what happens at resume-time but any other pointers to
>>> debug/troubleshoot this issue will be helpful.
>>
>> Setting up netconsole (see Documentation/networking/netconsole.txt in
>> the kernel source tree) would be better.
>>
>> Thanks.
>>
>> --
>> tejun
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> - Siddhartha
>   WV6U
>



-- 
- Siddhartha
  WV6U

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-30 22:26     ` Siddhartha Jain
@ 2010-08-31  0:59       ` Siddhartha Jain
  2010-08-31  2:03         ` Siddhartha Jain
  0 siblings, 1 reply; 46+ messages in thread
From: Siddhartha Jain @ 2010-08-31  0:59 UTC (permalink / raw)
  To: linux-ide

Re-installed with encryption off and netconsole logging on. Same
result, that is, sda switches to sdc and everything goes haywire.

eth0: link up.
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: configured for UDMA/100
ata1: EH complete
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#2884877 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#2884877 offset 0
sd 0:0:0:0: [sda] Stopping disk
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#13631490 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670077 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670077 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3677072 offset 0
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456489 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670077 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456489 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=4456517, block=17825828
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3691827 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3812945 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3681172 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670113 offset 0
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#2884877 offset 0
scsi 0:0:0:0: Direct-Access     ATA      FUJITSU MJA2250B 0081 PQ: 0 ANSI: 5
sd 0:0:0:0: Attached scsi generic sg0 type 0
ata2.00: detaching (SCSI 1:0:0:0)
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#2884877 offset 0
sd 0:0:0:0: [sdc] 488397168 512-byte logical blocks: (250 GB/232 GiB)
sd 0:0:0:0: [sdc] Write Protect is off
sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
 sdc:
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456489 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670077 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670017 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670068 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#10485761 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670068 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
scsi 1:0:0:0: CD-ROM            HL-DT-ST DVDRW  GS23N     SB03 PQ: 0 ANSI: 5
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456486 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda caddy
sr 1:0:0:0: Attached scsi CD-ROM sr0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#2883586 offset 0
sr 1:0:0:0: Attached scsi generic sg1 type 5
EXT4-fs error (device sda2): ext4_find_entry: reading directory #2 offset 0
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
 sdc1 sdc2 sdc3
sd 0:0:0:0: [sdc] Attached SCSI disk
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670077 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#3670026 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
EXT4-fs error (device sda2): ext4_find_entry: reading directory
#4456450 offset 0
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=2885332, block=11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=13631490, block=54525984
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=13631490, block=54525984
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=13631490, block=54525984
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=13631490, block=54525984
EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
inode block - inode=13631490, block=54525984



On Mon, Aug 30, 2010 at 3:26 PM, Siddhartha Jain
<siddhartha@siddharthajain.net> wrote:
> Got into runlevel 3 and manually invoked suspend via "pm-suspend". On
> resume, got these logs via netconsole. For some reason, sda becomes
> sdc and all sorts of panicking ensues.
>
>
> PM: Syncing filesystems ... done.
> PM: Preparing system for mem sleep
> Freezing user space processes ... (elapsed 0.01 seconds) done.
> Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> PM: Entering mem sleep
> Suspending console(s) (use no_console_suspend to debug)
> eth0: link up.
> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> ata2.00: configured for UDMA/133
> ata2: EH complete
> Buffer I/O error on device dm-0, logical block 41791664
> lost page write due to I/O error on dm-0
> Buffer I/O error on device dm-0, logical block 41791665
> lost page write due to I/O error on dm-0
> Buffer I/O error on device dm-0, logical block 41791666
> lost page write due to I/O error on dm-0
> Buffer I/O error on device dm-0, logical block 41800003
> lost page write due to I/O error on dm-0
> Buffer I/O error on device dm-0, logical block 41963521
> lost page write due to I/O error on dm-0
> Buffer I/O error on device dm-0, logical block 41789440
> lost page write due to I/O error on dm-0
> Buffer I/O error on device dm-0, logical block 41789449
> lost page write due to I/O error on dm-0
> Buffer I/O error on device dm-0, logical block 41791667
> lost page write due to I/O error on dm-0
> Buffer I/O error on device dm-0, logical block 41791667
> lost page write due to I/O error on dm-0
> EXT3-fs: ext3_journal_dirty_data: aborting transaction: IO failure in
> ext3_journal_dirty_data
> EXT3-fs (dm-0): error in ext3_orphan_add: Readonly filesystem
> EXT3-fs (dm-0): error in ext3_ordered_write_end: IO failure
> ------------[ cut here ]------------
> WARNING: at fs/buffer.c:1159 mark_buffer_dirty+0x2b/0x86()
> Hardware name: MacBookPro5,4
>                                            Modules linked in: rfcomm
> sco bridge stp llc bnep l2cap sbs sbshc coretemp sunrpc
> cpufreq_ondemand acpi_cpufreq freq_table ip6t_REJECT nf_conntrack_ipv6
> ip6table_filter ip6_tables ipv6 netconsole configfs uinput b43
> snd_hda_codec_cirrus snd_hda_intel mac80211 snd_hda_codec cfg80211
> snd_hwdep snd_seq uvcvideo snd_seq_device ssb snd_pcm applesmc
> snd_timer videodev snd forcedeth v4l1_compat mmc_core shpchp
> v4l2_compat_ioctl32 btusb soundcore snd_page_alloc bluetooth
> i2c_nforce2 joydev input_polldev bcm5974 appleir rfkill mbp_nvidia_bl
> microcode aes_x86_64 aes_generic xts gf128mul dm_crypt firewire_ohci
> usb_storage firewire_core crc_itu_t nouveau ttm drm_kms_helper drm
> i2c_algo_bit vid
> eo output i2c_core [last unloaded: scsi_wait_scan]
> Pid: 1989, comm: rsyslogd Not tainted 2.6.33.8-149.fc13.x86_64 #1
> Call Trace:
>  [<ffffffff81049ecc>] warn_slowpath_common+0x77/0x8f
>  [<ffffffff81049ef3>] warn_slowpath_null+0xf/0x11
>  [<ffffffff81120247>] mark_buffer_dirty+0x2b/0x86
>  [<ffffffff81162b24>] ext3_commit_super.clone.0+0x54/0x64
>  [<ffffffff81162bc4>] ext3_handle_error+0x90/0xb7
>  [<ffffffff81162c5d>] __ext3_std_error+0x72/0x8f
>  [<ffffffff81162cb9>] __ext3_journal_stop+0x3f/0x49
>  [<ffffffff8115cf9d>] ext3_ordered_write_end+0xd7/0x11b
>  [<ffffffff810bf335>] generic_file_buffered_write+0x165/0x25f
>  [<ffffffff8115ae9d>] ? ext3_dirty_inode+0x7b/0x84
>  [<ffffffff810c0479>] __generic_file_aio_write+0x247/0x27c
>  [<ffffffff810c0506>] generic_file_aio_write+0x58/0xa2
>  [<ffffffff811000cd>] do_sync_write+0xbf/0xfc
>  [<ffffffff8103b7d4>] ? pick_next_task+0x22/0x41
>  [<ffffffff81427c78>] ? schedule+0x850/0x8e6
>  [<ffffffff8100f02f>] ? native_sched_clock+0x2d/0x5f
>  [<ffffffff811bcb9f>] ? security_file_permission+0x11/0x13
>  [<ffffffff8110062b>] vfs_write+0xa9/0x106
>  [<ffffffff8110073e>] sys_write+0x45/0x69
>  [<ffffffff81008b02>] system_call_fastpath+0x16/0x1b
> ---[ end trace 65d387dd571aab2f ]---
> JBD: Detected IO errors while flushing file data on dm-0
> Aborting journal on device dm-0.
> EXT3-fs (dm-0): error in ext3_orphan_add: Journal has aborted
> EXT3-fs (dm-0): error in ext3_truncate: Journal has aborted
> JBD: Detected IO errors while flushing file data on dm-0
> EXT3-fs (dm-0): error: ext3_journal_start_sb: Detected aborted journal
> EXT3-fs (dm-0): error: ext3_journal_start_sb: Detected aborted journal
> EXT3-fs (dm-0): error: remounting filesystem read-only
> INFO: task ata_aux:27 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> ata_aux       D ffff8800057157c0     0    27      2 0x00000000
>  ffff88013b441950 0000000000000046 ffff8801382a0000 0000000000000000
>  ffff88013b4418d0 ffffffff812bb5f8 ffff88013b441960 ffff88013b441fd8
>  ffff88013b441fd8 000000000000f9b0 00000000000157c0 ffff88013b43b268
> Call Trace:
>  [<ffffffff812bb5f8>] ? __scsi_get_command+0x15/0x90
>  [<ffffffff81428079>] schedule_timeout+0x31/0xde
>  [<ffffffff811fb10c>] ? kobject_put+0x47/0x4b
>  [<ffffffff812a8a74>] ? put_device+0x12/0x14
>  [<ffffffff81427f0b>] wait_for_common+0xd1/0x12c
>  [<ffffffff81044f21>] ? default_wake_function+0x0/0xf
>  [<ffffffff811e980c>] ? __generic_unplug_device+0x2d/0x32
>  [<ffffffff81427ff0>] wait_for_completion+0x18/0x1a
>  [<ffffffff811ed7a1>] blk_execute_rq+0xa1/0xd0
>  [<ffffffffa014126c>] ? write_msg+0xe5/0xf4 [netconsole]
>  [<ffffffff811ea366>] ? blk_get_request+0x3c/0x72
>  [<ffffffff812c1ca5>] scsi_execute+0xf1/0x143
>  [<ffffffff812c1d9b>] scsi_execute_req+0xa4/0xd6
>  [<ffffffff812c8ef8>] sd_sync_cache+0x95/0xf3
>  [<ffffffff81427274>] ? printk+0x3c/0x40
>  [<ffffffff812c9115>] sd_shutdown+0x93/0x116
>  [<ffffffff812c92da>] sd_remove+0x56/0x8b
>  [<ffffffff812ac36a>] __device_release_driver+0x70/0xc3
>  [<ffffffff812ac481>] device_release_driver+0x1e/0x2b
>  [<ffffffff812ab6aa>] bus_remove_device+0xb1/0xe0
>  [<ffffffff812a9953>] device_del+0x135/0x1a1
>  [<ffffffff812c5592>] __scsi_remove_device+0x4d/0x98
>  [<ffffffff812c55fe>] scsi_remove_device+0x21/0x2e
>  [<ffffffff812db944>] ata_scsi_handle_link_detach+0xfe/0x131
>  [<ffffffff812dbb29>] ata_scsi_hotplug+0x26/0x62
>  [<ffffffff8105f6e9>] worker_thread+0x1a4/0x232
>  [<ffffffff812dbb03>] ? ata_scsi_hotplug+0x0/0x62
>  [<ffffffff810631bf>] ? autoremove_wake_function+0x0/0x34
>  [<ffffffff8105f545>] ? worker_thread+0x0/0x232
>  [<ffffffff81062d6f>] kthread+0x7a/0x82
>  [<ffffffff81009924>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81062cf5>] ? kthread+0x0/0x82
>  [<ffffffff81009920>] ? kernel_thread_helper+0x0/0x10
> sd 0:0:0:0: timing out command, waited 180s
>
>
> Thanks,
>
> - Siddhartha
>
>
> On Mon, Aug 30, 2010 at 8:25 AM, Siddhartha Jain
> <siddhartha@siddharthajain.net> wrote:
>> Hi Tejun,
>>
>> Will get more debugging info by setting up netconsole. Meanwhile,
>> syslog did glean some important info, I think. Looks like, for some
>> reason, after resume, device mapping for the hdd changes from sda to
>> sdc. Please see logs below. For ease of troubleshooting, I
>> re-installed the box with stock Fedora 13 x64 with no modifications
>> except that dmcrypt is turned on for root and swap (so using Nouveau
>> display drivers instead of Nvidia).
>>
>> Aug 29 19:13:45 192.168.1.199 kernel: PM: Syncing filesystems ... done.
>> Aug 29 19:13:45 192.168.1.199 kernel: PM: Preparing system for mem sleep
>> Aug 29 19:18:57 192.168.1.199 kernel: scsi 0:0:0:0: Direct-Access
>> ATA      FUJITSU MJA2250B 0081 PQ: 0 ANSI: 5
>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: Attached scsi
>> generic sg0 type 0
>> Aug 29 19:18:57 192.168.1.199 kernel: ata2.00: detaching (SCSI 1:0:0:0)
>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] 488397168
>> 512-byte logical blocks: (250 GB/232 GiB)
>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Write Protect is off
>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Write cache:
>> enabled, read cache: enabled, doesn't support DPO or FUA
>> Aug 29 19:18:57 192.168.1.199 kernel: sdc:
>> Aug 29 19:18:57 192.168.1.199 kernel: scsi 1:0:0:0: CD-ROM
>> HL-DT-ST DVDRW  GS23N     SB03 PQ: 0 ANSI: 5
>> Aug 29 19:18:57 192.168.1.199 kernel: sr0: scsi3-mmc drive: 24x/24x
>> writer cd/rw xa/form2 cdda caddy
>> Aug 29 19:18:57 192.168.1.199 kernel: sr 1:0:0:0: Attached scsi CD-ROM sr0
>> Aug 29 19:18:57 192.168.1.199 kernel: sr 1:0:0:0: Attached scsi
>> generic sg1 type 5
>> Aug 29 19:18:58 192.168.1.199 kernel: sdc1 sdc2 sdc3
>> Aug 29 19:18:58 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Attached SCSI disk
>> Aug 29 19:20:20 192.168.1.199 ntpd[1466]: 0.0.0.0 c61c 0c clock_step -0.389692 s
>> Aug 29 19:20:20 192.168.1.199 ntpd[1466]: 0.0.0.0 c614 04 freq_mode
>> Aug 29 19:20:21 192.168.1.199 ntpd[1466]: 0.0.0.0 c618 08 no_sys_peer
>> Aug 29 19:22:29 192.168.1.199 sshd[2425]: WARNING: no suitable primes
>> in /etc/ssh/primes
>> Aug 29 19:22:29 192.168.1.199 rsyslogd: /var/log/secure
>> Aug 29 19:22:37 192.168.1.199 auditd[1734]: Record was not written to
>> disk (Read-only file system)
>> Aug 29 19:22:37 192.168.1.199 auditd[1734]: write: Audit daemon
>> detected an error writing an event to disk (Connection refused)
>> Aug 29 19:22:37 192.168.1.199 sshd[2425]: Failed password for sjain
>> from 192.168.1.100 port 49902 ssh2
>> Aug 29 19:22:37 192.168.1.199 auditd[2428]: Audit daemon failed to exec (null)
>> Aug 29 19:22:37 192.168.1.199 auditd[2428]: The audit daemon is exiting.
>> Aug 29 19:22:37 192.168.1.199 auditd[1734]: Record was not written to
>> disk (Read-only file system)
>> Aug 29 19:22:37 192.168.1.199 auditd[1734]: write: Audit daemon
>> detected an error writing an event to disk (Read-only file system)
>> Aug 29 19:22:37 192.168.1.199 auditd[2429]: Audit daemon failed to exec (null)
>> Aug 29 19:22:37 192.168.1.199 auditd[2429]: The audit daemon is exiting.
>> Aug 29 19:22:37 192.168.1.199 kernel: type=1305
>> audit(1283134957.511:19078): audit_pid=0 old=1734 auid=4294967295
>> ses=4294967295 res=1
>> Aug 29 19:22:37 192.168.1.199 kernel: type=1305
>> audit(1283134957.511:19079): audit_pid=0 old=0 auid=4294967295
>> ses=4294967295 res=1
>> Aug 29 19:22:54 192.168.1.199 kernel: EXT3-fs error (device dm-0):
>> ext3_find_entry: reading directory #10445218 offset 0
>> Aug 29 19:22:54 192.168.1.199 sshd[2426]: Connection closed by 192.168.1.100
>> Aug 29 19:23:17 192.168.1.199 auditd[2428]: Error setting audit daemon
>> pid (Resource temporarily unavailable)
>> Aug 29 19:23:18 192.168.1.199 auditd[2429]: Error setting audit daemon
>> pid (Resource temporarily unavailable)
>> Aug 29 19:34:26 192.168.1.199 kernel: CE: hpet increasing min_delta_ns
>> to 15000 nsec
>> Aug 29 19:35:46 192.168.1.199 ntpd[1466]: 0.0.0.0 c612 02 freq_set
>> kernel -65.666 PPM
>> Aug 29 19:35:46 192.168.1.199 ntpd[1466]: 0.0.0.0 c615 05 clock_sync
>> Aug 29 19:38:26 192.168.1.199 kernel: EXT3-fs error (device dm-0):
>> ext3_find_entry: reading directory #1803721 offset 0
>>
>> Odd?
>>
>> Thanks,
>>
>> - Siddhartha
>>
>>
>>
>> On Mon, Aug 30, 2010 at 5:55 AM, Tejun Heo <tj@kernel.org> wrote:
>>> On 08/28/2010 10:28 PM, Siddhartha Jain wrote:
>>>> I am going to try pointing rsyslog at a remote syslog server to see if I
>>>> can capture what happens at resume-time but any other pointers to
>>>> debug/troubleshoot this issue will be helpful.
>>>
>>> Setting up netconsole (see Documentation/networking/netconsole.txt in
>>> the kernel source tree) would be better.
>>>
>>> Thanks.
>>>
>>> --
>>> tejun
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-31  0:59       ` Siddhartha Jain
@ 2010-08-31  2:03         ` Siddhartha Jain
  2010-09-01 19:47           ` Siddhartha Jain
  0 siblings, 1 reply; 46+ messages in thread
From: Siddhartha Jain @ 2010-08-31  2:03 UTC (permalink / raw)
  To: linux-ide

Upgraded to latest stable release - 2.6.35.4. Same errors:

PM: Syncing filesystems ... done.
PM: Preparing system for mem sleep
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
PM: Entering mem sleep
Suspending console(s) (use no_console_suspend to debug)
eth0: link up.
scsi 0:0:0:0: Direct-Access     ATA      FUJITSU MJA2250B 0081 PQ: 0 ANSI: 5
sd 0:0:0:0: Attached scsi generic sg0 type 0
ata2.00: detaching (SCSI 1:0:0:0)
EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
(comm udevd) unable to read inode block 11534477
sd 0:0:0:0: [sdc] 488397168 512-byte logical blocks: (250 GB/232 GiB)
EXT4-fs error (device sda2): ext4_find_entry: inode #2884877: (comm
udevd) reading directory lblock 0
sd 0:0:0:0: [sdc] Write Protect is off
sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
 sdc:
scsi 1:0:0:0: CD-ROM            HL-DT-ST DVDRW  GS23N     SB03 PQ: 0 ANSI: 5
EXT4-fs error (device sda2): ext4_find_entry: inode #4456450: (comm
gnome-panel) reading directory lblock 0
EXT4-fs error (device sda2): ext4_find_entry: inode #4456489: (comm
gnome-panel) reading directory lblock 0
EXT4-fs error (device sda2): ext4_find_entry: inode #3670077: (comm
gnome-panel) reading directory lblock 0
EXT4-fs error (device sda2): ext4_find_entry: inode #3670017: (comm
gnome-panel) reading directory lblock 0
EXT4-fs error (device sda2): ext4_find_entry: inode #3670068: (comm
gnome-panel) reading directory lblock 0
EXT4-fs error (device sda2): ext4_find_entry: inode #10485761: (comm
gnome-panel) reading directory lblock 0
EXT4-fs error (device sda2): ext4_find_entry: inode #3670068: (comm
gnome-panel) reading directory lblock 0
EXT4-fs error (device sda2): ext4_find_entry: inode #4456450: (comm
gnome-panel) reading directory lblock 0
sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda caddy
sr 1:0:0:0: Attached scsi CD-ROM sr0
sr 1:0:0:0: Attached scsi generic sg1 type 5
EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
(comm udevd) unable to read inode block 11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
(comm udevd) unable to read inode block 11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
(comm udevd) unable to read inode block 11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
(comm udevd) unable to read inode block 11534477
EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
(comm udevd) unable to read inode block 11534477
 sdc1 sdc2 sdc3
sd 0:0:0:0: [sdc] Attached SCSI disk





On Mon, Aug 30, 2010 at 5:59 PM, Siddhartha Jain
<siddhartha@siddharthajain.net> wrote:
> Re-installed with encryption off and netconsole logging on. Same
> result, that is, sda switches to sdc and everything goes haywire.
>
> eth0: link up.
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> ata1.00: configured for UDMA/100
> ata1: EH complete
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #2884877 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #2884877 offset 0
> sd 0:0:0:0: [sda] Stopping disk
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #13631490 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670077 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670077 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3677072 offset 0
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456489 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670077 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456489 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=4456517, block=17825828
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3691827 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3812945 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3681172 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670113 offset 0
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #2884877 offset 0
> scsi 0:0:0:0: Direct-Access     ATA      FUJITSU MJA2250B 0081 PQ: 0 ANSI: 5
> sd 0:0:0:0: Attached scsi generic sg0 type 0
> ata2.00: detaching (SCSI 1:0:0:0)
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #2884877 offset 0
> sd 0:0:0:0: [sdc] 488397168 512-byte logical blocks: (250 GB/232 GiB)
> sd 0:0:0:0: [sdc] Write Protect is off
> sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
>  sdc:
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456489 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670077 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670017 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670068 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #10485761 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670068 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> scsi 1:0:0:0: CD-ROM            HL-DT-ST DVDRW  GS23N     SB03 PQ: 0 ANSI: 5
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456486 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda caddy
> sr 1:0:0:0: Attached scsi CD-ROM sr0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #2883586 offset 0
> sr 1:0:0:0: Attached scsi generic sg1 type 5
> EXT4-fs error (device sda2): ext4_find_entry: reading directory #2 offset 0
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
>  sdc1 sdc2 sdc3
> sd 0:0:0:0: [sdc] Attached SCSI disk
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670077 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #3670026 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> EXT4-fs error (device sda2): ext4_find_entry: reading directory
> #4456450 offset 0
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=2885332, block=11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=13631490, block=54525984
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=13631490, block=54525984
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=13631490, block=54525984
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=13631490, block=54525984
> EXT4-fs error (device sda2): __ext4_get_inode_loc: unable to read
> inode block - inode=13631490, block=54525984
>
>
>
> On Mon, Aug 30, 2010 at 3:26 PM, Siddhartha Jain
> <siddhartha@siddharthajain.net> wrote:
>> Got into runlevel 3 and manually invoked suspend via "pm-suspend". On
>> resume, got these logs via netconsole. For some reason, sda becomes
>> sdc and all sorts of panicking ensues.
>>
>>
>> PM: Syncing filesystems ... done.
>> PM: Preparing system for mem sleep
>> Freezing user space processes ... (elapsed 0.01 seconds) done.
>> Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
>> PM: Entering mem sleep
>> Suspending console(s) (use no_console_suspend to debug)
>> eth0: link up.
>> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
>> ata2.00: configured for UDMA/133
>> ata2: EH complete
>> Buffer I/O error on device dm-0, logical block 41791664
>> lost page write due to I/O error on dm-0
>> Buffer I/O error on device dm-0, logical block 41791665
>> lost page write due to I/O error on dm-0
>> Buffer I/O error on device dm-0, logical block 41791666
>> lost page write due to I/O error on dm-0
>> Buffer I/O error on device dm-0, logical block 41800003
>> lost page write due to I/O error on dm-0
>> Buffer I/O error on device dm-0, logical block 41963521
>> lost page write due to I/O error on dm-0
>> Buffer I/O error on device dm-0, logical block 41789440
>> lost page write due to I/O error on dm-0
>> Buffer I/O error on device dm-0, logical block 41789449
>> lost page write due to I/O error on dm-0
>> Buffer I/O error on device dm-0, logical block 41791667
>> lost page write due to I/O error on dm-0
>> Buffer I/O error on device dm-0, logical block 41791667
>> lost page write due to I/O error on dm-0
>> EXT3-fs: ext3_journal_dirty_data: aborting transaction: IO failure in
>> ext3_journal_dirty_data
>> EXT3-fs (dm-0): error in ext3_orphan_add: Readonly filesystem
>> EXT3-fs (dm-0): error in ext3_ordered_write_end: IO failure
>> ------------[ cut here ]------------
>> WARNING: at fs/buffer.c:1159 mark_buffer_dirty+0x2b/0x86()
>> Hardware name: MacBookPro5,4
>>                                            Modules linked in: rfcomm
>> sco bridge stp llc bnep l2cap sbs sbshc coretemp sunrpc
>> cpufreq_ondemand acpi_cpufreq freq_table ip6t_REJECT nf_conntrack_ipv6
>> ip6table_filter ip6_tables ipv6 netconsole configfs uinput b43
>> snd_hda_codec_cirrus snd_hda_intel mac80211 snd_hda_codec cfg80211
>> snd_hwdep snd_seq uvcvideo snd_seq_device ssb snd_pcm applesmc
>> snd_timer videodev snd forcedeth v4l1_compat mmc_core shpchp
>> v4l2_compat_ioctl32 btusb soundcore snd_page_alloc bluetooth
>> i2c_nforce2 joydev input_polldev bcm5974 appleir rfkill mbp_nvidia_bl
>> microcode aes_x86_64 aes_generic xts gf128mul dm_crypt firewire_ohci
>> usb_storage firewire_core crc_itu_t nouveau ttm drm_kms_helper drm
>> i2c_algo_bit vid
>> eo output i2c_core [last unloaded: scsi_wait_scan]
>> Pid: 1989, comm: rsyslogd Not tainted 2.6.33.8-149.fc13.x86_64 #1
>> Call Trace:
>>  [<ffffffff81049ecc>] warn_slowpath_common+0x77/0x8f
>>  [<ffffffff81049ef3>] warn_slowpath_null+0xf/0x11
>>  [<ffffffff81120247>] mark_buffer_dirty+0x2b/0x86
>>  [<ffffffff81162b24>] ext3_commit_super.clone.0+0x54/0x64
>>  [<ffffffff81162bc4>] ext3_handle_error+0x90/0xb7
>>  [<ffffffff81162c5d>] __ext3_std_error+0x72/0x8f
>>  [<ffffffff81162cb9>] __ext3_journal_stop+0x3f/0x49
>>  [<ffffffff8115cf9d>] ext3_ordered_write_end+0xd7/0x11b
>>  [<ffffffff810bf335>] generic_file_buffered_write+0x165/0x25f
>>  [<ffffffff8115ae9d>] ? ext3_dirty_inode+0x7b/0x84
>>  [<ffffffff810c0479>] __generic_file_aio_write+0x247/0x27c
>>  [<ffffffff810c0506>] generic_file_aio_write+0x58/0xa2
>>  [<ffffffff811000cd>] do_sync_write+0xbf/0xfc
>>  [<ffffffff8103b7d4>] ? pick_next_task+0x22/0x41
>>  [<ffffffff81427c78>] ? schedule+0x850/0x8e6
>>  [<ffffffff8100f02f>] ? native_sched_clock+0x2d/0x5f
>>  [<ffffffff811bcb9f>] ? security_file_permission+0x11/0x13
>>  [<ffffffff8110062b>] vfs_write+0xa9/0x106
>>  [<ffffffff8110073e>] sys_write+0x45/0x69
>>  [<ffffffff81008b02>] system_call_fastpath+0x16/0x1b
>> ---[ end trace 65d387dd571aab2f ]---
>> JBD: Detected IO errors while flushing file data on dm-0
>> Aborting journal on device dm-0.
>> EXT3-fs (dm-0): error in ext3_orphan_add: Journal has aborted
>> EXT3-fs (dm-0): error in ext3_truncate: Journal has aborted
>> JBD: Detected IO errors while flushing file data on dm-0
>> EXT3-fs (dm-0): error: ext3_journal_start_sb: Detected aborted journal
>> EXT3-fs (dm-0): error: ext3_journal_start_sb: Detected aborted journal
>> EXT3-fs (dm-0): error: remounting filesystem read-only
>> INFO: task ata_aux:27 blocked for more than 120 seconds.
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> ata_aux       D ffff8800057157c0     0    27      2 0x00000000
>>  ffff88013b441950 0000000000000046 ffff8801382a0000 0000000000000000
>>  ffff88013b4418d0 ffffffff812bb5f8 ffff88013b441960 ffff88013b441fd8
>>  ffff88013b441fd8 000000000000f9b0 00000000000157c0 ffff88013b43b268
>> Call Trace:
>>  [<ffffffff812bb5f8>] ? __scsi_get_command+0x15/0x90
>>  [<ffffffff81428079>] schedule_timeout+0x31/0xde
>>  [<ffffffff811fb10c>] ? kobject_put+0x47/0x4b
>>  [<ffffffff812a8a74>] ? put_device+0x12/0x14
>>  [<ffffffff81427f0b>] wait_for_common+0xd1/0x12c
>>  [<ffffffff81044f21>] ? default_wake_function+0x0/0xf
>>  [<ffffffff811e980c>] ? __generic_unplug_device+0x2d/0x32
>>  [<ffffffff81427ff0>] wait_for_completion+0x18/0x1a
>>  [<ffffffff811ed7a1>] blk_execute_rq+0xa1/0xd0
>>  [<ffffffffa014126c>] ? write_msg+0xe5/0xf4 [netconsole]
>>  [<ffffffff811ea366>] ? blk_get_request+0x3c/0x72
>>  [<ffffffff812c1ca5>] scsi_execute+0xf1/0x143
>>  [<ffffffff812c1d9b>] scsi_execute_req+0xa4/0xd6
>>  [<ffffffff812c8ef8>] sd_sync_cache+0x95/0xf3
>>  [<ffffffff81427274>] ? printk+0x3c/0x40
>>  [<ffffffff812c9115>] sd_shutdown+0x93/0x116
>>  [<ffffffff812c92da>] sd_remove+0x56/0x8b
>>  [<ffffffff812ac36a>] __device_release_driver+0x70/0xc3
>>  [<ffffffff812ac481>] device_release_driver+0x1e/0x2b
>>  [<ffffffff812ab6aa>] bus_remove_device+0xb1/0xe0
>>  [<ffffffff812a9953>] device_del+0x135/0x1a1
>>  [<ffffffff812c5592>] __scsi_remove_device+0x4d/0x98
>>  [<ffffffff812c55fe>] scsi_remove_device+0x21/0x2e
>>  [<ffffffff812db944>] ata_scsi_handle_link_detach+0xfe/0x131
>>  [<ffffffff812dbb29>] ata_scsi_hotplug+0x26/0x62
>>  [<ffffffff8105f6e9>] worker_thread+0x1a4/0x232
>>  [<ffffffff812dbb03>] ? ata_scsi_hotplug+0x0/0x62
>>  [<ffffffff810631bf>] ? autoremove_wake_function+0x0/0x34
>>  [<ffffffff8105f545>] ? worker_thread+0x0/0x232
>>  [<ffffffff81062d6f>] kthread+0x7a/0x82
>>  [<ffffffff81009924>] kernel_thread_helper+0x4/0x10
>>  [<ffffffff81062cf5>] ? kthread+0x0/0x82
>>  [<ffffffff81009920>] ? kernel_thread_helper+0x0/0x10
>> sd 0:0:0:0: timing out command, waited 180s
>>
>>
>> Thanks,
>>
>> - Siddhartha
>>
>>
>> On Mon, Aug 30, 2010 at 8:25 AM, Siddhartha Jain
>> <siddhartha@siddharthajain.net> wrote:
>>> Hi Tejun,
>>>
>>> Will get more debugging info by setting up netconsole. Meanwhile,
>>> syslog did glean some important info, I think. Looks like, for some
>>> reason, after resume, device mapping for the hdd changes from sda to
>>> sdc. Please see logs below. For ease of troubleshooting, I
>>> re-installed the box with stock Fedora 13 x64 with no modifications
>>> except that dmcrypt is turned on for root and swap (so using Nouveau
>>> display drivers instead of Nvidia).
>>>
>>> Aug 29 19:13:45 192.168.1.199 kernel: PM: Syncing filesystems ... done.
>>> Aug 29 19:13:45 192.168.1.199 kernel: PM: Preparing system for mem sleep
>>> Aug 29 19:18:57 192.168.1.199 kernel: scsi 0:0:0:0: Direct-Access
>>> ATA      FUJITSU MJA2250B 0081 PQ: 0 ANSI: 5
>>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: Attached scsi
>>> generic sg0 type 0
>>> Aug 29 19:18:57 192.168.1.199 kernel: ata2.00: detaching (SCSI 1:0:0:0)
>>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] 488397168
>>> 512-byte logical blocks: (250 GB/232 GiB)
>>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Write Protect is off
>>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
>>> Aug 29 19:18:57 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Write cache:
>>> enabled, read cache: enabled, doesn't support DPO or FUA
>>> Aug 29 19:18:57 192.168.1.199 kernel: sdc:
>>> Aug 29 19:18:57 192.168.1.199 kernel: scsi 1:0:0:0: CD-ROM
>>> HL-DT-ST DVDRW  GS23N     SB03 PQ: 0 ANSI: 5
>>> Aug 29 19:18:57 192.168.1.199 kernel: sr0: scsi3-mmc drive: 24x/24x
>>> writer cd/rw xa/form2 cdda caddy
>>> Aug 29 19:18:57 192.168.1.199 kernel: sr 1:0:0:0: Attached scsi CD-ROM sr0
>>> Aug 29 19:18:57 192.168.1.199 kernel: sr 1:0:0:0: Attached scsi
>>> generic sg1 type 5
>>> Aug 29 19:18:58 192.168.1.199 kernel: sdc1 sdc2 sdc3
>>> Aug 29 19:18:58 192.168.1.199 kernel: sd 0:0:0:0: [sdc] Attached SCSI disk
>>> Aug 29 19:20:20 192.168.1.199 ntpd[1466]: 0.0.0.0 c61c 0c clock_step -0.389692 s
>>> Aug 29 19:20:20 192.168.1.199 ntpd[1466]: 0.0.0.0 c614 04 freq_mode
>>> Aug 29 19:20:21 192.168.1.199 ntpd[1466]: 0.0.0.0 c618 08 no_sys_peer
>>> Aug 29 19:22:29 192.168.1.199 sshd[2425]: WARNING: no suitable primes
>>> in /etc/ssh/primes
>>> Aug 29 19:22:29 192.168.1.199 rsyslogd: /var/log/secure
>>> Aug 29 19:22:37 192.168.1.199 auditd[1734]: Record was not written to
>>> disk (Read-only file system)
>>> Aug 29 19:22:37 192.168.1.199 auditd[1734]: write: Audit daemon
>>> detected an error writing an event to disk (Connection refused)
>>> Aug 29 19:22:37 192.168.1.199 sshd[2425]: Failed password for sjain
>>> from 192.168.1.100 port 49902 ssh2
>>> Aug 29 19:22:37 192.168.1.199 auditd[2428]: Audit daemon failed to exec (null)
>>> Aug 29 19:22:37 192.168.1.199 auditd[2428]: The audit daemon is exiting.
>>> Aug 29 19:22:37 192.168.1.199 auditd[1734]: Record was not written to
>>> disk (Read-only file system)
>>> Aug 29 19:22:37 192.168.1.199 auditd[1734]: write: Audit daemon
>>> detected an error writing an event to disk (Read-only file system)
>>> Aug 29 19:22:37 192.168.1.199 auditd[2429]: Audit daemon failed to exec (null)
>>> Aug 29 19:22:37 192.168.1.199 auditd[2429]: The audit daemon is exiting.
>>> Aug 29 19:22:37 192.168.1.199 kernel: type=1305
>>> audit(1283134957.511:19078): audit_pid=0 old=1734 auid=4294967295
>>> ses=4294967295 res=1
>>> Aug 29 19:22:37 192.168.1.199 kernel: type=1305
>>> audit(1283134957.511:19079): audit_pid=0 old=0 auid=4294967295
>>> ses=4294967295 res=1
>>> Aug 29 19:22:54 192.168.1.199 kernel: EXT3-fs error (device dm-0):
>>> ext3_find_entry: reading directory #10445218 offset 0
>>> Aug 29 19:22:54 192.168.1.199 sshd[2426]: Connection closed by 192.168.1.100
>>> Aug 29 19:23:17 192.168.1.199 auditd[2428]: Error setting audit daemon
>>> pid (Resource temporarily unavailable)
>>> Aug 29 19:23:18 192.168.1.199 auditd[2429]: Error setting audit daemon
>>> pid (Resource temporarily unavailable)
>>> Aug 29 19:34:26 192.168.1.199 kernel: CE: hpet increasing min_delta_ns
>>> to 15000 nsec
>>> Aug 29 19:35:46 192.168.1.199 ntpd[1466]: 0.0.0.0 c612 02 freq_set
>>> kernel -65.666 PPM
>>> Aug 29 19:35:46 192.168.1.199 ntpd[1466]: 0.0.0.0 c615 05 clock_sync
>>> Aug 29 19:38:26 192.168.1.199 kernel: EXT3-fs error (device dm-0):
>>> ext3_find_entry: reading directory #1803721 offset 0
>>>
>>> Odd?
>>>
>>> Thanks,
>>>
>>> - Siddhartha
>>>
>>>
>>>
>>> On Mon, Aug 30, 2010 at 5:55 AM, Tejun Heo <tj@kernel.org> wrote:
>>>> On 08/28/2010 10:28 PM, Siddhartha Jain wrote:
>>>>> I am going to try pointing rsyslog at a remote syslog server to see if I
>>>>> can capture what happens at resume-time but any other pointers to
>>>>> debug/troubleshoot this issue will be helpful.
>>>>
>>>> Setting up netconsole (see Documentation/networking/netconsole.txt in
>>>> the kernel source tree) would be better.
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> tejun
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>



-- 
- Siddhartha
  WV6U

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-31  2:03         ` Siddhartha Jain
@ 2010-09-01 19:47           ` Siddhartha Jain
  2010-09-02 10:09             ` Tejun Heo
  0 siblings, 1 reply; 46+ messages in thread
From: Siddhartha Jain @ 2010-09-01 19:47 UTC (permalink / raw)
  To: linux-ide

Hi Tejun,

Was any of this debug info helpful?

Thanks,

- Siddhartha


On Mon, Aug 30, 2010 at 7:03 PM, Siddhartha Jain
<siddhartha@siddharthajain.net> wrote:
> Upgraded to latest stable release - 2.6.35.4. Same errors:
>
> PM: Syncing filesystems ... done.
> PM: Preparing system for mem sleep
> Freezing user space processes ... (elapsed 0.01 seconds) done.
> Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> PM: Entering mem sleep
> Suspending console(s) (use no_console_suspend to debug)
> eth0: link up.
> scsi 0:0:0:0: Direct-Access     ATA      FUJITSU MJA2250B 0081 PQ: 0 ANSI: 5
> sd 0:0:0:0: Attached scsi generic sg0 type 0
> ata2.00: detaching (SCSI 1:0:0:0)
> EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
> (comm udevd) unable to read inode block 11534477
> sd 0:0:0:0: [sdc] 488397168 512-byte logical blocks: (250 GB/232 GiB)
> EXT4-fs error (device sda2): ext4_find_entry: inode #2884877: (comm
> udevd) reading directory lblock 0
> sd 0:0:0:0: [sdc] Write Protect is off
> sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
>  sdc:
> scsi 1:0:0:0: CD-ROM            HL-DT-ST DVDRW  GS23N     SB03 PQ: 0 ANSI: 5
> EXT4-fs error (device sda2): ext4_find_entry: inode #4456450: (comm
> gnome-panel) reading directory lblock 0
> EXT4-fs error (device sda2): ext4_find_entry: inode #4456489: (comm
> gnome-panel) reading directory lblock 0
> EXT4-fs error (device sda2): ext4_find_entry: inode #3670077: (comm
> gnome-panel) reading directory lblock 0
> EXT4-fs error (device sda2): ext4_find_entry: inode #3670017: (comm
> gnome-panel) reading directory lblock 0
> EXT4-fs error (device sda2): ext4_find_entry: inode #3670068: (comm
> gnome-panel) reading directory lblock 0
> EXT4-fs error (device sda2): ext4_find_entry: inode #10485761: (comm
> gnome-panel) reading directory lblock 0
> EXT4-fs error (device sda2): ext4_find_entry: inode #3670068: (comm
> gnome-panel) reading directory lblock 0
> EXT4-fs error (device sda2): ext4_find_entry: inode #4456450: (comm
> gnome-panel) reading directory lblock 0
> sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda caddy
> sr 1:0:0:0: Attached scsi CD-ROM sr0
> sr 1:0:0:0: Attached scsi generic sg1 type 5
> EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
> (comm udevd) unable to read inode block 11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
> (comm udevd) unable to read inode block 11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
> (comm udevd) unable to read inode block 11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
> (comm udevd) unable to read inode block 11534477
> EXT4-fs error (device sda2): __ext4_get_inode_loc: inode #2885332:
> (comm udevd) unable to read inode block 11534477
>  sdc1 sdc2 sdc3
> sd 0:0:0:0: [sdc] Attached SCSI disk
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-09-01 19:47           ` Siddhartha Jain
@ 2010-09-02 10:09             ` Tejun Heo
  2010-09-02 19:56               ` Siddhartha Jain
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-09-02 10:09 UTC (permalink / raw)
  To: Siddhartha Jain; +Cc: linux-ide

Hello,

Can you please the patch in the following message?

 http://article.gmane.org/gmane.linux.ide/47387/raw

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-09-02 10:09             ` Tejun Heo
@ 2010-09-02 19:56               ` Siddhartha Jain
  2010-09-03  9:04                 ` Tejun Heo
  0 siblings, 1 reply; 46+ messages in thread
From: Siddhartha Jain @ 2010-09-02 19:56 UTC (permalink / raw)
  To: linux-ide

Applied patch to 2.6.35.4. Patched cleanly and kernel compile was smooth.

Suspend works beautifully now. Not only does the system resume
properly but earlier suspend would take minutes, now it goes to
suspend within seconds (less than 10).

I will re-build my laptop with dmcrypt + ext4, apply this patch and
use the system for a while to ensure all play together nicely.

Thanks Tejun!! Really appreciate the quick turnaround on this issue. I
saved debugging info from the successful resume where I have debugging
turned on for various modules - acpi, pci etc. Let me know if you need
me to send it across.


- Siddhartha


On Thu, Sep 2, 2010 at 3:09 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> Can you please the patch in the following message?
>
>  http://article.gmane.org/gmane.linux.ide/47387/raw
>
> Thanks.
>
> --
> tejun
>



-- 
- Siddhartha
  WV6U

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-09-02 19:56               ` Siddhartha Jain
@ 2010-09-03  9:04                 ` Tejun Heo
  2010-10-28  2:11                   ` Siddhartha Jain
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-09-03  9:04 UTC (permalink / raw)
  To: Siddhartha Jain; +Cc: linux-ide

Hello,

On 09/02/2010 09:56 PM, Siddhartha Jain wrote:
> Applied patch to 2.6.35.4. Patched cleanly and kernel compile was smooth.
> 
> Suspend works beautifully now. Not only does the system resume
> properly but earlier suspend would take minutes, now it goes to
> suspend within seconds (less than 10).
> 
> I will re-build my laptop with dmcrypt + ext4, apply this patch and
> use the system for a while to ensure all play together nicely.
> 
> Thanks Tejun!! Really appreciate the quick turnaround on this issue. I
> saved debugging info from the successful resume where I have debugging
> turned on for various modules - acpi, pci etc. Let me know if you need
> me to send it across.

Debugging was mostly done by Rafael, so he deserves most of credit.
Can you please keep testing for a few more days and let me know the
result?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-09-03  9:04                 ` Tejun Heo
@ 2010-10-28  2:11                   ` Siddhartha Jain
  2010-10-29  5:49                     ` Jeff Garzik
  0 siblings, 1 reply; 46+ messages in thread
From: Siddhartha Jain @ 2010-10-28  2:11 UTC (permalink / raw)
  To: Tejun Heo, linux-ide

Just wanted to checkin and report that so far I applied the patch to
two revisions of 2.6.34 from fedora 13 x64 and have had no issues. My
macbook pro sleeps well and wakes up as normal everytime.

Thanks again to all involved. When should I expect it to be updated in
the main trunk?



On 9/3/10, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On 09/02/2010 09:56 PM, Siddhartha Jain wrote:
>> Applied patch to 2.6.35.4. Patched cleanly and kernel compile was smooth.
>>
>> Suspend works beautifully now. Not only does the system resume
>> properly but earlier suspend would take minutes, now it goes to
>> suspend within seconds (less than 10).
>>
>> I will re-build my laptop with dmcrypt + ext4, apply this patch and
>> use the system for a while to ensure all play together nicely.
>>
>> Thanks Tejun!! Really appreciate the quick turnaround on this issue. I
>> saved debugging info from the successful resume where I have debugging
>> turned on for various modules - acpi, pci etc. Let me know if you need
>> me to send it across.
>
> Debugging was mostly done by Rafael, so he deserves most of credit.
> Can you please keep testing for a few more days and let me know the
> result?
>
> Thanks.
>
> --
> tejun
>


-- 
- Siddhartha
  WV6U

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-10-28  2:11                   ` Siddhartha Jain
@ 2010-10-29  5:49                     ` Jeff Garzik
  0 siblings, 0 replies; 46+ messages in thread
From: Jeff Garzik @ 2010-10-29  5:49 UTC (permalink / raw)
  To: Siddhartha Jain; +Cc: Tejun Heo, linux-ide

On 10/27/2010 10:11 PM, Siddhartha Jain wrote:
> Just wanted to checkin and report that so far I applied the patch to
> two revisions of 2.6.34 from fedora 13 x64 and have had no issues. My
> macbook pro sleeps well and wakes up as normal everytime.
>
> Thanks again to all involved. When should I expect it to be updated in
> the main trunk?


It's in the main trunk:

commit e2f3d75fc0e4a0d03c61872bad39ffa2e74a04ff
Author: Tejun Heo <htejun@gmail.com>
Date:   Tue Sep 7 14:05:31 2010 +0200

     libata: skip EH autopsy and recovery during suspend


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-27 23:35                                         ` Rafael J. Wysocki
@ 2010-09-02 14:31                                             ` Stephan Diestelhorst
  0 siblings, 0 replies; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-09-02 14:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, Stephan Diestelhorst
  Cc: linux-ide, Tejun Heo, linux-pm, linux-kernel

On Saturday 28 August 2010, 01:35:38 Rafael J. Wysocki wrote:
> I reproduced the problem with the Tejun's patch applied, so I'm now quite
> sure the problem is related to the suspend of controller ports (which is done
> by scheduling SCSI error handling on the controller).
> 
> Anyway, below is a new version of my patch that plays a bit nicer with
> the resume code.  Can you please check if it still fixes the problem for you?

Applied to 2.6.35.3 and tested. Works perfectly fine (> 10 s2ram
cycles under heavy I/O load).

Many thanks,
  Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
@ 2010-09-02 14:31                                             ` Stephan Diestelhorst
  0 siblings, 0 replies; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-09-02 14:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, Stephan Diestelhorst
  Cc: Tejun Heo, linux-kernel, linux-ide, linux-pm

On Saturday 28 August 2010, 01:35:38 Rafael J. Wysocki wrote:
> I reproduced the problem with the Tejun's patch applied, so I'm now quite
> sure the problem is related to the suspend of controller ports (which is done
> by scheduling SCSI error handling on the controller).
> 
> Anyway, below is a new version of my patch that plays a bit nicer with
> the resume code.  Can you please check if it still fixes the problem for you?

Applied to 2.6.35.3 and tested. Works perfectly fine (> 10 s2ram
cycles under heavy I/O load).

Many thanks,
  Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-26 23:09                                       ` Rafael J. Wysocki
  2010-08-26 23:46                                         ` Rafael J. Wysocki
@ 2010-09-02  9:06                                         ` Tejun Heo
  1 sibling, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2010-09-02  9:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hello, Rafael.

On 08/27/2010 01:09 AM, Rafael J. Wysocki wrote:
> Well, no luck.  I was able to reproduce the issue on my box with this patch
> applied on top of 2.6.32-rc2.
> 
> Which probably means that the link power management is not really involved
> here and seems to turn up this statement:
> 
> rc = ata_host_request_pm(host, mesg, 0, ATA_EHI_QUIET, 1);
> 
> in ata_host_suspend() as the culprit.
> 
> Does it make sense?

So, LPM doesn't have anything to do with the problem.  The fact that
this only happens on specific machines is strange.  Maybe the BIOS is
doing something fishy to the controller on disk spindown during
suspend.  Can the BIOS tell that the system is going for suspend by
the time sd suspend is called?  I'll prep another patch which will
make EH skip certain steps and ignore failures during suspend, which
basically mimics what your patch does but in safer way.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-26 18:24                                       ` Rafael J. Wysocki
@ 2010-08-27 23:35                                         ` Rafael J. Wysocki
  2010-09-02 14:31                                             ` Stephan Diestelhorst
  0 siblings, 1 reply; 46+ messages in thread
From: Rafael J. Wysocki @ 2010-08-27 23:35 UTC (permalink / raw)
  To: Stephan Diestelhorst
  Cc: Tejun Heo, linux-kernel, linux-ide, linux-pm, Stephan Diestelhorst

On Thursday, August 26, 2010, Rafael J. Wysocki wrote:
> On Thursday, August 26, 2010, Stephan Diestelhorst wrote:
> > On Tuesday 24 August 2010 18:11:22 Stephan Diestelhorst wrote:
> > > On Tuesday 24 August 2010 18:07:23 Stephan Diestelhorst wrote:
> > > > On Monday 23 August 2010 14:03:40 Tejun Heo wrote:
> > > > > On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> > > > > > It says "max_performance", I have not touched anyhting. So it has been
> > > > > > like that all the time. Would this explain why your patch did not show
> > > > > > the debug printout?
> > > > > 
> > > > > Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
> > > > > be any debug messages but at the same time the posted patch should
> > > > > have had the same effect as Rafael's patch as IPM path isn't traveled
> > > > > at all.  Can you please check the followings?
> > > > > 
> > [...]
> > > > > * Rafael's patch actually fixes the problem.  If you haven't been
> > > > >   using IPM at all, Rafael's patch and mine should behave exactly the
> > > > >   same (ie. no IPM operation at all during suspend/resume).  It could
> > > > >   be that you're seeing a different issue.
> > > > 
> > > > That next on my list...
> > 
> > Just did the following: Rebased Rafaels patch to 2.6.35 and tried it
> > again (with added prints to make sure I am running the right one) and
> > did >10 suspend to ram / resume cycles under I/O write load. All of
> > them worked fine (for comparison: your patch resulted in RO HDD at
> > first attempt).
> > 
> > (I had some extra prints around the suspend functions changed in
> >  Rafael's patch, tried with and without, no change--works flawlessly.)
> > 
> > What do you make of this?
> 
> I think my patch actually does more than the Tejun's one.  I need to have a
> deeper look at them both.
> 
> I'm still testing the Tejun's patch on my system where I was able to reproduce
> the problem, but so far it's been working.

I reproduced the problem with the Tejun's patch applied, so I'm now quite
sure the problem is related to the suspend of controller ports (which is done
by scheduling SCSI error handling on the controller).

Anyway, below is a new version of my patch that plays a bit nicer with
the resume code.  Can you please check if it still fixes the problem for you?

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: SATA / AHCI: Do not play with the link PM during suspend to RAM (v2)

My Acer Ferrari One occasionally loses communication with the HDD
(which in fact is an Intel SSD) during suspend to RAM.  The symptom
is that the IDENTIFY command times out during suspend and the device
is dropped by the kernel, so it is not available during resume and
the system is unuseable as a result.  The failure is not readily
reproducible, although it happens once every several suspends and
it always happens after the disk has been shut down by the SCSI
layer's suspend routine.

I was able to trace this issue down to the scheduling of error
handling for all of the controller's ports carried out by
ata_host_suspend(), which indicates quirky hardware.  However, the
AHCI driver, which is used on the affected box, doesn't really need
to do anything with the controller's ports during suspend to RAM,
because the controller is going to be put into D3 immediately by
ata_pci_device_do_suspend() and it will undergo full reset during
the subsequent resume anyway.  For this reason, make the AHCI driver
avoid calling ata_host_suspend() during suspend to RAM which works
around the problem and makes sense as a general optimization.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/ata/ahci.c        |   11 ++++++++++-
 drivers/ata/libata-core.c |   20 ++++++++++++++++++++
 include/linux/libata.h    |    1 +
 3 files changed, 31 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/ata/ahci.c
===================================================================
--- linux-2.6.orig/drivers/ata/ahci.c
+++ linux-2.6/drivers/ata/ahci.c
@@ -595,6 +595,7 @@ static int ahci_pci_device_suspend(struc
 	struct ahci_host_priv *hpriv = host->private_data;
 	void __iomem *mmio = hpriv->mmio;
 	u32 ctl;
+	int rc = 0;
 
 	if (mesg.event & PM_EVENT_SUSPEND &&
 	    hpriv->flags & AHCI_HFLAG_NO_SUSPEND) {
@@ -614,7 +615,15 @@ static int ahci_pci_device_suspend(struc
 		readl(mmio + HOST_CTL); /* flush */
 	}
 
-	return ata_pci_device_suspend(pdev, mesg);
+	if (mesg.event == PM_EVENT_SUSPEND)
+		ata_fake_suspend(host);
+	else
+		rc = ata_host_suspend(host, mesg);
+
+	if (!rc)
+		ata_pci_device_do_suspend(pdev, mesg);
+
+	return rc;
 }
 
 static int ahci_pci_device_resume(struct pci_dev *pdev)
Index: linux-2.6/include/linux/libata.h
===================================================================
--- linux-2.6.orig/include/linux/libata.h
+++ linux-2.6/include/linux/libata.h
@@ -986,6 +986,7 @@ extern bool ata_link_online(struct ata_l
 extern bool ata_link_offline(struct ata_link *link);
 #ifdef CONFIG_PM
 extern int ata_host_suspend(struct ata_host *host, pm_message_t mesg);
+extern void ata_fake_suspend(struct ata_host *host);
 extern void ata_host_resume(struct ata_host *host);
 #endif
 extern int ata_ratelimit(void);
Index: linux-2.6/drivers/ata/libata-core.c
===================================================================
--- linux-2.6.orig/drivers/ata/libata-core.c
+++ linux-2.6/drivers/ata/libata-core.c
@@ -5429,6 +5429,25 @@ int ata_host_suspend(struct ata_host *ho
 	return rc;
 }
 
+void ata_fake_suspend(struct ata_host *host)
+{
+	unsigned long flags;
+	int i;
+
+	for (i = 0; i < host->n_ports; i++) {
+		struct ata_port *ap = host->ports[i];
+
+		spin_lock_irqsave(ap->lock, flags);
+
+		ap->pm_mesg = PMSG_SUSPEND;
+		ap->pflags |= ATA_PFLAG_SUSPENDED;
+
+		spin_unlock_irqrestore(ap->lock, flags);
+	}
+
+	host->dev->power.power_state = PMSG_SUSPEND;
+}
+
 /**
  *	ata_host_resume - resume host
  *	@host: host to resume
@@ -6691,6 +6710,7 @@ EXPORT_SYMBOL_GPL(ata_link_online);
 EXPORT_SYMBOL_GPL(ata_link_offline);
 #ifdef CONFIG_PM
 EXPORT_SYMBOL_GPL(ata_host_suspend);
+EXPORT_SYMBOL_GPL(ata_fake_suspend);
 EXPORT_SYMBOL_GPL(ata_host_resume);
 #endif /* CONFIG_PM */
 EXPORT_SYMBOL_GPL(ata_id_string);

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-26 23:09                                       ` Rafael J. Wysocki
@ 2010-08-26 23:46                                         ` Rafael J. Wysocki
  2010-09-02  9:06                                         ` Tejun Heo
  1 sibling, 0 replies; 46+ messages in thread
From: Rafael J. Wysocki @ 2010-08-26 23:46 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

On Friday, August 27, 2010, Rafael J. Wysocki wrote:
> On Tuesday, August 24, 2010, Rafael J. Wysocki wrote:
> > On Tuesday, August 24, 2010, Tejun Heo wrote:
> > > On 08/23/2010 08:58 PM, Rafael J. Wysocki wrote:
> > > > On Monday, August 23, 2010, Tejun Heo wrote:
> > > >> Hello, sorry about the delay.
> > > >>
> > > >> On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> > > >>> It says "max_performance", I have not touched anyhting. So it has been
> > > >>> like that all the time. Would this explain why your patch did not show
> > > >>> the debug printout?
> > > >>
> > > >> Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
> > > >> be any debug messages but at the same time the posted patch should
> > > >> have had the same effect as Rafael's patch as IPM path isn't traveled
> > > >> at all.  Can you please check the followings?
> > > >>
> > > >> * You're actually running the correct patched kernel and modules.  It
> > > >>   probably is a good idea to add a printk message.  ie. Apply the
> > > >>   patch and add a printk() in ata_host_request_pm() in libata-core.c
> > > >>   and make sure the debug messages appears.
> > > >>
> > > >> * Rafael's patch actually fixes the problem.  If you haven't been
> > > >>   using IPM at all, Rafael's patch and mine should behave exactly the
> > > >>   same (ie. no IPM operation at all during suspend/resume).  It could
> > > >>   be that you're seeing a different issue.
> > > >>
> > > >> Rafael, can you please test my patch and see how your case behaves?
> > > > 
> > > > This one: http://lkml.org/lkml/2010/8/5/328 ?
> > > 
> > > Yeap, that one.  I can prep a test git branch if necessary.
> > 
> > No need to, but it's going to take a few days to verify on my box.
> 
> Well, no luck.  I was able to reproduce the issue on my box with this patch
> applied on top of 2.6.32-rc2.

2.6.36-rc2 that is.

> Which probably means that the link power management is not really involved
> here and seems to turn up this statement:
> 
> rc = ata_host_request_pm(host, mesg, 0, ATA_EHI_QUIET, 1);
> 
> in ata_host_suspend() as the culprit.
> 
> Does it make sense?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-24 20:39                                     ` Rafael J. Wysocki
@ 2010-08-26 23:09                                       ` Rafael J. Wysocki
  2010-08-26 23:46                                         ` Rafael J. Wysocki
  2010-09-02  9:06                                         ` Tejun Heo
  0 siblings, 2 replies; 46+ messages in thread
From: Rafael J. Wysocki @ 2010-08-26 23:09 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

On Tuesday, August 24, 2010, Rafael J. Wysocki wrote:
> On Tuesday, August 24, 2010, Tejun Heo wrote:
> > On 08/23/2010 08:58 PM, Rafael J. Wysocki wrote:
> > > On Monday, August 23, 2010, Tejun Heo wrote:
> > >> Hello, sorry about the delay.
> > >>
> > >> On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> > >>> It says "max_performance", I have not touched anyhting. So it has been
> > >>> like that all the time. Would this explain why your patch did not show
> > >>> the debug printout?
> > >>
> > >> Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
> > >> be any debug messages but at the same time the posted patch should
> > >> have had the same effect as Rafael's patch as IPM path isn't traveled
> > >> at all.  Can you please check the followings?
> > >>
> > >> * You're actually running the correct patched kernel and modules.  It
> > >>   probably is a good idea to add a printk message.  ie. Apply the
> > >>   patch and add a printk() in ata_host_request_pm() in libata-core.c
> > >>   and make sure the debug messages appears.
> > >>
> > >> * Rafael's patch actually fixes the problem.  If you haven't been
> > >>   using IPM at all, Rafael's patch and mine should behave exactly the
> > >>   same (ie. no IPM operation at all during suspend/resume).  It could
> > >>   be that you're seeing a different issue.
> > >>
> > >> Rafael, can you please test my patch and see how your case behaves?
> > > 
> > > This one: http://lkml.org/lkml/2010/8/5/328 ?
> > 
> > Yeap, that one.  I can prep a test git branch if necessary.
> 
> No need to, but it's going to take a few days to verify on my box.

Well, no luck.  I was able to reproduce the issue on my box with this patch
applied on top of 2.6.32-rc2.

Which probably means that the link power management is not really involved
here and seems to turn up this statement:

rc = ata_host_request_pm(host, mesg, 0, ATA_EHI_QUIET, 1);

in ata_host_suspend() as the culprit.

Does it make sense?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-26 16:15                                     ` Stephan Diestelhorst
@ 2010-08-26 18:24                                       ` Rafael J. Wysocki
  2010-08-27 23:35                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 46+ messages in thread
From: Rafael J. Wysocki @ 2010-08-26 18:24 UTC (permalink / raw)
  To: Stephan Diestelhorst
  Cc: Tejun Heo, linux-kernel, linux-ide, linux-pm, Stephan Diestelhorst

On Thursday, August 26, 2010, Stephan Diestelhorst wrote:
> On Tuesday 24 August 2010 18:11:22 Stephan Diestelhorst wrote:
> > On Tuesday 24 August 2010 18:07:23 Stephan Diestelhorst wrote:
> > > On Monday 23 August 2010 14:03:40 Tejun Heo wrote:
> > > > On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> > > > > It says "max_performance", I have not touched anyhting. So it has been
> > > > > like that all the time. Would this explain why your patch did not show
> > > > > the debug printout?
> > > > 
> > > > Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
> > > > be any debug messages but at the same time the posted patch should
> > > > have had the same effect as Rafael's patch as IPM path isn't traveled
> > > > at all.  Can you please check the followings?
> > > > 
> [...]
> > > > * Rafael's patch actually fixes the problem.  If you haven't been
> > > >   using IPM at all, Rafael's patch and mine should behave exactly the
> > > >   same (ie. no IPM operation at all during suspend/resume).  It could
> > > >   be that you're seeing a different issue.
> > > 
> > > That next on my list...
> 
> Just did the following: Rebased Rafaels patch to 2.6.35 and tried it
> again (with added prints to make sure I am running the right one) and
> did >10 suspend to ram / resume cycles under I/O write load. All of
> them worked fine (for comparison: your patch resulted in RO HDD at
> first attempt).
> 
> (I had some extra prints around the suspend functions changed in
>  Rafael's patch, tried with and without, no change--works flawlessly.)
> 
> What do you make of this?

I think my patch actually does more than the Tejun's one.  I need to have a
deeper look at them both.

I'm still testing the Tejun's patch on my system where I was able to reproduce
the problem, but so far it's been working.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-24 16:11                                   ` Stephan Diestelhorst
@ 2010-08-26 16:15                                     ` Stephan Diestelhorst
  2010-08-26 18:24                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-26 16:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

On Tuesday 24 August 2010 18:11:22 Stephan Diestelhorst wrote:
> On Tuesday 24 August 2010 18:07:23 Stephan Diestelhorst wrote:
> > On Monday 23 August 2010 14:03:40 Tejun Heo wrote:
> > > On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> > > > It says "max_performance", I have not touched anyhting. So it has been
> > > > like that all the time. Would this explain why your patch did not show
> > > > the debug printout?
> > > 
> > > Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
> > > be any debug messages but at the same time the posted patch should
> > > have had the same effect as Rafael's patch as IPM path isn't traveled
> > > at all.  Can you please check the followings?
> > > 
[...]
> > > * Rafael's patch actually fixes the problem.  If you haven't been
> > >   using IPM at all, Rafael's patch and mine should behave exactly the
> > >   same (ie. no IPM operation at all during suspend/resume).  It could
> > >   be that you're seeing a different issue.
> > 
> > That next on my list...

Just did the following: Rebased Rafaels patch to 2.6.35 and tried it
again (with added prints to make sure I am running the right one) and
did >10 suspend to ram / resume cycles under I/O write load. All of
them worked fine (for comparison: your patch resulted in RO HDD at
first attempt).

(I had some extra prints around the suspend functions changed in
 Rafael's patch, tried with and without, no change--works flawlessly.)

What do you make of this?

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-24  7:37                                   ` Tejun Heo
@ 2010-08-24 20:39                                     ` Rafael J. Wysocki
  2010-08-26 23:09                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 46+ messages in thread
From: Rafael J. Wysocki @ 2010-08-24 20:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

On Tuesday, August 24, 2010, Tejun Heo wrote:
> On 08/23/2010 08:58 PM, Rafael J. Wysocki wrote:
> > On Monday, August 23, 2010, Tejun Heo wrote:
> >> Hello, sorry about the delay.
> >>
> >> On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> >>> It says "max_performance", I have not touched anyhting. So it has been
> >>> like that all the time. Would this explain why your patch did not show
> >>> the debug printout?
> >>
> >> Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
> >> be any debug messages but at the same time the posted patch should
> >> have had the same effect as Rafael's patch as IPM path isn't traveled
> >> at all.  Can you please check the followings?
> >>
> >> * You're actually running the correct patched kernel and modules.  It
> >>   probably is a good idea to add a printk message.  ie. Apply the
> >>   patch and add a printk() in ata_host_request_pm() in libata-core.c
> >>   and make sure the debug messages appears.
> >>
> >> * Rafael's patch actually fixes the problem.  If you haven't been
> >>   using IPM at all, Rafael's patch and mine should behave exactly the
> >>   same (ie. no IPM operation at all during suspend/resume).  It could
> >>   be that you're seeing a different issue.
> >>
> >> Rafael, can you please test my patch and see how your case behaves?
> > 
> > This one: http://lkml.org/lkml/2010/8/5/328 ?
> 
> Yeap, that one.  I can prep a test git branch if necessary.

No need to, but it's going to take a few days to verify on my box.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-24 16:07                                 ` Stephan Diestelhorst
@ 2010-08-24 16:11                                   ` Stephan Diestelhorst
  2010-08-26 16:15                                     ` Stephan Diestelhorst
  0 siblings, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-24 16:11 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm

On Tuesday 24 August 2010 18:07:23 Stephan Diestelhorst wrote:
> On Monday 23 August 2010 14:03:40 Tejun Heo wrote:
> > On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> > > It says "max_performance", I have not touched anyhting. So it has been
> > > like that all the time. Would this explain why your patch did not show
> > > the debug printout?
> > 
> > Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
> > be any debug messages but at the same time the posted patch should
> > have had the same effect as Rafael's patch as IPM path isn't traveled
> > at all.  Can you please check the followings?
> > 
> > * You're actually running the correct patched kernel and modules.  It
> >   probably is a good idea to add a printk message.  ie. Apply the
> >   patch and add a printk() in ata_host_request_pm() in libata-core.c
> >   and make sure the debug messages appears.
> 
> Did that. Actually also added some printks to the XXX function, called

I meant ahci_dev_config() in libahci.c . Darn quick trigger finger ;-)

> early during boot. Output confirms that your patch is loaded. And even
> on the first resume the machine dies.
> 
> > * Rafael's patch actually fixes the problem.  If you haven't been
> >   using IPM at all, Rafael's patch and mine should behave exactly the
> >   same (ie. no IPM operation at all during suspend/resume).  It could
> >   be that you're seeing a different issue.
> 
> That next on my list...
> 
> Many thanks!
> 
> Stephan
> 
> 


-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-23 12:03                               ` Tejun Heo
  2010-08-23 18:58                                 ` Rafael J. Wysocki
@ 2010-08-24 16:07                                 ` Stephan Diestelhorst
  2010-08-24 16:11                                   ` Stephan Diestelhorst
  1 sibling, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-24 16:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

On Monday 23 August 2010 14:03:40 Tejun Heo wrote:
> On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> > It says "max_performance", I have not touched anyhting. So it has been
> > like that all the time. Would this explain why your patch did not show
> > the debug printout?
> 
> Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
> be any debug messages but at the same time the posted patch should
> have had the same effect as Rafael's patch as IPM path isn't traveled
> at all.  Can you please check the followings?
> 
> * You're actually running the correct patched kernel and modules.  It
>   probably is a good idea to add a printk message.  ie. Apply the
>   patch and add a printk() in ata_host_request_pm() in libata-core.c
>   and make sure the debug messages appears.

Did that. Actually also added some printks to the XXX function, called
early during boot. Output confirms that your patch is loaded. And even
on the first resume the machine dies.

> * Rafael's patch actually fixes the problem.  If you haven't been
>   using IPM at all, Rafael's patch and mine should behave exactly the
>   same (ie. no IPM operation at all during suspend/resume).  It could
>   be that you're seeing a different issue.

That next on my list...

Many thanks!

Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-23 18:58                                 ` Rafael J. Wysocki
@ 2010-08-24  7:37                                   ` Tejun Heo
  2010-08-24 20:39                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-08-24  7:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

On 08/23/2010 08:58 PM, Rafael J. Wysocki wrote:
> On Monday, August 23, 2010, Tejun Heo wrote:
>> Hello, sorry about the delay.
>>
>> On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
>>> It says "max_performance", I have not touched anyhting. So it has been
>>> like that all the time. Would this explain why your patch did not show
>>> the debug printout?
>>
>> Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
>> be any debug messages but at the same time the posted patch should
>> have had the same effect as Rafael's patch as IPM path isn't traveled
>> at all.  Can you please check the followings?
>>
>> * You're actually running the correct patched kernel and modules.  It
>>   probably is a good idea to add a printk message.  ie. Apply the
>>   patch and add a printk() in ata_host_request_pm() in libata-core.c
>>   and make sure the debug messages appears.
>>
>> * Rafael's patch actually fixes the problem.  If you haven't been
>>   using IPM at all, Rafael's patch and mine should behave exactly the
>>   same (ie. no IPM operation at all during suspend/resume).  It could
>>   be that you're seeing a different issue.
>>
>> Rafael, can you please test my patch and see how your case behaves?
> 
> This one: http://lkml.org/lkml/2010/8/5/328 ?

Yeap, that one.  I can prep a test git branch if necessary.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-23 12:03                               ` Tejun Heo
@ 2010-08-23 18:58                                 ` Rafael J. Wysocki
  2010-08-24  7:37                                   ` Tejun Heo
  2010-08-24 16:07                                 ` Stephan Diestelhorst
  1 sibling, 1 reply; 46+ messages in thread
From: Rafael J. Wysocki @ 2010-08-23 18:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

On Monday, August 23, 2010, Tejun Heo wrote:
> Hello, sorry about the delay.
> 
> On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> > It says "max_performance", I have not touched anyhting. So it has been
> > like that all the time. Would this explain why your patch did not show
> > the debug printout?
> 
> Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
> be any debug messages but at the same time the posted patch should
> have had the same effect as Rafael's patch as IPM path isn't traveled
> at all.  Can you please check the followings?
> 
> * You're actually running the correct patched kernel and modules.  It
>   probably is a good idea to add a printk message.  ie. Apply the
>   patch and add a printk() in ata_host_request_pm() in libata-core.c
>   and make sure the debug messages appears.
> 
> * Rafael's patch actually fixes the problem.  If you haven't been
>   using IPM at all, Rafael's patch and mine should behave exactly the
>   same (ie. no IPM operation at all during suspend/resume).  It could
>   be that you're seeing a different issue.
> 
> Rafael, can you please test my patch and see how your case behaves?

This one: http://lkml.org/lkml/2010/8/5/328 ?

Rafael

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-19 16:23                             ` Stephan Diestelhorst
@ 2010-08-23 12:03                               ` Tejun Heo
  2010-08-23 18:58                                 ` Rafael J. Wysocki
  2010-08-24 16:07                                 ` Stephan Diestelhorst
  0 siblings, 2 replies; 46+ messages in thread
From: Tejun Heo @ 2010-08-23 12:03 UTC (permalink / raw)
  To: Stephan Diestelhorst
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hello, sorry about the delay.

On 08/19/2010 06:23 PM, Stephan Diestelhorst wrote:
> It says "max_performance", I have not touched anyhting. So it has been
> like that all the time. Would this explain why your patch did not show
> the debug printout?

Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
be any debug messages but at the same time the posted patch should
have had the same effect as Rafael's patch as IPM path isn't traveled
at all.  Can you please check the followings?

* You're actually running the correct patched kernel and modules.  It
  probably is a good idea to add a printk message.  ie. Apply the
  patch and add a printk() in ata_host_request_pm() in libata-core.c
  and make sure the debug messages appears.

* Rafael's patch actually fixes the problem.  If you haven't been
  using IPM at all, Rafael's patch and mine should behave exactly the
  same (ie. no IPM operation at all during suspend/resume).  It could
  be that you're seeing a different issue.

Rafael, can you please test my patch and see how your case behaves?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-18  6:12                           ` Tejun Heo
@ 2010-08-19 16:23                             ` Stephan Diestelhorst
  2010-08-23 12:03                               ` Tejun Heo
  0 siblings, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-19 16:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hi Tejun,

On Wednesday 18 August 2010, 08:12:24 Tejun Heo wrote:
> On 08/17/2010 11:28 PM, Stephan Diestelhorst wrote:
> > 2010/8/17 Tejun Heo <htejun@gmail.com>:
> >> On 08/17/2010 12:51 PM, Stephan Diestelhorst wrote:
> >>> I've also confirmed that the "XXX ahci_set_ipm" is present in
> >>> libahci.ko. So either I've screwed up badly when compiling the initrd,
> >>> the code is not executed or the printout does not make it into any
> >>> logfile anymore.
> >>
> >> Yeah, that's weird.  You're enabling IPM, right?
>
> You can check whether it's enabled by
> 
> $ cat /sys/class/scsi_host/host0/link_power_management_policy
> 
> If it says max_performance, it's disabled.  If it says anything else,
> it's enabled.

It says "max_performance", I have not touched anyhting. So it has been
like that all the time. Would this explain why your patch did not show
the debug printout?

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17 21:28                         ` Stephan Diestelhorst
@ 2010-08-18  6:12                           ` Tejun Heo
  2010-08-19 16:23                             ` Stephan Diestelhorst
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-08-18  6:12 UTC (permalink / raw)
  To: Stephan Diestelhorst
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	stephan.diestelhorst

Hello,

On 08/17/2010 11:28 PM, Stephan Diestelhorst wrote:
> 2010/8/17 Tejun Heo <htejun@gmail.com>:
>> On 08/17/2010 12:51 PM, Stephan Diestelhorst wrote:
>>> I *think* I have applied the patch correctly. Please find a copy of
>>> "git show" in the build directory attached. This should be the right
>>> thing, shouldn't it?
>>
>>> Maybe I forgot to speify a particular debug option / verbosity?
>>
>>> I've also confirmed that the "XXX ahci_set_ipm" is present in
>>> libahci.ko. So either I've screwed up badly when compiling the initrd,
>>> the code is not executed or the printout does not make it into any
>>> logfile anymore.
>>
>> Yeah, that's weird.  You're enabling IPM, right?
> 
> Erm... Honestly, I have no clue. What is IPM? How do I enable it? This is a
> Kubuntu Lucid 10.04 distribution, and I have not touched too much. In
> particular, the kernels have been from upstream git, just with the Ubuntu config
> copied over.
> 
> Maybe it is just not enabled? I am guessing that IPM might be IDE power
> management? Or intelligent, integrated? Google turns up this email thread
> as one of the first hits and nothing else conclusive.

It's interface power management, also called link power management.
You can check whether it's enabled by

$ cat /sys/class/scsi_host/host0/link_power_management_policy

If it says max_performance, it's disabled.  If it says anything else,
it's enabled.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17 15:04                       ` Tejun Heo
@ 2010-08-17 21:28                         ` Stephan Diestelhorst
  2010-08-18  6:12                           ` Tejun Heo
  0 siblings, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-17 21:28 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	stephan.diestelhorst

Hi,

2010/8/17 Tejun Heo <htejun@gmail.com>:
> On 08/17/2010 12:51 PM, Stephan Diestelhorst wrote:
>> I *think* I have applied the patch correctly. Please find a copy of
>> "git show" in the build directory attached. This should be the right
>> thing, shouldn't it?
>
>> Maybe I forgot to speify a particular debug option / verbosity?
>
>> I've also confirmed that the "XXX ahci_set_ipm" is present in
>> libahci.ko. So either I've screwed up badly when compiling the initrd,
>> the code is not executed or the printout does not make it into any
>> logfile anymore.
>
> Yeah, that's weird.  You're enabling IPM, right?

Erm... Honestly, I have no clue. What is IPM? How do I enable it? This is a
Kubuntu Lucid 10.04 distribution, and I have not touched too much. In
particular, the kernels have been from upstream git, just with the Ubuntu config
copied over.

Maybe it is just not enabled? I am guessing that IPM might be IDE power
management? Or intelligent, integrated? Google turns up this email thread
as one of the first hits and nothing else conclusive.

Thanks,
  Stephan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17 10:51                     ` Stephan Diestelhorst
@ 2010-08-17 15:04                       ` Tejun Heo
  2010-08-17 21:28                         ` Stephan Diestelhorst
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-08-17 15:04 UTC (permalink / raw)
  To: Stephan Diestelhorst
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hello,

On 08/17/2010 12:51 PM, Stephan Diestelhorst wrote:
> I *think* I have applied the patch correctly. Please find a copy of
> "git show" in the build directory attached. This should be the right
> thing, shouldn't it?

Yeah, looks right to me but if you have enabled IPM, there gotta be
at least some XXX messages in the log.  Weird.

> Maybe I forgot to speify a particular debug option / verbosity?

Hmmm... all messages are at KERN_INFO level and they don't have any
switch.

> I've also confirmed that the "XXX ahci_set_ipm" is present in
> libahci.ko. So either I've screwed up badly when compiling the initrd,
> the code is not executed or the printout does not make it into any
> logfile anymore.

Yeah, that's weird.  You're enabling IPM, right?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17 11:29                     ` Tejun Heo
@ 2010-08-17 12:10                       ` Stephan Diestelhorst
  2010-08-17 12:09                         ` Tejun Heo
  0 siblings, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-17 12:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hi,

On Tuesday 17 August 2010, 13:29:05 Tejun Heo wrote:
> Hello,
> 
> On 08/17/2010 01:19 PM, Rafael J. Wysocki wrote:
> > Well, I wonder what the real reason for doing the link power management
> > thing at this particular point in the suspend code path is.  It just seems to
> > disable the link power management, but then the controller is put into a
> > low-power state and is reset from scratch during resume, so I'm not quite
> > sure how skipping that code could possibly lead to any problems.
> 
> > Perhaps we could move the link PM manipulation to the prepare stage
> > of suspend?
> 
> Yeah, one possibility is that the devices misbehave if they receive
> LPM commands while suspended.  Does commenting out sd_suspend resolve
> the issue too?

If you want me to test anything... let me know. Since I do not know
much about the ATA code, I do not know what to change where. (A simple
grep for sd_suspend in drivers/ata didn't turn up anything.)

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17 12:10                       ` Stephan Diestelhorst
@ 2010-08-17 12:09                         ` Tejun Heo
  0 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2010-08-17 12:09 UTC (permalink / raw)
  To: Stephan Diestelhorst
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hello,

On 08/17/2010 02:10 PM, Stephan Diestelhorst wrote:
> If you want me to test anything... let me know. Since I do not know
> much about the ATA code, I do not know what to change where. (A simple
> grep for sd_suspend in drivers/ata didn't turn up anything.)

Oh, sure, the following should be enough.  Thanks.

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 8802e48..892ccc7 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2456,6 +2456,8 @@ static int sd_suspend(struct device *dev, pm_message_t mesg)
 	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
 	int ret = 0;

+	return 0;
+
 	if (!sdkp)
 		return 0;	/* this can happen */

-- 
tejun

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17 11:19                   ` Rafael J. Wysocki
@ 2010-08-17 11:29                     ` Tejun Heo
  2010-08-17 12:10                       ` Stephan Diestelhorst
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-08-17 11:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hello,

On 08/17/2010 01:19 PM, Rafael J. Wysocki wrote:
> Well, I wonder what the real reason for doing the link power management
> thing at this particular point in the suspend code path is.  It just seems to
> disable the link power management, but then the controller is put into a
> low-power state and is reset from scratch during resume, so I'm not quite
> sure how skipping that code could possibly lead to any problems.

Stranger things have happened in the ATA la-la land. :-) Also, it
makes non-lpm and lpm cases leave the controller and device in
different states when it goes to sleep, which _really_ bothers me.
Combined with the timing dependent nature of DIPM, I worry this might
lead to very obscure issues and would much prefer to make sure
everything is in fixed, known, fully powered state before committing
to any major operations.  I might be paranoid tho.  I'll think more
about it.

> Perhaps we could move the link PM manipulation to the prepare stage
> of suspend?

Yeah, one possibility is that the devices misbehave if they receive
LPM commands while suspended.  Does commenting out sd_suspend resolve
the issue too?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17 10:15                 ` Tejun Heo
  2010-08-17 10:29                   ` Stephan Diestelhorst
@ 2010-08-17 11:19                   ` Rafael J. Wysocki
  2010-08-17 11:29                     ` Tejun Heo
  1 sibling, 1 reply; 46+ messages in thread
From: Rafael J. Wysocki @ 2010-08-17 11:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

On Tuesday, August 17, 2010, Tejun Heo wrote:
> Hello,
> 
> On 08/17/2010 11:32 AM, Stephan Diestelhorst wrote:
> > Indeed. Like I said, I have similar issues on a another Samsung HDD in
> > an AMD system. I have not yet got around to try the fix there, but I
> > suspect it is the same thing.
> > 
> > I have attached the full /var/log/messages and /var/log/kern.log with
> > multiple suspend-to-ram runs and the last one failing.
> 
> Hmm... are you sure the patch is applied?  There's no debug message
> outputs in the log which the patch added.
> 
> > Would it make sense to add Rafael's workaround upstream, maybe enabling
> > it only for particular platforms / HDDs / with a parameter?
> 
> Yeah, maybe.  The problem is that I'm a bit reluctant to do that for
> all cases as it may cause other obscure failures and we don't know
> whether the problem is controller or device specific at this point,
> so...

Well, I wonder what the real reason for doing the link power management
thing at this particular point in the suspend code path is.  It just seems to
disable the link power management, but then the controller is put into a
low-power state and is reset from scratch during resume, so I'm not quite
sure how skipping that code could possibly lead to any problems.

Perhaps we could move the link PM manipulation to the prepare stage of suspend?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17 10:29                   ` Stephan Diestelhorst
@ 2010-08-17 10:51                     ` Stephan Diestelhorst
  2010-08-17 15:04                       ` Tejun Heo
  0 siblings, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-17 10:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

[-- Attachment #1: Type: text/plain, Size: 1401 bytes --]

On Tuesday 17 August 2010, 12:29:28 Stephan Diestelhorst wrote:
> Hi,
> 
> On Tuesday 17 August 2010 12:15:33 Tejun Heo wrote:
> > On 08/17/2010 11:32 AM, Stephan Diestelhorst wrote:
> > > Indeed. Like I said, I have similar issues on a another Samsung HDD in
> > > an AMD system. I have not yet got around to try the fix there, but I
> > > suspect it is the same thing.
> > > 
> > > I have attached the full /var/log/messages and /var/log/kern.log with
> > > multiple suspend-to-ram runs and the last one failing.
> > 
> > Hmm... are you sure the patch is applied?  There's no debug message
> > outputs in the log which the patch added.
> 
> ...

I *think* I have applied the patch correctly. Please find a copy of
"git show" in the build directory attached. This should be the right
thing, shouldn't it?

Maybe I forgot to speify a particular debug option / verbosity?

I've also confirmed that the "XXX ahci_set_ipm" is present in
libahci.ko. So either I've screwed up badly when compiling the initrd,
the code is not executed or the printout does not make it into any
logfile anymore.

Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

[-- Attachment #2: actually_applied.git_show.bz2 --]
[-- Type: application/x-bzip, Size: 11245 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17 10:15                 ` Tejun Heo
@ 2010-08-17 10:29                   ` Stephan Diestelhorst
  2010-08-17 10:51                     ` Stephan Diestelhorst
  2010-08-17 11:19                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-17 10:29 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hi,

On Tuesday 17 August 2010 12:15:33 Tejun Heo wrote:
> On 08/17/2010 11:32 AM, Stephan Diestelhorst wrote:
> > Indeed. Like I said, I have similar issues on a another Samsung HDD in
> > an AMD system. I have not yet got around to try the fix there, but I
> > suspect it is the same thing.
> > 
> > I have attached the full /var/log/messages and /var/log/kern.log with
> > multiple suspend-to-ram runs and the last one failing.
> 
> Hmm... are you sure the patch is applied?  There's no debug message
> outputs in the log which the patch added.

...

Let me get back to you. :-/

Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17  9:32               ` Stephan Diestelhorst
@ 2010-08-17 10:15                 ` Tejun Heo
  2010-08-17 10:29                   ` Stephan Diestelhorst
  2010-08-17 11:19                   ` Rafael J. Wysocki
  0 siblings, 2 replies; 46+ messages in thread
From: Tejun Heo @ 2010-08-17 10:15 UTC (permalink / raw)
  To: Stephan Diestelhorst
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hello,

On 08/17/2010 11:32 AM, Stephan Diestelhorst wrote:
> Indeed. Like I said, I have similar issues on a another Samsung HDD in
> an AMD system. I have not yet got around to try the fix there, but I
> suspect it is the same thing.
> 
> I have attached the full /var/log/messages and /var/log/kern.log with
> multiple suspend-to-ram runs and the last one failing.

Hmm... are you sure the patch is applied?  There's no debug message
outputs in the log which the patch added.

> Would it make sense to add Rafael's workaround upstream, maybe enabling
> it only for particular platforms / HDDs / with a parameter?

Yeah, maybe.  The problem is that I'm a bit reluctant to do that for
all cases as it may cause other obscure failures and we don't know
whether the problem is controller or device specific at this point,
so...

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17  8:08             ` Tejun Heo
@ 2010-08-17  9:32               ` Stephan Diestelhorst
  2010-08-17 10:15                 ` Tejun Heo
  0 siblings, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-17  9:32 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

[-- Attachment #1: Type: text/plain, Size: 1740 bytes --]

On Tuesday 17 August 2010 10:08:53 Tejun Heo wrote:
> Hello,
> 
> On 08/17/2010 09:51 AM, Stephan Diestelhorst wrote:
> > On Thursday 05 August 2010 18:08:02 Tejun Heo wrote:
> >> Can you please try the following patch and see whether the problem
> >> goes away?
> > 
> > I've finally managed to get this to compile and test. (Hit a bug with
> > Debian's make-kpkg and other nuisances...)
> > 
> > The problem is still there. On some resumes I get the dreadful dead
> > disk again:
> > 
> > end_request: I/O error , dev sda sector ...
> > sd 0:0:0:0: [sda] Unhandled error code
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> > sd 0:0:0:0: [sda] CDB: Read(10): 28 00 0e a4 77 a8 0 00 08 00
> > (many of those)
> > 
> > Can't access /var/log/messages right now, due to broken I/O. Will try
> > to trigger it again and check for the qc timeout messages..
> 
> Yeah, it would great to have the log.  So, it seems like the hardware
> is actually buggy then.  :-(

Indeed. Like I said, I have similar issues on a another Samsung HDD in
an AMD system. I have not yet got around to try the fix there, but I
suspect it is the same thing.

I have attached the full /var/log/messages and /var/log/kern.log with
multiple suspend-to-ram runs and the last one failing.

Would it make sense to add Rafael's workaround upstream, maybe enabling
it only for particular platforms / HDDs / with a parameter?

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

[-- Attachment #2: messages_kern.log.tar.bz2 --]
[-- Type: application/x-bzip-compressed-tar, Size: 163097 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-17  7:51           ` Stephan Diestelhorst
@ 2010-08-17  8:08             ` Tejun Heo
  2010-08-17  9:32               ` Stephan Diestelhorst
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-08-17  8:08 UTC (permalink / raw)
  To: Stephan Diestelhorst
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hello,

On 08/17/2010 09:51 AM, Stephan Diestelhorst wrote:
> On Thursday 05 August 2010 18:08:02 Tejun Heo wrote:
>> Can you please try the following patch and see whether the problem
>> goes away?
> 
> I've finally managed to get this to compile and test. (Hit a bug with
> Debian's make-kpkg and other nuisances...)
> 
> The problem is still there. On some resumes I get the dreadful dead
> disk again:
> 
> end_request: I/O error , dev sda sector ...
> sd 0:0:0:0: [sda] Unhandled error code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> sd 0:0:0:0: [sda] CDB: Read(10): 28 00 0e a4 77 a8 0 00 08 00
> (many of those)
> 
> Can't access /var/log/messages right now, due to broken I/O. Will try
> to trigger it again and check for the qc timeout messages..

Yeah, it would great to have the log.  So, it seems like the hardware
is actually buggy then.  :-(

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-05 16:08         ` Tejun Heo
  2010-08-05 19:58           ` Rafael J. Wysocki
  2010-08-06  6:30           ` Stephan Diestelhorst
@ 2010-08-17  7:51           ` Stephan Diestelhorst
  2010-08-17  8:08             ` Tejun Heo
  2 siblings, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-17  7:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

Hi Tejun,

On Thursday 05 August 2010 18:08:02 Tejun Heo wrote:
> Can you please try the following patch and see whether the problem
> goes away?

I've finally managed to get this to compile and test. (Hit a bug with
Debian's make-kpkg and other nuisances...)

The problem is still there. On some resumes I get the dreadful dead
disk again:

end_request: I/O error , dev sda sector ...
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 0:0:0:0: [sda] CDB: Read(10): 28 00 0e a4 77 a8 0 00 08 00
(many of those)

Can't access /var/log/messages right now, due to broken I/O. Will try
to trigger it again and check for the qc timeout messages..

Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-06  7:06             ` Tejun Heo
@ 2010-08-06  9:04               ` Stephan Diestelhorst
  0 siblings, 0 replies; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-06  9:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm,
	Stephan Diestelhorst

HI,

On Friday 06 August 2010, 09:06:26 Tejun Heo wrote:
> On 08/06/2010 08:30 AM, Stephan Diestelhorst wrote:
> > On Thursday 05 August 2010, 18:08:02 Tejun Heo wrote:
> >> Can you please try the following patch and see whether the problem
> >> goes away?
> > <snip>
> > 
> > to which revision does the patch apply? I didn't get it to apply
> > cleanly to Linus' kernel HEAD or the 2.6.34 stable tag. Maybe I am
> > missing something, since I am a git-n00b (use Mercurial all the time)?
> 
> It applies cleanly to v2.6.35.

Arrrgh. My "great" company Exchange mail server thought it was a good
idea to mess with the white-space of the mail. That's why the patch did
not apply. Compiling and testing now, sorry.

Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-06  6:30           ` Stephan Diestelhorst
@ 2010-08-06  7:06             ` Tejun Heo
  2010-08-06  9:04               ` Stephan Diestelhorst
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-08-06  7:06 UTC (permalink / raw)
  To: Stephan Diestelhorst; +Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm

Hello,

On 08/06/2010 08:30 AM, Stephan Diestelhorst wrote:
> On Thursday 05 August 2010, 18:08:02 Tejun Heo wrote:
>> Can you please try the following patch and see whether the problem
>> goes away?
> <snip>
> 
> to which revision does the patch apply? I didn't get it to apply
> cleanly to Linus' kernel HEAD or the 2.6.34 stable tag. Maybe I am
> missing something, since I am a git-n00b (use Mercurial all the time)?

It applies cleanly to v2.6.35.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-05 16:08         ` Tejun Heo
  2010-08-05 19:58           ` Rafael J. Wysocki
@ 2010-08-06  6:30           ` Stephan Diestelhorst
  2010-08-06  7:06             ` Tejun Heo
  2010-08-17  7:51           ` Stephan Diestelhorst
  2 siblings, 1 reply; 46+ messages in thread
From: Stephan Diestelhorst @ 2010-08-06  6:30 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Rafael J. Wysocki, linux-kernel, linux-ide, linux-pm

Hi Tejun,

On Thursday 05 August 2010, 18:08:02 Tejun Heo wrote:
> Can you please try the following patch and see whether the problem
> goes away?
<snip>

to which revision does the patch apply? I didn't get it to apply
cleanly to Linus' kernel HEAD or the 2.6.34 stable tag. Maybe I am
missing something, since I am a git-n00b (use Mercurial all the time)?

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-08-05 16:08         ` Tejun Heo
@ 2010-08-05 19:58           ` Rafael J. Wysocki
  2010-08-06  6:30           ` Stephan Diestelhorst
  2010-08-17  7:51           ` Stephan Diestelhorst
  2 siblings, 0 replies; 46+ messages in thread
From: Rafael J. Wysocki @ 2010-08-05 19:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	stephan.diestelhorst

On Thursday, August 05, 2010, Tejun Heo wrote:
> Hello, Rafael.
> 
> Can you please try the following patch and see whether the problem
> goes away?

I'm going to LinuxCon shortly and I'm afraid I won't be able to test it until
I get back home.  However, it seems that Stephan could reproduce the issue
more easily, so parhaps he'll be able to test it earlier.

Thanks,
Rafael


>  drivers/ata/ahci.c          |    3
>  drivers/ata/ahci.h          |    1
>  drivers/ata/ahci_platform.c |    3
>  drivers/ata/ata_piix.c      |   24 +++
>  drivers/ata/libahci.c       |  161 +++++++-------------------
>  drivers/ata/libata-core.c   |  269 ++++++++++----------------------------------
>  drivers/ata/libata-eh.c     |  176 +++++++++++++++++++++++++---
>  drivers/ata/libata-pmp.c    |   49 +++++++-
>  drivers/ata/libata-scsi.c   |   74 ++++--------
>  drivers/ata/libata.h        |   12 +
>  include/linux/libata.h      |   40 +++---
>  11 files changed, 393 insertions(+), 419 deletions(-)
> 
> diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
> index f252253..cfdc22b 100644
> --- a/drivers/ata/ahci.c
> +++ b/drivers/ata/ahci.c
> @@ -1190,9 +1190,6 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  		ata_port_pbar_desc(ap, AHCI_PCI_BAR,
>  				   0x100 + ap->port_no * 0x80, "port");
> 
> -		/* set initial link pm policy */
> -		ap->pm_policy = NOT_AVAILABLE;
> -
>  		/* set enclosure management message type */
>  		if (ap->flags & ATA_FLAG_EM)
>  			ap->em_message_type = hpriv->em_msg_type;
> diff --git a/drivers/ata/ahci.h b/drivers/ata/ahci.h
> index 7113c57..6d07948 100644
> --- a/drivers/ata/ahci.h
> +++ b/drivers/ata/ahci.h
> @@ -201,7 +201,6 @@ enum {
>  	AHCI_HFLAG_MV_PATA		= (1 << 4), /* PATA port */
>  	AHCI_HFLAG_NO_MSI		= (1 << 5), /* no PCI MSI */
>  	AHCI_HFLAG_NO_PMP		= (1 << 6), /* no PMP */
> -	AHCI_HFLAG_NO_HOTPLUG		= (1 << 7), /* ignore PxSERR.DIAG.N */
>  	AHCI_HFLAG_SECT255		= (1 << 8), /* max 255 sectors */
>  	AHCI_HFLAG_YES_NCQ		= (1 << 9), /* force NCQ cap on */
>  	AHCI_HFLAG_NO_SUSPEND		= (1 << 10), /* don't suspend */
> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> index 5e11b16..0f69afe 100644
> --- a/drivers/ata/ahci_platform.c
> +++ b/drivers/ata/ahci_platform.c
> @@ -120,9 +120,6 @@ static int __init ahci_probe(struct platform_device *pdev)
>  		ata_port_desc(ap, "mmio %pR", mem);
>  		ata_port_desc(ap, "port 0x%x", 0x100 + ap->port_no * 0x80);
> 
> -		/* set initial link pm policy */
> -		ap->pm_policy = NOT_AVAILABLE;
> -
>  		/* set enclosure management message type */
>  		if (ap->flags & ATA_FLAG_EM)
>  			ap->em_message_type = hpriv->em_msg_type;
> diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
> index 7409f98..0df0477 100644
> --- a/drivers/ata/ata_piix.c
> +++ b/drivers/ata/ata_piix.c
> @@ -174,6 +174,8 @@ static int piix_sidpr_scr_read(struct ata_link *link,
>  			       unsigned int reg, u32 *val);
>  static int piix_sidpr_scr_write(struct ata_link *link,
>  				unsigned int reg, u32 val);
> +static int piix_sidpr_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
> +			      unsigned hints);
>  static bool piix_irq_check(struct ata_port *ap);
>  #ifdef CONFIG_PM
>  static int piix_pci_device_suspend(struct pci_dev *pdev, pm_message_t mesg);
> @@ -343,11 +345,22 @@ static struct ata_port_operations ich_pata_ops = {
>  	.set_dmamode		= ich_set_dmamode,
>  };
> 
> +static struct device_attribute *piix_sidpr_shost_attrs[] = {
> +	&dev_attr_link_power_management_policy,
> +	NULL
> +};
> +
> +static struct scsi_host_template piix_sidpr_sht = {
> +	ATA_BMDMA_SHT(DRV_NAME),
> +	.shost_attrs		= piix_sidpr_shost_attrs,
> +};
> +
>  static struct ata_port_operations piix_sidpr_sata_ops = {
>  	.inherits		= &piix_sata_ops,
>  	.hardreset		= sata_std_hardreset,
>  	.scr_read		= piix_sidpr_scr_read,
>  	.scr_write		= piix_sidpr_scr_write,
> +	.set_ipm		= piix_sidpr_set_ipm,
>  };
> 
>  static const struct piix_map_db ich5_map_db = {
> @@ -973,6 +986,12 @@ static int piix_sidpr_scr_write(struct ata_link *link,
>  	return 0;
>  }
> 
> +static int piix_sidpr_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
> +			      unsigned hints)
> +{
> +	return sata_link_scr_ipm(link, policy, false);
> +}
> +
>  static bool piix_irq_check(struct ata_port *ap)
>  {
>  	if (unlikely(!ap->ioaddr.bmdma_addr))
> @@ -1532,6 +1551,7 @@ static int __devinit piix_init_one(struct pci_dev *pdev,
>  	struct device *dev = &pdev->dev;
>  	struct ata_port_info port_info[2];
>  	const struct ata_port_info *ppi[] = { &port_info[0], &port_info[1] };
> +	struct scsi_host_template *sht = &piix_sht;
>  	unsigned long port_flags;
>  	struct ata_host *host;
>  	struct piix_host_priv *hpriv;
> @@ -1600,6 +1620,8 @@ static int __devinit piix_init_one(struct pci_dev *pdev,
>  		rc = piix_init_sidpr(host);
>  		if (rc)
>  			return rc;
> +		if (host->ports[0]->ops == &piix_sidpr_sata_ops)
> +			sht = &piix_sidpr_sht;
>  	}
> 
>  	/* apply IOCFG bit18 quirk */
> @@ -1626,7 +1648,7 @@ static int __devinit piix_init_one(struct pci_dev *pdev,
>  	host->flags |= ATA_HOST_PARALLEL_SCAN;
> 
>  	pci_set_master(pdev);
> -	return ata_pci_sff_activate_host(host, ata_bmdma_interrupt, &piix_sht);
> +	return ata_pci_sff_activate_host(host, ata_bmdma_interrupt, sht);
>  }
> 
>  static void piix_remove_one(struct pci_dev *pdev)
> diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> index 81e772a..2c5f3df 100644
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -56,9 +56,8 @@ MODULE_PARM_DESC(skip_host_reset, "skip global host reset (0=don't skip, 1=skip)
>  module_param_named(ignore_sss, ahci_ignore_sss, int, 0444);
>  MODULE_PARM_DESC(ignore_sss, "Ignore staggered spinup flag (0=don't ignore, 1=ignore)");
> 
> -static int ahci_enable_alpm(struct ata_port *ap,
> -		enum link_pm policy);
> -static void ahci_disable_alpm(struct ata_port *ap);
> +static int ahci_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
> +			unsigned hints);
>  static ssize_t ahci_led_show(struct ata_port *ap, char *buf);
>  static ssize_t ahci_led_store(struct ata_port *ap, const char *buf,
>  			      size_t size);
> @@ -172,8 +171,7 @@ struct ata_port_operations ahci_ops = {
>  	.pmp_attach		= ahci_pmp_attach,
>  	.pmp_detach		= ahci_pmp_detach,
> 
> -	.enable_pm		= ahci_enable_alpm,
> -	.disable_pm		= ahci_disable_alpm,
> +	.set_ipm		= ahci_set_ipm,
>  	.em_show		= ahci_led_show,
>  	.em_store		= ahci_led_store,
>  	.sw_activity_show	= ahci_activity_show,
> @@ -644,127 +642,59 @@ static void ahci_power_up(struct ata_port *ap)
>  	writel(cmd | PORT_CMD_ICC_ACTIVE, port_mmio + PORT_CMD);
>  }
> 
> -static void ahci_disable_alpm(struct ata_port *ap)
> +static int ahci_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
> +			unsigned int hints)
>  {
> +	struct ata_port *ap = link->ap;
>  	struct ahci_host_priv *hpriv = ap->host->private_data;
> -	void __iomem *port_mmio = ahci_port_base(ap);
> -	u32 cmd;
>  	struct ahci_port_priv *pp = ap->private_data;
> -
> -	/* IPM bits should be disabled by libata-core */
> -	/* get the existing command bits */
> -	cmd = readl(port_mmio + PORT_CMD);
> -
> -	/* disable ALPM and ASP */
> -	cmd &= ~PORT_CMD_ASP;
> -	cmd &= ~PORT_CMD_ALPE;
> -
> -	/* force the interface back to active */
> -	cmd |= PORT_CMD_ICC_ACTIVE;
> -
> -	/* write out new cmd value */
> -	writel(cmd, port_mmio + PORT_CMD);
> -	cmd = readl(port_mmio + PORT_CMD);
> -
> -	/* wait 10ms to be sure we've come out of any low power state */
> -	msleep(10);
> -
> -	/* clear out any PhyRdy stuff from interrupt status */
> -	writel(PORT_IRQ_PHYRDY, port_mmio + PORT_IRQ_STAT);
> -
> -	/* go ahead and clean out PhyRdy Change from Serror too */
> -	ahci_scr_write(&ap->link, SCR_ERROR, ((1 << 16) | (1 << 18)));
> -
> -	/*
> -	 * Clear flag to indicate that we should ignore all PhyRdy
> -	 * state changes
> -	 */
> -	hpriv->flags &= ~AHCI_HFLAG_NO_HOTPLUG;
> -
> -	/*
> -	 * Enable interrupts on Phy Ready.
> -	 */
> -	pp->intr_mask |= PORT_IRQ_PHYRDY;
> -	writel(pp->intr_mask, port_mmio + PORT_IRQ_MASK);
> -
> -	/*
> -	 * don't change the link pm policy - we can be called
> -	 * just to turn of link pm temporarily
> -	 */
> -}
> -
> -static int ahci_enable_alpm(struct ata_port *ap,
> -	enum link_pm policy)
> -{
> -	struct ahci_host_priv *hpriv = ap->host->private_data;
>  	void __iomem *port_mmio = ahci_port_base(ap);
> -	u32 cmd;
> -	struct ahci_port_priv *pp = ap->private_data;
> -	u32 asp;
> 
> -	/* Make sure the host is capable of link power management */
> -	if (!(hpriv->cap & HOST_CAP_ALPM))
> -		return -EINVAL;
> +	ata_link_printk(link, KERN_INFO, "XXX ahci_set_ipm: pol=%d hints=%x\n",
> +			policy, hints);
> 
> -	switch (policy) {
> -	case MAX_PERFORMANCE:
> -	case NOT_AVAILABLE:
> +	if (policy != ATA_IPM_MAX_POWER) {
>  		/*
> -		 * if we came here with NOT_AVAILABLE,
> -		 * it just means this is the first time we
> -		 * have tried to enable - default to max performance,
> -		 * and let the user go to lower power modes on request.
> +		 * Disable interrupts on Phy Ready. This keeps us from
> +		 * getting woken up due to spurious phy ready
> +		 * interrupts.
>  		 */
> -		ahci_disable_alpm(ap);
> -		return 0;
> -	case MIN_POWER:
> -		/* configure HBA to enter SLUMBER */
> -		asp = PORT_CMD_ASP;
> -		break;
> -	case MEDIUM_POWER:
> -		/* configure HBA to enter PARTIAL */
> -		asp = 0;
> -		break;
> -	default:
> -		return -EINVAL;
> +		pp->intr_mask &= ~PORT_IRQ_PHYRDY;
> +		writel(pp->intr_mask, port_mmio + PORT_IRQ_MASK);
> +
> +		sata_link_scr_ipm(link, policy, false);
>  	}
> 
> -	/*
> -	 * Disable interrupts on Phy Ready. This keeps us from
> -	 * getting woken up due to spurious phy ready interrupts
> -	 * TBD - Hot plug should be done via polling now, is
> -	 * that even supported?
> -	 */
> -	pp->intr_mask &= ~PORT_IRQ_PHYRDY;
> -	writel(pp->intr_mask, port_mmio + PORT_IRQ_MASK);
> +	if (hpriv->cap & HOST_CAP_ALPM) {
> +		u32 cmd = readl(port_mmio + PORT_CMD);
> 
> -	/*
> -	 * Set a flag to indicate that we should ignore all PhyRdy
> -	 * state changes since these can happen now whenever we
> -	 * change link state
> -	 */
> -	hpriv->flags |= AHCI_HFLAG_NO_HOTPLUG;
> +		if (policy == ATA_IPM_MAX_POWER || !(hints & ATA_IPM_HIPM)) {
> +			cmd &= ~(PORT_CMD_ASP | PORT_CMD_ALPE);
> +			cmd |= PORT_CMD_ICC_ACTIVE;
> 
> -	/* get the existing command bits */
> -	cmd = readl(port_mmio + PORT_CMD);
> +			writel(cmd, port_mmio + PORT_CMD);
> +			readl(port_mmio + PORT_CMD);
> 
> -	/*
> -	 * Set ASP based on Policy
> -	 */
> -	cmd |= asp;
> +			/* wait 10ms to be sure we've come out of IPM state */
> +			msleep(10);
> +		} else {
> +			cmd |= PORT_CMD_ALPE;
> +			if (policy == ATA_IPM_MIN_POWER)
> +				cmd |= PORT_CMD_ASP;
> 
> -	/*
> -	 * Setting this bit will instruct the HBA to aggressively
> -	 * enter a lower power link state when it's appropriate and
> -	 * based on the value set above for ASP
> -	 */
> -	cmd |= PORT_CMD_ALPE;
> +			/* write out new cmd value */
> +			writel(cmd, port_mmio + PORT_CMD);
> +		}
> +	}
> 
> -	/* write out new cmd value */
> -	writel(cmd, port_mmio + PORT_CMD);
> -	cmd = readl(port_mmio + PORT_CMD);
> +	if (policy == ATA_IPM_MAX_POWER) {
> +		sata_link_scr_ipm(link, policy, false);
> +
> +		/* turn PHYRDY IRQ back on */
> +		pp->intr_mask |= PORT_IRQ_PHYRDY;
> +		writel(pp->intr_mask, port_mmio + PORT_IRQ_MASK);
> +	}
> 
> -	/* IPM bits should be set by libata-core */
>  	return 0;
>  }
> 
> @@ -1662,15 +1592,10 @@ static void ahci_port_intr(struct ata_port *ap)
>  	if (unlikely(resetting))
>  		status &= ~PORT_IRQ_BAD_PMP;
> 
> -	/* If we are getting PhyRdy, this is
> -	 * just a power state change, we should
> -	 * clear out this, plus the PhyRdy/Comm
> -	 * Wake bits from Serror
> -	 */
> -	if ((hpriv->flags & AHCI_HFLAG_NO_HOTPLUG) &&
> -		(status & PORT_IRQ_PHYRDY)) {
> +	/* if IPM is enabled, PHYRDY doesn't mean anything */
> +	if (ap->link.ipm_policy > ATA_IPM_MAX_POWER) {
>  		status &= ~PORT_IRQ_PHYRDY;
> -		ahci_scr_write(&ap->link, SCR_ERROR, ((1 << 16) | (1 << 18)));
> +		ahci_scr_write(&ap->link, SCR_ERROR, SERR_PHYRDY_CHG);
>  	}
> 
>  	if (unlikely(status & PORT_IRQ_ERROR)) {
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index ddf8e48..5d1eeb1 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -91,8 +91,6 @@ const struct ata_port_operations sata_port_ops = {
>  static unsigned int ata_dev_init_params(struct ata_device *dev,
>  					u16 heads, u16 sectors);
>  static unsigned int ata_dev_set_xfermode(struct ata_device *dev);
> -static unsigned int ata_dev_set_feature(struct ata_device *dev,
> -					u8 enable, u8 feature);
>  static void ata_dev_xfermask(struct ata_device *dev);
>  static unsigned long ata_dev_blacklisted(const struct ata_device *dev);
> 
> @@ -1032,182 +1030,6 @@ static const char *sata_spd_string(unsigned int spd)
>  	return spd_str[spd - 1];
>  }
> 
> -static int ata_dev_set_dipm(struct ata_device *dev, enum link_pm policy)
> -{
> -	struct ata_link *link = dev->link;
> -	struct ata_port *ap = link->ap;
> -	u32 scontrol;
> -	unsigned int err_mask;
> -	int rc;
> -
> -	/*
> -	 * disallow DIPM for drivers which haven't set
> -	 * ATA_FLAG_IPM.  This is because when DIPM is enabled,
> -	 * phy ready will be set in the interrupt status on
> -	 * state changes, which will cause some drivers to
> -	 * think there are errors - additionally drivers will
> -	 * need to disable hot plug.
> -	 */
> -	if (!(ap->flags & ATA_FLAG_IPM) || !ata_dev_enabled(dev)) {
> -		ap->pm_policy = NOT_AVAILABLE;
> -		return -EINVAL;
> -	}
> -
> -	/*
> -	 * For DIPM, we will only enable it for the
> -	 * min_power setting.
> -	 *
> -	 * Why?  Because Disks are too stupid to know that
> -	 * If the host rejects a request to go to SLUMBER
> -	 * they should retry at PARTIAL, and instead it
> -	 * just would give up.  So, for medium_power to
> -	 * work at all, we need to only allow HIPM.
> -	 */
> -	rc = sata_scr_read(link, SCR_CONTROL, &scontrol);
> -	if (rc)
> -		return rc;
> -
> -	switch (policy) {
> -	case MIN_POWER:
> -		/* no restrictions on IPM transitions */
> -		scontrol &= ~(0x3 << 8);
> -		rc = sata_scr_write(link, SCR_CONTROL, scontrol);
> -		if (rc)
> -			return rc;
> -
> -		/* enable DIPM */
> -		if (dev->flags & ATA_DFLAG_DIPM)
> -			err_mask = ata_dev_set_feature(dev,
> -					SETFEATURES_SATA_ENABLE, SATA_DIPM);
> -		break;
> -	case MEDIUM_POWER:
> -		/* allow IPM to PARTIAL */
> -		scontrol &= ~(0x1 << 8);
> -		scontrol |= (0x2 << 8);
> -		rc = sata_scr_write(link, SCR_CONTROL, scontrol);
> -		if (rc)
> -			return rc;
> -
> -		/*
> -		 * we don't have to disable DIPM since IPM flags
> -		 * disallow transitions to SLUMBER, which effectively
> -		 * disable DIPM if it does not support PARTIAL
> -		 */
> -		break;
> -	case NOT_AVAILABLE:
> -	case MAX_PERFORMANCE:
> -		/* disable all IPM transitions */
> -		scontrol |= (0x3 << 8);
> -		rc = sata_scr_write(link, SCR_CONTROL, scontrol);
> -		if (rc)
> -			return rc;
> -
> -		/*
> -		 * we don't have to disable DIPM since IPM flags
> -		 * disallow all transitions which effectively
> -		 * disable DIPM anyway.
> -		 */
> -		break;
> -	}
> -
> -	/* FIXME: handle SET FEATURES failure */
> -	(void) err_mask;
> -
> -	return 0;
> -}
> -
> -/**
> - *	ata_dev_enable_pm - enable SATA interface power management
> - *	@dev:  device to enable power management
> - *	@policy: the link power management policy
> - *
> - *	Enable SATA Interface power management.  This will enable
> - *	Device Interface Power Management (DIPM) for min_power
> - * 	policy, and then call driver specific callbacks for
> - *	enabling Host Initiated Power management.
> - *
> - *	Locking: Caller.
> - *	Returns: -EINVAL if IPM is not supported, 0 otherwise.
> - */
> -void ata_dev_enable_pm(struct ata_device *dev, enum link_pm policy)
> -{
> -	int rc = 0;
> -	struct ata_port *ap = dev->link->ap;
> -
> -	/* set HIPM first, then DIPM */
> -	if (ap->ops->enable_pm)
> -		rc = ap->ops->enable_pm(ap, policy);
> -	if (rc)
> -		goto enable_pm_out;
> -	rc = ata_dev_set_dipm(dev, policy);
> -
> -enable_pm_out:
> -	if (rc)
> -		ap->pm_policy = MAX_PERFORMANCE;
> -	else
> -		ap->pm_policy = policy;
> -	return /* rc */;	/* hopefully we can use 'rc' eventually */
> -}
> -
> -#ifdef CONFIG_PM
> -/**
> - *	ata_dev_disable_pm - disable SATA interface power management
> - *	@dev: device to disable power management
> - *
> - *	Disable SATA Interface power management.  This will disable
> - *	Device Interface Power Management (DIPM) without changing
> - * 	policy,  call driver specific callbacks for disabling Host
> - * 	Initiated Power management.
> - *
> - *	Locking: Caller.
> - *	Returns: void
> - */
> -static void ata_dev_disable_pm(struct ata_device *dev)
> -{
> -	struct ata_port *ap = dev->link->ap;
> -
> -	ata_dev_set_dipm(dev, MAX_PERFORMANCE);
> -	if (ap->ops->disable_pm)
> -		ap->ops->disable_pm(ap);
> -}
> -#endif	/* CONFIG_PM */
> -
> -void ata_lpm_schedule(struct ata_port *ap, enum link_pm policy)
> -{
> -	ap->pm_policy = policy;
> -	ap->link.eh_info.action |= ATA_EH_LPM;
> -	ap->link.eh_info.flags |= ATA_EHI_NO_AUTOPSY;
> -	ata_port_schedule_eh(ap);
> -}
> -
> -#ifdef CONFIG_PM
> -static void ata_lpm_enable(struct ata_host *host)
> -{
> -	struct ata_link *link;
> -	struct ata_port *ap;
> -	struct ata_device *dev;
> -	int i;
> -
> -	for (i = 0; i < host->n_ports; i++) {
> -		ap = host->ports[i];
> -		ata_for_each_link(link, ap, EDGE) {
> -			ata_for_each_dev(dev, link, ALL)
> -				ata_dev_disable_pm(dev);
> -		}
> -	}
> -}
> -
> -static void ata_lpm_disable(struct ata_host *host)
> -{
> -	int i;
> -
> -	for (i = 0; i < host->n_ports; i++) {
> -		struct ata_port *ap = host->ports[i];
> -		ata_lpm_schedule(ap, ap->pm_policy);
> -	}
> -}
> -#endif	/* CONFIG_PM */
> -
>  /**
>   *	ata_dev_classify - determine device type based on ATA-spec signature
>   *	@tf: ATA taskfile register set for device to be identified
> @@ -2566,13 +2388,6 @@ int ata_dev_configure(struct ata_device *dev)
>  	if (dev->flags & ATA_DFLAG_LBA48)
>  		dev->max_sectors = ATA_MAX_SECTORS_LBA48;
> 
> -	if (!(dev->horkage & ATA_HORKAGE_IPM)) {
> -		if (ata_id_has_hipm(dev->id))
> -			dev->flags |= ATA_DFLAG_HIPM;
> -		if (ata_id_has_dipm(dev->id))
> -			dev->flags |= ATA_DFLAG_DIPM;
> -	}
> -
>  	/* Limit PATA drive on SATA cable bridge transfers to udma5,
>  	   200 sectors */
>  	if (ata_dev_knobble(dev)) {
> @@ -2593,13 +2408,6 @@ int ata_dev_configure(struct ata_device *dev)
>  		dev->max_sectors = min_t(unsigned int, ATA_MAX_SECTORS_128,
>  					 dev->max_sectors);
> 
> -	if (ata_dev_blacklisted(dev) & ATA_HORKAGE_IPM) {
> -		dev->horkage |= ATA_HORKAGE_IPM;
> -
> -		/* reset link pm_policy for this port to no pm */
> -		ap->pm_policy = MAX_PERFORMANCE;
> -	}
> -
>  	if (ap->ops->dev_config)
>  		ap->ops->dev_config(dev);
> 
> @@ -3630,7 +3438,7 @@ int ata_wait_after_reset(struct ata_link *link, unsigned long deadline,
>   *	@params: timing parameters { interval, duratinon, timeout } in msec
>   *	@deadline: deadline jiffies for the operation
>   *
> -*	Make sure SStatus of @link reaches stable state, determined by
> + *	Make sure SStatus of @link reaches stable state, determined by
>   *	holding the same value where DET is not 1 for @duration polled
>   *	every @interval, before @timeout.  Timeout constraints the
>   *	beginning of the stable state.  Because DET gets stuck at 1 on
> @@ -3761,6 +3569,65 @@ int sata_link_resume(struct ata_link *link, const unsigned long *params,
>  	return rc != -EINVAL ? rc : 0;
>  }
> 
> +int sata_link_scr_ipm(struct ata_link *link, enum ata_ipm_policy policy,
> +		      bool spm_wakeup)
> +{
> +	struct ata_eh_context *ehc = &link->eh_context;
> +	bool woken_up = false;
> +	u32 scontrol;
> +	int rc;
> +
> +	ata_link_printk(link, KERN_INFO,
> +			"XXX sata_link_scr_ipm: pol=%d spm_wakeup=%d\n",
> +			policy, spm_wakeup);
> +	rc = sata_scr_read(link, SCR_CONTROL, &scontrol);
> +	if (rc)
> +		return rc;
> +
> +	switch (policy) {
> +	case ATA_IPM_MAX_POWER:
> +		/* disable all IPM transitions */
> +		scontrol |= (0x3 << 8);
> +		/* initiate transition to active state */
> +		if (spm_wakeup) {
> +			scontrol |= (0x4 << 12);
> +			woken_up = true;
> +		}
> +		break;
> +	case ATA_IPM_MED_POWER:
> +		/* allow IPM to PARTIAL */
> +		scontrol &= ~(0x1 << 8);
> +		scontrol |= (0x2 << 8);
> +		break;
> +	case ATA_IPM_MIN_POWER:
> +		/* no restrictions on IPM transitions */
> +		scontrol &= ~(0x3 << 8);
> +		break;
> +	default:
> +		WARN_ON(1);
> +	}
> +
> +	ata_link_printk(link, KERN_INFO,
> +			"XXX sata_link_scr_ipm: updating sctl to %x\n",
> +			scontrol);
> +	rc = sata_scr_write(link, SCR_CONTROL, scontrol);
> +	if (rc)
> +		return rc;
> +
> +	/* give the link time to transit out of IPM state */
> +	if (woken_up) {
> +		msleep(10);
> +		ata_link_printk(link, KERN_INFO,
> +				"XXX sata_link_scr_ipm: sleeping 10msec\n");
> +	}
> +
> +	/* clear PHYRDY_CHG from SError */
> +	ata_link_printk(link, KERN_INFO,
> +			"XXX sata_link_scr_ipm: clearing serr\n");
> +	ehc->i.serror &= ~SERR_PHYRDY_CHG;
> +	return sata_scr_write(link, SCR_ERROR, SERR_PHYRDY_CHG);
> +}
> +
>  /**
>   *	ata_std_prereset - prepare for reset
>   *	@link: ATA link to be reset
> @@ -4570,6 +4437,7 @@ static unsigned int ata_dev_set_xfermode(struct ata_device *dev)
>  	DPRINTK("EXIT, err_mask=%x\n", err_mask);
>  	return err_mask;
>  }
> +
>  /**
>   *	ata_dev_set_feature - Issue SET FEATURES - SATA FEATURES
>   *	@dev: Device to which command will be sent
> @@ -4585,8 +4453,7 @@ static unsigned int ata_dev_set_xfermode(struct ata_device *dev)
>   *	RETURNS:
>   *	0 on success, AC_ERR_* mask otherwise.
>   */
> -static unsigned int ata_dev_set_feature(struct ata_device *dev, u8 enable,
> -					u8 feature)
> +unsigned int ata_dev_set_feature(struct ata_device *dev, u8 enable, u8 feature)
>  {
>  	struct ata_taskfile tf;
>  	unsigned int err_mask;
> @@ -5436,12 +5303,6 @@ int ata_host_suspend(struct ata_host *host, pm_message_t mesg)
>  {
>  	int rc;
> 
> -	/*
> -	 * disable link pm on all ports before requesting
> -	 * any pm activity
> -	 */
> -	ata_lpm_enable(host);
> -
>  	rc = ata_host_request_pm(host, mesg, 0, ATA_EHI_QUIET, 1);
>  	if (rc == 0)
>  		host->dev->power.power_state = mesg;
> @@ -5464,9 +5325,6 @@ void ata_host_resume(struct ata_host *host)
>  	ata_host_request_pm(host, PMSG_ON, ATA_EH_RESET,
>  			    ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET, 0);
>  	host->dev->power.power_state = PMSG_ON;
> -
> -	/* reenable link pm */
> -	ata_lpm_disable(host);
>  }
>  #endif
> 
> @@ -6025,7 +5883,7 @@ static void async_port_probe(void *data, async_cookie_t cookie)
>  		spin_lock_irqsave(ap->lock, flags);
> 
>  		ehi->probe_mask |= ATA_ALL_DEVICES;
> -		ehi->action |= ATA_EH_RESET | ATA_EH_LPM;
> +		ehi->action |= ATA_EH_RESET;
>  		ehi->flags |= ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET;
> 
>  		ap->pflags &= ~ATA_PFLAG_INITIALIZING;
> @@ -6698,6 +6556,7 @@ EXPORT_SYMBOL_GPL(sata_set_spd);
>  EXPORT_SYMBOL_GPL(ata_wait_after_reset);
>  EXPORT_SYMBOL_GPL(sata_link_debounce);
>  EXPORT_SYMBOL_GPL(sata_link_resume);
> +EXPORT_SYMBOL_GPL(sata_link_scr_ipm);
>  EXPORT_SYMBOL_GPL(ata_std_prereset);
>  EXPORT_SYMBOL_GPL(sata_link_hardreset);
>  EXPORT_SYMBOL_GPL(sata_std_hardreset);
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index f77a673..bd77d94 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -1568,14 +1568,15 @@ static void ata_eh_analyze_serror(struct ata_link *link)
>  		action |= ATA_EH_RESET;
>  	}
> 
> -	/* Determine whether a hotplug event has occurred.  Both
> +	/*
> +	 * Determine whether a hotplug event has occurred.  Both
>  	 * SError.N/X are considered hotplug events for enabled or
>  	 * host links.  For disabled PMP links, only N bit is
>  	 * considered as X bit is left at 1 for link plugging.
>  	 */
> -	hotplug_mask = 0;
> -
> -	if (!(link->flags & ATA_LFLAG_DISABLED) || ata_is_host_link(link))
> +	if (link->ipm_policy != ATA_IPM_MAX_POWER)
> +		hotplug_mask = 0;	/* hotplug doesn't work w/ IPM */
> +	else if (!(link->flags & ATA_LFLAG_DISABLED) || ata_is_host_link(link))
>  		hotplug_mask = SERR_PHYRDY_CHG | SERR_DEV_XCHG;
>  	else
>  		hotplug_mask = SERR_PHYRDY_CHG;
> @@ -2776,8 +2777,9 @@ int ata_eh_reset(struct ata_link *link, int classify,
>  	ata_eh_done(link, NULL, ATA_EH_RESET);
>  	if (slave)
>  		ata_eh_done(slave, NULL, ATA_EH_RESET);
> -	ehc->last_reset = jiffies;	/* update to completion time */
> +	ehc->last_reset = jiffies;		/* update to completion time */
>  	ehc->i.action |= ATA_EH_REVALIDATE;
> +	link->ipm_policy = ATA_IPM_UNKNOWN;	/* reset IPM state */
> 
>  	rc = 0;
>   out:
> @@ -3203,6 +3205,124 @@ static int ata_eh_maybe_retry_flush(struct ata_device *dev)
>  	return rc;
>  }
> 
> +/**
> + *	ata_eh_set_ipm - configure SATA interface power management
> + *	@link: link to configure power management
> + *	@policy: the link power management policy
> + *	@r_failed_dev: out parameter for failed device
> + *
> + *	Enable SATA Interface power management.  This will enable
> + *	Device Interface Power Management (DIPM) for min_power
> + * 	policy, and then call driver specific callbacks for
> + *	enabling Host Initiated Power management.
> + *
> + *	LOCKING:
> + *	EH context.
> + *
> + *	RETURNS:
> + *	0 on success, -errno on failure.
> + */
> +static int ata_eh_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
> +			  struct ata_device **r_failed_dev)
> +{
> +	struct ata_port *ap = ata_is_host_link(link) ? link->ap : NULL;
> +	struct ata_eh_context *ehc = &link->eh_context;
> +	struct ata_device *dev, *link_dev = NULL, *ipm_dev = NULL;
> +	unsigned int hints = ATA_IPM_EMPTY | ATA_IPM_HIPM;
> +	unsigned int err_mask;
> +	int rc;
> +
> +	/* if the link or host doesn't do IPM, noop */
> +	if ((link->flags & ATA_LFLAG_NO_IPM) || (ap && !ap->ops->set_ipm))
> +		return 0;
> +
> +	/*
> +	 * DIPM is enabled only for MIN_POWER as some devices
> +	 * misbehave when the host NACKs transition to SLUMBER.  Order
> +	 * device and link configurations such that the host always
> +	 * allows DIPM requests.
> +	 */
> +	ata_for_each_dev(dev, link, ENABLED) {
> +		bool hipm = ata_id_has_hipm(dev->id);
> +		bool dipm = ata_id_has_dipm(dev->id);
> +
> +		/* find the first enabled and IPM enabled devices */
> +		if (!link_dev)
> +			link_dev = dev;
> +
> +		if (!ipm_dev && (hipm || dipm))
> +			ipm_dev = dev;
> +
> +		hints &= ~ATA_IPM_EMPTY;
> +		if (!hipm)
> +			hints &= ~ATA_IPM_HIPM;
> +
> +		/* disable DIPM before changing link config */
> +		if (policy != ATA_IPM_MIN_POWER && dipm) {
> +			ata_dev_printk(dev, KERN_INFO, "XXX ata_eh_set_ipm: disabling DIPM\n");
> +			err_mask = ata_dev_set_feature(dev,
> +					SETFEATURES_SATA_DISABLE, SATA_DIPM);
> +			if (err_mask && err_mask != AC_ERR_DEV) {
> +				ata_dev_printk(dev, KERN_WARNING,
> +					       "error while disabling DIPM\n");
> +				rc = -EIO;
> +				goto fail;
> +			}
> +		}
> +	}
> +
> +	if (ap) {
> +		rc = ap->ops->set_ipm(link, policy, hints);
> +		if (!rc && ap->slave_link)
> +			rc = ap->ops->set_ipm(ap->slave_link, policy, hints);
> +	} else
> +		rc = sata_pmp_set_ipm(link, policy, hints);
> +
> +	/*
> +	 * Attribute link config failure to the first (IPM) enabled
> +	 * device on the link.
> +	 */
> +	if (rc) {
> +		if (rc == -EOPNOTSUPP) {
> +			link->flags |= ATA_LFLAG_NO_IPM;
> +			return 0;
> +		}
> +		dev = ipm_dev ? ipm_dev : link_dev;
> +		goto fail;
> +	}
> +
> +	/* host config updated, enable DIPM if transitioning to MIN_POWER */
> +	ata_for_each_dev(dev, link, ENABLED) {
> +		if (policy == ATA_IPM_MIN_POWER && ata_id_has_dipm(dev->id)) {
> +			ata_dev_printk(dev, KERN_INFO, "XXX ata_eh_set_ipm: enabling DIPM\n");
> +			err_mask = ata_dev_set_feature(dev,
> +					SETFEATURES_SATA_ENABLE, SATA_DIPM);
> +			if (err_mask && err_mask != AC_ERR_DEV) {
> +				ata_dev_printk(dev, KERN_WARNING,
> +					       "error while enabling DIPM\n");
> +				rc = -EIO;
> +				goto fail;
> +			}
> +		}
> +	}
> +
> +	link->ipm_policy = policy;
> +	if (ap && ap->slave_link)
> +		ap->slave_link->ipm_policy = policy;
> +	return 0;
> +
> +fail:
> +	/* if no device or the last chance for the device, disable IPM */
> +	if (!dev || ehc->tries[dev->devno] == 1) {
> +		ata_link_printk(link, KERN_WARNING,
> +				"disabling IPM on the link\n");
> +		link->flags |= ATA_LFLAG_NO_IPM;
> +	}
> +	if (r_failed_dev)
> +		*r_failed_dev = dev;
> +	return rc;
> +}
> +
>  static int ata_link_nr_enabled(struct ata_link *link)
>  {
>  	struct ata_device *dev;
> @@ -3283,6 +3403,16 @@ static int ata_eh_schedule_probe(struct ata_device *dev)
>  	ehc->saved_xfer_mode[dev->devno] = 0;
>  	ehc->saved_ncq_enabled &= ~(1 << dev->devno);
> 
> +	/* the link maybe in a deep sleep, wake it up */
> +	if (link->ipm_policy > ATA_IPM_MAX_POWER) {
> +		if (ata_is_host_link(link))
> +			link->ap->ops->set_ipm(link, ATA_IPM_MAX_POWER,
> +					       ATA_IPM_EMPTY);
> +		else
> +			sata_pmp_set_ipm(link, ATA_IPM_MAX_POWER,
> +					 ATA_IPM_EMPTY);
> +	}
> +
>  	/* Record and count probe trials on the ering.  The specific
>  	 * error mask used is irrelevant.  Because a successful device
>  	 * detection clears the ering, this count accumulates only if
> @@ -3384,8 +3514,7 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
>  {
>  	struct ata_link *link;
>  	struct ata_device *dev;
> -	int nr_failed_devs;
> -	int rc;
> +	int rc, nr_fails;
>  	unsigned long flags, deadline;
> 
>  	DPRINTK("ENTER\n");
> @@ -3426,7 +3555,6 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
> 
>   retry:
>  	rc = 0;
> -	nr_failed_devs = 0;
> 
>  	/* if UNLOADING, finish immediately */
>  	if (ap->pflags & ATA_PFLAG_UNLOADING)
> @@ -3511,13 +3639,17 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
>  	}
> 
>  	/* the rest */
> -	ata_for_each_link(link, ap, EDGE) {
> +	nr_fails = 0;
> +	ata_for_each_link(link, ap, PMP_FIRST) {
>  		struct ata_eh_context *ehc = &link->eh_context;
> 
> +		if (sata_pmp_attached(ap) && ata_is_host_link(link))
> +			goto config_ipm;
> +
>  		/* revalidate existing devices and attach new ones */
>  		rc = ata_eh_revalidate_and_attach(link, &dev);
>  		if (rc)
> -			goto dev_fail;
> +			goto rest_fail;
> 
>  		/* if PMP got attached, return, pmp EH will take care of it */
>  		if (link->device->class == ATA_DEV_PMP) {
> @@ -3529,7 +3661,7 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
>  		if (ehc->i.flags & ATA_EHI_SETMODE) {
>  			rc = ata_set_mode(link, &dev);
>  			if (rc)
> -				goto dev_fail;
> +				goto rest_fail;
>  			ehc->i.flags &= ~ATA_EHI_SETMODE;
>  		}
> 
> @@ -3542,7 +3674,7 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
>  					continue;
>  				rc = atapi_eh_clear_ua(dev);
>  				if (rc)
> -					goto dev_fail;
> +					goto rest_fail;
>  			}
>  		}
> 
> @@ -3552,21 +3684,25 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
>  				continue;
>  			rc = ata_eh_maybe_retry_flush(dev);
>  			if (rc)
> -				goto dev_fail;
> +				goto rest_fail;
>  		}
> 
> +	config_ipm:
>  		/* configure link power saving */
> -		if (ehc->i.action & ATA_EH_LPM)
> -			ata_for_each_dev(dev, link, ALL)
> -				ata_dev_enable_pm(dev, ap->pm_policy);
> +		if (link->ipm_policy != ap->target_ipm_policy) {
> +			rc = ata_eh_set_ipm(link, ap->target_ipm_policy, &dev);
> +			if (rc)
> +				goto rest_fail;
> +		}
> 
>  		/* this link is okay now */
>  		ehc->i.flags = 0;
>  		continue;
> 
> -dev_fail:
> -		nr_failed_devs++;
> -		ata_eh_handle_dev_fail(dev, rc);
> +	rest_fail:
> +		nr_fails++;
> +		if (dev)
> +			ata_eh_handle_dev_fail(dev, rc);
> 
>  		if (ap->pflags & ATA_PFLAG_FROZEN) {
>  			/* PMP reset requires working host port.
> @@ -3578,7 +3714,7 @@ dev_fail:
>  		}
>  	}
> 
> -	if (nr_failed_devs)
> +	if (nr_fails)
>  		goto retry;
> 
>   out:
> diff --git a/drivers/ata/libata-pmp.c b/drivers/ata/libata-pmp.c
> index 224faab..06a66ca 100644
> --- a/drivers/ata/libata-pmp.c
> +++ b/drivers/ata/libata-pmp.c
> @@ -185,6 +185,27 @@ int sata_pmp_scr_write(struct ata_link *link, int reg, u32 val)
>  }
> 
>  /**
> + *	sata_pmp_set_ipm - configure IPM for a PMP link
> + *	@link: PMP link to configure IPM for
> + *	@policy: target IPM policy
> + *	@hints: IPM hints
> + *
> + *	Configure IPM for @link.  This function will contain any PMP
> + *	specific workarounds if necessary.
> + *
> + *	LOCKING:
> + *	EH context.
> + *
> + *	RETURNS:
> + *	0 on success, -errno on failure.
> + */
> +int sata_pmp_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
> +		     unsigned hints)
> +{
> +	return sata_link_scr_ipm(link, policy, true);
> +}
> +
> +/**
>   *	sata_pmp_read_gscr - read GSCR block of SATA PMP
>   *	@dev: PMP device
>   *	@gscr: buffer to read GSCR block into
> @@ -351,6 +372,9 @@ static void sata_pmp_quirks(struct ata_port *ap)
>  	if (vendor == 0x1095 && devid == 0x3726) {
>  		/* sil3726 quirks */
>  		ata_for_each_link(link, ap, EDGE) {
> +			/* link reports offline after IPM */
> +			link->flags |= ATA_LFLAG_NO_IPM;
> +
>  			/* Class code report is unreliable and SRST
>  			 * times out under certain configurations.
>  			 */
> @@ -366,6 +390,9 @@ static void sata_pmp_quirks(struct ata_port *ap)
>  	} else if (vendor == 0x1095 && devid == 0x4723) {
>  		/* sil4723 quirks */
>  		ata_for_each_link(link, ap, EDGE) {
> +			/* link reports offline after IPM */
> +			link->flags |= ATA_LFLAG_NO_IPM;
> +
>  			/* class code report is unreliable */
>  			if (link->pmp < 2)
>  				link->flags |= ATA_LFLAG_ASSUME_ATA;
> @@ -378,6 +405,9 @@ static void sata_pmp_quirks(struct ata_port *ap)
>  	} else if (vendor == 0x1095 && devid == 0x4726) {
>  		/* sil4726 quirks */
>  		ata_for_each_link(link, ap, EDGE) {
> +			/* link reports offline after IPM */
> +			link->flags |= ATA_LFLAG_NO_IPM;
> +
>  			/* Class code report is unreliable and SRST
>  			 * times out under certain configurations.
>  			 * Config device can be at port 0 or 5 and
> @@ -938,15 +968,26 @@ static int sata_pmp_eh_recover(struct ata_port *ap)
>  	if (rc)
>  		goto link_fail;
> 
> -	/* Connection status might have changed while resetting other
> -	 * links, check SATA_PMP_GSCR_ERROR before returning.
> -	 */
> -
> +	
>  	/* clear SNotification */
>  	rc = sata_scr_read(&ap->link, SCR_NOTIFICATION, &sntf);
>  	if (rc == 0)
>  		sata_scr_write(&ap->link, SCR_NOTIFICATION, sntf);
> 
> +	/*
> +	 * If IPM is active on any fan-out port, hotplug wouldn't
> +	 * work.  Return w/ PHY event notification disabled.
> +	 */
> +	ata_for_each_link(link, ap, EDGE)
> +		if (link->ipm_policy > ATA_IPM_MAX_POWER)
> +			return 0;
> +
> +	/*
> +	 * Connection status might have changed while resetting other
> +	 * links, enable notification and check SATA_PMP_GSCR_ERROR
> +	 * before returning.
> +	 */
> +
>  	/* enable notification */
>  	if (pmp_dev->flags & ATA_DFLAG_AN) {
>  		gscr[SATA_PMP_GSCR_FEAT_EN] |= SATA_PMP_FEAT_NOTIFY;
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index a54273d..8801342 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -116,73 +116,55 @@ static struct scsi_transport_template ata_scsi_transport_template = {
>  	.user_scan		= ata_scsi_user_scan,
>  };
> 
> -
> -static const struct {
> -	enum link_pm	value;
> -	const char	*name;
> -} link_pm_policy[] = {
> -	{ NOT_AVAILABLE, "max_performance" },
> -	{ MIN_POWER, "min_power" },
> -	{ MAX_PERFORMANCE, "max_performance" },
> -	{ MEDIUM_POWER, "medium_power" },
> +static const char *ata_ipm_policy_names[] = {
> +	[ATA_IPM_UNKNOWN]	= "max_performance",
> +	[ATA_IPM_MAX_POWER]	= "max_performance",
> +	[ATA_IPM_MED_POWER]	= "medium_power",
> +	[ATA_IPM_MIN_POWER]	= "min_power",
>  };
> 
> -static const char *ata_scsi_lpm_get(enum link_pm policy)
> -{
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(link_pm_policy); i++)
> -		if (link_pm_policy[i].value == policy)
> -			return link_pm_policy[i].name;
> -
> -	return NULL;
> -}
> -
> -static ssize_t ata_scsi_lpm_put(struct device *dev,
> -				struct device_attribute *attr,
> -				const char *buf, size_t count)
> +static ssize_t ata_scsi_ipm_store(struct device *dev,
> +				  struct device_attribute *attr,
> +				  const char *buf, size_t count)
>  {
>  	struct Scsi_Host *shost = class_to_shost(dev);
>  	struct ata_port *ap = ata_shost_to_port(shost);
> -	enum link_pm policy = 0;
> -	int i;
> +	enum ata_ipm_policy policy;
> +	unsigned long flags;
> 
> -	/*
> -	 * we are skipping array location 0 on purpose - this
> -	 * is because a value of NOT_AVAILABLE is displayed
> -	 * to the user as max_performance, but when the user
> -	 * writes "max_performance", they actually want the
> -	 * value to match MAX_PERFORMANCE.
> -	 */
> -	for (i = 1; i < ARRAY_SIZE(link_pm_policy); i++) {
> -		const int len = strlen(link_pm_policy[i].name);
> -		if (strncmp(link_pm_policy[i].name, buf, len) == 0) {
> -			policy = link_pm_policy[i].value;
> +	/* UNKNOWN is internal state, iterate from MAX_POWER */
> +	for (policy = ATA_IPM_MAX_POWER;
> +	     policy < ARRAY_SIZE(ata_ipm_policy_names); policy++) {
> +		const char *name = ata_ipm_policy_names[policy];
> +
> +		if (strncmp(name, buf, strlen(name)) == 0)
>  			break;
> -		}
>  	}
> -	if (!policy)
> +	if (policy == ARRAY_SIZE(ata_ipm_policy_names))
>  		return -EINVAL;
> 
> -	ata_lpm_schedule(ap, policy);
> +	spin_lock_irqsave(ap->lock, flags);
> +	ap->target_ipm_policy = policy;
> +	ata_port_schedule_eh(ap);
> +	spin_unlock_irqrestore(ap->lock, flags);
> +
>  	return count;
>  }
> 
> -static ssize_t
> -ata_scsi_lpm_show(struct device *dev, struct device_attribute *attr, char *buf)
> +static ssize_t ata_scsi_ipm_show(struct device *dev,
> +				 struct device_attribute *attr, char *buf)
>  {
>  	struct Scsi_Host *shost = class_to_shost(dev);
>  	struct ata_port *ap = ata_shost_to_port(shost);
> -	const char *policy =
> -		ata_scsi_lpm_get(ap->pm_policy);
> 
> -	if (!policy)
> +	if (ap->target_ipm_policy >= ARRAY_SIZE(ata_ipm_policy_names))
>  		return -EINVAL;
> 
> -	return snprintf(buf, 23, "%s\n", policy);
> +	return snprintf(buf, PAGE_SIZE, "%s\n",
> +			ata_ipm_policy_names[ap->target_ipm_policy]);
>  }
>  DEVICE_ATTR(link_power_management_policy, S_IRUGO | S_IWUSR,
> -		ata_scsi_lpm_show, ata_scsi_lpm_put);
> +	    ata_scsi_ipm_show, ata_scsi_ipm_store);
>  EXPORT_SYMBOL_GPL(dev_attr_link_power_management_policy);
> 
>  static ssize_t ata_scsi_park_show(struct device *device,
> diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
> index 4b84ed6..2dd0dfe 100644
> --- a/drivers/ata/libata.h
> +++ b/drivers/ata/libata.h
> @@ -87,6 +87,8 @@ extern int ata_dev_revalidate(struct ata_device *dev, unsigned int new_class,
>  extern int ata_dev_configure(struct ata_device *dev);
>  extern int sata_down_spd_limit(struct ata_link *link, u32 spd_limit);
>  extern int ata_down_xfermask_limit(struct ata_device *dev, unsigned int sel);
> +extern unsigned int ata_dev_set_feature(struct ata_device *dev,
> +					u8 enable, u8 feature);
>  extern void ata_sg_clean(struct ata_queued_cmd *qc);
>  extern void ata_qc_free(struct ata_queued_cmd *qc);
>  extern void ata_qc_issue(struct ata_queued_cmd *qc);
> @@ -101,8 +103,6 @@ extern int sata_link_init_spd(struct ata_link *link);
>  extern int ata_task_ioctl(struct scsi_device *scsidev, void __user *arg);
>  extern int ata_cmd_ioctl(struct scsi_device *scsidev, void __user *arg);
>  extern struct ata_port *ata_port_alloc(struct ata_host *host);
> -extern void ata_dev_enable_pm(struct ata_device *dev, enum link_pm policy);
> -extern void ata_lpm_schedule(struct ata_port *ap, enum link_pm);
> 
>  /* libata-acpi.c */
>  #ifdef CONFIG_ATA_ACPI
> @@ -170,6 +170,8 @@ extern void ata_eh_finish(struct ata_port *ap);
>  #ifdef CONFIG_SATA_PMP
>  extern int sata_pmp_scr_read(struct ata_link *link, int reg, u32 *val);
>  extern int sata_pmp_scr_write(struct ata_link *link, int reg, u32 val);
> +extern int sata_pmp_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
> +			    unsigned hints);
>  extern int sata_pmp_attach(struct ata_device *dev);
>  #else /* CONFIG_SATA_PMP */
>  static inline int sata_pmp_scr_read(struct ata_link *link, int reg, u32 *val)
> @@ -182,6 +184,12 @@ static inline int sata_pmp_scr_write(struct ata_link *link, int reg, u32 val)
>  	return -EINVAL;
>  }
> 
> +static inline int sata_pmp_set_ipm(struct ata_link *link,
> +				   enum ata_ipm_policy policy, unsigned hints)
> +{
> +	return -EINVAL;
> +}
> +
>  static inline int sata_pmp_attach(struct ata_device *dev)
>  {
>  	return -EINVAL;
> diff --git a/include/linux/libata.h b/include/linux/libata.h
> index b85f3ff..1f90dc5 100644
> --- a/include/linux/libata.h
> +++ b/include/linux/libata.h
> @@ -172,6 +172,7 @@ enum {
>  	ATA_LFLAG_NO_RETRY	= (1 << 5), /* don't retry this link */
>  	ATA_LFLAG_DISABLED	= (1 << 6), /* link is disabled */
>  	ATA_LFLAG_SW_ACTIVITY	= (1 << 7), /* keep activity stats */
> +	ATA_LFLAG_NO_IPM	= (1 << 8), /* disable IPM on this link */
> 
>  	/* struct ata_port flags */
>  	ATA_FLAG_SLAVE_POSS	= (1 << 0), /* host supports slave dev */
> @@ -324,12 +325,11 @@ enum {
>  	ATA_EH_HARDRESET	= (1 << 2), /* meaningful only in ->prereset */
>  	ATA_EH_RESET		= ATA_EH_SOFTRESET | ATA_EH_HARDRESET,
>  	ATA_EH_ENABLE_LINK	= (1 << 3),
> -	ATA_EH_LPM		= (1 << 4),  /* link power management action */
>  	ATA_EH_PARK		= (1 << 5), /* unload heads and stop I/O */
> 
>  	ATA_EH_PERDEV_MASK	= ATA_EH_REVALIDATE | ATA_EH_PARK,
>  	ATA_EH_ALL_ACTIONS	= ATA_EH_REVALIDATE | ATA_EH_RESET |
> -				  ATA_EH_ENABLE_LINK | ATA_EH_LPM,
> +				  ATA_EH_ENABLE_LINK,
> 
>  	/* ata_eh_info->flags */
>  	ATA_EHI_HOTPLUGGED	= (1 << 0),  /* could have been hotplugged */
> @@ -376,7 +376,6 @@ enum {
>  	ATA_HORKAGE_BROKEN_HPA	= (1 << 4),	/* Broken HPA */
>  	ATA_HORKAGE_DISABLE	= (1 << 5),	/* Disable it */
>  	ATA_HORKAGE_HPA_SIZE	= (1 << 6),	/* native size off by one */
> -	ATA_HORKAGE_IPM		= (1 << 7),	/* Link PM problems */
>  	ATA_HORKAGE_IVB		= (1 << 8),	/* cbl det validity bit bugs */
>  	ATA_HORKAGE_STUCK_ERR	= (1 << 9),	/* stuck ERR on next PACKET */
>  	ATA_HORKAGE_BRIDGE_OK	= (1 << 10),	/* no bridge limits */
> @@ -463,6 +462,22 @@ enum ata_completion_errors {
>  	AC_ERR_NCQ		= (1 << 10), /* marker for offending NCQ qc */
>  };
> 
> +/*
> + * Link pm policy: If you alter this, you also need to alter
> + * libata-scsi.c (for the ascii descriptions)
> + */
> +enum ata_ipm_policy {
> +	ATA_IPM_UNKNOWN,
> +	ATA_IPM_MAX_POWER,
> +	ATA_IPM_MED_POWER,
> +	ATA_IPM_MIN_POWER,
> +};
> +
> +enum ata_ipm_hints {
> +	ATA_IPM_EMPTY		= (1 << 0), /* port empty/probing */
> +	ATA_IPM_HIPM		= (1 << 1), /* may use HIPM */
> +};
> +
>  /* forward declarations */
>  struct scsi_device;
>  struct ata_port_operations;
> @@ -477,16 +492,6 @@ typedef int (*ata_reset_fn_t)(struct ata_link *link, unsigned int *classes,
>  			      unsigned long deadline);
>  typedef void (*ata_postreset_fn_t)(struct ata_link *link, unsigned int *classes);
> 
> -/*
> - * host pm policy: If you alter this, you also need to alter libata-scsi.c
> - * (for the ascii descriptions)
> - */
> -enum link_pm {
> -	NOT_AVAILABLE,
> -	MIN_POWER,
> -	MAX_PERFORMANCE,
> -	MEDIUM_POWER,
> -};
>  extern struct device_attribute dev_attr_link_power_management_policy;
>  extern struct device_attribute dev_attr_unload_heads;
>  extern struct device_attribute dev_attr_em_message_type;
> @@ -698,6 +703,7 @@ struct ata_link {
>  	unsigned int		hw_sata_spd_limit;
>  	unsigned int		sata_spd_limit;
>  	unsigned int		sata_spd;	/* current SATA PHY speed */
> +	enum ata_ipm_policy	ipm_policy;
> 
>  	/* record runtime error info, protected by host_set lock */
>  	struct ata_eh_info	eh_info;
> @@ -764,7 +770,7 @@ struct ata_port {
> 
>  	pm_message_t		pm_mesg;
>  	int			*pm_result;
> -	enum link_pm		pm_policy;
> +	enum ata_ipm_policy	target_ipm_policy;
> 
>  	struct timer_list	fastdrain_timer;
>  	unsigned long		fastdrain_cnt;
> @@ -830,8 +836,8 @@ struct ata_port_operations {
>  	int  (*scr_write)(struct ata_link *link, unsigned int sc_reg, u32 val);
>  	void (*pmp_attach)(struct ata_port *ap);
>  	void (*pmp_detach)(struct ata_port *ap);
> -	int  (*enable_pm)(struct ata_port *ap, enum link_pm policy);
> -	void (*disable_pm)(struct ata_port *ap);
> +	int  (*set_ipm)(struct ata_link *link, enum ata_ipm_policy policy,
> +			unsigned hints);
> 
>  	/*
>  	 * Start, stop, suspend and resume
> @@ -943,6 +949,8 @@ extern int sata_link_debounce(struct ata_link *link,
>  			const unsigned long *params, unsigned long deadline);
>  extern int sata_link_resume(struct ata_link *link, const unsigned long *params,
>  			    unsigned long deadline);
> +extern int sata_link_scr_ipm(struct ata_link *link, enum ata_ipm_policy policy,
> +			     bool spm_wakeup);
>  extern int sata_link_hardreset(struct ata_link *link,
>  			const unsigned long *timing, unsigned long deadline,
>  			bool *online, int (*check_ready)(struct ata_link *));
> 
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-07-30 14:18       ` [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM Tejun Heo
@ 2010-08-05 16:08         ` Tejun Heo
  2010-08-05 19:58           ` Rafael J. Wysocki
                             ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Tejun Heo @ 2010-08-05 16:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	stephan.diestelhorst

Hello, Rafael.

Can you please try the following patch and see whether the problem
goes away?

Thanks.

 drivers/ata/ahci.c          |    3
 drivers/ata/ahci.h          |    1
 drivers/ata/ahci_platform.c |    3
 drivers/ata/ata_piix.c      |   24 +++
 drivers/ata/libahci.c       |  161 +++++++-------------------
 drivers/ata/libata-core.c   |  269 ++++++++++----------------------------------
 drivers/ata/libata-eh.c     |  176 +++++++++++++++++++++++++---
 drivers/ata/libata-pmp.c    |   49 +++++++-
 drivers/ata/libata-scsi.c   |   74 ++++--------
 drivers/ata/libata.h        |   12 +
 include/linux/libata.h      |   40 +++---
 11 files changed, 393 insertions(+), 419 deletions(-)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index f252253..cfdc22b 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -1190,9 +1190,6 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		ata_port_pbar_desc(ap, AHCI_PCI_BAR,
 				   0x100 + ap->port_no * 0x80, "port");

-		/* set initial link pm policy */
-		ap->pm_policy = NOT_AVAILABLE;
-
 		/* set enclosure management message type */
 		if (ap->flags & ATA_FLAG_EM)
 			ap->em_message_type = hpriv->em_msg_type;
diff --git a/drivers/ata/ahci.h b/drivers/ata/ahci.h
index 7113c57..6d07948 100644
--- a/drivers/ata/ahci.h
+++ b/drivers/ata/ahci.h
@@ -201,7 +201,6 @@ enum {
 	AHCI_HFLAG_MV_PATA		= (1 << 4), /* PATA port */
 	AHCI_HFLAG_NO_MSI		= (1 << 5), /* no PCI MSI */
 	AHCI_HFLAG_NO_PMP		= (1 << 6), /* no PMP */
-	AHCI_HFLAG_NO_HOTPLUG		= (1 << 7), /* ignore PxSERR.DIAG.N */
 	AHCI_HFLAG_SECT255		= (1 << 8), /* max 255 sectors */
 	AHCI_HFLAG_YES_NCQ		= (1 << 9), /* force NCQ cap on */
 	AHCI_HFLAG_NO_SUSPEND		= (1 << 10), /* don't suspend */
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
index 5e11b16..0f69afe 100644
--- a/drivers/ata/ahci_platform.c
+++ b/drivers/ata/ahci_platform.c
@@ -120,9 +120,6 @@ static int __init ahci_probe(struct platform_device *pdev)
 		ata_port_desc(ap, "mmio %pR", mem);
 		ata_port_desc(ap, "port 0x%x", 0x100 + ap->port_no * 0x80);

-		/* set initial link pm policy */
-		ap->pm_policy = NOT_AVAILABLE;
-
 		/* set enclosure management message type */
 		if (ap->flags & ATA_FLAG_EM)
 			ap->em_message_type = hpriv->em_msg_type;
diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index 7409f98..0df0477 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -174,6 +174,8 @@ static int piix_sidpr_scr_read(struct ata_link *link,
 			       unsigned int reg, u32 *val);
 static int piix_sidpr_scr_write(struct ata_link *link,
 				unsigned int reg, u32 val);
+static int piix_sidpr_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
+			      unsigned hints);
 static bool piix_irq_check(struct ata_port *ap);
 #ifdef CONFIG_PM
 static int piix_pci_device_suspend(struct pci_dev *pdev, pm_message_t mesg);
@@ -343,11 +345,22 @@ static struct ata_port_operations ich_pata_ops = {
 	.set_dmamode		= ich_set_dmamode,
 };

+static struct device_attribute *piix_sidpr_shost_attrs[] = {
+	&dev_attr_link_power_management_policy,
+	NULL
+};
+
+static struct scsi_host_template piix_sidpr_sht = {
+	ATA_BMDMA_SHT(DRV_NAME),
+	.shost_attrs		= piix_sidpr_shost_attrs,
+};
+
 static struct ata_port_operations piix_sidpr_sata_ops = {
 	.inherits		= &piix_sata_ops,
 	.hardreset		= sata_std_hardreset,
 	.scr_read		= piix_sidpr_scr_read,
 	.scr_write		= piix_sidpr_scr_write,
+	.set_ipm		= piix_sidpr_set_ipm,
 };

 static const struct piix_map_db ich5_map_db = {
@@ -973,6 +986,12 @@ static int piix_sidpr_scr_write(struct ata_link *link,
 	return 0;
 }

+static int piix_sidpr_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
+			      unsigned hints)
+{
+	return sata_link_scr_ipm(link, policy, false);
+}
+
 static bool piix_irq_check(struct ata_port *ap)
 {
 	if (unlikely(!ap->ioaddr.bmdma_addr))
@@ -1532,6 +1551,7 @@ static int __devinit piix_init_one(struct pci_dev *pdev,
 	struct device *dev = &pdev->dev;
 	struct ata_port_info port_info[2];
 	const struct ata_port_info *ppi[] = { &port_info[0], &port_info[1] };
+	struct scsi_host_template *sht = &piix_sht;
 	unsigned long port_flags;
 	struct ata_host *host;
 	struct piix_host_priv *hpriv;
@@ -1600,6 +1620,8 @@ static int __devinit piix_init_one(struct pci_dev *pdev,
 		rc = piix_init_sidpr(host);
 		if (rc)
 			return rc;
+		if (host->ports[0]->ops == &piix_sidpr_sata_ops)
+			sht = &piix_sidpr_sht;
 	}

 	/* apply IOCFG bit18 quirk */
@@ -1626,7 +1648,7 @@ static int __devinit piix_init_one(struct pci_dev *pdev,
 	host->flags |= ATA_HOST_PARALLEL_SCAN;

 	pci_set_master(pdev);
-	return ata_pci_sff_activate_host(host, ata_bmdma_interrupt, &piix_sht);
+	return ata_pci_sff_activate_host(host, ata_bmdma_interrupt, sht);
 }

 static void piix_remove_one(struct pci_dev *pdev)
diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index 81e772a..2c5f3df 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -56,9 +56,8 @@ MODULE_PARM_DESC(skip_host_reset, "skip global host reset (0=don't skip, 1=skip)
 module_param_named(ignore_sss, ahci_ignore_sss, int, 0444);
 MODULE_PARM_DESC(ignore_sss, "Ignore staggered spinup flag (0=don't ignore, 1=ignore)");

-static int ahci_enable_alpm(struct ata_port *ap,
-		enum link_pm policy);
-static void ahci_disable_alpm(struct ata_port *ap);
+static int ahci_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
+			unsigned hints);
 static ssize_t ahci_led_show(struct ata_port *ap, char *buf);
 static ssize_t ahci_led_store(struct ata_port *ap, const char *buf,
 			      size_t size);
@@ -172,8 +171,7 @@ struct ata_port_operations ahci_ops = {
 	.pmp_attach		= ahci_pmp_attach,
 	.pmp_detach		= ahci_pmp_detach,

-	.enable_pm		= ahci_enable_alpm,
-	.disable_pm		= ahci_disable_alpm,
+	.set_ipm		= ahci_set_ipm,
 	.em_show		= ahci_led_show,
 	.em_store		= ahci_led_store,
 	.sw_activity_show	= ahci_activity_show,
@@ -644,127 +642,59 @@ static void ahci_power_up(struct ata_port *ap)
 	writel(cmd | PORT_CMD_ICC_ACTIVE, port_mmio + PORT_CMD);
 }

-static void ahci_disable_alpm(struct ata_port *ap)
+static int ahci_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
+			unsigned int hints)
 {
+	struct ata_port *ap = link->ap;
 	struct ahci_host_priv *hpriv = ap->host->private_data;
-	void __iomem *port_mmio = ahci_port_base(ap);
-	u32 cmd;
 	struct ahci_port_priv *pp = ap->private_data;
-
-	/* IPM bits should be disabled by libata-core */
-	/* get the existing command bits */
-	cmd = readl(port_mmio + PORT_CMD);
-
-	/* disable ALPM and ASP */
-	cmd &= ~PORT_CMD_ASP;
-	cmd &= ~PORT_CMD_ALPE;
-
-	/* force the interface back to active */
-	cmd |= PORT_CMD_ICC_ACTIVE;
-
-	/* write out new cmd value */
-	writel(cmd, port_mmio + PORT_CMD);
-	cmd = readl(port_mmio + PORT_CMD);
-
-	/* wait 10ms to be sure we've come out of any low power state */
-	msleep(10);
-
-	/* clear out any PhyRdy stuff from interrupt status */
-	writel(PORT_IRQ_PHYRDY, port_mmio + PORT_IRQ_STAT);
-
-	/* go ahead and clean out PhyRdy Change from Serror too */
-	ahci_scr_write(&ap->link, SCR_ERROR, ((1 << 16) | (1 << 18)));
-
-	/*
-	 * Clear flag to indicate that we should ignore all PhyRdy
-	 * state changes
-	 */
-	hpriv->flags &= ~AHCI_HFLAG_NO_HOTPLUG;
-
-	/*
-	 * Enable interrupts on Phy Ready.
-	 */
-	pp->intr_mask |= PORT_IRQ_PHYRDY;
-	writel(pp->intr_mask, port_mmio + PORT_IRQ_MASK);
-
-	/*
-	 * don't change the link pm policy - we can be called
-	 * just to turn of link pm temporarily
-	 */
-}
-
-static int ahci_enable_alpm(struct ata_port *ap,
-	enum link_pm policy)
-{
-	struct ahci_host_priv *hpriv = ap->host->private_data;
 	void __iomem *port_mmio = ahci_port_base(ap);
-	u32 cmd;
-	struct ahci_port_priv *pp = ap->private_data;
-	u32 asp;

-	/* Make sure the host is capable of link power management */
-	if (!(hpriv->cap & HOST_CAP_ALPM))
-		return -EINVAL;
+	ata_link_printk(link, KERN_INFO, "XXX ahci_set_ipm: pol=%d hints=%x\n",
+			policy, hints);

-	switch (policy) {
-	case MAX_PERFORMANCE:
-	case NOT_AVAILABLE:
+	if (policy != ATA_IPM_MAX_POWER) {
 		/*
-		 * if we came here with NOT_AVAILABLE,
-		 * it just means this is the first time we
-		 * have tried to enable - default to max performance,
-		 * and let the user go to lower power modes on request.
+		 * Disable interrupts on Phy Ready. This keeps us from
+		 * getting woken up due to spurious phy ready
+		 * interrupts.
 		 */
-		ahci_disable_alpm(ap);
-		return 0;
-	case MIN_POWER:
-		/* configure HBA to enter SLUMBER */
-		asp = PORT_CMD_ASP;
-		break;
-	case MEDIUM_POWER:
-		/* configure HBA to enter PARTIAL */
-		asp = 0;
-		break;
-	default:
-		return -EINVAL;
+		pp->intr_mask &= ~PORT_IRQ_PHYRDY;
+		writel(pp->intr_mask, port_mmio + PORT_IRQ_MASK);
+
+		sata_link_scr_ipm(link, policy, false);
 	}

-	/*
-	 * Disable interrupts on Phy Ready. This keeps us from
-	 * getting woken up due to spurious phy ready interrupts
-	 * TBD - Hot plug should be done via polling now, is
-	 * that even supported?
-	 */
-	pp->intr_mask &= ~PORT_IRQ_PHYRDY;
-	writel(pp->intr_mask, port_mmio + PORT_IRQ_MASK);
+	if (hpriv->cap & HOST_CAP_ALPM) {
+		u32 cmd = readl(port_mmio + PORT_CMD);

-	/*
-	 * Set a flag to indicate that we should ignore all PhyRdy
-	 * state changes since these can happen now whenever we
-	 * change link state
-	 */
-	hpriv->flags |= AHCI_HFLAG_NO_HOTPLUG;
+		if (policy == ATA_IPM_MAX_POWER || !(hints & ATA_IPM_HIPM)) {
+			cmd &= ~(PORT_CMD_ASP | PORT_CMD_ALPE);
+			cmd |= PORT_CMD_ICC_ACTIVE;

-	/* get the existing command bits */
-	cmd = readl(port_mmio + PORT_CMD);
+			writel(cmd, port_mmio + PORT_CMD);
+			readl(port_mmio + PORT_CMD);

-	/*
-	 * Set ASP based on Policy
-	 */
-	cmd |= asp;
+			/* wait 10ms to be sure we've come out of IPM state */
+			msleep(10);
+		} else {
+			cmd |= PORT_CMD_ALPE;
+			if (policy == ATA_IPM_MIN_POWER)
+				cmd |= PORT_CMD_ASP;

-	/*
-	 * Setting this bit will instruct the HBA to aggressively
-	 * enter a lower power link state when it's appropriate and
-	 * based on the value set above for ASP
-	 */
-	cmd |= PORT_CMD_ALPE;
+			/* write out new cmd value */
+			writel(cmd, port_mmio + PORT_CMD);
+		}
+	}

-	/* write out new cmd value */
-	writel(cmd, port_mmio + PORT_CMD);
-	cmd = readl(port_mmio + PORT_CMD);
+	if (policy == ATA_IPM_MAX_POWER) {
+		sata_link_scr_ipm(link, policy, false);
+
+		/* turn PHYRDY IRQ back on */
+		pp->intr_mask |= PORT_IRQ_PHYRDY;
+		writel(pp->intr_mask, port_mmio + PORT_IRQ_MASK);
+	}

-	/* IPM bits should be set by libata-core */
 	return 0;
 }

@@ -1662,15 +1592,10 @@ static void ahci_port_intr(struct ata_port *ap)
 	if (unlikely(resetting))
 		status &= ~PORT_IRQ_BAD_PMP;

-	/* If we are getting PhyRdy, this is
-	 * just a power state change, we should
-	 * clear out this, plus the PhyRdy/Comm
-	 * Wake bits from Serror
-	 */
-	if ((hpriv->flags & AHCI_HFLAG_NO_HOTPLUG) &&
-		(status & PORT_IRQ_PHYRDY)) {
+	/* if IPM is enabled, PHYRDY doesn't mean anything */
+	if (ap->link.ipm_policy > ATA_IPM_MAX_POWER) {
 		status &= ~PORT_IRQ_PHYRDY;
-		ahci_scr_write(&ap->link, SCR_ERROR, ((1 << 16) | (1 << 18)));
+		ahci_scr_write(&ap->link, SCR_ERROR, SERR_PHYRDY_CHG);
 	}

 	if (unlikely(status & PORT_IRQ_ERROR)) {
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index ddf8e48..5d1eeb1 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -91,8 +91,6 @@ const struct ata_port_operations sata_port_ops = {
 static unsigned int ata_dev_init_params(struct ata_device *dev,
 					u16 heads, u16 sectors);
 static unsigned int ata_dev_set_xfermode(struct ata_device *dev);
-static unsigned int ata_dev_set_feature(struct ata_device *dev,
-					u8 enable, u8 feature);
 static void ata_dev_xfermask(struct ata_device *dev);
 static unsigned long ata_dev_blacklisted(const struct ata_device *dev);

@@ -1032,182 +1030,6 @@ static const char *sata_spd_string(unsigned int spd)
 	return spd_str[spd - 1];
 }

-static int ata_dev_set_dipm(struct ata_device *dev, enum link_pm policy)
-{
-	struct ata_link *link = dev->link;
-	struct ata_port *ap = link->ap;
-	u32 scontrol;
-	unsigned int err_mask;
-	int rc;
-
-	/*
-	 * disallow DIPM for drivers which haven't set
-	 * ATA_FLAG_IPM.  This is because when DIPM is enabled,
-	 * phy ready will be set in the interrupt status on
-	 * state changes, which will cause some drivers to
-	 * think there are errors - additionally drivers will
-	 * need to disable hot plug.
-	 */
-	if (!(ap->flags & ATA_FLAG_IPM) || !ata_dev_enabled(dev)) {
-		ap->pm_policy = NOT_AVAILABLE;
-		return -EINVAL;
-	}
-
-	/*
-	 * For DIPM, we will only enable it for the
-	 * min_power setting.
-	 *
-	 * Why?  Because Disks are too stupid to know that
-	 * If the host rejects a request to go to SLUMBER
-	 * they should retry at PARTIAL, and instead it
-	 * just would give up.  So, for medium_power to
-	 * work at all, we need to only allow HIPM.
-	 */
-	rc = sata_scr_read(link, SCR_CONTROL, &scontrol);
-	if (rc)
-		return rc;
-
-	switch (policy) {
-	case MIN_POWER:
-		/* no restrictions on IPM transitions */
-		scontrol &= ~(0x3 << 8);
-		rc = sata_scr_write(link, SCR_CONTROL, scontrol);
-		if (rc)
-			return rc;
-
-		/* enable DIPM */
-		if (dev->flags & ATA_DFLAG_DIPM)
-			err_mask = ata_dev_set_feature(dev,
-					SETFEATURES_SATA_ENABLE, SATA_DIPM);
-		break;
-	case MEDIUM_POWER:
-		/* allow IPM to PARTIAL */
-		scontrol &= ~(0x1 << 8);
-		scontrol |= (0x2 << 8);
-		rc = sata_scr_write(link, SCR_CONTROL, scontrol);
-		if (rc)
-			return rc;
-
-		/*
-		 * we don't have to disable DIPM since IPM flags
-		 * disallow transitions to SLUMBER, which effectively
-		 * disable DIPM if it does not support PARTIAL
-		 */
-		break;
-	case NOT_AVAILABLE:
-	case MAX_PERFORMANCE:
-		/* disable all IPM transitions */
-		scontrol |= (0x3 << 8);
-		rc = sata_scr_write(link, SCR_CONTROL, scontrol);
-		if (rc)
-			return rc;
-
-		/*
-		 * we don't have to disable DIPM since IPM flags
-		 * disallow all transitions which effectively
-		 * disable DIPM anyway.
-		 */
-		break;
-	}
-
-	/* FIXME: handle SET FEATURES failure */
-	(void) err_mask;
-
-	return 0;
-}
-
-/**
- *	ata_dev_enable_pm - enable SATA interface power management
- *	@dev:  device to enable power management
- *	@policy: the link power management policy
- *
- *	Enable SATA Interface power management.  This will enable
- *	Device Interface Power Management (DIPM) for min_power
- * 	policy, and then call driver specific callbacks for
- *	enabling Host Initiated Power management.
- *
- *	Locking: Caller.
- *	Returns: -EINVAL if IPM is not supported, 0 otherwise.
- */
-void ata_dev_enable_pm(struct ata_device *dev, enum link_pm policy)
-{
-	int rc = 0;
-	struct ata_port *ap = dev->link->ap;
-
-	/* set HIPM first, then DIPM */
-	if (ap->ops->enable_pm)
-		rc = ap->ops->enable_pm(ap, policy);
-	if (rc)
-		goto enable_pm_out;
-	rc = ata_dev_set_dipm(dev, policy);
-
-enable_pm_out:
-	if (rc)
-		ap->pm_policy = MAX_PERFORMANCE;
-	else
-		ap->pm_policy = policy;
-	return /* rc */;	/* hopefully we can use 'rc' eventually */
-}
-
-#ifdef CONFIG_PM
-/**
- *	ata_dev_disable_pm - disable SATA interface power management
- *	@dev: device to disable power management
- *
- *	Disable SATA Interface power management.  This will disable
- *	Device Interface Power Management (DIPM) without changing
- * 	policy,  call driver specific callbacks for disabling Host
- * 	Initiated Power management.
- *
- *	Locking: Caller.
- *	Returns: void
- */
-static void ata_dev_disable_pm(struct ata_device *dev)
-{
-	struct ata_port *ap = dev->link->ap;
-
-	ata_dev_set_dipm(dev, MAX_PERFORMANCE);
-	if (ap->ops->disable_pm)
-		ap->ops->disable_pm(ap);
-}
-#endif	/* CONFIG_PM */
-
-void ata_lpm_schedule(struct ata_port *ap, enum link_pm policy)
-{
-	ap->pm_policy = policy;
-	ap->link.eh_info.action |= ATA_EH_LPM;
-	ap->link.eh_info.flags |= ATA_EHI_NO_AUTOPSY;
-	ata_port_schedule_eh(ap);
-}
-
-#ifdef CONFIG_PM
-static void ata_lpm_enable(struct ata_host *host)
-{
-	struct ata_link *link;
-	struct ata_port *ap;
-	struct ata_device *dev;
-	int i;
-
-	for (i = 0; i < host->n_ports; i++) {
-		ap = host->ports[i];
-		ata_for_each_link(link, ap, EDGE) {
-			ata_for_each_dev(dev, link, ALL)
-				ata_dev_disable_pm(dev);
-		}
-	}
-}
-
-static void ata_lpm_disable(struct ata_host *host)
-{
-	int i;
-
-	for (i = 0; i < host->n_ports; i++) {
-		struct ata_port *ap = host->ports[i];
-		ata_lpm_schedule(ap, ap->pm_policy);
-	}
-}
-#endif	/* CONFIG_PM */
-
 /**
  *	ata_dev_classify - determine device type based on ATA-spec signature
  *	@tf: ATA taskfile register set for device to be identified
@@ -2566,13 +2388,6 @@ int ata_dev_configure(struct ata_device *dev)
 	if (dev->flags & ATA_DFLAG_LBA48)
 		dev->max_sectors = ATA_MAX_SECTORS_LBA48;

-	if (!(dev->horkage & ATA_HORKAGE_IPM)) {
-		if (ata_id_has_hipm(dev->id))
-			dev->flags |= ATA_DFLAG_HIPM;
-		if (ata_id_has_dipm(dev->id))
-			dev->flags |= ATA_DFLAG_DIPM;
-	}
-
 	/* Limit PATA drive on SATA cable bridge transfers to udma5,
 	   200 sectors */
 	if (ata_dev_knobble(dev)) {
@@ -2593,13 +2408,6 @@ int ata_dev_configure(struct ata_device *dev)
 		dev->max_sectors = min_t(unsigned int, ATA_MAX_SECTORS_128,
 					 dev->max_sectors);

-	if (ata_dev_blacklisted(dev) & ATA_HORKAGE_IPM) {
-		dev->horkage |= ATA_HORKAGE_IPM;
-
-		/* reset link pm_policy for this port to no pm */
-		ap->pm_policy = MAX_PERFORMANCE;
-	}
-
 	if (ap->ops->dev_config)
 		ap->ops->dev_config(dev);

@@ -3630,7 +3438,7 @@ int ata_wait_after_reset(struct ata_link *link, unsigned long deadline,
  *	@params: timing parameters { interval, duratinon, timeout } in msec
  *	@deadline: deadline jiffies for the operation
  *
-*	Make sure SStatus of @link reaches stable state, determined by
+ *	Make sure SStatus of @link reaches stable state, determined by
  *	holding the same value where DET is not 1 for @duration polled
  *	every @interval, before @timeout.  Timeout constraints the
  *	beginning of the stable state.  Because DET gets stuck at 1 on
@@ -3761,6 +3569,65 @@ int sata_link_resume(struct ata_link *link, const unsigned long *params,
 	return rc != -EINVAL ? rc : 0;
 }

+int sata_link_scr_ipm(struct ata_link *link, enum ata_ipm_policy policy,
+		      bool spm_wakeup)
+{
+	struct ata_eh_context *ehc = &link->eh_context;
+	bool woken_up = false;
+	u32 scontrol;
+	int rc;
+
+	ata_link_printk(link, KERN_INFO,
+			"XXX sata_link_scr_ipm: pol=%d spm_wakeup=%d\n",
+			policy, spm_wakeup);
+	rc = sata_scr_read(link, SCR_CONTROL, &scontrol);
+	if (rc)
+		return rc;
+
+	switch (policy) {
+	case ATA_IPM_MAX_POWER:
+		/* disable all IPM transitions */
+		scontrol |= (0x3 << 8);
+		/* initiate transition to active state */
+		if (spm_wakeup) {
+			scontrol |= (0x4 << 12);
+			woken_up = true;
+		}
+		break;
+	case ATA_IPM_MED_POWER:
+		/* allow IPM to PARTIAL */
+		scontrol &= ~(0x1 << 8);
+		scontrol |= (0x2 << 8);
+		break;
+	case ATA_IPM_MIN_POWER:
+		/* no restrictions on IPM transitions */
+		scontrol &= ~(0x3 << 8);
+		break;
+	default:
+		WARN_ON(1);
+	}
+
+	ata_link_printk(link, KERN_INFO,
+			"XXX sata_link_scr_ipm: updating sctl to %x\n",
+			scontrol);
+	rc = sata_scr_write(link, SCR_CONTROL, scontrol);
+	if (rc)
+		return rc;
+
+	/* give the link time to transit out of IPM state */
+	if (woken_up) {
+		msleep(10);
+		ata_link_printk(link, KERN_INFO,
+				"XXX sata_link_scr_ipm: sleeping 10msec\n");
+	}
+
+	/* clear PHYRDY_CHG from SError */
+	ata_link_printk(link, KERN_INFO,
+			"XXX sata_link_scr_ipm: clearing serr\n");
+	ehc->i.serror &= ~SERR_PHYRDY_CHG;
+	return sata_scr_write(link, SCR_ERROR, SERR_PHYRDY_CHG);
+}
+
 /**
  *	ata_std_prereset - prepare for reset
  *	@link: ATA link to be reset
@@ -4570,6 +4437,7 @@ static unsigned int ata_dev_set_xfermode(struct ata_device *dev)
 	DPRINTK("EXIT, err_mask=%x\n", err_mask);
 	return err_mask;
 }
+
 /**
  *	ata_dev_set_feature - Issue SET FEATURES - SATA FEATURES
  *	@dev: Device to which command will be sent
@@ -4585,8 +4453,7 @@ static unsigned int ata_dev_set_xfermode(struct ata_device *dev)
  *	RETURNS:
  *	0 on success, AC_ERR_* mask otherwise.
  */
-static unsigned int ata_dev_set_feature(struct ata_device *dev, u8 enable,
-					u8 feature)
+unsigned int ata_dev_set_feature(struct ata_device *dev, u8 enable, u8 feature)
 {
 	struct ata_taskfile tf;
 	unsigned int err_mask;
@@ -5436,12 +5303,6 @@ int ata_host_suspend(struct ata_host *host, pm_message_t mesg)
 {
 	int rc;

-	/*
-	 * disable link pm on all ports before requesting
-	 * any pm activity
-	 */
-	ata_lpm_enable(host);
-
 	rc = ata_host_request_pm(host, mesg, 0, ATA_EHI_QUIET, 1);
 	if (rc == 0)
 		host->dev->power.power_state = mesg;
@@ -5464,9 +5325,6 @@ void ata_host_resume(struct ata_host *host)
 	ata_host_request_pm(host, PMSG_ON, ATA_EH_RESET,
 			    ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET, 0);
 	host->dev->power.power_state = PMSG_ON;
-
-	/* reenable link pm */
-	ata_lpm_disable(host);
 }
 #endif

@@ -6025,7 +5883,7 @@ static void async_port_probe(void *data, async_cookie_t cookie)
 		spin_lock_irqsave(ap->lock, flags);

 		ehi->probe_mask |= ATA_ALL_DEVICES;
-		ehi->action |= ATA_EH_RESET | ATA_EH_LPM;
+		ehi->action |= ATA_EH_RESET;
 		ehi->flags |= ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET;

 		ap->pflags &= ~ATA_PFLAG_INITIALIZING;
@@ -6698,6 +6556,7 @@ EXPORT_SYMBOL_GPL(sata_set_spd);
 EXPORT_SYMBOL_GPL(ata_wait_after_reset);
 EXPORT_SYMBOL_GPL(sata_link_debounce);
 EXPORT_SYMBOL_GPL(sata_link_resume);
+EXPORT_SYMBOL_GPL(sata_link_scr_ipm);
 EXPORT_SYMBOL_GPL(ata_std_prereset);
 EXPORT_SYMBOL_GPL(sata_link_hardreset);
 EXPORT_SYMBOL_GPL(sata_std_hardreset);
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index f77a673..bd77d94 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1568,14 +1568,15 @@ static void ata_eh_analyze_serror(struct ata_link *link)
 		action |= ATA_EH_RESET;
 	}

-	/* Determine whether a hotplug event has occurred.  Both
+	/*
+	 * Determine whether a hotplug event has occurred.  Both
 	 * SError.N/X are considered hotplug events for enabled or
 	 * host links.  For disabled PMP links, only N bit is
 	 * considered as X bit is left at 1 for link plugging.
 	 */
-	hotplug_mask = 0;
-
-	if (!(link->flags & ATA_LFLAG_DISABLED) || ata_is_host_link(link))
+	if (link->ipm_policy != ATA_IPM_MAX_POWER)
+		hotplug_mask = 0;	/* hotplug doesn't work w/ IPM */
+	else if (!(link->flags & ATA_LFLAG_DISABLED) || ata_is_host_link(link))
 		hotplug_mask = SERR_PHYRDY_CHG | SERR_DEV_XCHG;
 	else
 		hotplug_mask = SERR_PHYRDY_CHG;
@@ -2776,8 +2777,9 @@ int ata_eh_reset(struct ata_link *link, int classify,
 	ata_eh_done(link, NULL, ATA_EH_RESET);
 	if (slave)
 		ata_eh_done(slave, NULL, ATA_EH_RESET);
-	ehc->last_reset = jiffies;	/* update to completion time */
+	ehc->last_reset = jiffies;		/* update to completion time */
 	ehc->i.action |= ATA_EH_REVALIDATE;
+	link->ipm_policy = ATA_IPM_UNKNOWN;	/* reset IPM state */

 	rc = 0;
  out:
@@ -3203,6 +3205,124 @@ static int ata_eh_maybe_retry_flush(struct ata_device *dev)
 	return rc;
 }

+/**
+ *	ata_eh_set_ipm - configure SATA interface power management
+ *	@link: link to configure power management
+ *	@policy: the link power management policy
+ *	@r_failed_dev: out parameter for failed device
+ *
+ *	Enable SATA Interface power management.  This will enable
+ *	Device Interface Power Management (DIPM) for min_power
+ * 	policy, and then call driver specific callbacks for
+ *	enabling Host Initiated Power management.
+ *
+ *	LOCKING:
+ *	EH context.
+ *
+ *	RETURNS:
+ *	0 on success, -errno on failure.
+ */
+static int ata_eh_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
+			  struct ata_device **r_failed_dev)
+{
+	struct ata_port *ap = ata_is_host_link(link) ? link->ap : NULL;
+	struct ata_eh_context *ehc = &link->eh_context;
+	struct ata_device *dev, *link_dev = NULL, *ipm_dev = NULL;
+	unsigned int hints = ATA_IPM_EMPTY | ATA_IPM_HIPM;
+	unsigned int err_mask;
+	int rc;
+
+	/* if the link or host doesn't do IPM, noop */
+	if ((link->flags & ATA_LFLAG_NO_IPM) || (ap && !ap->ops->set_ipm))
+		return 0;
+
+	/*
+	 * DIPM is enabled only for MIN_POWER as some devices
+	 * misbehave when the host NACKs transition to SLUMBER.  Order
+	 * device and link configurations such that the host always
+	 * allows DIPM requests.
+	 */
+	ata_for_each_dev(dev, link, ENABLED) {
+		bool hipm = ata_id_has_hipm(dev->id);
+		bool dipm = ata_id_has_dipm(dev->id);
+
+		/* find the first enabled and IPM enabled devices */
+		if (!link_dev)
+			link_dev = dev;
+
+		if (!ipm_dev && (hipm || dipm))
+			ipm_dev = dev;
+
+		hints &= ~ATA_IPM_EMPTY;
+		if (!hipm)
+			hints &= ~ATA_IPM_HIPM;
+
+		/* disable DIPM before changing link config */
+		if (policy != ATA_IPM_MIN_POWER && dipm) {
+			ata_dev_printk(dev, KERN_INFO, "XXX ata_eh_set_ipm: disabling DIPM\n");
+			err_mask = ata_dev_set_feature(dev,
+					SETFEATURES_SATA_DISABLE, SATA_DIPM);
+			if (err_mask && err_mask != AC_ERR_DEV) {
+				ata_dev_printk(dev, KERN_WARNING,
+					       "error while disabling DIPM\n");
+				rc = -EIO;
+				goto fail;
+			}
+		}
+	}
+
+	if (ap) {
+		rc = ap->ops->set_ipm(link, policy, hints);
+		if (!rc && ap->slave_link)
+			rc = ap->ops->set_ipm(ap->slave_link, policy, hints);
+	} else
+		rc = sata_pmp_set_ipm(link, policy, hints);
+
+	/*
+	 * Attribute link config failure to the first (IPM) enabled
+	 * device on the link.
+	 */
+	if (rc) {
+		if (rc == -EOPNOTSUPP) {
+			link->flags |= ATA_LFLAG_NO_IPM;
+			return 0;
+		}
+		dev = ipm_dev ? ipm_dev : link_dev;
+		goto fail;
+	}
+
+	/* host config updated, enable DIPM if transitioning to MIN_POWER */
+	ata_for_each_dev(dev, link, ENABLED) {
+		if (policy == ATA_IPM_MIN_POWER && ata_id_has_dipm(dev->id)) {
+			ata_dev_printk(dev, KERN_INFO, "XXX ata_eh_set_ipm: enabling DIPM\n");
+			err_mask = ata_dev_set_feature(dev,
+					SETFEATURES_SATA_ENABLE, SATA_DIPM);
+			if (err_mask && err_mask != AC_ERR_DEV) {
+				ata_dev_printk(dev, KERN_WARNING,
+					       "error while enabling DIPM\n");
+				rc = -EIO;
+				goto fail;
+			}
+		}
+	}
+
+	link->ipm_policy = policy;
+	if (ap && ap->slave_link)
+		ap->slave_link->ipm_policy = policy;
+	return 0;
+
+fail:
+	/* if no device or the last chance for the device, disable IPM */
+	if (!dev || ehc->tries[dev->devno] == 1) {
+		ata_link_printk(link, KERN_WARNING,
+				"disabling IPM on the link\n");
+		link->flags |= ATA_LFLAG_NO_IPM;
+	}
+	if (r_failed_dev)
+		*r_failed_dev = dev;
+	return rc;
+}
+
 static int ata_link_nr_enabled(struct ata_link *link)
 {
 	struct ata_device *dev;
@@ -3283,6 +3403,16 @@ static int ata_eh_schedule_probe(struct ata_device *dev)
 	ehc->saved_xfer_mode[dev->devno] = 0;
 	ehc->saved_ncq_enabled &= ~(1 << dev->devno);

+	/* the link maybe in a deep sleep, wake it up */
+	if (link->ipm_policy > ATA_IPM_MAX_POWER) {
+		if (ata_is_host_link(link))
+			link->ap->ops->set_ipm(link, ATA_IPM_MAX_POWER,
+					       ATA_IPM_EMPTY);
+		else
+			sata_pmp_set_ipm(link, ATA_IPM_MAX_POWER,
+					 ATA_IPM_EMPTY);
+	}
+
 	/* Record and count probe trials on the ering.  The specific
 	 * error mask used is irrelevant.  Because a successful device
 	 * detection clears the ering, this count accumulates only if
@@ -3384,8 +3514,7 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
 {
 	struct ata_link *link;
 	struct ata_device *dev;
-	int nr_failed_devs;
-	int rc;
+	int rc, nr_fails;
 	unsigned long flags, deadline;

 	DPRINTK("ENTER\n");
@@ -3426,7 +3555,6 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,

  retry:
 	rc = 0;
-	nr_failed_devs = 0;

 	/* if UNLOADING, finish immediately */
 	if (ap->pflags & ATA_PFLAG_UNLOADING)
@@ -3511,13 +3639,17 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
 	}

 	/* the rest */
-	ata_for_each_link(link, ap, EDGE) {
+	nr_fails = 0;
+	ata_for_each_link(link, ap, PMP_FIRST) {
 		struct ata_eh_context *ehc = &link->eh_context;

+		if (sata_pmp_attached(ap) && ata_is_host_link(link))
+			goto config_ipm;
+
 		/* revalidate existing devices and attach new ones */
 		rc = ata_eh_revalidate_and_attach(link, &dev);
 		if (rc)
-			goto dev_fail;
+			goto rest_fail;

 		/* if PMP got attached, return, pmp EH will take care of it */
 		if (link->device->class == ATA_DEV_PMP) {
@@ -3529,7 +3661,7 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
 		if (ehc->i.flags & ATA_EHI_SETMODE) {
 			rc = ata_set_mode(link, &dev);
 			if (rc)
-				goto dev_fail;
+				goto rest_fail;
 			ehc->i.flags &= ~ATA_EHI_SETMODE;
 		}

@@ -3542,7 +3674,7 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
 					continue;
 				rc = atapi_eh_clear_ua(dev);
 				if (rc)
-					goto dev_fail;
+					goto rest_fail;
 			}
 		}

@@ -3552,21 +3684,25 @@ int ata_eh_recover(struct ata_port *ap, ata_prereset_fn_t prereset,
 				continue;
 			rc = ata_eh_maybe_retry_flush(dev);
 			if (rc)
-				goto dev_fail;
+				goto rest_fail;
 		}

+	config_ipm:
 		/* configure link power saving */
-		if (ehc->i.action & ATA_EH_LPM)
-			ata_for_each_dev(dev, link, ALL)
-				ata_dev_enable_pm(dev, ap->pm_policy);
+		if (link->ipm_policy != ap->target_ipm_policy) {
+			rc = ata_eh_set_ipm(link, ap->target_ipm_policy, &dev);
+			if (rc)
+				goto rest_fail;
+		}

 		/* this link is okay now */
 		ehc->i.flags = 0;
 		continue;

-dev_fail:
-		nr_failed_devs++;
-		ata_eh_handle_dev_fail(dev, rc);
+	rest_fail:
+		nr_fails++;
+		if (dev)
+			ata_eh_handle_dev_fail(dev, rc);

 		if (ap->pflags & ATA_PFLAG_FROZEN) {
 			/* PMP reset requires working host port.
@@ -3578,7 +3714,7 @@ dev_fail:
 		}
 	}

-	if (nr_failed_devs)
+	if (nr_fails)
 		goto retry;

  out:
diff --git a/drivers/ata/libata-pmp.c b/drivers/ata/libata-pmp.c
index 224faab..06a66ca 100644
--- a/drivers/ata/libata-pmp.c
+++ b/drivers/ata/libata-pmp.c
@@ -185,6 +185,27 @@ int sata_pmp_scr_write(struct ata_link *link, int reg, u32 val)
 }

 /**
+ *	sata_pmp_set_ipm - configure IPM for a PMP link
+ *	@link: PMP link to configure IPM for
+ *	@policy: target IPM policy
+ *	@hints: IPM hints
+ *
+ *	Configure IPM for @link.  This function will contain any PMP
+ *	specific workarounds if necessary.
+ *
+ *	LOCKING:
+ *	EH context.
+ *
+ *	RETURNS:
+ *	0 on success, -errno on failure.
+ */
+int sata_pmp_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
+		     unsigned hints)
+{
+	return sata_link_scr_ipm(link, policy, true);
+}
+
+/**
  *	sata_pmp_read_gscr - read GSCR block of SATA PMP
  *	@dev: PMP device
  *	@gscr: buffer to read GSCR block into
@@ -351,6 +372,9 @@ static void sata_pmp_quirks(struct ata_port *ap)
 	if (vendor == 0x1095 && devid == 0x3726) {
 		/* sil3726 quirks */
 		ata_for_each_link(link, ap, EDGE) {
+			/* link reports offline after IPM */
+			link->flags |= ATA_LFLAG_NO_IPM;
+
 			/* Class code report is unreliable and SRST
 			 * times out under certain configurations.
 			 */
@@ -366,6 +390,9 @@ static void sata_pmp_quirks(struct ata_port *ap)
 	} else if (vendor == 0x1095 && devid == 0x4723) {
 		/* sil4723 quirks */
 		ata_for_each_link(link, ap, EDGE) {
+			/* link reports offline after IPM */
+			link->flags |= ATA_LFLAG_NO_IPM;
+
 			/* class code report is unreliable */
 			if (link->pmp < 2)
 				link->flags |= ATA_LFLAG_ASSUME_ATA;
@@ -378,6 +405,9 @@ static void sata_pmp_quirks(struct ata_port *ap)
 	} else if (vendor == 0x1095 && devid == 0x4726) {
 		/* sil4726 quirks */
 		ata_for_each_link(link, ap, EDGE) {
+			/* link reports offline after IPM */
+			link->flags |= ATA_LFLAG_NO_IPM;
+
 			/* Class code report is unreliable and SRST
 			 * times out under certain configurations.
 			 * Config device can be at port 0 or 5 and
@@ -938,15 +968,26 @@ static int sata_pmp_eh_recover(struct ata_port *ap)
 	if (rc)
 		goto link_fail;

-	/* Connection status might have changed while resetting other
-	 * links, check SATA_PMP_GSCR_ERROR before returning.
-	 */
-
+	
 	/* clear SNotification */
 	rc = sata_scr_read(&ap->link, SCR_NOTIFICATION, &sntf);
 	if (rc == 0)
 		sata_scr_write(&ap->link, SCR_NOTIFICATION, sntf);

+	/*
+	 * If IPM is active on any fan-out port, hotplug wouldn't
+	 * work.  Return w/ PHY event notification disabled.
+	 */
+	ata_for_each_link(link, ap, EDGE)
+		if (link->ipm_policy > ATA_IPM_MAX_POWER)
+			return 0;
+
+	/*
+	 * Connection status might have changed while resetting other
+	 * links, enable notification and check SATA_PMP_GSCR_ERROR
+	 * before returning.
+	 */
+
 	/* enable notification */
 	if (pmp_dev->flags & ATA_DFLAG_AN) {
 		gscr[SATA_PMP_GSCR_FEAT_EN] |= SATA_PMP_FEAT_NOTIFY;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index a54273d..8801342 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -116,73 +116,55 @@ static struct scsi_transport_template ata_scsi_transport_template = {
 	.user_scan		= ata_scsi_user_scan,
 };

-
-static const struct {
-	enum link_pm	value;
-	const char	*name;
-} link_pm_policy[] = {
-	{ NOT_AVAILABLE, "max_performance" },
-	{ MIN_POWER, "min_power" },
-	{ MAX_PERFORMANCE, "max_performance" },
-	{ MEDIUM_POWER, "medium_power" },
+static const char *ata_ipm_policy_names[] = {
+	[ATA_IPM_UNKNOWN]	= "max_performance",
+	[ATA_IPM_MAX_POWER]	= "max_performance",
+	[ATA_IPM_MED_POWER]	= "medium_power",
+	[ATA_IPM_MIN_POWER]	= "min_power",
 };

-static const char *ata_scsi_lpm_get(enum link_pm policy)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(link_pm_policy); i++)
-		if (link_pm_policy[i].value == policy)
-			return link_pm_policy[i].name;
-
-	return NULL;
-}
-
-static ssize_t ata_scsi_lpm_put(struct device *dev,
-				struct device_attribute *attr,
-				const char *buf, size_t count)
+static ssize_t ata_scsi_ipm_store(struct device *dev,
+				  struct device_attribute *attr,
+				  const char *buf, size_t count)
 {
 	struct Scsi_Host *shost = class_to_shost(dev);
 	struct ata_port *ap = ata_shost_to_port(shost);
-	enum link_pm policy = 0;
-	int i;
+	enum ata_ipm_policy policy;
+	unsigned long flags;

-	/*
-	 * we are skipping array location 0 on purpose - this
-	 * is because a value of NOT_AVAILABLE is displayed
-	 * to the user as max_performance, but when the user
-	 * writes "max_performance", they actually want the
-	 * value to match MAX_PERFORMANCE.
-	 */
-	for (i = 1; i < ARRAY_SIZE(link_pm_policy); i++) {
-		const int len = strlen(link_pm_policy[i].name);
-		if (strncmp(link_pm_policy[i].name, buf, len) == 0) {
-			policy = link_pm_policy[i].value;
+	/* UNKNOWN is internal state, iterate from MAX_POWER */
+	for (policy = ATA_IPM_MAX_POWER;
+	     policy < ARRAY_SIZE(ata_ipm_policy_names); policy++) {
+		const char *name = ata_ipm_policy_names[policy];
+
+		if (strncmp(name, buf, strlen(name)) == 0)
 			break;
-		}
 	}
-	if (!policy)
+	if (policy == ARRAY_SIZE(ata_ipm_policy_names))
 		return -EINVAL;

-	ata_lpm_schedule(ap, policy);
+	spin_lock_irqsave(ap->lock, flags);
+	ap->target_ipm_policy = policy;
+	ata_port_schedule_eh(ap);
+	spin_unlock_irqrestore(ap->lock, flags);
+
 	return count;
 }

-static ssize_t
-ata_scsi_lpm_show(struct device *dev, struct device_attribute *attr, char *buf)
+static ssize_t ata_scsi_ipm_show(struct device *dev,
+				 struct device_attribute *attr, char *buf)
 {
 	struct Scsi_Host *shost = class_to_shost(dev);
 	struct ata_port *ap = ata_shost_to_port(shost);
-	const char *policy =
-		ata_scsi_lpm_get(ap->pm_policy);

-	if (!policy)
+	if (ap->target_ipm_policy >= ARRAY_SIZE(ata_ipm_policy_names))
 		return -EINVAL;

-	return snprintf(buf, 23, "%s\n", policy);
+	return snprintf(buf, PAGE_SIZE, "%s\n",
+			ata_ipm_policy_names[ap->target_ipm_policy]);
 }
 DEVICE_ATTR(link_power_management_policy, S_IRUGO | S_IWUSR,
-		ata_scsi_lpm_show, ata_scsi_lpm_put);
+	    ata_scsi_ipm_show, ata_scsi_ipm_store);
 EXPORT_SYMBOL_GPL(dev_attr_link_power_management_policy);

 static ssize_t ata_scsi_park_show(struct device *device,
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index 4b84ed6..2dd0dfe 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -87,6 +87,8 @@ extern int ata_dev_revalidate(struct ata_device *dev, unsigned int new_class,
 extern int ata_dev_configure(struct ata_device *dev);
 extern int sata_down_spd_limit(struct ata_link *link, u32 spd_limit);
 extern int ata_down_xfermask_limit(struct ata_device *dev, unsigned int sel);
+extern unsigned int ata_dev_set_feature(struct ata_device *dev,
+					u8 enable, u8 feature);
 extern void ata_sg_clean(struct ata_queued_cmd *qc);
 extern void ata_qc_free(struct ata_queued_cmd *qc);
 extern void ata_qc_issue(struct ata_queued_cmd *qc);
@@ -101,8 +103,6 @@ extern int sata_link_init_spd(struct ata_link *link);
 extern int ata_task_ioctl(struct scsi_device *scsidev, void __user *arg);
 extern int ata_cmd_ioctl(struct scsi_device *scsidev, void __user *arg);
 extern struct ata_port *ata_port_alloc(struct ata_host *host);
-extern void ata_dev_enable_pm(struct ata_device *dev, enum link_pm policy);
-extern void ata_lpm_schedule(struct ata_port *ap, enum link_pm);

 /* libata-acpi.c */
 #ifdef CONFIG_ATA_ACPI
@@ -170,6 +170,8 @@ extern void ata_eh_finish(struct ata_port *ap);
 #ifdef CONFIG_SATA_PMP
 extern int sata_pmp_scr_read(struct ata_link *link, int reg, u32 *val);
 extern int sata_pmp_scr_write(struct ata_link *link, int reg, u32 val);
+extern int sata_pmp_set_ipm(struct ata_link *link, enum ata_ipm_policy policy,
+			    unsigned hints);
 extern int sata_pmp_attach(struct ata_device *dev);
 #else /* CONFIG_SATA_PMP */
 static inline int sata_pmp_scr_read(struct ata_link *link, int reg, u32 *val)
@@ -182,6 +184,12 @@ static inline int sata_pmp_scr_write(struct ata_link *link, int reg, u32 val)
 	return -EINVAL;
 }

+static inline int sata_pmp_set_ipm(struct ata_link *link,
+				   enum ata_ipm_policy policy, unsigned hints)
+{
+	return -EINVAL;
+}
+
 static inline int sata_pmp_attach(struct ata_device *dev)
 {
 	return -EINVAL;
diff --git a/include/linux/libata.h b/include/linux/libata.h
index b85f3ff..1f90dc5 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -172,6 +172,7 @@ enum {
 	ATA_LFLAG_NO_RETRY	= (1 << 5), /* don't retry this link */
 	ATA_LFLAG_DISABLED	= (1 << 6), /* link is disabled */
 	ATA_LFLAG_SW_ACTIVITY	= (1 << 7), /* keep activity stats */
+	ATA_LFLAG_NO_IPM	= (1 << 8), /* disable IPM on this link */

 	/* struct ata_port flags */
 	ATA_FLAG_SLAVE_POSS	= (1 << 0), /* host supports slave dev */
@@ -324,12 +325,11 @@ enum {
 	ATA_EH_HARDRESET	= (1 << 2), /* meaningful only in ->prereset */
 	ATA_EH_RESET		= ATA_EH_SOFTRESET | ATA_EH_HARDRESET,
 	ATA_EH_ENABLE_LINK	= (1 << 3),
-	ATA_EH_LPM		= (1 << 4),  /* link power management action */
 	ATA_EH_PARK		= (1 << 5), /* unload heads and stop I/O */

 	ATA_EH_PERDEV_MASK	= ATA_EH_REVALIDATE | ATA_EH_PARK,
 	ATA_EH_ALL_ACTIONS	= ATA_EH_REVALIDATE | ATA_EH_RESET |
-				  ATA_EH_ENABLE_LINK | ATA_EH_LPM,
+				  ATA_EH_ENABLE_LINK,

 	/* ata_eh_info->flags */
 	ATA_EHI_HOTPLUGGED	= (1 << 0),  /* could have been hotplugged */
@@ -376,7 +376,6 @@ enum {
 	ATA_HORKAGE_BROKEN_HPA	= (1 << 4),	/* Broken HPA */
 	ATA_HORKAGE_DISABLE	= (1 << 5),	/* Disable it */
 	ATA_HORKAGE_HPA_SIZE	= (1 << 6),	/* native size off by one */
-	ATA_HORKAGE_IPM		= (1 << 7),	/* Link PM problems */
 	ATA_HORKAGE_IVB		= (1 << 8),	/* cbl det validity bit bugs */
 	ATA_HORKAGE_STUCK_ERR	= (1 << 9),	/* stuck ERR on next PACKET */
 	ATA_HORKAGE_BRIDGE_OK	= (1 << 10),	/* no bridge limits */
@@ -463,6 +462,22 @@ enum ata_completion_errors {
 	AC_ERR_NCQ		= (1 << 10), /* marker for offending NCQ qc */
 };

+/*
+ * Link pm policy: If you alter this, you also need to alter
+ * libata-scsi.c (for the ascii descriptions)
+ */
+enum ata_ipm_policy {
+	ATA_IPM_UNKNOWN,
+	ATA_IPM_MAX_POWER,
+	ATA_IPM_MED_POWER,
+	ATA_IPM_MIN_POWER,
+};
+
+enum ata_ipm_hints {
+	ATA_IPM_EMPTY		= (1 << 0), /* port empty/probing */
+	ATA_IPM_HIPM		= (1 << 1), /* may use HIPM */
+};
+
 /* forward declarations */
 struct scsi_device;
 struct ata_port_operations;
@@ -477,16 +492,6 @@ typedef int (*ata_reset_fn_t)(struct ata_link *link, unsigned int *classes,
 			      unsigned long deadline);
 typedef void (*ata_postreset_fn_t)(struct ata_link *link, unsigned int *classes);

-/*
- * host pm policy: If you alter this, you also need to alter libata-scsi.c
- * (for the ascii descriptions)
- */
-enum link_pm {
-	NOT_AVAILABLE,
-	MIN_POWER,
-	MAX_PERFORMANCE,
-	MEDIUM_POWER,
-};
 extern struct device_attribute dev_attr_link_power_management_policy;
 extern struct device_attribute dev_attr_unload_heads;
 extern struct device_attribute dev_attr_em_message_type;
@@ -698,6 +703,7 @@ struct ata_link {
 	unsigned int		hw_sata_spd_limit;
 	unsigned int		sata_spd_limit;
 	unsigned int		sata_spd;	/* current SATA PHY speed */
+	enum ata_ipm_policy	ipm_policy;

 	/* record runtime error info, protected by host_set lock */
 	struct ata_eh_info	eh_info;
@@ -764,7 +770,7 @@ struct ata_port {

 	pm_message_t		pm_mesg;
 	int			*pm_result;
-	enum link_pm		pm_policy;
+	enum ata_ipm_policy	target_ipm_policy;

 	struct timer_list	fastdrain_timer;
 	unsigned long		fastdrain_cnt;
@@ -830,8 +836,8 @@ struct ata_port_operations {
 	int  (*scr_write)(struct ata_link *link, unsigned int sc_reg, u32 val);
 	void (*pmp_attach)(struct ata_port *ap);
 	void (*pmp_detach)(struct ata_port *ap);
-	int  (*enable_pm)(struct ata_port *ap, enum link_pm policy);
-	void (*disable_pm)(struct ata_port *ap);
+	int  (*set_ipm)(struct ata_link *link, enum ata_ipm_policy policy,
+			unsigned hints);

 	/*
 	 * Start, stop, suspend and resume
@@ -943,6 +949,8 @@ extern int sata_link_debounce(struct ata_link *link,
 			const unsigned long *params, unsigned long deadline);
 extern int sata_link_resume(struct ata_link *link, const unsigned long *params,
 			    unsigned long deadline);
+extern int sata_link_scr_ipm(struct ata_link *link, enum ata_ipm_policy policy,
+			     bool spm_wakeup);
 extern int sata_link_hardreset(struct ata_link *link,
 			const unsigned long *timing, unsigned long deadline,
 			bool *online, int (*check_ready)(struct ata_link *));

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM
  2010-07-28 21:50     ` [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM (was: Re: HDD not suspending properly / dead on resume) Rafael J. Wysocki
@ 2010-07-30 14:18       ` Tejun Heo
  2010-08-05 16:08         ` Tejun Heo
  0 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2010-07-30 14:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Stephan Diestelhorst, linux-kernel, linux-ide, linux-pm,
	stephan.diestelhorst

Hello, Rafael.

Sorry about the delay.  There was a tiny crisis here and the whole
link pm seems to need a lot more work than I originally expected.  I'm
working on it now.  I'll probably have something for you to test in a
few days.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2010-10-29  5:49 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-28 20:28 [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM Siddhartha Jain
2010-08-30 12:55 ` Tejun Heo
2010-08-30 15:25   ` Siddhartha Jain
2010-08-30 22:26     ` Siddhartha Jain
2010-08-31  0:59       ` Siddhartha Jain
2010-08-31  2:03         ` Siddhartha Jain
2010-09-01 19:47           ` Siddhartha Jain
2010-09-02 10:09             ` Tejun Heo
2010-09-02 19:56               ` Siddhartha Jain
2010-09-03  9:04                 ` Tejun Heo
2010-10-28  2:11                   ` Siddhartha Jain
2010-10-29  5:49                     ` Jeff Garzik
  -- strict thread matches above, loose matches on Subject: below --
2010-07-09 15:50 HDD not suspending properly / dead on resume Stephan Diestelhorst
2010-07-10  6:50 ` Stephan Diestelhorst
2010-07-10 10:03   ` Tejun Heo
2010-07-28 21:50     ` [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM (was: Re: HDD not suspending properly / dead on resume) Rafael J. Wysocki
2010-07-30 14:18       ` [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM Tejun Heo
2010-08-05 16:08         ` Tejun Heo
2010-08-05 19:58           ` Rafael J. Wysocki
2010-08-06  6:30           ` Stephan Diestelhorst
2010-08-06  7:06             ` Tejun Heo
2010-08-06  9:04               ` Stephan Diestelhorst
2010-08-17  7:51           ` Stephan Diestelhorst
2010-08-17  8:08             ` Tejun Heo
2010-08-17  9:32               ` Stephan Diestelhorst
2010-08-17 10:15                 ` Tejun Heo
2010-08-17 10:29                   ` Stephan Diestelhorst
2010-08-17 10:51                     ` Stephan Diestelhorst
2010-08-17 15:04                       ` Tejun Heo
2010-08-17 21:28                         ` Stephan Diestelhorst
2010-08-18  6:12                           ` Tejun Heo
2010-08-19 16:23                             ` Stephan Diestelhorst
2010-08-23 12:03                               ` Tejun Heo
2010-08-23 18:58                                 ` Rafael J. Wysocki
2010-08-24  7:37                                   ` Tejun Heo
2010-08-24 20:39                                     ` Rafael J. Wysocki
2010-08-26 23:09                                       ` Rafael J. Wysocki
2010-08-26 23:46                                         ` Rafael J. Wysocki
2010-09-02  9:06                                         ` Tejun Heo
2010-08-24 16:07                                 ` Stephan Diestelhorst
2010-08-24 16:11                                   ` Stephan Diestelhorst
2010-08-26 16:15                                     ` Stephan Diestelhorst
2010-08-26 18:24                                       ` Rafael J. Wysocki
2010-08-27 23:35                                         ` Rafael J. Wysocki
2010-09-02 14:31                                           ` Stephan Diestelhorst
2010-09-02 14:31                                             ` Stephan Diestelhorst
2010-08-17 11:19                   ` Rafael J. Wysocki
2010-08-17 11:29                     ` Tejun Heo
2010-08-17 12:10                       ` Stephan Diestelhorst
2010-08-17 12:09                         ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.