All of lore.kernel.org
 help / color / mirror / Atom feed
* AHCI driver problem on SB700/SB800 w/ Acer Ferrari One
@ 2011-05-11 22:25 Rafael J. Wysocki
  2011-05-13 12:37 ` AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One) Michael Leun
  2011-05-13 12:37 ` Michael Leun
  0 siblings, 2 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-11 22:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, Linux PM mailing list, LKML

Hi,

There seems to be a problem on my Acer box which is related to DIPM
and switching the power source.  Namely, when I detach the AC adapter from
the machine, the disk (which is an Intel SSD) freezes for a while and
something like this appears in dmesg:

[ 2177.808120] ata1.00: exception Emask 0x0 SAct 0x7 SErr 0xd0002 action 0x6 frozen
[ 2177.808140] ata1: SError: { RecovComm PHYRdyChg CommWake 10B8B }
[ 2177.808153] ata1.00: failed command: WRITE FPDMA QUEUED
[ 2177.808171] ata1.00: cmd 61/08:00:08:ed:37/00:00:06:00:00/40 tag 0 ncq 4096 out
[ 2177.808174]          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 2177.808192] ata1.00: status: { DRDY }
[ 2177.808201] ata1.00: failed command: WRITE FPDMA QUEUED
[ 2177.808218] ata1.00: cmd 61/08:08:48:f4:c7/00:00:05:00:00/40 tag 1 ncq 4096 out
[ 2177.808220]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 2177.808239] ata1.00: status: { DRDY }
[ 2177.808247] ata1.00: failed command: WRITE FPDMA QUEUED
[ 2177.808275] ata1.00: cmd 61/08:10:50:f4:c7/00:00:05:00:00/40 tag 2 ncq 4096 out
[ 2177.808277]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 2177.808303] ata1.00: status: { DRDY }
[ 2177.808320] ata1: hard resetting link
[ 2187.812149] ata1: softreset failed (1st FIS failed)
[ 2187.812227] ata1: hard resetting link
[ 2188.304128] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 2188.304774] ata1.00: configured for UDMA/133
[ 2188.320333] ata1.00: device reported invalid CHS sector 0
[ 2188.320361] ata1.00: device reported invalid CHS sector 0
[ 2188.320394] ata1: EH complete

Then, when the AC adapter is attached again, the SSD freezes again and this
appears in dmesg:

[178523.064145] ata1.00: qc timeout (cmd 0xef)
[178523.064228] ata1.00: failed to disable DIPM, Emask 0x4
[178523.064248] ata1: hard resetting link
[178523.556096] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[178523.556793] ata1.00: configured for UDMA/133
[178523.572051] ata1: EH complete

That seems to be reproducible 100% of the time.  Moreover, if the box is
suspended on AC power and then resume on battery power, something more serious
happens:

[178881.824160] ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0xd0002 action 0x6 frozen
[178881.824203] ata1: SError: { RecovComm PHYRdyChg CommWake 10B8B }
[178881.824223] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824249] ata1.00: cmd 61/08:00:00:d0:9e/00:00:01:00:00/40 tag 0 ncq 4096 out
[178881.824252]          res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824288] ata1.00: status: { DRDY }
[178881.824304] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824328] ata1.00: cmd 61/08:08:48:d0:9e/00:00:01:00:00/40 tag 1 ncq 4096 out
[178881.824331]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824366] ata1.00: status: { DRDY }
[178881.824381] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824405] ata1.00: cmd 61/08:10:00:d2:9e/00:00:01:00:00/40 tag 2 ncq 4096 out
[178881.824408]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824443] ata1.00: status: { DRDY }
[178881.824457] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824481] ata1.00: cmd 61/08:18:20:d3:9e/00:00:01:00:00/40 tag 3 ncq 4096 out
[178881.824484]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824519] ata1.00: status: { DRDY }
[178881.824534] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824557] ata1.00: cmd 61/10:20:90:da:9e/00:00:01:00:00/40 tag 4 ncq 8192 out
[178881.824560]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824595] ata1.00: status: { DRDY }
[178881.824610] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824633] ata1.00: cmd 61/08:28:c0:5a:6c/00:00:01:00:00/40 tag 5 ncq 4096 out
[178881.824637]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824671] ata1.00: status: { DRDY }
[178881.824710] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824731] ata1.00: cmd 61/10:30:48:5f:98/00:00:01:00:00/40 tag 6 ncq 8192 out
[178881.824734]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824760] ata1.00: status: { DRDY }
[178881.824773] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824794] ata1.00: cmd 61/08:38:e0:1a:a4/00:00:01:00:00/40 tag 7 ncq 4096 out
[178881.824797]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824823] ata1.00: status: { DRDY }
[178881.824836] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824856] ata1.00: cmd 61/08:40:90:29:ac/00:00:01:00:00/40 tag 8 ncq 4096 out
[178881.824859]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824885] ata1.00: status: { DRDY }
[178881.824898] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824919] ata1.00: cmd 61/08:48:a0:33:a4/00:00:01:00:00/40 tag 9 ncq 4096 out
[178881.824922]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824949] ata1.00: status: { DRDY }
[178881.824961] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824982] ata1.00: cmd 61/08:50:c8:51:ac/00:00:01:00:00/40 tag 10 ncq 4096 out
[178881.824985]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825012] ata1.00: status: { DRDY }
[178881.825025] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825046] ata1.00: cmd 61/10:58:00:50:96/00:00:01:00:00/40 tag 11 ncq 8192 out
[178881.825049]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825075] ata1.00: status: { DRDY }
[178881.825087] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825108] ata1.00: cmd 61/08:60:20:50:96/00:00:01:00:00/40 tag 12 ncq 4096 out
[178881.825111]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825137] ata1.00: status: { DRDY }
[178881.825149] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825170] ata1.00: cmd 61/08:68:40:50:96/00:00:01:00:00/40 tag 13 ncq 4096 out
[178881.825173]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825199] ata1.00: status: { DRDY }
[178881.825212] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825232] ata1.00: cmd 61/10:70:e0:4c:94/00:00:01:00:00/40 tag 14 ncq 8192 out
[178881.825235]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825261] ata1.00: status: { DRDY }
[178881.825273] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825294] ata1.00: cmd 61/10:78:80:55:7b/00:00:01:00:00/40 tag 15 ncq 8192 out
[178881.825297]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825323] ata1.00: status: { DRDY }
[178881.825335] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825356] ata1.00: cmd 61/08:80:68:60:64/00:00:01:00:00/40 tag 16 ncq 4096 out
[178881.825359]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825385] ata1.00: status: { DRDY }
[178881.825398] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825419] ata1.00: cmd 61/08:88:c8:5a:6c/00:00:01:00:00/40 tag 17 ncq 4096 out
[178881.825422]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825449] ata1.00: status: { DRDY }
[178881.825461] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825461] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825482] ata1.00: cmd 61/08:90:b8:b6:64/00:00:01:00:00/40 tag 18 ncq 4096 out
[178881.825485]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825512] ata1.00: status: { DRDY }
[178881.825524] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825545] ata1.00: cmd 61/10:98:a8:a2:6d/00:00:01:00:00/40 tag 19 ncq 8192 out
[178881.825548]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825574] ata1.00: status: { DRDY }
[178881.825587] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825608] ata1.00: cmd 61/08:a0:58:f6:76/00:00:01:00:00/40 tag 20 ncq 4096 out
[178881.825611]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825637] ata1.00: status: { DRDY }
[178881.825649] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825670] ata1.00: cmd 61/08:a8:e0:28:3b/00:00:06:00:00/40 tag 21 ncq 4096 out
[178881.825673]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825699] ata1.00: status: { DRDY }
[178881.825712] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825732] ata1.00: cmd 61/48:b0:c0:1e:a5/01:00:04:00:00/40 tag 22 ncq 167936 out
[178881.825735]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825761] ata1.00: status: { DRDY }
[178881.825774] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825794] ata1.00: cmd 61/08:b8:78:c1:37/00:00:06:00:00/40 tag 23 ncq 4096 out
[178881.825798]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825824] ata1.00: status: { DRDY }
[178881.825836] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825857] ata1.00: cmd 61/18:c0:98:c1:37/00:00:06:00:00/40 tag 24 ncq 12288 out
[178881.825860]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825886] ata1.00: status: { DRDY }
[178881.825899] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825919] ata1.00: cmd 61/08:c8:10:c1:37/00:00:06:00:00/40 tag 25 ncq 4096 out
[178881.825923]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825949] ata1.00: status: { DRDY }
[178881.825961] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825982] ata1.00: cmd 61/10:d0:30:c1:37/00:00:06:00:00/40 tag 26 ncq 8192 out
[178881.825985]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826011] ata1.00: status: { DRDY }
[178881.826024] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.826045] ata1.00: cmd 61/18:d8:50:c1:37/00:00:06:00:00/40 tag 27 ncq 12288 out
[178881.826048]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826074] ata1.00: status: { DRDY }
[178881.826086] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.826107] ata1.00: cmd 61/08:e0:d8:a2:35/00:00:06:00:00/40 tag 28 ncq 4096 out
[178881.826110]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826137] ata1.00: status: { DRDY }
[178881.826149] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.826170] ata1.00: cmd 61/08:e8:d0:a3:35/00:00:06:00:00/40 tag 29 ncq 4096 out
[178881.826173]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826199] ata1.00: status: { DRDY }
[178881.826211] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.826232] ata1.00: cmd 61/08:f0:50:e5:64/00:00:06:00:00/40 tag 30 ncq 4096 out
[178881.826235]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826261] ata1.00: status: { DRDY }
[178881.826278] ata1: hard resetting link
[178891.824149] ata1: softreset failed (1st FIS failed)
[178891.824188] ata1: hard resetting link
[178892.320170] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[178892.320874] ata1.00: configured for UDMA/133
[178892.336156] ata1.00: device reported invalid CHS sector 0
[178892.336199] ata1.00: device reported invalid CHS sector 0
[178892.336218] ata1.00: device reported invalid CHS sector 0
[178892.336236] ata1.00: device reported invalid CHS sector 0
[178892.336254] ata1.00: device reported invalid CHS sector 0
[178892.336272] ata1.00: device reported invalid CHS sector 0
[178892.336290] ata1.00: device reported invalid CHS sector 0
[178892.336308] ata1.00: device reported invalid CHS sector 0
[178892.336327] ata1.00: device reported invalid CHS sector 0
[178892.336345] ata1.00: device reported invalid CHS sector 0
[178892.336363] ata1.00: device reported invalid CHS sector 0
[178892.336381] ata1.00: device reported invalid CHS sector 0
[178892.336399] ata1.00: device reported invalid CHS sector 0
[178892.336418] ata1.00: device reported invalid CHS sector 0
[178892.336436] ata1.00: device reported invalid CHS sector 0
[178892.336455] ata1.00: device reported invalid CHS sector 0
[178892.336474] ata1.00: device reported invalid CHS sector 0
[178892.336492] ata1.00: device reported invalid CHS sector 0
[178892.336512] ata1.00: device reported invalid CHS sector 0
[178892.336530] ata1.00: device reported invalid CHS sector 0
[178892.336548] ata1.00: device reported invalid CHS sector 0
[178892.336567] ata1.00: device reported invalid CHS sector 0
[178892.336586] ata1.00: device reported invalid CHS sector 0
[178892.336604] ata1.00: device reported invalid CHS sector 0
[178892.336622] ata1.00: device reported invalid CHS sector 0
[178892.336640] ata1.00: device reported invalid CHS sector 0
[178892.336658] ata1.00: device reported invalid CHS sector 0
[178892.336677] ata1.00: device reported invalid CHS sector 0
[178892.336694] ata1.00: device reported invalid CHS sector 0
[178892.336712] ata1.00: device reported invalid CHS sector 0
[178892.336731] ata1.00: device reported invalid CHS sector 0
[178892.336837] ata1: EH complete

This doesn't happen if the box is both suspended and resumed on battery
power.

I have verified that the issue described above doesn't happen with
the appended patch applied.

The hardware is:

00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode] (prog-if 01 [AHCI 1.0])
        Subsystem: Acer Incorporated [ALI] Device 029e
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 64
        Interrupt: pin A routed to IRQ 22
        Region 0: I/O ports at 8420 [size=8]
        Region 1: I/O ports at 8414 [size=4]
        Region 2: I/O ports at 8418 [size=8]
        Region 3: I/O ports at 8410 [size=4]
        Region 4: I/O ports at 8400 [size=16]
        Region 5: Memory at f0508000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [60] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [70] SATA HBA v1.0 InCfgSpace
        Kernel driver in use: ahci

[    3.336087] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    3.340777] ata1.00: ATA-7: INTEL SSDSA2M080G2GC, 2CV102HD, max UDMA/133
[    3.345233] ata1.00: 156301488 sectors, multi 16: LBA48 NCQ (depth 31/32)
[    3.350070] ata1.00: configured for UDMA/133
[    3.354969] scsi 0:0:0:0: Direct-Access     ATA      INTEL SSDSA2M080 2CV1 PQ: 0 ANSI: 5

Please tell me if there's anything I can do as a permanent workaround.

Thanks,
Rafael


---
 drivers/ata/ahci.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/drivers/ata/ahci.c
===================================================================
--- linux-2.6.orig/drivers/ata/ahci.c
+++ linux-2.6/drivers/ata/ahci.c
@@ -193,7 +193,7 @@ static const struct ata_port_info ahci_p
 	[board_ahci_sb700] =	/* for SB700 and SB800 */
 	{
 		AHCI_HFLAGS	(AHCI_HFLAG_IGN_SERR_INTERNAL),
-		.flags		= AHCI_FLAG_COMMON,
+		.flags		= AHCI_FLAG_COMMON | ATA_FLAG_NO_DIPM,
 		.pio_mask	= ATA_PIO4,
 		.udma_mask	= ATA_UDMA6,
 		.port_ops	= &ahci_sb600_ops,

^ permalink raw reply	[flat|nested] 62+ messages in thread

* AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-11 22:25 AHCI driver problem on SB700/SB800 w/ Acer Ferrari One Rafael J. Wysocki
  2011-05-13 12:37 ` AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One) Michael Leun
@ 2011-05-13 12:37 ` Michael Leun
  2011-05-13 16:56   ` Rafael J. Wysocki
  2011-05-13 16:56   ` Rafael J. Wysocki
  1 sibling, 2 replies; 62+ messages in thread
From: Michael Leun @ 2011-05-13 12:37 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Rafael J. Wysocki, linux-ide, Linux PM mailing list, LKML

On Thu, 12 May 2011 00:25:34 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> There seems to be a problem on my Acer box which is related to DIPM
> and switching the power source.  Namely, when I detach the AC adapter
> from the machine, the disk (which is an Intel SSD) freezes for a
> while and something like this appears in dmesg:
[...]

Since updating to 2.6.39-rc7 (from 2.6.38.x) I have similar problems
with two notebooks using Intel ICH:

Acer PTZ1825
00:1f.2 SATA controller: Intel Corporation ICH9M/M-E SATA AHCI
Controller (rev 03) (prog-if 01 [AHCI 1.0]) Subsystem: Acer
Incorporated [ALI] Device 0300 Flags: bus master, 66MHz, medium devsel,
latency 0, IRQ 42 I/O ports at 30e8 [size=8]
        I/O ports at 30fc [size=4]
        I/O ports at 30e0 [size=8]
        I/O ports at 30f8 [size=4]
        I/O ports at 3020 [size=32]
        Memory at d4504000 (32-bit, non-prefetchable) [size=2K]
        Capabilities: [80] MSI: Enable+ Count=1/16 Maskable- 64bit-
        Capabilities: [70] Power Management version 3
        Capabilities: [a8] SATA HBA v1.0
        Capabilities: [b0] PCI Advanced Features
        Kernel driver in use: ahci


Dell Latitude E6510
00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset
6 port SATA AHCI Cont roller (rev 05) (prog-if 01 [AHCI 1.0])
        Subsystem: Dell Device 040b
        Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 41
        I/O ports at 8090 [size=8]
        I/O ports at 8080 [size=4]
        I/O ports at 8070 [size=8]
        I/O ports at 8060 [size=4]
        I/O ports at 8020 [size=32]
        Memory at e9640000 (32-bit, non-prefetchable) [size=2K]
        Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
        Capabilities: [70] Power Management version 3
        Capabilities: [a8] SATA HBA v1.0
        Capabilities: [b0] PCI Advanced Features
        Kernel driver in use: ahci

Patch not tried yet (do not know / not looked into yet where to change
for this chipset).

Seems to be a 2.6.39-rcX regression?


-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 62+ messages in thread

* AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-11 22:25 AHCI driver problem on SB700/SB800 w/ Acer Ferrari One Rafael J. Wysocki
@ 2011-05-13 12:37 ` Michael Leun
  2011-05-13 12:37 ` Michael Leun
  1 sibling, 0 replies; 62+ messages in thread
From: Michael Leun @ 2011-05-13 12:37 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, Linux PM mailing list, LKML

On Thu, 12 May 2011 00:25:34 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> There seems to be a problem on my Acer box which is related to DIPM
> and switching the power source.  Namely, when I detach the AC adapter
> from the machine, the disk (which is an Intel SSD) freezes for a
> while and something like this appears in dmesg:
[...]

Since updating to 2.6.39-rc7 (from 2.6.38.x) I have similar problems
with two notebooks using Intel ICH:

Acer PTZ1825
00:1f.2 SATA controller: Intel Corporation ICH9M/M-E SATA AHCI
Controller (rev 03) (prog-if 01 [AHCI 1.0]) Subsystem: Acer
Incorporated [ALI] Device 0300 Flags: bus master, 66MHz, medium devsel,
latency 0, IRQ 42 I/O ports at 30e8 [size=8]
        I/O ports at 30fc [size=4]
        I/O ports at 30e0 [size=8]
        I/O ports at 30f8 [size=4]
        I/O ports at 3020 [size=32]
        Memory at d4504000 (32-bit, non-prefetchable) [size=2K]
        Capabilities: [80] MSI: Enable+ Count=1/16 Maskable- 64bit-
        Capabilities: [70] Power Management version 3
        Capabilities: [a8] SATA HBA v1.0
        Capabilities: [b0] PCI Advanced Features
        Kernel driver in use: ahci


Dell Latitude E6510
00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset
6 port SATA AHCI Cont roller (rev 05) (prog-if 01 [AHCI 1.0])
        Subsystem: Dell Device 040b
        Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 41
        I/O ports at 8090 [size=8]
        I/O ports at 8080 [size=4]
        I/O ports at 8070 [size=8]
        I/O ports at 8060 [size=4]
        I/O ports at 8020 [size=32]
        Memory at e9640000 (32-bit, non-prefetchable) [size=2K]
        Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
        Capabilities: [70] Power Management version 3
        Capabilities: [a8] SATA HBA v1.0
        Capabilities: [b0] PCI Advanced Features
        Kernel driver in use: ahci

Patch not tried yet (do not know / not looked into yet where to change
for this chipset).

Seems to be a 2.6.39-rcX regression?


-- 
MfG,

Michael Leun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 12:37 ` Michael Leun
  2011-05-13 16:56   ` Rafael J. Wysocki
@ 2011-05-13 16:56   ` Rafael J. Wysocki
  2011-05-13 17:39     ` Tejun Heo
  2011-05-13 17:39     ` Tejun Heo
  1 sibling, 2 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-13 16:56 UTC (permalink / raw)
  To: Michael Leun; +Cc: Tejun Heo, linux-ide, Linux PM mailing list, LKML

On Friday, May 13, 2011, Michael Leun wrote:
> On Thu, 12 May 2011 00:25:34 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > There seems to be a problem on my Acer box which is related to DIPM
> > and switching the power source.  Namely, when I detach the AC adapter
> > from the machine, the disk (which is an Intel SSD) freezes for a
> > while and something like this appears in dmesg:
> [...]
> 
> Since updating to 2.6.39-rc7 (from 2.6.38.x) I have similar problems
> with two notebooks using Intel ICH:
> 
> Acer PTZ1825
> 00:1f.2 SATA controller: Intel Corporation ICH9M/M-E SATA AHCI
> Controller (rev 03) (prog-if 01 [AHCI 1.0]) Subsystem: Acer
> Incorporated [ALI] Device 0300 Flags: bus master, 66MHz, medium devsel,
> latency 0, IRQ 42 I/O ports at 30e8 [size=8]
>         I/O ports at 30fc [size=4]
>         I/O ports at 30e0 [size=8]
>         I/O ports at 30f8 [size=4]
>         I/O ports at 3020 [size=32]
>         Memory at d4504000 (32-bit, non-prefetchable) [size=2K]
>         Capabilities: [80] MSI: Enable+ Count=1/16 Maskable- 64bit-
>         Capabilities: [70] Power Management version 3
>         Capabilities: [a8] SATA HBA v1.0
>         Capabilities: [b0] PCI Advanced Features
>         Kernel driver in use: ahci
> 
> 
> Dell Latitude E6510
> 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset
> 6 port SATA AHCI Cont roller (rev 05) (prog-if 01 [AHCI 1.0])
>         Subsystem: Dell Device 040b
>         Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 41
>         I/O ports at 8090 [size=8]
>         I/O ports at 8080 [size=4]
>         I/O ports at 8070 [size=8]
>         I/O ports at 8060 [size=4]
>         I/O ports at 8020 [size=32]
>         Memory at e9640000 (32-bit, non-prefetchable) [size=2K]
>         Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
>         Capabilities: [70] Power Management version 3
>         Capabilities: [a8] SATA HBA v1.0
>         Capabilities: [b0] PCI Advanced Features
>         Kernel driver in use: ahci
> 
> Patch not tried yet (do not know / not looked into yet where to change
> for this chipset).
> 
> Seems to be a 2.6.39-rcX regression?

Well, that very well may be the case.

Can you verify that it didn't happen with 2.6.38 and earlier?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 12:37 ` Michael Leun
@ 2011-05-13 16:56   ` Rafael J. Wysocki
  2011-05-13 16:56   ` Rafael J. Wysocki
  1 sibling, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-13 16:56 UTC (permalink / raw)
  To: Michael Leun; +Cc: Tejun Heo, linux-ide, Linux PM mailing list, LKML

On Friday, May 13, 2011, Michael Leun wrote:
> On Thu, 12 May 2011 00:25:34 +0200
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > There seems to be a problem on my Acer box which is related to DIPM
> > and switching the power source.  Namely, when I detach the AC adapter
> > from the machine, the disk (which is an Intel SSD) freezes for a
> > while and something like this appears in dmesg:
> [...]
> 
> Since updating to 2.6.39-rc7 (from 2.6.38.x) I have similar problems
> with two notebooks using Intel ICH:
> 
> Acer PTZ1825
> 00:1f.2 SATA controller: Intel Corporation ICH9M/M-E SATA AHCI
> Controller (rev 03) (prog-if 01 [AHCI 1.0]) Subsystem: Acer
> Incorporated [ALI] Device 0300 Flags: bus master, 66MHz, medium devsel,
> latency 0, IRQ 42 I/O ports at 30e8 [size=8]
>         I/O ports at 30fc [size=4]
>         I/O ports at 30e0 [size=8]
>         I/O ports at 30f8 [size=4]
>         I/O ports at 3020 [size=32]
>         Memory at d4504000 (32-bit, non-prefetchable) [size=2K]
>         Capabilities: [80] MSI: Enable+ Count=1/16 Maskable- 64bit-
>         Capabilities: [70] Power Management version 3
>         Capabilities: [a8] SATA HBA v1.0
>         Capabilities: [b0] PCI Advanced Features
>         Kernel driver in use: ahci
> 
> 
> Dell Latitude E6510
> 00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset
> 6 port SATA AHCI Cont roller (rev 05) (prog-if 01 [AHCI 1.0])
>         Subsystem: Dell Device 040b
>         Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 41
>         I/O ports at 8090 [size=8]
>         I/O ports at 8080 [size=4]
>         I/O ports at 8070 [size=8]
>         I/O ports at 8060 [size=4]
>         I/O ports at 8020 [size=32]
>         Memory at e9640000 (32-bit, non-prefetchable) [size=2K]
>         Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
>         Capabilities: [70] Power Management version 3
>         Capabilities: [a8] SATA HBA v1.0
>         Capabilities: [b0] PCI Advanced Features
>         Kernel driver in use: ahci
> 
> Patch not tried yet (do not know / not looked into yet where to change
> for this chipset).
> 
> Seems to be a 2.6.39-rcX regression?

Well, that very well may be the case.

Can you verify that it didn't happen with 2.6.38 and earlier?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 16:56   ` Rafael J. Wysocki
@ 2011-05-13 17:39     ` Tejun Heo
  2011-05-13 19:54       ` Michael Leun
                         ` (3 more replies)
  2011-05-13 17:39     ` Tejun Heo
  1 sibling, 4 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-13 17:39 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Michael Leun, linux-ide, Linux PM mailing list, LKML

Hello,

Can you guys please take a look at bug#34692?  It seems to be the same problem.

  https://bugzilla.kernel.org/show_bug.cgi?id=34692

The offending commit seems to be 270dac35c2.  Does reverting it
resolves the problem for you guys too?

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 16:56   ` Rafael J. Wysocki
  2011-05-13 17:39     ` Tejun Heo
@ 2011-05-13 17:39     ` Tejun Heo
  1 sibling, 0 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-13 17:39 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-ide, Linux PM mailing list, Michael Leun, LKML

Hello,

Can you guys please take a look at bug#34692?  It seems to be the same problem.

  https://bugzilla.kernel.org/show_bug.cgi?id=34692

The offending commit seems to be 270dac35c2.  Does reverting it
resolves the problem for you guys too?

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 17:39     ` Tejun Heo
  2011-05-13 19:54       ` Michael Leun
@ 2011-05-13 19:54       ` Michael Leun
  2011-05-13 20:20         ` Rafael J. Wysocki
  2011-05-13 20:20         ` Rafael J. Wysocki
  2011-05-13 20:22       ` Rafael J. Wysocki
  2011-05-13 20:22       ` Rafael J. Wysocki
  3 siblings, 2 replies; 62+ messages in thread
From: Michael Leun @ 2011-05-13 19:54 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Rafael J. Wysocki, linux-ide, Linux PM mailing list, LKML

On Fri, 13 May 2011 19:39:51 +0200
Tejun Heo <tj@kernel.org> wrote:

> Can you guys please take a look at bug#34692?  It seems to be the
> same problem.
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=34692
> 
> The offending commit seems to be 270dac35c2.  Does reverting it
> resolves the problem for you guys too?

- Does not happen with 2.6.38.6

- Happens 100% of cases with 2.6.39-rc7 (connect ac adapter or suspend
  to ram with ac connected and resume with ac disconnected)

- Reverting 270dac35c26433d06a89150c51e75ca0181ca7e4 fixes issue

-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 17:39     ` Tejun Heo
@ 2011-05-13 19:54       ` Michael Leun
  2011-05-13 19:54       ` Michael Leun
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 62+ messages in thread
From: Michael Leun @ 2011-05-13 19:54 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, Linux PM mailing list, LKML

On Fri, 13 May 2011 19:39:51 +0200
Tejun Heo <tj@kernel.org> wrote:

> Can you guys please take a look at bug#34692?  It seems to be the
> same problem.
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=34692
> 
> The offending commit seems to be 270dac35c2.  Does reverting it
> resolves the problem for you guys too?

- Does not happen with 2.6.38.6

- Happens 100% of cases with 2.6.39-rc7 (connect ac adapter or suspend
  to ram with ac connected and resume with ac disconnected)

- Reverting 270dac35c26433d06a89150c51e75ca0181ca7e4 fixes issue

-- 
MfG,

Michael Leun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 19:54       ` Michael Leun
@ 2011-05-13 20:20         ` Rafael J. Wysocki
  2011-05-13 20:20         ` Rafael J. Wysocki
  1 sibling, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-13 20:20 UTC (permalink / raw)
  To: Michael Leun; +Cc: Tejun Heo, linux-ide, Linux PM mailing list

On Friday, May 13, 2011, Michael Leun wrote:
> On Fri, 13 May 2011 19:39:51 +0200
> Tejun Heo <tj@kernel.org> wrote:
> 
> > Can you guys please take a look at bug#34692?  It seems to be the
> > same problem.
> > 
> >   https://bugzilla.kernel.org/show_bug.cgi?id=34692
> > 
> > The offending commit seems to be 270dac35c2.  Does reverting it
> > resolves the problem for you guys too?
> 
> - Does not happen with 2.6.38.6
> 
> - Happens 100% of cases with 2.6.39-rc7 (connect ac adapter or suspend
>   to ram with ac connected and resume with ac disconnected)
> 
> - Reverting 270dac35c26433d06a89150c51e75ca0181ca7e4 fixes issue

I can confirm that.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 19:54       ` Michael Leun
  2011-05-13 20:20         ` Rafael J. Wysocki
@ 2011-05-13 20:20         ` Rafael J. Wysocki
  1 sibling, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-13 20:20 UTC (permalink / raw)
  To: Michael Leun; +Cc: Tejun Heo, linux-ide, Linux PM mailing list

On Friday, May 13, 2011, Michael Leun wrote:
> On Fri, 13 May 2011 19:39:51 +0200
> Tejun Heo <tj@kernel.org> wrote:
> 
> > Can you guys please take a look at bug#34692?  It seems to be the
> > same problem.
> > 
> >   https://bugzilla.kernel.org/show_bug.cgi?id=34692
> > 
> > The offending commit seems to be 270dac35c2.  Does reverting it
> > resolves the problem for you guys too?
> 
> - Does not happen with 2.6.38.6
> 
> - Happens 100% of cases with 2.6.39-rc7 (connect ac adapter or suspend
>   to ram with ac connected and resume with ac disconnected)
> 
> - Reverting 270dac35c26433d06a89150c51e75ca0181ca7e4 fixes issue

I can confirm that.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 17:39     ` Tejun Heo
  2011-05-13 19:54       ` Michael Leun
  2011-05-13 19:54       ` Michael Leun
@ 2011-05-13 20:22       ` Rafael J. Wysocki
  2011-05-13 20:44         ` Rafael J. Wysocki
  2011-05-13 20:44         ` AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One) Rafael J. Wysocki
  2011-05-13 20:22       ` Rafael J. Wysocki
  3 siblings, 2 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-13 20:22 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Michael Leun, linux-ide, Linux PM mailing list, LKML

On Friday, May 13, 2011, Tejun Heo wrote:
> Hello,
> 
> Can you guys please take a look at bug#34692?  It seems to be the same problem.
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=34692

Yes, same issue.

> The offending commit seems to be 270dac35c2.  Does reverting it
> resolves the problem for you guys too?

Yes, it does.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 17:39     ` Tejun Heo
                         ` (2 preceding siblings ...)
  2011-05-13 20:22       ` Rafael J. Wysocki
@ 2011-05-13 20:22       ` Rafael J. Wysocki
  3 siblings, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-13 20:22 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, Linux PM mailing list, Michael Leun, LKML

On Friday, May 13, 2011, Tejun Heo wrote:
> Hello,
> 
> Can you guys please take a look at bug#34692?  It seems to be the same problem.
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=34692

Yes, same issue.

> The offending commit seems to be 270dac35c2.  Does reverting it
> resolves the problem for you guys too?

Yes, it does.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 20:22       ` Rafael J. Wysocki
@ 2011-05-13 20:44         ` Rafael J. Wysocki
  2011-05-14  2:08           ` Robert Hancock
                             ` (3 more replies)
  2011-05-13 20:44         ` AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One) Rafael J. Wysocki
  1 sibling, 4 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-13 20:44 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Michael Leun, linux-ide, Linux PM mailing list, LKML

On Friday, May 13, 2011, Rafael J. Wysocki wrote:
> On Friday, May 13, 2011, Tejun Heo wrote:
> > Hello,
> > 
> > Can you guys please take a look at bug#34692?  It seems to be the same problem.
> > 
> >   https://bugzilla.kernel.org/show_bug.cgi?id=34692
> 
> Yes, same issue.
> 
> > The offending commit seems to be 270dac35c2.  Does reverting it
> > resolves the problem for you guys too?
> 
> Yes, it does.

Would you object if I asked Linus to revert that commit?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 20:22       ` Rafael J. Wysocki
  2011-05-13 20:44         ` Rafael J. Wysocki
@ 2011-05-13 20:44         ` Rafael J. Wysocki
  1 sibling, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-13 20:44 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, Linux PM mailing list, Michael Leun, LKML

On Friday, May 13, 2011, Rafael J. Wysocki wrote:
> On Friday, May 13, 2011, Tejun Heo wrote:
> > Hello,
> > 
> > Can you guys please take a look at bug#34692?  It seems to be the same problem.
> > 
> >   https://bugzilla.kernel.org/show_bug.cgi?id=34692
> 
> Yes, same issue.
> 
> > The offending commit seems to be 270dac35c2.  Does reverting it
> > resolves the problem for you guys too?
> 
> Yes, it does.

Would you object if I asked Linus to revert that commit?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 20:44         ` Rafael J. Wysocki
@ 2011-05-14  2:08           ` Robert Hancock
  2011-05-14  2:08           ` Robert Hancock
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 62+ messages in thread
From: Robert Hancock @ 2011-05-14  2:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Michael Leun, linux-ide, Linux PM mailing list, LKML,
	jipeng2005

On 05/13/2011 02:44 PM, Rafael J. Wysocki wrote:
> On Friday, May 13, 2011, Rafael J. Wysocki wrote:
>> On Friday, May 13, 2011, Tejun Heo wrote:
>>> Hello,
>>>
>>> Can you guys please take a look at bug#34692?  It seems to be the same problem.
>>>
>>>    https://bugzilla.kernel.org/show_bug.cgi?id=34692
>>
>> Yes, same issue.
>>
>>> The offending commit seems to be 270dac35c2.  Does reverting it
>>> resolves the problem for you guys too?
>>
>> Yes, it does.
>
> Would you object if I asked Linus to revert that commit?

Think that's likely the best solution for now. I'm guessing there's some 
case where link power management trips up the check that was added. Most 
or all existing released controllers don't need it so I think Jian could 
likely try again after 2.6.39.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 20:44         ` Rafael J. Wysocki
  2011-05-14  2:08           ` Robert Hancock
@ 2011-05-14  2:08           ` Robert Hancock
  2011-05-14 10:18           ` Tejun Heo
  2011-05-14 10:18           ` Tejun Heo
  3 siblings, 0 replies; 62+ messages in thread
From: Robert Hancock @ 2011-05-14  2:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, linux-ide, jipeng2005, Tejun Heo, Linux PM mailing list,
	Michael Leun

On 05/13/2011 02:44 PM, Rafael J. Wysocki wrote:
> On Friday, May 13, 2011, Rafael J. Wysocki wrote:
>> On Friday, May 13, 2011, Tejun Heo wrote:
>>> Hello,
>>>
>>> Can you guys please take a look at bug#34692?  It seems to be the same problem.
>>>
>>>    https://bugzilla.kernel.org/show_bug.cgi?id=34692
>>
>> Yes, same issue.
>>
>>> The offending commit seems to be 270dac35c2.  Does reverting it
>>> resolves the problem for you guys too?
>>
>> Yes, it does.
>
> Would you object if I asked Linus to revert that commit?

Think that's likely the best solution for now. I'm guessing there's some 
case where link power management trips up the check that was added. Most 
or all existing released controllers don't need it so I think Jian could 
likely try again after 2.6.39.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 20:44         ` Rafael J. Wysocki
                             ` (2 preceding siblings ...)
  2011-05-14 10:18           ` Tejun Heo
@ 2011-05-14 10:18           ` Tejun Heo
  2011-05-14 10:28               ` Tejun Heo
  2011-05-14 10:28             ` Tejun Heo
  3 siblings, 2 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-14 10:18 UTC (permalink / raw)
  To: Rafael J. Wysocki, Jeff Garzik, Jian Peng
  Cc: Michael Leun, linux-ide, Linux PM mailing list, LKML

Hello,

On Fri, May 13, 2011 at 10:44:29PM +0200, Rafael J. Wysocki wrote:
> On Friday, May 13, 2011, Rafael J. Wysocki wrote:
> > On Friday, May 13, 2011, Tejun Heo wrote:
> > > Hello,
> > > 
> > > Can you guys please take a look at bug#34692?  It seems to be the same problem.
> > > 
> > >   https://bugzilla.kernel.org/show_bug.cgi?id=34692
> > 
> > Yes, same issue.
> > 
> > > The offending commit seems to be 270dac35c2.  Does reverting it
> > > resolves the problem for you guys too?
> > 
> > Yes, it does.
> 
> Would you object if I asked Linus to revert that commit?

Not at all.  I think that's the best option at this point.  I'll send
revert patch right now.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One)
  2011-05-13 20:44         ` Rafael J. Wysocki
  2011-05-14  2:08           ` Robert Hancock
  2011-05-14  2:08           ` Robert Hancock
@ 2011-05-14 10:18           ` Tejun Heo
  2011-05-14 10:18           ` Tejun Heo
  3 siblings, 0 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-14 10:18 UTC (permalink / raw)
  To: Rafael J. Wysocki, Jeff Garzik, Jian Peng
  Cc: linux-ide, Linux PM mailing list, Michael Leun, LKML

Hello,

On Fri, May 13, 2011 at 10:44:29PM +0200, Rafael J. Wysocki wrote:
> On Friday, May 13, 2011, Rafael J. Wysocki wrote:
> > On Friday, May 13, 2011, Tejun Heo wrote:
> > > Hello,
> > > 
> > > Can you guys please take a look at bug#34692?  It seems to be the same problem.
> > > 
> > >   https://bugzilla.kernel.org/show_bug.cgi?id=34692
> > 
> > Yes, same issue.
> > 
> > > The offending commit seems to be 270dac35c2.  Does reverting it
> > > resolves the problem for you guys too?
> > 
> > Yes, it does.
> 
> Would you object if I asked Linus to revert that commit?

Not at all.  I think that's the best option at this point.  I'll send
revert patch right now.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-14 10:18           ` Tejun Heo
@ 2011-05-14 10:28               ` Tejun Heo
  2011-05-14 10:28             ` Tejun Heo
  1 sibling, 0 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-14 10:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Leun, linux-ide, Linux PM mailing list, LKML,
	Rafael J. Wysocki, Jeff Garzik, Jian Peng

This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.

The commits causes command timeouts on AC plug/unplug.  It isn't yet
clear why.  As the commit was for a single rather obscure controller,
revert the change for now.

The problem was reported and bisected by Gu Rui in bug#34692.

 https://bugzilla.kernel.org/show_bug.cgi?id=34692

Also, reported by Rafael and Michael in the following thread.

 http://thread.gmane.org/gmane.linux.kernel/1138771

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Gu Rui <chaos.proton@gmail.com>
Reported-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Michael Leun <lkml20100708@newton.leun.net>
Cc: Jian Peng <jipeng2005@gmail.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
---
As we're already in -rc7, I'm sending the revert patch to both Jeff
and Linus.

Thank you.

 drivers/ata/libahci.c |   21 ---------------------
 1 files changed, 0 insertions(+), 21 deletions(-)

diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index ff9d832..d38c40f 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -561,27 +561,6 @@ void ahci_start_engine(struct ata_port *ap)
 {
 	void __iomem *port_mmio = ahci_port_base(ap);
 	u32 tmp;
-	u8 status;
-
-	status = readl(port_mmio + PORT_TFDATA) & 0xFF;
-
-	/*
-	 * At end of section 10.1 of AHCI spec (rev 1.3), it states
-	 * Software shall not set PxCMD.ST to 1 until it is determined
-	 * that a functoinal device is present on the port as determined by
-	 * PxTFD.STS.BSY=0, PxTFD.STS.DRQ=0 and PxSSTS.DET=3h
-	 *
-	 * Even though most AHCI host controllers work without this check,
-	 * specific controller will fail under this condition
-	 */
-	if (status & (ATA_BUSY | ATA_DRQ))
-		return;
-	else {
-		ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
-
-		if ((tmp & 0xf) != 0x3)
-			return;
-	}
 
 	/* start DMA */
 	tmp = readl(port_mmio + PORT_CMD);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
@ 2011-05-14 10:28               ` Tejun Heo
  0 siblings, 0 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-14 10:28 UTC (permalink / raw)
  To: Linus Torvalds, Jeff Garzik
  Cc: Michael Leun, linux-ide, Linux PM mailing list, LKML,
	Rafael J. Wysocki, Jeff Garzik, Jian Peng

This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.

The commits causes command timeouts on AC plug/unplug.  It isn't yet
clear why.  As the commit was for a single rather obscure controller,
revert the change for now.

The problem was reported and bisected by Gu Rui in bug#34692.

 https://bugzilla.kernel.org/show_bug.cgi?id=34692

Also, reported by Rafael and Michael in the following thread.

 http://thread.gmane.org/gmane.linux.kernel/1138771

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Gu Rui <chaos.proton@gmail.com>
Reported-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Michael Leun <lkml20100708@newton.leun.net>
Cc: Jian Peng <jipeng2005@gmail.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
---
As we're already in -rc7, I'm sending the revert patch to both Jeff
and Linus.

Thank you.

 drivers/ata/libahci.c |   21 ---------------------
 1 files changed, 0 insertions(+), 21 deletions(-)

diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index ff9d832..d38c40f 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -561,27 +561,6 @@ void ahci_start_engine(struct ata_port *ap)
 {
 	void __iomem *port_mmio = ahci_port_base(ap);
 	u32 tmp;
-	u8 status;
-
-	status = readl(port_mmio + PORT_TFDATA) & 0xFF;
-
-	/*
-	 * At end of section 10.1 of AHCI spec (rev 1.3), it states
-	 * Software shall not set PxCMD.ST to 1 until it is determined
-	 * that a functoinal device is present on the port as determined by
-	 * PxTFD.STS.BSY=0, PxTFD.STS.DRQ=0 and PxSSTS.DET=3h
-	 *
-	 * Even though most AHCI host controllers work without this check,
-	 * specific controller will fail under this condition
-	 */
-	if (status & (ATA_BUSY | ATA_DRQ))
-		return;
-	else {
-		ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
-
-		if ((tmp & 0xf) != 0x3)
-			return;
-	}
 
 	/* start DMA */
 	tmp = readl(port_mmio + PORT_CMD);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-14 10:18           ` Tejun Heo
  2011-05-14 10:28               ` Tejun Heo
@ 2011-05-14 10:28             ` Tejun Heo
  1 sibling, 0 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-14 10:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, LKML, linux-ide, Jian Peng, Linux PM mailing list,
	Michael Leun

This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.

The commits causes command timeouts on AC plug/unplug.  It isn't yet
clear why.  As the commit was for a single rather obscure controller,
revert the change for now.

The problem was reported and bisected by Gu Rui in bug#34692.

 https://bugzilla.kernel.org/show_bug.cgi?id=34692

Also, reported by Rafael and Michael in the following thread.

 http://thread.gmane.org/gmane.linux.kernel/1138771

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Gu Rui <chaos.proton@gmail.com>
Reported-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Michael Leun <lkml20100708@newton.leun.net>
Cc: Jian Peng <jipeng2005@gmail.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
---
As we're already in -rc7, I'm sending the revert patch to both Jeff
and Linus.

Thank you.

 drivers/ata/libahci.c |   21 ---------------------
 1 files changed, 0 insertions(+), 21 deletions(-)

diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index ff9d832..d38c40f 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -561,27 +561,6 @@ void ahci_start_engine(struct ata_port *ap)
 {
 	void __iomem *port_mmio = ahci_port_base(ap);
 	u32 tmp;
-	u8 status;
-
-	status = readl(port_mmio + PORT_TFDATA) & 0xFF;
-
-	/*
-	 * At end of section 10.1 of AHCI spec (rev 1.3), it states
-	 * Software shall not set PxCMD.ST to 1 until it is determined
-	 * that a functoinal device is present on the port as determined by
-	 * PxTFD.STS.BSY=0, PxTFD.STS.DRQ=0 and PxSSTS.DET=3h
-	 *
-	 * Even though most AHCI host controllers work without this check,
-	 * specific controller will fail under this condition
-	 */
-	if (status & (ATA_BUSY | ATA_DRQ))
-		return;
-	else {
-		ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
-
-		if ((tmp & 0xf) != 0x3)
-			return;
-	}
 
 	/* start DMA */
 	tmp = readl(port_mmio + PORT_CMD);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-14 10:28               ` Tejun Heo
  (?)
  (?)
@ 2011-05-14 17:47               ` Linus Torvalds
  2011-05-14 18:49                 ` Jeff Garzik
  2011-05-14 18:49                 ` Jeff Garzik
  -1 siblings, 2 replies; 62+ messages in thread
From: Linus Torvalds @ 2011-05-14 17:47 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Garzik, Michael Leun, linux-ide, Linux PM mailing list,
	LKML, Rafael J. Wysocki, Jian Peng

On Sat, May 14, 2011 at 3:28 AM, Tejun Heo <tj@kernel.org> wrote:
>
> As we're already in -rc7, I'm sending the revert patch to both Jeff
> and Linus.

Applied,

                   Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-14 10:28               ` Tejun Heo
  (?)
@ 2011-05-14 17:47               ` Linus Torvalds
  -1 siblings, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2011-05-14 17:47 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Garzik, LKML, linux-ide, Jian Peng, Linux PM mailing list,
	Michael Leun

On Sat, May 14, 2011 at 3:28 AM, Tejun Heo <tj@kernel.org> wrote:
>
> As we're already in -rc7, I'm sending the revert patch to both Jeff
> and Linus.

Applied,

                   Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-14 17:47               ` Linus Torvalds
  2011-05-14 18:49                 ` Jeff Garzik
@ 2011-05-14 18:49                 ` Jeff Garzik
  1 sibling, 0 replies; 62+ messages in thread
From: Jeff Garzik @ 2011-05-14 18:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tejun Heo, Michael Leun, linux-ide, Linux PM mailing list, LKML,
	Rafael J. Wysocki, Jian Peng

On 05/14/2011 01:47 PM, Linus Torvalds wrote:
> On Sat, May 14, 2011 at 3:28 AM, Tejun Heo<tj@kernel.org>  wrote:
>>
>> As we're already in -rc7, I'm sending the revert patch to both Jeff
>> and Linus.
>
> Applied,

ACK, thanks

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-14 17:47               ` Linus Torvalds
@ 2011-05-14 18:49                 ` Jeff Garzik
  2011-05-14 18:49                 ` Jeff Garzik
  1 sibling, 0 replies; 62+ messages in thread
From: Jeff Garzik @ 2011-05-14 18:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: LKML, linux-ide, Jian Peng, Tejun Heo, Linux PM mailing list,
	Michael Leun

On 05/14/2011 01:47 PM, Linus Torvalds wrote:
> On Sat, May 14, 2011 at 3:28 AM, Tejun Heo<tj@kernel.org>  wrote:
>>
>> As we're already in -rc7, I'm sending the revert patch to both Jeff
>> and Linus.
>
> Applied,

ACK, thanks

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-14 10:28               ` Tejun Heo
                                 ` (2 preceding siblings ...)
  (?)
@ 2011-05-15  6:19               ` Jian Peng
  2011-05-15  9:25                 ` Rafael J. Wysocki
  2011-05-15  9:25                 ` Rafael J. Wysocki
  -1 siblings, 2 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-15  6:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Michael Leun, Linus Torvalds, LKML, linux-ide,
	Linux PM mailing list, Jeff Garzik


[-- Attachment #1.1: Type: text/plain, Size: 2473 bytes --]

Hi, Michael/Rafael/Gu,

Could you help me understand the testing environme? I like to reproduce it
if it is possible and debug it. The host controller this patch dealing with
is not used on PC so my testing environment is quite different.

Thanks,
Jian

On Sat, May 14, 2011 at 3:28 AM, Tejun Heo <tj@kernel.org> wrote:

> This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.
>
> The commits causes command timeouts on AC plug/unplug.  It isn't yet
> clear why.  As the commit was for a single rather obscure controller,
> revert the change for now.
>
> The problem was reported and bisected by Gu Rui in bug#34692.
>
>  https://bugzilla.kernel.org/show_bug.cgi?id=34692
>
> Also, reported by Rafael and Michael in the following thread.
>
>  http://thread.gmane.org/gmane.linux.kernel/1138771
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Gu Rui <chaos.proton@gmail.com>
> Reported-by: Rafael J. Wysocki <rjw@sisk.pl>
> Reported-by: Michael Leun <lkml20100708@newton.leun.net>
> Cc: Jian Peng <jipeng2005@gmail.com>
> Cc: Jeff Garzik <jgarzik@pobox.com>
> ---
> As we're already in -rc7, I'm sending the revert patch to both Jeff
> and Linus.
>
> Thank you.
>
>  drivers/ata/libahci.c |   21 ---------------------
>  1 files changed, 0 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> index ff9d832..d38c40f 100644
> --- a/drivers/ata/libahci.c
> +++ b/drivers/ata/libahci.c
> @@ -561,27 +561,6 @@ void ahci_start_engine(struct ata_port *ap)
>  {
>        void __iomem *port_mmio = ahci_port_base(ap);
>        u32 tmp;
> -       u8 status;
> -
> -       status = readl(port_mmio + PORT_TFDATA) & 0xFF;
> -
> -       /*
> -        * At end of section 10.1 of AHCI spec (rev 1.3), it states
> -        * Software shall not set PxCMD.ST to 1 until it is determined
> -        * that a functoinal device is present on the port as determined by
> -        * PxTFD.STS.BSY=0, PxTFD.STS.DRQ=0 and PxSSTS.DET=3h
> -        *
> -        * Even though most AHCI host controllers work without this check,
> -        * specific controller will fail under this condition
> -        */
> -       if (status & (ATA_BUSY | ATA_DRQ))
> -               return;
> -       else {
> -               ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
> -
> -               if ((tmp & 0xf) != 0x3)
> -                       return;
> -       }
>
>        /* start DMA */
>        tmp = readl(port_mmio + PORT_CMD);
> --
> 1.7.1
>
>

[-- Attachment #1.2: Type: text/html, Size: 3387 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-15  6:19               ` Jian Peng
  2011-05-15  9:25                 ` Rafael J. Wysocki
@ 2011-05-15  9:25                 ` Rafael J. Wysocki
  2011-05-15 12:20                   ` Michael Leun
  2011-05-15 12:20                   ` Michael Leun
  1 sibling, 2 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-15  9:25 UTC (permalink / raw)
  To: Jian Peng
  Cc: Tejun Heo, Linus Torvalds, Jeff Garzik, Michael Leun, linux-ide,
	Linux PM mailing list, LKML

On Sunday, May 15, 2011, Jian Peng wrote:
> Hi, Michael/Rafael/Gu,
> 
> Could you help me understand the testing environme? I like to reproduce it
> if it is possible and debug it. The host controller this patch dealing with
> is not used on PC so my testing environment is quite different.

Well, the system I can reproduce it on is a production one, so I can't
really test too much on it.  What exactly would you like to do?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-15  6:19               ` Jian Peng
@ 2011-05-15  9:25                 ` Rafael J. Wysocki
  2011-05-15  9:25                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-15  9:25 UTC (permalink / raw)
  To: Jian Peng
  Cc: Michael Leun, Linus Torvalds, LKML, linux-ide, Tejun Heo,
	Linux PM mailing list, Jeff Garzik

On Sunday, May 15, 2011, Jian Peng wrote:
> Hi, Michael/Rafael/Gu,
> 
> Could you help me understand the testing environme? I like to reproduce it
> if it is possible and debug it. The host controller this patch dealing with
> is not used on PC so my testing environment is quite different.

Well, the system I can reproduce it on is a production one, so I can't
really test too much on it.  What exactly would you like to do?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-15  9:25                 ` Rafael J. Wysocki
  2011-05-15 12:20                   ` Michael Leun
@ 2011-05-15 12:20                   ` Michael Leun
  1 sibling, 0 replies; 62+ messages in thread
From: Michael Leun @ 2011-05-15 12:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jian Peng, Tejun Heo, Linus Torvalds, Jeff Garzik, linux-ide,
	Linux PM mailing list, LKML

On Sun, 15 May 2011 11:25:19 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Sunday, May 15, 2011, Jian Peng wrote:
> > Hi, Michael/Rafael/Gu,
> > 
> > Could you help me understand the testing environme? I like to
> > reproduce it if it is possible and debug it. The host controller
> > this patch dealing with is not used on PC so my testing environment
> > is quite different.

We have seen it on at least 3 different controllers from at least 2
different vendors (I do not know, what hardware Gu has).

So, for now, it looks like to happen on more different controllers than
not.

Testing environment - uhm, what shall I say? On each of the two
notebooks affected runs 64bit openSuSE 11.4 in more or less default
configuration (apart from kernel, of course).

Just do a "ls -lR /" and watch kernel log while plugging/unplugging
power.

Maybe running a 64bit system/kernel is part of picture? As far as I can
see all affected machines run 64bit.

> Well, the system I can reproduce it on is a production one, so I can't
> really test too much on it.  What exactly would you like to do?

I've at least one machine available where I can test. So, if there is
anything (preferably non destructive ;-) ) you want me to test this
should be possible.

Maybe I'll have a look if my netbook (running 32bit) is also affected.

-- 
MfG,

Michael Leun


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-15  9:25                 ` Rafael J. Wysocki
@ 2011-05-15 12:20                   ` Michael Leun
  2011-05-15 12:20                   ` Michael Leun
  1 sibling, 0 replies; 62+ messages in thread
From: Michael Leun @ 2011-05-15 12:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, LKML, linux-ide, Jian Peng, Tejun Heo,
	Linux PM mailing list, Jeff Garzik

On Sun, 15 May 2011 11:25:19 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Sunday, May 15, 2011, Jian Peng wrote:
> > Hi, Michael/Rafael/Gu,
> > 
> > Could you help me understand the testing environme? I like to
> > reproduce it if it is possible and debug it. The host controller
> > this patch dealing with is not used on PC so my testing environment
> > is quite different.

We have seen it on at least 3 different controllers from at least 2
different vendors (I do not know, what hardware Gu has).

So, for now, it looks like to happen on more different controllers than
not.

Testing environment - uhm, what shall I say? On each of the two
notebooks affected runs 64bit openSuSE 11.4 in more or less default
configuration (apart from kernel, of course).

Just do a "ls -lR /" and watch kernel log while plugging/unplugging
power.

Maybe running a 64bit system/kernel is part of picture? As far as I can
see all affected machines run 64bit.

> Well, the system I can reproduce it on is a production one, so I can't
> really test too much on it.  What exactly would you like to do?

I've at least one machine available where I can test. So, if there is
anything (preferably non destructive ;-) ) you want me to test this
should be possible.

Maybe I'll have a look if my netbook (running 32bit) is also affected.

-- 
MfG,

Michael Leun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-14 10:28               ` Tejun Heo
                                 ` (3 preceding siblings ...)
  (?)
@ 2011-05-16 17:02               ` Valdis.Kletnieks
  2011-05-18 19:23                 ` Jian Peng
  -1 siblings, 1 reply; 62+ messages in thread
From: Valdis.Kletnieks @ 2011-05-16 17:02 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Jeff Garzik, Michael Leun, linux-ide,
	Linux PM mailing list, LKML, Rafael J. Wysocki, Jian Peng

[-- Attachment #1: Type: text/plain, Size: 669 bytes --]

On Sat, 14 May 2011 12:28:04 +0200, Tejun Heo said:
> This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.
> 
> The commits causes command timeouts on AC plug/unplug.  It isn't yet
> clear why.  As the commit was for a single rather obscure controller,
> revert the change for now.
> 
> The problem was reported and bisected by Gu Rui in bug#34692.
> 
>  https://bugzilla.kernel.org/show_bug.cgi?id=34692
> 
> Also, reported by Rafael and Michael in the following thread.
> 
>  http://thread.gmane.org/gmane.linux.kernel/1138771

This also fixes the issue I had with a 10-second pause on my Dell Latitude E6500
laptop at boot while detecting the CD/DVD drive.

[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-14 10:28               ` Tejun Heo
                                 ` (4 preceding siblings ...)
  (?)
@ 2011-05-16 17:02               ` Valdis.Kletnieks
  -1 siblings, 0 replies; 62+ messages in thread
From: Valdis.Kletnieks @ 2011-05-16 17:02 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Michael Leun, Linus Torvalds, LKML, linux-ide, Jian Peng,
	Linux PM mailing list, Jeff Garzik


[-- Attachment #1.1: Type: text/plain, Size: 669 bytes --]

On Sat, 14 May 2011 12:28:04 +0200, Tejun Heo said:
> This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.
> 
> The commits causes command timeouts on AC plug/unplug.  It isn't yet
> clear why.  As the commit was for a single rather obscure controller,
> revert the change for now.
> 
> The problem was reported and bisected by Gu Rui in bug#34692.
> 
>  https://bugzilla.kernel.org/show_bug.cgi?id=34692
> 
> Also, reported by Rafael and Michael in the following thread.
> 
>  http://thread.gmane.org/gmane.linux.kernel/1138771

This also fixes the issue I had with a 10-second pause on my Dell Latitude E6500
laptop at boot while detecting the CD/DVD drive.

[-- Attachment #1.2: Type: application/pgp-signature, Size: 227 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-16 17:02               ` Valdis.Kletnieks
@ 2011-05-18 19:23                 ` Jian Peng
  2011-05-18 19:44                   ` Rafael J. Wysocki
  2011-05-18 19:44                   ` Rafael J. Wysocki
  0 siblings, 2 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-18 19:23 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Jeff Garzik, LKML, linux-ide, Tejun Heo, Linux PM mailing list,
	Michael Leun


[-- Attachment #1.1: Type: text/plain, Size: 1799 bytes --]

Hi, Valdis/Rafael/Michael,

Could you help me test the following change?

After reverting 81ca7e4, add 5ms delay as follow since that seems also
fixing the issue on my SATA host controller that requires 81ca7e4.

In drivers/ata/libahci.c, inside ahci_hardreset() function,


1333 <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1333>
ahci_start_engine
<http://lxr.linux.no/linux+*/+code=ahci_start_engine>(ap
<http://lxr.linux.no/linux+*/+code=ap>);

            msleep(5);1334
<http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1334>1335
<http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1335>        if
(online <http://lxr.linux.no/linux+*/+code=online>)1336
<http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1336>
  *class <http://lxr.linux.no/linux+*/+code=class> = ahci_dev_classify
<http://lxr.linux.no/linux+*/+code=ahci_dev_classify>(ap
<http://lxr.linux.no/linux+*/+code=ap>);

Since my host controller requires time to switch internal state to be ready.
Please let me know your testing result.

Thanks,
Jian



On Mon, May 16, 2011 at 10:02 AM, <Valdis.Kletnieks@vt.edu> wrote:

> On Sat, 14 May 2011 12:28:04 +0200, Tejun Heo said:
> > This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4.
> >
> > The commits causes command timeouts on AC plug/unplug.  It isn't yet
> > clear why.  As the commit was for a single rather obscure controller,
> > revert the change for now.
> >
> > The problem was reported and bisected by Gu Rui in bug#34692.
> >
> >  https://bugzilla.kernel.org/show_bug.cgi?id=34692
> >
> > Also, reported by Rafael and Michael in the following thread.
> >
> >  http://thread.gmane.org/gmane.linux.kernel/1138771
>
> This also fixes the issue I had with a 10-second pause on my Dell Latitude
> E6500
> laptop at boot while detecting the CD/DVD drive.
>

[-- Attachment #1.2: Type: text/html, Size: 2955 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-18 19:23                 ` Jian Peng
  2011-05-18 19:44                   ` Rafael J. Wysocki
@ 2011-05-18 19:44                   ` Rafael J. Wysocki
  2011-05-18 21:25                     ` Jian Peng
  1 sibling, 1 reply; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-18 19:44 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Tejun Heo, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

On Wednesday, May 18, 2011, Jian Peng wrote:
> Hi, Valdis/Rafael/Michael,
> 
> Could you help me test the following change?
> 
> After reverting 81ca7e4, add 5ms delay as follow since that seems also
> fixing the issue on my SATA host controller that requires 81ca7e4.
> 
> In drivers/ata/libahci.c, inside ahci_hardreset() function,
> 
> 
> 1333 <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1333>
> ahci_start_engine
> <http://lxr.linux.no/linux+*/+code=ahci_start_engine>(ap
> <http://lxr.linux.no/linux+*/+code=ap>);
> 
>             msleep(5);1334
> <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1334>1335
> <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1335>        if
> (online <http://lxr.linux.no/linux+*/+code=online>)1336
> <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1336>
>   *class <http://lxr.linux.no/linux+*/+code=class> = ahci_dev_classify
> <http://lxr.linux.no/linux+*/+code=ahci_dev_classify>(ap
> <http://lxr.linux.no/linux+*/+code=ap>);
> 
> Since my host controller requires time to switch internal state to be ready.
> Please let me know your testing result.

Could you simply post a patch?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-18 19:23                 ` Jian Peng
@ 2011-05-18 19:44                   ` Rafael J. Wysocki
  2011-05-18 19:44                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-18 19:44 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide, Tejun Heo,
	Linux PM mailing list, Jeff Garzik

On Wednesday, May 18, 2011, Jian Peng wrote:
> Hi, Valdis/Rafael/Michael,
> 
> Could you help me test the following change?
> 
> After reverting 81ca7e4, add 5ms delay as follow since that seems also
> fixing the issue on my SATA host controller that requires 81ca7e4.
> 
> In drivers/ata/libahci.c, inside ahci_hardreset() function,
> 
> 
> 1333 <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1333>
> ahci_start_engine
> <http://lxr.linux.no/linux+*/+code=ahci_start_engine>(ap
> <http://lxr.linux.no/linux+*/+code=ap>);
> 
>             msleep(5);1334
> <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1334>1335
> <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1335>        if
> (online <http://lxr.linux.no/linux+*/+code=online>)1336
> <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1336>
>   *class <http://lxr.linux.no/linux+*/+code=class> = ahci_dev_classify
> <http://lxr.linux.no/linux+*/+code=ahci_dev_classify>(ap
> <http://lxr.linux.no/linux+*/+code=ap>);
> 
> Since my host controller requires time to switch internal state to be ready.
> Please let me know your testing result.

Could you simply post a patch?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-18 19:44                   ` Rafael J. Wysocki
@ 2011-05-18 21:25                     ` Jian Peng
  2011-05-19  0:14                       ` Jian Peng
  0 siblings, 1 reply; 62+ messages in thread
From: Jian Peng @ 2011-05-18 21:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide, Tejun Heo,
	Linux PM mailing list, Jeff Garzik


[-- Attachment #1.1: Type: text/plain, Size: 2333 bytes --]

Sure, Please try this. Thanks,

diff -Naur a/drivers/ata/libahci.c b/drivers/ata/libahci.c
--- a/drivers/ata/libahci.c 2011-05-18 14:23:36.564665643 -0700
+++ b/drivers/ata/libahci.c 2011-05-18 14:24:52.564614378 -0700
@@ -539,27 +539,6 @@
 {
  void __iomem *port_mmio = ahci_port_base(ap);
  u32 tmp;
- u8 status;
-
- status = readl(port_mmio + PORT_TFDATA) & 0xFF;
-
- /*
-  * At end of section 10.1 of AHCI spec (rev 1.3), it states
-  * Software shall not set PxCMD.ST to 1 until it is determined
-  * that a functoinal device is present on the port as determined by
-  * PxTFD.STS.BSY=0, PxTFD.STS.DRQ=0 and PxSSTS.DET=3h
-  *
-  * Even though most AHCI host controllers work without this check,
-  * specific controller will fail under this condition
-  */
- if (status & (ATA_BUSY | ATA_DRQ))
-  return;
- else {
-  ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
-
-  if ((tmp & 0xf) != 0x3)
-   return;
- }

  /* start DMA */
  tmp = readl(port_mmio + PORT_CMD);
@@ -1353,6 +1332,8 @@

  ahci_start_engine(ap);

+ msleep(5);
+
  if (online)
   *class = ahci_dev_classify(ap);


2011/5/18 Rafael J. Wysocki <rjw@sisk.pl>

> On Wednesday, May 18, 2011, Jian Peng wrote:
> > Hi, Valdis/Rafael/Michael,
> >
> > Could you help me test the following change?
> >
> > After reverting 81ca7e4, add 5ms delay as follow since that seems also
> > fixing the issue on my SATA host controller that requires 81ca7e4.
> >
> > In drivers/ata/libahci.c, inside ahci_hardreset() function,
> >
> >
> > 1333 <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1333>
> > ahci_start_engine
> > <http://lxr.linux.no/linux+*/+code=ahci_start_engine>(ap
> > <http://lxr.linux.no/linux+*/+code=ap>);
> >
> >             msleep(5);1334
> > <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1334>1335
> > <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1335>        if
> > (online <http://lxr.linux.no/linux+*/+code=online>)1336
> > <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1336>
> >   *class <http://lxr.linux.no/linux+*/+code=class> = ahci_dev_classify
> > <http://lxr.linux.no/linux+*/+code=ahci_dev_classify>(ap
> > <http://lxr.linux.no/linux+*/+code=ap>);
> >
> > Since my host controller requires time to switch internal state to be
> ready.
> > Please let me know your testing result.
>
> Could you simply post a patch?
>
> Rafael
>

[-- Attachment #1.2: Type: text/html, Size: 3825 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-18 21:25                     ` Jian Peng
@ 2011-05-19  0:14                       ` Jian Peng
  2011-05-19 10:19                         ` Rafael J. Wysocki
                                           ` (3 more replies)
  0 siblings, 4 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-19  0:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide, Tejun Heo,
	Linux PM mailing list, Jeff Garzik


[-- Attachment #1.1: Type: text/plain, Size: 2791 bytes --]

Hi, Rafael,

I am using Dell E6410 with same AHCI host controller as your system, but
could not reproduce the bug you found out (using Ubuntu 11.04 and 2.6.39-rc6
kernel).
Do you know which config file I need change to perform this testing?

Thanks,
Jian

On Wed, May 18, 2011 at 2:25 PM, Jian Peng <jipeng2005@gmail.com> wrote:

> Sure, Please try this. Thanks,
>
> diff -Naur a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> --- a/drivers/ata/libahci.c 2011-05-18 14:23:36.564665643 -0700
> +++ b/drivers/ata/libahci.c 2011-05-18 14:24:52.564614378 -0700
> @@ -539,27 +539,6 @@
>
>  {
>   void __iomem *port_mmio = ahci_port_base(ap);
>   u32 tmp;
> - u8 status;
> -
> - status = readl(port_mmio + PORT_TFDATA) & 0xFF;
> -
> - /*
> -  * At end of section 10.1 of AHCI spec (rev 1.3), it states
> -  * Software shall not set PxCMD.ST to 1 until it is determined
> -  * that a functoinal device is present on the port as determined by
> -  * PxTFD.STS.BSY=0, PxTFD.STS.DRQ=0 and PxSSTS.DET=3h
> -  *
> -  * Even though most AHCI host controllers work without this check,
>
> -  * specific controller will fail under this condition
> -  */
> - if (status & (ATA_BUSY | ATA_DRQ))
> -  return;
> - else {
> -  ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
> -
> -  if ((tmp & 0xf) != 0x3)
> -   return;
> - }
>
>   /* start DMA */
>   tmp = readl(port_mmio + PORT_CMD);
> @@ -1353,6 +1332,8 @@
>
>
>   ahci_start_engine(ap);
>
> + msleep(5);
> +
>   if (online)
>
>    *class = ahci_dev_classify(ap);
>
>
> 2011/5/18 Rafael J. Wysocki <rjw@sisk.pl>
>
>  On Wednesday, May 18, 2011, Jian Peng wrote:
>> > Hi, Valdis/Rafael/Michael,
>> >
>> > Could you help me test the following change?
>> >
>> > After reverting 81ca7e4, add 5ms delay as follow since that seems also
>> > fixing the issue on my SATA host controller that requires 81ca7e4.
>> >
>> > In drivers/ata/libahci.c, inside ahci_hardreset() function,
>> >
>> >
>> > 1333 <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1333>
>> > ahci_start_engine
>> > <http://lxr.linux.no/linux+*/+code=ahci_start_engine>(ap
>> > <http://lxr.linux.no/linux+*/+code=ap>);
>> >
>> >             msleep(5);1334
>> > <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1334>1335
>> > <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1335>        if
>> > (online <http://lxr.linux.no/linux+*/+code=online>)1336
>> > <http://lxr.linux.no/linux+*/drivers/ata/libahci.c#L1336>
>> >   *class <http://lxr.linux.no/linux+*/+code=class> = ahci_dev_classify
>> > <http://lxr.linux.no/linux+*/+code=ahci_dev_classify>(ap
>> > <http://lxr.linux.no/linux+*/+code=ap>);
>> >
>> > Since my host controller requires time to switch internal state to be
>> ready.
>> > Please let me know your testing result.
>>
>> Could you simply post a patch?
>>
>> Rafael
>>
>
>

[-- Attachment #1.2: Type: text/html, Size: 4615 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19  0:14                       ` Jian Peng
  2011-05-19 10:19                         ` Rafael J. Wysocki
@ 2011-05-19 10:19                         ` Rafael J. Wysocki
  2011-05-19 16:42                           ` Jian Peng
  2011-05-20 15:40                         ` Valdis.Kletnieks
  2011-05-20 15:40                         ` Valdis.Kletnieks
  3 siblings, 1 reply; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-19 10:19 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Tejun Heo, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

On Thursday, May 19, 2011, Jian Peng wrote:
> Hi, Rafael,

Hi,

> I am using Dell E6410 with same AHCI host controller as your system, but
> could not reproduce the bug you found out (using Ubuntu 11.04 and 2.6.39-rc6
> kernel).

OK, do I think correctly you tested without the patch you sent previously?

> Do you know which config file I need change to perform this testing?

I'm not sure what you mean.  The kernel .config?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19  0:14                       ` Jian Peng
@ 2011-05-19 10:19                         ` Rafael J. Wysocki
  2011-05-19 10:19                         ` Rafael J. Wysocki
                                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-19 10:19 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide, Tejun Heo,
	Linux PM mailing list, Jeff Garzik

On Thursday, May 19, 2011, Jian Peng wrote:
> Hi, Rafael,

Hi,

> I am using Dell E6410 with same AHCI host controller as your system, but
> could not reproduce the bug you found out (using Ubuntu 11.04 and 2.6.39-rc6
> kernel).

OK, do I think correctly you tested without the patch you sent previously?

> Do you know which config file I need change to perform this testing?

I'm not sure what you mean.  The kernel .config?

Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19 10:19                         ` Rafael J. Wysocki
@ 2011-05-19 16:42                           ` Jian Peng
  2011-05-19 21:32                             ` Rafael J. Wysocki
  2011-05-19 21:32                             ` Rafael J. Wysocki
  0 siblings, 2 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-19 16:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide, Tejun Heo,
	Linux PM mailing list, Jeff Garzik


[-- Attachment #1.1: Type: text/plain, Size: 1063 bytes --]

Hi, Rafael,

Yes. I tested without reverting the patch that caused failure on E6410 (with
same AHCI controller), The system is Ubuntu 11.04 with default 2.6.39-rc6
kernel (guess this should be same kernel as yours before reverting the
patch).

My system worked well with AC power cable plugging in and out at boot time,
at runtime, and during "dd if=/dev/sda5 of=/dev/null ..".

The config file I mentioned is those under /etc/ or other directories that
is used to control the power management.

Did you test my patch? What it the result?

Thanks,
Jian


2011/5/19 Rafael J. Wysocki <rjw@sisk.pl>

> On Thursday, May 19, 2011, Jian Peng wrote:
> > Hi, Rafael,
>
> Hi,
>
> > I am using Dell E6410 with same AHCI host controller as your system, but
> > could not reproduce the bug you found out (using Ubuntu 11.04 and
> 2.6.39-rc6
> > kernel).
>
> OK, do I think correctly you tested without the patch you sent previously?
>
> > Do you know which config file I need change to perform this testing?
>
> I'm not sure what you mean.  The kernel .config?
>
> Rafael
>

[-- Attachment #1.2: Type: text/html, Size: 1588 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19 16:42                           ` Jian Peng
  2011-05-19 21:32                             ` Rafael J. Wysocki
@ 2011-05-19 21:32                             ` Rafael J. Wysocki
  2011-05-19 21:57                               ` Jian Peng
  1 sibling, 1 reply; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-19 21:32 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Tejun Heo, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

On Thursday, May 19, 2011, Jian Peng wrote:
> Hi, Rafael,
> 
> Yes. I tested without reverting the patch that caused failure on E6410 (with
> same AHCI controller), The system is Ubuntu 11.04 with default 2.6.39-rc6
> kernel (guess this should be same kernel as yours before reverting the
> patch).
> 
> My system worked well with AC power cable plugging in and out at boot time,
> at runtime, and during "dd if=/dev/sda5 of=/dev/null ..".

What kind of disk did you use?  Mine is an Intel SSD.

> The config file I mentioned is those under /etc/ or other directories that
> is used to control the power management.

They aren't relevant to this issue.

> Did you test my patch? What it the result?

No, I didn't.  Unfortunately, I won't be able to do any tests on the affected
machine in the near future (most probably for the next two weeks or so).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19 16:42                           ` Jian Peng
@ 2011-05-19 21:32                             ` Rafael J. Wysocki
  2011-05-19 21:32                             ` Rafael J. Wysocki
  1 sibling, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-19 21:32 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide, Tejun Heo,
	Linux PM mailing list, Jeff Garzik

On Thursday, May 19, 2011, Jian Peng wrote:
> Hi, Rafael,
> 
> Yes. I tested without reverting the patch that caused failure on E6410 (with
> same AHCI controller), The system is Ubuntu 11.04 with default 2.6.39-rc6
> kernel (guess this should be same kernel as yours before reverting the
> patch).
> 
> My system worked well with AC power cable plugging in and out at boot time,
> at runtime, and during "dd if=/dev/sda5 of=/dev/null ..".

What kind of disk did you use?  Mine is an Intel SSD.

> The config file I mentioned is those under /etc/ or other directories that
> is used to control the power management.

They aren't relevant to this issue.

> Did you test my patch? What it the result?

No, I didn't.  Unfortunately, I won't be able to do any tests on the affected
machine in the near future (most probably for the next two weeks or so).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19 21:32                             ` Rafael J. Wysocki
@ 2011-05-19 21:57                               ` Jian Peng
  2011-05-19 22:03                                 ` Rafael J. Wysocki
  2011-05-19 22:03                                 ` Rafael J. Wysocki
  0 siblings, 2 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-19 21:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide, Tejun Heo,
	Linux PM mailing list, Jeff Garzik


[-- Attachment #1.1: Type: text/plain, Size: 1214 bytes --]

Hi, Rafael,

I am using internal WDC laptop HDD. I will give SDD a try even though I only
have non-Intel SDD now.

Is your SDD the boot disk (by swapping the built-in HDD out) or the extra
one installed using device bay on Dell laptop?

Thanks,
Jian


2011/5/19 Rafael J. Wysocki <rjw@sisk.pl>

> On Thursday, May 19, 2011, Jian Peng wrote:
> > Hi, Rafael,
> >
> > Yes. I tested without reverting the patch that caused failure on E6410
> (with
> > same AHCI controller), The system is Ubuntu 11.04 with default 2.6.39-rc6
> > kernel (guess this should be same kernel as yours before reverting the
> > patch).
> >
> > My system worked well with AC power cable plugging in and out at boot
> time,
> > at runtime, and during "dd if=/dev/sda5 of=/dev/null ..".
>
> What kind of disk did you use?  Mine is an Intel SSD.
>
> > The config file I mentioned is those under /etc/ or other directories
> that
> > is used to control the power management.
>
> They aren't relevant to this issue.
>
> > Did you test my patch? What it the result?
>
> No, I didn't.  Unfortunately, I won't be able to do any tests on the
> affected
> machine in the near future (most probably for the next two weeks or so).
>
> Thanks,
> Rafael
>

[-- Attachment #1.2: Type: text/html, Size: 1770 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19 21:57                               ` Jian Peng
  2011-05-19 22:03                                 ` Rafael J. Wysocki
@ 2011-05-19 22:03                                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-19 22:03 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Tejun Heo, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

On Thursday, May 19, 2011, Jian Peng wrote:
> Hi, Rafael,
> 
> I am using internal WDC laptop HDD. I will give SDD a try even though I only
> have non-Intel SDD now.
> 
> Is your SDD the boot disk (by swapping the built-in HDD out)

Yes, it is.

> or the extra one installed using device bay on Dell laptop?

The machine is not a Dell.  It's an Acer.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19 21:57                               ` Jian Peng
@ 2011-05-19 22:03                                 ` Rafael J. Wysocki
  2011-05-19 22:03                                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-19 22:03 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide, Tejun Heo,
	Linux PM mailing list, Jeff Garzik

On Thursday, May 19, 2011, Jian Peng wrote:
> Hi, Rafael,
> 
> I am using internal WDC laptop HDD. I will give SDD a try even though I only
> have non-Intel SDD now.
> 
> Is your SDD the boot disk (by swapping the built-in HDD out)

Yes, it is.

> or the extra one installed using device bay on Dell laptop?

The machine is not a Dell.  It's an Acer.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19  0:14                       ` Jian Peng
  2011-05-19 10:19                         ` Rafael J. Wysocki
  2011-05-19 10:19                         ` Rafael J. Wysocki
@ 2011-05-20 15:40                         ` Valdis.Kletnieks
  2011-05-20 15:43                           ` Tejun Heo
  2011-05-20 15:43                           ` Tejun Heo
  2011-05-20 15:40                         ` Valdis.Kletnieks
  3 siblings, 2 replies; 62+ messages in thread
From: Valdis.Kletnieks @ 2011-05-20 15:40 UTC (permalink / raw)
  To: Jian Peng
  Cc: Rafael J. Wysocki, Tejun Heo, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

[-- Attachment #1: Type: text/plain, Size: 629 bytes --]

On Wed, 18 May 2011 17:14:56 PDT, Jian Peng said:

> > @@ -1353,6 +1332,8 @@
> >
> >
> >   ahci_start_engine(ap);
> >
> > + msleep(5);
> > +
> >   if (online)
> >
> >    *class = ahci_dev_classify(ap);
> >

It may very well be that adding a magic msleep(5) here just Makes It Work, but
I have a gut feeling that it's in the wrong place (for starters, 'online' can't change
during the msleep() unless somebody *else* sets it - in which case the locking
is screwed up as we're not forcing a re-read of the value).  The msleep() probably
needs to be before something else further down in the code (but I have no idea
exactly what).

[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-19  0:14                       ` Jian Peng
                                           ` (2 preceding siblings ...)
  2011-05-20 15:40                         ` Valdis.Kletnieks
@ 2011-05-20 15:40                         ` Valdis.Kletnieks
  3 siblings, 0 replies; 62+ messages in thread
From: Valdis.Kletnieks @ 2011-05-20 15:40 UTC (permalink / raw)
  To: Jian Peng
  Cc: Michael Leun, LKML, linux-ide, Tejun Heo, Linux PM mailing list,
	Jeff Garzik


[-- Attachment #1.1: Type: text/plain, Size: 629 bytes --]

On Wed, 18 May 2011 17:14:56 PDT, Jian Peng said:

> > @@ -1353,6 +1332,8 @@
> >
> >
> >   ahci_start_engine(ap);
> >
> > + msleep(5);
> > +
> >   if (online)
> >
> >    *class = ahci_dev_classify(ap);
> >

It may very well be that adding a magic msleep(5) here just Makes It Work, but
I have a gut feeling that it's in the wrong place (for starters, 'online' can't change
during the msleep() unless somebody *else* sets it - in which case the locking
is screwed up as we're not forcing a re-read of the value).  The msleep() probably
needs to be before something else further down in the code (but I have no idea
exactly what).

[-- Attachment #1.2: Type: application/pgp-signature, Size: 227 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 15:40                         ` Valdis.Kletnieks
@ 2011-05-20 15:43                           ` Tejun Heo
  2011-05-20 17:02                             ` Jian Peng
                                               ` (2 more replies)
  2011-05-20 15:43                           ` Tejun Heo
  1 sibling, 3 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-20 15:43 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Jian Peng, Rafael J. Wysocki, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

On Fri, May 20, 2011 at 11:40:20AM -0400, Valdis.Kletnieks@vt.edu wrote:
> On Wed, 18 May 2011 17:14:56 PDT, Jian Peng said:
> 
> > > @@ -1353,6 +1332,8 @@
> > >
> > >
> > >   ahci_start_engine(ap);
> > >
> > > + msleep(5);
> > > +
> > >   if (online)
> > >
> > >    *class = ahci_dev_classify(ap);
> > >
> 
> It may very well be that adding a magic msleep(5) here just Makes It Work, but
> I have a gut feeling that it's in the wrong place (for starters, 'online' can't change
> during the msleep() unless somebody *else* sets it - in which case the locking
> is screwed up as we're not forcing a re-read of the value).  The msleep() probably
> needs to be before something else further down in the code (but I have no idea
> exactly what).

At this point, I think it would be better to simply add a flag and
enable the check for the affected controller.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 15:40                         ` Valdis.Kletnieks
  2011-05-20 15:43                           ` Tejun Heo
@ 2011-05-20 15:43                           ` Tejun Heo
  1 sibling, 0 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-20 15:43 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Jeff Garzik, LKML, linux-ide, Jian Peng, Linux PM mailing list,
	Michael Leun

On Fri, May 20, 2011 at 11:40:20AM -0400, Valdis.Kletnieks@vt.edu wrote:
> On Wed, 18 May 2011 17:14:56 PDT, Jian Peng said:
> 
> > > @@ -1353,6 +1332,8 @@
> > >
> > >
> > >   ahci_start_engine(ap);
> > >
> > > + msleep(5);
> > > +
> > >   if (online)
> > >
> > >    *class = ahci_dev_classify(ap);
> > >
> 
> It may very well be that adding a magic msleep(5) here just Makes It Work, but
> I have a gut feeling that it's in the wrong place (for starters, 'online' can't change
> during the msleep() unless somebody *else* sets it - in which case the locking
> is screwed up as we're not forcing a re-read of the value).  The msleep() probably
> needs to be before something else further down in the code (but I have no idea
> exactly what).

At this point, I think it would be better to simply add a flag and
enable the check for the affected controller.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 15:43                           ` Tejun Heo
@ 2011-05-20 17:02                             ` Jian Peng
  2011-05-20 18:25                               ` Valdis.Kletnieks
                                                 ` (3 more replies)
  2011-05-20 18:21                             ` Jian Peng
  2011-05-20 18:21                             ` Jian Peng
  2 siblings, 4 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-20 17:02 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide,
	Linux PM mailing list, Jeff Garzik


[-- Attachment #1.1: Type: text/plain, Size: 2577 bytes --]

Hi, Tejun/Valdis,

Since this is an interoperability issue of SATA host controller, the first
step I want to try it to make sure the tweak that MAKE my controller WORK
does not break other controllers.
You are both right that adding this majic 5ms delay at this place should not
be the final solution.

If this magic 5ms delay works on other affected systems, I plan to post a
new patch that inside ahci_start_engine(), still perform same check, and
show warning message if failed, but will set a flag, then still set START
bit, and if there is such failure flag, add 5ms delay.

Valdis, could you apply the following patch and retest it?

Tejun, please review it.

--- a/drivers/ata/libahci.c 2011-05-18 14:23:36.564665643 -0700
+++ c/drivers/ata/libahci.c 2011-05-20 09:48:06.194663506 -0700
@@ -540,6 +540,7 @@
  void __iomem *port_mmio = ahci_port_base(ap);
  u32 tmp;
  u8 status;
+ int err = 0;

  status = readl(port_mmio + PORT_TFDATA) & 0xFF;

@@ -553,12 +554,12 @@
   * specific controller will fail under this condition
   */
  if (status & (ATA_BUSY | ATA_DRQ))
-  return;
+  err = 1;
  else {
   ahci_scr_read(&ap->link, SCR_STATUS, &tmp);

   if ((tmp & 0xf) != 0x3)
-   return;
+   err = 1;
  }

  /* start DMA */
@@ -566,6 +567,13 @@
  tmp |= PORT_CMD_START;
  writel(tmp, port_mmio + PORT_CMD);
  readl(port_mmio + PORT_CMD); /* flush */
+
+ /* Some controllers need longer time to be ready */
+ if(err) {
+  printk(KERN_WARNING
+   "Controller in wrong state when setting START bit\n");
+  msleep(5);
+ }
 }
 EXPORT_SYMBOL_GPL(ahci_start_engine);




On Fri, May 20, 2011 at 8:43 AM, Tejun Heo <tj@kernel.org> wrote:

>  On Fri, May 20, 2011 at 11:40:20AM -0400, Valdis.Kletnieks@vt.edu wrote:
> > On Wed, 18 May 2011 17:14:56 PDT, Jian Peng said:
> >
> > > > @@ -1353,6 +1332,8 @@
> > > >
> > > >
> > > >   ahci_start_engine(ap);
> > > >
> > > > + msleep(5);
> > > > +
> > > >   if (online)
> > > >
> > > >    *class = ahci_dev_classify(ap);
> > > >
> >
> > It may very well be that adding a magic msleep(5) here just Makes It
> Work, but
> > I have a gut feeling that it's in the wrong place (for starters, 'online'
> can't change
> > during the msleep() unless somebody *else* sets it - in which case the
> locking
> > is screwed up as we're not forcing a re-read of the value).  The msleep()
> probably
> > needs to be before something else further down in the code (but I have no
> idea
> > exactly what).
>
> At this point, I think it would be better to simply add a flag and
> enable the check for the affected controller.
>
> Thanks.
>
> --
> tejun
>

[-- Attachment #1.2: Type: text/html, Size: 3454 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 15:43                           ` Tejun Heo
  2011-05-20 17:02                             ` Jian Peng
  2011-05-20 18:21                             ` Jian Peng
@ 2011-05-20 18:21                             ` Jian Peng
  2 siblings, 0 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-20 18:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Valdis.Kletnieks, Rafael J. Wysocki, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

Sorry that I need fix my gmail setting to make it show up in LKML.

Hi, Tejun/Valdis,

Since this is an interoperability issue of SATA host controller, the
first step I want to try it to make sure the tweak that MAKE my
controller WORK does not break other controllers.
You are both right that adding this majic 5ms delay at this place
should not be the final solution.

If this magic 5ms delay works on other affected systems, I plan to
post a new patch that inside ahci_start_engine(), still perform same
check, and show warning message if failed, but will set a flag, then
still set START bit, and if there is such failure flag, add 5ms delay.

Valdis, could you apply the following patch and retest it?

Tejun, please review it.

--- a/drivers/ata/libahci.c 2011-05-18 14:23:36.564665643 -0700

+++ c/drivers/ata/libahci.c 2011-05-20 09:48:06.194663506 -0700
@@ -540,6 +540,7 @@

  void __iomem *port_mmio = ahci_port_base(ap);
  u32 tmp;
  u8 status;

+ int err = 0;


  status = readl(port_mmio + PORT_TFDATA) & 0xFF;


@@ -553,12 +554,12 @@

   * specific controller will fail under this condition
   */
  if (status & (ATA_BUSY | ATA_DRQ))
-  return;

+  err = 1;

  else {
   ahci_scr_read(&ap->link, SCR_STATUS, &tmp);

   if ((tmp & 0xf) != 0x3)
-   return;

+   err = 1;
  }

  /* start DMA */
@@ -566,6 +567,13 @@
  tmp |= PORT_CMD_START;
  writel(tmp, port_mmio + PORT_CMD);
  readl(port_mmio + PORT_CMD); /* flush */
+
+ /* Some controllers need longer time to be ready */
+ if(err) {
+  printk(KERN_WARNING
+   "Controller in wrong state when setting START bit\n");
+  msleep(5);
+ }
 }
 EXPORT_SYMBOL_GPL(ahci_start_engine);

On Fri, May 20, 2011 at 8:43 AM, Tejun Heo <tj@kernel.org> wrote:
>
> On Fri, May 20, 2011 at 11:40:20AM -0400, Valdis.Kletnieks@vt.edu wrote:
> > On Wed, 18 May 2011 17:14:56 PDT, Jian Peng said:
> >
> > > > @@ -1353,6 +1332,8 @@
> > > >
> > > >
> > > >   ahci_start_engine(ap);
> > > >
> > > > + msleep(5);
> > > > +
> > > >   if (online)
> > > >
> > > >    *class = ahci_dev_classify(ap);
> > > >
> >
> > It may very well be that adding a magic msleep(5) here just Makes It Work, but
> > I have a gut feeling that it's in the wrong place (for starters, 'online' can't change
> > during the msleep() unless somebody *else* sets it - in which case the locking
> > is screwed up as we're not forcing a re-read of the value).  The msleep() probably
> > needs to be before something else further down in the code (but I have no idea
> > exactly what).
>
> At this point, I think it would be better to simply add a flag and
> enable the check for the affected controller.
>
> Thanks.
>
> --
> tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 15:43                           ` Tejun Heo
  2011-05-20 17:02                             ` Jian Peng
@ 2011-05-20 18:21                             ` Jian Peng
  2011-05-20 18:21                             ` Jian Peng
  2 siblings, 0 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-20 18:21 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide,
	Linux PM mailing list, Jeff Garzik

Sorry that I need fix my gmail setting to make it show up in LKML.

Hi, Tejun/Valdis,

Since this is an interoperability issue of SATA host controller, the
first step I want to try it to make sure the tweak that MAKE my
controller WORK does not break other controllers.
You are both right that adding this majic 5ms delay at this place
should not be the final solution.

If this magic 5ms delay works on other affected systems, I plan to
post a new patch that inside ahci_start_engine(), still perform same
check, and show warning message if failed, but will set a flag, then
still set START bit, and if there is such failure flag, add 5ms delay.

Valdis, could you apply the following patch and retest it?

Tejun, please review it.

--- a/drivers/ata/libahci.c 2011-05-18 14:23:36.564665643 -0700

+++ c/drivers/ata/libahci.c 2011-05-20 09:48:06.194663506 -0700
@@ -540,6 +540,7 @@

  void __iomem *port_mmio = ahci_port_base(ap);
  u32 tmp;
  u8 status;

+ int err = 0;


  status = readl(port_mmio + PORT_TFDATA) & 0xFF;


@@ -553,12 +554,12 @@

   * specific controller will fail under this condition
   */
  if (status & (ATA_BUSY | ATA_DRQ))
-  return;

+  err = 1;

  else {
   ahci_scr_read(&ap->link, SCR_STATUS, &tmp);

   if ((tmp & 0xf) != 0x3)
-   return;

+   err = 1;
  }

  /* start DMA */
@@ -566,6 +567,13 @@
  tmp |= PORT_CMD_START;
  writel(tmp, port_mmio + PORT_CMD);
  readl(port_mmio + PORT_CMD); /* flush */
+
+ /* Some controllers need longer time to be ready */
+ if(err) {
+  printk(KERN_WARNING
+   "Controller in wrong state when setting START bit\n");
+  msleep(5);
+ }
 }
 EXPORT_SYMBOL_GPL(ahci_start_engine);

On Fri, May 20, 2011 at 8:43 AM, Tejun Heo <tj@kernel.org> wrote:
>
> On Fri, May 20, 2011 at 11:40:20AM -0400, Valdis.Kletnieks@vt.edu wrote:
> > On Wed, 18 May 2011 17:14:56 PDT, Jian Peng said:
> >
> > > > @@ -1353,6 +1332,8 @@
> > > >
> > > >
> > > >   ahci_start_engine(ap);
> > > >
> > > > + msleep(5);
> > > > +
> > > >   if (online)
> > > >
> > > >    *class = ahci_dev_classify(ap);
> > > >
> >
> > It may very well be that adding a magic msleep(5) here just Makes It Work, but
> > I have a gut feeling that it's in the wrong place (for starters, 'online' can't change
> > during the msleep() unless somebody *else* sets it - in which case the locking
> > is screwed up as we're not forcing a re-read of the value).  The msleep() probably
> > needs to be before something else further down in the code (but I have no idea
> > exactly what).
>
> At this point, I think it would be better to simply add a flag and
> enable the check for the affected controller.
>
> Thanks.
>
> --
> tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 17:02                             ` Jian Peng
  2011-05-20 18:25                               ` Valdis.Kletnieks
@ 2011-05-20 18:25                               ` Valdis.Kletnieks
  2011-05-22  2:00                                 ` Jian Peng
  2011-05-22  2:00                                 ` Jian Peng
  2011-05-23 12:13                               ` Tejun Heo
  2011-05-23 12:13                               ` Tejun Heo
  3 siblings, 2 replies; 62+ messages in thread
From: Valdis.Kletnieks @ 2011-05-20 18:25 UTC (permalink / raw)
  To: Jian Peng
  Cc: Tejun Heo, Rafael J. Wysocki, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

[-- Attachment #1: Type: text/plain, Size: 699 bytes --]

On Fri, 20 May 2011 10:02:56 PDT, Jian Peng said:
> --20cf307f38f6d842a904a3b81730

> You are both right that adding this majic 5ms delay at this place should not
> be the final solution.
> 
> If this magic 5ms delay works on other affected systems, I plan to post a
> new patch that inside ahci_start_engine(), still perform same check, and
> show warning message if failed, but will set a flag, then still set START
> bit, and if there is such failure flag, add 5ms delay.
> 
> Valdis, could you apply the following patch and retest it?

I should be able to do that this weekend.  To clarify - should this be with the
problem commit 270dac35c26433d06a89150c51e75ca0181ca7e4 applied, or reverted?


[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 17:02                             ` Jian Peng
@ 2011-05-20 18:25                               ` Valdis.Kletnieks
  2011-05-20 18:25                               ` Valdis.Kletnieks
                                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 62+ messages in thread
From: Valdis.Kletnieks @ 2011-05-20 18:25 UTC (permalink / raw)
  To: Jian Peng
  Cc: Michael Leun, LKML, linux-ide, Tejun Heo, Linux PM mailing list,
	Jeff Garzik


[-- Attachment #1.1: Type: text/plain, Size: 699 bytes --]

On Fri, 20 May 2011 10:02:56 PDT, Jian Peng said:
> --20cf307f38f6d842a904a3b81730

> You are both right that adding this majic 5ms delay at this place should not
> be the final solution.
> 
> If this magic 5ms delay works on other affected systems, I plan to post a
> new patch that inside ahci_start_engine(), still perform same check, and
> show warning message if failed, but will set a flag, then still set START
> bit, and if there is such failure flag, add 5ms delay.
> 
> Valdis, could you apply the following patch and retest it?

I should be able to do that this weekend.  To clarify - should this be with the
problem commit 270dac35c26433d06a89150c51e75ca0181ca7e4 applied, or reverted?


[-- Attachment #1.2: Type: application/pgp-signature, Size: 227 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 18:25                               ` Valdis.Kletnieks
  2011-05-22  2:00                                 ` Jian Peng
@ 2011-05-22  2:00                                 ` Jian Peng
  1 sibling, 0 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-22  2:00 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Tejun Heo, Rafael J. Wysocki, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

HI, Valdis,

This patch is on top of reverted patch 81ca7e4. So you should not
revert 81ca7e4 before applying this new one.

Best regards,
Jian

On Fri, May 20, 2011 at 11:25 AM, <Valdis.Kletnieks@vt.edu> wrote:
>
> On Fri, 20 May 2011 10:02:56 PDT, Jian Peng said:
> > --20cf307f38f6d842a904a3b81730
>
> > You are both right that adding this majic 5ms delay at this place should not
> > be the final solution.
> >
> > If this magic 5ms delay works on other affected systems, I plan to post a
> > new patch that inside ahci_start_engine(), still perform same check, and
> > show warning message if failed, but will set a flag, then still set START
> > bit, and if there is such failure flag, add 5ms delay.
> >
> > Valdis, could you apply the following patch and retest it?
>
> I should be able to do that this weekend.  To clarify - should this be with the
> problem commit 270dac35c26433d06a89150c51e75ca0181ca7e4 applied, or reverted?
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 18:25                               ` Valdis.Kletnieks
@ 2011-05-22  2:00                                 ` Jian Peng
  2011-05-22  2:00                                 ` Jian Peng
  1 sibling, 0 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-22  2:00 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Michael Leun, LKML, linux-ide, Tejun Heo, Linux PM mailing list,
	Jeff Garzik

HI, Valdis,

This patch is on top of reverted patch 81ca7e4. So you should not
revert 81ca7e4 before applying this new one.

Best regards,
Jian

On Fri, May 20, 2011 at 11:25 AM, <Valdis.Kletnieks@vt.edu> wrote:
>
> On Fri, 20 May 2011 10:02:56 PDT, Jian Peng said:
> > --20cf307f38f6d842a904a3b81730
>
> > You are both right that adding this majic 5ms delay at this place should not
> > be the final solution.
> >
> > If this magic 5ms delay works on other affected systems, I plan to post a
> > new patch that inside ahci_start_engine(), still perform same check, and
> > show warning message if failed, but will set a flag, then still set START
> > bit, and if there is such failure flag, add 5ms delay.
> >
> > Valdis, could you apply the following patch and retest it?
>
> I should be able to do that this weekend.  To clarify - should this be with the
> problem commit 270dac35c26433d06a89150c51e75ca0181ca7e4 applied, or reverted?
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 17:02                             ` Jian Peng
                                                 ` (2 preceding siblings ...)
  2011-05-23 12:13                               ` Tejun Heo
@ 2011-05-23 12:13                               ` Tejun Heo
  2011-05-24  1:04                                 ` Jian Peng
  2011-05-24  1:04                                 ` Jian Peng
  3 siblings, 2 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-23 12:13 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Rafael J. Wysocki, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

Hello,

On Fri, May 20, 2011 at 10:02:56AM -0700, Jian Peng wrote:
> Hi, Tejun/Valdis,
> 
> Since this is an interoperability issue of SATA host controller, the first
> step I want to try it to make sure the tweak that MAKE my controller WORK
> does not break other controllers.
> You are both right that adding this majic 5ms delay at this place should not
> be the final solution.
> 
> If this magic 5ms delay works on other affected systems, I plan to post a
> new patch that inside ahci_start_engine(), still perform same check, and
> show warning message if failed, but will set a flag, then still set START
> bit, and if there is such failure flag, add 5ms delay.

Yeah, sounds like a plan but please add ample comment explaining
what's going on for which controller including link to the mailing
list threads.  As we're basically adding a black magic, I would still
like to enable it only for the affected controllers so that we don't
have to worry whether there are controllers which are affected by this
problem but we don't know about.

> --- a/drivers/ata/libahci.c 2011-05-18 14:23:36.564665643 -0700
> +++ c/drivers/ata/libahci.c 2011-05-20 09:48:06.194663506 -0700
> @@ -540,6 +540,7 @@
>   void __iomem *port_mmio = ahci_port_base(ap);
>   u32 tmp;
>   u8 status;
> + int err = 0;

I think

 bool stat_failed = false;

would be more in line with recent coding style.

>   status = readl(port_mmio + PORT_TFDATA) & 0xFF;
> 
> @@ -553,12 +554,12 @@
>    * specific controller will fail under this condition
>    */
>   if (status & (ATA_BUSY | ATA_DRQ))
> -  return;
> +  err = 1;
>   else {
>    ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
> 
>    if ((tmp & 0xf) != 0x3)
> -   return;
> +   err = 1;
>   }
> 
>   /* start DMA */
> @@ -566,6 +567,13 @@
>   tmp |= PORT_CMD_START;
>   writel(tmp, port_mmio + PORT_CMD);
>   readl(port_mmio + PORT_CMD); /* flush */
> +
> + /* Some controllers need longer time to be ready */
> + if(err) {
> +  printk(KERN_WARNING
> +   "Controller in wrong state when setting START bit\n");
> +  msleep(5);

 ata_port_printk()?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-20 17:02                             ` Jian Peng
  2011-05-20 18:25                               ` Valdis.Kletnieks
  2011-05-20 18:25                               ` Valdis.Kletnieks
@ 2011-05-23 12:13                               ` Tejun Heo
  2011-05-23 12:13                               ` Tejun Heo
  3 siblings, 0 replies; 62+ messages in thread
From: Tejun Heo @ 2011-05-23 12:13 UTC (permalink / raw)
  To: Jian Peng
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide,
	Linux PM mailing list, Jeff Garzik

Hello,

On Fri, May 20, 2011 at 10:02:56AM -0700, Jian Peng wrote:
> Hi, Tejun/Valdis,
> 
> Since this is an interoperability issue of SATA host controller, the first
> step I want to try it to make sure the tweak that MAKE my controller WORK
> does not break other controllers.
> You are both right that adding this majic 5ms delay at this place should not
> be the final solution.
> 
> If this magic 5ms delay works on other affected systems, I plan to post a
> new patch that inside ahci_start_engine(), still perform same check, and
> show warning message if failed, but will set a flag, then still set START
> bit, and if there is such failure flag, add 5ms delay.

Yeah, sounds like a plan but please add ample comment explaining
what's going on for which controller including link to the mailing
list threads.  As we're basically adding a black magic, I would still
like to enable it only for the affected controllers so that we don't
have to worry whether there are controllers which are affected by this
problem but we don't know about.

> --- a/drivers/ata/libahci.c 2011-05-18 14:23:36.564665643 -0700
> +++ c/drivers/ata/libahci.c 2011-05-20 09:48:06.194663506 -0700
> @@ -540,6 +540,7 @@
>   void __iomem *port_mmio = ahci_port_base(ap);
>   u32 tmp;
>   u8 status;
> + int err = 0;

I think

 bool stat_failed = false;

would be more in line with recent coding style.

>   status = readl(port_mmio + PORT_TFDATA) & 0xFF;
> 
> @@ -553,12 +554,12 @@
>    * specific controller will fail under this condition
>    */
>   if (status & (ATA_BUSY | ATA_DRQ))
> -  return;
> +  err = 1;
>   else {
>    ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
> 
>    if ((tmp & 0xf) != 0x3)
> -   return;
> +   err = 1;
>   }
> 
>   /* start DMA */
> @@ -566,6 +567,13 @@
>   tmp |= PORT_CMD_START;
>   writel(tmp, port_mmio + PORT_CMD);
>   readl(port_mmio + PORT_CMD); /* flush */
> +
> + /* Some controllers need longer time to be ready */
> + if(err) {
> +  printk(KERN_WARNING
> +   "Controller in wrong state when setting START bit\n");
> +  msleep(5);

 ata_port_printk()?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-23 12:13                               ` Tejun Heo
  2011-05-24  1:04                                 ` Jian Peng
@ 2011-05-24  1:04                                 ` Jian Peng
  1 sibling, 0 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-24  1:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Valdis.Kletnieks, Rafael J. Wysocki, Jeff Garzik, Michael Leun,
	linux-ide, Linux PM mailing list, LKML

Hi, Tejun,

Defintely. I will clarify it as much as possible and meanwhile, wait
for it to be tested by reporters.

Thanks,
Jian

On Mon, May 23, 2011 at 5:13 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Fri, May 20, 2011 at 10:02:56AM -0700, Jian Peng wrote:
>> Hi, Tejun/Valdis,
>>
>> Since this is an interoperability issue of SATA host controller, the first
>> step I want to try it to make sure the tweak that MAKE my controller WORK
>> does not break other controllers.
>> You are both right that adding this majic 5ms delay at this place should not
>> be the final solution.
>>
>> If this magic 5ms delay works on other affected systems, I plan to post a
>> new patch that inside ahci_start_engine(), still perform same check, and
>> show warning message if failed, but will set a flag, then still set START
>> bit, and if there is such failure flag, add 5ms delay.
>
> Yeah, sounds like a plan but please add ample comment explaining
> what's going on for which controller including link to the mailing
> list threads.  As we're basically adding a black magic, I would still
> like to enable it only for the affected controllers so that we don't
> have to worry whether there are controllers which are affected by this
> problem but we don't know about.
>
>> --- a/drivers/ata/libahci.c 2011-05-18 14:23:36.564665643 -0700
>> +++ c/drivers/ata/libahci.c 2011-05-20 09:48:06.194663506 -0700
>> @@ -540,6 +540,7 @@
>>   void __iomem *port_mmio = ahci_port_base(ap);
>>   u32 tmp;
>>   u8 status;
>> + int err = 0;
>
> I think
>
>  bool stat_failed = false;
>
> would be more in line with recent coding style.
>
>>   status = readl(port_mmio + PORT_TFDATA) & 0xFF;
>>
>> @@ -553,12 +554,12 @@
>>    * specific controller will fail under this condition
>>    */
>>   if (status & (ATA_BUSY | ATA_DRQ))
>> -  return;
>> +  err = 1;
>>   else {
>>    ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
>>
>>    if ((tmp & 0xf) != 0x3)
>> -   return;
>> +   err = 1;
>>   }
>>
>>   /* start DMA */
>> @@ -566,6 +567,13 @@
>>   tmp |= PORT_CMD_START;
>>   writel(tmp, port_mmio + PORT_CMD);
>>   readl(port_mmio + PORT_CMD); /* flush */
>> +
>> + /* Some controllers need longer time to be ready */
>> + if(err) {
>> +  printk(KERN_WARNING
>> +   "Controller in wrong state when setting START bit\n");
>> +  msleep(5);
>
>  ata_port_printk()?
>
> Thanks.
>
> --
> tejun
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec"
  2011-05-23 12:13                               ` Tejun Heo
@ 2011-05-24  1:04                                 ` Jian Peng
  2011-05-24  1:04                                 ` Jian Peng
  1 sibling, 0 replies; 62+ messages in thread
From: Jian Peng @ 2011-05-24  1:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Valdis.Kletnieks, Michael Leun, LKML, linux-ide,
	Linux PM mailing list, Jeff Garzik

Hi, Tejun,

Defintely. I will clarify it as much as possible and meanwhile, wait
for it to be tested by reporters.

Thanks,
Jian

On Mon, May 23, 2011 at 5:13 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Fri, May 20, 2011 at 10:02:56AM -0700, Jian Peng wrote:
>> Hi, Tejun/Valdis,
>>
>> Since this is an interoperability issue of SATA host controller, the first
>> step I want to try it to make sure the tweak that MAKE my controller WORK
>> does not break other controllers.
>> You are both right that adding this majic 5ms delay at this place should not
>> be the final solution.
>>
>> If this magic 5ms delay works on other affected systems, I plan to post a
>> new patch that inside ahci_start_engine(), still perform same check, and
>> show warning message if failed, but will set a flag, then still set START
>> bit, and if there is such failure flag, add 5ms delay.
>
> Yeah, sounds like a plan but please add ample comment explaining
> what's going on for which controller including link to the mailing
> list threads.  As we're basically adding a black magic, I would still
> like to enable it only for the affected controllers so that we don't
> have to worry whether there are controllers which are affected by this
> problem but we don't know about.
>
>> --- a/drivers/ata/libahci.c 2011-05-18 14:23:36.564665643 -0700
>> +++ c/drivers/ata/libahci.c 2011-05-20 09:48:06.194663506 -0700
>> @@ -540,6 +540,7 @@
>>   void __iomem *port_mmio = ahci_port_base(ap);
>>   u32 tmp;
>>   u8 status;
>> + int err = 0;
>
> I think
>
>  bool stat_failed = false;
>
> would be more in line with recent coding style.
>
>>   status = readl(port_mmio + PORT_TFDATA) & 0xFF;
>>
>> @@ -553,12 +554,12 @@
>>    * specific controller will fail under this condition
>>    */
>>   if (status & (ATA_BUSY | ATA_DRQ))
>> -  return;
>> +  err = 1;
>>   else {
>>    ahci_scr_read(&ap->link, SCR_STATUS, &tmp);
>>
>>    if ((tmp & 0xf) != 0x3)
>> -   return;
>> +   err = 1;
>>   }
>>
>>   /* start DMA */
>> @@ -566,6 +567,13 @@
>>   tmp |= PORT_CMD_START;
>>   writel(tmp, port_mmio + PORT_CMD);
>>   readl(port_mmio + PORT_CMD); /* flush */
>> +
>> + /* Some controllers need longer time to be ready */
>> + if(err) {
>> +  printk(KERN_WARNING
>> +   "Controller in wrong state when setting START bit\n");
>> +  msleep(5);
>
>  ata_port_printk()?
>
> Thanks.
>
> --
> tejun
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* AHCI driver problem on SB700/SB800 w/ Acer Ferrari One
@ 2011-05-11 22:25 Rafael J. Wysocki
  0 siblings, 0 replies; 62+ messages in thread
From: Rafael J. Wysocki @ 2011-05-11 22:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, Linux PM mailing list, LKML

Hi,

There seems to be a problem on my Acer box which is related to DIPM
and switching the power source.  Namely, when I detach the AC adapter from
the machine, the disk (which is an Intel SSD) freezes for a while and
something like this appears in dmesg:

[ 2177.808120] ata1.00: exception Emask 0x0 SAct 0x7 SErr 0xd0002 action 0x6 frozen
[ 2177.808140] ata1: SError: { RecovComm PHYRdyChg CommWake 10B8B }
[ 2177.808153] ata1.00: failed command: WRITE FPDMA QUEUED
[ 2177.808171] ata1.00: cmd 61/08:00:08:ed:37/00:00:06:00:00/40 tag 0 ncq 4096 out
[ 2177.808174]          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 2177.808192] ata1.00: status: { DRDY }
[ 2177.808201] ata1.00: failed command: WRITE FPDMA QUEUED
[ 2177.808218] ata1.00: cmd 61/08:08:48:f4:c7/00:00:05:00:00/40 tag 1 ncq 4096 out
[ 2177.808220]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 2177.808239] ata1.00: status: { DRDY }
[ 2177.808247] ata1.00: failed command: WRITE FPDMA QUEUED
[ 2177.808275] ata1.00: cmd 61/08:10:50:f4:c7/00:00:05:00:00/40 tag 2 ncq 4096 out
[ 2177.808277]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 2177.808303] ata1.00: status: { DRDY }
[ 2177.808320] ata1: hard resetting link
[ 2187.812149] ata1: softreset failed (1st FIS failed)
[ 2187.812227] ata1: hard resetting link
[ 2188.304128] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 2188.304774] ata1.00: configured for UDMA/133
[ 2188.320333] ata1.00: device reported invalid CHS sector 0
[ 2188.320361] ata1.00: device reported invalid CHS sector 0
[ 2188.320394] ata1: EH complete

Then, when the AC adapter is attached again, the SSD freezes again and this
appears in dmesg:

[178523.064145] ata1.00: qc timeout (cmd 0xef)
[178523.064228] ata1.00: failed to disable DIPM, Emask 0x4
[178523.064248] ata1: hard resetting link
[178523.556096] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[178523.556793] ata1.00: configured for UDMA/133
[178523.572051] ata1: EH complete

That seems to be reproducible 100% of the time.  Moreover, if the box is
suspended on AC power and then resume on battery power, something more serious
happens:

[178881.824160] ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0xd0002 action 0x6 frozen
[178881.824203] ata1: SError: { RecovComm PHYRdyChg CommWake 10B8B }
[178881.824223] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824249] ata1.00: cmd 61/08:00:00:d0:9e/00:00:01:00:00/40 tag 0 ncq 4096 out
[178881.824252]          res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824288] ata1.00: status: { DRDY }
[178881.824304] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824328] ata1.00: cmd 61/08:08:48:d0:9e/00:00:01:00:00/40 tag 1 ncq 4096 out
[178881.824331]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824366] ata1.00: status: { DRDY }
[178881.824381] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824405] ata1.00: cmd 61/08:10:00:d2:9e/00:00:01:00:00/40 tag 2 ncq 4096 out
[178881.824408]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824443] ata1.00: status: { DRDY }
[178881.824457] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824481] ata1.00: cmd 61/08:18:20:d3:9e/00:00:01:00:00/40 tag 3 ncq 4096 out
[178881.824484]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824519] ata1.00: status: { DRDY }
[178881.824534] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824557] ata1.00: cmd 61/10:20:90:da:9e/00:00:01:00:00/40 tag 4 ncq 8192 out
[178881.824560]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824595] ata1.00: status: { DRDY }
[178881.824610] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824633] ata1.00: cmd 61/08:28:c0:5a:6c/00:00:01:00:00/40 tag 5 ncq 4096 out
[178881.824637]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824671] ata1.00: status: { DRDY }
[178881.824710] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824731] ata1.00: cmd 61/10:30:48:5f:98/00:00:01:00:00/40 tag 6 ncq 8192 out
[178881.824734]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824760] ata1.00: status: { DRDY }
[178881.824773] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824794] ata1.00: cmd 61/08:38:e0:1a:a4/00:00:01:00:00/40 tag 7 ncq 4096 out
[178881.824797]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824823] ata1.00: status: { DRDY }
[178881.824836] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824856] ata1.00: cmd 61/08:40:90:29:ac/00:00:01:00:00/40 tag 8 ncq 4096 out
[178881.824859]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824885] ata1.00: status: { DRDY }
[178881.824898] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824919] ata1.00: cmd 61/08:48:a0:33:a4/00:00:01:00:00/40 tag 9 ncq 4096 out
[178881.824922]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.824949] ata1.00: status: { DRDY }
[178881.824961] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.824982] ata1.00: cmd 61/08:50:c8:51:ac/00:00:01:00:00/40 tag 10 ncq 4096 out
[178881.824985]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825012] ata1.00: status: { DRDY }
[178881.825025] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825046] ata1.00: cmd 61/10:58:00:50:96/00:00:01:00:00/40 tag 11 ncq 8192 out
[178881.825049]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825075] ata1.00: status: { DRDY }
[178881.825087] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825108] ata1.00: cmd 61/08:60:20:50:96/00:00:01:00:00/40 tag 12 ncq 4096 out
[178881.825111]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825137] ata1.00: status: { DRDY }
[178881.825149] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825170] ata1.00: cmd 61/08:68:40:50:96/00:00:01:00:00/40 tag 13 ncq 4096 out
[178881.825173]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825199] ata1.00: status: { DRDY }
[178881.825212] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825232] ata1.00: cmd 61/10:70:e0:4c:94/00:00:01:00:00/40 tag 14 ncq 8192 out
[178881.825235]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825261] ata1.00: status: { DRDY }
[178881.825273] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825294] ata1.00: cmd 61/10:78:80:55:7b/00:00:01:00:00/40 tag 15 ncq 8192 out
[178881.825297]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825323] ata1.00: status: { DRDY }
[178881.825335] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825356] ata1.00: cmd 61/08:80:68:60:64/00:00:01:00:00/40 tag 16 ncq 4096 out
[178881.825359]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825385] ata1.00: status: { DRDY }
[178881.825398] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825419] ata1.00: cmd 61/08:88:c8:5a:6c/00:00:01:00:00/40 tag 17 ncq 4096 out
[178881.825422]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825449] ata1.00: status: { DRDY }
[178881.825461] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825461] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825482] ata1.00: cmd 61/08:90:b8:b6:64/00:00:01:00:00/40 tag 18 ncq 4096 out
[178881.825485]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825512] ata1.00: status: { DRDY }
[178881.825524] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825545] ata1.00: cmd 61/10:98:a8:a2:6d/00:00:01:00:00/40 tag 19 ncq 8192 out
[178881.825548]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825574] ata1.00: status: { DRDY }
[178881.825587] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825608] ata1.00: cmd 61/08:a0:58:f6:76/00:00:01:00:00/40 tag 20 ncq 4096 out
[178881.825611]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825637] ata1.00: status: { DRDY }
[178881.825649] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825670] ata1.00: cmd 61/08:a8:e0:28:3b/00:00:06:00:00/40 tag 21 ncq 4096 out
[178881.825673]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825699] ata1.00: status: { DRDY }
[178881.825712] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825732] ata1.00: cmd 61/48:b0:c0:1e:a5/01:00:04:00:00/40 tag 22 ncq 167936 out
[178881.825735]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825761] ata1.00: status: { DRDY }
[178881.825774] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825794] ata1.00: cmd 61/08:b8:78:c1:37/00:00:06:00:00/40 tag 23 ncq 4096 out
[178881.825798]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825824] ata1.00: status: { DRDY }
[178881.825836] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825857] ata1.00: cmd 61/18:c0:98:c1:37/00:00:06:00:00/40 tag 24 ncq 12288 out
[178881.825860]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825886] ata1.00: status: { DRDY }
[178881.825899] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825919] ata1.00: cmd 61/08:c8:10:c1:37/00:00:06:00:00/40 tag 25 ncq 4096 out
[178881.825923]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.825949] ata1.00: status: { DRDY }
[178881.825961] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.825982] ata1.00: cmd 61/10:d0:30:c1:37/00:00:06:00:00/40 tag 26 ncq 8192 out
[178881.825985]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826011] ata1.00: status: { DRDY }
[178881.826024] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.826045] ata1.00: cmd 61/18:d8:50:c1:37/00:00:06:00:00/40 tag 27 ncq 12288 out
[178881.826048]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826074] ata1.00: status: { DRDY }
[178881.826086] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.826107] ata1.00: cmd 61/08:e0:d8:a2:35/00:00:06:00:00/40 tag 28 ncq 4096 out
[178881.826110]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826137] ata1.00: status: { DRDY }
[178881.826149] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.826170] ata1.00: cmd 61/08:e8:d0:a3:35/00:00:06:00:00/40 tag 29 ncq 4096 out
[178881.826173]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826199] ata1.00: status: { DRDY }
[178881.826211] ata1.00: failed command: WRITE FPDMA QUEUED
[178881.826232] ata1.00: cmd 61/08:f0:50:e5:64/00:00:06:00:00/40 tag 30 ncq 4096 out
[178881.826235]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[178881.826261] ata1.00: status: { DRDY }
[178881.826278] ata1: hard resetting link
[178891.824149] ata1: softreset failed (1st FIS failed)
[178891.824188] ata1: hard resetting link
[178892.320170] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[178892.320874] ata1.00: configured for UDMA/133
[178892.336156] ata1.00: device reported invalid CHS sector 0
[178892.336199] ata1.00: device reported invalid CHS sector 0
[178892.336218] ata1.00: device reported invalid CHS sector 0
[178892.336236] ata1.00: device reported invalid CHS sector 0
[178892.336254] ata1.00: device reported invalid CHS sector 0
[178892.336272] ata1.00: device reported invalid CHS sector 0
[178892.336290] ata1.00: device reported invalid CHS sector 0
[178892.336308] ata1.00: device reported invalid CHS sector 0
[178892.336327] ata1.00: device reported invalid CHS sector 0
[178892.336345] ata1.00: device reported invalid CHS sector 0
[178892.336363] ata1.00: device reported invalid CHS sector 0
[178892.336381] ata1.00: device reported invalid CHS sector 0
[178892.336399] ata1.00: device reported invalid CHS sector 0
[178892.336418] ata1.00: device reported invalid CHS sector 0
[178892.336436] ata1.00: device reported invalid CHS sector 0
[178892.336455] ata1.00: device reported invalid CHS sector 0
[178892.336474] ata1.00: device reported invalid CHS sector 0
[178892.336492] ata1.00: device reported invalid CHS sector 0
[178892.336512] ata1.00: device reported invalid CHS sector 0
[178892.336530] ata1.00: device reported invalid CHS sector 0
[178892.336548] ata1.00: device reported invalid CHS sector 0
[178892.336567] ata1.00: device reported invalid CHS sector 0
[178892.336586] ata1.00: device reported invalid CHS sector 0
[178892.336604] ata1.00: device reported invalid CHS sector 0
[178892.336622] ata1.00: device reported invalid CHS sector 0
[178892.336640] ata1.00: device reported invalid CHS sector 0
[178892.336658] ata1.00: device reported invalid CHS sector 0
[178892.336677] ata1.00: device reported invalid CHS sector 0
[178892.336694] ata1.00: device reported invalid CHS sector 0
[178892.336712] ata1.00: device reported invalid CHS sector 0
[178892.336731] ata1.00: device reported invalid CHS sector 0
[178892.336837] ata1: EH complete

This doesn't happen if the box is both suspended and resumed on battery
power.

I have verified that the issue described above doesn't happen with
the appended patch applied.

The hardware is:

00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode] (prog-if 01 [AHCI 1.0])
        Subsystem: Acer Incorporated [ALI] Device 029e
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 64
        Interrupt: pin A routed to IRQ 22
        Region 0: I/O ports at 8420 [size=8]
        Region 1: I/O ports at 8414 [size=4]
        Region 2: I/O ports at 8418 [size=8]
        Region 3: I/O ports at 8410 [size=4]
        Region 4: I/O ports at 8400 [size=16]
        Region 5: Memory at f0508000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [60] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [70] SATA HBA v1.0 InCfgSpace
        Kernel driver in use: ahci

[    3.336087] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    3.340777] ata1.00: ATA-7: INTEL SSDSA2M080G2GC, 2CV102HD, max UDMA/133
[    3.345233] ata1.00: 156301488 sectors, multi 16: LBA48 NCQ (depth 31/32)
[    3.350070] ata1.00: configured for UDMA/133
[    3.354969] scsi 0:0:0:0: Direct-Access     ATA      INTEL SSDSA2M080 2CV1 PQ: 0 ANSI: 5

Please tell me if there's anything I can do as a permanent workaround.

Thanks,
Rafael


---
 drivers/ata/ahci.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/drivers/ata/ahci.c
===================================================================
--- linux-2.6.orig/drivers/ata/ahci.c
+++ linux-2.6/drivers/ata/ahci.c
@@ -193,7 +193,7 @@ static const struct ata_port_info ahci_p
 	[board_ahci_sb700] =	/* for SB700 and SB800 */
 	{
 		AHCI_HFLAGS	(AHCI_HFLAG_IGN_SERR_INTERNAL),
-		.flags		= AHCI_FLAG_COMMON,
+		.flags		= AHCI_FLAG_COMMON | ATA_FLAG_NO_DIPM,
 		.pio_mask	= ATA_PIO4,
 		.udma_mask	= ATA_UDMA6,
 		.port_ops	= &ahci_sb600_ops,

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2011-05-24  1:04 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-11 22:25 AHCI driver problem on SB700/SB800 w/ Acer Ferrari One Rafael J. Wysocki
2011-05-13 12:37 ` AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One) Michael Leun
2011-05-13 12:37 ` Michael Leun
2011-05-13 16:56   ` Rafael J. Wysocki
2011-05-13 16:56   ` Rafael J. Wysocki
2011-05-13 17:39     ` Tejun Heo
2011-05-13 19:54       ` Michael Leun
2011-05-13 19:54       ` Michael Leun
2011-05-13 20:20         ` Rafael J. Wysocki
2011-05-13 20:20         ` Rafael J. Wysocki
2011-05-13 20:22       ` Rafael J. Wysocki
2011-05-13 20:44         ` Rafael J. Wysocki
2011-05-14  2:08           ` Robert Hancock
2011-05-14  2:08           ` Robert Hancock
2011-05-14 10:18           ` Tejun Heo
2011-05-14 10:18           ` Tejun Heo
2011-05-14 10:28             ` [PATCH v2.6.38-rc7] Revert "libata: ahci_start_engine compliant to AHCI spec" Tejun Heo
2011-05-14 10:28               ` Tejun Heo
2011-05-14 17:47               ` Linus Torvalds
2011-05-14 17:47               ` Linus Torvalds
2011-05-14 18:49                 ` Jeff Garzik
2011-05-14 18:49                 ` Jeff Garzik
2011-05-15  6:19               ` Jian Peng
2011-05-15  9:25                 ` Rafael J. Wysocki
2011-05-15  9:25                 ` Rafael J. Wysocki
2011-05-15 12:20                   ` Michael Leun
2011-05-15 12:20                   ` Michael Leun
2011-05-16 17:02               ` Valdis.Kletnieks
2011-05-18 19:23                 ` Jian Peng
2011-05-18 19:44                   ` Rafael J. Wysocki
2011-05-18 19:44                   ` Rafael J. Wysocki
2011-05-18 21:25                     ` Jian Peng
2011-05-19  0:14                       ` Jian Peng
2011-05-19 10:19                         ` Rafael J. Wysocki
2011-05-19 10:19                         ` Rafael J. Wysocki
2011-05-19 16:42                           ` Jian Peng
2011-05-19 21:32                             ` Rafael J. Wysocki
2011-05-19 21:32                             ` Rafael J. Wysocki
2011-05-19 21:57                               ` Jian Peng
2011-05-19 22:03                                 ` Rafael J. Wysocki
2011-05-19 22:03                                 ` Rafael J. Wysocki
2011-05-20 15:40                         ` Valdis.Kletnieks
2011-05-20 15:43                           ` Tejun Heo
2011-05-20 17:02                             ` Jian Peng
2011-05-20 18:25                               ` Valdis.Kletnieks
2011-05-20 18:25                               ` Valdis.Kletnieks
2011-05-22  2:00                                 ` Jian Peng
2011-05-22  2:00                                 ` Jian Peng
2011-05-23 12:13                               ` Tejun Heo
2011-05-23 12:13                               ` Tejun Heo
2011-05-24  1:04                                 ` Jian Peng
2011-05-24  1:04                                 ` Jian Peng
2011-05-20 18:21                             ` Jian Peng
2011-05-20 18:21                             ` Jian Peng
2011-05-20 15:43                           ` Tejun Heo
2011-05-20 15:40                         ` Valdis.Kletnieks
2011-05-16 17:02               ` Valdis.Kletnieks
2011-05-14 10:28             ` Tejun Heo
2011-05-13 20:44         ` AHCI driver problem on ICH (Re: AHCI driver problem on SB700/SB800 w/ Acer Ferrari One) Rafael J. Wysocki
2011-05-13 20:22       ` Rafael J. Wysocki
2011-05-13 17:39     ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2011-05-11 22:25 AHCI driver problem on SB700/SB800 w/ Acer Ferrari One Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.