All of lore.kernel.org
 help / color / mirror / Atom feed
* high latency on 82573L
@ 2010-09-02  3:39 Tony Jones
  2010-09-02 18:49 ` Allan, Bruce W
  0 siblings, 1 reply; 6+ messages in thread
From: Tony Jones @ 2010-09-02  3:39 UTC (permalink / raw)
  To: jeffrey.t.kirsher, jesse.brandeburg, bruce.w.allan,
	alexander.h.duyck, peter.p.waskiewicz.jr, john.ronciak
  Cc: e1000-devel, linux-kernel, bphilips

Hi.

Since commit 6f461f6c7c (e1000e: enable/disable ASPM L0s and L1 and ERT 
according to hardware errata) I'm seeing high latencies on my Thinkpad T60p.

02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller
	Subsystem: Lenovo ThinkPad T60
	Flags: bus master, fast devsel, latency 0, IRQ 30
	Memory at ee000000 (32-bit, non-prefetchable) [size=128K]
	I/O ports at 3000 [size=32]
	Capabilities: [c8] Power Management version 2
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [e0] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number 00-1a-6b-ff-ff-6c-7e-a4
	Kernel driver in use: e1000e

# uname -r
2.6.36-rc3-0-default
# ping -c 20 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=1.06 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=1007 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=698 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=198 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=697 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=1007 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=690 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1007 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=682 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=1008 ms
64 bytes from 192.168.1.1: icmp_seq=11 ttl=64 time=1.03 ms
64 bytes from 192.168.1.1: icmp_seq=12 ttl=64 time=1007 ms
64 bytes from 192.168.1.1: icmp_seq=13 ttl=64 time=0.874 ms
64 bytes from 192.168.1.1: icmp_seq=14 ttl=64 time=1009 ms
64 bytes from 192.168.1.1: icmp_seq=15 ttl=64 time=1.72 ms
64 bytes from 192.168.1.1: icmp_seq=16 ttl=64 time=1006 ms
64 bytes from 192.168.1.1: icmp_seq=17 ttl=64 time=650 ms
64 bytes from 192.168.1.1: icmp_seq=18 ttl=64 time=1008 ms
64 bytes from 192.168.1.1: icmp_seq=19 ttl=64 time=642 ms
64 bytes from 192.168.1.1: icmp_seq=20 ttl=64 time=1643 ms

--- 192.168.1.1 ping statistics ---
20 packets transmitted, 20 received, 0% packet loss, time 19063ms
rtt min/avg/max/mdev = 0.874/698.661/1643.392/439.376 ms, pipe 2

This is 2.6.36-rc3 yet commit 19833b5dff (e1000e: disable ASPM L1 on 82573)
isn't having any effect for me.

For our OpenSUSE 11.3 kernels (2.6.34 based), reverting just 6f461f6c7c
solves the issue. For .36-rc3 reverting 6f461f6c7c plus of course 19833b5dff 
does the trick.

I'm happy to perform any testing to help narrow this down. LMK.

Tony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: high latency on 82573L
  2010-09-02  3:39 high latency on 82573L Tony Jones
@ 2010-09-02 18:49 ` Allan, Bruce W
  2010-09-03 17:51   ` Tony Jones
  0 siblings, 1 reply; 6+ messages in thread
From: Allan, Bruce W @ 2010-09-02 18:49 UTC (permalink / raw)
  To: Tony Jones, Kirsher, Jeffrey T, Brandeburg, Jesse, Duyck,
	Alexander H, Waskiewicz Jr, Peter P, Ronciak, John
  Cc: e1000-devel, linux-kernel, bphilips

On Wednesday, September 01, 2010 8:39 PM, Tony Jones wrote:
> Hi.
> 
> Since commit 6f461f6c7c (e1000e: enable/disable ASPM L0s and L1 and
> ERT according to hardware errata) I'm seeing high latencies on my
> Thinkpad T60p. 
> 
> 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit
> 	Ethernet Controller Subsystem: Lenovo ThinkPad T60
> 	Flags: bus master, fast devsel, latency 0, IRQ 30
> 	Memory at ee000000 (32-bit, non-prefetchable) [size=128K]
> 	I/O ports at 3000 [size=32]
> 	Capabilities: [c8] Power Management version 2
> 	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> 	Capabilities: [e0] Express Endpoint, MSI 00
> 	Capabilities: [100] Advanced Error Reporting
> 	Capabilities: [140] Device Serial Number 00-1a-6b-ff-ff-6c-7e-a4
> 	Kernel driver in use: e1000e
> 
> # uname -r
> 2.6.36-rc3-0-default
> # ping -c 20 192.168.1.1
> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=1.06 ms
> 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=1007 ms
> 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=698 ms
> 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=198 ms
> 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=697 ms
> 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=1007 ms
> 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=690 ms
> 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1007 ms
> 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=682 ms
> 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=1008 ms
> 64 bytes from 192.168.1.1: icmp_seq=11 ttl=64 time=1.03 ms
> 64 bytes from 192.168.1.1: icmp_seq=12 ttl=64 time=1007 ms
> 64 bytes from 192.168.1.1: icmp_seq=13 ttl=64 time=0.874 ms
> 64 bytes from 192.168.1.1: icmp_seq=14 ttl=64 time=1009 ms
> 64 bytes from 192.168.1.1: icmp_seq=15 ttl=64 time=1.72 ms
> 64 bytes from 192.168.1.1: icmp_seq=16 ttl=64 time=1006 ms
> 64 bytes from 192.168.1.1: icmp_seq=17 ttl=64 time=650 ms
> 64 bytes from 192.168.1.1: icmp_seq=18 ttl=64 time=1008 ms
> 64 bytes from 192.168.1.1: icmp_seq=19 ttl=64 time=642 ms
> 64 bytes from 192.168.1.1: icmp_seq=20 ttl=64 time=1643 ms
> 
> --- 192.168.1.1 ping statistics ---
> 20 packets transmitted, 20 received, 0% packet loss, time 19063ms
> rtt min/avg/max/mdev = 0.874/698.661/1643.392/439.376 ms, pipe 2
> 
> This is 2.6.36-rc3 yet commit 19833b5dff (e1000e: disable ASPM L1 on
> 82573) isn't having any effect for me.
> 
> For our OpenSUSE 11.3 kernels (2.6.34 based), reverting just
> 6f461f6c7c 
> solves the issue. For .36-rc3 reverting 6f461f6c7c plus of course
> 19833b5dff does the trick.
> 
> I'm happy to perform any testing to help narrow this down. LMK.
> 
> Tony

Please provide more verbose lspci output and include the PCI config
space, i.e. 'lspci -s 2:0.0 -vvv -xxx' after the driver is loaded,
your kernel .config and the full version number of the OpenSUSE
kernel.  Are there any messages in the system log regarding disabling
ASPM L0s and/or L1 on that device?

I can understand the latency with the OpenSUSE 2.6.34-based kernels
assuming commit 19833b5dff is not present, but I do not understand
the latency with 2.6.36-rc3.

Thanks,
Bruce.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: high latency on 82573L
  2010-09-02 18:49 ` Allan, Bruce W
@ 2010-09-03 17:51   ` Tony Jones
  2010-09-03 18:59     ` Allan, Bruce W
  0 siblings, 1 reply; 6+ messages in thread
From: Tony Jones @ 2010-09-03 17:51 UTC (permalink / raw)
  To: Allan, Bruce W
  Cc: Kirsher, Jeffrey T, Brandeburg, Jesse, Duyck, Alexander H,
	Waskiewicz Jr, Peter P, Ronciak, John, e1000-devel, linux-kernel,
	bphilips

On Thu, Sep 02, 2010 at 11:49:12AM -0700, Allan, Bruce W wrote:
> Please provide more verbose lspci output and include the PCI config
> space, i.e. 'lspci -s 2:0.0 -vvv -xxx' after the driver is loaded,

# lspci -s 2:0.0 -vvv -xxx
02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller
	Subsystem: Lenovo ThinkPad T60
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 46
	Region 0: Memory at ee000000 (32-bit, non-prefetchable) [size=128K]
	Region 2: I/O ports at 3000 [size=32]
	Capabilities: [c8] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0100c  Data: 41c9
	Capabilities: [e0] Express (v1) Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
		DevCtl:	Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <128ns, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr+ BadTLP+ BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		AERCap:	First Error Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn-
	Capabilities: [140 v1] Device Serial Number 00-1a-6b-ff-ff-6c-7e-a4
	Kernel driver in use: e1000e
00: 86 80 9a 10 07 05 10 00 00 00 00 02 10 00 00 00
10: 00 00 00 ee 00 00 00 00 01 30 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 01 20
30: 00 00 00 00 c8 00 00 00 00 00 00 00 0b 01 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 01 d0 22 c8 00 20 00 0f
d0: 05 e0 81 00 0c 10 e0 fe 00 00 00 00 c9 41 00 00
e0: 10 00 01 00 c1 0c 00 00 1f 28 1a 00 11 1c 07 00
f0: 42 01 11 10 00 00 00 00 00 00 00 00 00 00 00 00

> kernel.  Are there any messages in the system log regarding disabling
> ASPM L0s and/or L1 on that device?

It would appear it is being disabled:

[    0.194271] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
[    0.297112] pci 0000:01:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'
[    0.298003] pci 0000:02:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'
[    0.299123] pci 0000:03:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'
[   18.135907] e1000e 0000:02:00.0: Disabling ASPM  L1
[   18.137262] e1000e 0000:02:00.0: Disabling ASPM L0s 

but I see the same high ping latencies.

> I can understand the latency with the OpenSUSE 2.6.34-based kernels
> assuming commit 19833b5dff is not present, but I do not understand
> the latency with 2.6.36-rc3.

The first thing I tried was OpenSUSE 2.6.34 plus 19833b5dff.   This led me to
think it wasn't related to ASPM so I resorted to a bisect which ended up showing
it was 6f461f6c7c.

Anyways, all of the above is from vanilla 2.6.36-rc3 so lets ignore OpenSUSE
kernels.

http://ftp.suse.com/pub/people/tonyj/82573L/config  is the config for .36-rc3
generated using localmodconfig, defaults chosen for all prompts.

http://ftp.suse.com/pub/people/tonyj/82573L/dmesg  is the full dmesg

Tony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: high latency on 82573L
  2010-09-03 17:51   ` Tony Jones
@ 2010-09-03 18:59     ` Allan, Bruce W
  2010-09-08 18:21       ` Jesse Barnes
  0 siblings, 1 reply; 6+ messages in thread
From: Allan, Bruce W @ 2010-09-03 18:59 UTC (permalink / raw)
  To: Tony Jones
  Cc: Kirsher, Jeffrey T, Brandeburg, Jesse, Duyck, Alexander H,
	Waskiewicz Jr, Peter P, Ronciak, John, e1000-devel, linux-kernel,
	bphilips, linux-pci, jbarnes

On Friday, September 03, 2010 10:51 AM, Tony Jones wrote:
> On Thu, Sep 02, 2010 at 11:49:12AM -0700, Allan, Bruce W wrote:
>> Please provide more verbose lspci output and include the PCI config
>> space, i.e. 'lspci -s 2:0.0 -vvv -xxx' after the driver is loaded,
> 
> # lspci -s 2:0.0 -vvv -xxx
> 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit
> 	Ethernet Controller Subsystem: Lenovo ThinkPad T60
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> 	Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B-
> 	ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ
> 	46 Region 0: Memory at ee000000 (32-bit, non-prefetchable)
> 	[size=128K] Region 2: I/O ports at 3000 [size=32]
> 	Capabilities: [c8] Power Management version 2
> 		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> 		PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable-
> 	DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1
> 		Maskable- 64bit+ Address: 00000000fee0100c  Data: 41c9
> 	Capabilities: [e0] Express (v1) Endpoint, MSI 00
> 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1
> 			<64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
> 		DevCtl:	Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
> 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> 			MaxPayload 128 bytes, MaxReadReq 512 bytes
> 		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-
> 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0
> 			<128ns, L1 <64us ClockPM+ Surprise- LLActRep- BwNot-
> 		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
> 			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
> 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive-
> 	BWMgmt- ABWMgmt- Capabilities: [100 v1] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
> 		MalfTLP- ECRC- UnsupReq+ ACSViol- UEMsk:	DLP- SDES- TLP- FCP-
> 		CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq-
> 		ACSViol- UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> 		RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta:	RxErr+ BadTLP+
> 		BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk:	RxErr- BadTLP-
> 	BadDLLP- Rollover- Timeout- NonFatalErr- AERCap:	First Error
> 	Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140 v1]
> Device Serial Number 00-1a-6b-ff-ff-6c-7e-a4 Kernel driver in use:
> e1000e 00: 86 80 9a 10 07 05 10 00 00 00 00 02 10 00 00 00 10: 00 00
> 00 ee 00 00 00 00 01 30 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00
> 00 00 00 00 aa 17 01 20 30: 00 00 00 00 c8 00 00 00 00 00 00 00 0b 01
> 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> c0: 00 00 00 00 00 00 00 00 01 d0 22 c8 00 20 00 0f
> d0: 05 e0 81 00 0c 10 e0 fe 00 00 00 00 c9 41 00 00
> e0: 10 00 01 00 c1 0c 00 00 1f 28 1a 00 11 1c 07 00
> f0: 42 01 11 10 00 00 00 00 00 00 00 00 00 00 00 00
> 
>> kernel.  Are there any messages in the system log regarding disabling
>> ASPM L0s and/or L1 on that device?
> 
> It would appear it is being disabled:
> 
> [    0.194271] ACPI FADT declares the system doesn't support PCIe
> ASPM, so disable it [    0.297112] pci 0000:01:00.0: disabling ASPM
> on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force' [  
> 0.298003] pci 0000:02:00.0: disabling ASPM on pre-1.1 PCIe device. 
> You can enable it with 'pcie_aspm=force' [    0.299123] pci
> 0000:03:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable
> it with 'pcie_aspm=force' [   18.135907] e1000e 0000:02:00.0:
> Disabling ASPM  L1 [   18.137262] e1000e 0000:02:00.0: Disabling ASPM
> L0s   
> 
> but I see the same high ping latencies.
> 
>> I can understand the latency with the OpenSUSE 2.6.34-based kernels
>> assuming commit 19833b5dff is not present, but I do not understand
>> the latency with 2.6.36-rc3.
> 
> The first thing I tried was OpenSUSE 2.6.34 plus 19833b5dff.   This
> led me to 
> think it wasn't related to ASPM so I resorted to a bisect which ended
> up showing 
> it was 6f461f6c7c.
> 
> Anyways, all of the above is from vanilla 2.6.36-rc3 so lets ignore
> OpenSUSE 
> kernels.
> 
> http://ftp.suse.com/pub/people/tonyj/82573L/config  is the config for
> .36-rc3 
> generated using localmodconfig, defaults chosen for all prompts.
> 
> http://ftp.suse.com/pub/people/tonyj/82573L/dmesg  is the full dmesg
> 
> Tony

ASPM L1 must be disabled on this device otherwise the latency described
above will happen.  And even though there are log messages indicating
ASPM L1 is disabled, it really isn't according to the verbose lspci
output and PCI config space for the 2:0.0 device (see LnkCtl above).
Since CONFIG_PCIEASPM is enabled in your kernel config, the driver is
calling the kernel function pci_disable_link_state() to disable ASPM L1
which it fails to do because the variable aspm_disabled=1 (as indicated
by the "ACPI FADT declares the system doesn't support PCIe ASPM, so
disable it" message).

I'm unclear on whether the aspm_disabled variable is meant to indicate
ASPM L0s or both ASPM L0s _and_ L1 are disabled (added PCI maintainer
and linux-pci mail-list).  To resolve this issue, we need to either a)
change e1000e to directly write the PCI config space to disable ASPM L1
as was done before 6f461f6c7c, or b) fix pci_disable_link_state() et. al.
to allow for ASPM L1 to be disabled properly.  I would prefer the latter
option so that other drivers do not have to use the same kludge to write
to the PCI config space.  Any input from the PCI guys?

Alternatively in the meantime, if you disable CONFIG_PCIEASPM the e100e
driver will act how it did before 6f461f6c7c, i.e. it will directly write
the PCI config space to disable ASPM L1.

Thanks,
Bruce.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: high latency on 82573L
  2010-09-03 18:59     ` Allan, Bruce W
@ 2010-09-08 18:21       ` Jesse Barnes
  2010-09-20 22:31         ` Allan, Bruce W
  0 siblings, 1 reply; 6+ messages in thread
From: Jesse Barnes @ 2010-09-08 18:21 UTC (permalink / raw)
  To: Allan, Bruce W
  Cc: Tony Jones, Kirsher, Jeffrey T, Brandeburg, Jesse, Duyck,
	Alexander H, Waskiewicz Jr, Peter P, Ronciak, John, e1000-devel,
	linux-kernel, bphilips, linux-pci

On Fri, 3 Sep 2010 11:59:30 -0700
"Allan, Bruce W" <bruce.w.allan@intel.com> wrote:

> On Friday, September 03, 2010 10:51 AM, Tony Jones wrote:
> > On Thu, Sep 02, 2010 at 11:49:12AM -0700, Allan, Bruce W wrote:
> >> Please provide more verbose lspci output and include the PCI config
> >> space, i.e. 'lspci -s 2:0.0 -vvv -xxx' after the driver is loaded,
> > 
> > # lspci -s 2:0.0 -vvv -xxx
> > 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit
> > 	Ethernet Controller Subsystem: Lenovo ThinkPad T60
> > 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > 	Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B-
> > 	ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > 	Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ
> > 	46 Region 0: Memory at ee000000 (32-bit, non-prefetchable)
> > 	[size=128K] Region 2: I/O ports at 3000 [size=32]
> > 	Capabilities: [c8] Power Management version 2
> > 		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> > 		PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable-
> > 	DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1
> > 		Maskable- 64bit+ Address: 00000000fee0100c  Data: 41c9
> > 	Capabilities: [e0] Express (v1) Endpoint, MSI 00
> > 		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1
> > 			<64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
> > 		DevCtl:	Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
> > 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> > 			MaxPayload 128 bytes, MaxReadReq 512 bytes
> > 		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-
> > 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0
> > 			<128ns, L1 <64us ClockPM+ Surprise- LLActRep- BwNot-
> > 		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
> > 			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
> > 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive-
> > 	BWMgmt- ABWMgmt- Capabilities: [100 v1] Advanced Error Reporting
> > 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
> > 		MalfTLP- ECRC- UnsupReq+ ACSViol- UEMsk:	DLP- SDES- TLP- FCP-
> > 		CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq-
> > 		ACSViol- UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> > 		RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta:	RxErr+ BadTLP+
> > 		BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk:	RxErr- BadTLP-
> > 	BadDLLP- Rollover- Timeout- NonFatalErr- AERCap:	First Error
> > 	Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140 v1]
> > Device Serial Number 00-1a-6b-ff-ff-6c-7e-a4 Kernel driver in use:
> > e1000e 00: 86 80 9a 10 07 05 10 00 00 00 00 02 10 00 00 00 10: 00 00
> > 00 ee 00 00 00 00 01 30 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00
> > 00 00 00 00 aa 17 01 20 30: 00 00 00 00 c8 00 00 00 00 00 00 00 0b 01
> > 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > c0: 00 00 00 00 00 00 00 00 01 d0 22 c8 00 20 00 0f
> > d0: 05 e0 81 00 0c 10 e0 fe 00 00 00 00 c9 41 00 00
> > e0: 10 00 01 00 c1 0c 00 00 1f 28 1a 00 11 1c 07 00
> > f0: 42 01 11 10 00 00 00 00 00 00 00 00 00 00 00 00
> > 
> >> kernel.  Are there any messages in the system log regarding disabling
> >> ASPM L0s and/or L1 on that device?
> > 
> > It would appear it is being disabled:
> > 
> > [    0.194271] ACPI FADT declares the system doesn't support PCIe
> > ASPM, so disable it [    0.297112] pci 0000:01:00.0: disabling ASPM
> > on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force' [  
> > 0.298003] pci 0000:02:00.0: disabling ASPM on pre-1.1 PCIe device. 
> > You can enable it with 'pcie_aspm=force' [    0.299123] pci
> > 0000:03:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable
> > it with 'pcie_aspm=force' [   18.135907] e1000e 0000:02:00.0:
> > Disabling ASPM  L1 [   18.137262] e1000e 0000:02:00.0: Disabling ASPM
> > L0s   
> > 
> > but I see the same high ping latencies.
> > 
> >> I can understand the latency with the OpenSUSE 2.6.34-based kernels
> >> assuming commit 19833b5dff is not present, but I do not understand
> >> the latency with 2.6.36-rc3.
> > 
> > The first thing I tried was OpenSUSE 2.6.34 plus 19833b5dff.   This
> > led me to 
> > think it wasn't related to ASPM so I resorted to a bisect which ended
> > up showing 
> > it was 6f461f6c7c.
> > 
> > Anyways, all of the above is from vanilla 2.6.36-rc3 so lets ignore
> > OpenSUSE 
> > kernels.
> > 
> > http://ftp.suse.com/pub/people/tonyj/82573L/config  is the config for
> > .36-rc3 
> > generated using localmodconfig, defaults chosen for all prompts.
> > 
> > http://ftp.suse.com/pub/people/tonyj/82573L/dmesg  is the full dmesg
> > 
> > Tony
> 
> ASPM L1 must be disabled on this device otherwise the latency described
> above will happen.  And even though there are log messages indicating
> ASPM L1 is disabled, it really isn't according to the verbose lspci
> output and PCI config space for the 2:0.0 device (see LnkCtl above).
> Since CONFIG_PCIEASPM is enabled in your kernel config, the driver is
> calling the kernel function pci_disable_link_state() to disable ASPM L1
> which it fails to do because the variable aspm_disabled=1 (as indicated
> by the "ACPI FADT declares the system doesn't support PCIe ASPM, so
> disable it" message).
> 
> I'm unclear on whether the aspm_disabled variable is meant to indicate
> ASPM L0s or both ASPM L0s _and_ L1 are disabled (added PCI maintainer
> and linux-pci mail-list).  To resolve this issue, we need to either a)
> change e1000e to directly write the PCI config space to disable ASPM L1
> as was done before 6f461f6c7c, or b) fix pci_disable_link_state() et. al.
> to allow for ASPM L1 to be disabled properly.  I would prefer the latter
> option so that other drivers do not have to use the same kludge to write
> to the PCI config space.  Any input from the PCI guys?

Yeah, I'd prefer this code to be in the core.  Are there any patches
available yet?

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: high latency on 82573L
  2010-09-08 18:21       ` Jesse Barnes
@ 2010-09-20 22:31         ` Allan, Bruce W
  0 siblings, 0 replies; 6+ messages in thread
From: Allan, Bruce W @ 2010-09-20 22:31 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Tony Jones, Kirsher, Jeffrey T, Brandeburg, Jesse, Duyck,
	Alexander H, Waskiewicz Jr, Peter P, Ronciak, John, e1000-devel,
	linux-kernel, bphilips, linux-pci

On Wednesday, September 08, 2010 11:21 AM, Jesse Barnes wrote:
> On Fri, 3 Sep 2010 11:59:30 -0700
> "Allan, Bruce W" <bruce.w.allan@intel.com> wrote:
> 
>> ASPM L1 must be disabled on this device otherwise the latency
>> described above will happen.  And even though there are log messages
>> indicating ASPM L1 is disabled, it really isn't according to the
>> verbose lspci output and PCI config space for the 2:0.0 device (see
>> LnkCtl above). Since CONFIG_PCIEASPM is enabled in your kernel
>> config, the driver is calling the kernel function
>> pci_disable_link_state() to disable ASPM L1 which it fails to do
>> because the variable aspm_disabled=1 (as indicated by the "ACPI FADT
>> declares the system doesn't support PCIe ASPM, so disable it"
>> message).  
>> 
>> I'm unclear on whether the aspm_disabled variable is meant to
>> indicate ASPM L0s or both ASPM L0s _and_ L1 are disabled (added PCI
>> maintainer and linux-pci mail-list).  To resolve this issue, we need
>> to either a) change e1000e to directly write the PCI config space to
>> disable ASPM L1 as was done before 6f461f6c7c, or b) fix
>> pci_disable_link_state() et. al. to allow for ASPM L1 to be disabled
>> properly.  I would prefer the latter option so that other drivers do
>> not have to use the same kludge to write to the PCI config space. 
>> Any input from the PCI guys? 
> 
> Yeah, I'd prefer this code to be in the core.  Are there any patches
> available yet?

Nothing from me at this point (been on vacation).  I might be able to
look into this further but not for another week or so (need to catch
up on the backlog that piled up while away).

Bruce.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-09-20 22:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-02  3:39 high latency on 82573L Tony Jones
2010-09-02 18:49 ` Allan, Bruce W
2010-09-03 17:51   ` Tony Jones
2010-09-03 18:59     ` Allan, Bruce W
2010-09-08 18:21       ` Jesse Barnes
2010-09-20 22:31         ` Allan, Bruce W

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.