linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>,
	Matthias Andree <matthias.andree@gmx.de>
Subject: Re: [Regression] pcie_wait_for_link_delay (1132.853 ms @ 5039.414431)
Date: Tue, 6 Aug 2019 11:57:26 +0200	[thread overview]
Message-ID: <acca5213-7d8b-7db1-ff3c-cb5b4a704f04@molgen.mpg.de> (raw)
In-Reply-To: <20190806093626.GF2548@lahna.fi.intel.com>

[-- Attachment #1: Type: text/plain, Size: 10935 bytes --]

Dear Mika,


Thank you for your quick reply.


On 06.08.19 11:36, Mika Westerberg wrote:
> +Nicholas and Matthias
> 
> On Tue, Aug 06, 2019 at 11:20:37AM +0200, Paul Menzel wrote:

>> Commit c2bf1fc2 (PCI: Add missing link delays required by the PCIe spec) [1] 
>> increases the resume time from ACPI S3 on a desktop system Dell OptiPlex 5040
>> by one second. It looks like this is expected from the commit message, but
>> breaks existing systems with boot time requirements. I measured this with the
>> help of the pm-graph script `sleepgraph.py` [2].
>>
>>     0000:00:01.0 resume_noirq (1134.715 ms @ 5039.412578 to 5040.547293)
>>         …
>>             pcie_wait_for_link_delay (1132.853 ms @ 5039.414431)

By the way, here is the trace excerpt with the interesting comments.

```
 5040.547284 |   1)  kworker-3594  |   1132852 us  |              } /* schedule_timeout */
 5040.547284 |   1)  kworker-3594  |   1132853 us  |            } /* msleep */
 5040.547284 |   1)  kworker-3594  |   1132853 us  |          } /* pcie_wait_for_link_delay */
 5040.547284 |   1)  kworker-3594  |   1132856 us  |        } /* wait_for_downstream_link */
 5040.547285 |   1)  kworker-3594  |               |        device_for_each_child() {
 5040.547285 |   1)  kworker-3594  |   0.185 us    |          _raw_spin_lock_irqsave();
 5040.547286 |   1)  kworker-3594  |   0.136 us    |          _raw_spin_unlock_irqrestore();
 5040.547286 |   1)  kworker-3594  |   0.190 us    |          pm_iter();
 5040.547286 |   1)  kworker-3594  |   0.129 us    |          _raw_spin_lock_irqsave();
 5040.547287 |   1)  kworker-3594  |   0.134 us    |          _raw_spin_unlock_irqrestore();
 5040.547287 |   1)  kworker-3594  |   0.194 us    |          pm_iter();
 5040.547287 |   1)  kworker-3594  |   0.134 us    |          _raw_spin_lock_irqsave();
 5040.547288 |   1)  kworker-3594  |   0.133 us    |          _raw_spin_unlock_irqrestore();
 5040.547288 |   1)  kworker-3594  |   0.187 us    |          pm_iter();
 5040.547288 |   1)  kworker-3594  |   0.135 us    |          _raw_spin_lock_irqsave();
 5040.547289 |   1)  kworker-3594  |   0.135 us    |          _raw_spin_unlock_irqrestore();
 5040.547289 |   1)  kworker-3594  |   0.271 us    |          pm_iter();
 5040.547289 |   1)  kworker-3594  |   0.132 us    |          _raw_spin_lock_irqsave();
 5040.547290 |   1)  kworker-3594  |   0.137 us    |          _raw_spin_unlock_irqrestore();
 5040.547290 |   1)  kworker-3594  |   5.036 us    |        } /* device_for_each_child */
 5040.547290 |   1)  kworker-3594  |   1132862 us  |      } /* pcie_port_device_resume_noirq */
 5040.547290 |   1)  kworker-3594  |   1134709 us  |    } /* pci_pm_resume_noirq */
```

>> $ lspci -nn
>> 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:191f] (rev 07)
>> 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
>> 00:14.0 USB controller [0c03]: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller [8086:a12f] (rev 31)
>> 00:14.2 Signal processing controller [1180]: Intel Corporation Sunrise Point-H Thermal subsystem [8086:a131] (rev 31)
>> 00:16.0 Communication controller [0780]: Intel Corporation Sunrise Point-H CSME HECI #1 [8086:a13a] (rev 31)
>> 00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] [8086:a102] (rev 31)
>> 00:1c.0 PCI bridge [0604]: Intel Corporation Sunrise Point-H PCI Express Root Port #1 [8086:a110] (rev f1)
>> 00:1f.0 ISA bridge [0601]: Intel Corporation Sunrise Point-H LPC Controller [8086:a146] (rev 31)
>> 00:1f.2 Memory controller [0580]: Intel Corporation Sunrise Point-H PMC [8086:a121] (rev 31)
>> 00:1f.3 Audio device [0403]: Intel Corporation Sunrise Point-H HD Audio [8086:a170] (rev 31)
>> 00:1f.4 SMBus [0c05]: Intel Corporation Sunrise Point-H SMBus [8086:a123] (rev 31)
>> 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8] (rev 31)
>> 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350] [1002:6610] (rev 81)
>> 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
>> 02:00.0 PCI bridge [0604]: Texas Instruments XIO2001 PCI Express-to-PCI Bridge [104c:8240]
>>
>> So, it’s about the internal Intel graphics device, which is not used on this 
>> system, as there is an external AMD graphics device plugged in.
>>
>> As far as I understand it, it’s a bug in the firmware, that a one second delay
>> is specified?

How can I read out the delay from the system as done in?

```
static int get_downstream_delay(struct pci_bus *bus)
{
        struct pci_dev *pdev;
        int min_delay = 100;
        int max_delay = 0;

        list_for_each_entry(pdev, &bus->devices, bus_list) {
                if (!pdev->imm_ready)
                        min_delay = 0;
                else if (pdev->d3cold_delay < min_delay)
                        min_delay = pdev->d3cold_delay;
                if (pdev->d3cold_delay > max_delay)
                        max_delay = pdev->d3cold_delay;
        }

        return max(min_delay, max_delay);
}
```

>> Anyway, there is such firmware out there, so I’d like to avoid the time
>> increases.
>>
>> As a first step, the commit should be extended to print a warning (maybe if
>> `initcall_debug` is specified), when the delay is higher than let’s say 50(?)
>> ms. Also better documentation how to debug these delays would be appreciated.

As your commit message says the standard demands a delay of at least 100 ms, 50 ms
is of course too short, and maybe 150 ms or so should be used as the threshold.

>> If there is no easy solution, it’d be great if the commit could be reverted for
>> now, and a better solution be discussed for the next release.
> 
> There is also kernel bugzilla entry about another regression this causes
> here:
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=204413
> 
> I agree we should revert c2bf1fc2 now. I'll try to come up alternative
> solution to these missing delays that hopefully does not break existing
> setups.
> 
> Rafael, Bjorn,
> 
> Can you revert the commit or do you want me to send a revert patch?
> 
> Thanks and sorry about the breakage.

No worries.


Kind regards,

Paul


PS:

```
$ sudo lspci -vvv -s 00:00.0
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers (rev 07)
	Subsystem: Dell Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: [e0] Vendor Specific Information: Len=10 <?>
	Kernel driver in use: skl_uncore

$ sudo lspci -vvv -s 00:01
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 07) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 120
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	I/O behind bridge: 0000e000-0000efff [size=4K]
	Memory behind bridge: f7e00000-f7efffff [size=1M]
	Prefetchable memory behind bridge: 00000000e0000000-00000000efffffff [size=256M]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA+ MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [88] Subsystem: Dell Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16)
	Capabilities: [80] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee04004  Data: 4021
	Capabilities: [a0] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 256 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #2, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <256ns, L1 <8us
			ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt+ ABWMgmt+
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #2, PowerLimit 75.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet+ LinkState-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR+, OBFF Via WAKE# ARIFwd-
			 AtomicOpsCap: Routing- 32bit+ 64bit+ 128bitCAS+
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Via WAKE# ARIFwd-
			 AtomicOpsCtl: ReqEn- EgressBlck-
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [140 v1] Root Complex Link
		Desc:	PortNumber=02 ComponentID=01 EltType=Config
		Link0:	Desc:	TargetPort=00 TargetComponent=01 AssocRCRB- LinkType=MemMapped LinkValid+
			Addr:	00000000fed19000
	Capabilities: [d94 v1] #19
	Kernel driver in use: pcieport
```


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

  reply	other threads:[~2019-08-06  9:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-06  9:20 [Regression] pcie_wait_for_link_delay (1132.853 ms @ 5039.414431) Paul Menzel
2019-08-06  9:36 ` Mika Westerberg
2019-08-06  9:57   ` Paul Menzel [this message]
2019-08-06 11:31     ` Mika Westerberg
2019-08-06 14:02       ` Paul Menzel
2019-08-06 14:15         ` Mika Westerberg
2019-08-06 19:56       ` Matthias Andree
2019-08-07 10:06         ` Mika Westerberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acca5213-7d8b-7db1-ff3c-cb5b4a704f04@molgen.mpg.de \
    --to=pmenzel@molgen.mpg.de \
    --cc=bhelgaas@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=matthias.andree@gmx.de \
    --cc=mika.westerberg@linux.intel.com \
    --cc=nicholas.johnson-opensource@outlook.com.au \
    --cc=rafael.j.wysocki@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).