From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gw1.transmode.se ([195.58.98.146]:51743 "EHLO gw1.transmode.se" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753597Ab2GSJRL (ORCPT ); Thu, 19 Jul 2012 05:17:11 -0400 In-Reply-To: References: Subject: Re: PICe hotplug problems Cc: Yinghai Lu , linux-pci@vger.kernel.org, yhlu.kernel@gmail.com Message-ID: From: Joakim Tjernlund Date: Thu, 19 Jul 2012 11:17:08 +0200 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII To: unlisted-recipients:; (no To-header on input) Sender: linux-pci-owner@vger.kernel.org List-ID: Joakim Tjernlund/Transmode wrote on 2012/07/19 01:34:14: > > Joakim Tjernlund/Transmode wrote on 2012/07/18 15:07:09: > > > > yhlu.kernel@gmail.com wrote on 2012/07/11 03:33:05: > > > > > > On Tue, Jul 10, 2012 at 6:07 PM, Joakim Tjernlund > > > wrote: > > > > yhlu.kernel@gmail.com wrote on 2012/07/11 00:09:00: > > > > > > >> No. Can you compile lspci util as static and run it ? > > > > > > > > That wasn't so hard so here: > > > > > > > > root@P2020RDB ~ # ./lspci -vvxxx > > > > 00:00.0 Class 0604: Device 1957:0079 (rev 21) > > > > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- > > > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- > > > Latency: 0, Cache Line Size: 32 bytes > > > > Region 0: Memory at (32-bit, non-prefetchable) > > > > Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 > > > > I/O behind bridge: 00000000-00000fff > > > > Memory behind bridge: 80000000-9fffffff > > > > Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- > > > BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B- > > > > PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- > > > > Capabilities: [44] Power Management version 2 > > > > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) > > > > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > > > > Capabilities: [4c] Express (v1) Root Port (Slot-), MSI 00 > > > > DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us > > > > ExtTag- RBE- FLReset- > > > > DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+ > > > > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ > > > > MaxPayload 128 bytes, MaxReadReq 512 bytes > > > > DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- > > > > LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L0s, Latency L0 <2us, L1 unlimited > > > > ClockPM- Surprise- LLActRep- BwNot- > > > > LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk- > > > > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > > > > LnkSta: Speed 2.5GT/s, Width x2, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt- > > > > RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- > > > > RootCap: CRSVisible- > > > > RootSta: PME ReqID 0000, PMEStatus- PMEPending- > > > > > > There is no slot cap etc, so pciehp will not be loaded. > > > the power of you child device can not be turned off/on. > > > > > > Not sure if can use link off/on make the clock effective. > > > > > > You can turn off and on the pcie link like following: > > > > > > 1. remove the child device > > > echo 1 > /sys/..../0000:01:00.0/remove > > > 2. disable link > > > echo 1 > /sys/..../0000:00.00.0/pcie_link_disable > > > 3. enable link > > > echo 1 > /sys/..../0000:00.00.0/pcie_link_disable > > > 4. rescan the pci bus. > > > echo 1 > /sys/..../0000:00:00.0/rescan_bridge > > > > > > please check link disable patch at > > > git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git > > > for-pci-pcie-link > > > > > > | Subject: [PATCH] PCI: Add link_disable in /sysfs for pcie device > > > | > > > | Found PCIe cards from one vendor, will not respond to scan from bridge, > > > | if we change bus number setting in bridge device. > > > | > > > | Have to do link disable/enable on the pcie root port. > > > | > > > | So try to expose link disable bit of pcie link control register. We can use > > > | echo 1 > /sys/..../link_disable > > > | echo 0 > /sys/..../link_disable > > > | to bring the pcie device back to respond to scan. > > > > Sorry for the delay, got pulled into some high prio stuff. > > > > Anyhow, I have backported your patch to 3.4 as we cannot upgrade easily and > > I cannot make it work. I do > > > > # > echo 1 > /sys/devices/pci0000:00/0000:00:00.0/pcie_link_disable > > # > echo 0 > /sys/devices/pci0000:00/0000:00:00.0/pcie_link_disable > > # > echo 1 > /sys/devices/pci0000:00/0000:00:00.0/rescan > > # > lspci > > 00:00.0 Class 0604: Device 1957:0079 (rev 21) > > > > #> dmesg > > pcieport 0000:00:00.0: pcie_link_disable_set: lnk_ctrl = 18 > > pcieport 0000:00:00.0: pcie_link_disable_set: lnk_ctrl = 8 > > > > find /sys -name pcie_link_disable > > find /sys -name remove > > find /sys -name rescan > > > > shows > > /sys/devices/pci0000:00/0000:00:00.0/pcie_link_disable > > /sys/devices/pci0000:00/0000:00:00.0/remove > > /sys/bus/pci/rescan > > /sys/devices/pci0000:00/0000:00:00.0/rescan > > /sys/devices/pci0000:00/0000:00:00.0/pci_bus/0000:01/rescan > > /sys/devices/pci0000:00/pci_bus/0000:00/rescan > > > > so something is missing, but what? > > Tried with FAKE hotplug too but no luck. > > What is best pcie_hotplug or fake hotplug? > Would be nice if I could eliminate one of them. I found a register to turn on Slot and SlotClk so now I get: root@P2020RDB ~ # lspci -vvvvxx 00:00.0 Class 0604: Device 1957:0079 (rev 21) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- (32-bit, non-prefetchable) Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 00000000-00000fff Memory behind bridge: 90000000-900fffff Prefetchable memory behind bridge: 0000000080000000-000000008dffffff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [44] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [4c] Express (v1) Root Port (Slot+), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L0s, Latency L0 <2us, L1 unlimited ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x2, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise- Slot #0, PowerLimit 15.000W; Interlock- NoCompl- SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg- Control: AttnInd Off, PwrInd Off, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock- Changed: MRL- PresDet- LinkState- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- 00: 57 19 79 00 06 01 10 00 21 00 20 0b 00 00 01 00 10: 00 00 f0 ff 00 00 00 00 00 01 01 00 00 00 00 00 20: 00 90 00 90 01 80 f1 8d 00 00 00 00 00 00 00 00 30: 00 00 00 00 44 00 00 00 00 00 00 00 00 00 00 00 However I still don't get it to work echo 1 > /sys/..../rescan does not expose the device behind the RC nor does toggling pcie_link_disable Jocke