* Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) @ 2014-09-23 19:03 Andreas Hartmann 2014-09-23 20:07 ` Alex Williamson 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-09-23 19:03 UTC (permalink / raw) To: linux-pci Hello! Since long time now, I'm using w/o any problem PCIe pass through with a Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and enabled IOMMU with vfio-pci. The last kernel working w/o any problem is kernel 3.13.7 (I didn't use .8 and .9, but I do not think they would have been problematic). Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a hard and silent lock up of the complete machine when starting the VM with the PCIe card passed through. That's the relevant PCIe card, which locks up the machine (here running w/ 3.12.28) when passed to the VM: 03:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01) Subsystem: Qualcomm Atheros Device 3112 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 17 Region 0: Memory at fdbc0000 (64-bit, non-prefetchable) [size=128K] Expansion ROM at fda00000 [size=64K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [70] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <2us, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout+ NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Capabilities: [300 v1] Device Serial Number 00-00-00-00-00-00-00-00 Kernel driver in use: vfio-pci Kernel modules: ath9k Unbinding it works w/o any problem. The lock up encounters about 4 s after the start of the VM. On 3.12.x, I can see the following message on the error terminal when starting the VM: vfio-pci: 03:00.0: invalid ROM contents. I compared AMD-Vi debug output between 3.12 and 3.14, but couldn't see any difference. I compared /proc/interrupts between 3.12 and 3.14 and couldn't see any difference too so far. qemu version I'm using is 1.7.0. It is strange(?), that a second VM using PCI (legacy) pass through works w/o any problem. I tried to start the problematic VM even w/o running this VM - same result: machine is locked up hard. Do you have any idea, what could be going on there? Or how to debug it to see what happened? Thanks, kind regards, Andreas Hartmann ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-09-23 19:03 Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) Andreas Hartmann @ 2014-09-23 20:07 ` Alex Williamson 2014-09-24 14:54 ` Andreas Hartmann 2014-10-10 9:39 ` Andreas Hartmann 0 siblings, 2 replies; 42+ messages in thread From: Alex Williamson @ 2014-09-23 20:07 UTC (permalink / raw) To: Andreas Hartmann; +Cc: linux-pci On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: > Hello! > > Since long time now, I'm using w/o any problem PCIe pass through with a > Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and > enabled IOMMU with vfio-pci. > > The last kernel working w/o any problem is kernel 3.13.7 (I didn't use > .8 and .9, but I do not think they would have been problematic). > > Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a > hard and silent lock up of the complete machine when starting the VM > with the PCIe card passed through. > > That's the relevant PCIe card, which locks up the machine (here > running w/ 3.12.28) when passed to the VM: > > 03:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01) > Subsystem: Qualcomm Atheros Device 3112 > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 17 > Region 0: Memory at fdbc0000 (64-bit, non-prefetchable) [size=128K] > Expansion ROM at fda00000 [size=64K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+ > Address: 0000000000000000 Data: 0000 > Masking: 00000000 Pending: 00000000 > Capabilities: [70] Express (v2) Endpoint, MSI 00 > DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- > DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- > MaxPayload 128 bytes, MaxReadReq 512 bytes > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- > LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <2us, L1 <64us > ClockPM- Surprise- LLActRep- BwNot- > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- > DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled > LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- > Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- > Compliance De-emphasis: -6dB > LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- > EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- > Capabilities: [100 v1] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout+ NonFatalErr+ > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- > Capabilities: [140 v1] Virtual Channel > Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 > Arb: Fixed- WRR32- WRR64- WRR128- > Ctrl: ArbSelect=Fixed > Status: InProgress- > VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- > Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- > Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff > Status: NegoPending- InProgress- > Capabilities: [300 v1] Device Serial Number 00-00-00-00-00-00-00-00 > Kernel driver in use: vfio-pci > Kernel modules: ath9k > > > Unbinding it works w/o any problem. The lock up encounters about 4 s > after the start of the VM. > > On 3.12.x, I can see the following message on the error terminal when > starting the VM: > vfio-pci: 03:00.0: invalid ROM contents. > > I compared AMD-Vi debug output between 3.12 and 3.14, but couldn't see > any difference. I compared /proc/interrupts between 3.12 and 3.14 > and couldn't see any difference too so far. > > > qemu version I'm using is 1.7.0. > > > It is strange(?), that a second VM using PCI (legacy) pass through works > w/o any problem. I tried to start the problematic VM even w/o running > this VM - same result: machine is locked up hard. > > > Do you have any idea, what could be going on there? Or how to debug it > to see what happened? Are you able to setup a serial console on this system? Enabling sysrq and getting a dump of task states (t) via serial is often the best way to determine the problem. There weren't many vfio changes between 3.13 and 3.14. Have you tested whether the problem still occurs on 3.16 + newer QEMU? Maybe also remove the ROM from the equation with the rombar=0 option for the vfio-pci device in QEMU. Thanks, Alex ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-09-23 20:07 ` Alex Williamson @ 2014-09-24 14:54 ` Andreas Hartmann 2014-09-24 17:16 ` Andreas Hartmann 2014-10-10 9:39 ` Andreas Hartmann 1 sibling, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-09-24 14:54 UTC (permalink / raw) To: Alex Williamson; +Cc: linux-pci Alex Williamson wrote: > On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: >> Hello! >> >> Since long time now, I'm using w/o any problem PCIe pass through with a >> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and >> enabled IOMMU with vfio-pci. >> >> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use >> .8 and .9, but I do not think they would have been problematic). >> >> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a >> hard and silent lock up of the complete machine when starting the VM >> with the PCIe card passed through. >> >> That's the relevant PCIe card, which locks up the machine (here >> running w/ 3.12.28) when passed to the VM: >> >> 03:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01) >> Subsystem: Qualcomm Atheros Device 3112 >> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- >> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- >> Latency: 0, Cache Line Size: 64 bytes >> Interrupt: pin A routed to IRQ 17 >> Region 0: Memory at fdbc0000 (64-bit, non-prefetchable) [size=128K] >> Expansion ROM at fda00000 [size=64K] >> Capabilities: [40] Power Management version 3 >> Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-) >> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- >> Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+ >> Address: 0000000000000000 Data: 0000 >> Masking: 00000000 Pending: 00000000 >> Capabilities: [70] Express (v2) Endpoint, MSI 00 >> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us >> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- >> DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- >> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- >> MaxPayload 128 bytes, MaxReadReq 512 bytes >> DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- >> LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <2us, L1 <64us >> ClockPM- Surprise- LLActRep- BwNot- >> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ >> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- >> LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- >> DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported >> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled >> LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- >> Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- >> Compliance De-emphasis: -6dB >> LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- >> EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- >> Capabilities: [100 v1] Advanced Error Reporting >> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- >> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- >> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- >> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout+ NonFatalErr+ >> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ >> AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- >> Capabilities: [140 v1] Virtual Channel >> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 >> Arb: Fixed- WRR32- WRR64- WRR128- >> Ctrl: ArbSelect=Fixed >> Status: InProgress- >> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- >> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- >> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff >> Status: NegoPending- InProgress- >> Capabilities: [300 v1] Device Serial Number 00-00-00-00-00-00-00-00 >> Kernel driver in use: vfio-pci >> Kernel modules: ath9k >> >> >> Unbinding it works w/o any problem. The lock up encounters about 4 s >> after the start of the VM. >> >> On 3.12.x, I can see the following message on the error terminal when >> starting the VM: >> vfio-pci: 03:00.0: invalid ROM contents. >> >> I compared AMD-Vi debug output between 3.12 and 3.14, but couldn't see >> any difference. I compared /proc/interrupts between 3.12 and 3.14 >> and couldn't see any difference too so far. >> >> >> qemu version I'm using is 1.7.0. >> >> >> It is strange(?), that a second VM using PCI (legacy) pass through works >> w/o any problem. I tried to start the problematic VM even w/o running >> this VM - same result: machine is locked up hard. >> >> >> Do you have any idea, what could be going on there? Or how to debug it >> to see what happened? > > Are you able to setup a serial console on this system? Enabling sysrq > and getting a dump of task states (t) via serial is often the best way > to determine the problem. I'll try it. It should be most probably something like console=tty0 console=ttyS0,115200n8 on the sending machine as kernel option and /sbin/agetty -h -t 60 ttyS0 115200 vt102 on the client. Probably my biggest problem: I don't have a second machine with a serial port :-(. I hope this USB to serial adapter will be supported on client side (as receiver): Logilink USB 2.0 Seriell Adapter > There weren't many vfio changes between 3.13 and 3.14. It could be a pci problem, too? > Have you tested whether the problem still occurs on 3.16 + Same problem. > newer QEMU? Reluctantly - it is a production system. > Maybe also remove the ROM from the equation with the > rombar=0 option for the vfio-pci device in QEMU. Same problem :-(. The machine really is completely dead: it even pings any more. Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-09-24 14:54 ` Andreas Hartmann @ 2014-09-24 17:16 ` Andreas Hartmann 0 siblings, 0 replies; 42+ messages in thread From: Andreas Hartmann @ 2014-09-24 17:16 UTC (permalink / raw) To: Alex Williamson; +Cc: linux-pci Andreas Hartmann wrote: > Alex Williamson wrote: [...] >> Are you able to setup a serial console on this system? Enabling sysrq >> and getting a dump of task states (t) via serial is often the best way >> to determine the problem. > > I'll try it. I did it now like this: minicom on the client. Magic SysKeyRequest via minicom: Ctrl-A shift-f [Syskey, like m or t, ...] On the sender: As kerneloption: console=tty0 console=ttyS0,115200n8 Before, you have to enable syskeyrequest via sysctl -w kernel.sysrq=1 After doing all of this, you can test w/ m or t. The output will appear on tty9 (via Alt-F10 on openSUSE). But what was the result after starting the VM? -> Machine is definitely completely dead. It doesn't react on anything any more. Remarkable: After hard reset, the USB keyboard doesn't work any more in Linux (but in Grub 2), because the driver gets a timeout accessing the USB 3 hw (other USB chips are working fine). It is necessary to switch of the machine completely and remove the mains. After ~ 30s, it can be repowered and all is working fine again (after repairing the broken FS the VM resides on the host). Any more hints are welcome :-) Thanks, kind regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-09-23 20:07 ` Alex Williamson 2014-09-24 14:54 ` Andreas Hartmann @ 2014-10-10 9:39 ` Andreas Hartmann 2014-10-10 14:37 ` Bjorn Helgaas 1 sibling, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-10 9:39 UTC (permalink / raw) To: Alex Williamson; +Cc: linux-pci shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. Alex Williamson wrote: > On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: >> Hello! >> >> Since long time now, I'm using w/o any problem PCIe pass through with a >> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and >> enabled IOMMU with vfio-pci. >> >> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use >> .8 and .9, but I do not think they would have been problematic). >> >> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a >> hard and silent lock up of the complete machine when starting the VM >> with the PCIe card passed through. >> >> That's the relevant PCIe card, which locks up the machine (here >> running w/ 3.12.28) when passed to the VM: >> >> 03:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01) >> Subsystem: Qualcomm Atheros Device 3112 >> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- >> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- >> Latency: 0, Cache Line Size: 64 bytes >> Interrupt: pin A routed to IRQ 17 >> Region 0: Memory at fdbc0000 (64-bit, non-prefetchable) [size=128K] >> Expansion ROM at fda00000 [size=64K] >> Capabilities: [40] Power Management version 3 >> Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-) >> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- >> Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+ >> Address: 0000000000000000 Data: 0000 >> Masking: 00000000 Pending: 00000000 >> Capabilities: [70] Express (v2) Endpoint, MSI 00 >> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us >> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- >> DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- >> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- >> MaxPayload 128 bytes, MaxReadReq 512 bytes >> DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- >> LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <2us, L1 <64us >> ClockPM- Surprise- LLActRep- BwNot- >> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ >> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- >> LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- >> DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported >> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled >> LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- >> Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- >> Compliance De-emphasis: -6dB >> LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- >> EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- >> Capabilities: [100 v1] Advanced Error Reporting >> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- >> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- >> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- >> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout+ NonFatalErr+ >> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ >> AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- >> Capabilities: [140 v1] Virtual Channel >> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 >> Arb: Fixed- WRR32- WRR64- WRR128- >> Ctrl: ArbSelect=Fixed >> Status: InProgress- >> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- >> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- >> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff >> Status: NegoPending- InProgress- >> Capabilities: [300 v1] Device Serial Number 00-00-00-00-00-00-00-00 >> Kernel driver in use: vfio-pci >> Kernel modules: ath9k >> >> >> Unbinding it works w/o any problem. The lock up encounters about 4 s >> after the start of the VM. >> >> On 3.12.x, I can see the following message on the error terminal when >> starting the VM: >> vfio-pci: 03:00.0: invalid ROM contents. >> >> I compared AMD-Vi debug output between 3.12 and 3.14, but couldn't see >> any difference. I compared /proc/interrupts between 3.12 and 3.14 >> and couldn't see any difference too so far. >> >> >> qemu version I'm using is 1.7.0. >> >> >> It is strange(?), that a second VM using PCI (legacy) pass through works >> w/o any problem. I tried to start the problematic VM even w/o running >> this VM - same result: machine is locked up hard. >> >> >> Do you have any idea, what could be going on there? Or how to debug it >> to see what happened? > There weren't many vfio changes between 3.13 and 3.14. It could be a pci problem, too? It is strange, that there is no problem with the pci-card, but the pcie card hangs the machine! > Have you tested whether the problem still occurs on 3.16 + Same problem with 3.17.0 > newer QEMU? Same problem With qemu 2.1.0. > Maybe also remove the ROM from the equation with the > rombar=0 option for the vfio-pci device in QEMU. Same problem :-(. The machine really is completely dead: it even pings any more. Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-10 9:39 ` Andreas Hartmann @ 2014-10-10 14:37 ` Bjorn Helgaas 2014-10-10 14:49 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Bjorn Helgaas @ 2014-10-10 14:37 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Alex Williamson, linux-pci On Fri, Oct 10, 2014 at 3:39 AM, Andreas Hartmann <andihartmann@freenet.de> wrote: > shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. > > Alex Williamson wrote: >> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: >>> Hello! >>> >>> Since long time now, I'm using w/o any problem PCIe pass through with a >>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and >>> enabled IOMMU with vfio-pci. >>> >>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use >>> .8 and .9, but I do not think they would have been problematic). >>> >>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a >>> hard and silent lock up of the complete machine when starting the VM >>> with the PCIe card passed through. Since we're not really making any progress on this yet, would it be possible to bisect it? We already know that 3.13.7 works and 3.14.19 fails, and "git bisect start v3.14 v3.13" says it's about 13 steps. I know that's still quite a bit of work, but at least it sounds like the problem is easy to reproduce. Bjorn ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-10 14:37 ` Bjorn Helgaas @ 2014-10-10 14:49 ` Andreas Hartmann 2014-10-10 15:55 ` Bjorn Helgaas 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-10 14:49 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: Alex Williamson, linux-pci Bjorn Helgaas schrieb: > On Fri, Oct 10, 2014 at 3:39 AM, Andreas Hartmann > <andihartmann@freenet.de> wrote: >> shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. >> >> Alex Williamson wrote: >>> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: >>>> Hello! >>>> >>>> Since long time now, I'm using w/o any problem PCIe pass through with a >>>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and >>>> enabled IOMMU with vfio-pci. >>>> >>>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use >>>> .8 and .9, but I do not think they would have been problematic). >>>> >>>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a >>>> hard and silent lock up of the complete machine when starting the VM >>>> with the PCIe card passed through. > > Since we're not really making any progress on this yet, would it be > possible to bisect it? We already know that 3.13.7 works and 3.14.19 > fails, and "git bisect start v3.14 v3.13" says it's about 13 steps. I > know that's still quite a bit of work, but at least it sounds like the > problem is easy to reproduce. Which git repository should I use best? Is it possible to do one checkout and work afterwards always on base of this? Unfortunately my internet connection is very slow :-(. Thanks for your hint! Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-10 14:49 ` Andreas Hartmann @ 2014-10-10 15:55 ` Bjorn Helgaas 2014-10-10 16:09 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Bjorn Helgaas @ 2014-10-10 15:55 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Alex Williamson, linux-pci On Fri, Oct 10, 2014 at 8:49 AM, Andreas Hartmann <andihartmann@freenet.de> wrote: > Bjorn Helgaas schrieb: >> On Fri, Oct 10, 2014 at 3:39 AM, Andreas Hartmann >> <andihartmann@freenet.de> wrote: >>> shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. >>> >>> Alex Williamson wrote: >>>> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: >>>>> Hello! >>>>> >>>>> Since long time now, I'm using w/o any problem PCIe pass through with a >>>>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and >>>>> enabled IOMMU with vfio-pci. >>>>> >>>>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use >>>>> .8 and .9, but I do not think they would have been problematic). >>>>> >>>>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a >>>>> hard and silent lock up of the complete machine when starting the VM >>>>> with the PCIe card passed through. >> >> Since we're not really making any progress on this yet, would it be >> possible to bisect it? We already know that 3.13.7 works and 3.14.19 >> fails, and "git bisect start v3.14 v3.13" says it's about 13 steps. I >> know that's still quite a bit of work, but at least it sounds like the >> problem is easy to reproduce. > > Which git repository should I use best? The linux-stable repository [1] contains both the v3.13.x and the v3.14.x branches, but apparently you can't bisect directly between v3.13.7 and v3.14.19: $ git bisect start v3.14.19 v3.13.7 Bisecting: a merge base must be tested [d8ec26d7f8287f5788a494f56e8814210f0e64be] Linux 3.13 I'm not an expert at bisecting, but here's what I would try: - Clone the repo from [1] (this same repo can be used for all your testing) - Checkout, build, and test v3.14 - If v3.14 works (unlikely), bisect between v3.14 and v3.14.19 to see which change broke it - If v3.14 fails, checkout, build, and test v3.13 - If v3.13 fails (very unlikely), bisect between v3.13 and v3.13.7 to see which change fixed it - If v3.13 works and v3.14 fails (most likely), bisect between v3.13 and v3.14 Bjorn [1] git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-10 15:55 ` Bjorn Helgaas @ 2014-10-10 16:09 ` Andreas Hartmann 2014-10-10 16:41 ` Bjorn Helgaas 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-10 16:09 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: Alex Williamson, linux-pci Bjorn Helgaas wrote: > On Fri, Oct 10, 2014 at 8:49 AM, Andreas Hartmann > <andihartmann@freenet.de> wrote: >> Bjorn Helgaas wrote: >>> On Fri, Oct 10, 2014 at 3:39 AM, Andreas Hartmann >>> <andihartmann@freenet.de> wrote: >>>> shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. >>>> >>>> Alex Williamson wrote: >>>>> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: >>>>>> Hello! >>>>>> >>>>>> Since long time now, I'm using w/o any problem PCIe pass through with a >>>>>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and >>>>>> enabled IOMMU with vfio-pci. >>>>>> >>>>>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use >>>>>> .8 and .9, but I do not think they would have been problematic). >>>>>> >>>>>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a >>>>>> hard and silent lock up of the complete machine when starting the VM >>>>>> with the PCIe card passed through. >>> >>> Since we're not really making any progress on this yet, would it be >>> possible to bisect it? We already know that 3.13.7 works and 3.14.19 >>> fails, and "git bisect start v3.14 v3.13" says it's about 13 steps. I >>> know that's still quite a bit of work, but at least it sounds like the >>> problem is easy to reproduce. >> >> Which git repository should I use best? > > The linux-stable repository [1] contains both the v3.13.x and the > v3.14.x branches, but apparently you can't bisect directly between > v3.13.7 and v3.14.19: I know that the first version after 3.13.0 (patch-v3.13-next-20140121) is already broken. Therefore, it must be between 3.13.7 and patch-v3.13-next-20140121. Thanks, regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-10 16:09 ` Andreas Hartmann @ 2014-10-10 16:41 ` Bjorn Helgaas 2014-10-10 22:32 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Bjorn Helgaas @ 2014-10-10 16:41 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Alex Williamson, linux-pci On Fri, Oct 10, 2014 at 10:09 AM, Andreas Hartmann <andihartmann@freenet.de> wrote: > Bjorn Helgaas wrote: >> On Fri, Oct 10, 2014 at 8:49 AM, Andreas Hartmann >> <andihartmann@freenet.de> wrote: >>> Bjorn Helgaas wrote: >>>> On Fri, Oct 10, 2014 at 3:39 AM, Andreas Hartmann >>>> <andihartmann@freenet.de> wrote: >>>>> shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. >>>>> >>>>> Alex Williamson wrote: >>>>>> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: >>>>>>> Hello! >>>>>>> >>>>>>> Since long time now, I'm using w/o any problem PCIe pass through with a >>>>>>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and >>>>>>> enabled IOMMU with vfio-pci. >>>>>>> >>>>>>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use >>>>>>> .8 and .9, but I do not think they would have been problematic). >>>>>>> >>>>>>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a >>>>>>> hard and silent lock up of the complete machine when starting the VM >>>>>>> with the PCIe card passed through. >>>> >>>> Since we're not really making any progress on this yet, would it be >>>> possible to bisect it? We already know that 3.13.7 works and 3.14.19 >>>> fails, and "git bisect start v3.14 v3.13" says it's about 13 steps. I >>>> know that's still quite a bit of work, but at least it sounds like the >>>> problem is easy to reproduce. >>> >>> Which git repository should I use best? >> >> The linux-stable repository [1] contains both the v3.13.x and the >> v3.14.x branches, but apparently you can't bisect directly between >> v3.13.7 and v3.14.19: > > I know that the first version after 3.13.0 (patch-v3.13-next-20140121) > is already broken. Therefore, it must be between 3.13.7 and > patch-v3.13-next-20140121. I assume patch-v3.13-next-20140121 is the linux-next tree from 20140121. v3.13 was released on Jan 19, 2014, so 20140121 was during the merge window, and the linux-next tree from that day would be Linus' tree (v3.13 plus whatever he had merged during the first day or two), plus all the remaining stuff in subsystem trees that had not yet been merged. The result (patch-v3.13-next-20140121) should be a fairly good approximation of v3.14. v3.13.7 is a branch based on v3.13. patch-v3.13-next-20140121 would essentially be a branch based on v3.13 also. So while they share a common v3.13 ancestor, I don't think you can bisect directly between them. And linux-next is rebuilt from scratch every day, so I don't think there is a git tree with patch-v3.13-next-20140121 in it anyway. Bjorn ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-10 16:41 ` Bjorn Helgaas @ 2014-10-10 22:32 ` Andreas Hartmann 2014-10-10 22:54 ` Bjorn Helgaas 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-10 22:32 UTC (permalink / raw) To: Bjorn Helgaas, Alex Williamson; +Cc: linux-pci Bjorn Helgaas wrote: > On Fri, Oct 10, 2014 at 10:09 AM, Andreas Hartmann > <andihartmann@freenet.de> wrote: >> Bjorn Helgaas wrote: >>> On Fri, Oct 10, 2014 at 8:49 AM, Andreas Hartmann >>> <andihartmann@freenet.de> wrote: >>>> Bjorn Helgaas wrote: >>>>> On Fri, Oct 10, 2014 at 3:39 AM, Andreas Hartmann >>>>> <andihartmann@freenet.de> wrote: >>>>>> shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. >>>>>> >>>>>> Alex Williamson wrote: >>>>>>> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: >>>>>>>> Hello! >>>>>>>> >>>>>>>> Since long time now, I'm using w/o any problem PCIe pass through with a >>>>>>>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and >>>>>>>> enabled IOMMU with vfio-pci. >>>>>>>> >>>>>>>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use >>>>>>>> .8 and .9, but I do not think they would have been problematic). >>>>>>>> >>>>>>>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a >>>>>>>> hard and silent lock up of the complete machine when starting the VM >>>>>>>> with the PCIe card passed through. >>>>> >>>>> Since we're not really making any progress on this yet, would it be >>>>> possible to bisect it? We already know that 3.13.7 works and 3.14.19 >>>>> fails, and "git bisect start v3.14 v3.13" says it's about 13 steps. I >>>>> know that's still quite a bit of work, but at least it sounds like the >>>>> problem is easy to reproduce. >>>> >>>> Which git repository should I use best? >>> >>> The linux-stable repository [1] contains both the v3.13.x and the >>> v3.14.x branches, but apparently you can't bisect directly between >>> v3.13.7 and v3.14.19: >> >> I know that the first version after 3.13.0 (patch-v3.13-next-20140121) >> is already broken. Therefore, it must be between 3.13.7 and >> patch-v3.13-next-20140121. Ok, this is the result of git bisect: 425c1b223dac456d00a61fd6b451b6d1cf00d065 is the first bad commit commit 425c1b223dac456d00a61fd6b451b6d1cf00d065 Author: Alex Williamson <alex.williamson@redhat.com> Date: Tue Dec 17 16:43:51 2013 -0700 PCI: Add Virtual Channel to save/restore support While we don't really have any infrastructure for making use of VC support, the system BIOS can configure the topology to non-default VC values prior to boot. This may be due to silicon bugs, desire to reserve traffic classes, or perhaps just BIOS bugs. When we reset devices, the VC configuration may return to default values, which can be incompatible with devices upstream. For instance, Nvidia GRID cards provide a PCIe switch and some number of GPUs, all supporting VC. The power-on default for VC is to support TC0-7 across VC0, however some platforms will only enable TC0/VC0 mapping across the topology. When we do a secondary bus reset on the downstream switch port, the GPU is reset to a TC0-7/VC0 mapping while the opposite end of the link only enables TC0/VC0. If the GPU attempts to use TC1-7, it fails. This patch attempts to provide complete support for VC save/restore, even beyond the minimally required use case above. This includes save/restore and reload of the arbitration table, save/restore and reload of the port arbitration tables, and re-enabling of the channels for VC, VC9, and MFVC capabilities. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Kind regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-10 22:32 ` Andreas Hartmann @ 2014-10-10 22:54 ` Bjorn Helgaas 2014-10-11 6:20 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Bjorn Helgaas @ 2014-10-10 22:54 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Alex Williamson, linux-pci On Sat, Oct 11, 2014 at 12:32:19AM +0200, Andreas Hartmann wrote: > Bjorn Helgaas wrote: > > On Fri, Oct 10, 2014 at 10:09 AM, Andreas Hartmann > > <andihartmann@freenet.de> wrote: > >> Bjorn Helgaas wrote: > >>> On Fri, Oct 10, 2014 at 8:49 AM, Andreas Hartmann > >>> <andihartmann@freenet.de> wrote: > >>>> Bjorn Helgaas wrote: > >>>>> On Fri, Oct 10, 2014 at 3:39 AM, Andreas Hartmann > >>>>> <andihartmann@freenet.de> wrote: > >>>>>> shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. > >>>>>> > >>>>>> Alex Williamson wrote: > >>>>>>> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: > >>>>>>>> Hello! > >>>>>>>> > >>>>>>>> Since long time now, I'm using w/o any problem PCIe pass through with a > >>>>>>>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and > >>>>>>>> enabled IOMMU with vfio-pci. > >>>>>>>> > >>>>>>>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use > >>>>>>>> .8 and .9, but I do not think they would have been problematic). > >>>>>>>> > >>>>>>>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a > >>>>>>>> hard and silent lock up of the complete machine when starting the VM > >>>>>>>> with the PCIe card passed through. > >>>>> > >>>>> Since we're not really making any progress on this yet, would it be > >>>>> possible to bisect it? We already know that 3.13.7 works and 3.14.19 > >>>>> fails, and "git bisect start v3.14 v3.13" says it's about 13 steps. I > >>>>> know that's still quite a bit of work, but at least it sounds like the > >>>>> problem is easy to reproduce. > >>>> > >>>> Which git repository should I use best? > >>> > >>> The linux-stable repository [1] contains both the v3.13.x and the > >>> v3.14.x branches, but apparently you can't bisect directly between > >>> v3.13.7 and v3.14.19: > >> > >> I know that the first version after 3.13.0 (patch-v3.13-next-20140121) > >> is already broken. Therefore, it must be between 3.13.7 and > >> patch-v3.13-next-20140121. > > > Ok, this is the result of git bisect: > > 425c1b223dac456d00a61fd6b451b6d1cf00d065 is the first bad commit > commit 425c1b223dac456d00a61fd6b451b6d1cf00d065 > Author: Alex Williamson <alex.williamson@redhat.com> > Date: Tue Dec 17 16:43:51 2013 -0700 > > PCI: Add Virtual Channel to save/restore support > > While we don't really have any infrastructure for making use of VC > support, the system BIOS can configure the topology to non-default > VC values prior to boot. This may be due to silicon bugs, desire to > reserve traffic classes, or perhaps just BIOS bugs. When we reset > devices, the VC configuration may return to default values, which can > be incompatible with devices upstream. For instance, Nvidia GRID > cards provide a PCIe switch and some number of GPUs, all supporting > VC. The power-on default for VC is to support TC0-7 across VC0, > however some platforms will only enable TC0/VC0 mapping across the > topology. When we do a secondary bus reset on the downstream switch > port, the GPU is reset to a TC0-7/VC0 mapping while the opposite end > of the link only enables TC0/VC0. If the GPU attempts to use TC1-7, > it fails. > > This patch attempts to provide complete support for VC save/restore, > even beyond the minimally required use case above. This includes > save/restore and reload of the arbitration table, save/restore and > reload of the port arbitration tables, and re-enabling of the > channels for VC, VC9, and MFVC capabilities. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Wow, I'm amazed that you could get that done so fast... you must have spent your whole day working on this! To double-check this, can you try applying the patch below? It should be enough to make things work if 425c1b223dac is really what's causing the trouble. This patch is based on v3.17, but 425c1b223dac appeared in v3.14, so you should be able to apply it to v3.14 or any later kernel. Bjorn diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 2c9ac70254e2..8ef8bc56a584 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1007,8 +1007,6 @@ int pci_save_state(struct pci_dev *dev) return i; if ((i = pci_save_pcix_state(dev)) != 0) return i; - if ((i = pci_save_vc_state(dev)) != 0) - return i; return 0; } EXPORT_SYMBOL(pci_save_state); @@ -1072,7 +1070,6 @@ void pci_restore_state(struct pci_dev *dev) /* PCI Express register must be restored first */ pci_restore_pcie_state(dev); pci_restore_ats_state(dev); - pci_restore_vc_state(dev); pci_restore_config_space(dev); @@ -2170,8 +2167,6 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev) if (error) dev_err(&dev->dev, "unable to preallocate PCI-X save buffer\n"); - - pci_allocate_vc_save_buffers(dev); } void pci_free_cap_save_buffers(struct pci_dev *dev) ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-10 22:54 ` Bjorn Helgaas @ 2014-10-11 6:20 ` Andreas Hartmann 2014-10-15 8:04 ` Alex Williamson 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-11 6:20 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: Alex Williamson, linux-pci Bjorn Helgaas wrote: > On Sat, Oct 11, 2014 at 12:32:19AM +0200, Andreas Hartmann wrote: >> Bjorn Helgaas wrote: >>> On Fri, Oct 10, 2014 at 10:09 AM, Andreas Hartmann >>> <andihartmann@freenet.de> wrote: >>>> Bjorn Helgaas wrote: >>>>> On Fri, Oct 10, 2014 at 8:49 AM, Andreas Hartmann >>>>> <andihartmann@freenet.de> wrote: >>>>>> Bjorn Helgaas wrote: >>>>>>> On Fri, Oct 10, 2014 at 3:39 AM, Andreas Hartmann >>>>>>> <andihartmann@freenet.de> wrote: >>>>>>>> shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. >>>>>>>> >>>>>>>> Alex Williamson wrote: >>>>>>>>> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: >>>>>>>>>> Hello! >>>>>>>>>> >>>>>>>>>> Since long time now, I'm using w/o any problem PCIe pass through with a >>>>>>>>>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and >>>>>>>>>> enabled IOMMU with vfio-pci. >>>>>>>>>> >>>>>>>>>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use >>>>>>>>>> .8 and .9, but I do not think they would have been problematic). >>>>>>>>>> >>>>>>>>>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a >>>>>>>>>> hard and silent lock up of the complete machine when starting the VM >>>>>>>>>> with the PCIe card passed through. >>>>>>> >>>>>>> Since we're not really making any progress on this yet, would it be >>>>>>> possible to bisect it? We already know that 3.13.7 works and 3.14.19 >>>>>>> fails, and "git bisect start v3.14 v3.13" says it's about 13 steps. I >>>>>>> know that's still quite a bit of work, but at least it sounds like the >>>>>>> problem is easy to reproduce. >>>>>> >>>>>> Which git repository should I use best? >>>>> >>>>> The linux-stable repository [1] contains both the v3.13.x and the >>>>> v3.14.x branches, but apparently you can't bisect directly between >>>>> v3.13.7 and v3.14.19: >>>> >>>> I know that the first version after 3.13.0 (patch-v3.13-next-20140121) >>>> is already broken. Therefore, it must be between 3.13.7 and >>>> patch-v3.13-next-20140121. >> >> >> Ok, this is the result of git bisect: >> >> 425c1b223dac456d00a61fd6b451b6d1cf00d065 is the first bad commit >> commit 425c1b223dac456d00a61fd6b451b6d1cf00d065 >> Author: Alex Williamson <alex.williamson@redhat.com> >> Date: Tue Dec 17 16:43:51 2013 -0700 >> >> PCI: Add Virtual Channel to save/restore support >> >> While we don't really have any infrastructure for making use of VC >> support, the system BIOS can configure the topology to non-default >> VC values prior to boot. This may be due to silicon bugs, desire to >> reserve traffic classes, or perhaps just BIOS bugs. When we reset >> devices, the VC configuration may return to default values, which can >> be incompatible with devices upstream. For instance, Nvidia GRID >> cards provide a PCIe switch and some number of GPUs, all supporting >> VC. The power-on default for VC is to support TC0-7 across VC0, >> however some platforms will only enable TC0/VC0 mapping across the >> topology. When we do a secondary bus reset on the downstream switch >> port, the GPU is reset to a TC0-7/VC0 mapping while the opposite end >> of the link only enables TC0/VC0. If the GPU attempts to use TC1-7, >> it fails. >> >> This patch attempts to provide complete support for VC save/restore, >> even beyond the minimally required use case above. This includes >> save/restore and reload of the arbitration table, save/restore and >> reload of the port arbitration tables, and re-enabling of the >> channels for VC, VC9, and MFVC capabilities. >> >> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> >> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > > Wow, I'm amazed that you could get that done so fast... you must have spent > your whole day working on this! If I would have been more familiar with the versioning of the kernels and if I would have a faster internet connection and if there wouldn't be another bug in systemd, which has bitten me on booting with broken fs (but I found a cool workaround now :-)), I would have been much faster: my 8 core machine and 8 GB of RAM, where I've been compiling the kernel in and my special kernel config (which Im using since 3.10) only containing my requests, with parts of the process automated makes it possible to have a turn around of ~ 7 minutes :-). I too had no problem with reproducibility, because the problem always comes up at the start of the vm after 1 or 2 secs. > > To double-check this, can you try applying the patch below? It should be > enough to make things work if 425c1b223dac is really what's causing the > trouble. > > This patch is based on v3.17, but 425c1b223dac appeared in v3.14, so you > should be able to apply it to v3.14 or any later kernel. > > Bjorn > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 2c9ac70254e2..8ef8bc56a584 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1007,8 +1007,6 @@ int pci_save_state(struct pci_dev *dev) > return i; > if ((i = pci_save_pcix_state(dev)) != 0) > return i; > - if ((i = pci_save_vc_state(dev)) != 0) > - return i; > return 0; > } > EXPORT_SYMBOL(pci_save_state); > @@ -1072,7 +1070,6 @@ void pci_restore_state(struct pci_dev *dev) > /* PCI Express register must be restored first */ > pci_restore_pcie_state(dev); > pci_restore_ats_state(dev); > - pci_restore_vc_state(dev); > > pci_restore_config_space(dev); > > @@ -2170,8 +2167,6 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev) > if (error) > dev_err(&dev->dev, > "unable to preallocate PCI-X save buffer\n"); > - > - pci_allocate_vc_save_buffers(dev); > } > > void pci_free_cap_save_buffers(struct pci_dev *dev) > This patch proofed the git bisect result. I applied it to patch-v3.13-next-20140122 and the machine worked pretty fine :-). Thanks, Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-11 6:20 ` Andreas Hartmann @ 2014-10-15 8:04 ` Alex Williamson 2014-10-17 1:04 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Alex Williamson @ 2014-10-15 8:04 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Sat, 2014-10-11 at 08:20 +0200, Andreas Hartmann wrote: > Bjorn Helgaas wrote: > > On Sat, Oct 11, 2014 at 12:32:19AM +0200, Andreas Hartmann wrote: > >> Bjorn Helgaas wrote: > >>> On Fri, Oct 10, 2014 at 10:09 AM, Andreas Hartmann > >>> <andihartmann@freenet.de> wrote: > >>>> Bjorn Helgaas wrote: > >>>>> On Fri, Oct 10, 2014 at 8:49 AM, Andreas Hartmann > >>>>> <andihartmann@freenet.de> wrote: > >>>>>> Bjorn Helgaas wrote: > >>>>>>> On Fri, Oct 10, 2014 at 3:39 AM, Andreas Hartmann > >>>>>>> <andihartmann@freenet.de> wrote: > >>>>>>>> shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour. > >>>>>>>> > >>>>>>>> Alex Williamson wrote: > >>>>>>>>> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote: > >>>>>>>>>> Hello! > >>>>>>>>>> > >>>>>>>>>> Since long time now, I'm using w/o any problem PCIe pass through with a > >>>>>>>>>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and > >>>>>>>>>> enabled IOMMU with vfio-pci. > >>>>>>>>>> > >>>>>>>>>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use > >>>>>>>>>> .8 and .9, but I do not think they would have been problematic). > >>>>>>>>>> > >>>>>>>>>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a > >>>>>>>>>> hard and silent lock up of the complete machine when starting the VM > >>>>>>>>>> with the PCIe card passed through. > >>>>>>> > >>>>>>> Since we're not really making any progress on this yet, would it be > >>>>>>> possible to bisect it? We already know that 3.13.7 works and 3.14.19 > >>>>>>> fails, and "git bisect start v3.14 v3.13" says it's about 13 steps. I > >>>>>>> know that's still quite a bit of work, but at least it sounds like the > >>>>>>> problem is easy to reproduce. > >>>>>> > >>>>>> Which git repository should I use best? > >>>>> > >>>>> The linux-stable repository [1] contains both the v3.13.x and the > >>>>> v3.14.x branches, but apparently you can't bisect directly between > >>>>> v3.13.7 and v3.14.19: > >>>> > >>>> I know that the first version after 3.13.0 (patch-v3.13-next-20140121) > >>>> is already broken. Therefore, it must be between 3.13.7 and > >>>> patch-v3.13-next-20140121. > >> > >> > >> Ok, this is the result of git bisect: > >> > >> 425c1b223dac456d00a61fd6b451b6d1cf00d065 is the first bad commit > >> commit 425c1b223dac456d00a61fd6b451b6d1cf00d065 > >> Author: Alex Williamson <alex.williamson@redhat.com> > >> Date: Tue Dec 17 16:43:51 2013 -0700 > >> > >> PCI: Add Virtual Channel to save/restore support > >> > >> While we don't really have any infrastructure for making use of VC > >> support, the system BIOS can configure the topology to non-default > >> VC values prior to boot. This may be due to silicon bugs, desire to > >> reserve traffic classes, or perhaps just BIOS bugs. When we reset > >> devices, the VC configuration may return to default values, which can > >> be incompatible with devices upstream. For instance, Nvidia GRID > >> cards provide a PCIe switch and some number of GPUs, all supporting > >> VC. The power-on default for VC is to support TC0-7 across VC0, > >> however some platforms will only enable TC0/VC0 mapping across the > >> topology. When we do a secondary bus reset on the downstream switch > >> port, the GPU is reset to a TC0-7/VC0 mapping while the opposite end > >> of the link only enables TC0/VC0. If the GPU attempts to use TC1-7, > >> it fails. > >> > >> This patch attempts to provide complete support for VC save/restore, > >> even beyond the minimally required use case above. This includes > >> save/restore and reload of the arbitration table, save/restore and > >> reload of the port arbitration tables, and re-enabling of the > >> channels for VC, VC9, and MFVC capabilities. > >> > >> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > >> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > > > > Wow, I'm amazed that you could get that done so fast... you must have spent > > your whole day working on this! > > If I would have been more familiar with the versioning of the kernels > and if I would have a faster internet connection and if there wouldn't > be another bug in systemd, which has bitten me on booting with broken fs > (but I found a cool workaround now :-)), I would have been much faster: > my 8 core machine and 8 GB of RAM, where I've been compiling the kernel > in and my special kernel config (which Im using since 3.10) only > containing my requests, with parts of the process automated makes it > possible to have a turn around of ~ 7 minutes :-). > I too had no problem with reproducibility, because the problem always > comes up at the start of the vm after 1 or 2 secs. Hi Andreas, Sorry for the breakage. Is it possible to run lspci on the device in a loop from the host and capture whether we're failing to restore some of the VC bits to their previous state? Does the problem also occur if you unbind from host driver, echo 1 > reset in pci-sysfs, and re-bind to the host? I'll also try to reproduce on my 990fx system, but I won't be able to do that until next week due to travel. Thanks, Alex > > To double-check this, can you try applying the patch below? It should be > > enough to make things work if 425c1b223dac is really what's causing the > > trouble. > > > > This patch is based on v3.17, but 425c1b223dac appeared in v3.14, so you > > should be able to apply it to v3.14 or any later kernel. > > > > Bjorn > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index 2c9ac70254e2..8ef8bc56a584 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -1007,8 +1007,6 @@ int pci_save_state(struct pci_dev *dev) > > return i; > > if ((i = pci_save_pcix_state(dev)) != 0) > > return i; > > - if ((i = pci_save_vc_state(dev)) != 0) > > - return i; > > return 0; > > } > > EXPORT_SYMBOL(pci_save_state); > > @@ -1072,7 +1070,6 @@ void pci_restore_state(struct pci_dev *dev) > > /* PCI Express register must be restored first */ > > pci_restore_pcie_state(dev); > > pci_restore_ats_state(dev); > > - pci_restore_vc_state(dev); > > > > pci_restore_config_space(dev); > > > > @@ -2170,8 +2167,6 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev) > > if (error) > > dev_err(&dev->dev, > > "unable to preallocate PCI-X save buffer\n"); > > - > > - pci_allocate_vc_save_buffers(dev); > > } > > > > void pci_free_cap_save_buffers(struct pci_dev *dev) > > > > This patch proofed the git bisect result. I applied it to > patch-v3.13-next-20140122 and the machine worked pretty fine :-). > > > Thanks, > Regards, > Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-15 8:04 ` Alex Williamson @ 2014-10-17 1:04 ` Andreas Hartmann 2014-10-21 21:06 ` Alex Williamson 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-17 1:04 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Hello Alex, Alex Williamson wrote: > Hi Andreas, [...] > Sorry for the breakage. Is it possible to run lspci on the device in a > loop from the host and capture whether we're failing to restore some of > the VC bits to their previous state? > Does the problem also occur if you > unbind from host driver, The machine is booted w/ blacklisted ath9k. Then, the device is bound to vfio: echo "168c 0030" > /sys/bus/pci/drivers/vfio-pci/new_id echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind echo 0000:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind afterwards the VM is started -> hang. W/o starting th VM, I can bind it to vfio and unbind it from vfio w/o any problem. > echo 1 > reset in pci-sysfs, echo 1 > /sys/bus/pci/devices/0000:03:00.0 works w/o any problem while bound to vfio. Even after unbinding from vfio and rebinding to vfio again ... . > and re-bind to the Do you mean loading ath9k in host system after unbinding from vfio? If yes: Works w/o any problem. It's even possible to reset it or do a ifconfig wlan0 up, ifconfig wlan0 down, rmmod ath9k, bind it to vfio again and reset it, .... Looks like the hang only is triggered by qemu-system_x86_64 on startup the VM. > host? I'll also try to reproduce on my 990fx system, but I won't be > able to do that until next week due to travel. Thanks, Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-17 1:04 ` Andreas Hartmann @ 2014-10-21 21:06 ` Alex Williamson 2014-10-21 21:32 ` Alex Williamson 2014-10-22 15:34 ` Andreas Hartmann 0 siblings, 2 replies; 42+ messages in thread From: Alex Williamson @ 2014-10-21 21:06 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci Hi Andreas, On Fri, 2014-10-17 at 03:04 +0200, Andreas Hartmann wrote: > Hello Alex, > > Alex Williamson wrote: > > Hi Andreas, > [...] > > Sorry for the breakage. Is it possible to run lspci on the device in a > > loop from the host and capture whether we're failing to restore some of > > the VC bits to their previous state? > > > Does the problem also occur if you > > unbind from host driver, > > The machine is booted w/ blacklisted ath9k. Then, the device is bound to > vfio: > > echo "168c 0030" > /sys/bus/pci/drivers/vfio-pci/new_id > echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind > echo 0000:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind > > afterwards the VM is started -> hang. > > W/o starting th VM, I can bind it to vfio and unbind it from vfio w/o > any problem. > > > echo 1 > reset in pci-sysfs, > > echo 1 > /sys/bus/pci/devices/0000:03:00.0 works w/o any problem while > bound to vfio. Even after unbinding from vfio and rebinding to vfio > again ... . > > > and re-bind to the > > Do you mean loading ath9k in host system after unbinding from vfio? If > yes: Works w/o any problem. It's even possible to reset it or do a > ifconfig wlan0 up, ifconfig wlan0 down, rmmod ath9k, bind it to vfio > again and reset it, .... > > Looks like the hang only is triggered by qemu-system_x86_64 on startup > the VM. > > > host? I'll also try to reproduce on my 990fx system, but I won't be > > able to do that until next week due to travel. Thanks, Could you send me the lspci -vvvxxxx for the device and parent root port? Thanks, Alex ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-21 21:06 ` Alex Williamson @ 2014-10-21 21:32 ` Alex Williamson 2014-10-22 16:22 ` Andreas Hartmann 2014-10-22 15:34 ` Andreas Hartmann 1 sibling, 1 reply; 42+ messages in thread From: Alex Williamson @ 2014-10-21 21:32 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Tue, 2014-10-21 at 15:06 -0600, Alex Williamson wrote: > Hi Andreas, > > On Fri, 2014-10-17 at 03:04 +0200, Andreas Hartmann wrote: > > Hello Alex, > > > > Alex Williamson wrote: > > > Hi Andreas, > > [...] > > > Sorry for the breakage. Is it possible to run lspci on the device in a > > > loop from the host and capture whether we're failing to restore some of > > > the VC bits to their previous state? > > > > > Does the problem also occur if you > > > unbind from host driver, > > > > The machine is booted w/ blacklisted ath9k. Then, the device is bound to > > vfio: > > > > echo "168c 0030" > /sys/bus/pci/drivers/vfio-pci/new_id > > echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind > > echo 0000:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind > > > > afterwards the VM is started -> hang. > > > > W/o starting th VM, I can bind it to vfio and unbind it from vfio w/o > > any problem. > > > > > echo 1 > reset in pci-sysfs, > > > > echo 1 > /sys/bus/pci/devices/0000:03:00.0 works w/o any problem while > > bound to vfio. Even after unbinding from vfio and rebinding to vfio > > again ... . > > > > > and re-bind to the > > > > Do you mean loading ath9k in host system after unbinding from vfio? If > > yes: Works w/o any problem. It's even possible to reset it or do a > > ifconfig wlan0 up, ifconfig wlan0 down, rmmod ath9k, bind it to vfio > > again and reset it, .... > > > > Looks like the hang only is triggered by qemu-system_x86_64 on startup > > the VM. Also, this might be because QEMU since 1.7 will favor doing a bus reset for a device over PM reset while the sysfs reset interface will only do a bus reset if there are no other methods available and there are no other devices on the bus. Can you reproduce the hang using the sysfs reset interface without QEMU if you modify the kernel like this: --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3308,15 +3308,15 @@ static int __pci_dev_reset(struct pci_dev *dev, int prob if (rc != -ENOTTY) goto done; - rc = pci_pm_reset(dev, probe); + rc = pci_dev_reset_slot_function(dev, probe); if (rc != -ENOTTY) goto done; - rc = pci_dev_reset_slot_function(dev, probe); + rc = pci_parent_bus_reset(dev, probe); if (rc != -ENOTTY) goto done; - rc = pci_parent_bus_reset(dev, probe); + rc = pci_pm_reset(dev, probe); done: return rc; } > > > host? I'll also try to reproduce on my 990fx system, but I won't be > > > able to do that until next week due to travel. Thanks, > > Could you send me the lspci -vvvxxxx for the device and parent root > port? Thanks, > > Alex > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-21 21:32 ` Alex Williamson @ 2014-10-22 16:22 ` Andreas Hartmann 2014-10-22 20:36 ` Alex Williamson 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-22 16:22 UTC (permalink / raw) To: Alex Williamson, Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: > On Tue, 2014-10-21 at 15:06 -0600, Alex Williamson wrote: >> Hi Andreas, >> >> On Fri, 2014-10-17 at 03:04 +0200, Andreas Hartmann wrote: >>> Hello Alex, >>> >>> Alex Williamson wrote: >>>> Hi Andreas, >>> [...] >>>> Sorry for the breakage. Is it possible to run lspci on the device in a >>>> loop from the host and capture whether we're failing to restore some of >>>> the VC bits to their previous state? >>> >>>> Does the problem also occur if you >>>> unbind from host driver, >>> >>> The machine is booted w/ blacklisted ath9k. Then, the device is bound to >>> vfio: >>> >>> echo "168c 0030" > /sys/bus/pci/drivers/vfio-pci/new_id >>> echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind >>> echo 0000:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind >>> >>> afterwards the VM is started -> hang. >>> >>> W/o starting th VM, I can bind it to vfio and unbind it from vfio w/o >>> any problem. >>> >>>> echo 1 > reset in pci-sysfs, >>> >>> echo 1 > /sys/bus/pci/devices/0000:03:00.0 works w/o any problem while >>> bound to vfio. Even after unbinding from vfio and rebinding to vfio >>> again ... . >>> >>>> and re-bind to the >>> >>> Do you mean loading ath9k in host system after unbinding from vfio? If >>> yes: Works w/o any problem. It's even possible to reset it or do a >>> ifconfig wlan0 up, ifconfig wlan0 down, rmmod ath9k, bind it to vfio >>> again and reset it, .... >>> >>> Looks like the hang only is triggered by qemu-system_x86_64 on startup >>> the VM. > > Also, this might be because QEMU since 1.7 will favor doing a bus reset > for a device over PM reset while the sysfs reset interface will only do > a bus reset if there are no other methods available and there are no > other devices on the bus. Can you reproduce the hang using the sysfs > reset interface without QEMU if you modify the kernel like this: > > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -3308,15 +3308,15 @@ static int __pci_dev_reset(struct pci_dev *dev, int prob > if (rc != -ENOTTY) > goto done; > > - rc = pci_pm_reset(dev, probe); > + rc = pci_dev_reset_slot_function(dev, probe); > if (rc != -ENOTTY) > goto done; > > - rc = pci_dev_reset_slot_function(dev, probe); > + rc = pci_parent_bus_reset(dev, probe); > if (rc != -ENOTTY) > goto done; > > - rc = pci_parent_bus_reset(dev, probe); > + rc = pci_pm_reset(dev, probe); > done: > return rc; > } This way it's crashing with echo 1 > reset, too. Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-22 16:22 ` Andreas Hartmann @ 2014-10-22 20:36 ` Alex Williamson 2014-10-23 16:00 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Alex Williamson @ 2014-10-22 20:36 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Wed, 2014-10-22 at 18:22 +0200, Andreas Hartmann wrote: > Alex Williamson wrote: > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -3308,15 +3308,15 @@ static int __pci_dev_reset(struct pci_dev *dev, int prob > > if (rc != -ENOTTY) > > goto done; > > > > - rc = pci_pm_reset(dev, probe); > > + rc = pci_dev_reset_slot_function(dev, probe); > > if (rc != -ENOTTY) > > goto done; > > > > - rc = pci_dev_reset_slot_function(dev, probe); > > + rc = pci_parent_bus_reset(dev, probe); > > if (rc != -ENOTTY) > > goto done; > > > > - rc = pci_parent_bus_reset(dev, probe); > > + rc = pci_pm_reset(dev, probe); > > done: > > return rc; > > } > > This way it's crashing with echo 1 > reset, too. Ok, so it's somehow related to doing a bus reset with virtual channel save/restore while PM reset with VC save/restore works ok as apparently does bus reset without VC save/restore. Let's try to do a manual bus reset so we can look at the post reset state of the device before the kernel tries to restore it. First bind the target device 03:00.0 to pci-stub or vfio-pci so that we know it's not being used. Next capture lspci -xxxx -s 3:00.0 so we have the starting state. Then we'll do a bus reset using setpci: # setpci -s 00:05.0 3e.w=40:40 <if you script this, wait at least 2ms here> # setpci -s 00:05.0 3e.w=00:40 <wait 1 second here> Now re-capture lspci -xxxx -s 3:00.0 The interesting lines for your device are 140: and 150:, so if you want to avoid sending massive emails you can just send those for the before and after. You'll need to reboot the system before you do anything else with this device since it's now in an uninitialized state. Based on what the lspci output reports (or whether you experience a hang simply from this), we may want to try writing additional bits with setpci to mimic the VC restore behavior. Thanks, Alex ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-22 20:36 ` Alex Williamson @ 2014-10-23 16:00 ` Andreas Hartmann 2014-10-23 16:33 ` Alex Williamson 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-23 16:00 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci [-- Attachment #1: Type: text/plain, Size: 1941 bytes --] Alex Williamson wrote: > On Wed, 2014-10-22 at 18:22 +0200, Andreas Hartmann wrote: >> Alex Williamson wrote: >>> --- a/drivers/pci/pci.c >>> +++ b/drivers/pci/pci.c >>> @@ -3308,15 +3308,15 @@ static int __pci_dev_reset(struct pci_dev *dev, int prob >>> if (rc != -ENOTTY) >>> goto done; >>> >>> - rc = pci_pm_reset(dev, probe); >>> + rc = pci_dev_reset_slot_function(dev, probe); >>> if (rc != -ENOTTY) >>> goto done; >>> >>> - rc = pci_dev_reset_slot_function(dev, probe); >>> + rc = pci_parent_bus_reset(dev, probe); >>> if (rc != -ENOTTY) >>> goto done; >>> >>> - rc = pci_parent_bus_reset(dev, probe); >>> + rc = pci_pm_reset(dev, probe); >>> done: >>> return rc; >>> } >> >> This way it's crashing with echo 1 > reset, too. > > Ok, so it's somehow related to doing a bus reset with virtual channel > save/restore while PM reset with VC save/restore works ok as apparently > does bus reset without VC save/restore. Let's try to do a manual bus > reset so we can look at the post reset state of the device before the > kernel tries to restore it. > > First bind the target device 03:00.0 to pci-stub or vfio-pci so that we > know it's not being used. > > Next capture lspci -xxxx -s 3:00.0 so we have the starting state. > > Then we'll do a bus reset using setpci: > # setpci -s 00:05.0 3e.w=40:40 > <if you script this, wait at least 2ms here> > # setpci -s 00:05.0 3e.w=00:40 > <wait 1 second here> > > Now re-capture lspci -xxxx -s 3:00.0 The machine is booted w/ vfio bound to 3:00.0 as usual (now for testing linux 3.14) lspci -xxxx -s 3:00.0 setpci -s 00:05.0 3e.w=40:40 usleep 10 setpci -s 00:05.0 3e.w=00:40 sleep 1 lspci -xxxx -s 3:00.0 I didn't get the second lspci because the machine already was hanging. The first output is attached completely. Hope this helps, thanks, regards, Andreas [-- Attachment #2: atheros-pci1.gz --] [-- Type: application/x-gzip, Size: 963 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-23 16:00 ` Andreas Hartmann @ 2014-10-23 16:33 ` Alex Williamson 2014-10-23 17:12 ` Andreas Hartmann 2014-10-23 17:33 ` Andreas Hartmann 0 siblings, 2 replies; 42+ messages in thread From: Alex Williamson @ 2014-10-23 16:33 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Thu, 2014-10-23 at 18:00 +0200, Andreas Hartmann wrote: > Alex Williamson wrote: > > On Wed, 2014-10-22 at 18:22 +0200, Andreas Hartmann wrote: > >> Alex Williamson wrote: > >>> --- a/drivers/pci/pci.c > >>> +++ b/drivers/pci/pci.c > >>> @@ -3308,15 +3308,15 @@ static int __pci_dev_reset(struct pci_dev *dev, int prob > >>> if (rc != -ENOTTY) > >>> goto done; > >>> > >>> - rc = pci_pm_reset(dev, probe); > >>> + rc = pci_dev_reset_slot_function(dev, probe); > >>> if (rc != -ENOTTY) > >>> goto done; > >>> > >>> - rc = pci_dev_reset_slot_function(dev, probe); > >>> + rc = pci_parent_bus_reset(dev, probe); > >>> if (rc != -ENOTTY) > >>> goto done; > >>> > >>> - rc = pci_parent_bus_reset(dev, probe); > >>> + rc = pci_pm_reset(dev, probe); > >>> done: > >>> return rc; > >>> } > >> > >> This way it's crashing with echo 1 > reset, too. > > > > Ok, so it's somehow related to doing a bus reset with virtual channel > > save/restore while PM reset with VC save/restore works ok as apparently > > does bus reset without VC save/restore. Let's try to do a manual bus > > reset so we can look at the post reset state of the device before the > > kernel tries to restore it. > > > > First bind the target device 03:00.0 to pci-stub or vfio-pci so that we > > know it's not being used. > > > > Next capture lspci -xxxx -s 3:00.0 so we have the starting state. > > > > Then we'll do a bus reset using setpci: > > # setpci -s 00:05.0 3e.w=40:40 > > <if you script this, wait at least 2ms here> > > # setpci -s 00:05.0 3e.w=00:40 > > <wait 1 second here> > > > > Now re-capture lspci -xxxx -s 3:00.0 > > The machine is booted w/ vfio bound to 3:00.0 as usual (now for testing > linux 3.14) > > lspci -xxxx -s 3:00.0 > setpci -s 00:05.0 3e.w=40:40 > usleep 10 > setpci -s 00:05.0 3e.w=00:40 > sleep 1 > lspci -xxxx -s 3:00.0 > > I didn't get the second lspci because the machine already was hanging. > The first output is attached completely. Hmm, that doesn't make much sense. You had found that if you disabled the VC save/restore then QEMU works. That should have still been using secondary bus reset as we're trying to do here, so I don't understand why we can't do a manual secondary bus reset now. If you use Bjorn's previous patch to disable VC save/restore and my patch to reorder the reset mechanisms, does echo 1 > reset for the sysfs entry for the device also still cause a hang? Can you provide a link to the specific model for this card? Thanks, Alex ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-23 16:33 ` Alex Williamson @ 2014-10-23 17:12 ` Andreas Hartmann 2014-10-23 17:33 ` Andreas Hartmann 1 sibling, 0 replies; 42+ messages in thread From: Andreas Hartmann @ 2014-10-23 17:12 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: > On Thu, 2014-10-23 at 18:00 +0200, Andreas Hartmann wrote: >> Alex Williamson wrote: >>> On Wed, 2014-10-22 at 18:22 +0200, Andreas Hartmann wrote: >>>> Alex Williamson wrote: >>>>> --- a/drivers/pci/pci.c >>>>> +++ b/drivers/pci/pci.c >>>>> @@ -3308,15 +3308,15 @@ static int __pci_dev_reset(struct pci_dev *dev, int prob >>>>> if (rc != -ENOTTY) >>>>> goto done; >>>>> >>>>> - rc = pci_pm_reset(dev, probe); >>>>> + rc = pci_dev_reset_slot_function(dev, probe); >>>>> if (rc != -ENOTTY) >>>>> goto done; >>>>> >>>>> - rc = pci_dev_reset_slot_function(dev, probe); >>>>> + rc = pci_parent_bus_reset(dev, probe); >>>>> if (rc != -ENOTTY) >>>>> goto done; >>>>> >>>>> - rc = pci_parent_bus_reset(dev, probe); >>>>> + rc = pci_pm_reset(dev, probe); >>>>> done: >>>>> return rc; >>>>> } >>>> >>>> This way it's crashing with echo 1 > reset, too. >>> >>> Ok, so it's somehow related to doing a bus reset with virtual channel >>> save/restore while PM reset with VC save/restore works ok as apparently >>> does bus reset without VC save/restore. Let's try to do a manual bus >>> reset so we can look at the post reset state of the device before the >>> kernel tries to restore it. >>> >>> First bind the target device 03:00.0 to pci-stub or vfio-pci so that we >>> know it's not being used. >>> >>> Next capture lspci -xxxx -s 3:00.0 so we have the starting state. >>> >>> Then we'll do a bus reset using setpci: >>> # setpci -s 00:05.0 3e.w=40:40 >>> <if you script this, wait at least 2ms here> >>> # setpci -s 00:05.0 3e.w=00:40 >>> <wait 1 second here> >>> >>> Now re-capture lspci -xxxx -s 3:00.0 >> >> The machine is booted w/ vfio bound to 3:00.0 as usual (now for testing >> linux 3.14) >> >> lspci -xxxx -s 3:00.0 >> setpci -s 00:05.0 3e.w=40:40 >> usleep 10 >> setpci -s 00:05.0 3e.w=00:40 >> sleep 1 >> lspci -xxxx -s 3:00.0 >> >> I didn't get the second lspci because the machine already was hanging. >> The first output is attached completely. > > Hmm, that doesn't make much sense. You had found that if you disabled > the VC save/restore then QEMU works. That should have still been using > secondary bus reset as we're trying to do here, so I don't understand > why we can't do a manual secondary bus reset now. > > If you use Bjorn's previous patch to disable VC save/restore and my > patch to reorder the reset mechanisms, does echo 1 > reset for the sysfs > entry for the device also still cause a hang? I will test it. > Can you provide a link to the specific model for this card? Thanks, http://www.tp-link.com.de/support/download/?model=TL-WDN4800&version=V1 Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-23 16:33 ` Alex Williamson 2014-10-23 17:12 ` Andreas Hartmann @ 2014-10-23 17:33 ` Andreas Hartmann 2014-10-23 19:37 ` Alex Williamson 1 sibling, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-23 17:33 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: [...] > If you use Bjorn's previous patch to disable VC save/restore and my > patch to reorder the reset mechanisms, does echo 1 > reset for the sysfs > entry for the device also still cause a hang? Yes - it's hanging too (w/ vfio bound to the device - didn't test other possibilities). Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-23 17:33 ` Andreas Hartmann @ 2014-10-23 19:37 ` Alex Williamson 2014-10-24 14:21 ` Andreas Hartmann 2014-10-25 6:03 ` Andreas Hartmann 0 siblings, 2 replies; 42+ messages in thread From: Alex Williamson @ 2014-10-23 19:37 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Thu, 2014-10-23 at 19:33 +0200, Andreas Hartmann wrote: > Alex Williamson wrote: > [...] > > If you use Bjorn's previous patch to disable VC save/restore and my > > patch to reorder the reset mechanisms, does echo 1 > reset for the sysfs > > entry for the device also still cause a hang? > > Yes - it's hanging too (w/ vfio bound to the device - didn't test other > possibilities). Does it happen regardless of the slot the card is plugged into? Thanks, Alex ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-23 19:37 ` Alex Williamson @ 2014-10-24 14:21 ` Andreas Hartmann 2014-10-25 6:03 ` Andreas Hartmann 1 sibling, 0 replies; 42+ messages in thread From: Andreas Hartmann @ 2014-10-24 14:21 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: > On Thu, 2014-10-23 at 19:33 +0200, Andreas Hartmann wrote: >> Alex Williamson wrote: >> [...] >>> If you use Bjorn's previous patch to disable VC save/restore and my >>> patch to reorder the reset mechanisms, does echo 1 > reset for the sysfs >>> entry for the device also still cause a hang? >> >> Yes - it's hanging too (w/ vfio bound to the device - didn't test other >> possibilities). > > Does it happen regardless of the slot the card is plugged into? Thanks, Can't say - there is only one usable small pcie slot. The other slot is blocked by the graphics card - and the third slot, which should be there according documentation doesn't exist in reality :-(. Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-23 19:37 ` Alex Williamson 2014-10-24 14:21 ` Andreas Hartmann @ 2014-10-25 6:03 ` Andreas Hartmann 2014-10-28 21:51 ` Alex Williamson 1 sibling, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-25 6:03 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: > On Thu, 2014-10-23 at 19:33 +0200, Andreas Hartmann wrote: >> Alex Williamson wrote: >> [...] >>> If you use Bjorn's previous patch to disable VC save/restore and my >>> patch to reorder the reset mechanisms, does echo 1 > reset for the sysfs >>> entry for the device also still cause a hang? >> >> Yes - it's hanging too (w/ vfio bound to the device - didn't test other >> possibilities). > > Does it happen regardless of the slot the card is plugged into? Thanks, As I already wrote, it's not possible to plug the device to another port. But besides that, let me stress some "findings" I made over the past view weeks I'm now knowing about this problem. Maybe it gives you an idea about what's going on: - I did all of the tests in text mode on the console. Normally, there is a blinking cursor. When doing the echo 1 > reset, the shell doesn't come back again and the blinking of the cursor gets immediately slower. Getting slower means: it takes some more time until it is on / off again again. This way, it "blinks" another not exceeding 2 times until it's finally dead. It looks like the machine would have suddenly extremely high load (there are 8 cores!) - but this seems to be not true, because the cpu fan stays silent - the rpm isn't changed at all. - Most of the time, I'm doing tests which fail, I'm having problems after the hang with USB (it's the Etron device). Problem means: initrd isn't able to communicate with the device (but bios and grub2 didn't had any problem, because keyboard worked fine, which is connected via USB 3). At this point, it is necessary to disconnect the mains completely and wait half a minute until the problem disappears. Seldom, I too had this problem even on bios stage: the keyboard couldn't be seen even by the bios any more. - Sometimes (really seldom - now happened about 3 times), it gets extremely hard to return to normal operation after that hang. This means: Since a few weeks, I'm running kernel 3.12.28-3-desktop out of the box (= as provided by openSUSE). Sometimes now, I got (apparently) the same problems (= PCIe passthrough hangs the complete machine) w/ 3.12.28 as I'm having with stock >= 3.14 after testing. It's even useless then to reconnect the mains (I experienced this 2 times in series after one hang yesterday). At this point, I have to run kernel 3.10.x (which runs pretty fine as usual) and only after that, 3.12 works again as expected (as appeared once yesterday while tests w/ disabled USB 3 devices via bios). - I think there is a relationship between how long the hang is active and the consecutive problems coming up. If the hang is immediately (max about 1s) reset w/ the reset knob, it is possible, that there is no USB problem after reboot and the machine works completely fine with 3.12.x again. Conclusion (from my point of view): The broken reset seems to do something really _extreme ugly_ w/ the hardware, which has the potential to break the hardware "lasting" or the consecutive software isn't able at all to correctly reconfigure the system again - even after reconnecting the mains. Fortunately I'm having an old kernel version (3.10.x), which seems to be able to "repair" the hardware again. But I have to emphasis that the situation is really highly questionable and I'm meanwhile fearing to break my board finally, which is working really _extremely_ stable besides that. Out of interest: Bjorn's patch disables vc save/restore support - and the machine works fine again. Why is it needed at all if it seems to work perfectly w/o it? What's the additional benefit? Or in other words: What am I missing until today :-) ? What would be better? What could I do more? Thanks, kind regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-25 6:03 ` Andreas Hartmann @ 2014-10-28 21:51 ` Alex Williamson 2014-10-29 16:47 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Alex Williamson @ 2014-10-28 21:51 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Sat, 2014-10-25 at 08:03 +0200, Andreas Hartmann wrote: > > Out of interest: > Bjorn's patch disables vc save/restore support - and the machine works > fine again. Why is it needed at all if it seems to work perfectly w/o > it? What's the additional benefit? Or in other words: What am I missing > until today :-) ? What would be better? What could I do more? You're right, in the configuration you have the endpoint device has a Virtual Channel capability but the upstream root port does not. The spec is not at all clear about defining the endpoints for enabling Virtual Channel in each type of configuration, but I think that if we have an upstream port that does not support Virtual Channel, we can skip the save/restore. Please test the patch below. I'm also still completely confused about whether this is a VC save/restore issue or a bus reset issue. You originally bisected this back to the VC save/restore patch, but you also found that a manual, setpci-based bus reset triggered a system hang. I believe that re-ordering the kernel reset mechanisms also triggered this. Since recent versions of QEMU are going to favor a bus reset over PM reset, I don't have a lot of confidence that we're actually solving the problem for you. Please make sure to test with a recent QEMU to be sure we'll do a bus reset. Thanks, Alex diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c index 7e1304d..6d13d34 100644 --- a/drivers/pci/vc.c +++ b/drivers/pci/vc.c @@ -339,6 +339,25 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, return buf ? 0 : len; } +/** + * pci_vc_needs_save - Determine whether a VC capability needs to be saved + * @dev: device + * @id: VC capability ID (VC/VC9/MFVC) + * + * In configurations where we have a VC or MFVC capability, but the upstream + * device does not, we assume that VC save (and therefore restore) is not + * necessary. The intention is to only do VC save/restore in configuration + * where it's necessary and hopefully avoid reset issues. + */ +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) +{ + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) + return true; + + return false; +} + static struct { u16 id; const char *name; @@ -362,7 +381,7 @@ int pci_save_vc_state(struct pci_dev *dev) struct pci_cap_saved_state *save_state; pos = pci_find_ext_capability(dev, vc_caps[i].id); - if (!pos) + if (!posi || !pci_vc_needs_save(dev, vc_caps[i].id)) continue; save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); @@ -422,7 +441,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); - if (!pos) + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) continue; len = pci_vc_do_save_buffer(dev, pos, NULL, false); ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-28 21:51 ` Alex Williamson @ 2014-10-29 16:47 ` Andreas Hartmann 2014-10-29 17:44 ` Alex Williamson 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-29 16:47 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: > On Sat, 2014-10-25 at 08:03 +0200, Andreas Hartmann wrote: >> >> Out of interest: >> Bjorn's patch disables vc save/restore support - and the machine works >> fine again. Why is it needed at all if it seems to work perfectly w/o >> it? What's the additional benefit? Or in other words: What am I missing >> until today :-) ? What would be better? What could I do more? > > > You're right, in the configuration you have the endpoint device has a > Virtual Channel capability but the upstream root port does not. The > spec is not at all clear about defining the endpoints for enabling > Virtual Channel in each type of configuration, but I think that if we > have an upstream port that does not support Virtual Channel, we can skip > the save/restore. Please test the patch below. > > I'm also still completely confused about whether this is a VC > save/restore issue or a bus reset issue. You originally bisected this > back to the VC save/restore patch, but you also found that a manual, > setpci-based bus reset triggered a system hang. With your additional patch posted here: http://article.gmane.org/gmane.linux.kernel.pci/36162 > I believe that > re-ordering the kernel reset mechanisms also triggered this. Since > recent versions of QEMU are going to favor a bus reset over PM reset, I > don't have a lot of confidence that we're actually solving the problem > for you. Please make sure to test with a recent QEMU to be sure we'll > do a bus reset. I'm running qemu 2.1.0 (newest is 2.1.2 - but this shouldn't be a problem) and tested w/ linux 3.17. > diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c > index 7e1304d..6d13d34 100644 > --- a/drivers/pci/vc.c > +++ b/drivers/pci/vc.c > @@ -339,6 +339,25 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, > return buf ? 0 : len; > } > > +/** > + * pci_vc_needs_save - Determine whether a VC capability needs to be saved > + * @dev: device > + * @id: VC capability ID (VC/VC9/MFVC) > + * > + * In configurations where we have a VC or MFVC capability, but the upstream > + * device does not, we assume that VC save (and therefore restore) is not > + * necessary. The intention is to only do VC save/restore in configuration > + * where it's necessary and hopefully avoid reset issues. > + */ > +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) > +{ > + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || > + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) > + return true; > + > + return false; > +} > + > static struct { > u16 id; > const char *name; > @@ -362,7 +381,7 @@ int pci_save_vc_state(struct pci_dev *dev) > struct pci_cap_saved_state *save_state; > > pos = pci_find_ext_capability(dev, vc_caps[i].id); > - if (!pos) > + if (!posi || !pci_vc_needs_save(dev, vc_caps[i].id)) ^ This should be most probably !pos (and not !posi - because !posi does through a compile error). > continue; > > save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); > @@ -422,7 +441,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) > for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); > > - if (!pos) > + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) > continue; > > len = pci_vc_do_save_buffer(dev, pos, NULL, false); W/ the above patch, the machine hangs again (w/ qemu and setpci), but w/ Bjorn's patch (and nothing more applied) which disables vc save/restore, the machine just works fine ... . I especially retested this case to be really sure. I'm so sorry. But that's how it behaves here :-( Thanks, regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-29 16:47 ` Andreas Hartmann @ 2014-10-29 17:44 ` Alex Williamson 2014-10-29 17:57 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Alex Williamson @ 2014-10-29 17:44 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Wed, 2014-10-29 at 17:47 +0100, Andreas Hartmann wrote: > Alex Williamson wrote: > > On Sat, 2014-10-25 at 08:03 +0200, Andreas Hartmann wrote: > >> > >> Out of interest: > >> Bjorn's patch disables vc save/restore support - and the machine works > >> fine again. Why is it needed at all if it seems to work perfectly w/o > >> it? What's the additional benefit? Or in other words: What am I missing > >> until today :-) ? What would be better? What could I do more? > > > > > > You're right, in the configuration you have the endpoint device has a > > Virtual Channel capability but the upstream root port does not. The > > spec is not at all clear about defining the endpoints for enabling > > Virtual Channel in each type of configuration, but I think that if we > > have an upstream port that does not support Virtual Channel, we can skip > > the save/restore. Please test the patch below. > > > > I'm also still completely confused about whether this is a VC > > save/restore issue or a bus reset issue. You originally bisected this > > back to the VC save/restore patch, but you also found that a manual, > > setpci-based bus reset triggered a system hang. > > With your additional patch posted here: > http://article.gmane.org/gmane.linux.kernel.pci/36162 Right, a reset via sysfs also triggered it with that patch, but the reset via setpci is independent of any VC save/restore and still hung your box. > > > I believe that > > re-ordering the kernel reset mechanisms also triggered this. Since > > recent versions of QEMU are going to favor a bus reset over PM reset, I > > don't have a lot of confidence that we're actually solving the problem > > for you. Please make sure to test with a recent QEMU to be sure we'll > > do a bus reset. > > I'm running qemu 2.1.0 (newest is 2.1.2 - but this shouldn't be a > problem) and tested w/ linux 3.17. Yep, just want to make sure it's QEMU new enough to do a bus reset and kernel with matching support. > > diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c > > index 7e1304d..6d13d34 100644 > > --- a/drivers/pci/vc.c > > +++ b/drivers/pci/vc.c > > @@ -339,6 +339,25 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, > > return buf ? 0 : len; > > } > > > > +/** > > + * pci_vc_needs_save - Determine whether a VC capability needs to be saved > > + * @dev: device > > + * @id: VC capability ID (VC/VC9/MFVC) > > + * > > + * In configurations where we have a VC or MFVC capability, but the upstream > > + * device does not, we assume that VC save (and therefore restore) is not > > + * necessary. The intention is to only do VC save/restore in configuration > > + * where it's necessary and hopefully avoid reset issues. > > + */ > > +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) > > +{ > > + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || > > + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) > > + return true; > > + > > + return false; > > +} > > + > > static struct { > > u16 id; > > const char *name; > > @@ -362,7 +381,7 @@ int pci_save_vc_state(struct pci_dev *dev) > > struct pci_cap_saved_state *save_state; > > > > pos = pci_find_ext_capability(dev, vc_caps[i].id); > > - if (!pos) > > + if (!posi || !pci_vc_needs_save(dev, vc_caps[i].id)) > ^ > This should be most probably !pos (and not !posi - because !posi does > through a compile error). Oops, sorry. > > continue; > > > > save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); > > @@ -422,7 +441,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) > > for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > > int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); > > > > - if (!pos) > > + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) > > continue; > > > > len = pci_vc_do_save_buffer(dev, pos, NULL, false); > > W/ the above patch, the machine hangs again (w/ qemu and setpci), but w/ > Bjorn's patch (and nothing more applied) which disables vc save/restore, > the machine just works fine ... . I especially retested this case to be > really sure. I'm so sorry. But that's how it behaves here :-( Hmm, the intention was that this should effectively do the same thing as Bjorn's patch. The Atheros device (03:00.0) reports a VC capability but the root port above it (00:05.0) does not. The test in pci_vc_needs_save() should therefore be false for all tests within the if() block and the function should return false, causing us to neither allocate a save buffer or perform a save. The restore will automatically be skipped since there's no save buffer. This is what I was afraid of, on one hand you've bisected and proved via patching that the problem is exclusively due to VC save/restore, but we also have testing that indicates that we can't do a bus reset at all on this port. So I re-iterate my confusion that we don't seem to have a good idea which is the problem. Do you have any other devices that you can install in the slot for testing? Maybe if we knew that bus reset is only a problem with the Atheros card then we could blacklist it. Thanks, Alex ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-29 17:44 ` Alex Williamson @ 2014-10-29 17:57 ` Andreas Hartmann 2014-10-29 18:16 ` Alex Williamson 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-29 17:57 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson schrieb: > On Wed, 2014-10-29 at 17:47 +0100, Andreas Hartmann wrote: >> Alex Williamson wrote: >>> On Sat, 2014-10-25 at 08:03 +0200, Andreas Hartmann wrote: >>>> >>>> Out of interest: >>>> Bjorn's patch disables vc save/restore support - and the machine works >>>> fine again. Why is it needed at all if it seems to work perfectly w/o >>>> it? What's the additional benefit? Or in other words: What am I missing >>>> until today :-) ? What would be better? What could I do more? >>> >>> >>> You're right, in the configuration you have the endpoint device has a >>> Virtual Channel capability but the upstream root port does not. The >>> spec is not at all clear about defining the endpoints for enabling >>> Virtual Channel in each type of configuration, but I think that if we >>> have an upstream port that does not support Virtual Channel, we can skip >>> the save/restore. Please test the patch below. >>> >>> I'm also still completely confused about whether this is a VC >>> save/restore issue or a bus reset issue. You originally bisected this >>> back to the VC save/restore patch, but you also found that a manual, >>> setpci-based bus reset triggered a system hang. >> >> With your additional patch posted here: >> http://article.gmane.org/gmane.linux.kernel.pci/36162 > > Right, a reset via sysfs also triggered it with that patch, but the > reset via setpci is independent of any VC save/restore and still hung > your box. > >> >>> I believe that >>> re-ordering the kernel reset mechanisms also triggered this. Since >>> recent versions of QEMU are going to favor a bus reset over PM reset, I >>> don't have a lot of confidence that we're actually solving the problem >>> for you. Please make sure to test with a recent QEMU to be sure we'll >>> do a bus reset. >> >> I'm running qemu 2.1.0 (newest is 2.1.2 - but this shouldn't be a >> problem) and tested w/ linux 3.17. > > Yep, just want to make sure it's QEMU new enough to do a bus reset and > kernel with matching support. > >>> diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c >>> index 7e1304d..6d13d34 100644 >>> --- a/drivers/pci/vc.c >>> +++ b/drivers/pci/vc.c >>> @@ -339,6 +339,25 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, >>> return buf ? 0 : len; >>> } >>> >>> +/** >>> + * pci_vc_needs_save - Determine whether a VC capability needs to be saved >>> + * @dev: device >>> + * @id: VC capability ID (VC/VC9/MFVC) >>> + * >>> + * In configurations where we have a VC or MFVC capability, but the upstream >>> + * device does not, we assume that VC save (and therefore restore) is not >>> + * necessary. The intention is to only do VC save/restore in configuration >>> + * where it's necessary and hopefully avoid reset issues. >>> + */ >>> +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) >>> +{ >>> + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || >>> + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) >>> + return true; >>> + >>> + return false; >>> +} >>> + >>> static struct { >>> u16 id; >>> const char *name; >>> @@ -362,7 +381,7 @@ int pci_save_vc_state(struct pci_dev *dev) >>> struct pci_cap_saved_state *save_state; >>> >>> pos = pci_find_ext_capability(dev, vc_caps[i].id); >>> - if (!pos) >>> + if (!posi || !pci_vc_needs_save(dev, vc_caps[i].id)) >> ^ >> This should be most probably !pos (and not !posi - because !posi does >> through a compile error). > > Oops, sorry. > >>> continue; >>> >>> save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); >>> @@ -422,7 +441,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) >>> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { >>> int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); >>> >>> - if (!pos) >>> + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) >>> continue; >>> >>> len = pci_vc_do_save_buffer(dev, pos, NULL, false); >> >> W/ the above patch, the machine hangs again (w/ qemu and setpci), but w/ >> Bjorn's patch (and nothing more applied) which disables vc save/restore, >> the machine just works fine ... . I especially retested this case to be >> really sure. I'm so sorry. But that's how it behaves here :-( > > Hmm, the intention was that this should effectively do the same thing as > Bjorn's patch. The Atheros device (03:00.0) reports a VC capability but > the root port above it (00:05.0) does not. Are you sure, that this patch really works (-> here!) as expected? Would it be possible to add some debug output printing to the actual console (not to log file) to be sure it really works as expected? Maybe some more output to get an idea what's actually going on? Or is it just a timing issue? Thanks, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-29 17:57 ` Andreas Hartmann @ 2014-10-29 18:16 ` Alex Williamson 2014-10-29 19:43 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Alex Williamson @ 2014-10-29 18:16 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Wed, 2014-10-29 at 18:57 +0100, Andreas Hartmann wrote: > Alex Williamson schrieb: > > On Wed, 2014-10-29 at 17:47 +0100, Andreas Hartmann wrote: > >> Alex Williamson wrote: > >>> On Sat, 2014-10-25 at 08:03 +0200, Andreas Hartmann wrote: > >>>> > >>>> Out of interest: > >>>> Bjorn's patch disables vc save/restore support - and the machine works > >>>> fine again. Why is it needed at all if it seems to work perfectly w/o > >>>> it? What's the additional benefit? Or in other words: What am I missing > >>>> until today :-) ? What would be better? What could I do more? > >>> > >>> > >>> You're right, in the configuration you have the endpoint device has a > >>> Virtual Channel capability but the upstream root port does not. The > >>> spec is not at all clear about defining the endpoints for enabling > >>> Virtual Channel in each type of configuration, but I think that if we > >>> have an upstream port that does not support Virtual Channel, we can skip > >>> the save/restore. Please test the patch below. > >>> > >>> I'm also still completely confused about whether this is a VC > >>> save/restore issue or a bus reset issue. You originally bisected this > >>> back to the VC save/restore patch, but you also found that a manual, > >>> setpci-based bus reset triggered a system hang. > >> > >> With your additional patch posted here: > >> http://article.gmane.org/gmane.linux.kernel.pci/36162 > > > > Right, a reset via sysfs also triggered it with that patch, but the > > reset via setpci is independent of any VC save/restore and still hung > > your box. > > > >> > >>> I believe that > >>> re-ordering the kernel reset mechanisms also triggered this. Since > >>> recent versions of QEMU are going to favor a bus reset over PM reset, I > >>> don't have a lot of confidence that we're actually solving the problem > >>> for you. Please make sure to test with a recent QEMU to be sure we'll > >>> do a bus reset. > >> > >> I'm running qemu 2.1.0 (newest is 2.1.2 - but this shouldn't be a > >> problem) and tested w/ linux 3.17. > > > > Yep, just want to make sure it's QEMU new enough to do a bus reset and > > kernel with matching support. > > > >>> diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c > >>> index 7e1304d..6d13d34 100644 > >>> --- a/drivers/pci/vc.c > >>> +++ b/drivers/pci/vc.c > >>> @@ -339,6 +339,25 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, > >>> return buf ? 0 : len; > >>> } > >>> > >>> +/** > >>> + * pci_vc_needs_save - Determine whether a VC capability needs to be saved > >>> + * @dev: device > >>> + * @id: VC capability ID (VC/VC9/MFVC) > >>> + * > >>> + * In configurations where we have a VC or MFVC capability, but the upstream > >>> + * device does not, we assume that VC save (and therefore restore) is not > >>> + * necessary. The intention is to only do VC save/restore in configuration > >>> + * where it's necessary and hopefully avoid reset issues. > >>> + */ > >>> +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) > >>> +{ > >>> + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || > >>> + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) > >>> + return true; > >>> + > >>> + return false; > >>> +} > >>> + > >>> static struct { > >>> u16 id; > >>> const char *name; > >>> @@ -362,7 +381,7 @@ int pci_save_vc_state(struct pci_dev *dev) > >>> struct pci_cap_saved_state *save_state; > >>> > >>> pos = pci_find_ext_capability(dev, vc_caps[i].id); > >>> - if (!pos) > >>> + if (!posi || !pci_vc_needs_save(dev, vc_caps[i].id)) > >> ^ > >> This should be most probably !pos (and not !posi - because !posi does > >> through a compile error). > > > > Oops, sorry. > > > >>> continue; > >>> > >>> save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); > >>> @@ -422,7 +441,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) > >>> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > >>> int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); > >>> > >>> - if (!pos) > >>> + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) > >>> continue; > >>> > >>> len = pci_vc_do_save_buffer(dev, pos, NULL, false); > >> > >> W/ the above patch, the machine hangs again (w/ qemu and setpci), but w/ > >> Bjorn's patch (and nothing more applied) which disables vc save/restore, > >> the machine just works fine ... . I especially retested this case to be > >> really sure. I'm so sorry. But that's how it behaves here :-( > > > > Hmm, the intention was that this should effectively do the same thing as > > Bjorn's patch. The Atheros device (03:00.0) reports a VC capability but > > the root port above it (00:05.0) does not. > > Are you sure, that this patch really works (-> here!) as expected? Would > it be possible to add some debug output printing to the actual console > (not to log file) to be sure it really works as expected? Maybe some > more output to get an idea what's actually going on? Or is it just a > timing issue? Sure, here's some added printks (and fixed posi). You should be able to run 'dmesg | grep pci_vc_needs_save' after boot and see device 0000:03:00.0. Hopefully you won't see the pci_save_vc_state() printk as you assign the device. diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c index 7e1304d..300e126 100644 --- a/drivers/pci/vc.c +++ b/drivers/pci/vc.c @@ -339,6 +339,26 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, return buf ? 0 : len; } +/** + * pci_vc_needs_save - Determine whether a VC capability needs to be saved + * @dev: device + * @id: VC capability ID (VC/VC9/MFVC) + * + * In configurations where we have a VC or MFVC capability, but the upstream + * device does not, we assume that VC save (and therefore restore) is not + * necessary. The intention is to only do VC save/restore in configuration + * where it's necessary and hopefully avoid reset issues. + */ +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) +{ + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) + return true; + + printk("%s(%s, %x) returning false\n", __func__, pci_name(dev), id); + return false; +} + static struct { u16 id; const char *name; @@ -362,7 +382,7 @@ int pci_save_vc_state(struct pci_dev *dev) struct pci_cap_saved_state *save_state; pos = pci_find_ext_capability(dev, vc_caps[i].id); - if (!pos) + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) continue; save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); @@ -372,6 +392,7 @@ int pci_save_vc_state(struct pci_dev *dev) return -ENOMEM; } + I printk("%s doing %s save on %s\n", __func__, vc_caps[i].name, pci_name(dev)); ret = pci_vc_do_save_buffer(dev, pos, save_state, true); if (ret) { dev_err(&dev->dev, "%s save unsuccessful %s\n", @@ -403,6 +424,7 @@ void pci_restore_vc_state(struct pci_dev *dev) if (!save_state || !pos) continue; + I printk("%s doing %s restore on %s\n", __func__, vc_caps[i].name, pci_name(dev)); pci_vc_do_save_buffer(dev, pos, save_state, false); } } @@ -422,7 +444,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); - if (!pos) + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) continue; len = pci_vc_do_save_buffer(dev, pos, NULL, false); ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-29 18:16 ` Alex Williamson @ 2014-10-29 19:43 ` Andreas Hartmann 2014-10-29 20:50 ` Alex Williamson 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-29 19:43 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: > On Wed, 2014-10-29 at 18:57 +0100, Andreas Hartmann wrote: >> Alex Williamson wrote: >>> On Wed, 2014-10-29 at 17:47 +0100, Andreas Hartmann wrote: >>>> Alex Williamson wrote: >>>>> On Sat, 2014-10-25 at 08:03 +0200, Andreas Hartmann wrote: >>>>>> >>>>>> Out of interest: >>>>>> Bjorn's patch disables vc save/restore support - and the machine works >>>>>> fine again. Why is it needed at all if it seems to work perfectly w/o >>>>>> it? What's the additional benefit? Or in other words: What am I missing >>>>>> until today :-) ? What would be better? What could I do more? >>>>> >>>>> >>>>> You're right, in the configuration you have the endpoint device has a >>>>> Virtual Channel capability but the upstream root port does not. The >>>>> spec is not at all clear about defining the endpoints for enabling >>>>> Virtual Channel in each type of configuration, but I think that if we >>>>> have an upstream port that does not support Virtual Channel, we can skip >>>>> the save/restore. Please test the patch below. >>>>> >>>>> I'm also still completely confused about whether this is a VC >>>>> save/restore issue or a bus reset issue. You originally bisected this >>>>> back to the VC save/restore patch, but you also found that a manual, >>>>> setpci-based bus reset triggered a system hang. >>>> >>>> With your additional patch posted here: >>>> http://article.gmane.org/gmane.linux.kernel.pci/36162 >>> >>> Right, a reset via sysfs also triggered it with that patch, but the >>> reset via setpci is independent of any VC save/restore and still hung >>> your box. >>> >>>> >>>>> I believe that >>>>> re-ordering the kernel reset mechanisms also triggered this. Since >>>>> recent versions of QEMU are going to favor a bus reset over PM reset, I >>>>> don't have a lot of confidence that we're actually solving the problem >>>>> for you. Please make sure to test with a recent QEMU to be sure we'll >>>>> do a bus reset. >>>> >>>> I'm running qemu 2.1.0 (newest is 2.1.2 - but this shouldn't be a >>>> problem) and tested w/ linux 3.17. >>> >>> Yep, just want to make sure it's QEMU new enough to do a bus reset and >>> kernel with matching support. >>> >>>>> diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c >>>>> index 7e1304d..6d13d34 100644 >>>>> --- a/drivers/pci/vc.c >>>>> +++ b/drivers/pci/vc.c >>>>> @@ -339,6 +339,25 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, >>>>> return buf ? 0 : len; >>>>> } >>>>> >>>>> +/** >>>>> + * pci_vc_needs_save - Determine whether a VC capability needs to be saved >>>>> + * @dev: device >>>>> + * @id: VC capability ID (VC/VC9/MFVC) >>>>> + * >>>>> + * In configurations where we have a VC or MFVC capability, but the upstream >>>>> + * device does not, we assume that VC save (and therefore restore) is not >>>>> + * necessary. The intention is to only do VC save/restore in configuration >>>>> + * where it's necessary and hopefully avoid reset issues. >>>>> + */ >>>>> +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) >>>>> +{ >>>>> + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || >>>>> + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) >>>>> + return true; >>>>> + >>>>> + return false; >>>>> +} >>>>> + >>>>> static struct { >>>>> u16 id; >>>>> const char *name; >>>>> @@ -362,7 +381,7 @@ int pci_save_vc_state(struct pci_dev *dev) >>>>> struct pci_cap_saved_state *save_state; >>>>> >>>>> pos = pci_find_ext_capability(dev, vc_caps[i].id); >>>>> - if (!pos) >>>>> + if (!posi || !pci_vc_needs_save(dev, vc_caps[i].id)) >>>> ^ >>>> This should be most probably !pos (and not !posi - because !posi does >>>> through a compile error). >>> >>> Oops, sorry. >>> >>>>> continue; >>>>> >>>>> save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); >>>>> @@ -422,7 +441,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) >>>>> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { >>>>> int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); >>>>> >>>>> - if (!pos) >>>>> + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) >>>>> continue; >>>>> >>>>> len = pci_vc_do_save_buffer(dev, pos, NULL, false); >>>> >>>> W/ the above patch, the machine hangs again (w/ qemu and setpci), but w/ >>>> Bjorn's patch (and nothing more applied) which disables vc save/restore, >>>> the machine just works fine ... . I especially retested this case to be >>>> really sure. I'm so sorry. But that's how it behaves here :-( >>> >>> Hmm, the intention was that this should effectively do the same thing as >>> Bjorn's patch. The Atheros device (03:00.0) reports a VC capability but >>> the root port above it (00:05.0) does not. >> >> Are you sure, that this patch really works (-> here!) as expected? Would >> it be possible to add some debug output printing to the actual console >> (not to log file) to be sure it really works as expected? Maybe some >> more output to get an idea what's actually going on? Or is it just a >> timing issue? > > Sure, here's some added printks (and fixed posi). You should be able to > run 'dmesg | grep pci_vc_needs_save' after boot and see device > 0000:03:00.0. Hopefully you won't see the pci_save_vc_state() printk as > you assign the device. [...] I'm getting the expected output: [ 1.156857] pci_vc_needs_save(0000:03:00.0, 2) returning false [ 1.158866] pci_vc_needs_save(0000:04:00.0, 2) returning false This is most probably triggered by pci_allocate_vc_save_buffers, true? Therefore, I never should need pci_save_vc_state and pci_restore_vc_state. Thus, it should be ok to add "return" at the beginning of each of these function, true? Then it should work. I tested it. It worked. But if I'm removing only one of these returns either in pci_save_vc_state or pci_restore_vc_state, the machine hangs again. Therefore, there must be something odd going on in the for loops. Isn't it possible to add some useful debug code to these loops to see what's really going on? But the output *must* go to the actual console, otherwise I can't see it! int pci_save_vc_state(struct pci_dev *dev) { return 0; // must be set int i; for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { int pos, ret; struct pci_cap_saved_state *save_state; pos = pci_find_ext_capability(dev, vc_caps[i].id); if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) continue; save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); if (!save_state) { dev_err(&dev->dev, "%s buffer not found in %s\n", vc_caps[i].name, __func__); return -ENOMEM; } printk("%s doing %s save on %s\n", __func__, vc_caps[i].name, pci_name(dev)); ret = pci_vc_do_save_buffer(dev, pos, save_state, true); if (ret) { dev_err(&dev->dev, "%s save unsuccessful %s\n", vc_caps[i].name, __func__); return ret; } } return 0; } void pci_restore_vc_state(struct pci_dev *dev) { return; // must be set int i; for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { int pos; struct pci_cap_saved_state *save_state; pos = pci_find_ext_capability(dev, vc_caps[i].id); save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); if (!save_state || !pos) continue; printk("%s doing %s restore on %s\n", __func__, vc_caps[i].name, pci_name(dev)); pci_vc_do_save_buffer(dev, pos, save_state, false); } } Thanks, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-29 19:43 ` Andreas Hartmann @ 2014-10-29 20:50 ` Alex Williamson 2014-10-29 21:35 ` Andreas Hartmann 2014-10-30 16:35 ` Andreas Hartmann 0 siblings, 2 replies; 42+ messages in thread From: Alex Williamson @ 2014-10-29 20:50 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Wed, 2014-10-29 at 20:43 +0100, Andreas Hartmann wrote: > Alex Williamson wrote: > > On Wed, 2014-10-29 at 18:57 +0100, Andreas Hartmann wrote: > >> Alex Williamson wrote: > >>> On Wed, 2014-10-29 at 17:47 +0100, Andreas Hartmann wrote: > >>>> Alex Williamson wrote: > >>>>> On Sat, 2014-10-25 at 08:03 +0200, Andreas Hartmann wrote: > >>>>>> > >>>>>> Out of interest: > >>>>>> Bjorn's patch disables vc save/restore support - and the machine works > >>>>>> fine again. Why is it needed at all if it seems to work perfectly w/o > >>>>>> it? What's the additional benefit? Or in other words: What am I missing > >>>>>> until today :-) ? What would be better? What could I do more? > >>>>> > >>>>> > >>>>> You're right, in the configuration you have the endpoint device has a > >>>>> Virtual Channel capability but the upstream root port does not. The > >>>>> spec is not at all clear about defining the endpoints for enabling > >>>>> Virtual Channel in each type of configuration, but I think that if we > >>>>> have an upstream port that does not support Virtual Channel, we can skip > >>>>> the save/restore. Please test the patch below. > >>>>> > >>>>> I'm also still completely confused about whether this is a VC > >>>>> save/restore issue or a bus reset issue. You originally bisected this > >>>>> back to the VC save/restore patch, but you also found that a manual, > >>>>> setpci-based bus reset triggered a system hang. > >>>> > >>>> With your additional patch posted here: > >>>> http://article.gmane.org/gmane.linux.kernel.pci/36162 > >>> > >>> Right, a reset via sysfs also triggered it with that patch, but the > >>> reset via setpci is independent of any VC save/restore and still hung > >>> your box. > >>> > >>>> > >>>>> I believe that > >>>>> re-ordering the kernel reset mechanisms also triggered this. Since > >>>>> recent versions of QEMU are going to favor a bus reset over PM reset, I > >>>>> don't have a lot of confidence that we're actually solving the problem > >>>>> for you. Please make sure to test with a recent QEMU to be sure we'll > >>>>> do a bus reset. > >>>> > >>>> I'm running qemu 2.1.0 (newest is 2.1.2 - but this shouldn't be a > >>>> problem) and tested w/ linux 3.17. > >>> > >>> Yep, just want to make sure it's QEMU new enough to do a bus reset and > >>> kernel with matching support. > >>> > >>>>> diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c > >>>>> index 7e1304d..6d13d34 100644 > >>>>> --- a/drivers/pci/vc.c > >>>>> +++ b/drivers/pci/vc.c > >>>>> @@ -339,6 +339,25 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, > >>>>> return buf ? 0 : len; > >>>>> } > >>>>> > >>>>> +/** > >>>>> + * pci_vc_needs_save - Determine whether a VC capability needs to be saved > >>>>> + * @dev: device > >>>>> + * @id: VC capability ID (VC/VC9/MFVC) > >>>>> + * > >>>>> + * In configurations where we have a VC or MFVC capability, but the upstream > >>>>> + * device does not, we assume that VC save (and therefore restore) is not > >>>>> + * necessary. The intention is to only do VC save/restore in configuration > >>>>> + * where it's necessary and hopefully avoid reset issues. > >>>>> + */ > >>>>> +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) > >>>>> +{ > >>>>> + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || > >>>>> + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) > >>>>> + return true; > >>>>> + > >>>>> + return false; > >>>>> +} > >>>>> + > >>>>> static struct { > >>>>> u16 id; > >>>>> const char *name; > >>>>> @@ -362,7 +381,7 @@ int pci_save_vc_state(struct pci_dev *dev) > >>>>> struct pci_cap_saved_state *save_state; > >>>>> > >>>>> pos = pci_find_ext_capability(dev, vc_caps[i].id); > >>>>> - if (!pos) > >>>>> + if (!posi || !pci_vc_needs_save(dev, vc_caps[i].id)) > >>>> ^ > >>>> This should be most probably !pos (and not !posi - because !posi does > >>>> through a compile error). > >>> > >>> Oops, sorry. > >>> > >>>>> continue; > >>>>> > >>>>> save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); > >>>>> @@ -422,7 +441,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) > >>>>> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > >>>>> int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); > >>>>> > >>>>> - if (!pos) > >>>>> + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) > >>>>> continue; > >>>>> > >>>>> len = pci_vc_do_save_buffer(dev, pos, NULL, false); > >>>> > >>>> W/ the above patch, the machine hangs again (w/ qemu and setpci), but w/ > >>>> Bjorn's patch (and nothing more applied) which disables vc save/restore, > >>>> the machine just works fine ... . I especially retested this case to be > >>>> really sure. I'm so sorry. But that's how it behaves here :-( > >>> > >>> Hmm, the intention was that this should effectively do the same thing as > >>> Bjorn's patch. The Atheros device (03:00.0) reports a VC capability but > >>> the root port above it (00:05.0) does not. > >> > >> Are you sure, that this patch really works (-> here!) as expected? Would > >> it be possible to add some debug output printing to the actual console > >> (not to log file) to be sure it really works as expected? Maybe some > >> more output to get an idea what's actually going on? Or is it just a > >> timing issue? > > > > Sure, here's some added printks (and fixed posi). You should be able to > > run 'dmesg | grep pci_vc_needs_save' after boot and see device > > 0000:03:00.0. Hopefully you won't see the pci_save_vc_state() printk as > > you assign the device. > > [...] > > I'm getting the expected output: > > [ 1.156857] pci_vc_needs_save(0000:03:00.0, 2) returning false > [ 1.158866] pci_vc_needs_save(0000:04:00.0, 2) returning false > > This is most probably triggered by pci_allocate_vc_save_buffers, true? Yes, it will be done at device discovery. > Therefore, I never should need pci_save_vc_state and > pci_restore_vc_state. Thus, it should be ok to add "return" at the > beginning of each of these function, true? Then it should work. > > I tested it. It worked. > > But if I'm removing only one of these returns either in > pci_save_vc_state or pci_restore_vc_state, the machine hangs again. > > Therefore, there must be something odd going on in the for loops. Isn't > it possible to add some useful debug code to these loops to see what's > really going on? But the output *must* go to the actual console, > otherwise I can't see it! > > > int pci_save_vc_state(struct pci_dev *dev) > { > return 0; // must be set > int i; > > for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > int pos, ret; > struct pci_cap_saved_state *save_state; > > pos = pci_find_ext_capability(dev, vc_caps[i].id); > if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) > continue; Take the next logical step, comment out the if here and we'll statically take the continue. Does it still fail? If so, move the continue above the call to pci_find_ext_capability(), if not... > > save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); If not, add a continue; here. Unless my pci_vc_needs_save() function is broken, we shouldn't be getting here anyway. > if (!save_state) { > dev_err(&dev->dev, "%s buffer not found in %s\n", > vc_caps[i].name, __func__); > return -ENOMEM; > } > > printk("%s doing %s save on %s\n", __func__, vc_caps[i].name, pci_name(dev)); > ret = pci_vc_do_save_buffer(dev, pos, save_state, true); > if (ret) { > dev_err(&dev->dev, "%s save unsuccessful %s\n", > vc_caps[i].name, __func__); > return ret; > } > } > > return 0; > } > > > void pci_restore_vc_state(struct pci_dev *dev) > { > return; // must be set > int i; > > for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > int pos; > struct pci_cap_saved_state *save_state; > > pos = pci_find_ext_capability(dev, vc_caps[i].id); > save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); This should never find a save_state with the pci_vc_needs_save() patch, so we should always take the branch below. Comment out the if (... and leave the continue, does the behavior change? If so, add a continue; line above pci_find_saved_ext_cap(), does it work? If not, add another continue above pci_find_ext_capability(). > if (!save_state || !pos) > continue; > > printk("%s doing %s restore on %s\n", __func__, vc_caps[i].name, pci_name(dev)); > pci_vc_do_save_buffer(dev, pos, save_state, false); > } > } In the "working" case with Bjorn's patch, are you actually trying to use the device or just testing to see if the system survives reset? You might at least want to run lspci -xxxx on it after reset to make sure it's really there. Thanks, Alex ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-29 20:50 ` Alex Williamson @ 2014-10-29 21:35 ` Andreas Hartmann 2014-10-30 16:35 ` Andreas Hartmann 1 sibling, 0 replies; 42+ messages in thread From: Andreas Hartmann @ 2014-10-29 21:35 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: [...] more tomorrow > In the "working" case with Bjorn's patch, are you actually trying to use > the device or just testing to see if the system survives reset? The VM starts hostapd using this device. Hostapd didn't show any error in the logfile using this device. Therefore I think it should have worked: 1414609514.733245: Using interface wlan1 with hwaddr 64:70:02:... and ssid "..." 1414609514.766353: wlan1: interface state HT_SCAN->ENABLED 1414609514.766380: wlan1: AP-ENABLED Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-29 20:50 ` Alex Williamson 2014-10-29 21:35 ` Andreas Hartmann @ 2014-10-30 16:35 ` Andreas Hartmann 2014-10-30 16:58 ` Alex Williamson 1 sibling, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-30 16:35 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: > On Wed, 2014-10-29 at 20:43 +0100, Andreas Hartmann wrote: [...] >> Therefore, I never should need pci_save_vc_state and >> pci_restore_vc_state. Thus, it should be ok to add "return" at the >> beginning of each of these function, true? Then it should work. >> >> I tested it. It worked. >> >> But if I'm removing only one of these returns either in >> pci_save_vc_state or pci_restore_vc_state, the machine hangs again. >> >> Therefore, there must be something odd going on in the for loops. Isn't >> it possible to add some useful debug code to these loops to see what's >> really going on? But the output *must* go to the actual console, >> otherwise I can't see it! >> >> >> int pci_save_vc_state(struct pci_dev *dev) >> { >> return 0; // must be set >> int i; >> >> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { // continue; -> works >> int pos, ret; >> struct pci_cap_saved_state *save_state; // continue does not work! --> Most probably the struct pci_cap_saved_state *save_state; makes the system hang! ARRAY_SIZE(vc_caps) is 3 and the whole function is called 3 times when starting the vm. >> >> pos = pci_find_ext_capability(dev, vc_caps[i].id); >> if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) >> continue; > > Take the next logical step, comment out the if here and we'll statically > take the continue. Does it still fail? If so, move the continue above > the call to pci_find_ext_capability(), if not... > >> >> save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); > > If not, add a continue; here. Unless my pci_vc_needs_save() function is > broken, we shouldn't be getting here anyway. > >> if (!save_state) { >> dev_err(&dev->dev, "%s buffer not found in %s\n", >> vc_caps[i].name, __func__); >> return -ENOMEM; >> } >> >> printk("%s doing %s save on %s\n", __func__, vc_caps[i].name, pci_name(dev)); >> ret = pci_vc_do_save_buffer(dev, pos, save_state, true); >> if (ret) { >> dev_err(&dev->dev, "%s save unsuccessful %s\n", >> vc_caps[i].name, __func__); >> return ret; >> } >> } >> >> return 0; >> } >> >> >> void pci_restore_vc_state(struct pci_dev *dev) >> { >> return; // must be set >> int i; >> >> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { >> int pos; >> struct pci_cap_saved_state *save_state; >> >> pos = pci_find_ext_capability(dev, vc_caps[i].id); >> save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); > > This should never find a save_state with the pci_vc_needs_save() patch, > so we should always take the branch below. Comment out the if (... and > leave the continue, does the behavior change? If so, add a continue; > line above pci_find_saved_ext_cap(), does it work? If not, add another > continue above pci_find_ext_capability(). > >> if (!save_state || !pos) >> continue; >> >> printk("%s doing %s restore on %s\n", __func__, vc_caps[i].name, pci_name(dev)); >> pci_vc_do_save_buffer(dev, pos, save_state, false); >> } >> } > > In the "working" case with Bjorn's patch, are you actually trying to use > the device or just testing to see if the system survives reset? You > might at least want to run lspci -xxxx on it after reset to make sure > it's really there. Thanks, > > Alex > ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-30 16:35 ` Andreas Hartmann @ 2014-10-30 16:58 ` Alex Williamson 2014-10-30 19:09 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Alex Williamson @ 2014-10-30 16:58 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Thu, 2014-10-30 at 17:35 +0100, Andreas Hartmann wrote: > Alex Williamson wrote: > > On Wed, 2014-10-29 at 20:43 +0100, Andreas Hartmann wrote: > [...] > >> Therefore, I never should need pci_save_vc_state and > >> pci_restore_vc_state. Thus, it should be ok to add "return" at the > >> beginning of each of these function, true? Then it should work. > >> > >> I tested it. It worked. > >> > >> But if I'm removing only one of these returns either in > >> pci_save_vc_state or pci_restore_vc_state, the machine hangs again. > >> > >> Therefore, there must be something odd going on in the for loops. Isn't > >> it possible to add some useful debug code to these loops to see what's > >> really going on? But the output *must* go to the actual console, > >> otherwise I can't see it! > >> > >> > >> int pci_save_vc_state(struct pci_dev *dev) > >> { > >> return 0; // must be set > >> int i; > >> > >> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > // continue; -> works > >> int pos, ret; > >> struct pci_cap_saved_state *save_state; > // continue does not work! > > --> Most probably the > > struct pci_cap_saved_state *save_state; > > makes the system hang! We've done nothing more than declare variables there, there's no actual code. What happens if you increase the delay after bus reset, edit drivers/pci/pci.c, find the call to ssleep(1) and change the 1 to a 2, doubling the delay after reset. It seems like VC save/restore is just a scapegoat for the platform already being broken by the bus reset. Also, if you have any other card to test in this slot, it would be useful comparison data to know if we're dealing with an endpoint issue or a bus issue. > ARRAY_SIZE(vc_caps) is 3 and the whole > function is called 3 times when starting the vm. Sounds right. The array is declared right above these functions and has entries for VC, VC9, and MFVC types. VFIO will try to reset the device when it's initially opened and then QEMU does it twice (for some reason), so that makes 3. Thanks, Alex > >> > >> pos = pci_find_ext_capability(dev, vc_caps[i].id); > >> if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) > >> continue; > > > > Take the next logical step, comment out the if here and we'll statically > > take the continue. Does it still fail? If so, move the continue above > > the call to pci_find_ext_capability(), if not... > > > >> > >> save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); > > > > If not, add a continue; here. Unless my pci_vc_needs_save() function is > > broken, we shouldn't be getting here anyway. > > > >> if (!save_state) { > >> dev_err(&dev->dev, "%s buffer not found in %s\n", > >> vc_caps[i].name, __func__); > >> return -ENOMEM; > >> } > >> > >> printk("%s doing %s save on %s\n", __func__, vc_caps[i].name, pci_name(dev)); > >> ret = pci_vc_do_save_buffer(dev, pos, save_state, true); > >> if (ret) { > >> dev_err(&dev->dev, "%s save unsuccessful %s\n", > >> vc_caps[i].name, __func__); > >> return ret; > >> } > >> } > >> > >> return 0; > >> } > >> > >> > >> void pci_restore_vc_state(struct pci_dev *dev) > >> { > >> return; // must be set > >> int i; > >> > >> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > >> int pos; > >> struct pci_cap_saved_state *save_state; > >> > >> pos = pci_find_ext_capability(dev, vc_caps[i].id); > >> save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); > > > > This should never find a save_state with the pci_vc_needs_save() patch, > > so we should always take the branch below. Comment out the if (... and > > leave the continue, does the behavior change? If so, add a continue; > > line above pci_find_saved_ext_cap(), does it work? If not, add another > > continue above pci_find_ext_capability(). > > > >> if (!save_state || !pos) > >> continue; > >> > >> printk("%s doing %s restore on %s\n", __func__, vc_caps[i].name, pci_name(dev)); > >> pci_vc_do_save_buffer(dev, pos, save_state, false); > >> } > >> } > > > > In the "working" case with Bjorn's patch, are you actually trying to use > > the device or just testing to see if the system survives reset? You > > might at least want to run lspci -xxxx on it after reset to make sure > > it's really there. Thanks, > > > > Alex > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-30 16:58 ` Alex Williamson @ 2014-10-30 19:09 ` Andreas Hartmann 2014-10-30 19:45 ` Alex Williamson 0 siblings, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-30 19:09 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: > On Thu, 2014-10-30 at 17:35 +0100, Andreas Hartmann wrote: >> Alex Williamson wrote: >>> On Wed, 2014-10-29 at 20:43 +0100, Andreas Hartmann wrote: >> [...] >>>> Therefore, I never should need pci_save_vc_state and >>>> pci_restore_vc_state. Thus, it should be ok to add "return" at the >>>> beginning of each of these function, true? Then it should work. >>>> >>>> I tested it. It worked. >>>> >>>> But if I'm removing only one of these returns either in >>>> pci_save_vc_state or pci_restore_vc_state, the machine hangs again. >>>> >>>> Therefore, there must be something odd going on in the for loops. Isn't >>>> it possible to add some useful debug code to these loops to see what's >>>> really going on? But the output *must* go to the actual console, >>>> otherwise I can't see it! >>>> >>>> >>>> int pci_save_vc_state(struct pci_dev *dev) >>>> { >>>> return 0; // must be set >>>> int i; >>>> >>>> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { >> // continue; -> works >>>> int pos, ret; >>>> struct pci_cap_saved_state *save_state; >> // continue does not work! >> >> --> Most probably the >> >> struct pci_cap_saved_state *save_state; >> >> makes the system hang! > > We've done nothing more than declare variables there, there's no actual > code. What happens if you increase the delay after bus reset, edit > drivers/pci/pci.c, find the call to ssleep(1) and change the 1 to a 2, > doubling the delay after reset. Same behaviour. > It seems like VC save/restore is just a > scapegoat for the platform already being broken by the bus reset. Also, > if you have any other card to test in this slot, it would be useful > comparison data to know if we're dealing with an endpoint issue or a bus > issue. I organized an Intel pcie card: 03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection Subsystem: Intel Corporation Gigabit CT Desktop Adapter Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 17 Region 0: Memory at fdbc0000 (32-bit, non-prefetchable) [disabled] [size=128K] Region 1: Memory at fdb00000 (32-bit, non-prefetchable) [disabled] [size=512K] Region 2: I/O ports at cf00 [disabled] [size=32] Region 3: Memory at fdbfc000 (32-bit, non-prefetchable) [disabled] [size=16K] [virtual] Expansion ROM at fdb80000 [disabled] [size=256K] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable+ DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [e0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <128ns, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [a0] MSI-X: Enable- Count=5 Masked- Vector table: BAR=3 offset=00000000 PBA: BAR=3 offset=00002000 Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140 v1] Device Serial Number 00-1b-21-ff-ff-cf-8f-57 Kernel driver in use: vfio-pci and tested with the same kernel, which hangs w/ atheros card. It just worked. Not just once, but each of the tests I did. I retested w/ atheros -> hang. Tested again with intel-card -> works. Back to atheros -> hang. Seems to be really a problem w/ the atheros card, which is triggered by new vc save/restore. Well, but what to do now? I know how to "fix" it. But this means I have to compile my kernels again on my own if it is >= 3.14. Thanks, kind regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-30 19:09 ` Andreas Hartmann @ 2014-10-30 19:45 ` Alex Williamson 2014-10-30 20:21 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Alex Williamson @ 2014-10-30 19:45 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Thu, 2014-10-30 at 20:09 +0100, Andreas Hartmann wrote: > Alex Williamson wrote: > > On Thu, 2014-10-30 at 17:35 +0100, Andreas Hartmann wrote: > >> Alex Williamson wrote: > >>> On Wed, 2014-10-29 at 20:43 +0100, Andreas Hartmann wrote: > >> [...] > >>>> Therefore, I never should need pci_save_vc_state and > >>>> pci_restore_vc_state. Thus, it should be ok to add "return" at the > >>>> beginning of each of these function, true? Then it should work. > >>>> > >>>> I tested it. It worked. > >>>> > >>>> But if I'm removing only one of these returns either in > >>>> pci_save_vc_state or pci_restore_vc_state, the machine hangs again. > >>>> > >>>> Therefore, there must be something odd going on in the for loops. Isn't > >>>> it possible to add some useful debug code to these loops to see what's > >>>> really going on? But the output *must* go to the actual console, > >>>> otherwise I can't see it! > >>>> > >>>> > >>>> int pci_save_vc_state(struct pci_dev *dev) > >>>> { > >>>> return 0; // must be set > >>>> int i; > >>>> > >>>> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > >> // continue; -> works > >>>> int pos, ret; > >>>> struct pci_cap_saved_state *save_state; > >> // continue does not work! > >> > >> --> Most probably the > >> > >> struct pci_cap_saved_state *save_state; > >> > >> makes the system hang! > > > > We've done nothing more than declare variables there, there's no actual > > code. What happens if you increase the delay after bus reset, edit > > drivers/pci/pci.c, find the call to ssleep(1) and change the 1 to a 2, > > doubling the delay after reset. > > Same behaviour. > > > It seems like VC save/restore is just a > > scapegoat for the platform already being broken by the bus reset. Also, > > if you have any other card to test in this slot, it would be useful > > comparison data to know if we're dealing with an endpoint issue or a bus > > issue. > > I organized an Intel pcie card: > > 03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection > Subsystem: Intel Corporation Gigabit CT Desktop Adapter > Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > Interrupt: pin A routed to IRQ 17 > Region 0: Memory at fdbc0000 (32-bit, non-prefetchable) [disabled] [size=128K] > Region 1: Memory at fdb00000 (32-bit, non-prefetchable) [disabled] [size=512K] > Region 2: I/O ports at cf00 [disabled] [size=32] > Region 3: Memory at fdbfc000 (32-bit, non-prefetchable) [disabled] [size=16K] > [virtual] Expansion ROM at fdb80000 [disabled] [size=256K] > Capabilities: [c8] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) > Status: D0 NoSoftRst- PME-Enable+ DSel=0 DScale=1 PME- > Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+ > Address: 0000000000000000 Data: 0000 > Capabilities: [e0] Express (v1) Endpoint, MSI 00 > DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- > DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ > MaxPayload 128 bytes, MaxReadReq 512 bytes > DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- > LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <128ns, L1 <64us > ClockPM- Surprise- LLActRep- BwNot- > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- > Capabilities: [a0] MSI-X: Enable- Count=5 Masked- > Vector table: BAR=3 offset=00000000 > PBA: BAR=3 offset=00002000 > Capabilities: [100 v1] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- > Capabilities: [140 v1] Device Serial Number 00-1b-21-ff-ff-cf-8f-57 > Kernel driver in use: vfio-pci > > > and tested with the same kernel, which hangs w/ atheros card. It just > worked. Not just once, but each of the tests I did. I retested w/ > atheros -> hang. Tested again with intel-card -> works. Back to > atheros -> hang. Thanks for the test. > Seems to be really a problem w/ the atheros card, which is triggered by > new vc save/restore. It seems more like the bus reset, not the VC save/restore. As far a interacting with hardware is concerned, there's no difference between the two cases where you found one continue works and the other doesn't. > Well, but what to do now? I know how to "fix" it. But this means I have > to compile my kernels again on my own if it is >= 3.14. Let's not give up hope just yet, I'd like to try another bus reset mechanism with setpci. Install the Atheros card and bind it to pci-stub, then do: setpci -s 00:05.0 68.w=0010:0010 sleep 0.1 setpci -s 00:05.0 68.w=0000:0010 sleep 1 lspci -xxx -s 3:00.0 This uses the link disable control rather than the secondary bus reset. Typically the results between the two are the same, but maybe we'll get lucky. The BIOS manages to reset the bus with this device installed somehow, so there must be a mechanism to do it. Thanks, Alex ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-30 19:45 ` Alex Williamson @ 2014-10-30 20:21 ` Andreas Hartmann 0 siblings, 0 replies; 42+ messages in thread From: Andreas Hartmann @ 2014-10-30 20:21 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson wrote: > On Thu, 2014-10-30 at 20:09 +0100, Andreas Hartmann wrote: [...] >> Well, but what to do now? I know how to "fix" it. But this means I have >> to compile my kernels again on my own if it is >= 3.14. > > Let's not give up hope just yet, I'd like to try another bus reset > mechanism with setpci. Install the Atheros card and bind it to > pci-stub, then do: > > setpci -s 00:05.0 68.w=0010:0010 > sleep 0.1 > setpci -s 00:05.0 68.w=0000:0010 > sleep 1 > lspci -xxx -s 3:00.0 Exactly same -> hang :-(. Thanks, Regards, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-21 21:06 ` Alex Williamson 2014-10-21 21:32 ` Alex Williamson @ 2014-10-22 15:34 ` Andreas Hartmann 2014-10-22 16:02 ` Alex Williamson 1 sibling, 1 reply; 42+ messages in thread From: Andreas Hartmann @ 2014-10-22 15:34 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Alex Williamson schrieb: > Hi Andreas, > > On Fri, 2014-10-17 at 03:04 +0200, Andreas Hartmann wrote: >> Hello Alex, >> >> Alex Williamson wrote: >>> Hi Andreas, >> [...] >>> Sorry for the breakage. Is it possible to run lspci on the device in a >>> loop from the host and capture whether we're failing to restore some of >>> the VC bits to their previous state? >> >>> Does the problem also occur if you >>> unbind from host driver, >> >> The machine is booted w/ blacklisted ath9k. Then, the device is bound to >> vfio: >> >> echo "168c 0030" > /sys/bus/pci/drivers/vfio-pci/new_id >> echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind >> echo 0000:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind >> >> afterwards the VM is started -> hang. >> >> W/o starting th VM, I can bind it to vfio and unbind it from vfio w/o >> any problem. >> >>> echo 1 > reset in pci-sysfs, >> >> echo 1 > /sys/bus/pci/devices/0000:03:00.0 works w/o any problem while >> bound to vfio. Even after unbinding from vfio and rebinding to vfio >> again ... . >> >>> and re-bind to the >> >> Do you mean loading ath9k in host system after unbinding from vfio? If >> yes: Works w/o any problem. It's even possible to reset it or do a >> ifconfig wlan0 up, ifconfig wlan0 down, rmmod ath9k, bind it to vfio >> again and reset it, .... >> >> Looks like the hang only is triggered by qemu-system_x86_64 on startup >> the VM. >> >>> host? I'll also try to reproduce on my 990fx system, but I won't be >>> able to do that until next week due to travel. Thanks, > > Could you send me the lspci -vvvxxxx for the device and parent root > port? Thanks, Done with kernel 3.12.28 in host while the device was used in VM: # lspci -vt -[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) +-00.2 Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU) +-02.0-[01]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] Turks PRO [Radeon HD 6570/7570] | \-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Turks/Whistler HDMI Audio [Radeon HD 6000 Series] +-04.0-[02]----00.0 Etron Technology, Inc. EJ168 USB 3.0 Host Controller +-05.0-[03]----00.0 Qualcomm Atheros AR93xx Wireless Network Adapter +-09.0-[04]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller +-0a.0-[05]----00.0 Etron Technology, Inc. EJ168 USB 3.0 Host Controller +-11.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] +-12.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller +-12.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller +-13.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller +-13.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller +-14.0 Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller +-14.2 Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) +-14.3 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller +-14.4-[06]--+-06.0 Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 | \-0e.0 VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller +-14.5 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller +-15.0-[07]-- +-16.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller +-16.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller +-18.0 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0 +-18.1 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1 +-18.2 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2 +-18.3 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3 +-18.4 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4 \-18.5 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5 # lspci -s 03:00 -vvvxxxx 03:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01) Subsystem: Qualcomm Atheros Device 3112 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 17 Region 0: Memory at fdbc0000 (64-bit, non-prefetchable) [size=128K] [virtual] Expansion ROM at fda00000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [70] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <2us, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout+ NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Capabilities: [300 v1] Device Serial Number 00-00-00-00-00-00-00-00 Kernel driver in use: vfio-pci Kernel modules: ath9k 00: 8c 16 30 00 07 01 10 00 01 00 80 02 10 00 00 00 10: 04 00 bc fd 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 8c 16 12 31 30: 00 00 00 00 40 00 00 00 00 00 00 00 05 01 00 00 40: 01 50 c3 5b 00 00 00 00 00 00 00 00 00 00 00 00 50: 05 70 84 01 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 10 00 02 00 00 87 2c 01 10 20 09 00 11 5c 03 00 80: 40 00 11 10 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 100: 01 00 01 14 00 00 00 00 00 00 00 00 30 20 06 00 110: 00 30 00 00 00 20 00 00 00 00 00 00 00 00 00 00 120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 140: 02 00 01 30 00 00 00 00 00 00 00 00 00 00 00 00 150: 00 00 00 00 ff 00 00 80 00 00 00 00 00 00 00 00 160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 250: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 260: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 270: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 300: 03 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 420: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 540: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 550: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 610: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 620: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 630: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 640: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 650: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 660: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 670: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 690: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 700: 76 00 63 01 ff ff ff ff 04 00 00 07 01 3f 3f 17 710: 20 01 01 00 00 00 00 00 aa 83 00 00 80 02 00 00 720: 00 00 00 00 03 00 00 00 51 5c ae 03 10 01 00 08 730: 40 00 01 00 ff 0f 01 00 ff ff 0f 00 00 00 00 00 740: 0f 00 00 00 00 00 00 00 08 40 20 00 03 40 20 00 750: 00 00 80 00 00 00 00 00 00 00 00 00 00 00 00 00 760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7a0: 00 00 00 00 00 00 00 00 2c 00 08 00 00 00 00 00 7b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 800: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 810: ff ff ff ff 00 00 00 00 00 03 00 00 00 00 00 00 820: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 830: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 840: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 850: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 890: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 910: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ad0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ce0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 de0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ea0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 I'm not sure what you mean with "parent root port". Could it be this: # lspci -s 00:00 -vvvxxxx 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) (rev 02) Subsystem: Gigabyte Technology Co., Ltd Device 5000 Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx- Region 3: Memory at <ignored> (64-bit, non-prefetchable) [size=512M] Capabilities: [f0] HyperTransport: MSI Mapping Enable+ Fixed+ Capabilities: [c4] HyperTransport: Slave or Primary Interface Command: BaseUnitID=0 UnitCnt=20 MastHost- DefDir- DUL- Link Control 0: CFlE- CST- CFE- <LkFail- Init+ EOC- TXO- <CRCErr=0 IsocEn- LSEn- ExtCTL- 64b- Link Config 0: MLWI=16bit DwFcIn- MLWO=16bit DwFcOut- LWI=16bit DwFcInEn- LWO=16bit DwFcOutEn- Link Control 1: CFlE- CST- CFE- <LkFail+ Init- EOC+ TXO+ <CRCErr=0 IsocEn- LSEn- ExtCTL- 64b- Link Config 1: MLWI=8bit DwFcIn- MLWO=8bit DwFcOut- LWI=8bit DwFcInEn- LWO=8bit DwFcOutEn- Revision ID: 3.00 Link Frequency 0: [e] Link Error 0: <Prot- <Ovfl- <EOC- CTLTm- Link Frequency Capability 0: 200MHz+ 300MHz- 400MHz+ 500MHz- 600MHz+ 800MHz+ 1.0GHz+ 1.2GHz+ 1.4GHz- 1.6GHz- Vend- Feature Capability: IsocFC+ LDTSTOP+ CRCTM- ECTLT- 64bA+ UIDRD- Link Frequency 1: 200MHz Link Error 1: <Prot- <Ovfl- <EOC- CTLTm- Link Frequency Capability 1: 200MHz- 300MHz- 400MHz- 500MHz- 600MHz- 800MHz- 1.0GHz- 1.2GHz- 1.4GHz- 1.6GHz- Vend- Error Handling: PFlE- OFlE- PFE- OFE- EOCFE- RFE- CRCFE- SERRFE- CF- RE- PNFE- ONFE- EOCNFE- RNFE- CRCNFE- SERRNFE- Prefetchable memory behind bridge Upper: 00-00 Bus Number: 00 Capabilities: [40] HyperTransport: Retry Mode Capabilities: [54] HyperTransport: UnitID Clumping Capabilities: [9c] HyperTransport: #1a Capabilities: [70] MSI: Enable- Count=1/4 Maskable- 64bit- Address: 00000000 Data: 0000 00: 02 10 14 5a 02 00 10 20 02 00 00 06 00 00 80 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 04 00 00 e0 20: 00 00 00 00 00 00 00 00 00 00 00 00 58 14 00 50 30: 00 00 00 00 f0 00 00 00 00 00 00 00 ff 00 00 00 40: 08 54 00 c0 c1 00 00 00 00 00 00 00 42 20 05 00 50: 58 14 00 50 08 9c 00 90 08 10 00 00 08 10 00 00 60: 00 00 00 00 86 01 00 00 00 00 00 40 64 56 00 78 70: 05 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 10 00 00 03 20 02 30 00 31 20 00 00 90: 00 00 00 d0 00 00 00 00 10 09 00 00 08 70 3c d0 a0: 66 00 00 00 00 00 00 05 00 00 00 00 79 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 80 08 40 80 02 20 00 11 11 d0 00 00 00 d0: 60 0e f5 7f 13 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 05 00 ff ff ff ff 00 00 00 00 00 00 00 00 f0: 08 c4 03 a8 00 80 80 00 01 00 00 00 08 00 c0 fe 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU) Subsystem: Gigabyte Technology Co., Ltd Device 5000 Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 40 Capabilities: [40] Secure device <?> Capabilities: [54] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000feeff00c Data: 4171 Capabilities: [64] HyperTransport: MSI Mapping Enable+ Fixed+ 00: 02 10 23 5a 00 04 10 00 00 00 06 08 00 00 80 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 58 14 00 50 30: 00 00 00 00 40 00 00 00 00 00 00 00 00 01 00 00 40: 0f 54 0b 01 01 00 c3 fe 00 00 00 00 00 00 00 00 50: 00 34 20 00 05 64 81 00 0c f0 ef fe 00 00 00 00 60: 71 41 00 00 08 00 03 a8 58 14 00 50 01 01 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 82 00 00 00 00 00 00 00 11 00 05 00 00 00 00 00 Hope this helps! Thanks, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-22 15:34 ` Andreas Hartmann @ 2014-10-22 16:02 ` Alex Williamson 2014-10-22 16:20 ` Andreas Hartmann 0 siblings, 1 reply; 42+ messages in thread From: Alex Williamson @ 2014-10-22 16:02 UTC (permalink / raw) To: Andreas Hartmann; +Cc: Bjorn Helgaas, linux-pci On Wed, 2014-10-22 at 17:34 +0200, Andreas Hartmann wrote: > Alex Williamson schrieb: > > Hi Andreas, > > > > On Fri, 2014-10-17 at 03:04 +0200, Andreas Hartmann wrote: > >> Hello Alex, > >> > >> Alex Williamson wrote: > >>> Hi Andreas, > >> [...] > >>> Sorry for the breakage. Is it possible to run lspci on the device in a > >>> loop from the host and capture whether we're failing to restore some of > >>> the VC bits to their previous state? > >> > >>> Does the problem also occur if you > >>> unbind from host driver, > >> > >> The machine is booted w/ blacklisted ath9k. Then, the device is bound to > >> vfio: > >> > >> echo "168c 0030" > /sys/bus/pci/drivers/vfio-pci/new_id > >> echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind > >> echo 0000:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind > >> > >> afterwards the VM is started -> hang. > >> > >> W/o starting th VM, I can bind it to vfio and unbind it from vfio w/o > >> any problem. > >> > >>> echo 1 > reset in pci-sysfs, > >> > >> echo 1 > /sys/bus/pci/devices/0000:03:00.0 works w/o any problem while > >> bound to vfio. Even after unbinding from vfio and rebinding to vfio > >> again ... . > >> > >>> and re-bind to the > >> > >> Do you mean loading ath9k in host system after unbinding from vfio? If > >> yes: Works w/o any problem. It's even possible to reset it or do a > >> ifconfig wlan0 up, ifconfig wlan0 down, rmmod ath9k, bind it to vfio > >> again and reset it, .... > >> > >> Looks like the hang only is triggered by qemu-system_x86_64 on startup > >> the VM. > >> > >>> host? I'll also try to reproduce on my 990fx system, but I won't be > >>> able to do that until next week due to travel. Thanks, > > > > Could you send me the lspci -vvvxxxx for the device and parent root > > port? Thanks, > > > Done with kernel 3.12.28 in host while the device was used in VM: > > # lspci -vt > -[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) > +-00.2 Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU) > +-02.0-[01]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] Turks PRO [Radeon HD 6570/7570] > | \-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Turks/Whistler HDMI Audio [Radeon HD 6000 Series] > +-04.0-[02]----00.0 Etron Technology, Inc. EJ168 USB 3.0 Host Controller > +-05.0-[03]----00.0 Qualcomm Atheros AR93xx Wireless Network Adapter > +-09.0-[04]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller > +-0a.0-[05]----00.0 Etron Technology, Inc. EJ168 USB 3.0 Host Controller > +-11.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] > +-12.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller > +-12.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller > +-13.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller > +-13.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller > +-14.0 Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller > +-14.2 Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) > +-14.3 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller > +-14.4-[06]--+-06.0 Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 > | \-0e.0 VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller > +-14.5 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller > +-15.0-[07]-- > +-16.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller > +-16.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller > +-18.0 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0 > +-18.1 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1 > +-18.2 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2 > +-18.3 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3 > +-18.4 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4 > \-18.5 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5 > > > # lspci -s 03:00 -vvvxxxx > 03:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01) [snip] > > > I'm not sure what you mean with "parent root port". Could it be this: No, it's 00:05.0 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) 2014-10-22 16:02 ` Alex Williamson @ 2014-10-22 16:20 ` Andreas Hartmann 0 siblings, 0 replies; 42+ messages in thread From: Andreas Hartmann @ 2014-10-22 16:20 UTC (permalink / raw) To: Alex Williamson; +Cc: Bjorn Helgaas, linux-pci Am Wed, 22 Oct 2014 10:02:29 -0600 schrieb Alex Williamson <alex.williamson@redhat.com>: > On Wed, 2014-10-22 at 17:34 +0200, Andreas Hartmann wrote: > > Alex Williamson schrieb: > > > Hi Andreas, > > > > > > On Fri, 2014-10-17 at 03:04 +0200, Andreas Hartmann wrote: > > >> Hello Alex, > > >> > > >> Alex Williamson wrote: > > >>> Hi Andreas, > > >> [...] > > >>> Sorry for the breakage. Is it possible to run lspci on the device in a > > >>> loop from the host and capture whether we're failing to restore some of > > >>> the VC bits to their previous state? > > >> > > >>> Does the problem also occur if you > > >>> unbind from host driver, > > >> > > >> The machine is booted w/ blacklisted ath9k. Then, the device is bound to > > >> vfio: > > >> > > >> echo "168c 0030" > /sys/bus/pci/drivers/vfio-pci/new_id > > >> echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind > > >> echo 0000:03:00.0 > /sys/bus/pci/drivers/vfio-pci/bind > > >> > > >> afterwards the VM is started -> hang. > > >> > > >> W/o starting th VM, I can bind it to vfio and unbind it from vfio w/o > > >> any problem. > > >> > > >>> echo 1 > reset in pci-sysfs, > > >> > > >> echo 1 > /sys/bus/pci/devices/0000:03:00.0 works w/o any problem while > > >> bound to vfio. Even after unbinding from vfio and rebinding to vfio > > >> again ... . > > >> > > >>> and re-bind to the > > >> > > >> Do you mean loading ath9k in host system after unbinding from vfio? If > > >> yes: Works w/o any problem. It's even possible to reset it or do a > > >> ifconfig wlan0 up, ifconfig wlan0 down, rmmod ath9k, bind it to vfio > > >> again and reset it, .... > > >> > > >> Looks like the hang only is triggered by qemu-system_x86_64 on startup > > >> the VM. > > >> > > >>> host? I'll also try to reproduce on my 990fx system, but I won't be > > >>> able to do that until next week due to travel. Thanks, > > > > > > Could you send me the lspci -vvvxxxx for the device and parent root > > > port? Thanks, > > > > > > Done with kernel 3.12.28 in host while the device was used in VM: > > > > # lspci -vt > > -[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (external gfx0 port B) > > +-00.2 Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory Management Unit (IOMMU) > > +-02.0-[01]--+-00.0 Advanced Micro Devices, Inc. [AMD/ATI] Turks PRO [Radeon HD 6570/7570] > > | \-00.1 Advanced Micro Devices, Inc. [AMD/ATI] Turks/Whistler HDMI Audio [Radeon HD 6000 Series] > > +-04.0-[02]----00.0 Etron Technology, Inc. EJ168 USB 3.0 Host Controller > > +-05.0-[03]----00.0 Qualcomm Atheros AR93xx Wireless Network Adapter > > +-09.0-[04]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller > > +-0a.0-[05]----00.0 Etron Technology, Inc. EJ168 USB 3.0 Host Controller > > +-11.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] > > +-12.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller > > +-12.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller > > +-13.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller > > +-13.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller > > +-14.0 Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller > > +-14.2 Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) > > +-14.3 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller > > +-14.4-[06]--+-06.0 Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 > > | \-0e.0 VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller > > +-14.5 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller > > +-15.0-[07]-- > > +-16.0 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller > > +-16.2 Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller > > +-18.0 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 0 > > +-18.1 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 1 > > +-18.2 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 2 > > +-18.3 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 3 > > +-18.4 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 4 > > \-18.5 Advanced Micro Devices, Inc. [AMD] Family 15h Processor Function 5 > > > > > > # lspci -s 03:00 -vvvxxxx > > 03:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01) > [snip] > > > > > > I'm not sure what you mean with "parent root port". Could it be this: > > No, it's 00:05.0 # lspci -s 00:05.0 -vvvxxxx 00:05.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI to PCI bridge (PCI express gpp port E) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=03, subordinate=03, sec-latency=0 I/O behind bridge: 0000c000-0000cfff Memory behind bridge: fdb00000-fdbfffff Prefetchable memory behind bridge: 00000000fda00000-00000000fdafffff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Express (v2) Root Port (Slot+), MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <1us, L1 <8us ClockPM- Surprise- LLActRep+ BwNot+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt- SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise- Slot #5, PowerLimit 75.000W; Interlock- NoCompl+ SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg- Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock- Changed: MRL- PresDet+ LinkState+ RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd+ DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit- Address: 00000000 Data: 0000 Capabilities: [b0] Subsystem: Gigabyte Technology Co., Ltd Device 5000 Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+ Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [190 v1] Access Control Services ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans+ ACSCtl: SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans- Kernel driver in use: pcieport Kernel modules: shpchp 00: 02 10 19 5a 07 00 10 00 00 00 04 06 10 00 01 00 10: 00 00 00 00 00 00 00 00 00 03 03 00 c1 c1 00 20 20: b0 fd b0 fd a1 fd a1 fd 00 00 00 00 00 00 00 00 30: 00 00 00 00 50 00 00 00 00 00 00 00 05 01 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 01 58 03 c8 00 00 00 00 10 a0 42 01 20 80 00 00 60: 10 08 00 00 12 cc 31 01 40 00 11 70 80 25 2c 00 70: 00 00 48 01 00 00 01 00 00 00 00 00 3f 00 00 00 80: 06 00 00 00 00 00 00 00 42 00 01 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 05 b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 0d b8 00 00 58 14 00 50 08 00 03 a8 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 100: 0b 00 01 19 01 00 01 01 00 58 22 00 00 00 00 00 110: 02 00 01 19 00 00 00 00 00 00 00 00 00 00 00 00 120: 01 00 00 00 ff 00 00 80 00 00 00 00 01 00 00 00 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 140: 03 00 01 19 00 00 00 00 00 00 00 00 00 00 00 00 150: 01 00 01 19 00 00 00 00 00 00 00 00 30 20 06 00 160: 00 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 190: 0d 00 01 00 5f 00 1d 00 00 00 00 00 00 00 00 00 1a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 250: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 260: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 270: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 360: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 420: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 540: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 550: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 5f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 610: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 620: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 630: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 640: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 650: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 660: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 670: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 690: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 720: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 810: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 820: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 830: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 840: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 850: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 890: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 910: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 9f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ad0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ce0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 cf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 de0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ea0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Thanks, Andreas ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2014-10-30 20:27 UTC | newest] Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-09-23 19:03 Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) Andreas Hartmann 2014-09-23 20:07 ` Alex Williamson 2014-09-24 14:54 ` Andreas Hartmann 2014-09-24 17:16 ` Andreas Hartmann 2014-10-10 9:39 ` Andreas Hartmann 2014-10-10 14:37 ` Bjorn Helgaas 2014-10-10 14:49 ` Andreas Hartmann 2014-10-10 15:55 ` Bjorn Helgaas 2014-10-10 16:09 ` Andreas Hartmann 2014-10-10 16:41 ` Bjorn Helgaas 2014-10-10 22:32 ` Andreas Hartmann 2014-10-10 22:54 ` Bjorn Helgaas 2014-10-11 6:20 ` Andreas Hartmann 2014-10-15 8:04 ` Alex Williamson 2014-10-17 1:04 ` Andreas Hartmann 2014-10-21 21:06 ` Alex Williamson 2014-10-21 21:32 ` Alex Williamson 2014-10-22 16:22 ` Andreas Hartmann 2014-10-22 20:36 ` Alex Williamson 2014-10-23 16:00 ` Andreas Hartmann 2014-10-23 16:33 ` Alex Williamson 2014-10-23 17:12 ` Andreas Hartmann 2014-10-23 17:33 ` Andreas Hartmann 2014-10-23 19:37 ` Alex Williamson 2014-10-24 14:21 ` Andreas Hartmann 2014-10-25 6:03 ` Andreas Hartmann 2014-10-28 21:51 ` Alex Williamson 2014-10-29 16:47 ` Andreas Hartmann 2014-10-29 17:44 ` Alex Williamson 2014-10-29 17:57 ` Andreas Hartmann 2014-10-29 18:16 ` Alex Williamson 2014-10-29 19:43 ` Andreas Hartmann 2014-10-29 20:50 ` Alex Williamson 2014-10-29 21:35 ` Andreas Hartmann 2014-10-30 16:35 ` Andreas Hartmann 2014-10-30 16:58 ` Alex Williamson 2014-10-30 19:09 ` Andreas Hartmann 2014-10-30 19:45 ` Alex Williamson 2014-10-30 20:21 ` Andreas Hartmann 2014-10-22 15:34 ` Andreas Hartmann 2014-10-22 16:02 ` Alex Williamson 2014-10-22 16:20 ` Andreas Hartmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).