All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Hartmann <andihartmann@freenet.de>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: linux-pci <linux-pci@vger.kernel.org>
Subject: Re: Hard and silent lock up since linux 3.14 with PCIe pass through (vfio)
Date: Fri, 10 Oct 2014 11:39:36 +0200	[thread overview]
Message-ID: <5437A958.3000201@maya.org> (raw)
In-Reply-To: <1411502866.24563.8.camel@ul30vt.home>

shortly: I retested w/ qemu 2.1.0 and Linux 3.17.0 - no change in behaviour.

Alex Williamson wrote:
> On Tue, 2014-09-23 at 21:03 +0200, Andreas Hartmann wrote:
>> Hello!
>>
>> Since long time now, I'm using w/o any problem PCIe pass through with a
>> Gigabyte GA-990XA-UD3/GA-990XA-UD3 mainboard (AMD 990X chipset) and
>> enabled IOMMU with vfio-pci.
>>
>> The last kernel working w/o any problem is kernel 3.13.7 (I didn't use
>> .8 and .9, but I do not think they would have been problematic).
>>
>> Since 3.14.19 (I didn't test any 3.14 kernel before) I'm encountering a
>> hard and silent lock up of the complete machine when starting the VM
>> with the PCIe card passed through.
>>
>> That's the relevant PCIe card, which locks up the machine (here
>> running w/ 3.12.28) when passed to the VM:
>>
>> 03:00.0 Network controller: Qualcomm Atheros AR93xx Wireless Network Adapter (rev 01)
>>         Subsystem: Qualcomm Atheros Device 3112
>>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
>>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>         Latency: 0, Cache Line Size: 64 bytes
>>         Interrupt: pin A routed to IRQ 17
>>         Region 0: Memory at fdbc0000 (64-bit, non-prefetchable) [size=128K]
>>         Expansion ROM at fda00000 [size=64K]
>>         Capabilities: [40] Power Management version 3
>>                 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
>>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>>         Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+
>>                 Address: 0000000000000000  Data: 0000
>>                 Masking: 00000000  Pending: 00000000
>>         Capabilities: [70] Express (v2) Endpoint, MSI 00
>>                 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
>>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>>                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
>>                         MaxPayload 128 bytes, MaxReadReq 512 bytes
>>                 DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
>>                 LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <2us, L1 <64us
>>                         ClockPM- Surprise- LLActRep- BwNot-
>>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
>>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>                 DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
>>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>>                 LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>>                          Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>>                          Compliance De-emphasis: -6dB
>>                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>>                          EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>         Capabilities: [100 v1] Advanced Error Reporting
>>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout+ NonFatalErr+
>>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>>                 AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
>>         Capabilities: [140 v1] Virtual Channel
>>                 Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
>>                 Arb:    Fixed- WRR32- WRR64- WRR128-
>>                 Ctrl:   ArbSelect=Fixed
>>                 Status: InProgress-
>>                 VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
>>                         Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
>>                         Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
>>                         Status: NegoPending- InProgress-
>>         Capabilities: [300 v1] Device Serial Number 00-00-00-00-00-00-00-00
>>         Kernel driver in use: vfio-pci
>>         Kernel modules: ath9k
>>
>>
>> Unbinding it works w/o any problem. The lock up encounters about 4 s
>> after the start of the VM.
>>
>> On 3.12.x, I can see the following message on the error terminal when
>> starting the VM: 
>> vfio-pci: 03:00.0: invalid ROM contents.
>>
>> I compared AMD-Vi debug output between 3.12 and 3.14, but couldn't see
>> any difference. I compared /proc/interrupts between 3.12 and 3.14
>> and couldn't see any difference too so far.
>>
>>
>> qemu version I'm using is 1.7.0.
>>
>>
>> It is strange(?), that a second VM using PCI (legacy) pass through works
>> w/o any problem. I tried to start the problematic VM even w/o running
>> this VM - same result: machine is locked up hard.
>>
>>
>> Do you have any idea, what could be going on there? Or how to debug it
>> to see what happened?

> There weren't many vfio changes between 3.13 and 3.14.

It could be a pci problem, too? It is strange, that there is no problem
with the pci-card, but the pcie card hangs the machine!

> Have you tested whether the problem still occurs on 3.16 +

Same problem with 3.17.0

> newer QEMU?

Same problem With qemu 2.1.0.

>  Maybe also remove the ROM from the equation with the
> rombar=0 option for the vfio-pci device in QEMU.

Same problem :-(. The machine really is completely dead: it even pings
any more.



Regards,
Andreas

  parent reply	other threads:[~2014-10-10  9:45 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-23 19:03 Hard and silent lock up since linux 3.14 with PCIe pass through (vfio) Andreas Hartmann
2014-09-23 20:07 ` Alex Williamson
2014-09-24 14:54   ` Andreas Hartmann
2014-09-24 17:16     ` Andreas Hartmann
2014-10-10  9:39   ` Andreas Hartmann [this message]
2014-10-10 14:37     ` Bjorn Helgaas
2014-10-10 14:49       ` Andreas Hartmann
2014-10-10 15:55         ` Bjorn Helgaas
2014-10-10 16:09           ` Andreas Hartmann
2014-10-10 16:41             ` Bjorn Helgaas
2014-10-10 22:32               ` Andreas Hartmann
2014-10-10 22:54                 ` Bjorn Helgaas
2014-10-11  6:20                   ` Andreas Hartmann
2014-10-15  8:04                     ` Alex Williamson
2014-10-17  1:04                       ` Andreas Hartmann
2014-10-21 21:06                         ` Alex Williamson
2014-10-21 21:32                           ` Alex Williamson
2014-10-22 16:22                             ` Andreas Hartmann
2014-10-22 20:36                               ` Alex Williamson
2014-10-23 16:00                                 ` Andreas Hartmann
2014-10-23 16:33                                   ` Alex Williamson
2014-10-23 17:12                                     ` Andreas Hartmann
2014-10-23 17:33                                     ` Andreas Hartmann
2014-10-23 19:37                                       ` Alex Williamson
2014-10-24 14:21                                         ` Andreas Hartmann
2014-10-25  6:03                                         ` Andreas Hartmann
2014-10-28 21:51                                           ` Alex Williamson
2014-10-29 16:47                                             ` Andreas Hartmann
2014-10-29 17:44                                               ` Alex Williamson
2014-10-29 17:57                                                 ` Andreas Hartmann
2014-10-29 18:16                                                   ` Alex Williamson
2014-10-29 19:43                                                     ` Andreas Hartmann
2014-10-29 20:50                                                       ` Alex Williamson
2014-10-29 21:35                                                         ` Andreas Hartmann
2014-10-30 16:35                                                         ` Andreas Hartmann
2014-10-30 16:58                                                           ` Alex Williamson
2014-10-30 19:09                                                             ` Andreas Hartmann
2014-10-30 19:45                                                               ` Alex Williamson
2014-10-30 20:21                                                                 ` Andreas Hartmann
2014-10-22 15:34                           ` Andreas Hartmann
2014-10-22 16:02                             ` Alex Williamson
2014-10-22 16:20                               ` Andreas Hartmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5437A958.3000201@maya.org \
    --to=andihartmann@freenet.de \
    --cc=alex.williamson@redhat.com \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.