linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* DMAR faults from unrelated device when vfio is used
@ 2013-02-04 10:10 David Gstir
  2013-02-04 15:49 ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: David Gstir @ 2013-02-04 10:10 UTC (permalink / raw)
  To: kvm
  Cc: dwmw2, alex.williamson, iommu, linux-kernel, airlied, dri-devel,
	daniel.vetter, richard

[-- Attachment #1: Type: text/plain, Size: 937 bytes --]

Hi!

I get the following error messages over and over again when using vfio in qemu-kvm:

[ 1692.021403] dmar: DMAR:[DMA Read] Request device [00:02.0] fault addr 1a45aa9000 
[ 1692.021403] DMAR:[fault reason 12] non-zero reserved fields in PTE
[ 1692.021416] dmar: DRHD: handling fault status reg 2

This pci device is the graphics card, which I did not assign to qemu! I did assign the following devices:
00:1a.0, 00:1b.0, 00:1c.0, 00:1c.6, 00:1d.0, 03:00.0.

The error occurs at random and is not reproducible every time. It happens about every third reboot. 
I'm running qemu-kvm 1.3.0 (kvm-1.3.0-187.3), kernel 3.8.0-rc5 and windows 7 as guest OS. The hardware uses an Intel IOMMU. See attachments for output of lspci, and details on iommu groups

I'm not sure if this problem originates from qemu, kvm, vfio or the GPU driver.
Do you have any hints how to debug this further?


Thanks,
David

PS: please cc me, I'm not subscribed.

[-- Attachment #2: lspci.txt --]
[-- Type: text/plain, Size: 22840 bytes --]

# /sbin/lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation 2nd Generation Core Processor Family DRAM Controller [8086:0100] (rev 09)
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port [8086:0101] (rev 09)
00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0102] (rev 09)
00:16.0 Communication controller [0780]: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 [8086:1c3a] (rev 04)
00:16.2 IDE interface [0101]: Intel Corporation 6 Series/C200 Series Chipset Family IDE-r Controller [8086:1c3c] (rev 04)
00:16.3 Serial controller [0700]: Intel Corporation 6 Series/C200 Series Chipset Family KT Controller [8086:1c3d] (rev 04)
00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network Connection [8086:1502] (rev 04)
00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 04)
00:1b.0 Audio device [0403]: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller [8086:1c20] (rev 04)
00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 [8086:1c10] (rev b4)
00:1c.6 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 7 [8086:1c1c] (rev b4)
00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 [8086:1c26] (rev 04)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev a4)
00:1f.0 ISA bridge [0601]: Intel Corporation Q67 Express Chipset Family LPC Controller [8086:1c4e] (rev 04)
00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller [8086:1c02] (rev 04)
00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22] (rev 04)
03:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host Controller [1033:0194] (rev ff)




# lspci -vv
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
	Subsystem: Intel Corporation Device 2008
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: [e0] Vendor Specific Information: Len=0c <?>

00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [88] Subsystem: Intel Corporation Device 2008
	Capabilities: [80] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee002d8  Data: 0000
	Capabilities: [a0] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #2, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <1us, L1 <4us
			ClockPM- Surprise- LLActRep- BwNot+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #0, PowerLimit 0.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
			Changed: MRL- PresDet- LinkState-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis- ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- ARIFwd-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [140 v1] Root Complex Link
		Desc:	PortNumber=02 ComponentID=01 EltType=Config
		Link0:	Desc:	TargetPort=00 TargetComponent=01 AssocRCRB- LinkType=MemMapped LinkValid+
			Addr:	00000000fed19000
	Kernel driver in use: pcieport

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
	Subsystem: Intel Corporation Device 2008
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 51
	Region 0: Memory at fe000000 (64-bit, non-prefetchable) [size=4M]
	Region 2: Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Region 4: I/O ports at f000 [size=64]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee00018  Data: 0000
	Capabilities: [d0] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a4] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: i915

00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04)
	Subsystem: Intel Corporation Device 2008
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 52
	Region 0: Memory at fe62a000 (64-bit, non-prefetchable) [size=16]
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [8c] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee00378  Data: 0000
	Kernel driver in use: mei

00:16.2 IDE interface: Intel Corporation 6 Series/C200 Series Chipset Family IDE-r Controller (rev 04) (prog-if 85 [Master SecO PriO])
	Subsystem: Intel Corporation Device 2008
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin C routed to IRQ 18
	Region 0: I/O ports at f130 [size=8]
	Region 1: I/O ports at f120 [size=4]
	Region 2: I/O ports at f110 [size=8]
	Region 3: I/O ports at f100 [size=4]
	Region 4: I/O ports at f0f0 [size=16]
	Capabilities: [c8] Power Management version 3
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Kernel driver in use: ata_generic

00:16.3 Serial controller: Intel Corporation 6 Series/C200 Series Chipset Family KT Controller (rev 04) (prog-if 02 [16550])
	Subsystem: Intel Corporation Device 2008
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 17
	Region 0: I/O ports at f0e0 [size=8]
	Region 1: Memory at fe629000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: [c8] Power Management version 3
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Kernel driver in use: serial

00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
	Subsystem: Intel Corporation Device 2008
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 53
	Region 0: Memory at fe600000 (32-bit, non-prefetchable) [size=128K]
	Region 1: Memory at fe628000 (32-bit, non-prefetchable) [size=4K]
	Region 2: I/O ports at f080 [size=32]
	Capabilities: [c8] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee003b8  Data: 0000
	Capabilities: [e0] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: e1000e

00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) (prog-if 20 [EHCI])
	Subsystem: Intel Corporation Device 2008
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 16
	Region 0: [virtual] Memory at fe627000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D3 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Debug port: BAR=1 offset=00a0
	Capabilities: [98] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: vfio-pci

00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 04)
	Subsystem: Intel Corporation Device 2008
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 22
	Region 0: Memory at fe620000 (64-bit, non-prefetchable) [disabled] [size=16K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D3 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [60] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v1) Root Complex Integrated Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE- FLReset+
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed unknown, Width x0, ASPM unknown, Latency L0 <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed unknown, Width x0, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
		VC1:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=1 ArbSelect=Fixed TC/VC=00
			Status:	NegoPending- InProgress-
	Capabilities: [130 v1] Root Complex Link
		Desc:	PortNumber=0f ComponentID=00 EltType=Config
		Link0:	Desc:	TargetPort=00 TargetComponent=00 AssocRCRB- LinkType=MemMapped LinkValid+
			Addr:	00000000fed1c000
	Kernel driver in use: vfio-pci

00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4) (prog-if 00 [Normal decode])
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #1, Speed 5GT/s, Width x4, ASPM L0s L1, Latency L0 <1us, L1 <4us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #0, PowerLimit 25.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
			Changed: MRL- PresDet- LinkState-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range BC, TimeoutDis+ ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- ARIFwd-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: fee002f8  Data: 0000
	Capabilities: [90] Subsystem: Intel Corporation Device 2008
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D3 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: vfio-pci

00:1c.6 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 7 (rev b4) (prog-if 00 [Normal decode])
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
	Memory behind bridge: fe500000-fe5fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #7, Speed 5GT/s, Width x1, ASPM L1, Latency L0 <512ns, L1 <4us
			ClockPM- Surprise- LLActRep+ BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #6, PowerLimit 10.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet+ LinkState+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range BC, TimeoutDis+ ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- ARIFwd-
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: fee00318  Data: 0000
	Capabilities: [90] Subsystem: Intel Corporation Device 2008
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D3 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: vfio-pci

00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
	Subsystem: Intel Corporation Device 2008
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 23
	Region 0: [virtual] Memory at fe626000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D3 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Debug port: BAR=1 offset=00a0
	Capabilities: [98] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: vfio-pci

00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a4) (prog-if 01 [Subtractive decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Bus: primary=00, secondary=04, subordinate=04, sec-latency=32
	Memory behind bridge: fe400000-fe4fffff
	Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [50] Subsystem: Intel Corporation Device 2008

00:1f.0 ISA bridge: Intel Corporation Q67 Express Chipset Family LPC Controller (rev 04)
	Subsystem: Intel Corporation Device 2008
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: [e0] Vendor Specific Information: Len=0c <?>
	Kernel driver in use: lpc_ich

00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller (rev 04) (prog-if 01 [AHCI 1.0])
	Subsystem: Intel Corporation Device 2008
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 45
	Region 0: I/O ports at f0d0 [size=8]
	Region 1: I/O ports at f0c0 [size=4]
	Region 2: I/O ports at f0b0 [size=8]
	Region 3: I/O ports at f0a0 [size=4]
	Region 4: I/O ports at f060 [size=32]
	Region 5: Memory at fe625000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee00358  Data: 0000
	Capabilities: [70] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
	Capabilities: [b0] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: ahci

00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04)
	Subsystem: Intel Corporation Device 2008
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin C routed to IRQ 18
	Region 0: Memory at fe624000 (64-bit, non-prefetchable) [size=256]
	Region 4: I/O ports at f040 [size=32]
	Kernel driver in use: i801_smbus

03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev ff) (prog-if ff)
	!!! Unknown header type 7f
	Kernel driver in use: vfio-pci

[-- Attachment #3: iommu_groups.txt --]
[-- Type: text/plain, Size: 3204 bytes --]

$ ls -l /sys/kernel/iommu_groups/*/devices
/sys/kernel/iommu_groups/0/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:00.0 -> ../../../../devices/pci0000:00/0000:00:00.0

/sys/kernel/iommu_groups/10/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1f.0 -> ../../../../devices/pci0000:00/0000:00:1f.0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1f.2 -> ../../../../devices/pci0000:00/0000:00:1f.2
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1f.3 -> ../../../../devices/pci0000:00/0000:00:1f.3

/sys/kernel/iommu_groups/1/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:01.0 -> ../../../../devices/pci0000:00/0000:00:01.0

/sys/kernel/iommu_groups/2/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:02.0 -> ../../../../devices/pci0000:00/0000:00:02.0

/sys/kernel/iommu_groups/3/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:16.0 -> ../../../../devices/pci0000:00/0000:00:16.0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:16.2 -> ../../../../devices/pci0000:00/0000:00:16.2
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:16.3 -> ../../../../devices/pci0000:00/0000:00:16.3

/sys/kernel/iommu_groups/4/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:19.0 -> ../../../../devices/pci0000:00/0000:00:19.0

/sys/kernel/iommu_groups/5/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1a.0 -> ../../../../devices/pci0000:00/0000:00:1a.0

/sys/kernel/iommu_groups/6/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1b.0 -> ../../../../devices/pci0000:00/0000:00:1b.0

/sys/kernel/iommu_groups/7/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1c.0 -> ../../../../devices/pci0000:00/0000:00:1c.0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1c.6 -> ../../../../devices/pci0000:00/0000:00:1c.6
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:03:00.0 -> ../../../../devices/pci0000:00/0000:00:1c.6/0000:03:00.0

/sys/kernel/iommu_groups/8/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1d.0 -> ../../../../devices/pci0000:00/0000:00:1d.0

/sys/kernel/iommu_groups/9/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1e.0 -> ../../../../devices/pci0000:00/0000:00:1e.0


# ls -l /sys/bus/pci/drivers/vfio-pci/
total 0
lrwxrwxrwx 1 root root    0 Feb  4 10:32 0000:00:1a.0 -> ../../../../devices/pci0000:00/0000:00:1a.0
lrwxrwxrwx 1 root root    0 Feb  4 10:32 0000:00:1b.0 -> ../../../../devices/pci0000:00/0000:00:1b.0
lrwxrwxrwx 1 root root    0 Feb  4 10:32 0000:00:1c.0 -> ../../../../devices/pci0000:00/0000:00:1c.0
lrwxrwxrwx 1 root root    0 Feb  4 10:32 0000:00:1c.6 -> ../../../../devices/pci0000:00/0000:00:1c.6
lrwxrwxrwx 1 root root    0 Feb  4 10:32 0000:00:1d.0 -> ../../../../devices/pci0000:00/0000:00:1d.0
lrwxrwxrwx 1 root root    0 Feb  4 10:32 0000:03:00.0 -> ../../../../devices/pci0000:00/0000:00:1c.6/0000:03:00.0
--w------- 1 root root 4096 Feb  4 10:32 bind
lrwxrwxrwx 1 root root    0 Feb  4 10:32 module -> ../../../../module/vfio_pci
--w------- 1 root root 4096 Feb  4 09:46 new_id
--w------- 1 root root 4096 Feb  4 10:32 remove_id
--w------- 1 root root 4096 Feb  4 09:46 uevent
--w------- 1 root root 4096 Feb  4 10:32 unbind

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-04 10:10 DMAR faults from unrelated device when vfio is used David Gstir
@ 2013-02-04 15:49 ` Alex Williamson
  2013-02-05 13:31   ` David Gstir
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Williamson @ 2013-02-04 15:49 UTC (permalink / raw)
  To: David Gstir
  Cc: kvm, dwmw2, iommu, linux-kernel, airlied, dri-devel,
	daniel.vetter, richard

On Mon, 2013-02-04 at 11:10 +0100, David Gstir wrote:
> Hi!
> 
> I get the following error messages over and over again when using vfio
> in qemu-kvm:
> 
> [ 1692.021403] dmar: DMAR:[DMA Read] Request device [00:02.0] fault addr 1a45aa9000 
> [ 1692.021403] DMAR:[fault reason 12] non-zero reserved fields in PTE
> [ 1692.021416] dmar: DRHD: handling fault status reg 2
> 
> This pci device is the graphics card, which I did not assign to qemu!
> I did assign the following devices:
> 00:1a.0, 00:1b.0, 00:1c.0, 00:1c.6, 00:1d.0, 03:00.0.

Piecing together your logs:

iommu_group 5
00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 04)
iommu_group 6
00:1b.0 Audio device [0403]: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller [8086:1c20] (rev 04)
iommu_group 7
00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 [8086:1c10] (rev b4)
00:1c.6 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 7 [8086:1c1c] (rev b4)
03:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host Controller [1033:0194] (rev ff)
iommu_group 8
00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 [8086:1c26] (rev 04)

Can you clarify what you mean by assign?  Are you actually assigning the
root ports to the qemu guest (1c.0 & 1c.6)?  vfio will require they be
owned by vfio-pci to make use of 3:00.0, but assigning them to the guest
is not recommended.  Can you provided your qemu command line?  We need
to re-visit how to handle pcieport devices with vfio-pci, perhaps
white-listing it as a vfio "compatible" driver, but this still should
not interfere with devices external to the group.

The DMAR fault address looks pretty bogus unless you happen to have
100GB+ of ram in the system.

> The error occurs at random and is not reproducible every time. It
> happens about every third reboot. 
> I'm running qemu-kvm 1.3.0 (kvm-1.3.0-187.3), kernel 3.8.0-rc5 and
> windows 7 as guest OS. The hardware uses an Intel IOMMU. See
> attachments for output of lspci, and details on iommu groups
> 
> I'm not sure if this problem originates from qemu, kvm, vfio or the
> GPU driver.
> Do you have any hints how to debug this further?

vfio makes use of the IOMMU API for programming DMA translations, so an
reserved fields would have to be programmed by intel-iommu itself.  We
could of course be passing some kind of bogus data that intel-iommu
isn't catching.  If you're assigning the root ports to the guest, I'd
start with that, don't do it.  Attach them to vfio, but don't give them
to the guest.  Maybe that'll give us a hint.  I also notice that your
USB 3 controller is dead:

03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev ff) (prog-if ff)
	!!! Unknown header type 7f

We only see unknown header type 7f when the read from the device returns
-1.  This might have something to do with the root port above it (1c.6)
being in state D3.  Windows likes to put unused devices in D3, which
leads me to suspect you are giving it to the guest.  Let's see what
happens without that.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-04 15:49 ` Alex Williamson
@ 2013-02-05 13:31   ` David Gstir
  2013-02-05 15:37     ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: David Gstir @ 2013-02-05 13:31 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm, linux-kernel, richard

Am Montag, den 04.02.2013, 08:49 -0700 schrieb Alex Williamson:

> Can you clarify what you mean by assign?  Are you actually assigning the
> root ports to the qemu guest (1c.0 & 1c.6)?  vfio will require they be
> owned by vfio-pci to make use of 3:00.0, but assigning them to the guest
> is not recommended.  Can you provided your qemu command line?  

I did hand all of them to the guest OS. Removing 1c.0 & 1c.6 from the qemu 
command line seems to have done the trick. Thanks!

Here's my working qemu command line:
qemu-kvm -no-reboot -enable-kvm -cpu host -smp 4 -m 6G \
  -drive file=/home/test/qemu/images/win7_base_updated.qcow2,if=virtio,cache=none,media=disk,format=qcow2,index=0 \
  -full-screen -no-quit -no-frame -display sdl -vnc :1 -k de -usbdevice tablet \
  -vga std -global VGA.vgamem_mb=256 \
  -netdev tap,id=guest0,ifname=tap0,script=no,downscript=no \
  -net nic,netdev=guest0,model=virtio,macaddr=00:16:35:BE:EF:12  \
  -rtc base=localtime \
  -device vfio-pci,host=00:1b.0,id=audio \
  -device vfio-pci,host=00:1a.0,id=ehci1 \
  -device vfio-pci,host=00:1d.0,id=ehci2 \
  -device vfio-pci,host=03:00.0,id=xhci1 \
  -monitor tcp::5555,server,nowait


> We need
> to re-visit how to handle pcieport devices with vfio-pci, perhaps
> white-listing it as a vfio "compatible" driver, but this still should
> not interfere with devices external to the group.
> 
> The DMAR fault address looks pretty bogus unless you happen to have
> 100GB+ of ram in the system.

Nope, definitely not. :)

> vfio makes use of the IOMMU API for programming DMA translations, so an
> reserved fields would have to be programmed by intel-iommu itself.  We
> could of course be passing some kind of bogus data that intel-iommu
> isn't catching.  If you're assigning the root ports to the guest, I'd
> start with that, don't do it.  Attach them to vfio, but don't give them
> to the guest.  Maybe that'll give us a hint.  I also notice that your
> USB 3 controller is dead:
> 
> 03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev ff) (prog-if ff)
> 	!!! Unknown header type 7f
> 
> We only see unknown header type 7f when the read from the device returns
> -1.  This might have something to do with the root port above it (1c.6)
> being in state D3.  Windows likes to put unused devices in D3, which
> leads me to suspect you are giving it to the guest.  

There error does no longer occur. lspci now shows this:

-- snip --
03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) (prog-if 30 [XHCI])
	Subsystem: Intel Corporation Device 2008
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 18
	Region 0: Memory at fe500000 (64-bit, non-prefetchable) [disabled] [size=8K]
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D3 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [90] MSI-X: Enable- Count=8 Masked-
		Vector table: BAR=0 offset=00001000
		PBA: BAR=0 offset=00001080
	Capabilities: [a0] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <4us, L1 unlimited
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
	Capabilities: [140 v1] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
	Capabilities: [150 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Kernel driver in use: vfio-pci
-- snip --

Most likely because I don't hand the root ports over to the guest anymore. 
However, there seems to be another issue with the USB 3 controller since 
windows 7 can't start the device (error 10 in windows device manager). Using 
these USB ports in the host linux worked fine. Could this issue be related to 
pci-express?

Thanks,
David





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-05 13:31   ` David Gstir
@ 2013-02-05 15:37     ` Alex Williamson
  2013-02-05 20:36       ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Williamson @ 2013-02-05 15:37 UTC (permalink / raw)
  To: David Gstir; +Cc: kvm, linux-kernel, richard

On Tue, 2013-02-05 at 14:31 +0100, David Gstir wrote:
> Am Montag, den 04.02.2013, 08:49 -0700 schrieb Alex Williamson:
> 
> > Can you clarify what you mean by assign?  Are you actually assigning the
> > root ports to the qemu guest (1c.0 & 1c.6)?  vfio will require they be
> > owned by vfio-pci to make use of 3:00.0, but assigning them to the guest
> > is not recommended.  Can you provided your qemu command line?  
> 
> I did hand all of them to the guest OS. Removing 1c.0 & 1c.6 from the qemu 
> command line seems to have done the trick. Thanks!

Great, though I'm still not sure how we were generating those DMAR
faults.

> Here's my working qemu command line:
> qemu-kvm -no-reboot -enable-kvm -cpu host -smp 4 -m 6G \
>   -drive file=/home/test/qemu/images/win7_base_updated.qcow2,if=virtio,cache=none,media=disk,format=qcow2,index=0 \
>   -full-screen -no-quit -no-frame -display sdl -vnc :1 -k de -usbdevice tablet \
>   -vga std -global VGA.vgamem_mb=256 \
>   -netdev tap,id=guest0,ifname=tap0,script=no,downscript=no \
>   -net nic,netdev=guest0,model=virtio,macaddr=00:16:35:BE:EF:12  \
>   -rtc base=localtime \
>   -device vfio-pci,host=00:1b.0,id=audio \
>   -device vfio-pci,host=00:1a.0,id=ehci1 \
>   -device vfio-pci,host=00:1d.0,id=ehci2 \
>   -device vfio-pci,host=03:00.0,id=xhci1 \
>   -monitor tcp::5555,server,nowait
> 
> 
> > We need
> > to re-visit how to handle pcieport devices with vfio-pci, perhaps
> > white-listing it as a vfio "compatible" driver, but this still should
> > not interfere with devices external to the group.
> > 
> > The DMAR fault address looks pretty bogus unless you happen to have
> > 100GB+ of ram in the system.
> 
> Nope, definitely not. :)
> 
> > vfio makes use of the IOMMU API for programming DMA translations, so an
> > reserved fields would have to be programmed by intel-iommu itself.  We
> > could of course be passing some kind of bogus data that intel-iommu
> > isn't catching.  If you're assigning the root ports to the guest, I'd
> > start with that, don't do it.  Attach them to vfio, but don't give them
> > to the guest.  Maybe that'll give us a hint.  I also notice that your
> > USB 3 controller is dead:
> > 
> > 03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev ff) (prog-if ff)
> > 	!!! Unknown header type 7f
> > 
> > We only see unknown header type 7f when the read from the device returns
> > -1.  This might have something to do with the root port above it (1c.6)
> > being in state D3.  Windows likes to put unused devices in D3, which
> > leads me to suspect you are giving it to the guest.  
> 
> There error does no longer occur. lspci now shows this:
> 
> -- snip --
> 03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) (prog-if 30 [XHCI])
> 	Subsystem: Intel Corporation Device 2008
> 	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> 	Interrupt: pin A routed to IRQ 18
> 	Region 0: Memory at fe500000 (64-bit, non-prefetchable) [disabled] [size=8K]
> 	Capabilities: [50] Power Management version 3
> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> 		Status: D3 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
> 		Address: 0000000000000000  Data: 0000
> 	Capabilities: [90] MSI-X: Enable- Count=8 Masked-
> 		Vector table: BAR=0 offset=00001000
> 		PBA: BAR=0 offset=00001080
> 	Capabilities: [a0] Express (v2) Endpoint, MSI 00
> 		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
> 			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> 			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
> 			MaxPayload 128 bytes, MaxReadReq 128 bytes
> 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
> 		LnkCap:	Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <4us, L1 unlimited
> 			ClockPM+ Surprise- LLActRep- BwNot-
> 		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
> 			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
> 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> 		DevCap2: Completion Timeout: Not Supported, TimeoutDis+
> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
> 		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> 	Capabilities: [100 v1] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> 		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
> 	Capabilities: [140 v1] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
> 	Capabilities: [150 v1] Latency Tolerance Reporting
> 		Max snoop latency: 0ns
> 		Max no snoop latency: 0ns
> 	Kernel driver in use: vfio-pci
> -- snip --
> 
> Most likely because I don't hand the root ports over to the guest anymore. 
> However, there seems to be another issue with the USB 3 controller since 
> windows 7 can't start the device (error 10 in windows device manager). Using 
> these USB ports in the host linux worked fine. Could this issue be related to 
> pci-express?

Ugh, the infamous and useless error 10.  It could be anything.  I've got
a system with onboard usb3, let me see what windows does with it here
first.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-05 15:37     ` Alex Williamson
@ 2013-02-05 20:36       ` Alex Williamson
  2013-02-05 20:41         ` Richard Weinberger
  2013-02-06 18:09         ` Richard Weinberger
  0 siblings, 2 replies; 13+ messages in thread
From: Alex Williamson @ 2013-02-05 20:36 UTC (permalink / raw)
  To: David Gstir; +Cc: kvm, linux-kernel, richard

On Tue, 2013-02-05 at 08:37 -0700, Alex Williamson wrote:
> On Tue, 2013-02-05 at 14:31 +0100, David Gstir wrote:
> > Am Montag, den 04.02.2013, 08:49 -0700 schrieb Alex Williamson:
> > 
> > > Can you clarify what you mean by assign?  Are you actually assigning the
> > > root ports to the qemu guest (1c.0 & 1c.6)?  vfio will require they be
> > > owned by vfio-pci to make use of 3:00.0, but assigning them to the guest
> > > is not recommended.  Can you provided your qemu command line?  
> > 
> > I did hand all of them to the guest OS. Removing 1c.0 & 1c.6 from the qemu 
> > command line seems to have done the trick. Thanks!
> 
> Great, though I'm still not sure how we were generating those DMAR
> faults.
> 
> > Here's my working qemu command line:
> > qemu-kvm -no-reboot -enable-kvm -cpu host -smp 4 -m 6G \
> >   -drive file=/home/test/qemu/images/win7_base_updated.qcow2,if=virtio,cache=none,media=disk,format=qcow2,index=0 \
> >   -full-screen -no-quit -no-frame -display sdl -vnc :1 -k de -usbdevice tablet \
> >   -vga std -global VGA.vgamem_mb=256 \
> >   -netdev tap,id=guest0,ifname=tap0,script=no,downscript=no \
> >   -net nic,netdev=guest0,model=virtio,macaddr=00:16:35:BE:EF:12  \
> >   -rtc base=localtime \
> >   -device vfio-pci,host=00:1b.0,id=audio \
> >   -device vfio-pci,host=00:1a.0,id=ehci1 \
> >   -device vfio-pci,host=00:1d.0,id=ehci2 \
> >   -device vfio-pci,host=03:00.0,id=xhci1 \
> >   -monitor tcp::5555,server,nowait
> > 
> > 
> > > We need
> > > to re-visit how to handle pcieport devices with vfio-pci, perhaps
> > > white-listing it as a vfio "compatible" driver, but this still should
> > > not interfere with devices external to the group.
> > > 
> > > The DMAR fault address looks pretty bogus unless you happen to have
> > > 100GB+ of ram in the system.
> > 
> > Nope, definitely not. :)
> > 
> > > vfio makes use of the IOMMU API for programming DMA translations, so an
> > > reserved fields would have to be programmed by intel-iommu itself.  We
> > > could of course be passing some kind of bogus data that intel-iommu
> > > isn't catching.  If you're assigning the root ports to the guest, I'd
> > > start with that, don't do it.  Attach them to vfio, but don't give them
> > > to the guest.  Maybe that'll give us a hint.  I also notice that your
> > > USB 3 controller is dead:
> > > 
> > > 03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev ff) (prog-if ff)
> > > 	!!! Unknown header type 7f
> > > 
> > > We only see unknown header type 7f when the read from the device returns
> > > -1.  This might have something to do with the root port above it (1c.6)
> > > being in state D3.  Windows likes to put unused devices in D3, which
> > > leads me to suspect you are giving it to the guest.  
> > 
> > There error does no longer occur. lspci now shows this:
> > 
> > -- snip --
> > 03:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) (prog-if 30 [XHCI])
> > 	Subsystem: Intel Corporation Device 2008
> > 	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
> > 	Interrupt: pin A routed to IRQ 18
> > 	Region 0: Memory at fe500000 (64-bit, non-prefetchable) [disabled] [size=8K]
> > 	Capabilities: [50] Power Management version 3
> > 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> > 		Status: D3 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > 	Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
> > 		Address: 0000000000000000  Data: 0000
> > 	Capabilities: [90] MSI-X: Enable- Count=8 Masked-
> > 		Vector table: BAR=0 offset=00001000
> > 		PBA: BAR=0 offset=00001080
> > 	Capabilities: [a0] Express (v2) Endpoint, MSI 00
> > 		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
> > 			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> > 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> > 			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
> > 			MaxPayload 128 bytes, MaxReadReq 128 bytes
> > 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
> > 		LnkCap:	Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <4us, L1 unlimited
> > 			ClockPM+ Surprise- LLActRep- BwNot-
> > 		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
> > 			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
> > 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> > 		DevCap2: Completion Timeout: Not Supported, TimeoutDis+
> > 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
> > 		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
> > 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> > 			 Compliance De-emphasis: -6dB
> > 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
> > 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> > 	Capabilities: [100 v1] Advanced Error Reporting
> > 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> > 		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
> > 	Capabilities: [140 v1] Device Serial Number ff-ff-ff-ff-ff-ff-ff-ff
> > 	Capabilities: [150 v1] Latency Tolerance Reporting
> > 		Max snoop latency: 0ns
> > 		Max no snoop latency: 0ns
> > 	Kernel driver in use: vfio-pci
> > -- snip --
> > 
> > Most likely because I don't hand the root ports over to the guest anymore. 
> > However, there seems to be another issue with the USB 3 controller since 
> > windows 7 can't start the device (error 10 in windows device manager). Using 
> > these USB ports in the host linux worked fine. Could this issue be related to 
> > pci-express?
> 
> Ugh, the infamous and useless error 10.  It could be anything.  I've got
> a system with onboard usb3, let me see what windows does with it here
> first.  Thanks,

Well, I've got an Etron USB3 HBA and (un)fortunately it works just fine
with a Win7 guest.  There's really nothing special about USB controllers
from a PCI device assignment perspective.  Have you tried the latest
upstream qemu bits?  Thanks,

Alex



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-05 20:36       ` Alex Williamson
@ 2013-02-05 20:41         ` Richard Weinberger
  2013-02-06 18:09         ` Richard Weinberger
  1 sibling, 0 replies; 13+ messages in thread
From: Richard Weinberger @ 2013-02-05 20:41 UTC (permalink / raw)
  To: Alex Williamson; +Cc: David Gstir, kvm, linux-kernel

Am Tue, 05 Feb 2013 13:36:53 -0700
schrieb Alex Williamson <alex.williamson@redhat.com>:
> > Ugh, the infamous and useless error 10.  It could be anything.
> > I've got a system with onboard usb3, let me see what windows does
> > with it here first.  Thanks,
> 
> Well, I've got an Etron USB3 HBA and (un)fortunately it works just
> fine with a Win7 guest.  There's really nothing special about USB
> controllers from a PCI device assignment perspective.  Have you tried
> the latest upstream qemu bits?  Thanks,

We tried also qemu v1.4.0-rc0 (git as of today) without success.
As next step we'll test Linux as guest, maybe it is more chatty than
Windows regarding the issue. :-)

Thanks,
//richard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-05 20:36       ` Alex Williamson
  2013-02-05 20:41         ` Richard Weinberger
@ 2013-02-06 18:09         ` Richard Weinberger
  2013-02-06 18:47           ` Alex Williamson
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Weinberger @ 2013-02-06 18:09 UTC (permalink / raw)
  To: Alex Williamson; +Cc: David Gstir, kvm, linux-kernel

Hi,

Am Tue, 05 Feb 2013 13:36:53 -0700
schrieb Alex Williamson <alex.williamson@redhat.com>:
> > Ugh, the infamous and useless error 10.  It could be anything.
> > I've got a system with onboard usb3, let me see what windows does
> > with it here first.  Thanks,
> 
> Well, I've got an Etron USB3 HBA and (un)fortunately it works just
> fine with a Win7 guest.  There's really nothing special about USB
> controllers from a PCI device assignment perspective.  Have you tried
> the latest upstream qemu bits?  Thanks,

USB3 does also not work within a Linux guest.
xhci in debug mode gives a bit more infos.

[    1.157888] xhci_hcd 0000:00:07.0: xHCI Host Controller
[    1.157899] xhci_hcd 0000:00:07.0: new USB bus registered, assigned bus number 4
[    1.157948] xhci_hcd 0000:00:07.0: // Halt the HC
[    1.157957] xhci_hcd 0000:00:07.0: Resetting HCD
[    1.157962] xhci_hcd 0000:00:07.0: // Reset the HC
[    1.158111] usb 3-1: new full-speed USB device number 2 using uhci_hcd
[    1.158125] xhci_hcd 0000:00:07.0: Wait for controller to be ready for doorbell rings
[    1.158130] xhci_hcd 0000:00:07.0: Reset complete
[    1.158133] xhci_hcd 0000:00:07.0: Enabling 64-bit DMA addresses.
[    1.158135] xhci_hcd 0000:00:07.0: Calling HCD init
[    1.158136] xhci_hcd 0000:00:07.0: xhci_init
[    1.158137] xhci_hcd 0000:00:07.0: xHCI doesn't need link TRB QUIRK
[    1.158640] xhci_hcd 0000:00:07.0: Finished xhci_init
[    1.158642] xhci_hcd 0000:00:07.0: Called HCD init
[    1.158698] xhci_hcd 0000:00:07.0: irq 11, io mem 0xfebf4000
[    1.158699] xhci_hcd 0000:00:07.0: xhci_run
[    1.159578] xhci_hcd 0000:00:07.0: irq 40 for MSI/MSI-X
[    1.159697] xhci_hcd 0000:00:07.0: irq 41 for MSI/MSI-X
[    1.159720] xhci_hcd 0000:00:07.0: irq 42 for MSI/MSI-X
[    1.159736] xhci_hcd 0000:00:07.0: irq 43 for MSI/MSI-X
[    1.159752] xhci_hcd 0000:00:07.0: irq 44 for MSI/MSI-X
[    1.179682] xhci_hcd 0000:00:07.0: Setting event ring polling timer
[    1.179686] xhci_hcd 0000:00:07.0: Command ring memory map follows:
[    1.179693] xhci_hcd 0000:00:07.0: ERST memory map follows:
[    1.179695] xhci_hcd 0000:00:07.0: Event ring:
[    1.179702] xhci_hcd 0000:00:07.0: ERST deq = 64'h36820400
[    1.179703] xhci_hcd 0000:00:07.0: // Set the interrupt modulation register
[    1.179710] xhci_hcd 0000:00:07.0: // Enable interrupts, cmd = 0x4.
[    1.179715] xhci_hcd 0000:00:07.0: // Enabling event ring interrupter ffffc90000e68620 by writing 0x2 to irq_pending
[    1.179737] xhci_hcd 0000:00:07.0: Finished xhci_run for USB2 roothub
[    1.179752] usb usb4: New USB device found, idVendor=1d6b, idProduct=0002
[    1.179753] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.179755] usb usb4: Product: xHCI Host Controller
[    1.179756] usb usb4: Manufacturer: Linux 3.8.0-rc6-2.10-desktop xhci_hcd
[    1.179757] usb usb4: SerialNumber: 0000:00:07.0
[    1.179967] xHCI xhci_add_endpoint called for root hub
[    1.179971] xHCI xhci_check_bandwidth called for root hub
[    1.180081] hub 4-0:1.0: USB hub found
[    1.180094] hub 4-0:1.0: 2 ports detected
[    1.180200] xhci_hcd 0000:00:07.0: xHCI Host Controller
[    1.180206] xhci_hcd 0000:00:07.0: new USB bus registered, assigned bus number 5
[    1.180214] xhci_hcd 0000:00:07.0: Enabling 64-bit DMA addresses.
[    1.180219] xhci_hcd 0000:00:07.0: // Turn on HC, cmd = 0x5.
[    1.245201] xhci_hcd 0000:00:07.0: Host took too long to start, waited 16000 microseconds.

This one looks interesting.

[    1.245414] xhci_hcd 0000:00:07.0: // Halt the HC
[    1.245424] xhci_hcd 0000:00:07.0: startup error -19
[    1.245551] xhci_hcd 0000:00:07.0: USB bus 5 deregistered
[    1.245556] xhci_hcd 0000:00:07.0: remove, state 1
[    1.245560] usb usb4: USB disconnect, device number 1
[    1.245608] xHCI xhci_drop_endpoint called for root hub
[    1.245609] xHCI xhci_check_bandwidth called for root hub
[    1.245684] xhci_hcd 0000:00:07.0: // Halt the HC
[    1.245695] xhci_hcd 0000:00:07.0: // Reset the HC
[    1.245741] xhci_hcd 0000:00:07.0: Wait for controller to be ready for doorbell rings
[    1.256413] xhci_hcd 0000:00:07.0: // Disabling event ring interrupts
[    1.256427] xhci_hcd 0000:00:07.0: cleaning up memory
[    1.256440] xhci_hcd 0000:00:07.0: xhci_stop completed - status = 1
[    1.256446] xhci_hcd 0000:00:07.0: USB bus 4 deregistered
[    1.258194] ata_piix 0000:00:01.1: version 2.13

Within the guest lscpi -vv gives:

00:07.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) (prog-if 30 [XHCI])
        Subsystem: Intel Corporation Device 2008
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at febf4000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [90] MSI-X: Enable- Count=8 Masked-
                Vector table: BAR=0 offset=00001000
                PBA: BAR=0 offset=00001080
        Capabilities: [a0] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <4us, L1 unlimited
                        ClockPM+ Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-

Is there anything else we can do?

Thanks,
//richard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-06 18:09         ` Richard Weinberger
@ 2013-02-06 18:47           ` Alex Williamson
  2013-02-06 20:25             ` Richard Weinberger
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Williamson @ 2013-02-06 18:47 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: David Gstir, kvm, linux-kernel

On Wed, 2013-02-06 at 19:09 +0100, Richard Weinberger wrote:
> Hi,
> 
> Am Tue, 05 Feb 2013 13:36:53 -0700
> schrieb Alex Williamson <alex.williamson@redhat.com>:
> > > Ugh, the infamous and useless error 10.  It could be anything.
> > > I've got a system with onboard usb3, let me see what windows does
> > > with it here first.  Thanks,
> > 
> > Well, I've got an Etron USB3 HBA and (un)fortunately it works just
> > fine with a Win7 guest.  There's really nothing special about USB
> > controllers from a PCI device assignment perspective.  Have you tried
> > the latest upstream qemu bits?  Thanks,
> 
> USB3 does also not work within a Linux guest.
> xhci in debug mode gives a bit more infos.

Does the card work with pci-assign or are both broken?
> 
> [    1.157888] xhci_hcd 0000:00:07.0: xHCI Host Controller
> [    1.157899] xhci_hcd 0000:00:07.0: new USB bus registered, assigned bus number 4
> [    1.157948] xhci_hcd 0000:00:07.0: // Halt the HC
> [    1.157957] xhci_hcd 0000:00:07.0: Resetting HCD
> [    1.157962] xhci_hcd 0000:00:07.0: // Reset the HC
> [    1.158111] usb 3-1: new full-speed USB device number 2 using uhci_hcd
> [    1.158125] xhci_hcd 0000:00:07.0: Wait for controller to be ready for doorbell rings
> [    1.158130] xhci_hcd 0000:00:07.0: Reset complete
> [    1.158133] xhci_hcd 0000:00:07.0: Enabling 64-bit DMA addresses.
> [    1.158135] xhci_hcd 0000:00:07.0: Calling HCD init
> [    1.158136] xhci_hcd 0000:00:07.0: xhci_init
> [    1.158137] xhci_hcd 0000:00:07.0: xHCI doesn't need link TRB QUIRK
> [    1.158640] xhci_hcd 0000:00:07.0: Finished xhci_init
> [    1.158642] xhci_hcd 0000:00:07.0: Called HCD init
> [    1.158698] xhci_hcd 0000:00:07.0: irq 11, io mem 0xfebf4000
> [    1.158699] xhci_hcd 0000:00:07.0: xhci_run
> [    1.159578] xhci_hcd 0000:00:07.0: irq 40 for MSI/MSI-X
> [    1.159697] xhci_hcd 0000:00:07.0: irq 41 for MSI/MSI-X
> [    1.159720] xhci_hcd 0000:00:07.0: irq 42 for MSI/MSI-X
> [    1.159736] xhci_hcd 0000:00:07.0: irq 43 for MSI/MSI-X
> [    1.159752] xhci_hcd 0000:00:07.0: irq 44 for MSI/MSI-X
> [    1.179682] xhci_hcd 0000:00:07.0: Setting event ring polling timer
> [    1.179686] xhci_hcd 0000:00:07.0: Command ring memory map follows:
> [    1.179693] xhci_hcd 0000:00:07.0: ERST memory map follows:
> [    1.179695] xhci_hcd 0000:00:07.0: Event ring:
> [    1.179702] xhci_hcd 0000:00:07.0: ERST deq = 64'h36820400
> [    1.179703] xhci_hcd 0000:00:07.0: // Set the interrupt modulation register
> [    1.179710] xhci_hcd 0000:00:07.0: // Enable interrupts, cmd = 0x4.
> [    1.179715] xhci_hcd 0000:00:07.0: // Enabling event ring interrupter ffffc90000e68620 by writing 0x2 to irq_pending
> [    1.179737] xhci_hcd 0000:00:07.0: Finished xhci_run for USB2 roothub
> [    1.179752] usb usb4: New USB device found, idVendor=1d6b, idProduct=0002
> [    1.179753] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> [    1.179755] usb usb4: Product: xHCI Host Controller
> [    1.179756] usb usb4: Manufacturer: Linux 3.8.0-rc6-2.10-desktop xhci_hcd
> [    1.179757] usb usb4: SerialNumber: 0000:00:07.0
> [    1.179967] xHCI xhci_add_endpoint called for root hub
> [    1.179971] xHCI xhci_check_bandwidth called for root hub
> [    1.180081] hub 4-0:1.0: USB hub found
> [    1.180094] hub 4-0:1.0: 2 ports detected
> [    1.180200] xhci_hcd 0000:00:07.0: xHCI Host Controller
> [    1.180206] xhci_hcd 0000:00:07.0: new USB bus registered, assigned bus number 5
> [    1.180214] xhci_hcd 0000:00:07.0: Enabling 64-bit DMA addresses.
> [    1.180219] xhci_hcd 0000:00:07.0: // Turn on HC, cmd = 0x5.
> [    1.245201] xhci_hcd 0000:00:07.0: Host took too long to start, waited 16000 microseconds.
> 
> This one looks interesting.

Yep, the register never got to the state it was looking for.

> [    1.245414] xhci_hcd 0000:00:07.0: // Halt the HC
> [    1.245424] xhci_hcd 0000:00:07.0: startup error -19
> [    1.245551] xhci_hcd 0000:00:07.0: USB bus 5 deregistered
> [    1.245556] xhci_hcd 0000:00:07.0: remove, state 1
> [    1.245560] usb usb4: USB disconnect, device number 1
> [    1.245608] xHCI xhci_drop_endpoint called for root hub
> [    1.245609] xHCI xhci_check_bandwidth called for root hub
> [    1.245684] xhci_hcd 0000:00:07.0: // Halt the HC
> [    1.245695] xhci_hcd 0000:00:07.0: // Reset the HC
> [    1.245741] xhci_hcd 0000:00:07.0: Wait for controller to be ready for doorbell rings
> [    1.256413] xhci_hcd 0000:00:07.0: // Disabling event ring interrupts
> [    1.256427] xhci_hcd 0000:00:07.0: cleaning up memory
> [    1.256440] xhci_hcd 0000:00:07.0: xhci_stop completed - status = 1
> [    1.256446] xhci_hcd 0000:00:07.0: USB bus 4 deregistered
> [    1.258194] ata_piix 0000:00:01.1: version 2.13
> 
> Within the guest lscpi -vv gives:
> 
> 00:07.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) (prog-if 30 [XHCI])
>         Subsystem: Intel Corporation Device 2008
>         Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
>         Interrupt: pin A routed to IRQ 11
>         Region 0: Memory at febf4000 (64-bit, non-prefetchable) [size=8K]
>         Capabilities: [50] Power Management version 3
>                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>         Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [90] MSI-X: Enable- Count=8 Masked-
>                 Vector table: BAR=0 offset=00001000
>                 PBA: BAR=0 offset=00001080

Possible there's a bug in how we're managing the vector table and pba
here.  Can you get to the monitor and run 'into mtree' and provide the
results?  Thanks,

Alex


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-06 18:47           ` Alex Williamson
@ 2013-02-06 20:25             ` Richard Weinberger
  2013-02-06 22:45               ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Weinberger @ 2013-02-06 20:25 UTC (permalink / raw)
  To: Alex Williamson; +Cc: David Gstir, kvm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 393 bytes --]

Hi,

Am Wed, 06 Feb 2013 11:47:20 -0700
schrieb Alex Williamson <alex.williamson@redhat.com>: 
> Does the card work with pci-assign or are both broken?

It works with pci-assign. :-\

 
> Possible there's a bug in how we're managing the vector table and pba
> here.  Can you get to the monitor and run 'into mtree' and provide the
> results?  Thanks,

Please see attachment.

Thanks,
//richard

[-- Attachment #2: mtree_vfio.txt --]
[-- Type: text/plain, Size: 7698 bytes --]

(qemu) info mtree
info mtree
memory
0000000000000000-7ffffffffffffffe (prio 0, RW): system
  0000000000000000-00000000dfffffff (prio 0, RW): alias ram-below-4g @pc.ram 0000000000000000-00000000dfffffff
  00000000000a0000-00000000000bffff (prio 1, RW): alias smram-region @pci 00000000000a0000-00000000000bffff
  00000000000c0000-00000000000c3fff (prio 1, R-): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff
  00000000000c4000-00000000000c7fff (prio 1, R-): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff
  00000000000c8000-00000000000cbfff (prio 1, R-): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff
  00000000000cb000-00000000000cdfff (prio 1000, RW): alias kvmvapic-rom @pc.ram 00000000000cb000-00000000000cdfff
  00000000000cc000-00000000000cffff (prio 1, R-): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff
  00000000000d0000-00000000000d3fff (prio 1, RW): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff
  00000000000d4000-00000000000d7fff (prio 1, RW): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff
  00000000000d8000-00000000000dbfff (prio 1, RW): alias pam-ram @pc.ram 00000000000d8000-00000000000dbfff
  00000000000dc000-00000000000dffff (prio 1, RW): alias pam-ram @pc.ram 00000000000dc000-00000000000dffff
  00000000000e0000-00000000000e3fff (prio 1, RW): alias pam-ram @pc.ram 00000000000e0000-00000000000e3fff
  00000000000e4000-00000000000e7fff (prio 1, RW): alias pam-ram @pc.ram 00000000000e4000-00000000000e7fff
  00000000000e8000-00000000000ebfff (prio 1, RW): alias pam-ram @pc.ram 00000000000e8000-00000000000ebfff
  00000000000ec000-00000000000effff (prio 1, RW): alias pam-ram @pc.ram 00000000000ec000-00000000000effff
  00000000000f0000-00000000000fffff (prio 1, R-): alias pam-rom @pc.ram 00000000000f0000-00000000000fffff
  00000000e0000000-00000000ffffffff (prio 0, RW): alias pci-hole @pci 00000000e0000000-00000000ffffffff
  00000000fec00000-00000000fec00fff (prio 0, RW): kvm-ioapic
  00000000fed00000-00000000fed003ff (prio 0, RW): hpet
  00000000fee00000-00000000feefffff (prio 0, RW): kvm-apic-msi
  0000000100000000-000000019fffffff (prio 0, RW): alias ram-above-4g @pc.ram 00000000e0000000-000000017fffffff
  00000001a0000000-400000019fffffff (prio 0, RW): alias pci-hole64 @pci 00000001a0000000-400000019fffffff
I/O
0000000000000000-000000000000ffff (prio 0, RW): io
  0000000000000020-0000000000000021 (prio 0, RW): kvm-pic
  0000000000000040-0000000000000043 (prio 0, RW): kvm-pit
  0000000000000060-0000000000000060 (prio 0, RW): i8042-data
  0000000000000061-0000000000000061 (prio 0, RW): elcr
  0000000000000064-0000000000000064 (prio 0, RW): i8042-cmd
  0000000000000070-0000000000000071 (prio 0, RW): rtc
  000000000000007e-000000000000007f (prio 0, RW): kvmvapic
  0000000000000092-0000000000000092 (prio 0, RW): port92
  00000000000000a0-00000000000000a1 (prio 0, RW): kvm-pic
  0000000000000170-0000000000000177 (prio 0, RW): alias ide @ide 0000000000000170-0000000000000177
  00000000000001ce-00000000000001d0 (prio 0, RW): alias vbe @vbe 00000000000001ce-00000000000001d0
  00000000000001f0-00000000000001f7 (prio 0, RW): alias ide @ide 00000000000001f0-00000000000001f7
  0000000000000376-0000000000000376 (prio 0, RW): alias ide @ide 0000000000000376-0000000000000376
  0000000000000378-000000000000037f (prio 0, RW): alias parallel @parallel 0000000000000378-000000000000037f
  00000000000003b4-00000000000003b5 (prio 0, RW): alias vga @vga 00000000000003b4-00000000000003b5
  00000000000003ba-00000000000003ba (prio 0, RW): alias vga @vga 00000000000003ba-00000000000003ba
  00000000000003c0-00000000000003cf (prio 0, RW): alias vga @vga 00000000000003c0-00000000000003cf
  00000000000003d4-00000000000003d5 (prio 0, RW): alias vga @vga 00000000000003d4-00000000000003d5
  00000000000003da-00000000000003da (prio 0, RW): alias vga @vga 00000000000003da-00000000000003da
  00000000000003f1-00000000000003f5 (prio 0, RW): alias fdc @fdc 00000000000003f1-00000000000003f5
  00000000000003f6-00000000000003f6 (prio 0, RW): alias ide @ide 00000000000003f6-00000000000003f6
  00000000000003f7-00000000000003f7 (prio 0, RW): alias fdc @fdc 00000000000003f7-00000000000003f7
  00000000000003f8-00000000000003ff (prio 0, RW): serial
  00000000000004d0-00000000000004d0 (prio 0, RW): kvm-elcr
  00000000000004d1-00000000000004d1 (prio 0, RW): kvm-elcr
  0000000000000510-0000000000000511 (prio 0, RW): fwcfg
  0000000000000cf8-0000000000000cfb (prio 0, RW): pci-conf-idx
  0000000000000cfc-0000000000000cff (prio 0, RW): pci-conf-data
  0000000000005658-0000000000005658 (prio 0, RW): vmport
  000000000000c000-000000000000c01f (prio 1, RW): uhci
  000000000000c020-000000000000c03f (prio 1, RW): virtio-pci
  000000000000c040-000000000000c04f (prio 1, RW): piix-bmdma-container
    000000000000c040-000000000000c043 (prio 0, RW): piix-bmdma
    000000000000c044-000000000000c047 (prio 0, RW): bmdma
    000000000000c048-000000000000c04b (prio 0, RW): piix-bmdma
    000000000000c04c-000000000000c04f (prio 0, RW): bmdma
aliases
pc.ram
0000000000000000-000000017fffffff (prio 0, RW): pc.ram
pci
0000000000000000-7ffffffffffffffe (prio 0, RW): pci
  00000000000a0000-00000000000affff (prio 2, RW): alias vga.chain4 @vga.vram 0000000000000000-000000000000ffff
  00000000000a0000-00000000000bffff (prio 1, RW): vga-lowmem
  00000000000c0000-00000000000dffff (prio 1, RW): pc.rom
  00000000000e0000-00000000000fffff (prio 1, R-): alias isa-bios @pc.bios 0000000000000000-000000000001ffff
  00000000e0000000-00000000efffffff (prio 1, RW): vga.vram
  00000000febf0000-00000000febf3fff (prio 1, RW): VFIO 0000:00:1b.0 BAR 0
    00000000febf0000-00000000febf3fff (prio 0, RW): VFIO 0000:00:1b.0 BAR 0 mmap
  00000000febf4000-00000000febf5fff (prio 1, RW): VFIO 0000:03:00.0 BAR 0
    00000000febf4000-00000000febf4fff (prio 0, RW): VFIO 0000:03:00.0 BAR 0 mmap
    00000000febf5000-00000000febf507f (prio 0, RW): msix-table
    00000000febf5080-00000000febf5087 (prio 0, RW): msix-pba
    00000000febf6000-00000000febf5fff (prio 0, RW): VFIO 0000:03:00.0 BAR 0 mmap msix-hi
  00000000febf6000-00000000febf6fff (prio 1, RW): vga.mmio
    00000000febf6400-00000000febf641f (prio 0, RW): vga ioports remapped
    00000000febf6500-00000000febf6515 (prio 0, RW): bochs dispi interface
  00000000febf7000-00000000febf7fff (prio 1, RW): virtio-net-pci-msix
    00000000febf7000-00000000febf702f (prio 0, RW): msix-table
    00000000febf7800-00000000febf7807 (prio 0, RW): msix-pba
  00000000febf8000-00000000febf83ff (prio 1, RW): VFIO 0000:00:1a.0 BAR 0
    00000000febf8000-00000000febf7fff (prio 0, RW): VFIO 0000:00:1a.0 BAR 0 mmap
  00000000febf9000-00000000febf93ff (prio 1, RW): VFIO 0000:00:1d.0 BAR 0
    00000000febf9000-00000000febf8fff (prio 0, RW): VFIO 0000:00:1d.0 BAR 0 mmap
  00000000fffe0000-00000000ffffffff (prio 0, R-): pc.bios
ide
0000000000000000-7ffffffffffffffe (prio 0, RW): ide
vbe
0000000000000000-7ffffffffffffffe (prio 0, RW): vbe
ide
0000000000000000-7ffffffffffffffe (prio 0, RW): ide
ide
0000000000000000-7ffffffffffffffe (prio 0, RW): ide
parallel
0000000000000000-7ffffffffffffffe (prio 0, RW): parallel
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
fdc
0000000000000000-7ffffffffffffffe (prio 0, RW): fdc
ide
0000000000000000-7ffffffffffffffe (prio 0, RW): ide
fdc
0000000000000000-7ffffffffffffffe (prio 0, RW): fdc
vga.vram
00000000e0000000-00000000efffffff (prio 1, RW): vga.vram
pc.bios
00000000fffe0000-00000000ffffffff (prio 0, R-): pc.bios

[-- Attachment #3: mtee_pciassign.txt --]
[-- Type: text/plain, Size: 7545 bytes --]

(qemu) info mtree
info mtree
memory
0000000000000000-7ffffffffffffffe (prio 0, RW): system
  0000000000000000-00000000dfffffff (prio 0, RW): alias ram-below-4g @pc.ram 0000000000000000-00000000dfffffff
  00000000000a0000-00000000000bffff (prio 1, RW): alias smram-region @pci 00000000000a0000-00000000000bffff
  00000000000c0000-00000000000c3fff (prio 1, R-): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff
  00000000000c4000-00000000000c7fff (prio 1, R-): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff
  00000000000c8000-00000000000cbfff (prio 1, R-): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff
  00000000000cb000-00000000000cdfff (prio 1000, RW): alias kvmvapic-rom @pc.ram 00000000000cb000-00000000000cdfff
  00000000000cc000-00000000000cffff (prio 1, R-): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff
  00000000000d0000-00000000000d3fff (prio 1, RW): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff
  00000000000d4000-00000000000d7fff (prio 1, RW): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff
  00000000000d8000-00000000000dbfff (prio 1, RW): alias pam-ram @pc.ram 00000000000d8000-00000000000dbfff
  00000000000dc000-00000000000dffff (prio 1, RW): alias pam-ram @pc.ram 00000000000dc000-00000000000dffff
  00000000000e0000-00000000000e3fff (prio 1, RW): alias pam-ram @pc.ram 00000000000e0000-00000000000e3fff
  00000000000e4000-00000000000e7fff (prio 1, RW): alias pam-ram @pc.ram 00000000000e4000-00000000000e7fff
  00000000000e8000-00000000000ebfff (prio 1, RW): alias pam-ram @pc.ram 00000000000e8000-00000000000ebfff
  00000000000ec000-00000000000effff (prio 1, RW): alias pam-ram @pc.ram 00000000000ec000-00000000000effff
  00000000000f0000-00000000000fffff (prio 1, R-): alias pam-rom @pc.ram 00000000000f0000-00000000000fffff
  00000000e0000000-00000000ffffffff (prio 0, RW): alias pci-hole @pci 00000000e0000000-00000000ffffffff
  00000000fec00000-00000000fec00fff (prio 0, RW): kvm-ioapic
  00000000fed00000-00000000fed003ff (prio 0, RW): hpet
  00000000fee00000-00000000feefffff (prio 0, RW): kvm-apic-msi
  0000000100000000-000000019fffffff (prio 0, RW): alias ram-above-4g @pc.ram 00000000e0000000-000000017fffffff
  00000001a0000000-400000019fffffff (prio 0, RW): alias pci-hole64 @pci 00000001a0000000-400000019fffffff
I/O
0000000000000000-000000000000ffff (prio 0, RW): io
  0000000000000020-0000000000000021 (prio 0, RW): kvm-pic
  0000000000000040-0000000000000043 (prio 0, RW): kvm-pit
  0000000000000060-0000000000000060 (prio 0, RW): i8042-data
  0000000000000061-0000000000000061 (prio 0, RW): elcr
  0000000000000064-0000000000000064 (prio 0, RW): i8042-cmd
  0000000000000070-0000000000000071 (prio 0, RW): rtc
  000000000000007e-000000000000007f (prio 0, RW): kvmvapic
  0000000000000092-0000000000000092 (prio 0, RW): port92
  00000000000000a0-00000000000000a1 (prio 0, RW): kvm-pic
  0000000000000170-0000000000000177 (prio 0, RW): alias ide @ide 0000000000000170-0000000000000177
  00000000000001ce-00000000000001d0 (prio 0, RW): alias vbe @vbe 00000000000001ce-00000000000001d0
  00000000000001f0-00000000000001f7 (prio 0, RW): alias ide @ide 00000000000001f0-00000000000001f7
  0000000000000376-0000000000000376 (prio 0, RW): alias ide @ide 0000000000000376-0000000000000376
  0000000000000378-000000000000037f (prio 0, RW): alias parallel @parallel 0000000000000378-000000000000037f
  00000000000003b4-00000000000003b5 (prio 0, RW): alias vga @vga 00000000000003b4-00000000000003b5
  00000000000003ba-00000000000003ba (prio 0, RW): alias vga @vga 00000000000003ba-00000000000003ba
  00000000000003c0-00000000000003cf (prio 0, RW): alias vga @vga 00000000000003c0-00000000000003cf
  00000000000003d4-00000000000003d5 (prio 0, RW): alias vga @vga 00000000000003d4-00000000000003d5
  00000000000003da-00000000000003da (prio 0, RW): alias vga @vga 00000000000003da-00000000000003da
  00000000000003f1-00000000000003f5 (prio 0, RW): alias fdc @fdc 00000000000003f1-00000000000003f5
  00000000000003f6-00000000000003f6 (prio 0, RW): alias ide @ide 00000000000003f6-00000000000003f6
  00000000000003f7-00000000000003f7 (prio 0, RW): alias fdc @fdc 00000000000003f7-00000000000003f7
  00000000000003f8-00000000000003ff (prio 0, RW): serial
  00000000000004d0-00000000000004d0 (prio 0, RW): kvm-elcr
  00000000000004d1-00000000000004d1 (prio 0, RW): kvm-elcr
  0000000000000510-0000000000000511 (prio 0, RW): fwcfg
  0000000000000cf8-0000000000000cfb (prio 0, RW): pci-conf-idx
  0000000000000cfc-0000000000000cff (prio 0, RW): pci-conf-data
  0000000000005658-0000000000005658 (prio 0, RW): vmport
  000000000000c000-000000000000c01f (prio 1, RW): uhci
  000000000000c020-000000000000c03f (prio 1, RW): virtio-pci
  000000000000c040-000000000000c04f (prio 1, RW): piix-bmdma-container
    000000000000c040-000000000000c043 (prio 0, RW): piix-bmdma
    000000000000c044-000000000000c047 (prio 0, RW): bmdma
    000000000000c048-000000000000c04b (prio 0, RW): piix-bmdma
    000000000000c04c-000000000000c04f (prio 0, RW): bmdma
aliases
pc.ram
0000000000000000-000000017fffffff (prio 0, RW): pc.ram
pci
0000000000000000-7ffffffffffffffe (prio 0, RW): pci
  00000000000a0000-00000000000affff (prio 2, RW): alias vga.chain4 @vga.vram 0000000000000000-000000000000ffff
  00000000000a0000-00000000000bffff (prio 1, RW): vga-lowmem
  00000000000c0000-00000000000dffff (prio 1, RW): pc.rom
  00000000000e0000-00000000000fffff (prio 1, R-): alias isa-bios @pc.bios 0000000000000000-000000000001ffff
  00000000e0000000-00000000efffffff (prio 1, RW): vga.vram
  00000000febf0000-00000000febf3fff (prio 1, RW): VFIO 0000:00:1b.0 BAR 0
    00000000febf0000-00000000febf3fff (prio 0, RW): VFIO 0000:00:1b.0 BAR 0 mmap
  00000000febf4000-00000000febf5fff (prio 1, RW): assigned-dev-container
    00000000febf4000-00000000febf5fff (prio 0, RW): kvm-pci-assign.bar0
    00000000febf5000-00000000febf5fff (prio 1, RW): assigned-dev-msix
  00000000febf6000-00000000febf6fff (prio 1, RW): vga.mmio
    00000000febf6400-00000000febf641f (prio 0, RW): vga ioports remapped
    00000000febf6500-00000000febf6515 (prio 0, RW): bochs dispi interface
  00000000febf7000-00000000febf7fff (prio 1, RW): virtio-net-pci-msix
    00000000febf7000-00000000febf702f (prio 0, RW): msix-table
    00000000febf7800-00000000febf7807 (prio 0, RW): msix-pba
  00000000febf8000-00000000febf83ff (prio 1, RW): VFIO 0000:00:1a.0 BAR 0
    00000000febf8000-00000000febf7fff (prio 0, RW): VFIO 0000:00:1a.0 BAR 0 mmap
  00000000febf9000-00000000febf93ff (prio 1, RW): VFIO 0000:00:1d.0 BAR 0
    00000000febf9000-00000000febf8fff (prio 0, RW): VFIO 0000:00:1d.0 BAR 0 mmap
  00000000fffe0000-00000000ffffffff (prio 0, R-): pc.bios
ide
0000000000000000-7ffffffffffffffe (prio 0, RW): ide
vbe
0000000000000000-7ffffffffffffffe (prio 0, RW): vbe
ide
0000000000000000-7ffffffffffffffe (prio 0, RW): ide
ide
0000000000000000-7ffffffffffffffe (prio 0, RW): ide
parallel
0000000000000000-7ffffffffffffffe (prio 0, RW): parallel
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
vga
0000000000000000-7ffffffffffffffe (prio 0, RW): vga
fdc
0000000000000000-7ffffffffffffffe (prio 0, RW): fdc
ide
0000000000000000-7ffffffffffffffe (prio 0, RW): ide
fdc
0000000000000000-7ffffffffffffffe (prio 0, RW): fdc
vga.vram
00000000e0000000-00000000efffffff (prio 1, RW): vga.vram
pc.bios
00000000fffe0000-00000000ffffffff (prio 0, R-): pc.bios

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-06 20:25             ` Richard Weinberger
@ 2013-02-06 22:45               ` Alex Williamson
  2013-02-07 22:23                 ` Richard Weinberger
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Williamson @ 2013-02-06 22:45 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: David Gstir, kvm, linux-kernel

On Wed, 2013-02-06 at 21:25 +0100, Richard Weinberger wrote:
> Hi,
> 
> Am Wed, 06 Feb 2013 11:47:20 -0700
> schrieb Alex Williamson <alex.williamson@redhat.com>: 
> > Does the card work with pci-assign or are both broken?
> 
> It works with pci-assign. :-\

When you tested this, did you detach the group from vfio or use it as
is?  In your previous message I see this:

03:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host Controller [1033:0194] (rev ff)

/sys/kernel/iommu_groups/7/devices:
total 0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1c.0 -> ../../../../devices/pci0000:00/0000:00:1c.0
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1c.6 -> ../../../../devices/pci0000:00/0000:00:1c.6
lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:03:00.0 -> ../../../../devices/pci0000:00/0000:00:1c.6/0000:03:00.0

This seemed like a good card to have in my test cache, so I went and got
one and it works fine for me... but I've been playing with pcieport
because I don't think we're handling them correctly in vfio.

Can you provide lspci -vvv -s 1c.6 while the guest is running?  I'm
going to bet that

Control: I/O+ Mem+ BusMaster+

is not set, which it would have been if pci-assign was tested without
the group bound to vfio.  I think the solution is going to be something
around white-listing pcieport, which you can easily test with a kernel
patch like this:

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 12c264d..48a97fb 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -442,7 +442,7 @@ static struct vfio_device *vfio_group_get_device(struct vfio
  * a device.  It's not always practical to leave a device within a group
  * driverless as it could get re-bound to something unsafe.
  */
-static const char * const vfio_driver_whitelist[] = { "pci-stub" };
+static const char * const vfio_driver_whitelist[] = { "pci-stub", "pcieport" };
 
 static bool vfio_whitelisted_driver(struct device_driver *drv)
 {

Then you won't need to bind 1c.0 or 1c.6 to vfio-pci and hopefully
things will work.  The other problem you might hit is that the pciehp
service driver may also be bound to these slots and somehow deletes the
pci device and re-adds it when a device reset happens.  This causes all
sorts of badness.  The solution here is to unbind the child device from
pciehp, ie:

echo 0000:00:1c.0:pcie04 | sudo \
    tee /sys/bus/pci_express/drivers/pciehp/unbind
echo 0000:00:1c.6:pcie04 | sudo \
    tee /sys/bus/pci_express/drivers/pciehp/unbind

Hopefully combined that will make things work, please let me know.
Another option is to move the device to a slot where it isn't grouped
with the root port above it, assuming it's a plugin card.  Also if we
could determine that these root ports support PCI ACS but just don't
report it, we could change the grouping and avoid root ports grouped
with devices.

I'm still trying to formulate how to fix this long term, whether we
should whitelist pcieport and require userspace to do this kind of set
(need a hotplug stub driver?) or if vfio-pci needs to gain some basic
pcieport functionality that can enable the device and bind service
drivers we want (aer) and avoid ones we don't (pciehp).  Thanks,

Alex


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-06 22:45               ` Alex Williamson
@ 2013-02-07 22:23                 ` Richard Weinberger
  2013-02-07 22:49                   ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Weinberger @ 2013-02-07 22:23 UTC (permalink / raw)
  To: Alex Williamson; +Cc: David Gstir, kvm, linux-kernel

Hi,

Am Wed, 06 Feb 2013 15:45:37 -0700
schrieb Alex Williamson <alex.williamson@redhat.com>:

> On Wed, 2013-02-06 at 21:25 +0100, Richard Weinberger wrote:
> > Hi,
> > 
> > Am Wed, 06 Feb 2013 11:47:20 -0700
> > schrieb Alex Williamson <alex.williamson@redhat.com>: 
> > > Does the card work with pci-assign or are both broken?
> > 
> > It works with pci-assign. :-\
> 
> When you tested this, did you detach the group from vfio or use it as
> is?  In your previous message I see this:

I've detached it.

> 03:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host
> Controller [1033:0194] (rev ff)
> 
> /sys/kernel/iommu_groups/7/devices:
> total 0
> lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1c.0
> -> ../../../../devices/pci0000:00/0000:00:1c.0 lrwxrwxrwx 1 root root
> 0 Feb  4 10:29 0000:00:1c.6
> -> ../../../../devices/pci0000:00/0000:00:1c.6 lrwxrwxrwx 1 root root
> 0 Feb  4 10:29 0000:03:00.0
> -> ../../../../devices/pci0000:00/0000:00:1c.6/0000:03:00.0
> 
> This seemed like a good card to have in my test cache, so I went and
> got one and it works fine for me... but I've been playing with
> pcieport because I don't think we're handling them correctly in vfio.
> 
> Can you provide lspci -vvv -s 1c.6 while the guest is running?  I'm
> going to bet that
> 
> Control: I/O+ Mem+ BusMaster+

Do you want "lspci -vvv -s 1c.6" after attaching 1c.6 to vfio and not
using pci-assign?

> is not set, which it would have been if pci-assign was tested without
> the group bound to vfio.  I think the solution is going to be
> something around white-listing pcieport, which you can easily test
> with a kernel patch like this:
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 12c264d..48a97fb 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -442,7 +442,7 @@ static struct vfio_device
> *vfio_group_get_device(struct vfio
>   * a device.  It's not always practical to leave a device within a
> group
>   * driverless as it could get re-bound to something unsafe.
>   */
> -static const char * const vfio_driver_whitelist[] = { "pci-stub" };
> +static const char * const vfio_driver_whitelist[] = { "pci-stub",
> "pcieport" }; 
>  static bool vfio_whitelisted_driver(struct device_driver *drv)
>  {

If I whitelist pcieport USB3 works within the guests. :-)
Binding 1c.0 and 1c.6 is no longer needed.
Next week I'll run some more tests with USB3 devices.
 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-07 22:23                 ` Richard Weinberger
@ 2013-02-07 22:49                   ` Alex Williamson
  2013-02-07 23:26                     ` Richard Weinberger
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Williamson @ 2013-02-07 22:49 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: David Gstir, kvm, linux-kernel

On Thu, 2013-02-07 at 23:23 +0100, Richard Weinberger wrote:
> Hi,
> 
> Am Wed, 06 Feb 2013 15:45:37 -0700
> schrieb Alex Williamson <alex.williamson@redhat.com>:
> 
> > On Wed, 2013-02-06 at 21:25 +0100, Richard Weinberger wrote:
> > > Hi,
> > > 
> > > Am Wed, 06 Feb 2013 11:47:20 -0700
> > > schrieb Alex Williamson <alex.williamson@redhat.com>: 
> > > > Does the card work with pci-assign or are both broken?
> > > 
> > > It works with pci-assign. :-\
> > 
> > When you tested this, did you detach the group from vfio or use it as
> > is?  In your previous message I see this:
> 
> I've detached it.
> 
> > 03:00.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host
> > Controller [1033:0194] (rev ff)
> > 
> > /sys/kernel/iommu_groups/7/devices:
> > total 0
> > lrwxrwxrwx 1 root root 0 Feb  4 10:29 0000:00:1c.0
> > -> ../../../../devices/pci0000:00/0000:00:1c.0 lrwxrwxrwx 1 root root
> > 0 Feb  4 10:29 0000:00:1c.6
> > -> ../../../../devices/pci0000:00/0000:00:1c.6 lrwxrwxrwx 1 root root
> > 0 Feb  4 10:29 0000:03:00.0
> > -> ../../../../devices/pci0000:00/0000:00:1c.6/0000:03:00.0
> > 
> > This seemed like a good card to have in my test cache, so I went and
> > got one and it works fine for me... but I've been playing with
> > pcieport because I don't think we're handling them correctly in vfio.
> > 
> > Can you provide lspci -vvv -s 1c.6 while the guest is running?  I'm
> > going to bet that
> > 
> > Control: I/O+ Mem+ BusMaster+
> 
> Do you want "lspci -vvv -s 1c.6" after attaching 1c.6 to vfio and not
> using pci-assign?

Was looking for while attached to vfio with the guest running after xhci
has failed to attach to it, but it's not really necessary, I'm pretty
sure of the result given that it work when the root port is left alone.


> > is not set, which it would have been if pci-assign was tested without
> > the group bound to vfio.  I think the solution is going to be
> > something around white-listing pcieport, which you can easily test
> > with a kernel patch like this:
> > 
> > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > index 12c264d..48a97fb 100644
> > --- a/drivers/vfio/vfio.c
> > +++ b/drivers/vfio/vfio.c
> > @@ -442,7 +442,7 @@ static struct vfio_device
> > *vfio_group_get_device(struct vfio
> >   * a device.  It's not always practical to leave a device within a
> > group
> >   * driverless as it could get re-bound to something unsafe.
> >   */
> > -static const char * const vfio_driver_whitelist[] = { "pci-stub" };
> > +static const char * const vfio_driver_whitelist[] = { "pci-stub",
> > "pcieport" }; 
> >  static bool vfio_whitelisted_driver(struct device_driver *drv)
> >  {
> 
> If I whitelist pcieport USB3 works within the guests. :-)
> Binding 1c.0 and 1c.6 is no longer needed.
> Next week I'll run some more tests with USB3 devices.

Great!  Thanks for the test.  I assume you didn't need to do anything
with unbinding pciehp?

Alex


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: DMAR faults from unrelated device when vfio is used
  2013-02-07 22:49                   ` Alex Williamson
@ 2013-02-07 23:26                     ` Richard Weinberger
  0 siblings, 0 replies; 13+ messages in thread
From: Richard Weinberger @ 2013-02-07 23:26 UTC (permalink / raw)
  To: Alex Williamson; +Cc: David Gstir, kvm, linux-kernel

Am Thu, 07 Feb 2013 15:49:58 -0700
schrieb Alex Williamson <alex.williamson@redhat.com>:
> > If I whitelist pcieport USB3 works within the guests. :-)
> > Binding 1c.0 and 1c.6 is no longer needed.
> > Next week I'll run some more tests with USB3 devices.
> 
> Great!  Thanks for the test.  I assume you didn't need to do anything
> with unbinding pciehp?

Yeah, unbinding from pciehp was not needed.
Next week I'll have physical access to that box and be able run more
tests.
So far everything looks fine.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-02-07 23:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-04 10:10 DMAR faults from unrelated device when vfio is used David Gstir
2013-02-04 15:49 ` Alex Williamson
2013-02-05 13:31   ` David Gstir
2013-02-05 15:37     ` Alex Williamson
2013-02-05 20:36       ` Alex Williamson
2013-02-05 20:41         ` Richard Weinberger
2013-02-06 18:09         ` Richard Weinberger
2013-02-06 18:47           ` Alex Williamson
2013-02-06 20:25             ` Richard Weinberger
2013-02-06 22:45               ` Alex Williamson
2013-02-07 22:23                 ` Richard Weinberger
2013-02-07 22:49                   ` Alex Williamson
2013-02-07 23:26                     ` Richard Weinberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).