All of lore.kernel.org
 help / color / mirror / Atom feed
* PCI pass-through problem for SN570 NVME SSD
@ 2022-07-02 17:43 G.R.
  2022-07-04  6:37 ` G.R.
  2022-07-04  9:50 ` Roger Pau Monné
  0 siblings, 2 replies; 31+ messages in thread
From: G.R. @ 2022-07-02 17:43 UTC (permalink / raw)
  To: xen-users, xen-devel

Hi everybody,

I run into problems passing through a SN570 NVME SSD to a HVM guest.
So far I have no idea if the problem is with this specific SSD or with
the CPU + motherboard combination or the SW stack.
Looking for some suggestions on troubleshooting.

List of build info:
CPU+motherboard: E-2146G + Gigabyte C246N-WU2
XEN version: 4.14.3
Dom0: Linux Kernel 5.10 (built from Debian 11.2 kernel source package)
The SN570 SSD sits here in the PCI tree:
           +-1d.0-[05]----00.0  Sandisk Corp Device 501a

Syndromes observed:
With ASPM enabled, pciback has problem seizing the device.

Jul  2 00:36:54 gaia kernel: [    1.648270] pciback 0000:05:00.0:
xen_pciback: seizing device
...
Jul  2 00:36:54 gaia kernel: [    1.768646] pcieport 0000:00:1d.0:
AER: enabled with IRQ 150
Jul  2 00:36:54 gaia kernel: [    1.768716] pcieport 0000:00:1d.0:
DPC: enabled with IRQ 150
Jul  2 00:36:54 gaia kernel: [    1.768717] pcieport 0000:00:1d.0:
DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+
SwTrigger+ RP PIO Log 4, DL_ActiveErr+
...
Jul  2 00:36:54 gaia kernel: [    1.770039] xen: registering gsi 16
triggering 0 polarity 1
Jul  2 00:36:54 gaia kernel: [    1.770041] Already setup the GSI :16
Jul  2 00:36:54 gaia kernel: [    1.770314] pcieport 0000:00:1d.0:
DPC: containment event, status:0x1f11 source:0x0000
Jul  2 00:36:54 gaia kernel: [    1.770315] pcieport 0000:00:1d.0:
DPC: unmasked uncorrectable error detected
Jul  2 00:36:54 gaia kernel: [    1.770320] pcieport 0000:00:1d.0:
PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction
Layer, (Receiver ID)
Jul  2 00:36:54 gaia kernel: [    1.770371] pcieport 0000:00:1d.0:
device [8086:a330] error status/mask=00200000/00010000
Jul  2 00:36:54 gaia kernel: [    1.770413] pcieport 0000:00:1d.0:
[21] ACSViol                (First)
Jul  2 00:36:54 gaia kernel: [    1.770466] pciback 0000:05:00.0:
xen_pciback: device is not found/assigned
Jul  2 00:36:54 gaia kernel: [    1.920195] pciback 0000:05:00.0:
xen_pciback: device is not found/assigned
Jul  2 00:36:54 gaia kernel: [    1.920260] pcieport 0000:00:1d.0:
AER: device recovery successful
Jul  2 00:36:54 gaia kernel: [    1.920263] pcieport 0000:00:1d.0:
DPC: containment event, status:0x1f01 source:0x0000
Jul  2 00:36:54 gaia kernel: [    1.920264] pcieport 0000:00:1d.0:
DPC: unmasked uncorrectable error detected
Jul  2 00:36:54 gaia kernel: [    1.920267] pciback 0000:05:00.0:
xen_pciback: device is not found/assigned
Jul  2 00:36:54 gaia kernel: [    1.938406] xen: registering gsi 16
triggering 0 polarity 1
Jul  2 00:36:54 gaia kernel: [    1.938408] Already setup the GSI :16
Jul  2 00:36:54 gaia kernel: [    1.938666] xen_pciback: backend is vpci
...
Jul  2 00:43:48 gaia kernel: [  420.231955] pcieport 0000:00:1d.0:
DPC: containment event, status:0x1f01 source:0x0000
Jul  2 00:43:48 gaia kernel: [  420.231961] pcieport 0000:00:1d.0:
DPC: unmasked uncorrectable error detected
Jul  2 00:43:48 gaia kernel: [  420.231993] pcieport 0000:00:1d.0:
PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction
Layer, (Requester ID)
Jul  2 00:43:48 gaia kernel: [  420.235775] pcieport 0000:00:1d.0:
device [8086:a330] error status/mask=00100000/00010000
Jul  2 00:43:48 gaia kernel: [  420.235779] pcieport 0000:00:1d.0:
[20] UnsupReq               (First)
Jul  2 00:43:48 gaia kernel: [  420.235783] pcieport 0000:00:1d.0:
AER:   TLP Header: 34000000 05000010 00000000 88458845
Jul  2 00:43:48 gaia kernel: [  420.235819] pci 0000:05:00.0: AER:
can't recover (no error_detected callback)
Jul  2 00:43:48 gaia kernel: [  420.384349] pcieport 0000:00:1d.0:
AER: device recovery successful
... // The following might relate to an attempt to assign the device
to guest, not very sure...
Jul  2 00:46:06 gaia kernel: [  559.147333] pciback 0000:05:00.0:
xen_pciback: seizing device
Jul  2 00:46:06 gaia kernel: [  559.147435] pciback 0000:05:00.0:
enabling device (0000 -> 0002)
Jul  2 00:46:06 gaia kernel: [  559.147508] xen: registering gsi 16
triggering 0 polarity 1
Jul  2 00:46:06 gaia kernel: [  559.147511] Already setup the GSI :16
Jul  2 00:46:06 gaia kernel: [  559.147558] pciback 0000:05:00.0:
xen_pciback: MSI-X preparation failed (-6)


With pcie_aspm=off, the error log related to pciback goes away.
But I suspect there are still some problems hidden -- since I don't
see any AER enabled messages so errors may be hidden.
I have the xen_pciback built directly into the kernel and assigned the
SSD to it in the kernel command-line.
However, the result from pci-assignable-xxx commands are not very consistent:

root@gaia:~# xl pci-assignable-list
0000:00:17.0
0000:05:00.0
root@gaia:~# xl pci-assignable-remove 05:00.0
libxl: error: libxl_pci.c:853:libxl__device_pci_assignable_remove:
failed to de-quarantine 0000:05:00.0 <===== Here!!!
root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:794:libxl__device_pci_assignable_add:
0000:05:00.0 already assigned to pciback <==== Here!!!
root@gaia:~# xl pci-assignable-remove 05:00.0
root@gaia:~# xl pci-assignable-list
0000:00:17.0
root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:814:libxl__device_pci_assignable_add:
0000:05:00.0 not bound to a driver, will not be rebound.
root@gaia:~# xl pci-assignable-list
0000:00:17.0
0000:05:00.0


After the 'xl pci-assignable-list' appears to be self-consistent,
creating VM with the SSD assigned still leads to a guest crash:
From qemu log:
[00:06.0] xen_pt_region_update: Error: create new mem mapping failed! (err: 1)
qemu-system-i386: terminating on signal 1 from pid 1192 (xl)

From the 'xl dmesg' output:
(XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
(XEN) domain_crash called from p2m.c:1301
(XEN) Domain 1 reported crashed by domain 0 on cpu#4:
(XEN) memory_map:fail: dom1 gfn=f3078 mfn=a2504 nr=1 ret:-1


Which of the three syndromes are more fundamental?
1. The DPC / AER error log
2. The inconsistency in 'xl pci-assignable-list' state tracking
3. The GFN mapping failure on guest setup

Any suggestions for the next step?


Thanks,
G.R.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-02 17:43 PCI pass-through problem for SN570 NVME SSD G.R.
@ 2022-07-04  6:37 ` G.R.
  2022-07-04 10:31   ` Jan Beulich
  2022-07-04  9:50 ` Roger Pau Monné
  1 sibling, 1 reply; 31+ messages in thread
From: G.R. @ 2022-07-04  6:37 UTC (permalink / raw)
  To: xen-users, xen-devel

[-- Attachment #1: Type: text/plain, Size: 8193 bytes --]

Update some findings with extra triage effort...
Detailed log could be found in the attachments.
1. Confirm stock Debian 11.2 kernel (5.10) shares the same syndrome..
2. With loglvl=all, it reveals why the mapping failure happens, looks
like it comes from some duplicated mapping..
(XEN) memory_map:add: dom1 gfn=f3074 mfn=a2610 nr=2
(XEN) memory_map:add: dom1 gfn=f3077 mfn=a2615 nr=1
(XEN) memory_map:add: dom1 gfn=f3078 mfn=a2616 nr=1 <===========Here
(XEN) ioport_map:add: dom1 gport=c140 mport=4060 nr=20
(XEN) ioport_map:add: dom1 gport=c170 mport=4090 nr=8
(XEN) ioport_map:add: dom1 gport=c178 mport=4080 nr=4
(XEN) memory_map:add: dom1 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom1 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom1 gfn=f3078 mfn=a2504 nr=1 <===========Here
(XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
(XEN) domain_crash called from p2m.c:1301
(XEN) Domain 1 reported crashed by domain 0 on cpu#2:
(XEN) memory_map:fail: dom1 gfn=f3078 mfn=a2504 nr=1 ret:-1
(XEN) memory_map:remove: dom1 gfn=f3078 mfn=a2504 nr=1

3. Recompiled kernel with DEBUG enabled for xen_pciback driver and
play with xl pci-assignable-XXX with it
3.1 It's confirmed that the DPC / AER error log happens only when
xen_pciback attempts to seize && release the device
3.1.1 It only happens on each of the first add / remove operations.
3.2 There is still a 'MSI-X preparation failed' message later-on, but
otherwise it appears to be successful to add / remove the device after
the 1st attempt.
3.3 Not necessarily related, but the DPC / AER log looks similar to
this report [1]


[1]: https://patchwork.kernel.org/project/linux-pci/patch/20220127025418.1989642-1-kai.heng.feng@canonical.com/#24713767
PS: Attempting to fix the line-wrapping issue below... Have no idea
what happened about the formatting....

On Sun, Jul 3, 2022 at 1:43 AM G.R. <firemeteor@users.sourceforge.net> wrote:
>
> Hi everybody,
>
> I run into problems passing through a SN570 NVME SSD to a HVM guest.
> So far I have no idea if the problem is with this specific SSD or with
> the CPU + motherboard combination or the SW stack.
> Looking for some suggestions on troubleshooting.
>
> List of build info:
> CPU+motherboard: E-2146G + Gigabyte C246N-WU2
> XEN version: 4.14.3
> Dom0: Linux Kernel 5.10 (built from Debian 11.2 kernel source package)
> The SN570 SSD sits here in the PCI tree:
>            +-1d.0-[05]----00.0  Sandisk Corp Device 501a
>
> Syndromes observed:
> With ASPM enabled, pciback has problem seizing the device.
>
> Jul  2 00:36:54 gaia kernel: [    1.648270] pciback 0000:05:00.0: xen_pciback: seizing device
> ...
> Jul  2 00:36:54 gaia kernel: [    1.768646] pcieport 0000:00:1d.0: AER: enabled with IRQ 150
> Jul  2 00:36:54 gaia kernel: [    1.768716] pcieport 0000:00:1d.0: DPC: enabled with IRQ 150
> Jul  2 00:36:54 gaia kernel: [    1.768717] pcieport 0000:00:1d.0: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
> ...
> Jul  2 00:36:54 gaia kernel: [    1.770039] xen: registering gsi 16 triggering 0 polarity 1
> Jul  2 00:36:54 gaia kernel: [    1.770041] Already setup the GSI :16
> Jul  2 00:36:54 gaia kernel: [    1.770314] pcieport 0000:00:1d.0: DPC: containment event, status:0x1f11 source:0x0000
> Jul  2 00:36:54 gaia kernel: [    1.770315] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
> Jul  2 00:36:54 gaia kernel: [    1.770320] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)
> Jul  2 00:36:54 gaia kernel: [    1.770371] pcieport 0000:00:1d.0: device [8086:a330] error status/mask=00200000/00010000
> Jul  2 00:36:54 gaia kernel: [    1.770413] pcieport 0000:00:1d.0: [21] ACSViol                (First)
> Jul  2 00:36:54 gaia kernel: [    1.770466] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
> Jul  2 00:36:54 gaia kernel: [    1.920195] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
> Jul  2 00:36:54 gaia kernel: [    1.920260] pcieport 0000:00:1d.0: AER: device recovery successful
> Jul  2 00:36:54 gaia kernel: [    1.920263] pcieport 0000:00:1d.0: DPC: containment event, status:0x1f01 source:0x0000
> Jul  2 00:36:54 gaia kernel: [    1.920264] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
> Jul  2 00:36:54 gaia kernel: [    1.920267] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
> Jul  2 00:36:54 gaia kernel: [    1.938406] xen: registering gsi 16 triggering 0 polarity 1
> Jul  2 00:36:54 gaia kernel: [    1.938408] Already setup the GSI :16
> Jul  2 00:36:54 gaia kernel: [    1.938666] xen_pciback: backend is vpci
> ...
> Jul  2 00:43:48 gaia kernel: [  420.231955] pcieport 0000:00:1d.0: DPC: containment event, status:0x1f01 source:0x0000
> Jul  2 00:43:48 gaia kernel: [  420.231961] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
> Jul  2 00:43:48 gaia kernel: [  420.231993] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
> Jul  2 00:43:48 gaia kernel: [  420.235775] pcieport 0000:00:1d.0: device [8086:a330] error status/mask=00100000/00010000
> Jul  2 00:43:48 gaia kernel: [  420.235779] pcieport 0000:00:1d.0: [20] UnsupReq               (First)
> Jul  2 00:43:48 gaia kernel: [  420.235783] pcieport 0000:00:1d.0: AER:   TLP Header: 34000000 05000010 00000000 88458845
> Jul  2 00:43:48 gaia kernel: [  420.235819] pci 0000:05:00.0: AER: can't recover (no error_detected callback)
> Jul  2 00:43:48 gaia kernel: [  420.384349] pcieport 0000:00:1d.0: AER: device recovery successful
> ... // The following might relate to an attempt to assign the device to guest, not very sure...
> Jul  2 00:46:06 gaia kernel: [  559.147333] pciback 0000:05:00.0: xen_pciback: seizing device
> Jul  2 00:46:06 gaia kernel: [  559.147435] pciback 0000:05:00.0: enabling device (0000 -> 0002)
> Jul  2 00:46:06 gaia kernel: [  559.147508] xen: registering gsi 16 triggering 0 polarity 1
> Jul  2 00:46:06 gaia kernel: [  559.147511] Already setup the GSI :16
> Jul  2 00:46:06 gaia kernel: [  559.147558] pciback 0000:05:00.0: xen_pciback: MSI-X preparation failed (-6)
>
>
> With pcie_aspm=off, the error log related to pciback goes away.
> But I suspect there are still some problems hidden -- since I don't
> see any AER enabled messages so errors may be hidden.
> I have the xen_pciback built directly into the kernel and assigned the
> SSD to it in the kernel command-line.
> However, the result from pci-assignable-xxx commands are not very consistent:
>
> root@gaia:~# xl pci-assignable-list
> 0000:00:17.0
> 0000:05:00.0
> root@gaia:~# xl pci-assignable-remove 05:00.0
> libxl: error: libxl_pci.c:853:libxl__device_pci_assignable_remove: failed to de-quarantine 0000:05:00.0 <===== Here!!!
> root@gaia:~# xl pci-assignable-add 05:00.0
> libxl: warning: libxl_pci.c:794:libxl__device_pci_assignable_add: 0000:05:00.0 already assigned to pciback <==== Here!!!
> root@gaia:~# xl pci-assignable-remove 05:00.0
> root@gaia:~# xl pci-assignable-list 0000:00:17.0
> root@gaia:~# xl pci-assignable-add 05:00.0
> libxl: warning: libxl_pci.c:814:libxl__device_pci_assignable_add: 0000:05:00.0 not bound to a driver, will not be rebound.
> root@gaia:~# xl pci-assignable-list
> 0000:00:17.0
> 0000:05:00.0
>
>
> After the 'xl pci-assignable-list' appears to be self-consistent, creating VM with the SSD assigned still leads to a guest crash:
> From qemu log:
> [00:06.0] xen_pt_region_update: Error: create new mem mapping failed! (err: 1)
> qemu-system-i386: terminating on signal 1 from pid 1192 (xl)
>
> From the 'xl dmesg' output:
> (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
> (XEN) domain_crash called from p2m.c:1301
> (XEN) Domain 1 reported crashed by domain 0 on cpu#4:
> (XEN) memory_map:fail: dom1 gfn=f3078 mfn=a2504 nr=1 ret:-1
>
>
> Which of the three syndromes are more fundamental?
> 1. The DPC / AER error log
> 2. The inconsistency in 'xl pci-assignable-list' state tracking
> 3. The GFN mapping failure on guest setup
>
> Any suggestions for the next step?
>
>
> Thanks,
> G.R.

[-- Attachment #2: xldmesg_sn570_pt_fail.log --]
[-- Type: text/x-log, Size: 3016 bytes --]

(XEN) HVM d1v0 save: CPU
(XEN) HVM d1v1 save: CPU
(XEN) HVM d1 save: PIC
(XEN) HVM d1 save: IOAPIC
(XEN) HVM d1v0 save: LAPIC
(XEN) HVM d1v1 save: LAPIC
(XEN) HVM d1v0 save: LAPIC_REGS
(XEN) HVM d1v1 save: LAPIC_REGS
(XEN) HVM d1 save: PCI_IRQ
(XEN) HVM d1 save: ISA_IRQ
(XEN) HVM d1 save: PCI_LINK
(XEN) HVM d1 save: PIT
(XEN) HVM d1 save: RTC
(XEN) HVM d1 save: HPET
(XEN) HVM d1 save: PMTIMER
(XEN) HVM d1v0 save: MTRR
(XEN) HVM d1v1 save: MTRR
(XEN) HVM d1 save: VIRIDIAN_DOMAIN
(XEN) HVM d1v0 save: CPU_XSAVE
(XEN) HVM d1v1 save: CPU_XSAVE
(XEN) HVM d1v0 save: VIRIDIAN_VCPU
(XEN) HVM d1v1 save: VIRIDIAN_VCPU
(XEN) HVM d1v0 save: VMCE_VCPU
(XEN) HVM d1v1 save: VMCE_VCPU
(XEN) HVM d1v0 save: TSC_ADJUST
(XEN) HVM d1v1 save: TSC_ADJUST
(XEN) HVM d1v0 save: CPU_MSR
(XEN) HVM d1v1 save: CPU_MSR
(XEN) HVM1 restore: CPU 0
(d1) HVM Loader
(d1) Detected Xen v4.14.3
(d1) Xenbus rings @0xfeffc000, event channel 1
(d1) System requested SeaBIOS
(d1) CPU speed is 3505 MHz
(d1) Relocating guest memory for lowmem MMIO space disabled
(d1) PCI-ISA link 0 routed to IRQ5
(d1) PCI-ISA link 1 routed to IRQ10
(d1) PCI-ISA link 2 routed to IRQ11
(d1) PCI-ISA link 3 routed to IRQ5
(d1) pci dev 01:3 INTA->IRQ10
(d1) pci dev 02:0 INTA->IRQ11
(d1) pci dev 04:0 INTA->IRQ5
(d1) pci dev 05:0 INTA->IRQ10
(d1) pci dev 06:0 INTA->IRQ11
(d1) RAM in high memory; setting high_mem resource base to 40f800000
(d1) pci dev 03:0 bar 10 size 002000000: 0f0000008
(d1) pci dev 02:0 bar 14 size 001000000: 0f2000008
(d1) pci dev 04:0 bar 30 size 000040000: 0f3000000
(d1) pci dev 04:0 bar 10 size 000020000: 0f3040000
(d1) pci dev 03:0 bar 30 size 000010000: 0f3060000
(d1) pci dev 06:0 bar 10 size 000004000: 0f3070004
(d1) pci dev 05:0 bar 10 size 000002000: 0f3074000
(d1) pci dev 03:0 bar 14 size 000001000: 0f3076000
(d1) pci dev 05:0 bar 24 size 000000800: 0f3077000
(d1) pci dev 02:0 bar 10 size 000000100: 00000c001
(d1) pci dev 05:0 bar 14 size 000000100: 0f3077800
(d1) pci dev 06:0 bar 20 size 000000100: 0f3077904
(d1) pci dev 04:0 bar 14 size 000000040: 00000c101
(d1) pci dev 05:0 bar 20 size 000000020: 00000c141
(d1) pci dev 01:1 bar 20 size 000000010: 00000c161
(d1) pci dev 05:0 bar 18 size 000000008: 00000c171
(d1) pci dev 05:0 bar 1c size 000000004: 00000c179
(XEN) memory_map:add: dom1 gfn=f3074 mfn=a2610 nr=2
(XEN) memory_map:add: dom1 gfn=f3077 mfn=a2615 nr=1
(XEN) memory_map:add: dom1 gfn=f3078 mfn=a2616 nr=1
(XEN) ioport_map:add: dom1 gport=c140 mport=4060 nr=20
(XEN) ioport_map:add: dom1 gport=c170 mport=4090 nr=8
(XEN) ioport_map:add: dom1 gport=c178 mport=4080 nr=4
(XEN) memory_map:add: dom1 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom1 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom1 gfn=f3078 mfn=a2504 nr=1
(XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
(XEN) domain_crash called from p2m.c:1301
(XEN) Domain 1 reported crashed by domain 0 on cpu#2:
(XEN) memory_map:fail: dom1 gfn=f3078 mfn=a2504 nr=1 ret:-1
(XEN) memory_map:remove: dom1 gfn=f3078 mfn=a2504 nr=1


[-- Attachment #3: pciback_dbg_xl-pci_assignable_XXX.log --]
[-- Type: text/x-log, Size: 6680 bytes --]

root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:814:libxl__device_pci_assignable_add: 0000:05:00.0 not bound to a driver, will not be rebound.

[  323.448115] xen_pciback: wants to seize 0000:05:00.0
[  323.448136] pciback 0000:05:00.0: xen_pciback: probing...
[  323.448137] pciback 0000:05:00.0: xen_pciback: seizing device
[  323.448162] pciback 0000:05:00.0: xen_pciback: pcistub_device_alloc
[  323.448162] pciback 0000:05:00.0: xen_pciback: initializing...
[  323.448163] pciback 0000:05:00.0: xen_pciback: initializing config
[  323.448344] pciback 0000:05:00.0: xen_pciback: enabling device
[  323.448425] xen: registering gsi 16 triggering 0 polarity 1
[  323.448428] Already setup the GSI :16
[  323.448497] pciback 0000:05:00.0: xen_pciback: save state of device
[  323.448642] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3, etc) the device
[  323.448707] pcieport 0000:00:1d.0: DPC: containment event, status:0x1f11 source:0x0000
[  323.448730] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
[  323.448760] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)
[  323.448786] pcieport 0000:00:1d.0:   device [8086:a330] error status/mask=00200000/00010000
[  323.448813] pcieport 0000:00:1d.0:    [21] ACSViol                (First)
[  324.690979] pciback 0000:05:00.0: not ready 1023ms after FLR; waiting
[  325.730706] pciback 0000:05:00.0: not ready 2047ms after FLR; waiting
[  327.997638] pciback 0000:05:00.0: not ready 4095ms after FLR; waiting
[  332.264251] pciback 0000:05:00.0: not ready 8191ms after FLR; waiting
[  340.584320] pciback 0000:05:00.0: not ready 16383ms after FLR; waiting
[  357.010896] pciback 0000:05:00.0: not ready 32767ms after FLR; waiting
[  391.143951] pciback 0000:05:00.0: not ready 65535ms after FLR; giving up
[  392.249252] pciback 0000:05:00.0: xen_pciback: reset device
[  392.249392] pciback 0000:05:00.0: xen_pciback: xen_pcibk_error_detected(bus:5,devfn:0)
[  392.249393] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
[  392.397074] pciback 0000:05:00.0: xen_pciback: xen_pcibk_error_resume(bus:5,devfn:0)
[  392.397080] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
[  392.397284] pcieport 0000:00:1d.0: AER: device recovery successful

libxl: error: libxl_pci.c:835:libxl__device_pci_assignable_add: failed to quarantine 0000:05:00.0

root@gaia:~# xl pci-assignable-remove 05:00.0
libxl: error: libxl_pci.c:853:libxl__device_pci_assignable_remove: failed to de-quarantine 0000:05:00.0
root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:794:libxl__device_pci_assignable_add: 0000:05:00.0 already assigned to pciback
root@gaia:~# xl pci-assignable-remove 05:00.0
[  603.928039] pciback 0000:05:00.0: xen_pciback: removing
[  603.928041] pciback 0000:05:00.0: xen_pciback: found device to remove 
[  603.928042] pciback 0000:05:00.0: xen_pciback: pcistub_device_release
[  604.033372] pcieport 0000:00:1d.0: DPC: containment event, status:0x1f11 source:0x0000
[  604.033512] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
[  604.033631] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[  604.033758] pcieport 0000:00:1d.0:   device [8086:a330] error status/mask=00100000/00010000
[  604.033856] pcieport 0000:00:1d.0:    [20] UnsupReq               (First)
[  604.033939] pcieport 0000:00:1d.0: AER:   TLP Header: 34000000 05000010 00000000 88458845
[  604.034059] pci 0000:05:00.0: AER: can't recover (no error_detected callback)
[  604.034421] xen_pciback: removed 0000:05:00.0 from seize list
[  604.182597] pcieport 0000:00:1d.0: AER: device recovery successful

root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:814:libxl__device_pci_assignable_add: 0000:05:00.0 not bound to a driver, will not be rebound.
[  667.582051] xen_pciback: wants to seize 0000:05:00.0
[  667.582130] pciback 0000:05:00.0: xen_pciback: probing...
[  667.582134] pciback 0000:05:00.0: xen_pciback: seizing device
[  667.582228] pciback 0000:05:00.0: xen_pciback: pcistub_device_alloc
[  667.582231] pciback 0000:05:00.0: xen_pciback: initializing...
[  667.582235] pciback 0000:05:00.0: xen_pciback: initializing config
[  667.582548] pciback 0000:05:00.0: xen_pciback: enabling device
[  667.582599] pciback 0000:05:00.0: enabling device (0000 -> 0002)
[  667.582912] xen: registering gsi 16 triggering 0 polarity 1
[  667.582923] Already setup the GSI :16
[  667.583061] pciback 0000:05:00.0: xen_pciback: MSI-X preparation failed (-6)
[  667.583148] pciback 0000:05:00.0: xen_pciback: save state of device
[  667.583569] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3, etc) the device
[  667.689656] pciback 0000:05:00.0: xen_pciback: reset device

root@gaia:~# xl pci-assignable-remove 05:00.0
[  720.957988] pciback 0000:05:00.0: xen_pciback: removing
[  720.957996] pciback 0000:05:00.0: xen_pciback: found device to remove 
[  720.957999] pciback 0000:05:00.0: xen_pciback: pcistub_device_release
[  721.065222] pciback 0000:05:00.0: xen_pciback: MSI-X release failed (-16)
[  721.065667] xen_pciback: removed 0000:05:00.0 from seize list

root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:814:libxl__device_pci_assignable_add: 0000:05:00.0 not bound to a driver, will not be rebound.

[  763.888631] xen_pciback: wants to seize 0000:05:00.0
[  763.888690] pciback 0000:05:00.0: xen_pciback: probing...
[  763.888691] pciback 0000:05:00.0: xen_pciback: seizing device
[  763.888716] pciback 0000:05:00.0: xen_pciback: pcistub_device_alloc
[  763.888717] pciback 0000:05:00.0: xen_pciback: initializing...
[  763.888717] pciback 0000:05:00.0: xen_pciback: initializing config
[  763.888804] pciback 0000:05:00.0: xen_pciback: enabling device
[  763.888885] xen: registering gsi 16 triggering 0 polarity 1
[  763.888889] Already setup the GSI :16
[  763.888949] pciback 0000:05:00.0: xen_pciback: MSI-X preparation failed (-6)
[  763.888977] pciback 0000:05:00.0: xen_pciback: save state of device
[  763.889126] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3, etc) the device
[  763.994206] pciback 0000:05:00.0: xen_pciback: reset device

root@gaia:~# xl pci-assignable-remove 05:00.0
[  819.491000] pciback 0000:05:00.0: xen_pciback: removing
[  819.491002] pciback 0000:05:00.0: xen_pciback: found device to remove 
[  819.491003] pciback 0000:05:00.0: xen_pciback: pcistub_device_release
[  819.596113] pciback 0000:05:00.0: xen_pciback: MSI-X release failed (-16)
[  819.596466] xen_pciback: removed 0000:05:00.0 from seize list


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-02 17:43 PCI pass-through problem for SN570 NVME SSD G.R.
  2022-07-04  6:37 ` G.R.
@ 2022-07-04  9:50 ` Roger Pau Monné
  2022-07-04 10:34   ` Jan Beulich
  2022-07-04 11:34   ` G.R.
  1 sibling, 2 replies; 31+ messages in thread
From: Roger Pau Monné @ 2022-07-04  9:50 UTC (permalink / raw)
  To: G.R.; +Cc: xen-users, xen-devel

On Sun, Jul 03, 2022 at 01:43:11AM +0800, G.R. wrote:
> Hi everybody,
> 
> I run into problems passing through a SN570 NVME SSD to a HVM guest.
> So far I have no idea if the problem is with this specific SSD or with
> the CPU + motherboard combination or the SW stack.
> Looking for some suggestions on troubleshooting.
> 
> List of build info:
> CPU+motherboard: E-2146G + Gigabyte C246N-WU2
> XEN version: 4.14.3

Are you using a debug build of Xen? (if not it would be helpful to do
so).

> Dom0: Linux Kernel 5.10 (built from Debian 11.2 kernel source package)
> The SN570 SSD sits here in the PCI tree:
>            +-1d.0-[05]----00.0  Sandisk Corp Device 501a

Could be helpful to post the output with -vvv so we can see the
capabilities of the device.

> Syndromes observed:
> With ASPM enabled, pciback has problem seizing the device.
> 
> Jul  2 00:36:54 gaia kernel: [    1.648270] pciback 0000:05:00.0:
> xen_pciback: seizing device
> ...
> Jul  2 00:36:54 gaia kernel: [    1.768646] pcieport 0000:00:1d.0:
> AER: enabled with IRQ 150
> Jul  2 00:36:54 gaia kernel: [    1.768716] pcieport 0000:00:1d.0:
> DPC: enabled with IRQ 150
> Jul  2 00:36:54 gaia kernel: [    1.768717] pcieport 0000:00:1d.0:
> DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+
> SwTrigger+ RP PIO Log 4, DL_ActiveErr+

Is there a device reset involved here?  It's possible the device
doesn't reset properly and hence the Uncorrectable Error Status
Register ends up with inconsistent bits set.

> ...
> Jul  2 00:36:54 gaia kernel: [    1.770039] xen: registering gsi 16
> triggering 0 polarity 1
> Jul  2 00:36:54 gaia kernel: [    1.770041] Already setup the GSI :16
> Jul  2 00:36:54 gaia kernel: [    1.770314] pcieport 0000:00:1d.0:
> DPC: containment event, status:0x1f11 source:0x0000
> Jul  2 00:36:54 gaia kernel: [    1.770315] pcieport 0000:00:1d.0:
> DPC: unmasked uncorrectable error detected
> Jul  2 00:36:54 gaia kernel: [    1.770320] pcieport 0000:00:1d.0:
> PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction
> Layer, (Receiver ID)
> Jul  2 00:36:54 gaia kernel: [    1.770371] pcieport 0000:00:1d.0:
> device [8086:a330] error status/mask=00200000/00010000
> Jul  2 00:36:54 gaia kernel: [    1.770413] pcieport 0000:00:1d.0:
> [21] ACSViol                (First)
> Jul  2 00:36:54 gaia kernel: [    1.770466] pciback 0000:05:00.0:
> xen_pciback: device is not found/assigned
> Jul  2 00:36:54 gaia kernel: [    1.920195] pciback 0000:05:00.0:
> xen_pciback: device is not found/assigned
> Jul  2 00:36:54 gaia kernel: [    1.920260] pcieport 0000:00:1d.0:
> AER: device recovery successful
> Jul  2 00:36:54 gaia kernel: [    1.920263] pcieport 0000:00:1d.0:
> DPC: containment event, status:0x1f01 source:0x0000
> Jul  2 00:36:54 gaia kernel: [    1.920264] pcieport 0000:00:1d.0:
> DPC: unmasked uncorrectable error detected
> Jul  2 00:36:54 gaia kernel: [    1.920267] pciback 0000:05:00.0:
> xen_pciback: device is not found/assigned

That's from a different device (05:00.0).

> Jul  2 00:36:54 gaia kernel: [    1.938406] xen: registering gsi 16
> triggering 0 polarity 1
> Jul  2 00:36:54 gaia kernel: [    1.938408] Already setup the GSI :16
> Jul  2 00:36:54 gaia kernel: [    1.938666] xen_pciback: backend is vpci
> ...
> Jul  2 00:43:48 gaia kernel: [  420.231955] pcieport 0000:00:1d.0:
> DPC: containment event, status:0x1f01 source:0x0000
> Jul  2 00:43:48 gaia kernel: [  420.231961] pcieport 0000:00:1d.0:
> DPC: unmasked uncorrectable error detected
> Jul  2 00:43:48 gaia kernel: [  420.231993] pcieport 0000:00:1d.0:
> PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction
> Layer, (Requester ID)
> Jul  2 00:43:48 gaia kernel: [  420.235775] pcieport 0000:00:1d.0:
> device [8086:a330] error status/mask=00100000/00010000
> Jul  2 00:43:48 gaia kernel: [  420.235779] pcieport 0000:00:1d.0:
> [20] UnsupReq               (First)
> Jul  2 00:43:48 gaia kernel: [  420.235783] pcieport 0000:00:1d.0:
> AER:   TLP Header: 34000000 05000010 00000000 88458845
> Jul  2 00:43:48 gaia kernel: [  420.235819] pci 0000:05:00.0: AER:
> can't recover (no error_detected callback)
> Jul  2 00:43:48 gaia kernel: [  420.384349] pcieport 0000:00:1d.0:
> AER: device recovery successful
> ... // The following might relate to an attempt to assign the device
> to guest, not very sure...
> Jul  2 00:46:06 gaia kernel: [  559.147333] pciback 0000:05:00.0:
> xen_pciback: seizing device
> Jul  2 00:46:06 gaia kernel: [  559.147435] pciback 0000:05:00.0:
> enabling device (0000 -> 0002)
> Jul  2 00:46:06 gaia kernel: [  559.147508] xen: registering gsi 16
> triggering 0 polarity 1
> Jul  2 00:46:06 gaia kernel: [  559.147511] Already setup the GSI :16
> Jul  2 00:46:06 gaia kernel: [  559.147558] pciback 0000:05:00.0:
> xen_pciback: MSI-X preparation failed (-6)
> 
> 
> With pcie_aspm=off, the error log related to pciback goes away.
> But I suspect there are still some problems hidden -- since I don't
> see any AER enabled messages so errors may be hidden.
> I have the xen_pciback built directly into the kernel and assigned the
> SSD to it in the kernel command-line.
> However, the result from pci-assignable-xxx commands are not very consistent:
> 
> root@gaia:~# xl pci-assignable-list
> 0000:00:17.0
> 0000:05:00.0
> root@gaia:~# xl pci-assignable-remove 05:00.0
> libxl: error: libxl_pci.c:853:libxl__device_pci_assignable_remove:
> failed to de-quarantine 0000:05:00.0 <===== Here!!!
> root@gaia:~# xl pci-assignable-add 05:00.0
> libxl: warning: libxl_pci.c:794:libxl__device_pci_assignable_add:
> 0000:05:00.0 already assigned to pciback <==== Here!!!
> root@gaia:~# xl pci-assignable-remove 05:00.0
> root@gaia:~# xl pci-assignable-list
> 0000:00:17.0
> root@gaia:~# xl pci-assignable-add 05:00.0
> libxl: warning: libxl_pci.c:814:libxl__device_pci_assignable_add:
> 0000:05:00.0 not bound to a driver, will not be rebound.
> root@gaia:~# xl pci-assignable-list
> 0000:00:17.0
> 0000:05:00.0

I'm confused, the log above is mostly from a device at 0000:00:1d.0,
while here you only have 0000:00:17.0 and 0000:05:00.0. I assume
0000:00:1d.0 never gets to appear on the output of `xl
pci-assignable-list`?

Also you seem to be trying to assign 0000:05:00.0 which is not the
same device that's giving the errors above. From the text above I've
assumed 0000:00:1d.0 was the NVME that you wanted to assign to a
guest.

Could you attempt the same with only the single device that's causing
issues as assignable? (having other devices just makes the output
confusing).

> 
> After the 'xl pci-assignable-list' appears to be self-consistent,
> creating VM with the SSD assigned still leads to a guest crash:
> From qemu log:
> [00:06.0] xen_pt_region_update: Error: create new mem mapping failed! (err: 1)
> qemu-system-i386: terminating on signal 1 from pid 1192 (xl)
> 
> From the 'xl dmesg' output:
> (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted

Seems like QEMU is attempting to remap a p2m_mmio_direct region.

Can you paste the full output of `xl dmesg`? (as that will contain the
memory map).

Would also be helpful if you could get the RMRR regions from that
box. Booting with `iommu=verbose` on the Xen command line should print
those.

> (XEN) domain_crash called from p2m.c:1301
> (XEN) Domain 1 reported crashed by domain 0 on cpu#4:
> (XEN) memory_map:fail: dom1 gfn=f3078 mfn=a2504 nr=1 ret:-1
> 
> 
> Which of the three syndromes are more fundamental?
> 1. The DPC / AER error log
> 2. The inconsistency in 'xl pci-assignable-list' state tracking
> 3. The GFN mapping failure on guest setup
> 
> Any suggestions for the next step?

I'm slightly confused by the fact that the DPC / AER errors seem to be
from a device that's different from what you attempt to passthrough?
(0000:00:1d.0 vs 0000:05:00.0)

Might be helpful to start by only attempting to passthrough the device
you are having issues with, and leaving any other device out.

Roger.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04  6:37 ` G.R.
@ 2022-07-04 10:31   ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2022-07-04 10:31 UTC (permalink / raw)
  To: G.R.; +Cc: xen-devel

(Note: please don't cross-post; removing xen-users@.)

On 04.07.2022 08:37, G.R. wrote:
> Update some findings with extra triage effort...
> Detailed log could be found in the attachments.
> 1. Confirm stock Debian 11.2 kernel (5.10) shares the same syndrome..
> 2. With loglvl=all, it reveals why the mapping failure happens, looks
> like it comes from some duplicated mapping..
> (XEN) memory_map:add: dom1 gfn=f3074 mfn=a2610 nr=2
> (XEN) memory_map:add: dom1 gfn=f3077 mfn=a2615 nr=1
> (XEN) memory_map:add: dom1 gfn=f3078 mfn=a2616 nr=1 <===========Here
> (XEN) ioport_map:add: dom1 gport=c140 mport=4060 nr=20
> (XEN) ioport_map:add: dom1 gport=c170 mport=4090 nr=8
> (XEN) ioport_map:add: dom1 gport=c178 mport=4080 nr=4
> (XEN) memory_map:add: dom1 gfn=f3070 mfn=a2500 nr=2
> (XEN) memory_map:add: dom1 gfn=f3073 mfn=a2503 nr=1
> (XEN) memory_map:add: dom1 gfn=f3078 mfn=a2504 nr=1 <===========Here
> (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
> (XEN) domain_crash called from p2m.c:1301
> (XEN) Domain 1 reported crashed by domain 0 on cpu#2:
> (XEN) memory_map:fail: dom1 gfn=f3078 mfn=a2504 nr=1 ret:-1
> (XEN) memory_map:remove: dom1 gfn=f3078 mfn=a2504 nr=1

Neither here nor in your initial mail I've spotted information
on the BARs the device has. The above makes me wonder whether it
has two BARs each covering less than 4k and both sharing a page.
Or wait - the hvmloader output actually has some useful data:

(d1) pci dev 05:0 bar 24 size 000000800: 0f3077000
...
(d1) pci dev 05:0 bar 14 size 000000100: 0f3077800

The sharing is apparently introduced in hvmloader, but might not
have been deemed a problem because it's generally advisable (for
security reasons) or even necessary (for functionality) for BARs
of devices to be passed through to all live in distinct (4k) pages.
However - while hvmloader has no knowledge of host addresses
occupied by the BARs (so it has no indication to place them in
separate pages), it should still not put any two BARs in the same
(guest) page. Even then, as the P2M mapping occurs at 4k
granularity, it would further need to know the host's offset-into-
page value to correctly calculate the guest address. IOW this will
in addition require the host to put all BARs at the beginning of
4k pages (which may well be the case already for you).

(d1) pci dev 06:0 bar 20 size 000000100: 0f3077904

would cause the same issue (afaict), unless the host BAR shared a
page with the earlier BAR of 05:0.

> 3. Recompiled kernel with DEBUG enabled for xen_pciback driver and
> play with xl pci-assignable-XXX with it
> 3.1 It's confirmed that the DPC / AER error log happens only when
> xen_pciback attempts to seize && release the device
> 3.1.1 It only happens on each of the first add / remove operations.
> 3.2 There is still a 'MSI-X preparation failed' message later-on, but
> otherwise it appears to be successful to add / remove the device after
> the 1st attempt.
> 3.3 Not necessarily related, but the DPC / AER log looks similar to
> this report [1]

The only thing I can say here is that quite likely pciback needs
work to become up-to-date again with advanced feature handling
elsewhere in the kernel.

Jan

> [1]: https://patchwork.kernel.org/project/linux-pci/patch/20220127025418.1989642-1-kai.heng.feng@canonical.com/#24713767


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04  9:50 ` Roger Pau Monné
@ 2022-07-04 10:34   ` Jan Beulich
  2022-07-04 11:34   ` G.R.
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2022-07-04 10:34 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, G.R.

On 04.07.2022 11:50, Roger Pau Monné wrote:
> On Sun, Jul 03, 2022 at 01:43:11AM +0800, G.R. wrote:
>> Hi everybody,
>>
>> I run into problems passing through a SN570 NVME SSD to a HVM guest.
>> So far I have no idea if the problem is with this specific SSD or with
>> the CPU + motherboard combination or the SW stack.
>> Looking for some suggestions on troubleshooting.
>>
>> List of build info:
>> CPU+motherboard: E-2146G + Gigabyte C246N-WU2
>> XEN version: 4.14.3
> 
> Are you using a debug build of Xen? (if not it would be helpful to do
> so).
> 
>> Dom0: Linux Kernel 5.10 (built from Debian 11.2 kernel source package)
>> The SN570 SSD sits here in the PCI tree:
>>            +-1d.0-[05]----00.0  Sandisk Corp Device 501a

As per this I understand that ...

> I'm slightly confused by the fact that the DPC / AER errors seem to be
> from a device that's different from what you attempt to passthrough?
> (0000:00:1d.0 vs 0000:05:00.0)

... 00:1d.0 is the upstream bridge for 05:00.0.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04  9:50 ` Roger Pau Monné
  2022-07-04 10:34   ` Jan Beulich
@ 2022-07-04 11:34   ` G.R.
  2022-07-04 11:44     ` G.R.
  2022-07-04 13:09     ` Roger Pau Monné
  1 sibling, 2 replies; 31+ messages in thread
From: G.R. @ 2022-07-04 11:34 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-users, xen-devel

[-- Attachment #1: Type: text/plain, Size: 7118 bytes --]

On Mon, Jul 4, 2022 at 5:53 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Sun, Jul 03, 2022 at 01:43:11AM +0800, G.R. wrote:
> > Hi everybody,
> >
> > I run into problems passing through a SN570 NVME SSD to a HVM guest.
> > So far I have no idea if the problem is with this specific SSD or with
> > the CPU + motherboard combination or the SW stack.
> > Looking for some suggestions on troubleshooting.
> >
> > List of build info:
> > CPU+motherboard: E-2146G + Gigabyte C246N-WU2
> > XEN version: 4.14.3
>
> Are you using a debug build of Xen? (if not it would be helpful to do
> so).
It's a release version at this moment. I can switch to a debug version
later when I get my hands free.
BTW, I got a DEBUG build of the xen_pciback driver to see how it plays
with 'xl pci-assignable-xxx' commands.
You can find this in my 2nd email in the chain.

>
> > Dom0: Linux Kernel 5.10 (built from Debian 11.2 kernel source package)
> > The SN570 SSD sits here in the PCI tree:
> >            +-1d.0-[05]----00.0  Sandisk Corp Device 501a
>
> Could be helpful to post the output with -vvv so we can see the
> capabilities of the device.
Sure, please find the -vvv output from the attachment.
This one is just to indicate the connection in the PCI tree.
I.e. 05:00.0 is attached under 00:1d.0.

>
> > Syndromes observed:
> > With ASPM enabled, pciback has problem seizing the device.
> >
> > Jul  2 00:36:54 gaia kernel: [    1.648270] pciback 0000:05:00.0:
> > xen_pciback: seizing device
> > ...
> > Jul  2 00:36:54 gaia kernel: [    1.768646] pcieport 0000:00:1d.0:
> > AER: enabled with IRQ 150
> > Jul  2 00:36:54 gaia kernel: [    1.768716] pcieport 0000:00:1d.0:
> > DPC: enabled with IRQ 150
> > Jul  2 00:36:54 gaia kernel: [    1.768717] pcieport 0000:00:1d.0:
> > DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+
> > SwTrigger+ RP PIO Log 4, DL_ActiveErr+
>
> Is there a device reset involved here?  It's possible the device
> doesn't reset properly and hence the Uncorrectable Error Status
> Register ends up with inconsistent bits set.

xen_pciback appears to force a FLR whenever it attempts to seize a
capable device.
As shown in pciback_dbg_xl-pci_assignable_XXX.log attached in my 2nd mail.
[  323.448115] xen_pciback: wants to seize 0000:05:00.0
[  323.448136] pciback 0000:05:00.0: xen_pciback: probing...
[  323.448137] pciback 0000:05:00.0: xen_pciback: seizing device
[  323.448162] pciback 0000:05:00.0: xen_pciback: pcistub_device_alloc
[  323.448162] pciback 0000:05:00.0: xen_pciback: initializing...
[  323.448163] pciback 0000:05:00.0: xen_pciback: initializing config
[  323.448344] pciback 0000:05:00.0: xen_pciback: enabling device
[  323.448425] xen: registering gsi 16 triggering 0 polarity 1
[  323.448428] Already setup the GSI :16
[  323.448497] pciback 0000:05:00.0: xen_pciback: save state of device
[  323.448642] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3,
etc) the device
[  323.448707] pcieport 0000:00:1d.0: DPC: containment event,
status:0x1f11 source:0x0000
[  323.448730] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
[  323.448760] pcieport 0000:00:1d.0: PCIe Bus Error:
severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver
ID)
[  323.448786] pcieport 0000:00:1d.0:   device [8086:a330] error
status/mask=00200000/00010000
[  323.448813] pcieport 0000:00:1d.0:    [21] ACSViol                (First)
[  324.690979] pciback 0000:05:00.0: not ready 1023ms after FLR;
waiting  <============ HERE
[  325.730706] pciback 0000:05:00.0: not ready 2047ms after FLR; waiting
[  327.997638] pciback 0000:05:00.0: not ready 4095ms after FLR; waiting
[  332.264251] pciback 0000:05:00.0: not ready 8191ms after FLR; waiting
[  340.584320] pciback 0000:05:00.0: not ready 16383ms after FLR;
waiting
[  357.010896] pciback 0000:05:00.0: not ready 32767ms after FLR; waiting
[  391.143951] pciback 0000:05:00.0: not ready 65535ms after FLR; giving up
[  392.249252] pciback 0000:05:00.0: xen_pciback: reset device
[  392.249392] pciback 0000:05:00.0: xen_pciback:
xen_pcibk_error_detected(bus:5,devfn:0)
[  392.249393] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
[  392.397074] pciback 0000:05:00.0: xen_pciback:
xen_pcibk_error_resume(bus:5,devfn:0)
[  392.397080] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
[  392.397284] pcieport 0000:00:1d.0: AER: device recovery successful
Note, I only see this in FLR action the 1st attempt.
And my SATA controller which doesn't support FLR appears to pass
through just fine...

>
> > ...
> > Jul  2 00:36:54 gaia kernel: [    1.770039] xen: registering gsi 16
> > triggering 0 polarity 1
> > Jul  2 00:36:54 gaia kernel: [    1.770041] Already setup the GSI :16
> > Jul  2 00:36:54 gaia kernel: [    1.770314] pcieport 0000:00:1d.0:
> > DPC: containment event, status:0x1f11 source:0x0000
> > Jul  2 00:36:54 gaia kernel: [    1.770315] pcieport 0000:00:1d.0:
> > DPC: unmasked uncorrectable error detected
> > Jul  2 00:36:54 gaia kernel: [    1.770320] pcieport 0000:00:1d.0:
> > PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction
> > Layer, (Receiver ID)
> > Jul  2 00:36:54 gaia kernel: [    1.770371] pcieport 0000:00:1d.0:
> > device [8086:a330] error status/mask=00200000/00010000
> > Jul  2 00:36:54 gaia kernel: [    1.770413] pcieport 0000:00:1d.0:
> > [21] ACSViol                (First)
> > Jul  2 00:36:54 gaia kernel: [    1.770466] pciback 0000:05:00.0:
> > xen_pciback: device is not found/assigned
> > Jul  2 00:36:54 gaia kernel: [    1.920195] pciback 0000:05:00.0:
> > xen_pciback: device is not found/assigned
> > Jul  2 00:36:54 gaia kernel: [    1.920260] pcieport 0000:00:1d.0:
> > AER: device recovery successful
> > Jul  2 00:36:54 gaia kernel: [    1.920263] pcieport 0000:00:1d.0:
> > DPC: containment event, status:0x1f01 source:0x0000
> > Jul  2 00:36:54 gaia kernel: [    1.920264] pcieport 0000:00:1d.0:
> > DPC: unmasked uncorrectable error detected
> > Jul  2 00:36:54 gaia kernel: [    1.920267] pciback 0000:05:00.0:
> > xen_pciback: device is not found/assigned
>
> That's from a different device (05:00.0).
00:1d.0 is the bridge port that 05:00.0 attaches to.


> >
> > After the 'xl pci-assignable-list' appears to be self-consistent,
> > creating VM with the SSD assigned still leads to a guest crash:
> > From qemu log:
> > [00:06.0] xen_pt_region_update: Error: create new mem mapping failed! (err: 1)
> > qemu-system-i386: terminating on signal 1 from pid 1192 (xl)
> >
> > From the 'xl dmesg' output:
> > (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
>
> Seems like QEMU is attempting to remap a p2m_mmio_direct region.
>
> Can you paste the full output of `xl dmesg`? (as that will contain the
> memory map).
Attached.

>
> Would also be helpful if you could get the RMRR regions from that
> box. Booting with `iommu=verbose` on the Xen command line should print
> those.
Coming in my next reply...

[-- Attachment #2: lspcivvv_cutdown.log --]
[-- Type: text/x-log, Size: 9299 bytes --]

00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #9 (rev f0) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 126
	IOMMU group: 10
	Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
	I/O behind bridge: 0000f000-00000fff [disabled]
	Memory behind bridge: a2600000-a26fffff [size=1M]
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [disabled]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16+ MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 256 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #9, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <1us, L1 <16us
			ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s (ok), Width x4 (ok)
			TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #12, PowerLimit 25.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet- LinkState+
		RootCap: CRSVisible-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABC, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
			 AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled, ARIFwd-
			 AtomicOpsCtl: ReqEn- EgressBlck-
		LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
			 EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee002b8  Data: 0000
	Capabilities: [90] Subsystem: Gigabyte Technology Co., Ltd Cannon Lake PCH PCI Express Root Port
	Capabilities: [a0] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
		RootCmd: CERptEn+ NFERptEn+ FERptEn+
		RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
			 FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
		ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
	Capabilities: [140 v1] Access Control Services
		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
	Capabilities: [150 v1] Precision Time Measurement
		PTMCap: Requester:- Responder:+ Root:+
		PTMClockGranularity: 4ns
		PTMControl: Enabled:+ RootSelected:+
		PTMEffectiveGranularity: Unknown
	Capabilities: [200 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=40us PortTPowerOnTime=44us
		L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
			   T_CommonMode=40us LTR1.2_Threshold=65536ns
		L1SubCtl2: T_PwrOn=44us
	Capabilities: [220 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
		LaneErrStat: 0
	Capabilities: [250 v1] Downstream Port Containment
		DpcCap:	INT Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
		DpcCtl:	Trigger:1 Cmpl- INT+ ErrCor- PoisonedTLP- SwTrigger- DL_ActiveErr-
		DpcSta:	Trigger- Reason:00 INT- RPBusy- TriggerExt:00 RP PIO ErrPtr:1f
		Source:	0000
	Kernel driver in use: pcieport

05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a (prog-if 02 [NVM Express])
	Subsystem: Sandisk Corp Device 501a
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	NUMA node: 0
	IOMMU group: 13
	Region 0: Memory at a2600000 (64-bit, non-prefetchable) [size=16K]
	Region 4: Memory at a2604000 (64-bit, non-prefetchable) [size=256]
	Capabilities: [80] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [b0] MSI-X: Enable+ Count=17 Masked-
		Vector table: BAR=0 offset=00002000
		PBA: BAR=4 offset=00000000
	Capabilities: [c0] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <8us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s (ok), Width x4 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS- TPHComp- ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
		LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
			 EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [150 v1] Device Serial Number 00-00-00-00-00-00-00-00
	Capabilities: [1b8 v1] Latency Tolerance Reporting
		Max snoop latency: 3145728ns
		Max no snoop latency: 3145728ns
	Capabilities: [300 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
		LaneErrStat: 0
	Capabilities: [900 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
		L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=65536ns
		L1SubCtl2: T_PwrOn=44us
	Kernel driver in use: nvme
	Kernel modules: nvme


[-- Attachment #3: xldmesg_full.log --]
[-- Type: text/x-log, Size: 12609 bytes --]

 Xen 4.14.3
(XEN) Xen version 4.14.3 (firemeteor@) (gcc (Debian 11.2.0-13) 11.2.0) debug=n  Fri Jan  7 21:28:52 HKT 2022
(XEN) Latest ChangeSet: Sat Jan 14 15:41:32 2017 +0800 git:ff792a893a-dirty
(XEN) build-id: c43fcc69fb5fc9bcd292b31ba618a7d2335ec69a
(XEN) Bootloader: GRUB 2.04-20
(XEN) Command line: placeholder dom0_mem=2G,max:3G,min:1G dom0_max_vcpus=4 loglvl=all guest_loglvl=all
(XEN) Xen image load base address: 0x87a00000
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN)  Found 5 MBR signatures
(XEN)  Found 5 EDD information structures
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 158 (0x9e), Stepping 10 (raw 000906ea)
(XEN) Xen-e820 RAM map:
(XEN)  [0000000000000000, 000000000009d3ff] (usable)
(XEN)  [000000000009d400, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 00000000835bffff] (usable)
(XEN)  [00000000835c0000, 00000000835c0fff] (ACPI NVS)
(XEN)  [00000000835c1000, 00000000835c1fff] (reserved)
(XEN)  [00000000835c2000, 0000000088c0bfff] (usable)
(XEN)  [0000000088c0c000, 000000008907dfff] (reserved)
(XEN)  [000000008907e000, 00000000891f4fff] (usable)
(XEN)  [00000000891f5000, 00000000895dcfff] (ACPI NVS)
(XEN)  [00000000895dd000, 0000000089efefff] (reserved)
(XEN)  [0000000089eff000, 0000000089efffff] (usable)
(XEN)  [0000000089f00000, 000000008f7fffff] (reserved)
(XEN)  [00000000e0000000, 00000000efffffff] (reserved)
(XEN)  [00000000fe000000, 00000000fe010fff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fee00000, 00000000fee00fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 000000086e7fffff] (usable)
(XEN) ACPI: RSDP 000F05B0, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT 895120A8, 00D4 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FACP 895509C0, 0114 (r6 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: DSDT 89512218, 3E7A6 (r2 ALASKA    A M I  1072009 INTL 20160527)
(XEN) ACPI: FACS 895DC080, 0040
(XEN) ACPI: APIC 89550AD8, 00F4 (r4 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FPDT 89550BD0, 0044 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FIDT 89550C18, 009C (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: MCFG 89550CB8, 003C (r1 ALASKA    A M I  1072009 MSFT       97)
(XEN) ACPI: SSDT 89550CF8, 0204 (r1 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89550F00, 17D5 (r2 ALASKA    A M I     3000 INTL 20160527)
(XEN) ACPI: SSDT 895526D8, 933D (r1 ALASKA    A M I        1 INTL 20160527)
(XEN) ACPI: SSDT 8955BA18, 31C7 (r2 ALASKA    A M I     3000 INTL 20160527)
(XEN) ACPI: SSDT 8955EBE0, 2358 (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: HPET 89560F38, 0038 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: SSDT 89560F70, 1BE1 (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89562B58, 0F9E (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89563AF8, 2D1B (r2 ALASKA    A M I        0 INTL 20160527)
(XEN) ACPI: UEFI 89566818, 0042 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: LPIT 89566860, 005C (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: SSDT 895668C0, 27DE (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 895690A0, 0FFE (r2 ALASKA    A M I        0 INTL 20160527)
(XEN) ACPI: DBGP 8956A0A0, 0034 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: DBG2 8956A0D8, 0054 (r0 ALASKA    A M I        2       1000013)
(XEN) ACPI: DMAR 8956A130, 00A8 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: WSMT 8956A1D8, 0028 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) System RAM: 32629MB (33412220kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-000000086e800000
(XEN) Domain heap initialised
(XEN) found SMP MP-table at 000fce10
(XEN) SMBIOS 3.1 present.
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x1808 (24 bits)
(XEN) ACPI: v5 SLEEP INFO: control[1:1804], status[1:1800]
(XEN) ACPI: Invalid sleep control/status register data: 0:0x8:0x3 0:0x8:0x3
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:1804,1:0], pm1x_evt[1:1800,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT - 895dc080/0000000000000000, using 32
(XEN) ACPI:             wakeup_vec[895dc08c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x08] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x0a] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x01] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x03] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x09] lapic_id[0x05] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x07] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x09] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x0b] enabled)
(XEN) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x09] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x0a] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x0b] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x0c] high edge lint[0x1])
(XEN) Overriding APIC driver with bigsmp
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-119
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode:  Phys.  Using 1 I/O APICs
(XEN) ACPI: HPET id: 0x8086a201 base: 0xfed00000
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-ff
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 12 CPUs (0 hotplug CPUs)
(XEN) IRQ limits: 120 GSI, 2376 MSI/MSI-X
(XEN) Switched to APIC driver x2apic_cluster
(XEN) CPU0: TSC: ratio: 292 / 2
(XEN) CPU0: bus: 100 MHz base: 3500 MHz max: 4500 MHz
(XEN) CPU0: 800 ... 3500 MHz
(XEN) xstate: size: 0x440 and states: 0x1f
(XEN) CPU0: Intel machine check reporting enabled
(XEN) Speculative mitigation facilities:
(XEN)   Hardware features: IBRS/IBPB STIBP L1D_FLUSH SSBD MD_CLEAR
(XEN)   Compiled-in support: INDIRECT_THUNK SHADOW_PAGING
(XEN)   Xen settings: BTI-Thunk JMP, SPEC_CTRL: IBRS+ SSBD-, Other: IBPB L1D_FLUSH VERW BRANCH_HARDEN
(XEN)   L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 39, Safe address 8000000000
(XEN)   Support for HVM VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR
(XEN)   Support for PV VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR
(XEN)   XPTI (64-bit PV only): Dom0 enabled, DomU enabled (with PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU enabled
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Platform timer is 24.000MHz HPET
(XEN) Detected 3504.767 MHz processor.
(XEN) alt table ffff82d040439290 -> ffff82d040444238
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d Snoop Control not enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) nr_sockets: 1
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) TSC deadline timer enabled
(XEN) Allocated console ring of 128 KiB.
(XEN) mwait-idle: MWAIT substates: 0x11142120
(XEN) mwait-idle: v0.4.1 model 0x9e
(XEN) mwait-idle: lapic_timer_reliable_states 0xffffffff
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN)  - VMCS shadowing
(XEN)  - VM Functions
(XEN)  - Virtualisation Exceptions
(XEN)  - Page Modification Logging
(XEN) HVM: ASIDs enabled.
(XEN) VMX: Disabling executable EPT superpages due to CVE-2018-12207
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) alt table ffff82d040439290 -> ffff82d040444238
(XEN) Brought up 12 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Adding cpu 0 to runqueue 0
(XEN)  First cpu on runqueue, activating
(XEN) Adding cpu 1 to runqueue 0
(XEN) Adding cpu 2 to runqueue 0
(XEN) Adding cpu 3 to runqueue 0
(XEN) Adding cpu 4 to runqueue 0
(XEN) Adding cpu 5 to runqueue 0
(XEN) Adding cpu 6 to runqueue 0
(XEN) Adding cpu 7 to runqueue 0
(XEN) Adding cpu 8 to runqueue 0
(XEN) Adding cpu 9 to runqueue 0
(XEN) Adding cpu 10 to runqueue 0
(XEN) Adding cpu 11 to runqueue 0
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) NX (Execute Disable) protection active
(XEN) Dom0 has maximum 952 PIRQs
(XEN) *** Building a PV Dom0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x2e2c000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000850000000->0000000854000000 (504314 pages to be allocated)
(XEN)  Init. ramdisk: 000000086d9fa000->000000086e7ff3f6
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff82e2c000
(XEN)  Init. ramdisk: 0000000000000000->0000000000000000
(XEN)  Phys-Mach map: 0000008000000000->0000008000400000
(XEN)  Start info:    ffffffff82e2c000->ffffffff82e2c4b8
(XEN)  Xenstore ring: 0000000000000000->0000000000000000
(XEN)  Console ring:  0000000000000000->0000000000000000
(XEN)  Page tables:   ffffffff82e2d000->ffffffff82e48000
(XEN)  Boot stack:    ffffffff82e48000->ffffffff82e49000
(XEN)  TOTAL:         ffffffff80000000->ffffffff83000000
(XEN)  ENTRY ADDRESS: ffffffff82ab6160
(XEN) Dom0 has maximum 4 VCPUs
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) Booted on L1TF-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Please assess your configuration and choose an
(XEN) explicit 'smt=<bool>' setting.  See XSA-273.
(XEN) ***************************************************
(XEN) Booted on MLPDS/MFBDS-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Mitigations will not be fully effective.  Please
(XEN) choose an explicit smt=<bool> setting.  See XSA-297.
(XEN) ***************************************************
(XEN) 3... 2... 1... 
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 552kB init memory
(XEN) PCI add device 0000:00:00.0
(XEN) PCI add device 0000:00:01.0
(XEN) PCI add device 0000:00:02.0
(XEN) PCI add device 0000:00:12.0
(XEN) PCI add device 0000:00:14.0
(XEN) PCI add device 0000:00:14.2
(XEN) PCI add device 0000:00:16.0
(XEN) PCI add device 0000:00:16.3
(XEN) PCI add device 0000:00:17.0
(XEN) PCI add device 0000:00:1b.0
(XEN) PCI add device 0000:00:1c.0
(XEN) PCI add device 0000:00:1c.5
(XEN) PCI add device 0000:00:1d.0
(XEN) PCI add device 0000:00:1f.0
(XEN) PCI add device 0000:00:1f.4
(XEN) PCI add device 0000:00:1f.5
(XEN) PCI add device 0000:01:00.0
(XEN) PCI add device 0000:04:00.0
(XEN) PCI add device 0000:05:00.0

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 11:34   ` G.R.
@ 2022-07-04 11:44     ` G.R.
  2022-07-04 13:09     ` Roger Pau Monné
  1 sibling, 0 replies; 31+ messages in thread
From: G.R. @ 2022-07-04 11:44 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-users, xen-devel

[-- Attachment #1: Type: text/plain, Size: 374 bytes --]

On Mon, Jul 4, 2022 at 7:34 PM G.R. <firemeteor@users.sourceforge.net> wrote:
>
> On Mon, Jul 4, 2022 at 5:53 PM Roger Pau Monné <roger.pau@citrix.com> wrote:

> >
> > Would also be helpful if you could get the RMRR regions from that
> > box. Booting with `iommu=verbose` on the Xen command line should print
> > those.
> Coming in my next reply...
See attached.

[-- Attachment #2: xldmesg_full_rmrr.log --]
[-- Type: text/x-log, Size: 13454 bytes --]

 Xen 4.14.3
(XEN) Xen version 4.14.3 (firemeteor@) (gcc (Debian 11.2.0-13) 11.2.0) debug=n  Fri Jan  7 21:28:52 HKT 2022
(XEN) Latest ChangeSet: Sat Jan 14 15:41:32 2017 +0800 git:ff792a893a-dirty
(XEN) build-id: c43fcc69fb5fc9bcd292b31ba618a7d2335ec69a
(XEN) Bootloader: GRUB 2.04-20
(XEN) Command line: placeholder dom0_mem=2G,max:3G,min:1G dom0_max_vcpus=4 loglvl=all guest_loglvl=all iommu=verbose
(XEN) Xen image load base address: 0x87a00000
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN)  Found 5 MBR signatures
(XEN)  Found 5 EDD information structures
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 158 (0x9e), Stepping 10 (raw 000906ea)
(XEN) Xen-e820 RAM map:
(XEN)  [0000000000000000, 000000000009d3ff] (usable)
(XEN)  [000000000009d400, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 00000000835bffff] (usable)
(XEN)  [00000000835c0000, 00000000835c0fff] (ACPI NVS)
(XEN)  [00000000835c1000, 00000000835c1fff] (reserved)
(XEN)  [00000000835c2000, 0000000088c0bfff] (usable)
(XEN)  [0000000088c0c000, 000000008907dfff] (reserved)
(XEN)  [000000008907e000, 00000000891f4fff] (usable)
(XEN)  [00000000891f5000, 00000000895dcfff] (ACPI NVS)
(XEN)  [00000000895dd000, 0000000089efefff] (reserved)
(XEN)  [0000000089eff000, 0000000089efffff] (usable)
(XEN)  [0000000089f00000, 000000008f7fffff] (reserved)
(XEN)  [00000000e0000000, 00000000efffffff] (reserved)
(XEN)  [00000000fe000000, 00000000fe010fff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fee00000, 00000000fee00fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 000000086e7fffff] (usable)
(XEN) ACPI: RSDP 000F05B0, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT 895120A8, 00D4 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FACP 895509C0, 0114 (r6 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: DSDT 89512218, 3E7A6 (r2 ALASKA    A M I  1072009 INTL 20160527)
(XEN) ACPI: FACS 895DC080, 0040
(XEN) ACPI: APIC 89550AD8, 00F4 (r4 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FPDT 89550BD0, 0044 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FIDT 89550C18, 009C (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: MCFG 89550CB8, 003C (r1 ALASKA    A M I  1072009 MSFT       97)
(XEN) ACPI: SSDT 89550CF8, 0204 (r1 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89550F00, 17D5 (r2 ALASKA    A M I     3000 INTL 20160527)
(XEN) ACPI: SSDT 895526D8, 933D (r1 ALASKA    A M I        1 INTL 20160527)
(XEN) ACPI: SSDT 8955BA18, 31C7 (r2 ALASKA    A M I     3000 INTL 20160527)
(XEN) ACPI: SSDT 8955EBE0, 2358 (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: HPET 89560F38, 0038 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: SSDT 89560F70, 1BE1 (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89562B58, 0F9E (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89563AF8, 2D1B (r2 ALASKA    A M I        0 INTL 20160527)
(XEN) ACPI: UEFI 89566818, 0042 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: LPIT 89566860, 005C (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: SSDT 895668C0, 27DE (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 895690A0, 0FFE (r2 ALASKA    A M I        0 INTL 20160527)
(XEN) ACPI: DBGP 8956A0A0, 0034 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: DBG2 8956A0D8, 0054 (r0 ALASKA    A M I        2       1000013)
(XEN) ACPI: DMAR 8956A130, 00A8 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: WSMT 8956A1D8, 0028 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) System RAM: 32629MB (33412220kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-000000086e800000
(XEN) Domain heap initialised
(XEN) found SMP MP-table at 000fce10
(XEN) SMBIOS 3.1 present.
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x1808 (24 bits)
(XEN) ACPI: v5 SLEEP INFO: control[1:1804], status[1:1800]
(XEN) ACPI: Invalid sleep control/status register data: 0:0x8:0x3 0:0x8:0x3
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:1804,1:0], pm1x_evt[1:1800,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT - 895dc080/0000000000000000, using 32
(XEN) ACPI:             wakeup_vec[895dc08c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x08] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x0a] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x01] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x03] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x09] lapic_id[0x05] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x07] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x09] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x0b] enabled)
(XEN) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x09] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x0a] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x0b] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x0c] high edge lint[0x1])
(XEN) Overriding APIC driver with bigsmp
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-119
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode:  Phys.  Using 1 I/O APICs
(XEN) ACPI: HPET id: 0x8086a201 base: 0xfed00000
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-ff
(XEN) [VT-D]Host address width 39
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed90000
(XEN) [VT-D]drhd->address = fed90000 iommu->reg = ffff82c00021d000
(XEN) [VT-D]cap = 1c0000c40660462 ecap = 19e2ff0505e
(XEN) [VT-D] endpoint: 0000:00:02.0
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed91000
(XEN) [VT-D]drhd->address = fed91000 iommu->reg = ffff82c00021f000
(XEN) [VT-D]cap = d2008c40660462 ecap = f050da
(XEN) [VT-D] IOAPIC: 0000:00:1e.7
(XEN) [VT-D] MSI HPET: 0000:00:1e.6
(XEN) [VT-D]  flags: INCLUDE_ALL
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:14.0
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:02.0
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 12 CPUs (0 hotplug CPUs)
(XEN) IRQ limits: 120 GSI, 2376 MSI/MSI-X
(XEN) Switched to APIC driver x2apic_cluster
(XEN) CPU0: TSC: ratio: 292 / 2
(XEN) CPU0: bus: 100 MHz base: 3500 MHz max: 4500 MHz
(XEN) CPU0: 800 ... 3500 MHz
(XEN) xstate: size: 0x440 and states: 0x1f
(XEN) CPU0: Intel machine check reporting enabled
(XEN) Speculative mitigation facilities:
(XEN)   Hardware features: IBRS/IBPB STIBP L1D_FLUSH SSBD MD_CLEAR
(XEN)   Compiled-in support: INDIRECT_THUNK SHADOW_PAGING
(XEN)   Xen settings: BTI-Thunk JMP, SPEC_CTRL: IBRS+ SSBD-, Other: IBPB L1D_FLUSH VERW BRANCH_HARDEN
(XEN)   L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 39, Safe address 8000000000
(XEN)   Support for HVM VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR
(XEN)   Support for PV VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR
(XEN)   XPTI (64-bit PV only): Dom0 enabled, DomU enabled (with PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU enabled
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Platform timer is 24.000MHz HPET
(XEN) Detected 3504.520 MHz processor.
(XEN) alt table ffff82d040439290 -> ffff82d040444238
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d Snoop Control not enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) nr_sockets: 1
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) TSC deadline timer enabled
(XEN) Allocated console ring of 128 KiB.
(XEN) mwait-idle: MWAIT substates: 0x11142120
(XEN) mwait-idle: v0.4.1 model 0x9e
(XEN) mwait-idle: lapic_timer_reliable_states 0xffffffff
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN)  - VMCS shadowing
(XEN)  - VM Functions
(XEN)  - Virtualisation Exceptions
(XEN)  - Page Modification Logging
(XEN) HVM: ASIDs enabled.
(XEN) VMX: Disabling executable EPT superpages due to CVE-2018-12207
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) alt table ffff82d040439290 -> ffff82d040444238
(XEN) Brought up 12 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Adding cpu 0 to runqueue 0
(XEN)  First cpu on runqueue, activating
(XEN) Adding cpu 1 to runqueue 0
(XEN) Adding cpu 2 to runqueue 0
(XEN) Adding cpu 3 to runqueue 0
(XEN) Adding cpu 4 to runqueue 0
(XEN) Adding cpu 5 to runqueue 0
(XEN) Adding cpu 6 to runqueue 0
(XEN) Adding cpu 7 to runqueue 0
(XEN) Adding cpu 8 to runqueue 0
(XEN) Adding cpu 9 to runqueue 0
(XEN) Adding cpu 10 to runqueue 0
(XEN) Adding cpu 11 to runqueue 0
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) NX (Execute Disable) protection active
(XEN) Dom0 has maximum 952 PIRQs
(XEN) *** Building a PV Dom0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x2e2c000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000850000000->0000000854000000 (504314 pages to be allocated)
(XEN)  Init. ramdisk: 000000086d9fa000->000000086e7ff3f6
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff82e2c000
(XEN)  Init. ramdisk: 0000000000000000->0000000000000000
(XEN)  Phys-Mach map: 0000008000000000->0000008000400000
(XEN)  Start info:    ffffffff82e2c000->ffffffff82e2c4b8
(XEN)  Xenstore ring: 0000000000000000->0000000000000000
(XEN)  Console ring:  0000000000000000->0000000000000000
(XEN)  Page tables:   ffffffff82e2d000->ffffffff82e48000
(XEN)  Boot stack:    ffffffff82e48000->ffffffff82e49000
(XEN)  TOTAL:         ffffffff80000000->ffffffff83000000
(XEN)  ENTRY ADDRESS: ffffffff82ab6160
(XEN) Dom0 has maximum 4 VCPUs
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021d000
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021f000
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) Booted on L1TF-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Please assess your configuration and choose an
(XEN) explicit 'smt=<bool>' setting.  See XSA-273.
(XEN) ***************************************************
(XEN) Booted on MLPDS/MFBDS-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Mitigations will not be fully effective.  Please
(XEN) choose an explicit smt=<bool> setting.  See XSA-297.
(XEN) ***************************************************
(XEN) 3... 2... 1... 
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 552kB init memory
(XEN) PCI add device 0000:00:00.0
(XEN) PCI add device 0000:00:01.0
(XEN) PCI add device 0000:00:02.0
(XEN) PCI add device 0000:00:12.0
(XEN) PCI add device 0000:00:14.0
(XEN) PCI add device 0000:00:14.2
(XEN) PCI add device 0000:00:16.0
(XEN) PCI add device 0000:00:16.3
(XEN) PCI add device 0000:00:17.0
(XEN) PCI add device 0000:00:1b.0
(XEN) PCI add device 0000:00:1c.0
(XEN) PCI add device 0000:00:1c.5
(XEN) PCI add device 0000:00:1d.0
(XEN) PCI add device 0000:00:1f.0
(XEN) PCI add device 0000:00:1f.4
(XEN) PCI add device 0000:00:1f.5
(XEN) PCI add device 0000:01:00.0
(XEN) PCI add device 0000:04:00.0
(XEN) PCI add device 0000:05:00.0

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 11:34   ` G.R.
  2022-07-04 11:44     ` G.R.
@ 2022-07-04 13:09     ` Roger Pau Monné
  2022-07-04 14:51       ` G.R.
  1 sibling, 1 reply; 31+ messages in thread
From: Roger Pau Monné @ 2022-07-04 13:09 UTC (permalink / raw)
  To: G.R.; +Cc: xen-devel

On Mon, Jul 04, 2022 at 07:34:47PM +0800, G.R. wrote:
> On Mon, Jul 4, 2022 at 5:53 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Sun, Jul 03, 2022 at 01:43:11AM +0800, G.R. wrote:
> > > Hi everybody,
> > >
> > > I run into problems passing through a SN570 NVME SSD to a HVM guest.
> > > So far I have no idea if the problem is with this specific SSD or with
> > > the CPU + motherboard combination or the SW stack.
> > > Looking for some suggestions on troubleshooting.
> > >
> > > List of build info:
> > > CPU+motherboard: E-2146G + Gigabyte C246N-WU2
> > > XEN version: 4.14.3
> >
> > Are you using a debug build of Xen? (if not it would be helpful to do
> > so).
> It's a release version at this moment. I can switch to a debug version
> later when I get my hands free.
> BTW, I got a DEBUG build of the xen_pciback driver to see how it plays
> with 'xl pci-assignable-xxx' commands.
> You can find this in my 2nd email in the chain.
> 
> >
> > > Dom0: Linux Kernel 5.10 (built from Debian 11.2 kernel source package)
> > > The SN570 SSD sits here in the PCI tree:
> > >            +-1d.0-[05]----00.0  Sandisk Corp Device 501a
> >
> > Could be helpful to post the output with -vvv so we can see the
> > capabilities of the device.
> Sure, please find the -vvv output from the attachment.
> This one is just to indicate the connection in the PCI tree.
> I.e. 05:00.0 is attached under 00:1d.0.
> 
> >
> > > Syndromes observed:
> > > With ASPM enabled, pciback has problem seizing the device.
> > >
> > > Jul  2 00:36:54 gaia kernel: [    1.648270] pciback 0000:05:00.0:
> > > xen_pciback: seizing device
> > > ...
> > > Jul  2 00:36:54 gaia kernel: [    1.768646] pcieport 0000:00:1d.0:
> > > AER: enabled with IRQ 150
> > > Jul  2 00:36:54 gaia kernel: [    1.768716] pcieport 0000:00:1d.0:
> > > DPC: enabled with IRQ 150
> > > Jul  2 00:36:54 gaia kernel: [    1.768717] pcieport 0000:00:1d.0:
> > > DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+
> > > SwTrigger+ RP PIO Log 4, DL_ActiveErr+
> >
> > Is there a device reset involved here?  It's possible the device
> > doesn't reset properly and hence the Uncorrectable Error Status
> > Register ends up with inconsistent bits set.
> 
> xen_pciback appears to force a FLR whenever it attempts to seize a
> capable device.
> As shown in pciback_dbg_xl-pci_assignable_XXX.log attached in my 2nd mail.
> [  323.448115] xen_pciback: wants to seize 0000:05:00.0
> [  323.448136] pciback 0000:05:00.0: xen_pciback: probing...
> [  323.448137] pciback 0000:05:00.0: xen_pciback: seizing device
> [  323.448162] pciback 0000:05:00.0: xen_pciback: pcistub_device_alloc
> [  323.448162] pciback 0000:05:00.0: xen_pciback: initializing...
> [  323.448163] pciback 0000:05:00.0: xen_pciback: initializing config
> [  323.448344] pciback 0000:05:00.0: xen_pciback: enabling device
> [  323.448425] xen: registering gsi 16 triggering 0 polarity 1
> [  323.448428] Already setup the GSI :16
> [  323.448497] pciback 0000:05:00.0: xen_pciback: save state of device
> [  323.448642] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3,
> etc) the device
> [  323.448707] pcieport 0000:00:1d.0: DPC: containment event,
> status:0x1f11 source:0x0000
> [  323.448730] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
> [  323.448760] pcieport 0000:00:1d.0: PCIe Bus Error:
> severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver
> ID)
> [  323.448786] pcieport 0000:00:1d.0:   device [8086:a330] error
> status/mask=00200000/00010000
> [  323.448813] pcieport 0000:00:1d.0:    [21] ACSViol                (First)
> [  324.690979] pciback 0000:05:00.0: not ready 1023ms after FLR;
> waiting  <============ HERE
> [  325.730706] pciback 0000:05:00.0: not ready 2047ms after FLR; waiting
> [  327.997638] pciback 0000:05:00.0: not ready 4095ms after FLR; waiting
> [  332.264251] pciback 0000:05:00.0: not ready 8191ms after FLR; waiting
> [  340.584320] pciback 0000:05:00.0: not ready 16383ms after FLR;
> waiting
> [  357.010896] pciback 0000:05:00.0: not ready 32767ms after FLR; waiting
> [  391.143951] pciback 0000:05:00.0: not ready 65535ms after FLR; giving up
> [  392.249252] pciback 0000:05:00.0: xen_pciback: reset device
> [  392.249392] pciback 0000:05:00.0: xen_pciback:
> xen_pcibk_error_detected(bus:5,devfn:0)
> [  392.249393] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
> [  392.397074] pciback 0000:05:00.0: xen_pciback:
> xen_pcibk_error_resume(bus:5,devfn:0)
> [  392.397080] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
> [  392.397284] pcieport 0000:00:1d.0: AER: device recovery successful
> Note, I only see this in FLR action the 1st attempt.
> And my SATA controller which doesn't support FLR appears to pass
> through just fine...
> 
> >
> > > ...
> > > Jul  2 00:36:54 gaia kernel: [    1.770039] xen: registering gsi 16
> > > triggering 0 polarity 1
> > > Jul  2 00:36:54 gaia kernel: [    1.770041] Already setup the GSI :16
> > > Jul  2 00:36:54 gaia kernel: [    1.770314] pcieport 0000:00:1d.0:
> > > DPC: containment event, status:0x1f11 source:0x0000
> > > Jul  2 00:36:54 gaia kernel: [    1.770315] pcieport 0000:00:1d.0:
> > > DPC: unmasked uncorrectable error detected
> > > Jul  2 00:36:54 gaia kernel: [    1.770320] pcieport 0000:00:1d.0:
> > > PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction
> > > Layer, (Receiver ID)
> > > Jul  2 00:36:54 gaia kernel: [    1.770371] pcieport 0000:00:1d.0:
> > > device [8086:a330] error status/mask=00200000/00010000
> > > Jul  2 00:36:54 gaia kernel: [    1.770413] pcieport 0000:00:1d.0:
> > > [21] ACSViol                (First)
> > > Jul  2 00:36:54 gaia kernel: [    1.770466] pciback 0000:05:00.0:
> > > xen_pciback: device is not found/assigned
> > > Jul  2 00:36:54 gaia kernel: [    1.920195] pciback 0000:05:00.0:
> > > xen_pciback: device is not found/assigned
> > > Jul  2 00:36:54 gaia kernel: [    1.920260] pcieport 0000:00:1d.0:
> > > AER: device recovery successful
> > > Jul  2 00:36:54 gaia kernel: [    1.920263] pcieport 0000:00:1d.0:
> > > DPC: containment event, status:0x1f01 source:0x0000
> > > Jul  2 00:36:54 gaia kernel: [    1.920264] pcieport 0000:00:1d.0:
> > > DPC: unmasked uncorrectable error detected
> > > Jul  2 00:36:54 gaia kernel: [    1.920267] pciback 0000:05:00.0:
> > > xen_pciback: device is not found/assigned
> >
> > That's from a different device (05:00.0).
> 00:1d.0 is the bridge port that 05:00.0 attaches to.
> 
> 
> > >
> > > After the 'xl pci-assignable-list' appears to be self-consistent,
> > > creating VM with the SSD assigned still leads to a guest crash:
> > > From qemu log:
> > > [00:06.0] xen_pt_region_update: Error: create new mem mapping failed! (err: 1)
> > > qemu-system-i386: terminating on signal 1 from pid 1192 (xl)
> > >
> > > From the 'xl dmesg' output:
> > > (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
> >
> > Seems like QEMU is attempting to remap a p2m_mmio_direct region.
> >
> > Can you paste the full output of `xl dmesg`? (as that will contain the
> > memory map).
> Attached.
> 
> >
> > Would also be helpful if you could get the RMRR regions from that
> > box. Booting with `iommu=verbose` on the Xen command line should print
> > those.
> Coming in my next reply...

> 00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #9 (rev f0) (prog-if 00 [Normal decode])
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 64 bytes
> 	Interrupt: pin A routed to IRQ 126
> 	IOMMU group: 10
> 	Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
> 	I/O behind bridge: 0000f000-00000fff [disabled]
> 	Memory behind bridge: a2600000-a26fffff [size=1M]
> 	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [disabled]
> 	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
> 	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16+ MAbort- >Reset- FastB2B-
> 		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
> 	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
> 		DevCap:	MaxPayload 256 bytes, PhantFunc 0
> 			ExtTag- RBE+
> 		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
> 			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> 			MaxPayload 256 bytes, MaxReadReq 128 bytes
> 		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
> 		LnkCap:	Port #9, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <1us, L1 <16us
> 			ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
> 		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
> 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> 		LnkSta:	Speed 8GT/s (ok), Width x4 (ok)
> 			TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
> 		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
> 			Slot #12, PowerLimit 25.000W; Interlock- NoCompl+
> 		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
> 			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
> 		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
> 			Changed: MRL- PresDet- LinkState+
> 		RootCap: CRSVisible-
> 		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
> 		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
> 		DevCap2: Completion Timeout: Range ABC, TimeoutDis+ NROPrPrP- LTR+
> 			 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
> 			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
> 			 FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
> 			 AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled, ARIFwd-
> 			 AtomicOpsCtl: ReqEn- EgressBlck-
> 		LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
> 		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> 		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
> 			 EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
> 			 Retimer- 2Retimers- CrosslinkRes: unsupported
> 	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
> 		Address: fee002b8  Data: 0000
> 	Capabilities: [90] Subsystem: Gigabyte Technology Co., Ltd Cannon Lake PCH PCI Express Root Port
> 	Capabilities: [a0] Power Management version 3
> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> 		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [100 v1] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UESvrt:	DLP+ SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
> 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
> 		AERCap:	First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
> 			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
> 		HeaderLog: 00000000 00000000 00000000 00000000
> 		RootCmd: CERptEn+ NFERptEn+ FERptEn+
> 		RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
> 			 FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
> 		ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
> 	Capabilities: [140 v1] Access Control Services
> 		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 	Capabilities: [150 v1] Precision Time Measurement
> 		PTMCap: Requester:- Responder:+ Root:+
> 		PTMClockGranularity: 4ns
> 		PTMControl: Enabled:+ RootSelected:+
> 		PTMEffectiveGranularity: Unknown
> 	Capabilities: [200 v1] L1 PM Substates
> 		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> 			  PortCommonModeRestoreTime=40us PortTPowerOnTime=44us
> 		L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> 			   T_CommonMode=40us LTR1.2_Threshold=65536ns
> 		L1SubCtl2: T_PwrOn=44us
> 	Capabilities: [220 v1] Secondary PCI Express
> 		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
> 		LaneErrStat: 0
> 	Capabilities: [250 v1] Downstream Port Containment
> 		DpcCap:	INT Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
> 		DpcCtl:	Trigger:1 Cmpl- INT+ ErrCor- PoisonedTLP- SwTrigger- DL_ActiveErr-
> 		DpcSta:	Trigger- Reason:00 INT- RPBusy- TriggerExt:00 RP PIO ErrPtr:1f
> 		Source:	0000
> 	Kernel driver in use: pcieport
> 
> 05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a (prog-if 02 [NVM Express])
> 	Subsystem: Sandisk Corp Device 501a
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 64 bytes
> 	Interrupt: pin A routed to IRQ 16
> 	NUMA node: 0
> 	IOMMU group: 13
> 	Region 0: Memory at a2600000 (64-bit, non-prefetchable) [size=16K]
> 	Region 4: Memory at a2604000 (64-bit, non-prefetchable) [size=256]

I think I'm slightly confused, the overlapping happens at:

(XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted

So it's MFNs 0xa2616 and 0xa2504, yet none of those are in the BAR
ranges of this device.

Can you paste the lspci -vvv output for any other device you are also
passing through to this guest?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 13:09     ` Roger Pau Monné
@ 2022-07-04 14:51       ` G.R.
  2022-07-04 15:15         ` G.R.
  2022-07-04 15:33         ` Roger Pau Monné
  0 siblings, 2 replies; 31+ messages in thread
From: G.R. @ 2022-07-04 14:51 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 2923 bytes --]

On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > 05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a (prog-if 02 [NVM Express])
> >       Subsystem: Sandisk Corp Device 501a
> >       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> >       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> >       Latency: 0, Cache Line Size: 64 bytes
> >       Interrupt: pin A routed to IRQ 16
> >       NUMA node: 0
> >       IOMMU group: 13
> >       Region 0: Memory at a2600000 (64-bit, non-prefetchable) [size=16K]
> >       Region 4: Memory at a2604000 (64-bit, non-prefetchable) [size=256]
>
> I think I'm slightly confused, the overlapping happens at:
>
> (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
>
> So it's MFNs 0xa2616 and 0xa2504, yet none of those are in the BAR
> ranges of this device.
>
> Can you paste the lspci -vvv output for any other device you are also
> passing through to this guest?
>

I just realized that the address may change in different environments.
In previous email chains, I used a cached dump from a Linux
environment running outside the hypervisor.
Sorry for the confusion. Refreshing with a XEN dom0 dump.

The other device I used is a SATA controller. I think I can get what
you are looking for now.
Both a2616 and a2504 are found!

00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI
Controller (rev 10) (prog-if 01 [AHCI 1.0])
        DeviceName: Onboard - SATA
        Subsystem: Gigabyte Technology Co., Ltd Cannon Lake PCH SATA
AHCI Controller
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at a2610000 (32-bit, non-prefetchable) [size=8K]
        Region 1: Memory at a2616000 (32-bit, non-prefetchable) [size=256]
        Region 2: I/O ports at 4090 [size=8]
        Region 3: I/O ports at 4080 [size=4]
        Region 4: I/O ports at 4060 [size=32]

05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a
(prog-if 02 [NVM Express])
        Subsystem: Sandisk Corp Device 501a
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at a2500000 (64-bit, non-prefetchable) [size=16K]
        Region 4: Memory at a2504000 (64-bit, non-prefetchable) [size=256]

Thanks,
G.R.



> Thanks, Roger.

[-- Attachment #2: lspcivvv_cutdown_refreshed.txt --]
[-- Type: text/plain, Size: 10383 bytes --]

00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10) (prog-if 01 [AHCI 1.0])
	DeviceName: Onboard - SATA
	Subsystem: Gigabyte Technology Co., Ltd Cannon Lake PCH SATA AHCI Controller
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at a2610000 (32-bit, non-prefetchable) [size=8K]
	Region 1: Memory at a2616000 (32-bit, non-prefetchable) [size=256]
	Region 2: I/O ports at 4090 [size=8]
	Region 3: I/O ports at 4080 [size=4]
	Region 4: I/O ports at 4060 [size=32]
	Region 5: Memory at a2615000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [70] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
	Kernel driver in use: pciback
	Kernel modules: ahci

00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #9 (rev f0) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 150
	Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
	I/O behind bridge: 0000f000-00000fff [disabled]
	Memory behind bridge: a2500000-a25fffff [size=1M]
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [disabled]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16+ MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 256 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #9, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <1us, L1 <16us
			ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s (ok), Width x4 (ok)
			TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #12, PowerLimit 25.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet- LinkState+
		RootCap: CRSVisible-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABC, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
			 AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled, ARIFwd-
			 AtomicOpsCtl: ReqEn- EgressBlck-
		LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
			 EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee00f98  Data: 0000
	Capabilities: [90] Subsystem: Gigabyte Technology Co., Ltd Cannon Lake PCH PCI Express Root Port
	Capabilities: [a0] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
		RootCmd: CERptEn+ NFERptEn+ FERptEn+
		RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
			 FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
		ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
	Capabilities: [140 v1] Access Control Services
		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
	Capabilities: [150 v1] Precision Time Measurement
		PTMCap: Requester:- Responder:+ Root:+
		PTMClockGranularity: 4ns
		PTMControl: Enabled:+ RootSelected:+
		PTMEffectiveGranularity: Unknown
	Capabilities: [200 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=40us PortTPowerOnTime=44us
		L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
			   T_CommonMode=40us LTR1.2_Threshold=65536ns
		L1SubCtl2: T_PwrOn=44us
	Capabilities: [220 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
		LaneErrStat: 0
	Capabilities: [250 v1] Downstream Port Containment
		DpcCap:	INT Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
		DpcCtl:	Trigger:1 Cmpl- INT+ ErrCor- PoisonedTLP- SwTrigger- DL_ActiveErr-
		DpcSta:	Trigger- Reason:00 INT- RPBusy- TriggerExt:00 RP PIO ErrPtr:1f
		Source:	0000
	Kernel driver in use: pcieport

05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a (prog-if 02 [NVM Express])
	Subsystem: Sandisk Corp Device 501a
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 11
	Region 0: Memory at a2500000 (64-bit, non-prefetchable) [size=16K]
	Region 4: Memory at a2504000 (64-bit, non-prefetchable) [size=256]
	Capabilities: [80] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [b0] MSI-X: Enable- Count=17 Masked-
		Vector table: BAR=0 offset=00002000
		PBA: BAR=4 offset=00000000
	Capabilities: [c0] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <8us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s (ok), Width x4 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS- TPHComp- ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
		LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
			 EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [150 v1] Device Serial Number 00-00-00-00-00-00-00-00
	Capabilities: [1b8 v1] Latency Tolerance Reporting
		Max snoop latency: 3145728ns
		Max no snoop latency: 3145728ns
	Capabilities: [300 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
		LaneErrStat: 0
	Capabilities: [900 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
		L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=65536ns
		L1SubCtl2: T_PwrOn=44us
	Kernel modules: nvme


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 14:51       ` G.R.
@ 2022-07-04 15:15         ` G.R.
  2022-07-04 15:37           ` G.R.
  2022-07-04 15:33         ` Roger Pau Monné
  1 sibling, 1 reply; 31+ messages in thread
From: G.R. @ 2022-07-04 15:15 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

On Mon, Jul 4, 2022 at 10:51 PM G.R. <firemeteor@users.sourceforge.net> wrote:
>
> On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > 05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a (prog-if 02 [NVM Express])
> > >       Subsystem: Sandisk Corp Device 501a
> > >       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> > >       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > >       Latency: 0, Cache Line Size: 64 bytes
> > >       Interrupt: pin A routed to IRQ 16
> > >       NUMA node: 0
> > >       IOMMU group: 13
> > >       Region 0: Memory at a2600000 (64-bit, non-prefetchable) [size=16K]
> > >       Region 4: Memory at a2604000 (64-bit, non-prefetchable) [size=256]
> >
> > I think I'm slightly confused, the overlapping happens at:
> >
> > (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
> >
> > So it's MFNs 0xa2616 and 0xa2504, yet none of those are in the BAR
> > ranges of this device.
> >
> > Can you paste the lspci -vvv output for any other device you are also
> > passing through to this guest?
> >

As reminded by this request, I tried to assign this nvme device to
another FreeBSD12 domU.
This time it does not fail at the VM setup stage, but the device is
still not usable at the domU.
The nvmecontrol command is not able to talk to the device at all:
nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0

The QEMU log says the following:
00:05.0] Write-back to unknown field 0x09 (partially) inhibited (0x00)
[00:05.0] If the device doesn't work, try enabling permissive mode
[00:05.0] (unsafe) and if it helps report the problem to xen-devel
[00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)

xl dmesg says the following:
(XEN) d[IO]: assign (0000:05:00.0) failed (-16)
(XEN) HVM d5v0 save: CPU
(XEN) HVM d5v1 save: CPU
(XEN) HVM d5v2 save: CPU
(XEN) HVM d5v3 save: CPU
(XEN) HVM d5 save: PIC
(XEN) HVM d5 save: IOAPIC
(XEN) HVM d5v0 save: LAPIC
(XEN) HVM d5v1 save: LAPIC
(XEN) HVM d5v2 save: LAPIC
(XEN) HVM d5v3 save: LAPIC
(XEN) HVM d5v0 save: LAPIC_REGS
(XEN) HVM d5v1 save: LAPIC_REGS
(XEN) HVM d5v2 save: LAPIC_REGS
(XEN) HVM d5v3 save: LAPIC_REGS
(XEN) HVM d5 save: PCI_IRQ
(XEN) HVM d5 save: ISA_IRQ
(XEN) HVM d5 save: PCI_LINK
(XEN) HVM d5 save: PIT
(XEN) HVM d5 save: RTC
(XEN) HVM d5 save: HPET
(XEN) HVM d5 save: PMTIMER
(XEN) HVM d5v0 save: MTRR
(XEN) HVM d5v1 save: MTRR
(XEN) HVM d5v2 save: MTRR
(XEN) HVM d5v3 save: MTRR
(XEN) HVM d5 save: VIRIDIAN_DOMAIN
(XEN) HVM d5v0 save: CPU_XSAVE
(XEN) HVM d5v1 save: CPU_XSAVE
(XEN) HVM d5v2 save: CPU_XSAVE
(XEN) HVM d5v3 save: CPU_XSAVE
(XEN) HVM d5v0 save: VIRIDIAN_VCPU
(XEN) HVM d5v1 save: VIRIDIAN_VCPU
(XEN) HVM d5v2 save: VIRIDIAN_VCPU
(XEN) HVM d5v3 save: VIRIDIAN_VCPU
(XEN) HVM d5v0 save: VMCE_VCPU
(XEN) HVM d5v1 save: VMCE_VCPU
(XEN) HVM d5v2 save: VMCE_VCPU
(XEN) HVM d5v3 save: VMCE_VCPU
(XEN) HVM d5v0 save: TSC_ADJUST
(XEN) HVM d5v1 save: TSC_ADJUST
(XEN) HVM d5v2 save: TSC_ADJUST
(XEN) HVM d5v3 save: TSC_ADJUST
(XEN) HVM d5v0 save: CPU_MSR
(XEN) HVM d5v1 save: CPU_MSR
(XEN) HVM d5v2 save: CPU_MSR
(XEN) HVM d5v3 save: CPU_MSR
(XEN) HVM5 restore: CPU 0
(XEN) d5: bind: m_gsi=16 g_gsi=36 dev=00.00.5 intx=0
(d5) HVM Loader
(d5) Detected Xen v4.14.3
(d5) Xenbus rings @0xfeffc000, event channel 1
(d5) System requested SeaBIOS
(d5) CPU speed is 3505 MHz
(d5) Relocating guest memory for lowmem MMIO space disabled
(d5) PCI-ISA link 0 routed to IRQ5
(d5) PCI-ISA link 1 routed to IRQ10
(d5) PCI-ISA link 2 routed to IRQ11
(d5) PCI-ISA link 3 routed to IRQ5
(d5) pci dev 01:3 INTA->IRQ10
(d5) pci dev 02:0 INTA->IRQ11
(d5) pci dev 04:0 INTA->IRQ5
(d5) pci dev 05:0 INTA->IRQ10
(d5) No RAM in high memory; setting high_mem resource base to 100000000
(d5) pci dev 03:0 bar 10 size 002000000: 0f0000008
(d5) pci dev 02:0 bar 14 size 001000000: 0f2000008
(d5) pci dev 04:0 bar 30 size 000040000: 0f3000000
(d5) pci dev 04:0 bar 10 size 000020000: 0f3040000
(d5) pci dev 03:0 bar 30 size 000010000: 0f3060000
(d5) pci dev 05:0 bar 10 size 000004000: 0f3070004
(d5) pci dev 03:0 bar 14 size 000001000: 0f3074000
(d5) pci dev 02:0 bar 10 size 000000100: 00000c001
(d5) pci dev 05:0 bar 20 size 000000100: 0f3075004
(d5) pci dev 04:0 bar 14 size 000000040: 00000c101
(d5) pci dev 01:1 bar 20 size 000000010: 00000c141
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1
(d5) Multiprocessor initialisation:
(d5)  - CPU0 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d5)  - CPU1 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d5)  - CPU2 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d5)  - CPU3 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d5) Writing SMBIOS tables ...
(d5) Loading SeaBIOS ...
(d5) Creating MP tables ...
(d5) Loading ACPI ...
(d5) vm86 TSS at fc100300
(d5) BIOS map:
(d5)  10000-100e3: Scratch space
(d5)  c0000-fffff: Main BIOS
(d5) E820 table:
(d5)  [00]: 00000000:00000000 - 00000000:000a0000: RAM
(d5)  HOLE: 00000000:000a0000 - 00000000:000c0000
(d5)  [01]: 00000000:000c0000 - 00000000:00100000: RESERVED
(d5)  [02]: 00000000:00100000 - 00000000:7f800000: RAM
(d5)  HOLE: 00000000:7f800000 - 00000000:fc000000
(d5)  [03]: 00000000:fc000000 - 00000000:fc00b000: NVS
(d5)  [04]: 00000000:fc00b000 - 00000001:00000000: RESERVED
(d5) Invoking SeaBIOS ...
(d5) SeaBIOS (version rel-1.13.0-1-gd542924-Xen)
(d5) BUILD: gcc: (Debian 11.2.0-13) 11.2.0 binutils: (GNU Binutils for
Debian) 2.37
(d5)
(d5) Found Xen hypervisor signature at 40000000
(XEN) memory_map:remove: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:remove: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:remove: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:remove: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 14:51       ` G.R.
  2022-07-04 15:15         ` G.R.
@ 2022-07-04 15:33         ` Roger Pau Monné
  2022-07-04 15:44           ` G.R.
  2022-07-04 16:05           ` Jan Beulich
  1 sibling, 2 replies; 31+ messages in thread
From: Roger Pau Monné @ 2022-07-04 15:33 UTC (permalink / raw)
  To: G.R.; +Cc: xen-devel

On Mon, Jul 04, 2022 at 10:51:53PM +0800, G.R. wrote:
> On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > 05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a (prog-if 02 [NVM Express])
> > >       Subsystem: Sandisk Corp Device 501a
> > >       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> > >       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > >       Latency: 0, Cache Line Size: 64 bytes
> > >       Interrupt: pin A routed to IRQ 16
> > >       NUMA node: 0
> > >       IOMMU group: 13
> > >       Region 0: Memory at a2600000 (64-bit, non-prefetchable) [size=16K]
> > >       Region 4: Memory at a2604000 (64-bit, non-prefetchable) [size=256]
> >
> > I think I'm slightly confused, the overlapping happens at:
> >
> > (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
> >
> > So it's MFNs 0xa2616 and 0xa2504, yet none of those are in the BAR
> > ranges of this device.
> >
> > Can you paste the lspci -vvv output for any other device you are also
> > passing through to this guest?
> >
> 
> I just realized that the address may change in different environments.
> In previous email chains, I used a cached dump from a Linux
> environment running outside the hypervisor.
> Sorry for the confusion. Refreshing with a XEN dom0 dump.
> 
> The other device I used is a SATA controller. I think I can get what
> you are looking for now.
> Both a2616 and a2504 are found!
> 
> 00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI
> Controller (rev 10) (prog-if 01 [AHCI 1.0])
>         DeviceName: Onboard - SATA
>         Subsystem: Gigabyte Technology Co., Ltd Cannon Lake PCH SATA
> AHCI Controller
>         Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx-
>         Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium
> >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Interrupt: pin A routed to IRQ 16
>         Region 0: Memory at a2610000 (32-bit, non-prefetchable) [size=8K]
>         Region 1: Memory at a2616000 (32-bit, non-prefetchable) [size=256]
>         Region 2: I/O ports at 4090 [size=8]
>         Region 3: I/O ports at 4080 [size=4]
>         Region 4: I/O ports at 4060 [size=32]
> 
> 05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a
> (prog-if 02 [NVM Express])
>         Subsystem: Sandisk Corp Device 501a
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx-
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 11
>         Region 0: Memory at a2500000 (64-bit, non-prefetchable) [size=16K]
>         Region 4: Memory at a2504000 (64-bit, non-prefetchable) [size=256]

Right, so hvmloader attempts to place a BAR from 05:00.0 and a BAR
from 00:17.0 into the same page, which is not that good behavior.  It
might be sensible to attempt to share the page if both BARs belong to
the same device, but not if they belong to different devices.

I think the following patch:

https://lore.kernel.org/xen-devel/20200117110811.43321-1-roger.pau@citrix.com/

Might help with this.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 15:15         ` G.R.
@ 2022-07-04 15:37           ` G.R.
  2022-07-04 16:05             ` Roger Pau Monné
  0 siblings, 1 reply; 31+ messages in thread
From: G.R. @ 2022-07-04 15:37 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

On Mon, Jul 4, 2022 at 11:15 PM G.R. <firemeteor@users.sourceforge.net> wrote:
>
> On Mon, Jul 4, 2022 at 10:51 PM G.R. <firemeteor@users.sourceforge.net> wrote:
> >
> > On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > Can you paste the lspci -vvv output for any other device you are also
> > > passing through to this guest?
> > >
>
> As reminded by this request, I tried to assign this nvme device to
> another FreeBSD12 domU.
Just to clarify, this time this NVME SSD is the only device I passed to this VM.

> This time it does not fail at the VM setup stage, but the device is
> still not usable at the domU.
> The nvmecontrol command is not able to talk to the device at all:
> nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
> nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
>
> The QEMU log says the following:
> 00:05.0] Write-back to unknown field 0x09 (partially) inhibited (0x00)
> [00:05.0] If the device doesn't work, try enabling permissive mode
> [00:05.0] (unsafe) and if it helps report the problem to xen-devel
> [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)

I retried with the following:
pci=['05:00.0,permissive=1,msitranslate=1']
Those extra options suppressed some error logging, but still didn't
make the device usable to the domU.
The nvmecontrol command still get ABORTED result from the kernel...

The only thing remained in the QEMU file is this one:
[00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)

The xl dmesg appears to be identical except this line is gone:
(XEN) d[IO]: assign (0000:05:00.0) failed (-16)
And in both case I see the following, which suggests that the MSI-X
failure is worked-around.
(XEN) d5: bind: m_gsi=16 g_gsi=36 dev=00.00.5 intx=0

What's the current situation now?


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 15:33         ` Roger Pau Monné
@ 2022-07-04 15:44           ` G.R.
  2022-07-04 15:57             ` Roger Pau Monné
  2022-07-04 16:05           ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: G.R. @ 2022-07-04 15:44 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel

On Mon, Jul 4, 2022 at 11:33 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> Right, so hvmloader attempts to place a BAR from 05:00.0 and a BAR
> from 00:17.0 into the same page, which is not that good behavior.  It
> might be sensible to attempt to share the page if both BARs belong to
> the same device, but not if they belong to different devices.
>
> I think the following patch:
>
> https://lore.kernel.org/xen-devel/20200117110811.43321-1-roger.pau@citrix.com/
>
> Might help with this.
>
> Thanks, Roger.
I suppose this patch has been released in a newer XEN version that I
can pick up if I decide to upgrade?
Which version would it be?

On the other hand, according to the other experiment I did, this may
not be the only issue related to this device.
Still not sure if the device or the SW stack is faulty this time...

Thanks,
G.R.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 15:44           ` G.R.
@ 2022-07-04 15:57             ` Roger Pau Monné
  2022-07-05 18:06               ` Jason Andryuk
  0 siblings, 1 reply; 31+ messages in thread
From: Roger Pau Monné @ 2022-07-04 15:57 UTC (permalink / raw)
  To: G.R., jandryuk; +Cc: xen-devel

On Mon, Jul 04, 2022 at 11:44:14PM +0800, G.R. wrote:
> On Mon, Jul 4, 2022 at 11:33 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > Right, so hvmloader attempts to place a BAR from 05:00.0 and a BAR
> > from 00:17.0 into the same page, which is not that good behavior.  It
> > might be sensible to attempt to share the page if both BARs belong to
> > the same device, but not if they belong to different devices.
> >
> > I think the following patch:
> >
> > https://lore.kernel.org/xen-devel/20200117110811.43321-1-roger.pau@citrix.com/
> >
> > Might help with this.
> >
> > Thanks, Roger.
> I suppose this patch has been released in a newer XEN version that I
> can pick up if I decide to upgrade?
> Which version would it be?
> 
> On the other hand, according to the other experiment I did, this may
> not be the only issue related to this device.
> Still not sure if the device or the SW stack is faulty this time...

I don't think this patch has been applied to any release, adding Jason
who I think was also interested in the fix and might provide more
info.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 15:37           ` G.R.
@ 2022-07-04 16:05             ` Roger Pau Monné
  2022-07-04 16:07               ` Jan Beulich
  2022-07-04 16:31               ` G.R.
  0 siblings, 2 replies; 31+ messages in thread
From: Roger Pau Monné @ 2022-07-04 16:05 UTC (permalink / raw)
  To: G.R.; +Cc: xen-devel, Jan Beulich

On Mon, Jul 04, 2022 at 11:37:13PM +0800, G.R. wrote:
> On Mon, Jul 4, 2022 at 11:15 PM G.R. <firemeteor@users.sourceforge.net> wrote:
> >
> > On Mon, Jul 4, 2022 at 10:51 PM G.R. <firemeteor@users.sourceforge.net> wrote:
> > >
> > > On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > Can you paste the lspci -vvv output for any other device you are also
> > > > passing through to this guest?
> > > >
> >
> > As reminded by this request, I tried to assign this nvme device to
> > another FreeBSD12 domU.
> Just to clarify, this time this NVME SSD is the only device I passed to this VM.
> 
> > This time it does not fail at the VM setup stage, but the device is
> > still not usable at the domU.
> > The nvmecontrol command is not able to talk to the device at all:
> > nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
> > nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
> > nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
> > nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
> >
> > The QEMU log says the following:
> > 00:05.0] Write-back to unknown field 0x09 (partially) inhibited (0x00)
> > [00:05.0] If the device doesn't work, try enabling permissive mode
> > [00:05.0] (unsafe) and if it helps report the problem to xen-devel
> > [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)
> 
> I retried with the following:
> pci=['05:00.0,permissive=1,msitranslate=1']
> Those extra options suppressed some error logging, but still didn't
> make the device usable to the domU.
> The nvmecontrol command still get ABORTED result from the kernel...
> 
> The only thing remained in the QEMU file is this one:
> [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)

Hm it seems like Xen doesn't find the position of the MSI-X table
correctly, given there's only one error path from msi.c returning
-ENODATA (61).

Are there errors from pciback when this happens?  I would expect the
call to pci_prepare_msix() from pciback to fail and thus also report
some error?

I think it's likely I will have to provide an additional debug patch
to Xen, maybe Jan has an idea of what could be going on.

Roger.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 15:33         ` Roger Pau Monné
  2022-07-04 15:44           ` G.R.
@ 2022-07-04 16:05           ` Jan Beulich
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2022-07-04 16:05 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, G.R.

On 04.07.2022 17:33, Roger Pau Monné wrote:
> On Mon, Jul 04, 2022 at 10:51:53PM +0800, G.R. wrote:
>> On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>
>>>> 05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a (prog-if 02 [NVM Express])
>>>>       Subsystem: Sandisk Corp Device 501a
>>>>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>>       Latency: 0, Cache Line Size: 64 bytes
>>>>       Interrupt: pin A routed to IRQ 16
>>>>       NUMA node: 0
>>>>       IOMMU group: 13
>>>>       Region 0: Memory at a2600000 (64-bit, non-prefetchable) [size=16K]
>>>>       Region 4: Memory at a2604000 (64-bit, non-prefetchable) [size=256]
>>>
>>> I think I'm slightly confused, the overlapping happens at:
>>>
>>> (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
>>>
>>> So it's MFNs 0xa2616 and 0xa2504, yet none of those are in the BAR
>>> ranges of this device.
>>>
>>> Can you paste the lspci -vvv output for any other device you are also
>>> passing through to this guest?
>>>
>>
>> I just realized that the address may change in different environments.
>> In previous email chains, I used a cached dump from a Linux
>> environment running outside the hypervisor.
>> Sorry for the confusion. Refreshing with a XEN dom0 dump.
>>
>> The other device I used is a SATA controller. I think I can get what
>> you are looking for now.
>> Both a2616 and a2504 are found!
>>
>> 00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI
>> Controller (rev 10) (prog-if 01 [AHCI 1.0])
>>         DeviceName: Onboard - SATA
>>         Subsystem: Gigabyte Technology Co., Ltd Cannon Lake PCH SATA
>> AHCI Controller
>>         Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR- FastB2B- DisINTx-
>>         Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium
>>> TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>         Interrupt: pin A routed to IRQ 16
>>         Region 0: Memory at a2610000 (32-bit, non-prefetchable) [size=8K]
>>         Region 1: Memory at a2616000 (32-bit, non-prefetchable) [size=256]
>>         Region 2: I/O ports at 4090 [size=8]
>>         Region 3: I/O ports at 4080 [size=4]
>>         Region 4: I/O ports at 4060 [size=32]
>>
>> 05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a
>> (prog-if 02 [NVM Express])
>>         Subsystem: Sandisk Corp Device 501a
>>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR- FastB2B- DisINTx-
>>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>         Latency: 0, Cache Line Size: 64 bytes
>>         Interrupt: pin A routed to IRQ 11
>>         Region 0: Memory at a2500000 (64-bit, non-prefetchable) [size=16K]
>>         Region 4: Memory at a2504000 (64-bit, non-prefetchable) [size=256]
> 
> Right, so hvmloader attempts to place a BAR from 05:00.0 and a BAR
> from 00:17.0 into the same page, which is not that good behavior.  It
> might be sensible to attempt to share the page if both BARs belong to
> the same device, but not if they belong to different devices.
> 
> I think the following patch:
> 
> https://lore.kernel.org/xen-devel/20200117110811.43321-1-roger.pau@citrix.com/

Hmm, yes, we definitely want to revive that one. Having gone through
the discussion again, I think what is needed is suitable checking in
tool stack and Xen for proper alignment. Unless of course non-page-
aligned BARs could be adjusted "on the fly" by some interaction with
the kernel (perhaps at pci-assignable-add time), in which case it
would only be Xen where a (final) check would want adding. Of course
if we can't adjust things "on the fly", then clear direction needs
to be provided to users as to what they need to do in order to be
able to assign a given device to a guest.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 16:05             ` Roger Pau Monné
@ 2022-07-04 16:07               ` Jan Beulich
  2022-07-04 16:31               ` G.R.
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2022-07-04 16:07 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, G.R.

On 04.07.2022 18:05, Roger Pau Monné wrote:
> On Mon, Jul 04, 2022 at 11:37:13PM +0800, G.R. wrote:
>> On Mon, Jul 4, 2022 at 11:15 PM G.R. <firemeteor@users.sourceforge.net> wrote:
>>>
>>> On Mon, Jul 4, 2022 at 10:51 PM G.R. <firemeteor@users.sourceforge.net> wrote:
>>>>
>>>> On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>> Can you paste the lspci -vvv output for any other device you are also
>>>>> passing through to this guest?
>>>>>
>>>
>>> As reminded by this request, I tried to assign this nvme device to
>>> another FreeBSD12 domU.
>> Just to clarify, this time this NVME SSD is the only device I passed to this VM.
>>
>>> This time it does not fail at the VM setup stage, but the device is
>>> still not usable at the domU.
>>> The nvmecontrol command is not able to talk to the device at all:
>>> nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
>>> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
>>> nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
>>> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
>>>
>>> The QEMU log says the following:
>>> 00:05.0] Write-back to unknown field 0x09 (partially) inhibited (0x00)
>>> [00:05.0] If the device doesn't work, try enabling permissive mode
>>> [00:05.0] (unsafe) and if it helps report the problem to xen-devel
>>> [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)
>>
>> I retried with the following:
>> pci=['05:00.0,permissive=1,msitranslate=1']
>> Those extra options suppressed some error logging, but still didn't
>> make the device usable to the domU.
>> The nvmecontrol command still get ABORTED result from the kernel...
>>
>> The only thing remained in the QEMU file is this one:
>> [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)
> 
> Hm it seems like Xen doesn't find the position of the MSI-X table
> correctly, given there's only one error path from msi.c returning
> -ENODATA (61).
> 
> Are there errors from pciback when this happens?  I would expect the
> call to pci_prepare_msix() from pciback to fail and thus also report
> some error?
> 
> I think it's likely I will have to provide an additional debug patch
> to Xen, maybe Jan has an idea of what could be going on.

No, sorry, not without - as you say - further debugging output added.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 16:05             ` Roger Pau Monné
  2022-07-04 16:07               ` Jan Beulich
@ 2022-07-04 16:31               ` G.R.
  2022-07-05  7:29                 ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: G.R. @ 2022-07-04 16:31 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 2628 bytes --]

On Tue, Jul 5, 2022 at 12:21 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Mon, Jul 04, 2022 at 11:37:13PM +0800, G.R. wrote:
> > On Mon, Jul 4, 2022 at 11:15 PM G.R. <firemeteor@users.sourceforge.net> wrote:
> > >
> > > On Mon, Jul 4, 2022 at 10:51 PM G.R. <firemeteor@users.sourceforge.net> wrote:
> > > >
> > > > On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > Can you paste the lspci -vvv output for any other device you are also
> > > > > passing through to this guest?
> > > > >
> > >
> > > As reminded by this request, I tried to assign this nvme device to
> > > another FreeBSD12 domU.
> > Just to clarify, this time this NVME SSD is the only device I passed to this VM.
> >
> > > This time it does not fail at the VM setup stage, but the device is
> > > still not usable at the domU.
> > > The nvmecontrol command is not able to talk to the device at all:
> > > nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
> > > nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
> > > nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
> > > nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
> > >
> > > The QEMU log says the following:
> > > 00:05.0] Write-back to unknown field 0x09 (partially) inhibited (0x00)
> > > [00:05.0] If the device doesn't work, try enabling permissive mode
> > > [00:05.0] (unsafe) and if it helps report the problem to xen-devel
> > > [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)
> >
> > I retried with the following:
> > pci=['05:00.0,permissive=1,msitranslate=1']
> > Those extra options suppressed some error logging, but still didn't
> > make the device usable to the domU.
> > The nvmecontrol command still get ABORTED result from the kernel...
> >
> > The only thing remained in the QEMU file is this one:
> > [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)
>
> Hm it seems like Xen doesn't find the position of the MSI-X table
> correctly, given there's only one error path from msi.c returning
> -ENODATA (61).
>
> Are there errors from pciback when this happens?  I would expect the
> call to pci_prepare_msix() from pciback to fail and thus also report
> some error?
>
> I think it's likely I will have to provide an additional debug patch
> to Xen, maybe Jan has an idea of what could be going on.
>
pciback reports the same MSI-x related error.
But even with DEBUG enabled, I didn't see more context reported.
Please find details from the attachment.

> Roger.

[-- Attachment #2: pciback_dbg_xl-pci_assignable_XXX.log --]
[-- Type: text/x-log, Size: 6680 bytes --]

root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:814:libxl__device_pci_assignable_add: 0000:05:00.0 not bound to a driver, will not be rebound.

[  323.448115] xen_pciback: wants to seize 0000:05:00.0
[  323.448136] pciback 0000:05:00.0: xen_pciback: probing...
[  323.448137] pciback 0000:05:00.0: xen_pciback: seizing device
[  323.448162] pciback 0000:05:00.0: xen_pciback: pcistub_device_alloc
[  323.448162] pciback 0000:05:00.0: xen_pciback: initializing...
[  323.448163] pciback 0000:05:00.0: xen_pciback: initializing config
[  323.448344] pciback 0000:05:00.0: xen_pciback: enabling device
[  323.448425] xen: registering gsi 16 triggering 0 polarity 1
[  323.448428] Already setup the GSI :16
[  323.448497] pciback 0000:05:00.0: xen_pciback: save state of device
[  323.448642] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3, etc) the device
[  323.448707] pcieport 0000:00:1d.0: DPC: containment event, status:0x1f11 source:0x0000
[  323.448730] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
[  323.448760] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)
[  323.448786] pcieport 0000:00:1d.0:   device [8086:a330] error status/mask=00200000/00010000
[  323.448813] pcieport 0000:00:1d.0:    [21] ACSViol                (First)
[  324.690979] pciback 0000:05:00.0: not ready 1023ms after FLR; waiting
[  325.730706] pciback 0000:05:00.0: not ready 2047ms after FLR; waiting
[  327.997638] pciback 0000:05:00.0: not ready 4095ms after FLR; waiting
[  332.264251] pciback 0000:05:00.0: not ready 8191ms after FLR; waiting
[  340.584320] pciback 0000:05:00.0: not ready 16383ms after FLR; waiting
[  357.010896] pciback 0000:05:00.0: not ready 32767ms after FLR; waiting
[  391.143951] pciback 0000:05:00.0: not ready 65535ms after FLR; giving up
[  392.249252] pciback 0000:05:00.0: xen_pciback: reset device
[  392.249392] pciback 0000:05:00.0: xen_pciback: xen_pcibk_error_detected(bus:5,devfn:0)
[  392.249393] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
[  392.397074] pciback 0000:05:00.0: xen_pciback: xen_pcibk_error_resume(bus:5,devfn:0)
[  392.397080] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
[  392.397284] pcieport 0000:00:1d.0: AER: device recovery successful

libxl: error: libxl_pci.c:835:libxl__device_pci_assignable_add: failed to quarantine 0000:05:00.0

root@gaia:~# xl pci-assignable-remove 05:00.0
libxl: error: libxl_pci.c:853:libxl__device_pci_assignable_remove: failed to de-quarantine 0000:05:00.0
root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:794:libxl__device_pci_assignable_add: 0000:05:00.0 already assigned to pciback
root@gaia:~# xl pci-assignable-remove 05:00.0
[  603.928039] pciback 0000:05:00.0: xen_pciback: removing
[  603.928041] pciback 0000:05:00.0: xen_pciback: found device to remove 
[  603.928042] pciback 0000:05:00.0: xen_pciback: pcistub_device_release
[  604.033372] pcieport 0000:00:1d.0: DPC: containment event, status:0x1f11 source:0x0000
[  604.033512] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
[  604.033631] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[  604.033758] pcieport 0000:00:1d.0:   device [8086:a330] error status/mask=00100000/00010000
[  604.033856] pcieport 0000:00:1d.0:    [20] UnsupReq               (First)
[  604.033939] pcieport 0000:00:1d.0: AER:   TLP Header: 34000000 05000010 00000000 88458845
[  604.034059] pci 0000:05:00.0: AER: can't recover (no error_detected callback)
[  604.034421] xen_pciback: removed 0000:05:00.0 from seize list
[  604.182597] pcieport 0000:00:1d.0: AER: device recovery successful

root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:814:libxl__device_pci_assignable_add: 0000:05:00.0 not bound to a driver, will not be rebound.
[  667.582051] xen_pciback: wants to seize 0000:05:00.0
[  667.582130] pciback 0000:05:00.0: xen_pciback: probing...
[  667.582134] pciback 0000:05:00.0: xen_pciback: seizing device
[  667.582228] pciback 0000:05:00.0: xen_pciback: pcistub_device_alloc
[  667.582231] pciback 0000:05:00.0: xen_pciback: initializing...
[  667.582235] pciback 0000:05:00.0: xen_pciback: initializing config
[  667.582548] pciback 0000:05:00.0: xen_pciback: enabling device
[  667.582599] pciback 0000:05:00.0: enabling device (0000 -> 0002)
[  667.582912] xen: registering gsi 16 triggering 0 polarity 1
[  667.582923] Already setup the GSI :16
[  667.583061] pciback 0000:05:00.0: xen_pciback: MSI-X preparation failed (-6)
[  667.583148] pciback 0000:05:00.0: xen_pciback: save state of device
[  667.583569] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3, etc) the device
[  667.689656] pciback 0000:05:00.0: xen_pciback: reset device

root@gaia:~# xl pci-assignable-remove 05:00.0
[  720.957988] pciback 0000:05:00.0: xen_pciback: removing
[  720.957996] pciback 0000:05:00.0: xen_pciback: found device to remove 
[  720.957999] pciback 0000:05:00.0: xen_pciback: pcistub_device_release
[  721.065222] pciback 0000:05:00.0: xen_pciback: MSI-X release failed (-16)
[  721.065667] xen_pciback: removed 0000:05:00.0 from seize list

root@gaia:~# xl pci-assignable-add 05:00.0
libxl: warning: libxl_pci.c:814:libxl__device_pci_assignable_add: 0000:05:00.0 not bound to a driver, will not be rebound.

[  763.888631] xen_pciback: wants to seize 0000:05:00.0
[  763.888690] pciback 0000:05:00.0: xen_pciback: probing...
[  763.888691] pciback 0000:05:00.0: xen_pciback: seizing device
[  763.888716] pciback 0000:05:00.0: xen_pciback: pcistub_device_alloc
[  763.888717] pciback 0000:05:00.0: xen_pciback: initializing...
[  763.888717] pciback 0000:05:00.0: xen_pciback: initializing config
[  763.888804] pciback 0000:05:00.0: xen_pciback: enabling device
[  763.888885] xen: registering gsi 16 triggering 0 polarity 1
[  763.888889] Already setup the GSI :16
[  763.888949] pciback 0000:05:00.0: xen_pciback: MSI-X preparation failed (-6)
[  763.888977] pciback 0000:05:00.0: xen_pciback: save state of device
[  763.889126] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3, etc) the device
[  763.994206] pciback 0000:05:00.0: xen_pciback: reset device

root@gaia:~# xl pci-assignable-remove 05:00.0
[  819.491000] pciback 0000:05:00.0: xen_pciback: removing
[  819.491002] pciback 0000:05:00.0: xen_pciback: found device to remove 
[  819.491003] pciback 0000:05:00.0: xen_pciback: pcistub_device_release
[  819.596113] pciback 0000:05:00.0: xen_pciback: MSI-X release failed (-16)
[  819.596466] xen_pciback: removed 0000:05:00.0 from seize list


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 16:31               ` G.R.
@ 2022-07-05  7:29                 ` Jan Beulich
  2022-07-05 11:31                   ` G.R.
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2022-07-05  7:29 UTC (permalink / raw)
  To: G.R.; +Cc: xen-devel, Roger Pau Monné

On 04.07.2022 18:31, G.R. wrote:
> On Tue, Jul 5, 2022 at 12:21 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>
>> On Mon, Jul 04, 2022 at 11:37:13PM +0800, G.R. wrote:
>>> On Mon, Jul 4, 2022 at 11:15 PM G.R. <firemeteor@users.sourceforge.net> wrote:
>>>>
>>>> On Mon, Jul 4, 2022 at 10:51 PM G.R. <firemeteor@users.sourceforge.net> wrote:
>>>>>
>>>>> On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>>> Can you paste the lspci -vvv output for any other device you are also
>>>>>> passing through to this guest?
>>>>>>
>>>>
>>>> As reminded by this request, I tried to assign this nvme device to
>>>> another FreeBSD12 domU.
>>> Just to clarify, this time this NVME SSD is the only device I passed to this VM.
>>>
>>>> This time it does not fail at the VM setup stage, but the device is
>>>> still not usable at the domU.
>>>> The nvmecontrol command is not able to talk to the device at all:
>>>> nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
>>>> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
>>>> nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
>>>> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
>>>>
>>>> The QEMU log says the following:
>>>> 00:05.0] Write-back to unknown field 0x09 (partially) inhibited (0x00)
>>>> [00:05.0] If the device doesn't work, try enabling permissive mode
>>>> [00:05.0] (unsafe) and if it helps report the problem to xen-devel
>>>> [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)
>>>
>>> I retried with the following:
>>> pci=['05:00.0,permissive=1,msitranslate=1']
>>> Those extra options suppressed some error logging, but still didn't
>>> make the device usable to the domU.
>>> The nvmecontrol command still get ABORTED result from the kernel...
>>>
>>> The only thing remained in the QEMU file is this one:
>>> [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)
>>
>> Hm it seems like Xen doesn't find the position of the MSI-X table
>> correctly, given there's only one error path from msi.c returning
>> -ENODATA (61).
>>
>> Are there errors from pciback when this happens?  I would expect the
>> call to pci_prepare_msix() from pciback to fail and thus also report
>> some error?
>>
>> I think it's likely I will have to provide an additional debug patch
>> to Xen, maybe Jan has an idea of what could be going on.
>>
> pciback reports the same MSI-x related error.
> But even with DEBUG enabled, I didn't see more context reported.
> Please find details from the attachment.

And nothing pertinent in "xl dmesg"? Looking back through the thread I
couldn't spot a complete hypervisor log (i.e. from boot to assignment
attempt). An issue with MSI-X table determination, as Roger suspects,
would typically be associated with a prominent warning emitted to the
log. But there are also further possible sources of -ENXIO, which
would go silently.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-05  7:29                 ` Jan Beulich
@ 2022-07-05 11:31                   ` G.R.
  2022-07-05 11:59                     ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: G.R. @ 2022-07-05 11:31 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Roger Pau Monné

[-- Attachment #1: Type: text/plain, Size: 1826 bytes --]

On Tue, Jul 5, 2022 at 5:04 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 04.07.2022 18:31, G.R. wrote:
> > On Tue, Jul 5, 2022 at 12:21 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >>> I retried with the following:
> >>> pci=['05:00.0,permissive=1,msitranslate=1']
> >>> Those extra options suppressed some error logging, but still didn't
> >>> make the device usable to the domU.
> >>> The nvmecontrol command still get ABORTED result from the kernel...
> >>>
> >>> The only thing remained in the QEMU file is this one:
> >>> [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)
> >>
> >> Hm it seems like Xen doesn't find the position of the MSI-X table
> >> correctly, given there's only one error path from msi.c returning
> >> -ENODATA (61).
> >>
> >> Are there errors from pciback when this happens?  I would expect the
> >> call to pci_prepare_msix() from pciback to fail and thus also report
> >> some error?
> >>
> >> I think it's likely I will have to provide an additional debug patch
> >> to Xen, maybe Jan has an idea of what could be going on.
> >>
> > pciback reports the same MSI-x related error.
> > But even with DEBUG enabled, I didn't see more context reported.
> > Please find details from the attachment.
>
> And nothing pertinent in "xl dmesg"? Looking back through the thread I
> couldn't spot a complete hypervisor log (i.e. from boot to assignment
> attempt). An issue with MSI-X table determination, as Roger suspects,
> would typically be associated with a prominent warning emitted to the
> log. But there are also further possible sources of -ENXIO, which
> would go silently.
Please find the xl dmesg in the attachment.
It's with the two FreeBSD domU attempts so it should have captured
some culprits if there is any...

[-- Attachment #2: xldmesg_really_full.log --]
[-- Type: application/octet-stream, Size: 29261 bytes --]

 Xen 4.14.3
(XEN) Xen version 4.14.3 (firemeteor@) (gcc (Debian 11.2.0-13) 11.2.0) debug=n  Fri Jan  7 21:28:52 HKT 2022
(XEN) Latest ChangeSet: Sat Jan 14 15:41:32 2017 +0800 git:ff792a893a-dirty
(XEN) build-id: c43fcc69fb5fc9bcd292b31ba618a7d2335ec69a
(XEN) Bootloader: GRUB 2.04-20
(XEN) Command line: placeholder dom0_mem=2G,max:3G,min:1G dom0_max_vcpus=4 loglvl=all guest_loglvl=all iommu=verbose
(XEN) Xen image load base address: 0x87a00000
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN)  Found 5 MBR signatures
(XEN)  Found 5 EDD information structures
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 158 (0x9e), Stepping 10 (raw 000906ea)
(XEN) Xen-e820 RAM map:
(XEN)  [0000000000000000, 000000000009d3ff] (usable)
(XEN)  [000000000009d400, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 00000000835bffff] (usable)
(XEN)  [00000000835c0000, 00000000835c0fff] (ACPI NVS)
(XEN)  [00000000835c1000, 00000000835c1fff] (reserved)
(XEN)  [00000000835c2000, 0000000088c0bfff] (usable)
(XEN)  [0000000088c0c000, 000000008907dfff] (reserved)
(XEN)  [000000008907e000, 00000000891f4fff] (usable)
(XEN)  [00000000891f5000, 00000000895dcfff] (ACPI NVS)
(XEN)  [00000000895dd000, 0000000089efefff] (reserved)
(XEN)  [0000000089eff000, 0000000089efffff] (usable)
(XEN)  [0000000089f00000, 000000008f7fffff] (reserved)
(XEN)  [00000000e0000000, 00000000efffffff] (reserved)
(XEN)  [00000000fe000000, 00000000fe010fff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fee00000, 00000000fee00fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 000000086e7fffff] (usable)
(XEN) ACPI: RSDP 000F05B0, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT 895120A8, 00D4 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FACP 895509C0, 0114 (r6 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: DSDT 89512218, 3E7A6 (r2 ALASKA    A M I  1072009 INTL 20160527)
(XEN) ACPI: FACS 895DC080, 0040
(XEN) ACPI: APIC 89550AD8, 00F4 (r4 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FPDT 89550BD0, 0044 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FIDT 89550C18, 009C (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: MCFG 89550CB8, 003C (r1 ALASKA    A M I  1072009 MSFT       97)
(XEN) ACPI: SSDT 89550CF8, 0204 (r1 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89550F00, 17D5 (r2 ALASKA    A M I     3000 INTL 20160527)
(XEN) ACPI: SSDT 895526D8, 933D (r1 ALASKA    A M I        1 INTL 20160527)
(XEN) ACPI: SSDT 8955BA18, 31C7 (r2 ALASKA    A M I     3000 INTL 20160527)
(XEN) ACPI: SSDT 8955EBE0, 2358 (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: HPET 89560F38, 0038 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: SSDT 89560F70, 1BE1 (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89562B58, 0F9E (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89563AF8, 2D1B (r2 ALASKA    A M I        0 INTL 20160527)
(XEN) ACPI: UEFI 89566818, 0042 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: LPIT 89566860, 005C (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: SSDT 895668C0, 27DE (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 895690A0, 0FFE (r2 ALASKA    A M I        0 INTL 20160527)
(XEN) ACPI: DBGP 8956A0A0, 0034 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: DBG2 8956A0D8, 0054 (r0 ALASKA    A M I        2       1000013)
(XEN) ACPI: DMAR 8956A130, 00A8 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: WSMT 8956A1D8, 0028 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) System RAM: 32629MB (33412220kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-000000086e800000
(XEN) Domain heap initialised
(XEN) found SMP MP-table at 000fce10
(XEN) SMBIOS 3.1 present.
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x1808 (24 bits)
(XEN) ACPI: v5 SLEEP INFO: control[1:1804], status[1:1800]
(XEN) ACPI: Invalid sleep control/status register data: 0:0x8:0x3 0:0x8:0x3
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:1804,1:0], pm1x_evt[1:1800,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT - 895dc080/0000000000000000, using 32
(XEN) ACPI:             wakeup_vec[895dc08c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x08] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x0a] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x01] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x03] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x09] lapic_id[0x05] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x07] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x09] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x0b] enabled)
(XEN) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x09] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x0a] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x0b] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x0c] high edge lint[0x1])
(XEN) Overriding APIC driver with bigsmp
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-119
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode:  Phys.  Using 1 I/O APICs
(XEN) ACPI: HPET id: 0x8086a201 base: 0xfed00000
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-ff
(XEN) [VT-D]Host address width 39
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed90000
(XEN) [VT-D]drhd->address = fed90000 iommu->reg = ffff82c00021d000
(XEN) [VT-D]cap = 1c0000c40660462 ecap = 19e2ff0505e
(XEN) [VT-D] endpoint: 0000:00:02.0
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed91000
(XEN) [VT-D]drhd->address = fed91000 iommu->reg = ffff82c00021f000
(XEN) [VT-D]cap = d2008c40660462 ecap = f050da
(XEN) [VT-D] IOAPIC: 0000:00:1e.7
(XEN) [VT-D] MSI HPET: 0000:00:1e.6
(XEN) [VT-D]  flags: INCLUDE_ALL
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:14.0
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:02.0
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 12 CPUs (0 hotplug CPUs)
(XEN) IRQ limits: 120 GSI, 2376 MSI/MSI-X
(XEN) Switched to APIC driver x2apic_cluster
(XEN) CPU0: TSC: ratio: 292 / 2
(XEN) CPU0: bus: 100 MHz base: 3500 MHz max: 4500 MHz
(XEN) CPU0: 800 ... 3500 MHz
(XEN) xstate: size: 0x440 and states: 0x1f
(XEN) CPU0: Intel machine check reporting enabled
(XEN) Speculative mitigation facilities:
(XEN)   Hardware features: IBRS/IBPB STIBP L1D_FLUSH SSBD MD_CLEAR
(XEN)   Compiled-in support: INDIRECT_THUNK SHADOW_PAGING
(XEN)   Xen settings: BTI-Thunk JMP, SPEC_CTRL: IBRS+ SSBD-, Other: IBPB L1D_FLUSH VERW BRANCH_HARDEN
(XEN)   L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 39, Safe address 8000000000
(XEN)   Support for HVM VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR
(XEN)   Support for PV VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR
(XEN)   XPTI (64-bit PV only): Dom0 enabled, DomU enabled (with PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU enabled
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Platform timer is 24.000MHz HPET
(XEN) Detected 3504.520 MHz processor.
(XEN) alt table ffff82d040439290 -> ffff82d040444238
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d Snoop Control not enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) nr_sockets: 1
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) TSC deadline timer enabled
(XEN) Allocated console ring of 128 KiB.
(XEN) mwait-idle: MWAIT substates: 0x11142120
(XEN) mwait-idle: v0.4.1 model 0x9e
(XEN) mwait-idle: lapic_timer_reliable_states 0xffffffff
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN)  - VMCS shadowing
(XEN)  - VM Functions
(XEN)  - Virtualisation Exceptions
(XEN)  - Page Modification Logging
(XEN) HVM: ASIDs enabled.
(XEN) VMX: Disabling executable EPT superpages due to CVE-2018-12207
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) alt table ffff82d040439290 -> ffff82d040444238
(XEN) Brought up 12 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Adding cpu 0 to runqueue 0
(XEN)  First cpu on runqueue, activating
(XEN) Adding cpu 1 to runqueue 0
(XEN) Adding cpu 2 to runqueue 0
(XEN) Adding cpu 3 to runqueue 0
(XEN) Adding cpu 4 to runqueue 0
(XEN) Adding cpu 5 to runqueue 0
(XEN) Adding cpu 6 to runqueue 0
(XEN) Adding cpu 7 to runqueue 0
(XEN) Adding cpu 8 to runqueue 0
(XEN) Adding cpu 9 to runqueue 0
(XEN) Adding cpu 10 to runqueue 0
(XEN) Adding cpu 11 to runqueue 0
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) NX (Execute Disable) protection active
(XEN) Dom0 has maximum 952 PIRQs
(XEN) *** Building a PV Dom0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x2e2c000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000850000000->0000000854000000 (504314 pages to be allocated)
(XEN)  Init. ramdisk: 000000086d9fa000->000000086e7ff3f6
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff82e2c000
(XEN)  Init. ramdisk: 0000000000000000->0000000000000000
(XEN)  Phys-Mach map: 0000008000000000->0000008000400000
(XEN)  Start info:    ffffffff82e2c000->ffffffff82e2c4b8
(XEN)  Xenstore ring: 0000000000000000->0000000000000000
(XEN)  Console ring:  0000000000000000->0000000000000000
(XEN)  Page tables:   ffffffff82e2d000->ffffffff82e48000
(XEN)  Boot stack:    ffffffff82e48000->ffffffff82e49000
(XEN)  TOTAL:         ffffffff80000000->ffffffff83000000
(XEN)  ENTRY ADDRESS: ffffffff82ab6160
(XEN) Dom0 has maximum 4 VCPUs
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021d000
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021f000
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) Booted on L1TF-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Please assess your configuration and choose an
(XEN) explicit 'smt=<bool>' setting.  See XSA-273.
(XEN) ***************************************************
(XEN) Booted on MLPDS/MFBDS-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Mitigations will not be fully effective.  Please
(XEN) choose an explicit smt=<bool> setting.  See XSA-297.
(XEN) ***************************************************
(XEN) 3... 2... 1... 
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 552kB init memory
(XEN) PCI add device 0000:00:00.0
(XEN) PCI add device 0000:00:01.0
(XEN) PCI add device 0000:00:02.0
(XEN) PCI add device 0000:00:12.0
(XEN) PCI add device 0000:00:14.0
(XEN) PCI add device 0000:00:14.2
(XEN) PCI add device 0000:00:16.0
(XEN) PCI add device 0000:00:16.3
(XEN) PCI add device 0000:00:17.0
(XEN) PCI add device 0000:00:1b.0
(XEN) PCI add device 0000:00:1c.0
(XEN) PCI add device 0000:00:1c.5
(XEN) PCI add device 0000:00:1d.0
(XEN) PCI add device 0000:00:1f.0
(XEN) PCI add device 0000:00:1f.4
(XEN) PCI add device 0000:00:1f.5
(XEN) PCI add device 0000:01:00.0
(XEN) PCI add device 0000:04:00.0
(XEN) PCI add device 0000:05:00.0
(XEN) HVM d1v0 save: CPU
(XEN) HVM d1v1 save: CPU
(XEN) HVM d1 save: PIC
(XEN) HVM d1 save: IOAPIC
(XEN) HVM d1v0 save: LAPIC
(XEN) HVM d1v1 save: LAPIC
(XEN) HVM d1v0 save: LAPIC_REGS
(XEN) HVM d1v1 save: LAPIC_REGS
(XEN) HVM d1 save: PCI_IRQ
(XEN) HVM d1 save: ISA_IRQ
(XEN) HVM d1 save: PCI_LINK
(XEN) HVM d1 save: PIT
(XEN) HVM d1 save: RTC
(XEN) HVM d1 save: HPET
(XEN) HVM d1 save: PMTIMER
(XEN) HVM d1v0 save: MTRR
(XEN) HVM d1v1 save: MTRR
(XEN) HVM d1 save: VIRIDIAN_DOMAIN
(XEN) HVM d1v0 save: CPU_XSAVE
(XEN) HVM d1v1 save: CPU_XSAVE
(XEN) HVM d1v0 save: VIRIDIAN_VCPU
(XEN) HVM d1v1 save: VIRIDIAN_VCPU
(XEN) HVM d1v0 save: VMCE_VCPU
(XEN) HVM d1v1 save: VMCE_VCPU
(XEN) HVM d1v0 save: TSC_ADJUST
(XEN) HVM d1v1 save: TSC_ADJUST
(XEN) HVM d1v0 save: CPU_MSR
(XEN) HVM d1v1 save: CPU_MSR
(XEN) HVM1 restore: CPU 0
(XEN) d1: bind: m_gsi=16 g_gsi=36 dev=00.00.5 intx=0
(d1) HVM Loader
(d1) Detected Xen v4.14.3
(d1) Xenbus rings @0xfeffc000, event channel 1
(d1) System requested SeaBIOS
(d1) CPU speed is 3505 MHz
(d1) Relocating guest memory for lowmem MMIO space disabled
(d1) PCI-ISA link 0 routed to IRQ5
(d1) PCI-ISA link 1 routed to IRQ10
(d1) PCI-ISA link 2 routed to IRQ11
(d1) PCI-ISA link 3 routed to IRQ5
(d1) pci dev 01:3 INTA->IRQ10
(d1) pci dev 02:0 INTA->IRQ11
(d1) pci dev 04:0 INTA->IRQ5
(d1) pci dev 05:0 INTA->IRQ10
(d1) RAM in high memory; setting high_mem resource base to 40f800000
(d1) pci dev 03:0 bar 10 size 002000000: 0f0000008
(d1) pci dev 02:0 bar 14 size 001000000: 0f2000008
(d1) pci dev 04:0 bar 30 size 000040000: 0f3000000
(d1) pci dev 04:0 bar 10 size 000020000: 0f3040000
(d1) pci dev 03:0 bar 30 size 000010000: 0f3060000
(d1) pci dev 05:0 bar 10 size 000002000: 0f3070000
(d1) pci dev 03:0 bar 14 size 000001000: 0f3072000
(d1) pci dev 05:0 bar 24 size 000000800: 0f3073000
(d1) pci dev 02:0 bar 10 size 000000100: 00000c001
(d1) pci dev 05:0 bar 14 size 000000100: 0f3073800
(d1) pci dev 04:0 bar 14 size 000000040: 00000c101
(d1) pci dev 05:0 bar 20 size 000000020: 00000c141
(d1) pci dev 01:1 bar 20 size 000000010: 00000c161
(d1) pci dev 05:0 bar 18 size 000000008: 00000c171
(d1) pci dev 05:0 bar 1c size 000000004: 00000c179
(XEN) memory_map:add: dom1 gfn=f3070 mfn=a2610 nr=2
(XEN) memory_map:add: dom1 gfn=f3073 mfn=a2615 nr=1
(XEN) memory_map:add: dom1 gfn=f3074 mfn=a2616 nr=1
(XEN) ioport_map:add: dom1 gport=c140 mport=4060 nr=20
(XEN) ioport_map:add: dom1 gport=c170 mport=4090 nr=8
(XEN) ioport_map:add: dom1 gport=c178 mport=4080 nr=4
(d1) Multiprocessor initialisation:
(d1)  - CPU0 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d1)  - CPU1 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d1) Writing SMBIOS tables ...
(d1) Loading SeaBIOS ...
(d1) Creating MP tables ...
(d1) Loading ACPI ...
(d1) vm86 TSS at fc100280
(d1) BIOS map:
(d1)  10000-100e3: Scratch space
(d1)  c0000-fffff: Main BIOS
(d1) E820 table:
(d1)  [00]: 00000000:00000000 - 00000000:000a0000: RAM
(d1)  HOLE: 00000000:000a0000 - 00000000:000c0000
(d1)  [01]: 00000000:000c0000 - 00000000:00100000: RESERVED
(d1)  [02]: 00000000:00100000 - 00000000:f0000000: RAM
(d1)  HOLE: 00000000:f0000000 - 00000000:fc000000
(d1)  [03]: 00000000:fc000000 - 00000000:fc00b000: NVS
(d1)  [04]: 00000000:fc00b000 - 00000001:00000000: RESERVED
(d1)  [05]: 00000001:00000000 - 00000004:0f800000: RAM
(d1) Invoking SeaBIOS ...
(d1) SeaBIOS (version rel-1.13.0-1-gd542924-Xen)
(d1) BUILD: gcc: (Debian 11.2.0-13) 11.2.0 binutils: (GNU Binutils for Debian) 2.37
(d1) 
(d1) Found Xen hypervisor signature at 40000000
(XEN) memory_map:remove: dom1 gfn=f3070 mfn=a2610 nr=2
(XEN) memory_map:remove: dom1 gfn=f3073 mfn=a2615 nr=1
(XEN) memory_map:remove: dom1 gfn=f3074 mfn=a2616 nr=1
(XEN) memory_map:add: dom1 gfn=f3070 mfn=a2610 nr=2
(XEN) memory_map:add: dom1 gfn=f3073 mfn=a2615 nr=1
(XEN) memory_map:add: dom1 gfn=f3074 mfn=a2616 nr=1
(XEN) memory_map:remove: dom1 gfn=f3070 mfn=a2610 nr=2
(XEN) memory_map:remove: dom1 gfn=f3073 mfn=a2615 nr=1
(XEN) memory_map:remove: dom1 gfn=f3074 mfn=a2616 nr=1
(XEN) memory_map:add: dom1 gfn=f3070 mfn=a2610 nr=2
(XEN) memory_map:add: dom1 gfn=f3073 mfn=a2615 nr=1
(XEN) memory_map:add: dom1 gfn=f3074 mfn=a2616 nr=1
(XEN) ioport_map:remove: dom1 gport=c140 mport=4060 nr=20
(XEN) ioport_map:remove: dom1 gport=c170 mport=4090 nr=8
(XEN) ioport_map:remove: dom1 gport=c178 mport=4080 nr=4
(XEN) ioport_map:add: dom1 gport=c140 mport=4060 nr=20
(XEN) ioport_map:add: dom1 gport=c170 mport=4090 nr=8
(XEN) ioport_map:add: dom1 gport=c178 mport=4080 nr=4
(XEN) ioport_map:remove: dom1 gport=c140 mport=4060 nr=20
(XEN) ioport_map:remove: dom1 gport=c170 mport=4090 nr=8
(XEN) ioport_map:remove: dom1 gport=c178 mport=4080 nr=4
(XEN) ioport_map:add: dom1 gport=c140 mport=4060 nr=20
(XEN) ioport_map:add: dom1 gport=c170 mport=4090 nr=8
(XEN) ioport_map:add: dom1 gport=c178 mport=4080 nr=4
(XEN) ioport_map:remove: dom1 gport=c140 mport=4060 nr=20
(XEN) ioport_map:remove: dom1 gport=c170 mport=4090 nr=8
(XEN) ioport_map:remove: dom1 gport=c178 mport=4080 nr=4
(XEN) ioport_map:add: dom1 gport=c140 mport=4060 nr=20
(XEN) ioport_map:add: dom1 gport=c170 mport=4090 nr=8
(XEN) ioport_map:add: dom1 gport=c178 mport=4080 nr=4
(XEN) memory_map:remove: dom1 gfn=f3070 mfn=a2610 nr=2
(XEN) memory_map:remove: dom1 gfn=f3073 mfn=a2615 nr=1
(XEN) memory_map:remove: dom1 gfn=f3074 mfn=a2616 nr=1
(XEN) memory_map:add: dom1 gfn=f3070 mfn=a2610 nr=2
(XEN) memory_map:add: dom1 gfn=f3073 mfn=a2615 nr=1
(XEN) memory_map:add: dom1 gfn=f3074 mfn=a2616 nr=1
(XEN) d[IO]: assign (0000:05:00.0) failed (-16)
(XEN) HVM d5v0 save: CPU
(XEN) HVM d5v1 save: CPU
(XEN) HVM d5v2 save: CPU
(XEN) HVM d5v3 save: CPU
(XEN) HVM d5 save: PIC
(XEN) HVM d5 save: IOAPIC
(XEN) HVM d5v0 save: LAPIC
(XEN) HVM d5v1 save: LAPIC
(XEN) HVM d5v2 save: LAPIC
(XEN) HVM d5v3 save: LAPIC
(XEN) HVM d5v0 save: LAPIC_REGS
(XEN) HVM d5v1 save: LAPIC_REGS
(XEN) HVM d5v2 save: LAPIC_REGS
(XEN) HVM d5v3 save: LAPIC_REGS
(XEN) HVM d5 save: PCI_IRQ
(XEN) HVM d5 save: ISA_IRQ
(XEN) HVM d5 save: PCI_LINK
(XEN) HVM d5 save: PIT
(XEN) HVM d5 save: RTC
(XEN) HVM d5 save: HPET
(XEN) HVM d5 save: PMTIMER
(XEN) HVM d5v0 save: MTRR
(XEN) HVM d5v1 save: MTRR
(XEN) HVM d5v2 save: MTRR
(XEN) HVM d5v3 save: MTRR
(XEN) HVM d5 save: VIRIDIAN_DOMAIN
(XEN) HVM d5v0 save: CPU_XSAVE
(XEN) HVM d5v1 save: CPU_XSAVE
(XEN) HVM d5v2 save: CPU_XSAVE
(XEN) HVM d5v3 save: CPU_XSAVE
(XEN) HVM d5v0 save: VIRIDIAN_VCPU
(XEN) HVM d5v1 save: VIRIDIAN_VCPU
(XEN) HVM d5v2 save: VIRIDIAN_VCPU
(XEN) HVM d5v3 save: VIRIDIAN_VCPU
(XEN) HVM d5v0 save: VMCE_VCPU
(XEN) HVM d5v1 save: VMCE_VCPU
(XEN) HVM d5v2 save: VMCE_VCPU
(XEN) HVM d5v3 save: VMCE_VCPU
(XEN) HVM d5v0 save: TSC_ADJUST
(XEN) HVM d5v1 save: TSC_ADJUST
(XEN) HVM d5v2 save: TSC_ADJUST
(XEN) HVM d5v3 save: TSC_ADJUST
(XEN) HVM d5v0 save: CPU_MSR
(XEN) HVM d5v1 save: CPU_MSR
(XEN) HVM d5v2 save: CPU_MSR
(XEN) HVM d5v3 save: CPU_MSR
(XEN) HVM5 restore: CPU 0
(XEN) d5: bind: m_gsi=16 g_gsi=36 dev=00.00.5 intx=0
(d5) HVM Loader
(d5) Detected Xen v4.14.3
(d5) Xenbus rings @0xfeffc000, event channel 1
(d5) System requested SeaBIOS
(d5) CPU speed is 3505 MHz
(d5) Relocating guest memory for lowmem MMIO space disabled
(d5) PCI-ISA link 0 routed to IRQ5
(d5) PCI-ISA link 1 routed to IRQ10
(d5) PCI-ISA link 2 routed to IRQ11
(d5) PCI-ISA link 3 routed to IRQ5
(d5) pci dev 01:3 INTA->IRQ10
(d5) pci dev 02:0 INTA->IRQ11
(d5) pci dev 04:0 INTA->IRQ5
(d5) pci dev 05:0 INTA->IRQ10
(d5) No RAM in high memory; setting high_mem resource base to 100000000
(d5) pci dev 03:0 bar 10 size 002000000: 0f0000008
(d5) pci dev 02:0 bar 14 size 001000000: 0f2000008
(d5) pci dev 04:0 bar 30 size 000040000: 0f3000000
(d5) pci dev 04:0 bar 10 size 000020000: 0f3040000
(d5) pci dev 03:0 bar 30 size 000010000: 0f3060000
(d5) pci dev 05:0 bar 10 size 000004000: 0f3070004
(d5) pci dev 03:0 bar 14 size 000001000: 0f3074000
(d5) pci dev 02:0 bar 10 size 000000100: 00000c001
(d5) pci dev 05:0 bar 20 size 000000100: 0f3075004
(d5) pci dev 04:0 bar 14 size 000000040: 00000c101
(d5) pci dev 01:1 bar 20 size 000000010: 00000c141
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1
(d5) Multiprocessor initialisation:
(d5)  - CPU0 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d5)  - CPU1 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d5)  - CPU2 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d5)  - CPU3 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d5) Writing SMBIOS tables ...
(d5) Loading SeaBIOS ...
(d5) Creating MP tables ...
(d5) Loading ACPI ...
(d5) vm86 TSS at fc100300
(d5) BIOS map:
(d5)  10000-100e3: Scratch space
(d5)  c0000-fffff: Main BIOS
(d5) E820 table:
(d5)  [00]: 00000000:00000000 - 00000000:000a0000: RAM
(d5)  HOLE: 00000000:000a0000 - 00000000:000c0000
(d5)  [01]: 00000000:000c0000 - 00000000:00100000: RESERVED
(d5)  [02]: 00000000:00100000 - 00000000:7f800000: RAM
(d5)  HOLE: 00000000:7f800000 - 00000000:fc000000
(d5)  [03]: 00000000:fc000000 - 00000000:fc00b000: NVS
(d5)  [04]: 00000000:fc00b000 - 00000001:00000000: RESERVED
(d5) Invoking SeaBIOS ...
(d5) SeaBIOS (version rel-1.13.0-1-gd542924-Xen)
(d5) BUILD: gcc: (Debian 11.2.0-13) 11.2.0 binutils: (GNU Binutils for Debian) 2.37
(d5) 
(d5) Found Xen hypervisor signature at 40000000
(XEN) memory_map:remove: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:remove: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:remove: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:remove: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom5 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom5 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom5 gfn=f3075 mfn=a2504 nr=1
(XEN) HVM d6v0 save: CPU
(XEN) HVM d6v1 save: CPU
(XEN) HVM d6v2 save: CPU
(XEN) HVM d6v3 save: CPU
(XEN) HVM d6 save: PIC
(XEN) HVM d6 save: IOAPIC
(XEN) HVM d6v0 save: LAPIC
(XEN) HVM d6v1 save: LAPIC
(XEN) HVM d6v2 save: LAPIC
(XEN) HVM d6v3 save: LAPIC
(XEN) HVM d6v0 save: LAPIC_REGS
(XEN) HVM d6v1 save: LAPIC_REGS
(XEN) HVM d6v2 save: LAPIC_REGS
(XEN) HVM d6v3 save: LAPIC_REGS
(XEN) HVM d6 save: PCI_IRQ
(XEN) HVM d6 save: ISA_IRQ
(XEN) HVM d6 save: PCI_LINK
(XEN) HVM d6 save: PIT
(XEN) HVM d6 save: RTC
(XEN) HVM d6 save: HPET
(XEN) HVM d6 save: PMTIMER
(XEN) HVM d6v0 save: MTRR
(XEN) HVM d6v1 save: MTRR
(XEN) HVM d6v2 save: MTRR
(XEN) HVM d6v3 save: MTRR
(XEN) HVM d6 save: VIRIDIAN_DOMAIN
(XEN) HVM d6v0 save: CPU_XSAVE
(XEN) HVM d6v1 save: CPU_XSAVE
(XEN) HVM d6v2 save: CPU_XSAVE
(XEN) HVM d6v3 save: CPU_XSAVE
(XEN) HVM d6v0 save: VIRIDIAN_VCPU
(XEN) HVM d6v1 save: VIRIDIAN_VCPU
(XEN) HVM d6v2 save: VIRIDIAN_VCPU
(XEN) HVM d6v3 save: VIRIDIAN_VCPU
(XEN) HVM d6v0 save: VMCE_VCPU
(XEN) HVM d6v1 save: VMCE_VCPU
(XEN) HVM d6v2 save: VMCE_VCPU
(XEN) HVM d6v3 save: VMCE_VCPU
(XEN) HVM d6v0 save: TSC_ADJUST
(XEN) HVM d6v1 save: TSC_ADJUST
(XEN) HVM d6v2 save: TSC_ADJUST
(XEN) HVM d6v3 save: TSC_ADJUST
(XEN) HVM d6v0 save: CPU_MSR
(XEN) HVM d6v1 save: CPU_MSR
(XEN) HVM d6v2 save: CPU_MSR
(XEN) HVM d6v3 save: CPU_MSR
(XEN) HVM6 restore: CPU 0
(XEN) d6: bind: m_gsi=16 g_gsi=36 dev=00.00.5 intx=0
(d6) HVM Loader
(d6) Detected Xen v4.14.3
(d6) Xenbus rings @0xfeffc000, event channel 1
(d6) System requested SeaBIOS
(d6) CPU speed is 3505 MHz
(d6) Relocating guest memory for lowmem MMIO space disabled
(d6) PCI-ISA link 0 routed to IRQ5
(d6) PCI-ISA link 1 routed to IRQ10
(d6) PCI-ISA link 2 routed to IRQ11
(d6) PCI-ISA link 3 routed to IRQ5
(d6) pci dev 01:3 INTA->IRQ10
(d6) pci dev 02:0 INTA->IRQ11
(d6) pci dev 04:0 INTA->IRQ5
(d6) pci dev 05:0 INTA->IRQ10
(d6) No RAM in high memory; setting high_mem resource base to 100000000
(d6) pci dev 03:0 bar 10 size 002000000: 0f0000008
(d6) pci dev 02:0 bar 14 size 001000000: 0f2000008
(d6) pci dev 04:0 bar 30 size 000040000: 0f3000000
(d6) pci dev 04:0 bar 10 size 000020000: 0f3040000
(d6) pci dev 03:0 bar 30 size 000010000: 0f3060000
(d6) pci dev 05:0 bar 10 size 000004000: 0f3070004
(d6) pci dev 03:0 bar 14 size 000001000: 0f3074000
(d6) pci dev 02:0 bar 10 size 000000100: 00000c001
(d6) pci dev 05:0 bar 20 size 000000100: 0f3075004
(d6) pci dev 04:0 bar 14 size 000000040: 00000c101
(d6) pci dev 01:1 bar 20 size 000000010: 00000c141
(XEN) memory_map:add: dom6 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom6 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom6 gfn=f3075 mfn=a2504 nr=1
(d6) Multiprocessor initialisation:
(d6)  - CPU0 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d6)  - CPU1 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d6)  - CPU2 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d6)  - CPU3 ... 39-bit phys ... fixed MTRRs ... var MTRRs [1/8] ... done.
(d6) Writing SMBIOS tables ...
(d6) Loading SeaBIOS ...
(d6) Creating MP tables ...
(d6) Loading ACPI ...
(d6) vm86 TSS at fc100300
(d6) BIOS map:
(d6)  10000-100e3: Scratch space
(d6)  c0000-fffff: Main BIOS
(d6) E820 table:
(d6)  [00]: 00000000:00000000 - 00000000:000a0000: RAM
(d6)  HOLE: 00000000:000a0000 - 00000000:000c0000
(d6)  [01]: 00000000:000c0000 - 00000000:00100000: RESERVED
(d6)  [02]: 00000000:00100000 - 00000000:7f800000: RAM
(d6)  HOLE: 00000000:7f800000 - 00000000:fc000000
(d6)  [03]: 00000000:fc000000 - 00000000:fc00b000: NVS
(d6)  [04]: 00000000:fc00b000 - 00000001:00000000: RESERVED
(d6) Invoking SeaBIOS ...
(d6) SeaBIOS (version rel-1.13.0-1-gd542924-Xen)
(d6) BUILD: gcc: (Debian 11.2.0-13) 11.2.0 binutils: (GNU Binutils for Debian) 2.37
(d6) 
(d6) Found Xen hypervisor signature at 40000000
(XEN) memory_map:remove: dom6 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom6 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom6 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom6 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom6 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom6 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:remove: dom6 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom6 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom6 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom6 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom6 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom6 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:remove: dom6 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom6 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom6 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom6 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom6 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom6 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:remove: dom6 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:remove: dom6 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:remove: dom6 gfn=f3075 mfn=a2504 nr=1
(XEN) memory_map:add: dom6 gfn=f3070 mfn=a2500 nr=2
(XEN) memory_map:add: dom6 gfn=f3073 mfn=a2503 nr=1
(XEN) memory_map:add: dom6 gfn=f3075 mfn=a2504 nr=1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-05 11:31                   ` G.R.
@ 2022-07-05 11:59                     ` Jan Beulich
  2022-07-06  6:25                       ` G.R.
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2022-07-05 11:59 UTC (permalink / raw)
  To: G.R.; +Cc: xen-devel, Roger Pau Monné

On 05.07.2022 13:31, G.R. wrote:
> On Tue, Jul 5, 2022 at 5:04 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 04.07.2022 18:31, G.R. wrote:
>>> On Tue, Jul 5, 2022 at 12:21 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>> I retried with the following:
>>>>> pci=['05:00.0,permissive=1,msitranslate=1']
>>>>> Those extra options suppressed some error logging, but still didn't
>>>>> make the device usable to the domU.
>>>>> The nvmecontrol command still get ABORTED result from the kernel...
>>>>>
>>>>> The only thing remained in the QEMU file is this one:
>>>>> [00:05.0] msi_msix_setup: Error: Mapping of MSI-X (err: 61, vec: 0x30, entry 0)
>>>>
>>>> Hm it seems like Xen doesn't find the position of the MSI-X table
>>>> correctly, given there's only one error path from msi.c returning
>>>> -ENODATA (61).
>>>>
>>>> Are there errors from pciback when this happens?  I would expect the
>>>> call to pci_prepare_msix() from pciback to fail and thus also report
>>>> some error?
>>>>
>>>> I think it's likely I will have to provide an additional debug patch
>>>> to Xen, maybe Jan has an idea of what could be going on.
>>>>
>>> pciback reports the same MSI-x related error.
>>> But even with DEBUG enabled, I didn't see more context reported.
>>> Please find details from the attachment.
>>
>> And nothing pertinent in "xl dmesg"? Looking back through the thread I
>> couldn't spot a complete hypervisor log (i.e. from boot to assignment
>> attempt). An issue with MSI-X table determination, as Roger suspects,
>> would typically be associated with a prominent warning emitted to the
>> log. But there are also further possible sources of -ENXIO, which
>> would go silently.
> Please find the xl dmesg in the attachment.
> It's with the two FreeBSD domU attempts so it should have captured
> some culprits if there is any...

Nothing useful in there. Yet independent of that I guess we need to
separate the issues you're seeing. Otherwise it'll be impossible to
know what piece of data belongs where.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-04 15:57             ` Roger Pau Monné
@ 2022-07-05 18:06               ` Jason Andryuk
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Andryuk @ 2022-07-05 18:06 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: G.R., xen-devel

On Mon, Jul 4, 2022 at 11:57 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Mon, Jul 04, 2022 at 11:44:14PM +0800, G.R. wrote:
> > On Mon, Jul 4, 2022 at 11:33 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > Right, so hvmloader attempts to place a BAR from 05:00.0 and a BAR
> > > from 00:17.0 into the same page, which is not that good behavior.  It
> > > might be sensible to attempt to share the page if both BARs belong to
> > > the same device, but not if they belong to different devices.
> > >
> > > I think the following patch:
> > >
> > > https://lore.kernel.org/xen-devel/20200117110811.43321-1-roger.pau@citrix.com/
> > >
> > > Might help with this.
> > >
> > > Thanks, Roger.
> > I suppose this patch has been released in a newer XEN version that I
> > can pick up if I decide to upgrade?
> > Which version would it be?
> >
> > On the other hand, according to the other experiment I did, this may
> > not be the only issue related to this device.
> > Still not sure if the device or the SW stack is faulty this time...
>
> I don't think this patch has been applied to any release, adding Jason
> who I think was also interested in the fix and might provide more
> info.

Roger wrote the above patch after I tried to upstream a Qubes QEMU
patch https://lore.kernel.org/xen-devel/20190311180216.18811-7-jandryuk@gmail.com/.
The patch rounded up BAR sizes for passed through devices which
ensured they couldn't share a page.  But Roger rightfully pointed out
that changing the BAR size is incorrect, and hvmloader could just
enforce a minimum alignment.  However, nothing prevents a guest from
relocating the BARs again.

I tested Roger's patch, but Qubes and OpenXT have kept using the QEMU
patch.  When I added the QEMU patch to OpenXT, I wrote in the commit
message that it fixed probing an e1000e nic


Regards,
Jason


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-05 11:59                     ` Jan Beulich
@ 2022-07-06  6:25                       ` G.R.
  2022-07-06  6:33                         ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: G.R. @ 2022-07-06  6:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Roger Pau Monné

[-- Attachment #1: Type: text/plain, Size: 1681 bytes --]

On Tue, Jul 5, 2022 at 7:59 PM Jan Beulich <jbeulich@suse.com> wrote:
> Nothing useful in there. Yet independent of that I guess we need to
> separate the issues you're seeing. Otherwise it'll be impossible to
> know what piece of data belongs where.
Yep, I think I'm seeing several different issues here:
1. The FLR related DPC / AER message seen on the 1st attempt only when
pciback tries to seize and release the SN570
    - Later-on pciback operations appear just fine.
2. MSI-X preparation failure message that shows up each time the SN570
is seized by pciback or when it's passed to domU.
3. XEN tries to map BAR from two devices to the same page
4. The "write-back to unknown field" message in QEMU log that goes
away with permissive=1 passthrough config.
5. The "irq 16: nobody cared" message shows up *sometimes* in a
pattern that I haven't figured out  (See attached)
6. The FreeBSD domU sees the device but fails to use it because low
level commands sent to it are aborted.
7. The device does not return to the pci-assignable-list when the domU
it was assigned shuts-down. (See attached)

#3 appears to be a known issue that could be worked around with
patches from the list.
I suspect #1 may have something to do with the device itself. It's
still not clear if it's deadly or just annoying.
I was able to update the firmware to the latest version and confirmed
that the new firmware didn't make any noticeable difference.

I suspect issue #2, #4, #5, #6, #7 may be related, and the
pass-through was not completely successful...
Should I expect a debug build of XEN hypervisor to give better
diagnose messages, without the debug patch that Roger mentioned?

Thanks,
Rui

[-- Attachment #2: dom0_dmsg_for_domu_shutdown.log --]
[-- Type: text/x-log, Size: 1414 bytes --]

[59213.312849] xenbr0: port 3(vif3.0) entered disabled state  //domU shutdown sequence start from here
[59215.247393] pciback 0000:05:00.0: xen_pciback: removing
[59215.247395] pciback 0000:05:00.0: xen_pciback: found device to remove
[59215.247396] pciback 0000:05:00.0: xen_pciback: pcistub_device_release
[59215.352893] pciback 0000:05:00.0: xen_pciback: MSI-X release failed (-16)
[59215.353199] xen_pciback: removed 0000:05:00.0 from seize list
[59216.474139] pciback 0000:05:00.0: xen_pciback: probing...
[59728.150053] xen_pciback: wants to seize 0000:05:00.0      //manual xl pci-assignable-add 05:00.0
[59728.150074] pciback 0000:05:00.0: xen_pciback: probing...
[59728.150075] pciback 0000:05:00.0: xen_pciback: seizing device
[59728.150076] pciback 0000:05:00.0: xen_pciback: pcistub_device_alloc
[59728.150076] pciback 0000:05:00.0: xen_pciback: initializing...
[59728.150077] pciback 0000:05:00.0: xen_pciback: initializing config
[59728.150165] pciback 0000:05:00.0: xen_pciback: enabling device
[59728.150247] xen: registering gsi 16 triggering 0 polarity 1
[59728.150250] Already setup the GSI :16
[59728.150293] pciback 0000:05:00.0: xen_pciback: MSI-X preparation failed (-6)
[59728.150582] pciback 0000:05:00.0: xen_pciback: save state of device
[59728.150731] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3, etc) the device
[59728.257558] pciback 0000:05:00.0: xen_pciback: reset device

[-- Attachment #3: bad_irq.log --]
[-- Type: text/x-log, Size: 2151 bytes --]

[ 3742.440487] irq 16: nobody cared (try booting with the "irqpoll" option)
[ 3742.440516] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.120.gaia.78.xenpcibackdbg #4
[ 3742.440516] Hardware name: Gigabyte Technology Co., Ltd. C246N-WU2/C246N-WU2-CF, BIOS F1 10/02/2019
[ 3742.440517] Call Trace:
[ 3742.440518]  <IRQ>
[ 3742.440522]  dump_stack+0x6b/0x83
[ 3742.440524]  __report_bad_irq+0x30/0xa2
[ 3742.440525]  note_interrupt.cold+0xb/0x61
[ 3742.440527]  handle_irq_event+0x9f/0xb0
[ 3742.440528]  handle_fasteoi_irq+0x73/0x1c0
[ 3742.440529]  generic_handle_irq+0x42/0x50
[ 3742.440531]  __evtchn_fifo_handle_events+0x155/0x170
[ 3742.440533]  __xen_evtchn_do_upcall+0x61/0xa0
[ 3742.440535]  __xen_pv_evtchn_do_upcall+0x11/0x20
[ 3742.440536]  asm_call_irq_on_stack+0x12/0x20
[ 3742.440537]  </IRQ>
[ 3742.440538]  xen_pv_evtchn_do_upcall+0xa2/0xc0
[ 3742.440539]  exc_xen_hypervisor_callback+0x8/0x10
[ 3742.440540] RIP: e030:xen_hypercall_sched_op+0xa/0x20
[ 3742.440542] Code: 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[ 3742.440542] RSP: e02b:ffffffff82403de0 EFLAGS: 00000246
[ 3742.440543] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff810023aa
[ 3742.440544] RDX: 0000000002d1a31a RSI: 0000000000000000 RDI: 0000000000000001
[ 3742.440544] RBP: ffffffff82415940 R08: 00000066a173b5fc R09: 000003676ebf842f
[ 3742.440545] R10: 00000000000340ee R11: 0000000000000246 R12: 0000000000000000
[ 3742.440545] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 3742.440546]  ? xen_hypercall_sched_op+0xa/0x20
[ 3742.440548]  ? xen_safe_halt+0xc/0x20
[ 3742.440549]  ? default_idle+0x5/0x10
[ 3742.440550]  ? default_idle_call+0x33/0xc0
[ 3742.440551]  ? do_idle+0x1e9/0x260
[ 3742.440553]  ? cpu_startup_entry+0x14/0x20
[ 3742.440555]  ? start_kernel+0x503/0x526
[ 3742.440556]  ? xen_start_kernel+0x60f/0x61b
[ 3742.440556]  ? startup_xen+0x3e/0x3e
[ 3742.440557] handlers:
[ 3742.440570] [<000000008e20908e>] i801_isr [i2c_i801]
[ 3742.440585] Disabling IRQ #16


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-06  6:25                       ` G.R.
@ 2022-07-06  6:33                         ` Jan Beulich
  2022-07-07 15:24                           ` G.R.
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2022-07-06  6:33 UTC (permalink / raw)
  To: G.R.; +Cc: xen-devel, Roger Pau Monné

On 06.07.2022 08:25, G.R. wrote:
> On Tue, Jul 5, 2022 at 7:59 PM Jan Beulich <jbeulich@suse.com> wrote:
>> Nothing useful in there. Yet independent of that I guess we need to
>> separate the issues you're seeing. Otherwise it'll be impossible to
>> know what piece of data belongs where.
> Yep, I think I'm seeing several different issues here:
> 1. The FLR related DPC / AER message seen on the 1st attempt only when
> pciback tries to seize and release the SN570
>     - Later-on pciback operations appear just fine.
> 2. MSI-X preparation failure message that shows up each time the SN570
> is seized by pciback or when it's passed to domU.
> 3. XEN tries to map BAR from two devices to the same page
> 4. The "write-back to unknown field" message in QEMU log that goes
> away with permissive=1 passthrough config.
> 5. The "irq 16: nobody cared" message shows up *sometimes* in a
> pattern that I haven't figured out  (See attached)
> 6. The FreeBSD domU sees the device but fails to use it because low
> level commands sent to it are aborted.
> 7. The device does not return to the pci-assignable-list when the domU
> it was assigned shuts-down. (See attached)
> 
> #3 appears to be a known issue that could be worked around with
> patches from the list.
> I suspect #1 may have something to do with the device itself. It's
> still not clear if it's deadly or just annoying.
> I was able to update the firmware to the latest version and confirmed
> that the new firmware didn't make any noticeable difference.
> 
> I suspect issue #2, #4, #5, #6, #7 may be related, and the
> pass-through was not completely successful...
> 
> Should I expect a debug build of XEN hypervisor to give better
> diagnose messages, without the debug patch that Roger mentioned?

Well, "expect" is perhaps too much to say, but with problems like
yours (and even more so with multiple ones) using a debug
hypervisor (or kernel, if there such a build mode existed) is imo
always a good idea. As is using as up-to-date a version as
possible.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-06  6:33                         ` Jan Beulich
@ 2022-07-07 15:24                           ` G.R.
  2022-07-07 15:36                             ` G.R.
  2022-07-07 16:23                             ` Jan Beulich
  0 siblings, 2 replies; 31+ messages in thread
From: G.R. @ 2022-07-07 15:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Roger Pau Monné

On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 06.07.2022 08:25, G.R. wrote:
> > On Tue, Jul 5, 2022 at 7:59 PM Jan Beulich <jbeulich@suse.com> wrote:
> >> Nothing useful in there. Yet independent of that I guess we need to
> >> separate the issues you're seeing. Otherwise it'll be impossible to
> >> know what piece of data belongs where.
> > Yep, I think I'm seeing several different issues here:
> > 1. The FLR related DPC / AER message seen on the 1st attempt only when
> > pciback tries to seize and release the SN570
> >     - Later-on pciback operations appear just fine.
> > 2. MSI-X preparation failure message that shows up each time the SN570
> > is seized by pciback or when it's passed to domU.
> > 3. XEN tries to map BAR from two devices to the same page
> > 4. The "write-back to unknown field" message in QEMU log that goes
> > away with permissive=1 passthrough config.
> > 5. The "irq 16: nobody cared" message shows up *sometimes* in a
> > pattern that I haven't figured out  (See attached)
> > 6. The FreeBSD domU sees the device but fails to use it because low
> > level commands sent to it are aborted.
> > 7. The device does not return to the pci-assignable-list when the domU
> > it was assigned shuts-down. (See attached)
> >
> > #3 appears to be a known issue that could be worked around with
> > patches from the list.
> > I suspect #1 may have something to do with the device itself. It's
> > still not clear if it's deadly or just annoying.
> > I was able to update the firmware to the latest version and confirmed
> > that the new firmware didn't make any noticeable difference.
> >
> > I suspect issue #2, #4, #5, #6, #7 may be related, and the
> > pass-through was not completely successful...
> >
> > Should I expect a debug build of XEN hypervisor to give better
> > diagnose messages, without the debug patch that Roger mentioned?
>
> Well, "expect" is perhaps too much to say, but with problems like
> yours (and even more so with multiple ones) using a debug
> hypervisor (or kernel, if there such a build mode existed) is imo
> always a good idea. As is using as up-to-date a version as
> possible.

I built both 4.14.3 debug version and 4.16.1 release version for
testing purposes.
Unfortunately they gave me absolutely zero information, since both of
them are not able to get through issue #1
the FlR related DPC / AER issue.
With 4.16.1 release, it actually can survive the 'xl
pci-assignable-add' which triggers the first AER failure.
But the 'xl pci-assignable-remove' will lead to xl segmentation fault...
>[  655.041442] xl[975]: segfault at 0 ip 00007f2cccdaf71f sp 00007ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000]
>[  655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c 0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 <48> 8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44
Since I'll need a couple of pci-assignable-add &&
pci-assignable-remove to get to a seemingly normal state, I cannot
proceed from here.

With 4.14.3 debug build, the hypervisor / dom0 reboots on 'xl
pci-assignable-add'.

[  574.623143] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3,
etc) the device
[  574.623203] pcieport 0000:00:1d.0: DPC: containment event,
status:0x1f11 source:0x0000
[  574.623204] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
[  574.623209] pcieport 0000:00:1d.0: PCIe Bus Error:
severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver
ID)
[  574.623240] pcieport 0000:00:1d.0:   device [8086:a330] error
status/mask=00200000/00010000
[  574.623261] pcieport 0000:00:1d.0:    [21] ACSViol                (First)
[  575.855026] pciback 0000:05:00.0: not ready 1023ms after FLR; waiting
[  576.895015] pciback 0000:05:00.0: not ready 2047ms after FLR; waiting
[  579.028311] pciback 0000:05:00.0: not ready 4095ms after FLR; waiting
[  583.294910] pciback 0000:05:00.0: not ready 8191ms after FLR; waiting
[  591.614965] pciback 0000:05:00.0: not ready 16383ms after FLR; waiting
[  609.534502] pciback 0000:05:00.0: not ready 32767ms after FLR; waiting
[  643.667069] pciback 0000:05:00.0: not ready 65535ms after FLR; giving up
//<=======The reboot happens somewhere here, not immediately, but
after a while...
//Maybe I can get something from xl dmesg if I was quick enough and
have connected from a second terminal...
[  644.773922] pciback 0000:05:00.0: xen_pciback: reset device
[  644.774050] pciback 0000:05:00.0: xen_pciback:
xen_pcibk_error_detected(bus:5,devfn:0)
[  644.774051] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
[  644.923432] pciback 0000:05:00.0: xen_pciback:
xen_pcibk_error_resume(bus:5,devfn:0)
[  644.923437] pciback 0000:05:00.0: xen_pciback: device is not found/assigned
[  644.923616] pcieport 0000:00:1d.0: AER: device recovery successful



>
> Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-07 15:24                           ` G.R.
@ 2022-07-07 15:36                             ` G.R.
  2022-07-07 16:18                               ` Jan Beulich
  2022-07-07 16:23                             ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: G.R. @ 2022-07-07 15:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Roger Pau Monné

On Thu, Jul 7, 2022 at 11:24 PM G.R. <firemeteor@users.sourceforge.net> wrote:
>
> On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich <jbeulich@suse.com> wrote:
> >
> > > Should I expect a debug build of XEN hypervisor to give better
> > > diagnose messages, without the debug patch that Roger mentioned?
> >
> > Well, "expect" is perhaps too much to say, but with problems like
> > yours (and even more so with multiple ones) using a debug
> > hypervisor (or kernel, if there such a build mode existed) is imo
> > always a good idea. As is using as up-to-date a version as
> > possible.
>
> I built both 4.14.3 debug version and 4.16.1 release version for
> testing purposes.
> Unfortunately they gave me absolutely zero information, since both of
> them are not able to get through issue #1
> the FlR related DPC / AER issue.
> With 4.16.1 release, it actually can survive the 'xl
> pci-assignable-add' which triggers the first AER failure.
> But the 'xl pci-assignable-remove' will lead to xl segmentation fault...
> >[  655.041442] xl[975]: segfault at 0 ip 00007f2cccdaf71f sp 00007ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000]
> >[  655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c 0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 <48> 8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44
> Since I'll need a couple of pci-assignable-add &&
> pci-assignable-remove to get to a seemingly normal state, I cannot
> proceed from here.
>
> With 4.14.3 debug build, the hypervisor / dom0 reboots on 'xl
> pci-assignable-add'.
>
> [  574.623143] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3,
> etc) the device
> [  574.623203] pcieport 0000:00:1d.0: DPC: containment event,
> status:0x1f11 source:0x0000
> [  574.623204] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
> [  574.623209] pcieport 0000:00:1d.0: PCIe Bus Error:
> severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver
> ID)
> [  574.623240] pcieport 0000:00:1d.0:   device [8086:a330] error
> status/mask=00200000/00010000
> [  574.623261] pcieport 0000:00:1d.0:    [21] ACSViol                (First)
> [  575.855026] pciback 0000:05:00.0: not ready 1023ms after FLR; waiting
> [  576.895015] pciback 0000:05:00.0: not ready 2047ms after FLR; waiting
> [  579.028311] pciback 0000:05:00.0: not ready 4095ms after FLR; waiting
> [  583.294910] pciback 0000:05:00.0: not ready 8191ms after FLR; waiting
> [  591.614965] pciback 0000:05:00.0: not ready 16383ms after FLR; waiting
> [  609.534502] pciback 0000:05:00.0: not ready 32767ms after FLR; waiting
> [  643.667069] pciback 0000:05:00.0: not ready 65535ms after FLR; giving up
> //<=======The reboot happens somewhere here, not immediately, but
> after a while...
> //Maybe I can get something from xl dmesg if I was quick enough and
> have connected from a second terminal...

Unfortunately I didn't see anything from xl dmesg...
I wish the 'xl dmesg' can support the follow mode (dmesg -w) that the
Linux dmesg does.
Here I have to manually repeat this command. The machine suddenly
freezes after the 'giving up' message is out.
I see nothing special in the log. Maybe I'm just not lucky enough to
catch the output, not sure.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-07 15:36                             ` G.R.
@ 2022-07-07 16:18                               ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2022-07-07 16:18 UTC (permalink / raw)
  To: G.R.; +Cc: xen-devel, Roger Pau Monné

On 07.07.2022 17:36, G.R. wrote:
> On Thu, Jul 7, 2022 at 11:24 PM G.R. <firemeteor@users.sourceforge.net> wrote:
>>
>> On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>
>>>> Should I expect a debug build of XEN hypervisor to give better
>>>> diagnose messages, without the debug patch that Roger mentioned?
>>>
>>> Well, "expect" is perhaps too much to say, but with problems like
>>> yours (and even more so with multiple ones) using a debug
>>> hypervisor (or kernel, if there such a build mode existed) is imo
>>> always a good idea. As is using as up-to-date a version as
>>> possible.
>>
>> I built both 4.14.3 debug version and 4.16.1 release version for
>> testing purposes.
>> Unfortunately they gave me absolutely zero information, since both of
>> them are not able to get through issue #1
>> the FlR related DPC / AER issue.
>> With 4.16.1 release, it actually can survive the 'xl
>> pci-assignable-add' which triggers the first AER failure.
>> But the 'xl pci-assignable-remove' will lead to xl segmentation fault...
>>> [  655.041442] xl[975]: segfault at 0 ip 00007f2cccdaf71f sp 00007ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000]
>>> [  655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c 0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 <48> 8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44
>> Since I'll need a couple of pci-assignable-add &&
>> pci-assignable-remove to get to a seemingly normal state, I cannot
>> proceed from here.
>>
>> With 4.14.3 debug build, the hypervisor / dom0 reboots on 'xl
>> pci-assignable-add'.
>>
>> [  574.623143] pciback 0000:05:00.0: xen_pciback: resetting (FLR, D3,
>> etc) the device
>> [  574.623203] pcieport 0000:00:1d.0: DPC: containment event,
>> status:0x1f11 source:0x0000
>> [  574.623204] pcieport 0000:00:1d.0: DPC: unmasked uncorrectable error detected
>> [  574.623209] pcieport 0000:00:1d.0: PCIe Bus Error:
>> severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver
>> ID)
>> [  574.623240] pcieport 0000:00:1d.0:   device [8086:a330] error
>> status/mask=00200000/00010000
>> [  574.623261] pcieport 0000:00:1d.0:    [21] ACSViol                (First)
>> [  575.855026] pciback 0000:05:00.0: not ready 1023ms after FLR; waiting
>> [  576.895015] pciback 0000:05:00.0: not ready 2047ms after FLR; waiting
>> [  579.028311] pciback 0000:05:00.0: not ready 4095ms after FLR; waiting
>> [  583.294910] pciback 0000:05:00.0: not ready 8191ms after FLR; waiting
>> [  591.614965] pciback 0000:05:00.0: not ready 16383ms after FLR; waiting
>> [  609.534502] pciback 0000:05:00.0: not ready 32767ms after FLR; waiting
>> [  643.667069] pciback 0000:05:00.0: not ready 65535ms after FLR; giving up
>> //<=======The reboot happens somewhere here, not immediately, but
>> after a while...
>> //Maybe I can get something from xl dmesg if I was quick enough and
>> have connected from a second terminal...
> 
> Unfortunately I didn't see anything from xl dmesg...
> I wish the 'xl dmesg' can support the follow mode (dmesg -w) that the
> Linux dmesg does.
> Here I have to manually repeat this command. The machine suddenly
> freezes after the 'giving up' message is out.
> I see nothing special in the log. Maybe I'm just not lucky enough to
> catch the output, not sure.

If the box reboots in the middle, I guess you really want to hook up
a serial console.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-07 15:24                           ` G.R.
  2022-07-07 15:36                             ` G.R.
@ 2022-07-07 16:23                             ` Jan Beulich
  2022-07-08  2:28                               ` G.R.
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2022-07-07 16:23 UTC (permalink / raw)
  To: G.R.; +Cc: xen-devel, Roger Pau Monné, Anthony Perard

On 07.07.2022 17:24, G.R. wrote:
> On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 06.07.2022 08:25, G.R. wrote:
>>> On Tue, Jul 5, 2022 at 7:59 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>> Nothing useful in there. Yet independent of that I guess we need to
>>>> separate the issues you're seeing. Otherwise it'll be impossible to
>>>> know what piece of data belongs where.
>>> Yep, I think I'm seeing several different issues here:
>>> 1. The FLR related DPC / AER message seen on the 1st attempt only when
>>> pciback tries to seize and release the SN570
>>>     - Later-on pciback operations appear just fine.
>>> 2. MSI-X preparation failure message that shows up each time the SN570
>>> is seized by pciback or when it's passed to domU.
>>> 3. XEN tries to map BAR from two devices to the same page
>>> 4. The "write-back to unknown field" message in QEMU log that goes
>>> away with permissive=1 passthrough config.
>>> 5. The "irq 16: nobody cared" message shows up *sometimes* in a
>>> pattern that I haven't figured out  (See attached)
>>> 6. The FreeBSD domU sees the device but fails to use it because low
>>> level commands sent to it are aborted.
>>> 7. The device does not return to the pci-assignable-list when the domU
>>> it was assigned shuts-down. (See attached)
>>>
>>> #3 appears to be a known issue that could be worked around with
>>> patches from the list.
>>> I suspect #1 may have something to do with the device itself. It's
>>> still not clear if it's deadly or just annoying.
>>> I was able to update the firmware to the latest version and confirmed
>>> that the new firmware didn't make any noticeable difference.
>>>
>>> I suspect issue #2, #4, #5, #6, #7 may be related, and the
>>> pass-through was not completely successful...
>>>
>>> Should I expect a debug build of XEN hypervisor to give better
>>> diagnose messages, without the debug patch that Roger mentioned?
>>
>> Well, "expect" is perhaps too much to say, but with problems like
>> yours (and even more so with multiple ones) using a debug
>> hypervisor (or kernel, if there such a build mode existed) is imo
>> always a good idea. As is using as up-to-date a version as
>> possible.
> 
> I built both 4.14.3 debug version and 4.16.1 release version for
> testing purposes.
> Unfortunately they gave me absolutely zero information, since both of
> them are not able to get through issue #1
> the FlR related DPC / AER issue.
> With 4.16.1 release, it actually can survive the 'xl
> pci-assignable-add' which triggers the first AER failure.

Then that's what needs debugging first. Yet from all I've seen so
far I'm not sure who one the Xen side could be doing that, the more
without themselves being able to repro - this seems more like a
Linux side issue (and even outside of the pciback driver).

> But the 'xl pci-assignable-remove' will lead to xl segmentation fault...
>> [  655.041442] xl[975]: segfault at 0 ip 00007f2cccdaf71f sp 00007ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000]
>> [  655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c 0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 <48> 8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44

That'll need debugging. Cc-ing Anthony for awareness, but I'm sure
he'll need more data to actually stand a chance of doing something
about it.

Is there any chance you could be doing some debugging work yourself,
at the very least to figure out where this (apparent) NULL deref is
happening?

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-07 16:23                             ` Jan Beulich
@ 2022-07-08  2:28                               ` G.R.
  2022-07-09  1:24                                 ` G.R.
  2022-07-09  4:27                                 ` G.R.
  0 siblings, 2 replies; 31+ messages in thread
From: G.R. @ 2022-07-08  2:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Roger Pau Monné, Anthony Perard

On Fri, Jul 8, 2022 at 12:38 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 07.07.2022 17:24, G.R. wrote:
> > On Wed, Jul 6, 2022 at 2:33 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 06.07.2022 08:25, G.R. wrote:
> >>> On Tue, Jul 5, 2022 at 7:59 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> Nothing useful in there. Yet independent of that I guess we need to
> >>>> separate the issues you're seeing. Otherwise it'll be impossible to
> >>>> know what piece of data belongs where.
> >>> Yep, I think I'm seeing several different issues here:
> >>> 1. The FLR related DPC / AER message seen on the 1st attempt only when
> >>> pciback tries to seize and release the SN570
> >>>     - Later-on pciback operations appear just fine.
> >>> 2. MSI-X preparation failure message that shows up each time the SN570
> >>> is seized by pciback or when it's passed to domU.
> >>> 3. XEN tries to map BAR from two devices to the same page
> >>> 4. The "write-back to unknown field" message in QEMU log that goes
> >>> away with permissive=1 passthrough config.
> >>> 5. The "irq 16: nobody cared" message shows up *sometimes* in a
> >>> pattern that I haven't figured out  (See attached)
> >>> 6. The FreeBSD domU sees the device but fails to use it because low
> >>> level commands sent to it are aborted.
> >>> 7. The device does not return to the pci-assignable-list when the domU
> >>> it was assigned shuts-down. (See attached)
> >>>
> >>> #3 appears to be a known issue that could be worked around with
> >>> patches from the list.
> >>> I suspect #1 may have something to do with the device itself. It's
> >>> still not clear if it's deadly or just annoying.
> >>> I was able to update the firmware to the latest version and confirmed
> >>> that the new firmware didn't make any noticeable difference.
> >>>
> >>> I suspect issue #2, #4, #5, #6, #7 may be related, and the
> >>> pass-through was not completely successful...
> >>>
> >>> Should I expect a debug build of XEN hypervisor to give better
> >>> diagnose messages, without the debug patch that Roger mentioned?
> >>
> >> Well, "expect" is perhaps too much to say, but with problems like
> >> yours (and even more so with multiple ones) using a debug
> >> hypervisor (or kernel, if there such a build mode existed) is imo
> >> always a good idea. As is using as up-to-date a version as
> >> possible.
> >
> > I built both 4.14.3 debug version and 4.16.1 release version for
> > testing purposes.
> > Unfortunately they gave me absolutely zero information, since both of
> > them are not able to get through issue #1
> > the FlR related DPC / AER issue.
> > With 4.16.1 release, it actually can survive the 'xl
> > pci-assignable-add' which triggers the first AER failure.
>
> Then that's what needs debugging first. Yet from all I've seen so
> far I'm not sure who one the Xen side could be doing that, the more
> without themselves being able to repro - this seems more like a
> Linux side issue (and even outside of the pciback driver).
>
Yep, this one is likely not XEN related, as I've seen some discussions
([1],[2]) on similar syndrome (not necessarily same root cause
though).
The question is why this only shows up during the FLR attempt and if
following pci-assignable-adds that do not trigger the error are
actually reliable.
BTW, I'm under the impression that the device is still usable in dom0
afterwards, I'll have to double check though...

[1] https://patchwork.kernel.org/project/linux-pci/patch/20220408153159.106741-1-kai.heng.feng@canonical.com/
[2] https://patchwork.kernel.org/project/linux-pci/patch/20220127025418.1989642-1-kai.heng.feng@canonical.com/#24713767

> > But the 'xl pci-assignable-remove' will lead to xl segmentation fault...
> >> [  655.041442] xl[975]: segfault at 0 ip 00007f2cccdaf71f sp 00007ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000]
> >> [  655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c 0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 <48> 8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44
>
> That'll need debugging. Cc-ing Anthony for awareness, but I'm sure
> he'll need more data to actually stand a chance of doing something
> about it.
>
> Is there any chance you could be doing some debugging work yourself,
> at the very least to figure out where this (apparent) NULL deref is
> happening?
Yep, I can collect the call-stack for sure.

>
> Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-08  2:28                               ` G.R.
@ 2022-07-09  1:24                                 ` G.R.
  2022-07-09  4:27                                 ` G.R.
  1 sibling, 0 replies; 31+ messages in thread
From: G.R. @ 2022-07-09  1:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Roger Pau Monné, Anthony Perard

[-- Attachment #1: Type: text/plain, Size: 2217 bytes --]

On Fri, Jul 8, 2022 at 10:28 AM G.R. <firemeteor@users.sourceforge.net> wrote:
>
> On Fri, Jul 8, 2022 at 12:38 AM Jan Beulich <jbeulich@suse.com> wrote:
> > > But the 'xl pci-assignable-remove' will lead to xl segmentation fault...
> > >> [  655.041442] xl[975]: segfault at 0 ip 00007f2cccdaf71f sp 00007ffd73a3d4d0 error 4 in libxenlight.so.4.16.0[7f2cccd92000+7c000]
> > >> [  655.041460] Code: 61 06 00 eb 13 66 0f 1f 44 00 00 83 c3 01 39 5c 24 2c 0f 86 1b 01 00 00 48 8b 34 24 89 d8 4d 89 f9 4d 89 f0 4c 89 e9 4c 89 e2 <48> 8b 3c c6 31 c0 48 89 ee e8 53 44 fe ff 83 f8 04 75 ce 48 8b 44
> >
> > That'll need debugging. Cc-ing Anthony for awareness, but I'm sure
> > he'll need more data to actually stand a chance of doing something
> > about it.
> >
> > Is there any chance you could be doing some debugging work yourself,
> > at the very least to figure out where this (apparent) NULL deref is
> > happening?
> Yep, I can collect the call-stack for sure.

The call-stack of the segfault is like this:
0x00007ffff7f0971f in name2bdf () from /usr/lib/libxenlight.so.4.16
(gdb) bt
#0  0x00007ffff7f0971f in name2bdf () from /usr/lib/libxenlight.so.4.16
#1  0x00007ffff7f0a75e in libxl_device_pci_assignable_remove () from
/usr/lib/libxenlight.so.4.16
#2  0x00005555555725bf in main_pciassignable_remove ()
#3  0x00005555555610ab in main ()
It's with a release version of libxenlight. Once I switch it to a
debug version, the segment fault just goes away...
This allows me to move on and test the behavior on 4.16.1 --
unfortunately no change observed at all.
Once I get the SSD assigned to the FreeeBSD 12 domU, the domU still
sees the device but fails to operate.

This time I also built the debug version of 4.16.1 hypervisor.
But unfortunately it shares the same reboot on the first
pci-assignable-add problem.
I cannot follow the suggestion of attaching a serial console yet.
The motherboard does have a serial port connector, but I do not have a
cable at the moment.
Maybe I can grab one, but it takes some time...

What I was able to do is to dump the 'xl dmesg' output from the dom0
boot with a debug hypervisor (see attached).
It does give a few extra lines and hope they could be helpful.

Thanks,
G.R.

[-- Attachment #2: xldmesg_4.16.1_dbgbuild.log --]
[-- Type: text/x-log, Size: 14831 bytes --]

 Xen 4.16.1
(XEN) Xen version 4.16.1 (firemeteor@) (gcc (Debian 11.2.0-13) 11.2.0) debug=y Fri Jul  8 21:09:41 HKT 2022
(XEN) Latest ChangeSet: Wed Jul 6 16:22:55 2022 +0800 git:514aba9623
(XEN) build-id: 3e07b621cf5201a82b867a44ef1ad58a233e4ec8
(XEN) Bootloader: GRUB 2.04-20
(XEN) Command line: placeholder dom0_mem=2G,max:3G,min:1G dom0_max_vcpus=4 loglvl=all guest_loglvl=all iommu=verbose
(XEN) Xen image load base address: 0x87a00000
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN)  Found 5 MBR signatures
(XEN)  Found 5 EDD information structures
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 158 (0x9e), Stepping 10 (raw 000906ea)
(XEN) Xen-e820 RAM map:
(XEN)  [0000000000000000, 000000000009d3ff] (usable)
(XEN)  [000000000009d400, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 00000000835bffff] (usable)
(XEN)  [00000000835c0000, 00000000835c0fff] (ACPI NVS)
(XEN)  [00000000835c1000, 00000000835c1fff] (reserved)
(XEN)  [00000000835c2000, 0000000088c0bfff] (usable)
(XEN)  [0000000088c0c000, 000000008907dfff] (reserved)
(XEN)  [000000008907e000, 00000000891f4fff] (usable)
(XEN)  [00000000891f5000, 00000000895dcfff] (ACPI NVS)
(XEN)  [00000000895dd000, 0000000089efefff] (reserved)
(XEN)  [0000000089eff000, 0000000089efffff] (usable)
(XEN)  [0000000089f00000, 000000008f7fffff] (reserved)
(XEN)  [00000000e0000000, 00000000efffffff] (reserved)
(XEN)  [00000000fe000000, 00000000fe010fff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fee00000, 00000000fee00fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 000000086c7fffff] (usable)
(XEN) ACPI: RSDP 000F05B0, 0024 (r2 ALASKA)
(XEN) ACPI: XSDT 895120A8, 00D4 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FACP 895509C0, 0114 (r6 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: DSDT 89512218, 3E7A6 (r2 ALASKA    A M I  1072009 INTL 20160527)
(XEN) ACPI: FACS 895DC080, 0040
(XEN) ACPI: APIC 89550AD8, 00F4 (r4 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FPDT 89550BD0, 0044 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: FIDT 89550C18, 009C (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) ACPI: MCFG 89550CB8, 003C (r1 ALASKA    A M I  1072009 MSFT       97)
(XEN) ACPI: SSDT 89550CF8, 0204 (r1 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89550F00, 17D5 (r2 ALASKA    A M I     3000 INTL 20160527)
(XEN) ACPI: SSDT 895526D8, 933D (r1 ALASKA    A M I        1 INTL 20160527)
(XEN) ACPI: SSDT 8955BA18, 31C7 (r2 ALASKA    A M I     3000 INTL 20160527)
(XEN) ACPI: SSDT 8955EBE0, 2358 (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: HPET 89560F38, 0038 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: SSDT 89560F70, 1BE1 (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89562B58, 0F9E (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 89563AF8, 2D1B (r2 ALASKA    A M I        0 INTL 20160527)
(XEN) ACPI: UEFI 89566818, 0042 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: LPIT 89566860, 005C (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: SSDT 895668C0, 27DE (r2 ALASKA    A M I     1000 INTL 20160527)
(XEN) ACPI: SSDT 895690A0, 0FFE (r2 ALASKA    A M I        0 INTL 20160527)
(XEN) ACPI: DBGP 8956A0A0, 0034 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: DBG2 8956A0D8, 0054 (r0 ALASKA    A M I        2       1000013)
(XEN) ACPI: DMAR 8956A130, 00A8 (r1 ALASKA    A M I        2       1000013)
(XEN) ACPI: WSMT 8956A1D8, 0028 (r1 ALASKA    A M I  1072009 AMI     10013)
(XEN) System RAM: 32597MB (33379452kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-000000086c800000
(XEN) Domain heap initialised
(XEN) found SMP MP-table at 000fce30
(XEN) SMBIOS 3.1 present.
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x1808 (24 bits)
(XEN) ACPI: v5 SLEEP INFO: control[1:1804], status[1:1800]
(XEN) ACPI: Invalid sleep control/status register data: 0:0x8:0x3 0:0x8:0x3
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:1804,1:0], pm1x_evt[1:1800,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT - 895dc080/0000000000000000, using 32
(XEN) ACPI:             wakeup_vec[895dc08c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) Overriding APIC driver with bigsmp
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-119
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode:  Phys.  Using 1 I/O APICs
(XEN) ACPI: HPET id: 0x8086a201 base: 0xfed00000
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-ff
(XEN) [VT-D]Host address width 39
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed90000
(XEN) [VT-D]drhd->address = fed90000 iommu->reg = ffff82c00021d000
(XEN) [VT-D]cap = 1c0000c40660462 ecap = 19e2ff0505e
(XEN) [VT-D] endpoint: 0000:00:02.0
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed91000
(XEN) [VT-D]drhd->address = fed91000 iommu->reg = ffff82c00021f000
(XEN) [VT-D]cap = d2008c40660462 ecap = f050da
(XEN) [VT-D] IOAPIC: 0000:00:1e.7
(XEN) [VT-D] MSI HPET: 0000:00:1e.6
(XEN) [VT-D]  flags: INCLUDE_ALL
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:14.0
(XEN) [VT-D]dmar.c:617:  RMRR: [899e0000,89c29fff]
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:02.0
(XEN) [VT-D]dmar.c:617:  RMRR: [8b000000,8f7fffff]
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 12 CPUs (0 hotplug CPUs)
(XEN) IRQ limits: 120 GSI, 2376 MSI/MSI-X
(XEN) [VT-D]qinval.c:421: QI: using 256-entry ring(s)
(XEN) Switched to APIC driver x2apic_cluster
(XEN) CPU0: TSC: ratio: 292 / 2
(XEN) CPU0: bus: 100 MHz base: 3500 MHz max: 4500 MHz
(XEN) CPU0: 800 ... 3500 MHz
(XEN) xstate: size: 0x440 and states: 0x1f
(XEN) mce_intel.c:773: MCA Capability: firstbank 0, extended MCE MSR 0, BCAST, CMCI
(XEN) CPU0: Intel machine check reporting enabled
(XEN) Speculative mitigation facilities:
(XEN)   Hardware hints:
(XEN)   Hardware features: IBPB IBRS STIBP SSBD L1D_FLUSH MD_CLEAR
(XEN)   Compiled-in support: INDIRECT_THUNK SHADOW_PAGING
(XEN)   Xen settings: BTI-Thunk JMP, SPEC_CTRL: IBRS+ STIBP- SSBD-, Other: IBPB L1D_FLUSH VERW BRANCH_HARDEN
(XEN)   L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 39, Safe address 8000000000
(XEN)   Support for HVM VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR
(XEN)   Support for PV VMs: MSR_SPEC_CTRL EAGER_FPU MD_CLEAR
(XEN)   XPTI (64-bit PV only): Dom0 enabled, DomU enabled (with PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU enabled
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Disabling HPET for being unreliable
(XEN) Platform timer is 3.580MHz ACPI PM Timer
(XEN) Detected 3504.012 MHz processor.
(XEN) Freed 1024kB unused BSS memory
(XEN) alt table ffff82d04048b5b0 -> ffff82d040497d06
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d Snoop Control not enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) nr_sockets: 1
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) TSC deadline timer enabled
(XEN) Allocated console ring of 128 KiB.
(XEN) mwait-idle: MWAIT substates: 0x11142120
(XEN) mwait-idle: v0.4.1 model 0x9e
(XEN) mwait-idle: lapic_timer_reliable_states 0xffffffff
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN)  - VMCS shadowing
(XEN)  - VM Functions
(XEN)  - Virtualisation Exceptions
(XEN)  - Page Modification Logging
(XEN) HVM: ASIDs enabled.
(XEN) VMX: Disabling executable EPT superpages due to CVE-2018-12207
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) alt table ffff82d04048b5b0 -> ffff82d040497d06
(XEN) Brought up 12 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Adding cpu 0 to runqueue 0
(XEN)  First cpu on runqueue, activating
(XEN) Adding cpu 1 to runqueue 0
(XEN) Adding cpu 2 to runqueue 0
(XEN) Adding cpu 3 to runqueue 0
(XEN) Adding cpu 4 to runqueue 0
(XEN) Adding cpu 5 to runqueue 0
(XEN) Adding cpu 6 to runqueue 0
(XEN) Adding cpu 7 to runqueue 0
(XEN) Adding cpu 8 to runqueue 0
(XEN) Adding cpu 9 to runqueue 0
(XEN) Adding cpu 10 to runqueue 0
(XEN) Adding cpu 11 to runqueue 0
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) Running stub recovery selftests...
(XEN) Fixup #UD[0000]: ffff82d07fffe044 [ffff82d07fffe044] -> ffff82d040386809
(XEN) Fixup #GP[0000]: ffff82d07fffe045 [ffff82d07fffe045] -> ffff82d040386809
(XEN) Fixup #SS[0000]: ffff82d07fffe044 [ffff82d07fffe044] -> ffff82d040386809
(XEN) Fixup #BP[0000]: ffff82d07fffe045 [ffff82d07fffe045] -> ffff82d040386809
(XEN) NX (Execute Disable) protection active
(XEN) Dom0 has maximum 952 PIRQs
(XEN) *** Building a PV Dom0 ***
(XEN) ELF: phdr: paddr=0x1000000 memsz=0x1395d08
(XEN) ELF: phdr: paddr=0x2400000 memsz=0x685000
(XEN) ELF: phdr: paddr=0x2a85000 memsz=0x30d98
(XEN) ELF: phdr: paddr=0x2ab6000 memsz=0x376000
(XEN) ELF: memory: 0x1000000 -> 0x2e2c000
(XEN) ELF: note: GUEST_OS = "linux"
(XEN) ELF: note: GUEST_VERSION = "2.6"
(XEN) ELF: note: XEN_VERSION = "xen-3.0"
(XEN) ELF: note: VIRT_BASE = 0xffffffff80000000
(XEN) ELF: note: INIT_P2M = 0x8000000000
(XEN) ELF: note: ENTRY = 0xffffffff82ab6160
(XEN) ELF: note: HYPERCALL_PAGE = 0xffffffff81002000
(XEN) ELF: note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
(XEN) ELF: note: SUPPORTED_FEATURES = 0x8801
(XEN) ELF: note: PAE_MODE = "yes"
(XEN) ELF: note: LOADER = "generic"
(XEN) ELF: note: unknown (0xd)
(XEN) ELF: note: SUSPEND_CANCEL = 0x1
(XEN) ELF: note: MOD_START_PFN = 0x1
(XEN) ELF: note: HV_START_LOW = 0xffff800000000000
(XEN) ELF: note: PADDR_OFFSET = 0
(XEN) ELF: note: PHYS32_ENTRY = 0x10004b0
(XEN) ELF: addresses:
(XEN)     virt_base        = 0xffffffff80000000
(XEN)     elf_paddr_offset = 0x0
(XEN)     virt_offset      = 0xffffffff80000000
(XEN)     virt_kstart      = 0xffffffff81000000
(XEN)     virt_kend        = 0xffffffff82e2c000
(XEN)     virt_entry       = 0xffffffff82ab6160
(XEN)     p2m_base         = 0x8000000000
(XEN)  Xen  kernel: 64-bit, lsb
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x2e2c000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000850000000->0000000854000000 (504314 pages to be allocated)
(XEN)  Init. ramdisk: 000000086b9fa000->000000086c7ff3f6
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff82e2c000
(XEN)  Phys-Mach map: 0000008000000000->0000008000400000
(XEN)  Start info:    ffffffff82e2c000->ffffffff82e2c4b8
(XEN)  Page tables:   ffffffff82e2d000->ffffffff82e48000
(XEN)  Boot stack:    ffffffff82e48000->ffffffff82e49000
(XEN)  TOTAL:         ffffffff80000000->ffffffff83000000
(XEN)  ENTRY ADDRESS: ffffffff82ab6160
(XEN) Dom0 has maximum 4 VCPUs
(XEN) ELF: phdr 0 at 0xffffffff81000000 -> 0xffffffff82395d08
(XEN) ELF: phdr 1 at 0xffffffff82400000 -> 0xffffffff82a85000
(XEN) ELF: phdr 2 at 0xffffffff82a85000 -> 0xffffffff82ab5d98
(XEN) ELF: phdr 3 at 0xffffffff82ab6000 -> 0xffffffff82be0000
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021d000
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021f000
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) Booted on L1TF-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Please assess your configuration and choose an
(XEN) explicit 'smt=<bool>' setting.  See XSA-273.
(XEN) ***************************************************
(XEN) Booted on MLPDS/MFBDS-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Mitigations will not be fully effective.  Please
(XEN) choose an explicit smt=<bool> setting.  See XSA-297.
(XEN) ***************************************************
(XEN) 3... 2... 1... 
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 620kB init memory
(XEN) d0: Forcing write emulation on MFNs e0000-effff
(XEN) PCI add device 0000:00:00.0
(XEN) PCI add device 0000:00:01.0
(XEN) PCI add device 0000:00:02.0
(XEN) PCI add device 0000:00:12.0
(XEN) PCI add device 0000:00:14.0
(XEN) PCI add device 0000:00:14.2
(XEN) PCI add device 0000:00:16.0
(XEN) PCI add device 0000:00:16.3
(XEN) PCI add device 0000:00:17.0
(XEN) PCI add device 0000:00:1b.0
(XEN) PCI add device 0000:00:1c.0
(XEN) PCI add device 0000:00:1c.5
(XEN) PCI add device 0000:00:1d.0
(XEN) PCI add device 0000:00:1f.0
(XEN) PCI add device 0000:00:1f.3
(XEN) PCI add device 0000:00:1f.4
(XEN) PCI add device 0000:00:1f.5
(XEN) PCI add device 0000:01:00.0
(XEN) PCI add device 0000:04:00.0
(XEN) PCI add device 0000:05:00.0
(XEN) emul-priv-op.c:1025:d0v2 RDMSR 0x00000639 unimplemented
(XEN) emul-priv-op.c:1025:d0v2 RDMSR 0x00000611 unimplemented
(XEN) emul-priv-op.c:1025:d0v2 RDMSR 0x00000619 unimplemented
(XEN) emul-priv-op.c:1025:d0v2 RDMSR 0x00000641 unimplemented
(XEN) emul-priv-op.c:1025:d0v2 RDMSR 0x0000064d unimplemented
(XEN) emul-priv-op.c:1025:d0v2 RDMSR 0x00000606 unimplemented
(XEN) emul-priv-op.c:1025:d0v2 RDMSR 0x0000064e unimplemented
(XEN) emul-priv-op.c:1025:d0v2 RDMSR 0x00000034 unimplemented
(XEN) d0: Forcing read-only access to MFN fed00
(XEN) emul-priv-op.c:1025:d0v3 RDMSR 0xc0011020 unimplemented

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PCI pass-through problem for SN570 NVME SSD
  2022-07-08  2:28                               ` G.R.
  2022-07-09  1:24                                 ` G.R.
@ 2022-07-09  4:27                                 ` G.R.
  1 sibling, 0 replies; 31+ messages in thread
From: G.R. @ 2022-07-09  4:27 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Roger Pau Monné, Anthony Perard

On Fri, Jul 8, 2022 at 10:28 AM G.R. <firemeteor@users.sourceforge.net> wrote:
>
> On Fri, Jul 8, 2022 at 12:38 AM Jan Beulich <jbeulich@suse.com> wrote:
> >
> > > I built both 4.14.3 debug version and 4.16.1 release version for
> > > testing purposes.
> > > Unfortunately they gave me absolutely zero information, since both of
> > > them are not able to get through issue #1
> > > the FlR related DPC / AER issue.
> > > With 4.16.1 release, it actually can survive the 'xl
> > > pci-assignable-add' which triggers the first AER failure.
> >
> > Then that's what needs debugging first. Yet from all I've seen so
> > far I'm not sure who one the Xen side could be doing that, the more
> > without themselves being able to repro - this seems more like a
> > Linux side issue (and even outside of the pciback driver).
> >
> Yep, this one is likely not XEN related, as I've seen some discussions
> ([1],[2]) on similar syndrome (not necessarily same root cause
> though).
> The question is why this only shows up during the FLR attempt and if
> following pci-assignable-adds that do not trigger the error are
> actually reliable.
> BTW, I'm under the impression that the device is still usable in dom0
> afterwards, I'll have to double check though...
I think I'm finally making progress here.
Today I verified that the SSD does not survive FLR in Linux as long as
AER / DPC is enabled.
Irrelevant to pciback or whatever driver it attaches to, the same
syndrome can be observed.
And after the unsuccessful FLR, I can't even use the device on dom0 or
the Linux itself.

Forcing disabling ASPM does not fix the issue, as long as AER / DPC
are left enabled.
However, as long as AER / DPC is disabled (through pcie-aspm=off
kernel command-line, which ironically doesn't turn off ASPM),
the SSD device appears to FLR just fine and I'm still able to use it
on the Linux host afterwards.

And what's even better, the same pcie-aspm=off kernel command works in
XEN env too.
The FreeBSD12 domU is now able to access the SSD without issue now!

Thanks,
G.R.

>
> [1] https://patchwork.kernel.org/project/linux-pci/patch/20220408153159.106741-1-kai.heng.feng@canonical.com/
> [2] https://patchwork.kernel.org/project/linux-pci/patch/20220127025418.1989642-1-kai.heng.feng@canonical.com/#24713767
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2022-07-09  4:28 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-02 17:43 PCI pass-through problem for SN570 NVME SSD G.R.
2022-07-04  6:37 ` G.R.
2022-07-04 10:31   ` Jan Beulich
2022-07-04  9:50 ` Roger Pau Monné
2022-07-04 10:34   ` Jan Beulich
2022-07-04 11:34   ` G.R.
2022-07-04 11:44     ` G.R.
2022-07-04 13:09     ` Roger Pau Monné
2022-07-04 14:51       ` G.R.
2022-07-04 15:15         ` G.R.
2022-07-04 15:37           ` G.R.
2022-07-04 16:05             ` Roger Pau Monné
2022-07-04 16:07               ` Jan Beulich
2022-07-04 16:31               ` G.R.
2022-07-05  7:29                 ` Jan Beulich
2022-07-05 11:31                   ` G.R.
2022-07-05 11:59                     ` Jan Beulich
2022-07-06  6:25                       ` G.R.
2022-07-06  6:33                         ` Jan Beulich
2022-07-07 15:24                           ` G.R.
2022-07-07 15:36                             ` G.R.
2022-07-07 16:18                               ` Jan Beulich
2022-07-07 16:23                             ` Jan Beulich
2022-07-08  2:28                               ` G.R.
2022-07-09  1:24                                 ` G.R.
2022-07-09  4:27                                 ` G.R.
2022-07-04 15:33         ` Roger Pau Monné
2022-07-04 15:44           ` G.R.
2022-07-04 15:57             ` Roger Pau Monné
2022-07-05 18:06               ` Jason Andryuk
2022-07-04 16:05           ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.