On Fri, 17 Mar 2023, Roger Pau Monné wrote: > On Fri, Mar 17, 2023 at 09:39:52AM +0100, Jan Beulich wrote: > > On 17.03.2023 00:19, Stefano Stabellini wrote: > > > On Thu, 16 Mar 2023, Jan Beulich wrote: > > >> So yes, it then all boils down to that Linux- > > >> internal question. > > > > > > Excellent question but we'll have to wait for Ray as he is the one with > > > access to the hardware. But I have this data I can share in the > > > meantime: > > > > > > [ 1.260378] IRQ to pin mappings: > > > [ 1.260387] IRQ1 -> 0:1 > > > [ 1.260395] IRQ2 -> 0:2 > > > [ 1.260403] IRQ3 -> 0:3 > > > [ 1.260410] IRQ4 -> 0:4 > > > [ 1.260418] IRQ5 -> 0:5 > > > [ 1.260425] IRQ6 -> 0:6 > > > [ 1.260432] IRQ7 -> 0:7 > > > [ 1.260440] IRQ8 -> 0:8 > > > [ 1.260447] IRQ9 -> 0:9 > > > [ 1.260455] IRQ10 -> 0:10 > > > [ 1.260462] IRQ11 -> 0:11 > > > [ 1.260470] IRQ12 -> 0:12 > > > [ 1.260478] IRQ13 -> 0:13 > > > [ 1.260485] IRQ14 -> 0:14 > > > [ 1.260493] IRQ15 -> 0:15 > > > [ 1.260505] IRQ106 -> 1:8 > > > [ 1.260513] IRQ112 -> 1:4 > > > [ 1.260521] IRQ116 -> 1:13 > > > [ 1.260529] IRQ117 -> 1:14 > > > [ 1.260537] IRQ118 -> 1:15 > > > [ 1.260544] .................................... done. > > > > And what does Linux think are IRQs 16 ... 105? Have you compared with > > Linux running baremetal on the same hardware? > > So I have some emails from Ray from he time he was looking into this, > and on Linux dom0 PVH dmesg there is: > > [ 0.065063] IOAPIC[0]: apic_id 33, version 17, address 0xfec00000, GSI 0-23 > [ 0.065096] IOAPIC[1]: apic_id 34, version 17, address 0xfec01000, GSI 24-55 > > So it seems the vIO-APIC data provided by Xen to dom0 is at least > consistent. > > > > And I think Ray traced the point in Linux where Linux gives us an IRQ == > > > 112 (which is the one causing issues): > > > > > > __acpi_register_gsi-> > > > acpi_register_gsi_ioapic-> > > > mp_map_gsi_to_irq-> > > > mp_map_pin_to_irq-> > > > __irq_resolve_mapping() > > > > > > if (likely(data)) { > > > desc = irq_data_to_desc(data); > > > if (irq) > > > *irq = data->irq; > > > /* this IRQ is 112, IO-APIC-34 domain */ > > > } > > > Could this all be a result of patch 4/5 in the Linux series ("[RFC > PATCH 4/5] x86/xen: acpi registers gsi for xen pvh"), where a different > __acpi_register_gsi hook is installed for PVH in order to setup GSIs > using PHYSDEV ops instead of doing it natively from the IO-APIC? > > FWIW, the introduced function in that patch > (acpi_register_gsi_xen_pvh()) seems to unconditionally call > acpi_register_gsi_ioapic() without checking if the GSI is already > registered, which might lead to multiple IRQs being allocated for the > same underlying GSI? I understand this point and I think it needs investigating. > As I commented there, I think that approach is wrong. If the GSI has > not been mapped in Xen (because dom0 hasn't unmasked the respective > IO-APIC pin) we should add some logic in the toolstack to map it > before attempting to bind. But this statement confuses me. The toolstack doesn't get involved in IRQ setup for PCI devices for HVM guests? Keep in mind that this is a regular HVM guest creation on PVH Dom0, so normally the IRQ setup is done by QEMU, and QEMU already calls xc_physdev_map_pirq and xc_domain_bind_pt_pci_irq. So I don't follow your statement about "the toolstack to map it before attempting to bind".