RE: IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)

* RE: IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)
@ 2009-10-16  1:38 Cinco, Dante
  2009-10-16  2:34 ` Qing He
  0 siblings, 1 reply; 55+ messages in thread
From: Cinco, Dante @ 2009-10-16  1:38 UTC (permalink / raw)
  To: Qing He; +Cc: xen-devel, Keir Fraser, xiantao.zhang

I'm still trying to track down the problem of lost interrupts when I change /proc/irq/<irq#>/smp_affinity in domU. I'm now at Xen 3.5-unstable changeset 20320 and using pvops dom0 2.6.31.1.

In domU, my PCI devices are at virtual slots 5, 6, 7 and 8 so I use "lspci -vv" to get their respective IRQs and MSI message address/data and I can also see their IRQs in /proc/interrupts (I'm not showing all 16 CPUs):

lspci -vv -s 00:05.0 | grep IRQ; lspci -vv -s 00:06.0 | grep IRQ; lspci -vv -s 00:07.0 | grep IRQ; lspci -vv -s 00:08.0 | grep IRQ
        Interrupt: pin A routed to IRQ 48
        Interrupt: pin B routed to IRQ 49
        Interrupt: pin C routed to IRQ 50
        Interrupt: pin D routed to IRQ 51
lspci -vv -s 00:05.0 | grep Address; lspci -vv -s 00:06.0 | grep Address; lspci -vv -s 00:07.0 | grep Address; lspci -vv -s 00:08.0 | grep Address
                Address: 00000000fee00000  Data: 4071 (vector=113)
                Address: 00000000fee00000  Data: 4089 (vector=137)
                Address: 00000000fee00000  Data: 4099 (vector=153)
                Address: 00000000fee00000  Data: 40a9 (vector=169)
egrep '(HW_TACHYON|CPU0)' /proc/interrupts 
            CPU0       CPU1       
  48:    1571765          0          PCI-MSI-edge      HW_TACHYON
  49:    3204403          0          PCI-MSI-edge      HW_TACHYON
  50:    2643008          0          PCI-MSI-edge      HW_TACHYON
  51:    3270322          0          PCI-MSI-edge      HW_TACHYON

In dom0, my PCI devices show up as a 4-function device: 0:07:0.0, 0:07:0.1, 0:07:0.2, 0:07:0.3 and I also use "lspci -vv" to get the IRQs and MSI info:

lspci -vv -s 0:07:0.0 | grep IRQ;lspci -vv -s 0:07:0.1 | grep IRQ;lspci -vv -s 0:07:0.2 | grep IRQ;lspci -vv -s 0:07:0.3 | grep IRQ
        Interrupt: pin A routed to IRQ 11
        Interrupt: pin B routed to IRQ 10
        Interrupt: pin C routed to IRQ 7
        Interrupt: pin D routed to IRQ 5
lspci -vv -s 0:07:0.0 | grep Address;lspci -vv -s 0:07:0.1 | grep Address;lspci -vv -s 0:07:0.2 | grep Address;lspci -vv -s 0:07:0.3 | grep Address
                Address: 00000000fee00000  Data: 403c (vector=60)
                Address: 00000000fee00000  Data: 4044 (vector=68)
                Address: 00000000fee00000  Data: 404c (vector=76)
                Address: 00000000fee00000  Data: 4054 (vector=84)

I used the "Ctrl-a" "Ctrl-a" "Ctrl-a" "i" key sequence from the Xen console to print the guest interrupt information and the PCI devices. The vectors shown here are actually the vectors as seen from dom0 so I don't understand the label "Guest interrupt information." Meanwhile, the IRQs (74 - 77) do not match those from dom0 (11, 10, 7, 5) or domU (48, 49, 50, 51) as seen by "lspci -vv" but they do match those reported by the "Ctrl-a" key sequence followed by "Q" for PCI devices.

(XEN) Guest interrupt information:
(XEN)    IRQ:  74, IRQ affinity:0x00000001, Vec: 60 type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 79(----),
(XEN)    IRQ:  75, IRQ affinity:0x00000001, Vec: 68 type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 78(----),
(XEN)    IRQ:  76, IRQ affinity:0x00000001, Vec: 76 type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 77(----),
(XEN)    IRQ:  77, IRQ affinity:0x00000001, Vec: 84 type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 76(----),

(XEN) ==== PCI devices ====
(XEN) 07:00.3 - dom 1   - MSIs < 77 >
(XEN) 07:00.2 - dom 1   - MSIs < 76 >
(XEN) 07:00.1 - dom 1   - MSIs < 75 >
(XEN) 07:00.0 - dom 1   - MSIs < 74 >

If I look at /var/log/xen/qemu-dm-dpm.log, I see these 4 lines that show the pirq's which matches those in the last column of guest interrupt information:

pt_msi_setup: msi mapped with pirq 4f (79)
pt_msi_setup: msi mapped with pirq 4e (78)
pt_msi_setup: msi mapped with pirq 4d (77)
pt_msi_setup: msi mapped with pirq 4c (76)

The gvec's (71, 89, 99, a9) matches the vectors as seen by lspci in domU:

pt_msgctrl_reg_write: guest enabling MSI, disable MSI-INTx translation
pt_msi_update: Update msi with pirq 4f gvec 71 gflags 0
pt_msgctrl_reg_write: guest enabling MSI, disable MSI-INTx translation
pt_msi_update: Update msi with pirq 4e gvec 89 gflags 0
pt_msgctrl_reg_write: guest enabling MSI, disable MSI-INTx translation
pt_msi_update: Update msi with pirq 4d gvec 99 gflags 0
pt_msgctrl_reg_write: guest enabling MSI, disable MSI-INTx translation
pt_msi_update: Update msi with pirq 4c gvec a9 gflags 0

I see these same pirq's in the output of "xm dmesg"

(XEN) [VT-D]iommu.c:1289:d0 domain_context_unmap:PCIe: bdf = 7:0.0
(XEN) [VT-D]iommu.c:1175:d0 domain_context_mapping:PCIe: bdf = 7:0.0
(XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 4f device = 5 intx = 0
(XEN) [VT-D]iommu.c:1289:d0 domain_context_unmap:PCIe: bdf = 7:0.1
(XEN) [VT-D]iommu.c:1175:d0 domain_context_mapping:PCIe: bdf = 7:0.1
(XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 4e device = 6 intx = 0
(XEN) [VT-D]iommu.c:1289:d0 domain_context_unmap:PCIe: bdf = 7:0.2
(XEN) [VT-D]iommu.c:1175:d0 domain_context_mapping:PCIe: bdf = 7:0.2
(XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 4d device = 7 intx = 0
(XEN) [VT-D]iommu.c:1289:d0 domain_context_unmap:PCIe: bdf = 7:0.3
(XEN) [VT-D]iommu.c:1175:d0 domain_context_mapping:PCIe: bdf = 7:0.3
(XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 4c device = 8 intx = 0

The machine_gsi's match the pirq's while the m_irq's match the IRQ from lspci dom0. What are the guest_gsi's?

(XEN) io.c:316:d0 pt_irq_destroy_bind_vtd: machine_gsi=79 guest_gsi=36, device=5, intx=0.
(XEN) io.c:371:d0 XEN_DOMCTL_irq_unmapping: m_irq = 0x4f device = 0x5 intx = 0x0
(XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = b device = 5 intx = 0
(XEN) io.c:316:d0 pt_irq_destroy_bind_vtd: machine_gsi=78 guest_gsi=40, device=6, intx=0.
(XEN) io.c:371:d0 XEN_DOMCTL_irq_unmapping: m_irq = 0x4e device = 0x6 intx = 0x0
(XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = a device = 6 intx = 0
(XEN) io.c:316:d0 pt_irq_destroy_bind_vtd: machine_gsi=77 guest_gsi=44, device=7, intx=0.
(XEN) io.c:371:d0 XEN_DOMCTL_irq_unmapping: m_irq = 0x4d device = 0x7 intx = 0x0
(XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 7 device = 7 intx = 0
(XEN) io.c:316:d0 pt_irq_destroy_bind_vtd: machine_gsi=76 guest_gsi=17, device=8, intx=0.
(XEN) io.c:371:d0 XEN_DOMCTL_irq_unmapping: m_irq = 0x4c device = 0x8 intx = 0x0
(XEN) [VT-D]io.c:291:d0 VT-d irq bind: m_irq = 5 device = 8 intx = 0

So now when I finally get to the part where I change the smp_affinity, I see a corresponding change in the guest interrupt information, qemu-dm-dpm.log and lspci on both dom0 and domU:

cat /proc/irq/48/smp_affinity 
ffff
echo 2 > /proc/irq/48/smp_affinity
cat /proc/irq/48/smp_affinity 
0002

(XEN) Guest interrupt information: (IRQ affinity changed from 1 to 2, while vector changed from 60 to 92)
(XEN)    IRQ:  74, IRQ affinity:0x00000002, Vec: 92 type=PCI-MSI         status=00000010 in-flight=1 domain-list=1: 79(---M),

pt_msi_update: Update msi with pirq 4f gvec 71 gflags 2 (What is the significance of gflags 2?)
pt_msi_update: Update msi with pirq 4f gvec b1 gflags 2

domU: lspci -vv -s 00:05.0 | grep Address
                Address: 00000000fee02000  Data: 40b1 (dest ID changed from 0 to 2 and vector changed from 0x71 to 0xb1)

dom0: lspci -vv -s 0:07:0.0 | grep Address
                Address: 00000000fee00000  Data: 405c (vector changed from 0x3c (60 decimal) to 0x5c (92 decimal))

I'm confused why there are 4 sets of IRQs: dom0 lspci:[11,10,7,5], domU lspci proc interrupts:[48,49,50,51], pirq:[76,77,78,79], guest int info:[74,75,76,77].

Are the changes resulting from changing the IRQ smp_affinity consistent with what is expected? Any recommendation on where to go from here?

Thanks in advance.

Dante

^ permalink raw reply	[flat|nested] 55+ messages in thread