Xen-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [Xen-devel] HPET interrupt remapping during boot
@ 2019-10-08 18:30 Andrew Cooper
  2019-10-09  9:31 ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Cooper @ 2019-10-08 18:30 UTC (permalink / raw)
  To: xen-devel, Jan Beulich, Wei Liu, Roger Pau Monne

Hello,

I have no idea if this is a regression or not.  I suspect it might not
be, and has always been broken.

Either way, I'm seeing occasional single interrupt remapping errors when
booting a range of Intel systems

(XEN) x2APIC mode is already enabled by BIOS.
(XEN) Using APIC driver x2apic_cluster
...
(XEN) Platform timer is 23.999MHz HPET
(XEN) Detected 2194.922 MHz processor.
...
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) alt table ffff82d08047a070 -> ffff82d080486c6c
(XEN) [VT-D]INTR-REMAP: Request device [0000:f0:1f.0] fault index 0,
iommu reg = ffff82c00072d000
(XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear
(XEN) microcode: CPU2 updated from revision 0x5000021 to 0x500002b, date
= 2019-08-12

From other debugging, I know that this happens after CPU 1 (which is a
hyperthread) has passed through start_secondary().

f0:1f.0 is one of the IO-APICs, and if I've cross referenced the DMAR
and APIC tables properly, is the IO-APIC on the PCH, making the
problematic IRQ GSI0.

This suggests that we have an error setting up the timer IRQ (as the
HPET isn't MSI-capable), but we have already allegedly used it
successfully earlier on boot.

I haven't investigated further yet, but it is an intermittent issue
(i.e. doesn't reproduce on each boot).  My gut feeling is that we have
something which corrects itself as a side effect of a later action.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] HPET interrupt remapping during boot
  2019-10-08 18:30 [Xen-devel] HPET interrupt remapping during boot Andrew Cooper
@ 2019-10-09  9:31 ` Jan Beulich
  2019-10-09 10:11   ` Roger Pau Monné
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2019-10-09  9:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monne

On 08.10.2019 20:30, Andrew Cooper wrote:
> Hello,
> 
> I have no idea if this is a regression or not.  I suspect it might not
> be, and has always been broken.
> 
> Either way, I'm seeing occasional single interrupt remapping errors when
> booting a range of Intel systems
> 
> (XEN) x2APIC mode is already enabled by BIOS.
> (XEN) Using APIC driver x2apic_cluster
> ...
> (XEN) Platform timer is 23.999MHz HPET
> (XEN) Detected 2194.922 MHz processor.
> ...
> (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
> (XEN) alt table ffff82d08047a070 -> ffff82d080486c6c
> (XEN) [VT-D]INTR-REMAP: Request device [0000:f0:1f.0] fault index 0,
> iommu reg = ffff82c00072d000
> (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear
> (XEN) microcode: CPU2 updated from revision 0x5000021 to 0x500002b, date
> = 2019-08-12
> 
> From other debugging, I know that this happens after CPU 1 (which is a
> hyperthread) has passed through start_secondary().
> 
> f0:1f.0 is one of the IO-APICs, and if I've cross referenced the DMAR
> and APIC tables properly, is the IO-APIC on the PCH, making the
> problematic IRQ GSI0.
> 
> This suggests that we have an error setting up the timer IRQ (as the
> HPET isn't MSI-capable), but we have already allegedly used it
> successfully earlier on boot.

Wait - is this really a system with the timer at GSI 0, i.e. without
an interrupt source override like this (as seen in Linux logs):

ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)

Otherwise isn't this rather an ExtInt arriving through the PIC? We
mask all PIC interrupts first thing, but I'm not sure there aren't
paths where we'd unmask any. Additionally I seem to vaguely recall
that the spurious interrupt can't be masked at the PIC, and I do
recall seeing (randomly, like you) spurious interrupts during boot
(can't tell offhand whether this was on AMD and/or Intel, and
perhaps on IOMMU-less systems only). I've never seen though what
you describe here.

Then again the log message saying "fault index 0" doesn't tell us
what the GSI is, or what IO-APIC pin the IRQ can through. All not
yet set up IO-APIC RTEs would specify a remapping index of zero.
Of course all such IO-APIC entries ought to have their mask bit
set - question is whether the system comes out of boot with one
(perhaps indeed an ExtInt one) not masked.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] HPET interrupt remapping during boot
  2019-10-09  9:31 ` Jan Beulich
@ 2019-10-09 10:11   ` Roger Pau Monné
  2019-10-09 10:41     ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Roger Pau Monné @ 2019-10-09 10:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Wed, Oct 09, 2019 at 11:31:59AM +0200, Jan Beulich wrote:
> On 08.10.2019 20:30, Andrew Cooper wrote:
> > Hello,
> > 
> > I have no idea if this is a regression or not.  I suspect it might not
> > be, and has always been broken.
> > 
> > Either way, I'm seeing occasional single interrupt remapping errors when
> > booting a range of Intel systems
> > 
> > (XEN) x2APIC mode is already enabled by BIOS.
> > (XEN) Using APIC driver x2apic_cluster
> > ...
> > (XEN) Platform timer is 23.999MHz HPET
> > (XEN) Detected 2194.922 MHz processor.
> > ...
> > (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
> > (XEN) alt table ffff82d08047a070 -> ffff82d080486c6c
> > (XEN) [VT-D]INTR-REMAP: Request device [0000:f0:1f.0] fault index 0,
> > iommu reg = ffff82c00072d000
> > (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear
> > (XEN) microcode: CPU2 updated from revision 0x5000021 to 0x500002b, date
> > = 2019-08-12
> > 
> > From other debugging, I know that this happens after CPU 1 (which is a
> > hyperthread) has passed through start_secondary().
> > 
> > f0:1f.0 is one of the IO-APICs, and if I've cross referenced the DMAR
> > and APIC tables properly, is the IO-APIC on the PCH, making the
> > problematic IRQ GSI0.
> > 
> > This suggests that we have an error setting up the timer IRQ (as the
> > HPET isn't MSI-capable), but we have already allegedly used it
> > successfully earlier on boot.
> 
> Wait - is this really a system with the timer at GSI 0, i.e. without
> an interrupt source override like this (as seen in Linux logs):
> 
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)

I also have such a system, and I do have and entry like:

(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)

On the Xen log. I've added some more debug info to my build with the
following patch:

diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index f08eec070d..0d7abcd8fa 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -993,6 +993,7 @@ static void iommu_page_fault(int irq, void *dev_id,
      * specs since a new interrupt won't be generated until we clear all
      * the faults that caused this one to happen.
      */
+    WARN();
     tasklet_schedule(&vtd_fault_tasklet);
 }
 
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 59905629e1..2dbc7a3607 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -23,12 +23,59 @@
 #include <asm/hvm/io.h>
 #include <asm/setup.h>
 
+#include <asm/io_apic.h>
+
 const struct iommu_init_ops *__initdata iommu_init_ops;
 struct iommu_ops __read_mostly iommu_ops;
 
+static const char * delivery_mode_2_str(
+    const enum ioapic_irq_destination_types mode)
+{
+    switch ( mode )
+    {
+    case dest_Fixed: return "Fixed";
+    case dest_LowestPrio: return "LoPri";
+    case dest_SMI: return "SMI";
+    case dest_NMI: return "NMI";
+    case dest_INIT: return "INIT";
+    case dest_ExtINT: return "ExINT";
+    case dest__reserved_1:
+    case dest__reserved_2: return "Resvd";
+    default: return "INVAL";
+    }
+}
+
 int __init iommu_hardware_setup(void)
 {
     int rc;
+    int apic;
+
+for(apic = 0; apic < nr_ioapics; apic++)
+    {
+        int pin;
+        /* See if any of the pins is in ExtINT mode */
+        for (pin = 0; pin < nr_ioapic_entries[apic]; pin++) {
+            struct IO_APIC_route_entry rte = __ioapic_read_entry(apic, pin, 0);
+
+            /* If the interrupt line is enabled and in ExtInt mode
+             * I have found the pin where the i8259 is connected.
+             */
+            if (!rte.mask)
+            {
+                printk("ioapic %d pin %d not masked\n", apic, pin);
+                printk("vec=%02x delivery=%-5s dest=%c status=%d "
+                   "polarity=%d irr=%d trig=%c mask=%d dest_id:%0*x\n",
+                   rte.vector, delivery_mode_2_str(rte.delivery_mode),
+                   rte.dest_mode ? 'L' : 'P',
+                   rte.delivery_status, rte.polarity, rte.irr,
+                   rte.trigger ? 'L' : 'E', rte.mask,
+                   x2apic_enabled ? 8 : 2,
+                   x2apic_enabled ? rte.dest.dest32
+                                  : rte.dest.logical.logical_dest);
+            }
+        }
+    }
+
 
     if ( !iommu_init_ops )
         return -ENODEV;

And it does print the following when setting up the iommu:

(XEN) ioapic 0 pin 0 not masked
(XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:00010000

I wonder, shouldn't all pins of all the io-apics be masked at boot?

Full Xen boot log below.

Thanks, Roger.
---
(XEN) Xen version 4.13-unstable (root@xenrtcloud) (gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516) debug=y  Wed Oct  9 12:03:37 CEST 2019
(XEN) Latest ChangeSet:   Copyright (C) 1994-2013 H. Peter Anvin et al
(XEN) build-id: 52e9ffb079497b12ad075c4d09e1c9070151bfdf
(XEN) Console output is synchronous.
(XEN) Bootloader: PXELINUX 4.07 2013-07-25
(XEN) Command line: cpuinfo dom0=verbose watchdog noreboot dom0_mem=8196M com2=115200,8n1 console=com2 guest_loglvl=all loglvl=all sync_console iommu=debug,verbose
(XEN) Xen image load base address: 01:255.255.248.0
(XEN) Video information:-94
(XEN)  VGA is text mode 80x25, font 8x168794
(XEN)  VBE/DDC methods: none; EDID transfer time: 1 seconds
(XEN)  EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN)  Found 1 MBR signatures... ok
(XEN)  Found 1 EDD information structuresmd64... ok
(XEN) Xen-e820 RAM map:.img-4.18.0-0.bpo.1-amd64... ok
(XEN)  0000000000000000 - 0000000000098c00 (usable)
(XEN)  0000000000098c00 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 0000000088dd5000 (usable)
(XEN)  0000000088dd5000 - 0000000088dd6000 (ACPI NVS)
(XEN)  0000000088dd6000 - 0000000088e20000 (reserved)
(XEN)  0000000088e20000 - 000000008cba2000 (usable)
(XEN)  000000008cba2000 - 000000008cf30000 (reserved)
(XEN)  000000008cf30000 - 000000008d0ef000 (usable)
(XEN)  000000008d0ef000 - 000000008d893000 (ACPI NVS)
(XEN)  000000008d893000 - 000000008faff000 (reserved)
(XEN)  000000008faff000 - 000000008fb00000 (usable)
(XEN)  000000008fb00000 - 000000008fc00000 (reserved)
(XEN)  00000000e0000000 - 00000000f0000000 (reserved)
(XEN)  00000000fe000000 - 00000000fe011000 (reserved)
(XEN)  00000000fec00000 - 00000000fec01000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ff000000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000000870000000 (usable)
(XEN) New Xen image base address: 0x8c400000
(XEN) ACPI: RSDP 000F05B0, 0024 (r2 SUPERM)
(XEN) ACPI: XSDT 8D2F70B0, 00DC (r1 SUPERM   SUPERM  1072009 AMI     10013)
(XEN) ACPI: FACP 8D3191C8, 010C (r5                  1072009 AMI     10013)
(XEN) ACPI: DSDT 8D2F7220, 21FA5 (r2 SUPERM SMCI--MB  1072009 INTL 20120913)
(XEN) ACPI: FACS 8D891F80, 0040
(XEN) ACPI: APIC 8D3192D8, 00BC (r3                  1072009 AMI     10013)
(XEN) ACPI: FPDT 8D319398, 0044 (r1                  1072009 AMI     10013)
(XEN) ACPI: FIDT 8D3193E0, 009C (r1                  1072009 AMI     10013)
(XEN) ACPI: SPMI 8D319480, 0041 (r5 SUPERM SMCI--MB        0 AMI.        0)
(XEN) ACPI: MCFG 8D3194C8, 003C (r1 SUPERM SMCI--MB  1072009 MSFT       97)
(XEN) ACPI: HPET 8D319508, 0038 (r1 SUPERM SMCI--MB  1072009 AMI.    5000B)
(XEN) ACPI: LPIT 8D319540, 0094 (r1 INTEL      GNLR        0 MSFT       5F)
(XEN) ACPI: SSDT 8D3195D8, 0248 (r2 INTEL  sensrhub        0 INTL 20120913)
(XEN) ACPI: SSDT 8D319820, 2BAE (r2 INTEL  PtidDevc     1000 INTL 20120913)
(XEN) ACPI: SSDT 8D31C3D0, 0BE3 (r2 INTEL  Ther_Rvp     1000 INTL 20120913)
(XEN) ACPI: DBGP 8D31CFB8, 0034 (r1 INTEL                  0 MSFT       5F)
(XEN) ACPI: DBG2 8D31CFF0, 0054 (r0 INTEL                  0 MSFT       5F)
(XEN) ACPI: SSDT 8D31D048, 06EB (r2  INTEL xh_Zumba        0 INTL 20120913)
(XEN) ACPI: PRAD 8D31D738, 00CA (r2 PRADID  PRADTID        1 INTL 20120913)
(XEN) ACPI: SSDT 8D31D808, 547E (r2 SaSsdt  SaSsdt      3000 INTL 20120913)
(XEN) ACPI: UEFI 8D322C88, 0042 (r1                        0             0)
(XEN) ACPI: SSDT 8D322CD0, 0E73 (r2 CpuRef  CpuSsdt     3000 INTL 20120913)
(XEN) ACPI: DMAR 8D323B48, 0070 (r1 INTEL      SKL         1 INTL        1)
(XEN) ACPI: EINJ 8D323BB8, 0130 (r1    AMI AMI.EINJ        0 AMI.        0)
(XEN) ACPI: ERST 8D323CE8, 0230 (r1  AMIER AMI.ERST        0 AMI.        0)
(XEN) ACPI: BERT 8D323F18, 0030 (r1    AMI AMI.BERT        0 AMI.        0)
(XEN) ACPI: HEST 8D323F48, 027C (r1    AMI AMI.HEST        0 AMI.        0)
(XEN) System RAM: 32716MB (33501884kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-0000000870000000
(XEN) Domain heap initialised
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 158 (0x9e), Stepping 9 (raw 000906e9)
(XEN) found SMP MP-table at 000fcc40
(XEN) DMI 3.0 present.
(XEN) Using APIC driver default
(XEN) ACPI: PM-Timer IO Port: 0x1808 (32 bits)
(XEN) ACPI: v5 SLEEP INFO: control[0:0], status[0:0]
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:1804,1:0], pm1x_evt[1:1800,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT - 8d891f80/0000000000000000, using 32
(XEN) ACPI:             wakeup_vec[8d891f8c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x01] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x03] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x05] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x07] enabled)
(XEN) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1])
(XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
(XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-ff
(XEN) [VT-D]Host address width 39
(XEN) [VT-D]found ACPI_DMAR_DRHD:
(XEN) [VT-D]  dmaru->address = fed90000
(XEN) [VT-D]drhd->address = fed90000 iommu->reg = ffff82c00021d000
(XEN) [VT-D]cap = d2008c40660462 ecap = f050da
(XEN) [VT-D] IOAPIC: 0000:f0:1f.0
(XEN) [VT-D] MSI HPET: 0000:00:1f.0
(XEN) [VT-D]  flags: INCLUDE_ALL
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:14.0
(XEN) [VT-D]dmar.c:600:   RMRR region: base_addr 8cc87000 end_addr 8cca6fff
(XEN) Xen ERST support is initialized.
(XEN) HEST: Table parsing has been initialized
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 8 CPUs (0 hotplug CPUs)
(XEN) IRQ limits: 24 GSI, 1528 MSI/MSI-X
(XEN) Switched to APIC driver x2apic_cluster
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 0
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 8192K
(XEN) xstate: size: 0x440 and states: 0x1f
(XEN) mce_intel.c:778: MCA Capability: firstbank 0, extended MCE MSR 0, BCAST, CMCI
(XEN) CPU0: Thermal monitoring enabled (TM1)
(XEN) CPU0: Intel machine check reporting enabled
(XEN) Speculative mitigation facilities:
(XEN)   Hardware features: IBRS/IBPB STIBP
(XEN)   Compiled-in support: INDIRECT_THUNK SHADOW_PAGING
(XEN)   Xen settings: BTI-Thunk JMP, SPEC_CTRL: IBRS+, Other: IBPB L1TF_BARRIER
(XEN)   L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 39, Safe address 8000000000
(XEN)   Support for HVM VMs: MSR_SPEC_CTRL RSB EAGER_FPU
(XEN)   Support for PV VMs: MSR_SPEC_CTRL RSB EAGER_FPU
(XEN)   XPTI (64-bit PV only): Dom0 enabled, DomU enabled (with PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU enabled
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN)  load_precision_shift: 18
(XEN)  load_window_shift: 30
(XEN)  underload_balance_tolerance: 0
(XEN)  overload_balance_tolerance: -3
(XEN)  runqueues arrangement: socket
(XEN)  cap enforcement granularity: 10ms
(XEN) load tracking window length 1073741824 ns
(XEN) Initializing CPU#0
(XEN) Platform timer is 23.999MHz HPET
(XEN) Detected 3504.313 MHz processor.
(XEN) alt table ffff82d0804781b0 -> ffff82d080484a10
(XEN) spurious 8259A interrupt: IRQ7.
(XEN) ioapic 0 pin 0 not masked
(XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:00010000
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB.
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables enabled.
(XEN) I/O virtualisation enabled
(XEN) Xen WARN at iommu.c:996
(XEN) ----[ Xen-4.13-unstable  x86_64  debug=y   Tainted:  C   ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d08025e6df>] iommu.c#iommu_page_fault+0x4/0x14
(XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
(XEN) rax: ffff82d08025e6db   rbx: ffff83086c601800   rcx: ffff82d0804a7bbc
(XEN) rdx: ffff82d0804a7c08   rsi: ffff83086c6b2a40   rdi: 0000000000000018
(XEN) rbp: ffff82d0804a7b78   rsp: ffff82d0804a7b78   r8:  0000000000000000
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: ffff82d0804a7c08   r13: 0000000000000028   r14: ffff83086c666860
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000003506e0
(XEN) cr3: 000000008c899000   cr2: 0000000000000000
(XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen code around <ffff82d08025e6df> (iommu.c#iommu_page_fault+0x4/0x14):
(XEN)  df fc ff ff 55 48 89 e5 <0f> 0b 48 8d 3d 58 60 36 00 e8 98 44 fe ff 5d c3
(XEN) Xen stack trace from rsp=ffff82d0804a7b78:
(XEN)    ffff82d0804a7bf8 ffff82d080285fe2 ffff82d080388851 ffff82d080388845
(XEN)    ffff82d0804a7bbc ffff83086c601824 0000000000000000 0000001880388845
(XEN)    ffff82d080388851 ffff82d080388845 ffff82d080388851 0000000000000000
(XEN)    0000000000000000 0000000000000000 ffff82d0804a7fff 0000000000000000
(XEN)    00007d2f7fb583d7 ffff82d0803888ba ffff82d0804a7cd0 0000000000000200
(XEN)    ffff82d0805bdf1b ffff82d0805af3a0 ffff82d0804a7d08 0000000000000000
(XEN)    0000000000000001 ffff82d080489300 0000000000000000 ffff82d0804546a0
(XEN)    ffff82d0804a7fff 0000000000000000 0000000000000000 000000000000000a
(XEN)    ffff82d08049c698 0000002800000000 ffff82d0802509a9 000000000000e008
(XEN)    0000000000000206 ffff82d0804a7cb8 0000000000000000 0000000000000206
(XEN)    0000000000000018 ffff82d0803f3e70 ffff82d0804a7d18 ffff82d0805bdf1b
(XEN)    ffff82d0804a7ce8 0000000000000001 0000000000000000 ffff82d08048d6e0
(XEN)    0000000000000007 ffff82d08048d6a0 ffff82d0804a7d68 ffff82d080250a26
(XEN)    ffff82d0804a7d68 ffff82d000000010 ffff82d0804a7d78 ffff82d0804a7d38
(XEN)    ffff82d080425911 ffff82d0803f3ee3 ffff82d0803f4632 0000000000000001
(XEN)    ffff82d080485a00 0000000000000102 ffff82d0804a7d88 ffff82d080417a1b
(XEN)    ffff82d0805affb0 ffff82d0805affb0 ffff82d0804a7ee8 ffff82d08042cf6e
(XEN)    00000000003e0df8 0000000000000002 0000000000000002 0000000000000002
(XEN)    0000000000000001 0000000000000001 0000000000000001 0000000000000001
(XEN)    0000000000000000 00000000000001ff 0000000001f18000 0000000001f18fff
(XEN) Xen call trace:
(XEN)    [<ffff82d08025e6df>] R iommu.c#iommu_page_fault+0x4/0x14
(XEN)    [<ffff82d080285fe2>] F do_IRQ+0x654/0x6c9
(XEN)    [<ffff82d0803888ba>] F common_interrupt+0x10a/0x120
(XEN)    [<ffff82d0802509a9>] F console.c#vprintk_common+0x121/0x158
(XEN)    [<ffff82d080250a26>] F printk+0x46/0x48
(XEN)    [<ffff82d080417a1b>] F iommu_setup+0xaa/0x18f
(XEN)    [<ffff82d08042cf6e>] F __start_xen+0x227e/0x28cf
(XEN)    [<ffff82d0802000f3>] F __high_start+0x53/0x55
(XEN)
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) CPU0: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz stepping 09
(XEN) nr_sockets: 1
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=0 pin2=0
(XEN) TSC deadline timer enabled
(XEN) Allocated console ring of 64 KiB.
(XEN) mwait-idle: MWAIT substates: 0x142120
(XEN) mwait-idle: v0.4.1 model 0x9e
(XEN) mwait-idle: lapic_timer_reliable_states 0xffffffff
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN)  - VMCS shadowing
(XEN)  - VM Functions
(XEN)  - Virtualisation Exceptions
(XEN)  - Page Modification Logging
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) alt table ffff82d0804781b0 -> ffff82d080484a10
(XEN) Booting processor 1/1 eip 72000
(XEN) Initializing CPU#1
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 0
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 8192K
(XEN) CPU1: Thermal monitoring enabled (TM1)
(XEN) CPU1: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz stepping 09
(XEN) [VT-D]iommu.c:882: iommu_fault_status: Fault Overflow
(XEN) [VT-D]iommu.c:884: iommu_fault_status: Primary Pending Fault
(XEN) [VT-D]INTR-REMAP: Request device [0000:f0:1f.0] fault index 0, iommu reg = ffff82c00021d000
(XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear
(XEN) Booting processor 2/2 eip 72000
(XEN) Initializing CPU#2
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 1
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 8192K
(XEN) CPU2: Thermal monitoring enabled (TM1)
(XEN) CPU2: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz stepping 09
(XEN) Booting processor 3/3 eip 72000
(XEN) Initializing CPU#3
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 1
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 8192K
(XEN) CPU3: Thermal monitoring enabled (TM1)
(XEN) CPU3: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz stepping 09
(XEN) Booting processor 4/4 eip 72000
(XEN) Initializing CPU#4
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 2
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 8192K
(XEN) CPU4: Thermal monitoring enabled (TM1)
(XEN) CPU4: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz stepping 09
(XEN) Booting processor 5/5 eip 72000
(XEN) Initializing CPU#5
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 2
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 8192K
(XEN) CPU5: Thermal monitoring enabled (TM1)
(XEN) CPU5: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz stepping 09
(XEN) Booting processor 6/6 eip 72000
(XEN) Initializing CPU#6
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 3
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 8192K
(XEN) CPU6: Thermal monitoring enabled (TM1)
(XEN) CPU6: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz stepping 09
(XEN) Booting processor 7/7 eip 72000
(XEN) Initializing CPU#7
(XEN) CPU: Physical Processor ID: 0
(XEN) CPU: Processor Core ID: 3
(XEN) CPU: L1 I cache: 32K, L1 D cache: 32K
(XEN) CPU: L2 cache: 256K
(XEN) CPU: L3 cache: 8192K
(XEN) CPU7: Thermal monitoring enabled (TM1)
(XEN) CPU7: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz stepping 09
(XEN) Brought up 8 CPUs
(XEN) Testing NMI watchdog on all CPUs: ok
(XEN) Adding cpu 0 to runqueue 0
(XEN)  First cpu on runqueue, activating
(XEN) Adding cpu 1 to runqueue 0
(XEN) Adding cpu 2 to runqueue 0
(XEN) Adding cpu 3 to runqueue 0
(XEN) Adding cpu 4 to runqueue 0
(XEN) Adding cpu 5 to runqueue 0
(XEN) Adding cpu 6 to runqueue 0
(XEN) Adding cpu 7 to runqueue 0
(XEN) Running stub recovery selftests...
(XEN) traps.c:1589: GPF (0000): ffff82d0bffff041 [ffff82d0bffff041] -> ffff82d08038a3f3
(XEN) traps.c:784: Trap 12: ffff82d0bffff040 [ffff82d0bffff040] -> ffff82d08038a3f3
(XEN) traps.c:1123: Trap 3: ffff82d0bffff041 [ffff82d0bffff041] -> ffff82d08038a3f3
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) Dom0 has maximum 792 PIRQs
(XEN) NX (Execute Disable) protection active
(XEN) *** Building a PV Dom0 ***
(XEN) ELF: phdr: paddr=0x1000000 memsz=0xf2d000
(XEN) ELF: phdr: paddr=0x2000000 memsz=0x41a000
(XEN) ELF: phdr: paddr=0x241a000 memsz=0x22d98
(XEN) ELF: phdr: paddr=0x243d000 memsz=0x238000
(XEN) ELF: memory: 0x1000000 -> 0x2675000
(XEN) ELF: note: GUEST_OS = "linux"
(XEN) ELF: note: GUEST_VERSION = "2.6"
(XEN) ELF: note: XEN_VERSION = "xen-3.0"
(XEN) ELF: note: VIRT_BASE = 0xffffffff80000000
(XEN) ELF: note: INIT_P2M = 0x8000000000
(XEN) ELF: note: ENTRY = 0xffffffff8243d180
(XEN) ELF: note: HYPERCALL_PAGE = 0xffffffff81001000
(XEN) ELF: note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb"
(XEN) ELF: note: SUPPORTED_FEATURES = 0x8801
(XEN) ELF: note: PAE_MODE = "yes"
(XEN) ELF: note: LOADER = "generic"
(XEN) ELF: note: unknown (0xd)
(XEN) ELF: note: SUSPEND_CANCEL = 0x1
(XEN) ELF: note: MOD_START_PFN = 0x1
(XEN) ELF: note: HV_START_LOW = 0xffff800000000000
(XEN) ELF: note: PADDR_OFFSET = 0
(XEN) ELF: note: PHYS32_ENTRY = 0x1000340
(XEN) ELF: Found PVH image
(XEN) ELF: addresses:
(XEN)     virt_base        = 0xffffffff80000000
(XEN)     elf_paddr_offset = 0x0
(XEN)     virt_offset      = 0xffffffff80000000
(XEN)     virt_kstart      = 0xffffffff81000000
(XEN)     virt_kend        = 0xffffffff82675000
(XEN)     virt_entry       = 0xffffffff8243d180
(XEN)     p2m_base         = 0x8000000000
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x2675000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000850000000->0000000854000000 (2076348 pages to be allocated)
(XEN)  Init. ramdisk: 000000086eabc000->000000086ffff5ad
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff82675000
(XEN)  Init. ramdisk: 0000000000000000->0000000000000000
(XEN)  Phys-Mach map: 0000008000000000->0000008001002000
(XEN)  Start info:    ffffffff82675000->ffffffff826754b8
(XEN)  Xenstore ring: 0000000000000000->0000000000000000
(XEN)  Console ring:  0000000000000000->0000000000000000
(XEN)  Page tables:   ffffffff82676000->ffffffff8268d000
(XEN)  Boot stack:    ffffffff8268d000->ffffffff8268e000
(XEN)  TOTAL:         ffffffff80000000->ffffffff82800000
(XEN)  ENTRY ADDRESS: ffffffff8243d180
(XEN) Dom0 has maximum 8 VCPUs
(XEN) ELF: phdr 0 at 0xffffffff81000000 -> 0xffffffff81f2d000
(XEN) ELF: phdr 1 at 0xffffffff82000000 -> 0xffffffff8241a000
(XEN) ELF: phdr 2 at 0xffffffff8241a000 -> 0xffffffff8243cd98
(XEN) ELF: phdr 3 at 0xffffffff8243d000 -> 0xffffffff825a5000
(XEN) [VT-D]d0:Hostbridge: skip 0000:00:00.0 map
(XEN) [VT-D]d0:PCI: map 0000:00:13.0
(XEN) [VT-D]d0:PCI: map 0000:00:14.0
(XEN) [VT-D]d0:PCI: map 0000:00:14.2
(XEN) [VT-D]d0:PCI: map 0000:00:16.0
(XEN) [VT-D]d0:PCI: map 0000:00:16.1
(XEN) [VT-D]d0:PCI: map 0000:00:17.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.2
(XEN) [VT-D]d0:PCI: map 0000:00:1f.4
(XEN) [VT-D]d0:PCIe: map 0000:02:00.0
(XEN) [VT-D]d0:PCIe: map 0000:02:00.1
(XEN) [VT-D]d0:PCI: map 0000:06:00.0
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021d000
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) WARNING: CONSOLE OUTPUT IS SYNCHRONOUS
(XEN) This option is intended to aid debugging of Xen by ensuring
(XEN) that all output is synchronously delivered on the serial line.
(XEN) However it can introduce SIGNIFICANT latencies and affect
(XEN) timekeeping. It is NOT recommended for production use!
(XEN) ***************************************************
(XEN) Booted on L1TF-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Please assess your configuration and choose an
(XEN) explicit 'smt=<bool>' setting.  See XSA-273.
(XEN) ***************************************************
(XEN) Booted on MLPDS/MFBDS-vulnerable hardware with SMT/Hyperthreading
(XEN) enabled.  Mitigations will not be fully effective.  Please
(XEN) choose an explicit smt=<bool> setting.  See XSA-297.
(XEN) ***************************************************
(XEN) 3... 2... 1...
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 536kB init memory

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] HPET interrupt remapping during boot
  2019-10-09 10:11   ` Roger Pau Monné
@ 2019-10-09 10:41     ` Jan Beulich
  2019-10-09 11:29       ` Roger Pau Monné
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2019-10-09 10:41 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 09.10.2019 12:11, Roger Pau Monné  wrote:
> And it does print the following when setting up the iommu:
> 
> (XEN) ioapic 0 pin 0 not masked
> (XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:00010000
> 
> I wonder, shouldn't all pins of all the io-apics be masked at boot?

I think you might get different answers here depending on whether
you ask firmware or OS people. In fact there are cases where the
IO-APIC needs to be left in this state, I think, but such would
likely need properly reflecting in ACPI tables (albeit I don't
know/recall how this would be done; looking at the code ). This goes back to times
when IO-APICs were new and OSes would not even know about them,
yet they wouldn't get any interrupts to work if fiddling with
only the PIC (sitting behind IO-APIC pin 0).

See enable_IO_APIC(), where we actually use this property to
determine the pin behind which the 8259 sits.

I've seen quite many systems where in the BIOS setup you have an
option to select whether you have an "ACPI OS" (wording of course
varies). I've never checked whether this may e.g. reflect itself
in the handover state of the GSI 0 RTE.

In your testing patch, could you also log the PIC mask bytes?
There ought to be at least one unmasked; or wait - there actually
is a spurious interrupt there (right before IOMMU initialization):

(XEN) spurious 8259A interrupt: IRQ7.

Hence I wonder if there's not possibly a 2nd one once the IOMMU
has been set up.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] HPET interrupt remapping during boot
  2019-10-09 10:41     ` Jan Beulich
@ 2019-10-09 11:29       ` Roger Pau Monné
  2019-10-09 12:03         ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Roger Pau Monné @ 2019-10-09 11:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Wed, Oct 09, 2019 at 12:41:05PM +0200, Jan Beulich wrote:
> On 09.10.2019 12:11, Roger Pau Monné  wrote:
> > And it does print the following when setting up the iommu:
> > 
> > (XEN) ioapic 0 pin 0 not masked
> > (XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:00010000
> > 
> > I wonder, shouldn't all pins of all the io-apics be masked at boot?
> 
> I think you might get different answers here depending on whether
> you ask firmware or OS people. In fact there are cases where the
> IO-APIC needs to be left in this state, I think, but such would
> likely need properly reflecting in ACPI tables (albeit I don't
> know/recall how this would be done; looking at the code ). This goes back to times
> when IO-APICs were new and OSes would not even know about them,
> yet they wouldn't get any interrupts to work if fiddling with
> only the PIC (sitting behind IO-APIC pin 0).
> 
> See enable_IO_APIC(), where we actually use this property to
> determine the pin behind which the 8259 sits.
> 
> I've seen quite many systems where in the BIOS setup you have an
> option to select whether you have an "ACPI OS" (wording of course
> varies). I've never checked whether this may e.g. reflect itself
> in the handover state of the GSI 0 RTE.
> 
> In your testing patch, could you also log the PIC mask bytes?
> There ought to be at least one unmasked; or wait - there actually
> is a spurious interrupt there (right before IOMMU initialization):
> 
> (XEN) spurious 8259A interrupt: IRQ7.

So I've added a log of the PIC masks just before checking the ioapic
masks:

(XEN) 8259A-1 mask: fe 8259A-2 mask: ff

AFAICT IRQ7 seems to be unmasked? Sorry my knowledge of PICs is quite
limited since I've never had to deal with them.

The line I've added is:

printk("8259A-1 mask: %x 8259A-2 mask: %x\n", inb(0x21), inb(0xA1));

I wonder why does Xen even has any code to deal with the PICs,
shouldn't we rely on io-apics only for legacy delivery?

> Hence I wonder if there's not possibly a 2nd one once the IOMMU
> has been set up.

Right, then I guess we either mask all the io-apic pins or we setup
proper remapping entries for non-masked pins? (in order to avoid iommu
faults)

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] HPET interrupt remapping during boot
  2019-10-09 11:29       ` Roger Pau Monné
@ 2019-10-09 12:03         ` Jan Beulich
  2019-10-09 13:56           ` Roger Pau Monné
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2019-10-09 12:03 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 09.10.2019 13:29, Roger Pau Monné  wrote:
> On Wed, Oct 09, 2019 at 12:41:05PM +0200, Jan Beulich wrote:
>> On 09.10.2019 12:11, Roger Pau Monné  wrote:
>>> And it does print the following when setting up the iommu:
>>>
>>> (XEN) ioapic 0 pin 0 not masked
>>> (XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:00010000
>>>
>>> I wonder, shouldn't all pins of all the io-apics be masked at boot?
>>
>> I think you might get different answers here depending on whether
>> you ask firmware or OS people. In fact there are cases where the
>> IO-APIC needs to be left in this state, I think, but such would
>> likely need properly reflecting in ACPI tables (albeit I don't
>> know/recall how this would be done; looking at the code ). This goes back to times
>> when IO-APICs were new and OSes would not even know about them,
>> yet they wouldn't get any interrupts to work if fiddling with
>> only the PIC (sitting behind IO-APIC pin 0).
>>
>> See enable_IO_APIC(), where we actually use this property to
>> determine the pin behind which the 8259 sits.
>>
>> I've seen quite many systems where in the BIOS setup you have an
>> option to select whether you have an "ACPI OS" (wording of course
>> varies). I've never checked whether this may e.g. reflect itself
>> in the handover state of the GSI 0 RTE.
>>
>> In your testing patch, could you also log the PIC mask bytes?
>> There ought to be at least one unmasked; or wait - there actually
>> is a spurious interrupt there (right before IOMMU initialization):
>>
>> (XEN) spurious 8259A interrupt: IRQ7.
> 
> So I've added a log of the PIC masks just before checking the ioapic
> masks:
> 
> (XEN) 8259A-1 mask: fe 8259A-2 mask: ff
> 
> AFAICT IRQ7 seems to be unmasked? Sorry my knowledge of PICs is quite
> limited since I've never had to deal with them.

That's IRQ0 then which is unmasked. As said the spurious one
(IRQ7) can't be masked (at the PIC); only the "normal" IRQ7 can
be.

> The line I've added is:
> 
> printk("8259A-1 mask: %x 8259A-2 mask: %x\n", inb(0x21), inb(0xA1));
> 
> I wonder why does Xen even has any code to deal with the PICs,
> shouldn't we rely on io-apics only for legacy delivery?

There are (were?) systems where things wouldn't work without.

>> Hence I wonder if there's not possibly a 2nd one once the IOMMU
>> has been set up.
> 
> Right, then I guess we either mask all the io-apic pins or we setup
> proper remapping entries for non-masked pins? (in order to avoid iommu
> faults)

Making the ExtInt entry is at least worth an experiment, to
(hopefully) confirm that this would take care of the IOMMU
fault. But I'm afraid (as per above) it's not an option in
general. What I could see us doing is mask the entry if all
legacy IRQs are handled through the IO-APIC. This would take
care of spurious interrupts, as these are the only ones
which can make it through when the PIC mask bits are all set.
However, maybe it is legitimate to mask the ExtInt entry
when an IOMMU comes into play.

As to "proper" remapping entries: I'll have to look at the
spec what they say about this. There's only one IRT index
that we can put in the RTE, yet this would need to serve all
15 IRQs potentially coming through the PIC. Recall that the
vector gets supplied by the PIC in the ExtInt case, not by
the IO-APIC RTE.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] HPET interrupt remapping during boot
  2019-10-09 12:03         ` Jan Beulich
@ 2019-10-09 13:56           ` Roger Pau Monné
  2019-10-09 14:59             ` Jan Beulich
  0 siblings, 1 reply; 8+ messages in thread
From: Roger Pau Monné @ 2019-10-09 13:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Wed, Oct 09, 2019 at 02:03:12PM +0200, Jan Beulich wrote:
> On 09.10.2019 13:29, Roger Pau Monné  wrote:
> > On Wed, Oct 09, 2019 at 12:41:05PM +0200, Jan Beulich wrote:
> >> On 09.10.2019 12:11, Roger Pau Monné  wrote:
> >>> And it does print the following when setting up the iommu:
> >>>
> >>> (XEN) ioapic 0 pin 0 not masked
> >>> (XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:00010000
> >>>
> >>> I wonder, shouldn't all pins of all the io-apics be masked at boot?
> >>
> >> I think you might get different answers here depending on whether
> >> you ask firmware or OS people. In fact there are cases where the
> >> IO-APIC needs to be left in this state, I think, but such would
> >> likely need properly reflecting in ACPI tables (albeit I don't
> >> know/recall how this would be done; looking at the code ). This goes back to times
> >> when IO-APICs were new and OSes would not even know about them,
> >> yet they wouldn't get any interrupts to work if fiddling with
> >> only the PIC (sitting behind IO-APIC pin 0).
> >>
> >> See enable_IO_APIC(), where we actually use this property to
> >> determine the pin behind which the 8259 sits.
> >>
> >> I've seen quite many systems where in the BIOS setup you have an
> >> option to select whether you have an "ACPI OS" (wording of course
> >> varies). I've never checked whether this may e.g. reflect itself
> >> in the handover state of the GSI 0 RTE.
> >>
> >> In your testing patch, could you also log the PIC mask bytes?
> >> There ought to be at least one unmasked; or wait - there actually
> >> is a spurious interrupt there (right before IOMMU initialization):
> >>
> >> (XEN) spurious 8259A interrupt: IRQ7.
> > 
> > So I've added a log of the PIC masks just before checking the ioapic
> > masks:
> > 
> > (XEN) 8259A-1 mask: fe 8259A-2 mask: ff
> > 
> > AFAICT IRQ7 seems to be unmasked? Sorry my knowledge of PICs is quite
> > limited since I've never had to deal with them.
> 
> That's IRQ0 then which is unmasked. As said the spurious one
> (IRQ7) can't be masked (at the PIC); only the "normal" IRQ7 can
> be.
> 
> > The line I've added is:
> > 
> > printk("8259A-1 mask: %x 8259A-2 mask: %x\n", inb(0x21), inb(0xA1));
> > 
> > I wonder why does Xen even has any code to deal with the PICs,
> > shouldn't we rely on io-apics only for legacy delivery?
> 
> There are (were?) systems where things wouldn't work without.
> 
> >> Hence I wonder if there's not possibly a 2nd one once the IOMMU
> >> has been set up.
> > 
> > Right, then I guess we either mask all the io-apic pins or we setup
> > proper remapping entries for non-masked pins? (in order to avoid iommu
> > faults)
> 
> Making the ExtInt entry is at least worth an experiment, to
> (hopefully) confirm that this would take care of the IOMMU
> fault. But I'm afraid (as per above) it's not an option in
> general. What I could see us doing is mask the entry if all
> legacy IRQs are handled through the IO-APIC. This would take
> care of spurious interrupts, as these are the only ones
> which can make it through when the PIC mask bits are all set.
> However, maybe it is legitimate to mask the ExtInt entry
> when an IOMMU comes into play.

That was my thinking, ie: make sure every io-apic pin is masked before
enabling iommu interrupt remapping. Nothing useful can happen of
having io-apic pins unmasked, as the remapping is not setup anyway.
If/when those pins get used a proper remapping entry is going to be
setup, and the pin would then be unmasked.

> As to "proper" remapping entries: I'll have to look at the
> spec what they say about this. There's only one IRT index
> that we can put in the RTE, yet this would need to serve all
> 15 IRQs potentially coming through the PIC. Recall that the
> vector gets supplied by the PIC in the ExtInt case, not by
> the IO-APIC RTE.

You can set the delivery mode of the IRTE to ExtINT, much like how this
is done on the io-apic, and then poke the PIC to figure out which IRQ
triggered?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] HPET interrupt remapping during boot
  2019-10-09 13:56           ` Roger Pau Monné
@ 2019-10-09 14:59             ` Jan Beulich
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2019-10-09 14:59 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 09.10.2019 15:56, Roger Pau Monné  wrote:
> On Wed, Oct 09, 2019 at 02:03:12PM +0200, Jan Beulich wrote:
>> On 09.10.2019 13:29, Roger Pau Monné  wrote:
>>> Right, then I guess we either mask all the io-apic pins or we setup
>>> proper remapping entries for non-masked pins? (in order to avoid iommu
>>> faults)
>>
>> Making the ExtInt entry is at least worth an experiment, to
>> (hopefully) confirm that this would take care of the IOMMU
>> fault. But I'm afraid (as per above) it's not an option in
>> general. What I could see us doing is mask the entry if all
>> legacy IRQs are handled through the IO-APIC. This would take
>> care of spurious interrupts, as these are the only ones
>> which can make it through when the PIC mask bits are all set.
>> However, maybe it is legitimate to mask the ExtInt entry
>> when an IOMMU comes into play.
> 
> That was my thinking, ie: make sure every io-apic pin is masked before
> enabling iommu interrupt remapping. Nothing useful can happen of
> having io-apic pins unmasked, as the remapping is not setup anyway.
> If/when those pins get used a proper remapping entry is going to be
> setup, and the pin would then be unmasked.

Well, this isn't the only option. Another would be to transform
all unmasked entries to be translated.

>> As to "proper" remapping entries: I'll have to look at the
>> spec what they say about this. There's only one IRT index
>> that we can put in the RTE, yet this would need to serve all
>> 15 IRQs potentially coming through the PIC. Recall that the
>> vector gets supplied by the PIC in the ExtInt case, not by
>> the IO-APIC RTE.
> 
> You can set the delivery mode of the IRTE to ExtINT, much like how this
> is done on the io-apic, and then poke the PIC to figure out which IRQ
> triggered?

Hmm, yes - it didn't even occur to me that VT-d might allow
ExtInt as delivery mode; too much AMD IOMMU work recently,
where the only way to deal with ExtInt is a "don't remap"
flag in the device table entry.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08 18:30 [Xen-devel] HPET interrupt remapping during boot Andrew Cooper
2019-10-09  9:31 ` Jan Beulich
2019-10-09 10:11   ` Roger Pau Monné
2019-10-09 10:41     ` Jan Beulich
2019-10-09 11:29       ` Roger Pau Monné
2019-10-09 12:03         ` Jan Beulich
2019-10-09 13:56           ` Roger Pau Monné
2019-10-09 14:59             ` Jan Beulich

Xen-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/xen-devel/0 xen-devel/git/0.git
	git clone --mirror https://lore.kernel.org/xen-devel/1 xen-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 xen-devel xen-devel/ https://lore.kernel.org/xen-devel \
		xen-devel@lists.xenproject.org xen-devel@lists.xen.org xen-devel@archiver.kernel.org
	public-inbox-index xen-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.xenproject.lists.xen-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox