xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH v2 0/2] iommu: fixes for interrupt remapping enabling
@ 2019-10-10 11:33 Roger Pau Monne
  2019-10-10 11:33 ` [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU Roger Pau Monne
  2019-10-10 11:33 ` [Xen-devel] [PATCH v2 2/2] iommu: translate IO-APIC pins when enabling interrupt remapping Roger Pau Monne
  0 siblings, 2 replies; 16+ messages in thread
From: Roger Pau Monne @ 2019-10-10 11:33 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Jan Beulich, Wei Liu, Andrew Cooper,
	Suravee Suthikulpanit, Roger Pau Monne

Hello,

The following series contain two fixes for issues found when enabling
interrupt remapping and the IO-APIC already has unmasked pins. While I'm
not aware of any system malfunctioning (apart from reporting IOMMU
interrupt remapping faults) I think it would be nice to have those in
4.13.

The series can also be found at:

git://xenbits.xen.org/people/royger/xen.git iommu_intr_v2

Thanks, Roger.

Roger Pau Monne (2):
  x2APIC: translate IO-APIC entries when enabling the IOMMU
  iommu: translate IO-APIC pins when enabling interrupt remapping

 xen/arch/x86/apic.c                           |  9 +-
 xen/arch/x86/io_apic.c                        |  5 +-
 xen/drivers/passthrough/amd/iommu_init.c      | 11 ++-
 xen/drivers/passthrough/amd/iommu_intr.c      | 90 +------------------
 xen/drivers/passthrough/x86/iommu.c           | 34 ++++++-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  1 +
 xen/include/asm-x86/io_apic.h                 |  3 +-
 7 files changed, 52 insertions(+), 101 deletions(-)

-- 
2.23.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 11:33 [Xen-devel] [PATCH v2 0/2] iommu: fixes for interrupt remapping enabling Roger Pau Monne
@ 2019-10-10 11:33 ` Roger Pau Monne
  2019-10-10 11:54   ` Jan Beulich
  2019-10-10 11:33 ` [Xen-devel] [PATCH v2 2/2] iommu: translate IO-APIC pins when enabling interrupt remapping Roger Pau Monne
  1 sibling, 1 reply; 16+ messages in thread
From: Roger Pau Monne @ 2019-10-10 11:33 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

When interrupt remapping is enabled as part of enabling x2APIC the
IO-APIC entries also need to be translated to the new format and added
to the interrupt remapping table.

This prevents IOMMU interrupt remapping faults when booting on
hardware that has unmasked IO-APIC pins.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Juergen Gross <jgross@suse.com>
---
Changes since v1:
 - Remove the unneeded iommu_enabled local variable.
---
 xen/arch/x86/apic.c           | 9 +++++++--
 xen/arch/x86/io_apic.c        | 5 +++--
 xen/include/asm-x86/io_apic.h | 3 ++-
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index 6cdb50cf41..8415d9f787 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -515,7 +515,7 @@ static void resume_x2apic(void)
     iommu_enable_x2apic();
     __enable_x2apic();
 
-    restore_IO_APIC_setup(ioapic_entries);
+    restore_IO_APIC_setup(ioapic_entries, true);
     unmask_8259A();
 
 out:
@@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
         printk("Switched to APIC driver %s\n", genapic.name);
 
 restore_out:
-    restore_IO_APIC_setup(ioapic_entries);
+    /*
+     * NB: do not use raw mode when restoring entries if the iommu has been
+     * enabled during the process, because the entries need to be translated
+     * and added to the remapping table in that case.
+     */
+    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
     unmask_8259A();
 
 out:
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 5d25862bd8..37eabc16c9 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -379,7 +379,8 @@ void mask_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries)
 /*
  * Restore IO APIC entries which was saved in ioapic_entries.
  */
-int restore_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries)
+int restore_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries,
+                          bool raw)
 {
     int apic, pin;
 
@@ -394,7 +395,7 @@ int restore_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries)
             return -ENOMEM;
 
         for (pin = 0; pin < nr_ioapic_entries[apic]; pin++)
-	    ioapic_write_entry(apic, pin, 1, ioapic_entries[apic][pin]);
+	    ioapic_write_entry(apic, pin, raw, ioapic_entries[apic][pin]);
     }
 
     return 0;
diff --git a/xen/include/asm-x86/io_apic.h b/xen/include/asm-x86/io_apic.h
index 0b041f0565..998905186b 100644
--- a/xen/include/asm-x86/io_apic.h
+++ b/xen/include/asm-x86/io_apic.h
@@ -197,7 +197,8 @@ extern struct IO_APIC_route_entry **alloc_ioapic_entries(void);
 extern void free_ioapic_entries(struct IO_APIC_route_entry **ioapic_entries);
 extern int save_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries);
 extern void mask_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries);
-extern int restore_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries);
+extern int restore_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries,
+                                 bool raw);
 
 unsigned highest_gsi(void);
 
-- 
2.23.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Xen-devel] [PATCH v2 2/2] iommu: translate IO-APIC pins when enabling interrupt remapping
  2019-10-10 11:33 [Xen-devel] [PATCH v2 0/2] iommu: fixes for interrupt remapping enabling Roger Pau Monne
  2019-10-10 11:33 ` [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU Roger Pau Monne
@ 2019-10-10 11:33 ` Roger Pau Monne
  2019-10-10 11:59   ` Jan Beulich
  1 sibling, 1 reply; 16+ messages in thread
From: Roger Pau Monne @ 2019-10-10 11:33 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Suravee Suthikulpanit, Wei Liu, Andrew Cooper,
	Jan Beulich, Roger Pau Monne

On Intel hardware there's currently no translation of already enabled
IO-APIC pins when interrupt remapping is enabled on the IOMMU, hence
introduce a logic similar to the one used in x2apic_bsp_setup in order
to save and mask all IO-APIC pins, and then translate and restore them
after interrupt remapping has been enabled.

With this change the AMD specific logic to deal with enabled pins
(amd_iommu_setup_ioapic_remapping) can be removed, thus unifying the
handling of IO-APIC when enabling interrupt remapping regardless of
the IOMMU vendor.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Juergen Gross <jgross@suse.com>
---
 xen/drivers/passthrough/amd/iommu_init.c      | 11 ++-
 xen/drivers/passthrough/amd/iommu_intr.c      | 90 +------------------
 xen/drivers/passthrough/x86/iommu.c           | 34 ++++++-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  1 +
 4 files changed, 40 insertions(+), 96 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index 6f53c7ec08..3c244619b9 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -19,6 +19,7 @@
 
 #include <xen/errno.h>
 #include <xen/acpi.h>
+#include <xen/keyhandler.h>
 #include <xen/pci.h>
 #include <xen/pci_regs.h>
 #include <xen/irq.h>
@@ -1435,12 +1436,6 @@ int __init amd_iommu_init(bool xt)
     if ( rc )
         goto error_out;
 
-    /* initialize io-apic interrupt remapping entries */
-    if ( iommu_intremap )
-        rc = amd_iommu_setup_ioapic_remapping();
-    if ( rc )
-        goto error_out;
-
     /* Allocate and initialize device table(s). */
     pci_init = !xt;
     rc = iterate_ivrs_mappings(amd_iommu_setup_device_table);
@@ -1469,6 +1464,10 @@ int __init amd_iommu_init(bool xt)
             goto error_out;
     }
 
+    if ( iommu_intremap )
+        register_keyhandler('V', &amd_iommu_dump_intremap_tables,
+                            "dump IOMMU intremap tables", 0);
+
     return 0;
 
 error_out:
diff --git a/xen/drivers/passthrough/amd/iommu_intr.c b/xen/drivers/passthrough/amd/iommu_intr.c
index fb71073c84..1eed60f265 100644
--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -21,7 +21,6 @@
 #include <asm/amd-iommu.h>
 #include <asm/hvm/svm/amd-iommu-proto.h>
 #include <asm/io_apic.h>
-#include <xen/keyhandler.h>
 #include <xen/softirq.h>
 
 union irte32 {
@@ -79,8 +78,6 @@ unsigned long *shared_intremap_inuse;
 static DEFINE_SPINLOCK(shared_intremap_lock);
 unsigned int nr_ioapic_sbdf;
 
-static void dump_intremap_tables(unsigned char key);
-
 #define intremap_page_order(irt) PFN_ORDER(virt_to_page(irt))
 
 unsigned int amd_iommu_intremap_table_order(
@@ -354,91 +351,6 @@ static int update_intremap_entry_from_ioapic(
     return 0;
 }
 
-int __init amd_iommu_setup_ioapic_remapping(void)
-{
-    struct IO_APIC_route_entry rte;
-    unsigned long flags;
-    union irte_ptr entry;
-    int apic, pin;
-    u8 delivery_mode, dest, vector, dest_mode;
-    u16 seg, bdf, req_id;
-    struct amd_iommu *iommu;
-    spinlock_t *lock;
-    unsigned int offset;
-
-    /* Read ioapic entries and update interrupt remapping table accordingly */
-    for ( apic = 0; apic < nr_ioapics; apic++ )
-    {
-        for ( pin = 0; pin < nr_ioapic_entries[apic]; pin++ )
-        {
-            unsigned int idx;
-
-            rte = __ioapic_read_entry(apic, pin, 1);
-            if ( rte.mask == 1 )
-                continue;
-
-            /* get device id of ioapic devices */
-            idx = ioapic_id_to_index(IO_APIC_ID(apic));
-            if ( idx == MAX_IO_APICS )
-                return -EINVAL;
-
-            bdf = ioapic_sbdf[idx].bdf;
-            seg = ioapic_sbdf[idx].seg;
-            iommu = find_iommu_for_device(seg, bdf);
-            if ( !iommu )
-            {
-                AMD_IOMMU_DEBUG("Fail to find iommu for ioapic "
-                                "device id = %04x:%04x\n", seg, bdf);
-                continue;
-            }
-
-            req_id = get_intremap_requestor_id(iommu->seg, bdf);
-            lock = get_intremap_lock(iommu->seg, req_id);
-
-            delivery_mode = rte.delivery_mode;
-            vector = rte.vector;
-            dest_mode = rte.dest_mode;
-            dest = rte.dest.logical.logical_dest;
-
-            if ( iommu->ctrl.xt_en )
-            {
-                /*
-                 * In x2APIC mode we have no way of discovering the high 24
-                 * bits of the destination of an already enabled interrupt.
-                 * We come here earlier than for xAPIC mode, so no interrupts
-                 * should have been set up before.
-                 */
-                AMD_IOMMU_DEBUG("Unmasked IO-APIC#%u entry %u in x2APIC mode\n",
-                                IO_APIC_ID(apic), pin);
-            }
-
-            spin_lock_irqsave(lock, flags);
-            offset = alloc_intremap_entry(iommu, req_id, 1);
-            BUG_ON(offset >= INTREMAP_MAX_ENTRIES);
-            entry = get_intremap_entry(iommu, req_id, offset);
-            update_intremap_entry(iommu, entry, vector,
-                                  delivery_mode, dest_mode, dest);
-            spin_unlock_irqrestore(lock, flags);
-
-            set_rte_index(&rte, offset);
-            ioapic_sbdf[idx].pin_2_idx[pin] = offset;
-            __ioapic_write_entry(apic, pin, 1, rte);
-
-            if ( iommu->enabled )
-            {
-                spin_lock_irqsave(&iommu->lock, flags);
-                amd_iommu_flush_intremap(iommu, req_id);
-                spin_unlock_irqrestore(&iommu->lock, flags);
-            }
-        }
-    }
-
-    register_keyhandler('V', &dump_intremap_tables,
-                        "dump IOMMU intremap tables", 0);
-
-    return 0;
-}
-
 void amd_iommu_ioapic_update_ire(
     unsigned int apic, unsigned int reg, unsigned int value)
 {
@@ -982,7 +894,7 @@ static int dump_intremap_mapping(const struct amd_iommu *iommu,
     return 0;
 }
 
-static void dump_intremap_tables(unsigned char key)
+void amd_iommu_dump_intremap_tables(unsigned char key)
 {
     if ( !shared_intremap_table )
     {
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 59905629e1..2cf528e760 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -21,6 +21,7 @@
 #include <xsm/xsm.h>
 
 #include <asm/hvm/io.h>
+#include <asm/io_apic.h>
 #include <asm/setup.h>
 
 const struct iommu_init_ops *__initdata iommu_init_ops;
@@ -28,6 +29,7 @@ struct iommu_ops __read_mostly iommu_ops;
 
 int __init iommu_hardware_setup(void)
 {
+    struct IO_APIC_route_entry **ioapic_entries = NULL;
     int rc;
 
     if ( !iommu_init_ops )
@@ -43,7 +45,37 @@ int __init iommu_hardware_setup(void)
         /* x2apic setup may have previously initialised the struct. */
         ASSERT(iommu_ops.init == iommu_init_ops->ops->init);
 
-    return iommu_init_ops->setup();
+    if ( !x2apic_enabled && iommu_intremap )
+    {
+        /*
+         * If x2APIC is enabled interrupt remapping is already enabled, so
+         * there's no need to mess with the IO-APIC because the remapping
+         * entries are already correctly setup by x2apic_bsp_setup.
+         */
+        ioapic_entries = alloc_ioapic_entries();
+        if ( !ioapic_entries )
+            return -ENOMEM;
+        rc = save_IO_APIC_setup(ioapic_entries);
+        if ( rc )
+        {
+            free_ioapic_entries(ioapic_entries);
+            return rc;
+        }
+
+        mask_8259A();
+        mask_IO_APIC_setup(ioapic_entries);
+    }
+
+    rc = iommu_init_ops->setup();
+
+    if ( ioapic_entries )
+    {
+        restore_IO_APIC_setup(ioapic_entries, rc);
+        unmask_8259A();
+        free_ioapic_entries(ioapic_entries);
+    }
+
+    return rc;
 }
 
 int iommu_enable_x2apic(void)
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 07d25a585d..8ed9482791 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -114,6 +114,7 @@ int amd_iommu_msi_msg_update_ire(
 void amd_iommu_read_msi_from_ire(
     struct msi_desc *msi_desc, struct msi_msg *msg);
 int amd_setup_hpet_msi(struct msi_desc *msi_desc);
+void amd_iommu_dump_intremap_tables(unsigned char key);
 
 extern struct ioapic_sbdf {
     u16 bdf, seg;
-- 
2.23.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 11:33 ` [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU Roger Pau Monne
@ 2019-10-10 11:54   ` Jan Beulich
  2019-10-10 12:13     ` Roger Pau Monné
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2019-10-10 11:54 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Juergen Gross, xen-devel, Wei Liu, Andrew Cooper

On 10.10.2019 13:33, Roger Pau Monne wrote:
> When interrupt remapping is enabled as part of enabling x2APIC the

Perhaps "unmasked" instead of "the"?

> IO-APIC entries also need to be translated to the new format and added
> to the interrupt remapping table.
> 
> This prevents IOMMU interrupt remapping faults when booting on
> hardware that has unmasked IO-APIC pins.

But in the end it only papers over whatever the spurious interrupts
result form, doesn't it? Which isn't to say this isn't an
improvement. Calling out the ExtInt case here may be worthwhile as
well, as would be pointing out that this case still won't work on
AMD IOMMUs.

> --- a/xen/arch/x86/apic.c
> +++ b/xen/arch/x86/apic.c
> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
>      iommu_enable_x2apic();
>      __enable_x2apic();
>  
> -    restore_IO_APIC_setup(ioapic_entries);
> +    restore_IO_APIC_setup(ioapic_entries, true);
>      unmask_8259A();
>  
>  out:
> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
>          printk("Switched to APIC driver %s\n", genapic.name);
>  
>  restore_out:
> -    restore_IO_APIC_setup(ioapic_entries);
> +    /*
> +     * NB: do not use raw mode when restoring entries if the iommu has been
> +     * enabled during the process, because the entries need to be translated
> +     * and added to the remapping table in that case.
> +     */
> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);

How is this different in the resume_x2apic() case? The IOMMU gets
enabled in the course of that as well. I.e. I'd expect you want
to pass "false" there, not "true".

Also how would "x2apic_enabled" reflect the transition? It may
have been "true" already before entering the function here.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] iommu: translate IO-APIC pins when enabling interrupt remapping
  2019-10-10 11:33 ` [Xen-devel] [PATCH v2 2/2] iommu: translate IO-APIC pins when enabling interrupt remapping Roger Pau Monne
@ 2019-10-10 11:59   ` Jan Beulich
  0 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2019-10-10 11:59 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, xen-devel, Wei Liu, Suravee Suthikulpanit, AndrewCooper

On 10.10.2019 13:33, Roger Pau Monne wrote:
> On Intel hardware there's currently no translation of already enabled
> IO-APIC pins when interrupt remapping is enabled on the IOMMU, hence
> introduce a logic similar to the one used in x2apic_bsp_setup in order
> to save and mask all IO-APIC pins, and then translate and restore them
> after interrupt remapping has been enabled.
> 
> With this change the AMD specific logic to deal with enabled pins
> (amd_iommu_setup_ioapic_remapping) can be removed, thus unifying the
> handling of IO-APIC when enabling interrupt remapping regardless of
> the IOMMU vendor.
> 
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

The actual code change
Reviewed-by: Jan Beulich <jbeulich@suse.com>
but please mention here as well that the ExtInt case continues to be
broken in the AMD case.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 11:54   ` Jan Beulich
@ 2019-10-10 12:13     ` Roger Pau Monné
  2019-10-10 12:55       ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Roger Pau Monné @ 2019-10-10 12:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Juergen Gross, xen-devel, Wei Liu, Andrew Cooper

On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
> On 10.10.2019 13:33, Roger Pau Monne wrote:
> > When interrupt remapping is enabled as part of enabling x2APIC the
> 
> Perhaps "unmasked" instead of "the"?
> 
> > IO-APIC entries also need to be translated to the new format and added
> > to the interrupt remapping table.
> > 
> > This prevents IOMMU interrupt remapping faults when booting on
> > hardware that has unmasked IO-APIC pins.
> 
> But in the end it only papers over whatever the spurious interrupts
> result form, doesn't it? Which isn't to say this isn't an
> improvement. Calling out the ExtInt case here may be worthwhile as
> well, as would be pointing out that this case still won't work on
> AMD IOMMUs.

But the fix for the ExtINT AMD issue should be done in
amd_iommu_ioapic_update_ire then, so that it can properly handle
ExtINT delivery mode, not to this part of the code. I will look
into it, but I think it's kind of tangential to the issue here.

> > --- a/xen/arch/x86/apic.c
> > +++ b/xen/arch/x86/apic.c
> > @@ -515,7 +515,7 @@ static void resume_x2apic(void)
> >      iommu_enable_x2apic();
> >      __enable_x2apic();
> >  
> > -    restore_IO_APIC_setup(ioapic_entries);
> > +    restore_IO_APIC_setup(ioapic_entries, true);
> >      unmask_8259A();
> >  
> >  out:
> > @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
> >          printk("Switched to APIC driver %s\n", genapic.name);
> >  
> >  restore_out:
> > -    restore_IO_APIC_setup(ioapic_entries);
> > +    /*
> > +     * NB: do not use raw mode when restoring entries if the iommu has been
> > +     * enabled during the process, because the entries need to be translated
> > +     * and added to the remapping table in that case.
> > +     */
> > +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
> 
> How is this different in the resume_x2apic() case? The IOMMU gets
> enabled in the course of that as well. I.e. I'd expect you want
> to pass "false" there, not "true".

In the resume_x2apic case interrupt remapping should already be
enabled or not, but that function is not going to enable interrupt
remapping if it wasn't enabled before, hence the IO-APIC entries
should already be using the interrupt remapping table and no
translation is needed.

> Also how would "x2apic_enabled" reflect the transition? It may
> have been "true" already before entering the function here.

If x2apic_enabled == true at the point where restore_IO_APIC_setup
is called interrupt remapping must be enabled, because AFAICT at this
point it's not possible to have x2apic_enabled == true and interrupt
remapping disabled.

The issue I can see here is what happens if interrupt remapping was
already enabled by the hardware, and entries in the IO-APIC are
already using the remapping table. I would have to look into how to
detect that case properly, but I think the proposed change is an
improvement overall.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 12:13     ` Roger Pau Monné
@ 2019-10-10 12:55       ` Jan Beulich
  2019-10-10 13:12         ` Roger Pau Monné
  2019-10-10 13:29         ` Roger Pau Monné
  0 siblings, 2 replies; 16+ messages in thread
From: Jan Beulich @ 2019-10-10 12:55 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On 10.10.2019 14:13, Roger Pau Monné  wrote:
> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
>> On 10.10.2019 13:33, Roger Pau Monne wrote:
>>> When interrupt remapping is enabled as part of enabling x2APIC the
>>
>> Perhaps "unmasked" instead of "the"?
>>
>>> IO-APIC entries also need to be translated to the new format and added
>>> to the interrupt remapping table.
>>>
>>> This prevents IOMMU interrupt remapping faults when booting on
>>> hardware that has unmasked IO-APIC pins.
>>
>> But in the end it only papers over whatever the spurious interrupts
>> result form, doesn't it? Which isn't to say this isn't an
>> improvement. Calling out the ExtInt case here may be worthwhile as
>> well, as would be pointing out that this case still won't work on
>> AMD IOMMUs.
> 
> But the fix for the ExtINT AMD issue should be done in
> amd_iommu_ioapic_update_ire then, so that it can properly handle
> ExtINT delivery mode, not to this part of the code. I will look
> into it, but I think it's kind of tangential to the issue here.

I'm not talking of you working on fixing this right away. I'm merely
asking that you mention here (a) the ExtInt special case and (b)
that this special case will (continue to) not work in the AMD case.

>>> --- a/xen/arch/x86/apic.c
>>> +++ b/xen/arch/x86/apic.c
>>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
>>>      iommu_enable_x2apic();
>>>      __enable_x2apic();
>>>  
>>> -    restore_IO_APIC_setup(ioapic_entries);
>>> +    restore_IO_APIC_setup(ioapic_entries, true);
>>>      unmask_8259A();
>>>  
>>>  out:
>>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
>>>          printk("Switched to APIC driver %s\n", genapic.name);
>>>  
>>>  restore_out:
>>> -    restore_IO_APIC_setup(ioapic_entries);
>>> +    /*
>>> +     * NB: do not use raw mode when restoring entries if the iommu has been
>>> +     * enabled during the process, because the entries need to be translated
>>> +     * and added to the remapping table in that case.
>>> +     */
>>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
>>
>> How is this different in the resume_x2apic() case? The IOMMU gets
>> enabled in the course of that as well. I.e. I'd expect you want
>> to pass "false" there, not "true".
> 
> In the resume_x2apic case interrupt remapping should already be
> enabled or not, but that function is not going to enable interrupt
> remapping if it wasn't enabled before, hence the IO-APIC entries
> should already be using the interrupt remapping table and no
> translation is needed.

Who / what would have enabled the IOMMU in the resume case?

>> Also how would "x2apic_enabled" reflect the transition? It may
>> have been "true" already before entering the function here.
> 
> If x2apic_enabled == true at the point where restore_IO_APIC_setup
> is called interrupt remapping must be enabled, because AFAICT at this
> point it's not possible to have x2apic_enabled == true and interrupt
> remapping disabled.
> 
> The issue I can see here is what happens if interrupt remapping was
> already enabled by the hardware, and entries in the IO-APIC are
> already using the remapping table. I would have to look into how to
> detect that case properly, but I think the proposed change is an
> improvement overall.

Firmware handing off with the IOMMU (and hence interrupt remapping)
already enabled is a case which - afaict - we can't currently cope
with. Firmware handing off in x2APIC enabled state is typically
with the IOMMU (and hence interrupt remapping) still disabled. This
is not a forbidden mode, it's just that in such a configuration
interrupts can't be delivered to certain CPUs.

In any event you need to properly distinguish x2APIC enabled state
(or the transition thereof) from IOMMU / interrupt remapping
enabled state (or the transition thereof). I.e. you want to avoid
raw mode restore if interrupt remapping state transitioned from
off to on in the process.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 12:55       ` Jan Beulich
@ 2019-10-10 13:12         ` Roger Pau Monné
  2019-10-10 13:46           ` Jan Beulich
  2019-10-10 13:29         ` Roger Pau Monné
  1 sibling, 1 reply; 16+ messages in thread
From: Roger Pau Monné @ 2019-10-10 13:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
> On 10.10.2019 14:13, Roger Pau Monné  wrote:
> > On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
> >> On 10.10.2019 13:33, Roger Pau Monne wrote:
> >>> When interrupt remapping is enabled as part of enabling x2APIC the
> >>
> >> Perhaps "unmasked" instead of "the"?
> >>
> >>> IO-APIC entries also need to be translated to the new format and added
> >>> to the interrupt remapping table.
> >>>
> >>> This prevents IOMMU interrupt remapping faults when booting on
> >>> hardware that has unmasked IO-APIC pins.
> >>
> >> But in the end it only papers over whatever the spurious interrupts
> >> result form, doesn't it? Which isn't to say this isn't an
> >> improvement. Calling out the ExtInt case here may be worthwhile as
> >> well, as would be pointing out that this case still won't work on
> >> AMD IOMMUs.
> > 
> > But the fix for the ExtINT AMD issue should be done in
> > amd_iommu_ioapic_update_ire then, so that it can properly handle
> > ExtINT delivery mode, not to this part of the code. I will look
> > into it, but I think it's kind of tangential to the issue here.
> 
> I'm not talking of you working on fixing this right away. I'm merely
> asking that you mention here (a) the ExtInt special case and (b)
> that this special case will (continue to) not work in the AMD case.
> 
> >>> --- a/xen/arch/x86/apic.c
> >>> +++ b/xen/arch/x86/apic.c
> >>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
> >>>      iommu_enable_x2apic();
> >>>      __enable_x2apic();
> >>>  
> >>> -    restore_IO_APIC_setup(ioapic_entries);
> >>> +    restore_IO_APIC_setup(ioapic_entries, true);
> >>>      unmask_8259A();
> >>>  
> >>>  out:
> >>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
> >>>          printk("Switched to APIC driver %s\n", genapic.name);
> >>>  
> >>>  restore_out:
> >>> -    restore_IO_APIC_setup(ioapic_entries);
> >>> +    /*
> >>> +     * NB: do not use raw mode when restoring entries if the iommu has been
> >>> +     * enabled during the process, because the entries need to be translated
> >>> +     * and added to the remapping table in that case.
> >>> +     */
> >>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
> >>
> >> How is this different in the resume_x2apic() case? The IOMMU gets
> >> enabled in the course of that as well. I.e. I'd expect you want
> >> to pass "false" there, not "true".
> > 
> > In the resume_x2apic case interrupt remapping should already be
> > enabled or not, but that function is not going to enable interrupt
> > remapping if it wasn't enabled before, hence the IO-APIC entries
> > should already be using the interrupt remapping table and no
> > translation is needed.
> 
> Who / what would have enabled the IOMMU in the resume case?

I don't think the question is who enables interrupt remapping in the
resume case (which is resume_x2apic when calling iommu_enable_x2apic
AFAICT), the point here is that on resume the entries in the IO-APIC
will already match the state of interrupt remapping, so they shouldn't
be translated. If interrupt remapping was off before suspend it will
still be off after resume, and there won't be any translation needed.
The same is true if interrupt remapping is on before suspend.

> >> Also how would "x2apic_enabled" reflect the transition? It may
> >> have been "true" already before entering the function here.
> > 
> > If x2apic_enabled == true at the point where restore_IO_APIC_setup
> > is called interrupt remapping must be enabled, because AFAICT at this
> > point it's not possible to have x2apic_enabled == true and interrupt
> > remapping disabled.
> > 
> > The issue I can see here is what happens if interrupt remapping was
> > already enabled by the hardware, and entries in the IO-APIC are
> > already using the remapping table. I would have to look into how to
> > detect that case properly, but I think the proposed change is an
> > improvement overall.
> 
> Firmware handing off with the IOMMU (and hence interrupt remapping)
> already enabled is a case which - afaict - we can't currently cope
> with. Firmware handing off in x2APIC enabled state is typically
> with the IOMMU (and hence interrupt remapping) still disabled. This
> is not a forbidden mode, it's just that in such a configuration
> interrupts can't be delivered to certain CPUs.
> 
> In any event you need to properly distinguish x2APIC enabled state
> (or the transition thereof) from IOMMU / interrupt remapping
> enabled state (or the transition thereof). I.e. you want to avoid
> raw mode restore if interrupt remapping state transitioned from
> off to on in the process.

Right, and that's why the call to restore_IO_APIC_setup in
x2apic_bsp_setup uses !x2apic_enabled as it's second parameter. If
interrupt remapping has been enabled by the call to
iommu_enable_x2apic x2apic_enabled must be true, and hence the entries
in the IO-APIC need to be translated to use the interrupt remapping
table. There's no path that can lead to restore_IO_APIC_setup with
interrupt remapping enabled and x2APIC mode disabled (or with x2APIC
enabled and interrupt remapping disabled).

Hence if interrupt remapping is off before calling x2apic_bsp_setup
(which is what Xen expects to function properly) and x2apic_enabled ==
true when calling restore_IO_APIC_setup it means interrupt remapping
got enabled, and the IO-APIC entries need translation.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 12:55       ` Jan Beulich
  2019-10-10 13:12         ` Roger Pau Monné
@ 2019-10-10 13:29         ` Roger Pau Monné
  2019-10-10 13:53           ` Jan Beulich
  1 sibling, 1 reply; 16+ messages in thread
From: Roger Pau Monné @ 2019-10-10 13:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
> On 10.10.2019 14:13, Roger Pau Monné  wrote:
> > On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
> >> On 10.10.2019 13:33, Roger Pau Monne wrote:
> >>> When interrupt remapping is enabled as part of enabling x2APIC the
> >>
> >> Perhaps "unmasked" instead of "the"?
> >>
> >>> IO-APIC entries also need to be translated to the new format and added
> >>> to the interrupt remapping table.
> >>>
> >>> This prevents IOMMU interrupt remapping faults when booting on
> >>> hardware that has unmasked IO-APIC pins.
> >>
> >> But in the end it only papers over whatever the spurious interrupts
> >> result form, doesn't it? Which isn't to say this isn't an
> >> improvement. Calling out the ExtInt case here may be worthwhile as
> >> well, as would be pointing out that this case still won't work on
> >> AMD IOMMUs.
> > 
> > But the fix for the ExtINT AMD issue should be done in
> > amd_iommu_ioapic_update_ire then, so that it can properly handle
> > ExtINT delivery mode, not to this part of the code. I will look
> > into it, but I think it's kind of tangential to the issue here.
> 
> I'm not talking of you working on fixing this right away. I'm merely
> asking that you mention here (a) the ExtInt special case and (b)
> that this special case will (continue to) not work in the AMD case.

Please bear with me, I've taken a look at the ExtINT handling for AMD
and I'm not sure I can spot what's missing. Xen seems to handle both
the EIntPass and IV fields of the DTE (see iommu_dte_add_device_entry
and amd_iommu_set_intremap_table), and that should be enough in order
to passthrough such interrupts.

Is there some other handling that I'm missing? (maybe in the handling
of the interrupt itself?)

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 13:12         ` Roger Pau Monné
@ 2019-10-10 13:46           ` Jan Beulich
  2019-10-10 15:19             ` Roger Pau Monné
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2019-10-10 13:46 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On 10.10.2019 15:12, Roger Pau Monné  wrote:
> On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
>> On 10.10.2019 14:13, Roger Pau Monné  wrote:
>>> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
>>>> On 10.10.2019 13:33, Roger Pau Monne wrote:
>>>>> When interrupt remapping is enabled as part of enabling x2APIC the
>>>>
>>>> Perhaps "unmasked" instead of "the"?
>>>>
>>>>> IO-APIC entries also need to be translated to the new format and added
>>>>> to the interrupt remapping table.
>>>>>
>>>>> This prevents IOMMU interrupt remapping faults when booting on
>>>>> hardware that has unmasked IO-APIC pins.
>>>>
>>>> But in the end it only papers over whatever the spurious interrupts
>>>> result form, doesn't it? Which isn't to say this isn't an
>>>> improvement. Calling out the ExtInt case here may be worthwhile as
>>>> well, as would be pointing out that this case still won't work on
>>>> AMD IOMMUs.
>>>
>>> But the fix for the ExtINT AMD issue should be done in
>>> amd_iommu_ioapic_update_ire then, so that it can properly handle
>>> ExtINT delivery mode, not to this part of the code. I will look
>>> into it, but I think it's kind of tangential to the issue here.
>>
>> I'm not talking of you working on fixing this right away. I'm merely
>> asking that you mention here (a) the ExtInt special case and (b)
>> that this special case will (continue to) not work in the AMD case.
>>
>>>>> --- a/xen/arch/x86/apic.c
>>>>> +++ b/xen/arch/x86/apic.c
>>>>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
>>>>>      iommu_enable_x2apic();
>>>>>      __enable_x2apic();
>>>>>  
>>>>> -    restore_IO_APIC_setup(ioapic_entries);
>>>>> +    restore_IO_APIC_setup(ioapic_entries, true);
>>>>>      unmask_8259A();
>>>>>  
>>>>>  out:
>>>>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
>>>>>          printk("Switched to APIC driver %s\n", genapic.name);
>>>>>  
>>>>>  restore_out:
>>>>> -    restore_IO_APIC_setup(ioapic_entries);
>>>>> +    /*
>>>>> +     * NB: do not use raw mode when restoring entries if the iommu has been
>>>>> +     * enabled during the process, because the entries need to be translated
>>>>> +     * and added to the remapping table in that case.
>>>>> +     */
>>>>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
>>>>
>>>> How is this different in the resume_x2apic() case? The IOMMU gets
>>>> enabled in the course of that as well. I.e. I'd expect you want
>>>> to pass "false" there, not "true".
>>>
>>> In the resume_x2apic case interrupt remapping should already be
>>> enabled or not, but that function is not going to enable interrupt
>>> remapping if it wasn't enabled before, hence the IO-APIC entries
>>> should already be using the interrupt remapping table and no
>>> translation is needed.
>>
>> Who / what would have enabled the IOMMU in the resume case?
> 
> I don't think the question is who enables interrupt remapping in the
> resume case (which is resume_x2apic when calling iommu_enable_x2apic
> AFAICT), the point here is that on resume the entries in the IO-APIC
> will already match the state of interrupt remapping, so they shouldn't
> be translated. If interrupt remapping was off before suspend it will
> still be off after resume, and there won't be any translation needed.
> The same is true if interrupt remapping is on before suspend.

I disagree: save_IO_APIC_setup() gets called from resume_x2apic(),
not prior to suspend.

>>>> Also how would "x2apic_enabled" reflect the transition? It may
>>>> have been "true" already before entering the function here.
>>>
>>> If x2apic_enabled == true at the point where restore_IO_APIC_setup
>>> is called interrupt remapping must be enabled, because AFAICT at this
>>> point it's not possible to have x2apic_enabled == true and interrupt
>>> remapping disabled.
>>>
>>> The issue I can see here is what happens if interrupt remapping was
>>> already enabled by the hardware, and entries in the IO-APIC are
>>> already using the remapping table. I would have to look into how to
>>> detect that case properly, but I think the proposed change is an
>>> improvement overall.
>>
>> Firmware handing off with the IOMMU (and hence interrupt remapping)
>> already enabled is a case which - afaict - we can't currently cope
>> with. Firmware handing off in x2APIC enabled state is typically
>> with the IOMMU (and hence interrupt remapping) still disabled. This
>> is not a forbidden mode, it's just that in such a configuration
>> interrupts can't be delivered to certain CPUs.
>>
>> In any event you need to properly distinguish x2APIC enabled state
>> (or the transition thereof) from IOMMU / interrupt remapping
>> enabled state (or the transition thereof). I.e. you want to avoid
>> raw mode restore if interrupt remapping state transitioned from
>> off to on in the process.
> 
> Right, and that's why the call to restore_IO_APIC_setup in
> x2apic_bsp_setup uses !x2apic_enabled as it's second parameter. If
> interrupt remapping has been enabled by the call to
> iommu_enable_x2apic x2apic_enabled must be true, and hence the entries
> in the IO-APIC need to be translated to use the interrupt remapping
> table. There's no path that can lead to restore_IO_APIC_setup with
> interrupt remapping enabled and x2APIC mode disabled (or with x2APIC
> enabled and interrupt remapping disabled).
> 
> Hence if interrupt remapping is off before calling x2apic_bsp_setup
> (which is what Xen expects to function properly) and x2apic_enabled ==
> true when calling restore_IO_APIC_setup it means interrupt remapping
> got enabled, and the IO-APIC entries need translation.

But the code in question sits on a shared success/error path, and
in the error case it matters whether x2apic_enabled was true already
on entry. I realize that io_apic_write() would suitably avoid going
the remapping path, but I think it would be more clear if the
distinction was already made properly at the call site.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 13:29         ` Roger Pau Monné
@ 2019-10-10 13:53           ` Jan Beulich
  0 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2019-10-10 13:53 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On 10.10.2019 15:29, Roger Pau Monné  wrote:
> On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
>> On 10.10.2019 14:13, Roger Pau Monné  wrote:
>>> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
>>>> On 10.10.2019 13:33, Roger Pau Monne wrote:
>>>>> When interrupt remapping is enabled as part of enabling x2APIC the
>>>>
>>>> Perhaps "unmasked" instead of "the"?
>>>>
>>>>> IO-APIC entries also need to be translated to the new format and added
>>>>> to the interrupt remapping table.
>>>>>
>>>>> This prevents IOMMU interrupt remapping faults when booting on
>>>>> hardware that has unmasked IO-APIC pins.
>>>>
>>>> But in the end it only papers over whatever the spurious interrupts
>>>> result form, doesn't it? Which isn't to say this isn't an
>>>> improvement. Calling out the ExtInt case here may be worthwhile as
>>>> well, as would be pointing out that this case still won't work on
>>>> AMD IOMMUs.
>>>
>>> But the fix for the ExtINT AMD issue should be done in
>>> amd_iommu_ioapic_update_ire then, so that it can properly handle
>>> ExtINT delivery mode, not to this part of the code. I will look
>>> into it, but I think it's kind of tangential to the issue here.
>>
>> I'm not talking of you working on fixing this right away. I'm merely
>> asking that you mention here (a) the ExtInt special case and (b)
>> that this special case will (continue to) not work in the AMD case.
> 
> Please bear with me, I've taken a look at the ExtINT handling for AMD
> and I'm not sure I can spot what's missing. Xen seems to handle both
> the EIntPass and IV fields of the DTE (see iommu_dte_add_device_entry
> and amd_iommu_set_intremap_table), and that should be enough in order
> to passthrough such interrupts.

EIntPass gets set based on ACPI table information, not what we found
in a particular RTE. That's hopefully fine, but you know how reliable
firmware is especially in corner cases.

> Is there some other handling that I'm missing? (maybe in the handling
> of the interrupt itself?)

Look at the update_intremap_entry_from_ioapic() ->
update_intremap_entry() path: This converts the 3-bit delivery mode
field into a 1-bit int_type one (the field is really 3 bits wide,
but values 010..111 (binary) are documented as reserved; I can't
exclude though that the documentation is wrong here).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 13:46           ` Jan Beulich
@ 2019-10-10 15:19             ` Roger Pau Monné
  2019-10-11  7:52               ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Roger Pau Monné @ 2019-10-10 15:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On Thu, Oct 10, 2019 at 03:46:45PM +0200, Jan Beulich wrote:
> On 10.10.2019 15:12, Roger Pau Monné  wrote:
> > On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
> >> On 10.10.2019 14:13, Roger Pau Monné  wrote:
> >>> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
> >>>> On 10.10.2019 13:33, Roger Pau Monne wrote:
> >>>>> When interrupt remapping is enabled as part of enabling x2APIC the
> >>>>
> >>>> Perhaps "unmasked" instead of "the"?
> >>>>
> >>>>> IO-APIC entries also need to be translated to the new format and added
> >>>>> to the interrupt remapping table.
> >>>>>
> >>>>> This prevents IOMMU interrupt remapping faults when booting on
> >>>>> hardware that has unmasked IO-APIC pins.
> >>>>
> >>>> But in the end it only papers over whatever the spurious interrupts
> >>>> result form, doesn't it? Which isn't to say this isn't an
> >>>> improvement. Calling out the ExtInt case here may be worthwhile as
> >>>> well, as would be pointing out that this case still won't work on
> >>>> AMD IOMMUs.
> >>>
> >>> But the fix for the ExtINT AMD issue should be done in
> >>> amd_iommu_ioapic_update_ire then, so that it can properly handle
> >>> ExtINT delivery mode, not to this part of the code. I will look
> >>> into it, but I think it's kind of tangential to the issue here.
> >>
> >> I'm not talking of you working on fixing this right away. I'm merely
> >> asking that you mention here (a) the ExtInt special case and (b)
> >> that this special case will (continue to) not work in the AMD case.
> >>
> >>>>> --- a/xen/arch/x86/apic.c
> >>>>> +++ b/xen/arch/x86/apic.c
> >>>>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
> >>>>>      iommu_enable_x2apic();
> >>>>>      __enable_x2apic();
> >>>>>  
> >>>>> -    restore_IO_APIC_setup(ioapic_entries);
> >>>>> +    restore_IO_APIC_setup(ioapic_entries, true);
> >>>>>      unmask_8259A();
> >>>>>  
> >>>>>  out:
> >>>>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
> >>>>>          printk("Switched to APIC driver %s\n", genapic.name);
> >>>>>  
> >>>>>  restore_out:
> >>>>> -    restore_IO_APIC_setup(ioapic_entries);
> >>>>> +    /*
> >>>>> +     * NB: do not use raw mode when restoring entries if the iommu has been
> >>>>> +     * enabled during the process, because the entries need to be translated
> >>>>> +     * and added to the remapping table in that case.
> >>>>> +     */
> >>>>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
> >>>>
> >>>> How is this different in the resume_x2apic() case? The IOMMU gets
> >>>> enabled in the course of that as well. I.e. I'd expect you want
> >>>> to pass "false" there, not "true".
> >>>
> >>> In the resume_x2apic case interrupt remapping should already be
> >>> enabled or not, but that function is not going to enable interrupt
> >>> remapping if it wasn't enabled before, hence the IO-APIC entries
> >>> should already be using the interrupt remapping table and no
> >>> translation is needed.
> >>
> >> Who / what would have enabled the IOMMU in the resume case?
> > 
> > I don't think the question is who enables interrupt remapping in the
> > resume case (which is resume_x2apic when calling iommu_enable_x2apic
> > AFAICT), the point here is that on resume the entries in the IO-APIC
> > will already match the state of interrupt remapping, so they shouldn't
> > be translated. If interrupt remapping was off before suspend it will
> > still be off after resume, and there won't be any translation needed.
> > The same is true if interrupt remapping is on before suspend.
> 
> I disagree: save_IO_APIC_setup() gets called from resume_x2apic(),
> not prior to suspend.

Oh, so maybe that's a misunderstanding on my side. I don't seem to be
able to find a statement about the contents of the IO-APIC registers
(and more specifically the entries) when getting back from
suspension. Are all entries cleared and masked?

Are the values previous to suspension stored?

> >>>> Also how would "x2apic_enabled" reflect the transition? It may
> >>>> have been "true" already before entering the function here.
> >>>
> >>> If x2apic_enabled == true at the point where restore_IO_APIC_setup
> >>> is called interrupt remapping must be enabled, because AFAICT at this
> >>> point it's not possible to have x2apic_enabled == true and interrupt
> >>> remapping disabled.
> >>>
> >>> The issue I can see here is what happens if interrupt remapping was
> >>> already enabled by the hardware, and entries in the IO-APIC are
> >>> already using the remapping table. I would have to look into how to
> >>> detect that case properly, but I think the proposed change is an
> >>> improvement overall.
> >>
> >> Firmware handing off with the IOMMU (and hence interrupt remapping)
> >> already enabled is a case which - afaict - we can't currently cope
> >> with. Firmware handing off in x2APIC enabled state is typically
> >> with the IOMMU (and hence interrupt remapping) still disabled. This
> >> is not a forbidden mode, it's just that in such a configuration
> >> interrupts can't be delivered to certain CPUs.
> >>
> >> In any event you need to properly distinguish x2APIC enabled state
> >> (or the transition thereof) from IOMMU / interrupt remapping
> >> enabled state (or the transition thereof). I.e. you want to avoid
> >> raw mode restore if interrupt remapping state transitioned from
> >> off to on in the process.
> > 
> > Right, and that's why the call to restore_IO_APIC_setup in
> > x2apic_bsp_setup uses !x2apic_enabled as it's second parameter. If
> > interrupt remapping has been enabled by the call to
> > iommu_enable_x2apic x2apic_enabled must be true, and hence the entries
> > in the IO-APIC need to be translated to use the interrupt remapping
> > table. There's no path that can lead to restore_IO_APIC_setup with
> > interrupt remapping enabled and x2APIC mode disabled (or with x2APIC
> > enabled and interrupt remapping disabled).
> > 
> > Hence if interrupt remapping is off before calling x2apic_bsp_setup
> > (which is what Xen expects to function properly) and x2apic_enabled ==
> > true when calling restore_IO_APIC_setup it means interrupt remapping
> > got enabled, and the IO-APIC entries need translation.
> 
> But the code in question sits on a shared success/error path, and
> in the error case it matters whether x2apic_enabled was true already
> on entry.

But it's simply not possible to reach the call to
restore_IO_APIC_setup with x2apic_enabled == true and interrupt
remapping disabled, regardless of the initial value of
x2apic_enabled.

All the paths that could lead to this scenario are short-circuited
above with a panic.

> I realize that io_apic_write() would suitably avoid going
> the remapping path, but I think it would be more clear if the
> distinction was already made properly at the call site.

I'm afraid I'm slightly loss, do you mean to replace the
ioapic_write_entry with an io_apic_write in restore_IO_APIC_setup?

That would be the same as always passing raw == false AFAICT.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-10 15:19             ` Roger Pau Monné
@ 2019-10-11  7:52               ` Jan Beulich
  2019-10-11  9:28                 ` Roger Pau Monné
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2019-10-11  7:52 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On 10.10.2019 17:19, Roger Pau Monné  wrote:
> On Thu, Oct 10, 2019 at 03:46:45PM +0200, Jan Beulich wrote:
>> On 10.10.2019 15:12, Roger Pau Monné  wrote:
>>> On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
>>>> On 10.10.2019 14:13, Roger Pau Monné  wrote:
>>>>> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
>>>>>> On 10.10.2019 13:33, Roger Pau Monne wrote:
>>>>>>> When interrupt remapping is enabled as part of enabling x2APIC the
>>>>>>
>>>>>> Perhaps "unmasked" instead of "the"?
>>>>>>
>>>>>>> IO-APIC entries also need to be translated to the new format and added
>>>>>>> to the interrupt remapping table.
>>>>>>>
>>>>>>> This prevents IOMMU interrupt remapping faults when booting on
>>>>>>> hardware that has unmasked IO-APIC pins.
>>>>>>
>>>>>> But in the end it only papers over whatever the spurious interrupts
>>>>>> result form, doesn't it? Which isn't to say this isn't an
>>>>>> improvement. Calling out the ExtInt case here may be worthwhile as
>>>>>> well, as would be pointing out that this case still won't work on
>>>>>> AMD IOMMUs.
>>>>>
>>>>> But the fix for the ExtINT AMD issue should be done in
>>>>> amd_iommu_ioapic_update_ire then, so that it can properly handle
>>>>> ExtINT delivery mode, not to this part of the code. I will look
>>>>> into it, but I think it's kind of tangential to the issue here.
>>>>
>>>> I'm not talking of you working on fixing this right away. I'm merely
>>>> asking that you mention here (a) the ExtInt special case and (b)
>>>> that this special case will (continue to) not work in the AMD case.
>>>>
>>>>>>> --- a/xen/arch/x86/apic.c
>>>>>>> +++ b/xen/arch/x86/apic.c
>>>>>>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
>>>>>>>      iommu_enable_x2apic();
>>>>>>>      __enable_x2apic();
>>>>>>>  
>>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
>>>>>>> +    restore_IO_APIC_setup(ioapic_entries, true);
>>>>>>>      unmask_8259A();
>>>>>>>  
>>>>>>>  out:
>>>>>>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
>>>>>>>          printk("Switched to APIC driver %s\n", genapic.name);
>>>>>>>  
>>>>>>>  restore_out:
>>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
>>>>>>> +    /*
>>>>>>> +     * NB: do not use raw mode when restoring entries if the iommu has been
>>>>>>> +     * enabled during the process, because the entries need to be translated
>>>>>>> +     * and added to the remapping table in that case.
>>>>>>> +     */
>>>>>>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
>>>>>>
>>>>>> How is this different in the resume_x2apic() case? The IOMMU gets
>>>>>> enabled in the course of that as well. I.e. I'd expect you want
>>>>>> to pass "false" there, not "true".
>>>>>
>>>>> In the resume_x2apic case interrupt remapping should already be
>>>>> enabled or not, but that function is not going to enable interrupt
>>>>> remapping if it wasn't enabled before, hence the IO-APIC entries
>>>>> should already be using the interrupt remapping table and no
>>>>> translation is needed.
>>>>
>>>> Who / what would have enabled the IOMMU in the resume case?
>>>
>>> I don't think the question is who enables interrupt remapping in the
>>> resume case (which is resume_x2apic when calling iommu_enable_x2apic
>>> AFAICT), the point here is that on resume the entries in the IO-APIC
>>> will already match the state of interrupt remapping, so they shouldn't
>>> be translated. If interrupt remapping was off before suspend it will
>>> still be off after resume, and there won't be any translation needed.
>>> The same is true if interrupt remapping is on before suspend.
>>
>> I disagree: save_IO_APIC_setup() gets called from resume_x2apic(),
>> not prior to suspend.
> 
> Oh, so maybe that's a misunderstanding on my side. I don't seem to be
> able to find a statement about the contents of the IO-APIC registers
> (and more specifically the entries) when getting back from
> suspension. Are all entries cleared and masked?
> 
> Are the values previous to suspension stored?

See ioapic_suspend() / ioapic_resume(): Looks like there's some
redundancy here - I don't think it makes sense for the LAPIC
code to fiddle with all the RTEs if subsequently (I assume;
didn't check) they'll all be overwritten anyway. I would seem
more logical to me if they'd just all get masked for IOMMU
enabling, deferring to ioapic_resume() for everything else.

>>>>>> Also how would "x2apic_enabled" reflect the transition? It may
>>>>>> have been "true" already before entering the function here.
>>>>>
>>>>> If x2apic_enabled == true at the point where restore_IO_APIC_setup
>>>>> is called interrupt remapping must be enabled, because AFAICT at this
>>>>> point it's not possible to have x2apic_enabled == true and interrupt
>>>>> remapping disabled.
>>>>>
>>>>> The issue I can see here is what happens if interrupt remapping was
>>>>> already enabled by the hardware, and entries in the IO-APIC are
>>>>> already using the remapping table. I would have to look into how to
>>>>> detect that case properly, but I think the proposed change is an
>>>>> improvement overall.
>>>>
>>>> Firmware handing off with the IOMMU (and hence interrupt remapping)
>>>> already enabled is a case which - afaict - we can't currently cope
>>>> with. Firmware handing off in x2APIC enabled state is typically
>>>> with the IOMMU (and hence interrupt remapping) still disabled. This
>>>> is not a forbidden mode, it's just that in such a configuration
>>>> interrupts can't be delivered to certain CPUs.
>>>>
>>>> In any event you need to properly distinguish x2APIC enabled state
>>>> (or the transition thereof) from IOMMU / interrupt remapping
>>>> enabled state (or the transition thereof). I.e. you want to avoid
>>>> raw mode restore if interrupt remapping state transitioned from
>>>> off to on in the process.
>>>
>>> Right, and that's why the call to restore_IO_APIC_setup in
>>> x2apic_bsp_setup uses !x2apic_enabled as it's second parameter. If
>>> interrupt remapping has been enabled by the call to
>>> iommu_enable_x2apic x2apic_enabled must be true, and hence the entries
>>> in the IO-APIC need to be translated to use the interrupt remapping
>>> table. There's no path that can lead to restore_IO_APIC_setup with
>>> interrupt remapping enabled and x2APIC mode disabled (or with x2APIC
>>> enabled and interrupt remapping disabled).
>>>
>>> Hence if interrupt remapping is off before calling x2apic_bsp_setup
>>> (which is what Xen expects to function properly) and x2apic_enabled ==
>>> true when calling restore_IO_APIC_setup it means interrupt remapping
>>> got enabled, and the IO-APIC entries need translation.
>>
>> But the code in question sits on a shared success/error path, and
>> in the error case it matters whether x2apic_enabled was true already
>> on entry.
> 
> But it's simply not possible to reach the call to
> restore_IO_APIC_setup with x2apic_enabled == true and interrupt
> remapping disabled, regardless of the initial value of
> x2apic_enabled.
> 
> All the paths that could lead to this scenario are short-circuited
> above with a panic.

Hmm, true. Nevertheless it would feel better if the conditionals
were using what actually matters, rather than something derived.

>> I realize that io_apic_write() would suitably avoid going
>> the remapping path, but I think it would be more clear if the
>> distinction was already made properly at the call site.
> 
> I'm afraid I'm slightly loss, do you mean to replace the
> ioapic_write_entry with an io_apic_write in restore_IO_APIC_setup?

No, because of ...

> That would be the same as always passing raw == false AFAICT.

... this. I'm asking to pass in the argument for "raw" based
on what you want, without relying on io_apic_write()'s
behavior. That's more for code clarity than actual correctness,
since - as said - io_apic_write() would invoke __io_apic_write()
anyway. By setting "raw" correctly you can simply avoid going
through io_apic_write() altogether.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-11  7:52               ` Jan Beulich
@ 2019-10-11  9:28                 ` Roger Pau Monné
  2019-10-11 10:01                   ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Roger Pau Monné @ 2019-10-11  9:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On Fri, Oct 11, 2019 at 09:52:35AM +0200, Jan Beulich wrote:
> On 10.10.2019 17:19, Roger Pau Monné  wrote:
> > On Thu, Oct 10, 2019 at 03:46:45PM +0200, Jan Beulich wrote:
> >> On 10.10.2019 15:12, Roger Pau Monné  wrote:
> >>> On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
> >>>> On 10.10.2019 14:13, Roger Pau Monné  wrote:
> >>>>> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
> >>>>>> On 10.10.2019 13:33, Roger Pau Monne wrote:
> >>>>>>> When interrupt remapping is enabled as part of enabling x2APIC the
> >>>>>>
> >>>>>> Perhaps "unmasked" instead of "the"?
> >>>>>>
> >>>>>>> IO-APIC entries also need to be translated to the new format and added
> >>>>>>> to the interrupt remapping table.
> >>>>>>>
> >>>>>>> This prevents IOMMU interrupt remapping faults when booting on
> >>>>>>> hardware that has unmasked IO-APIC pins.
> >>>>>>
> >>>>>> But in the end it only papers over whatever the spurious interrupts
> >>>>>> result form, doesn't it? Which isn't to say this isn't an
> >>>>>> improvement. Calling out the ExtInt case here may be worthwhile as
> >>>>>> well, as would be pointing out that this case still won't work on
> >>>>>> AMD IOMMUs.
> >>>>>
> >>>>> But the fix for the ExtINT AMD issue should be done in
> >>>>> amd_iommu_ioapic_update_ire then, so that it can properly handle
> >>>>> ExtINT delivery mode, not to this part of the code. I will look
> >>>>> into it, but I think it's kind of tangential to the issue here.
> >>>>
> >>>> I'm not talking of you working on fixing this right away. I'm merely
> >>>> asking that you mention here (a) the ExtInt special case and (b)
> >>>> that this special case will (continue to) not work in the AMD case.
> >>>>
> >>>>>>> --- a/xen/arch/x86/apic.c
> >>>>>>> +++ b/xen/arch/x86/apic.c
> >>>>>>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
> >>>>>>>      iommu_enable_x2apic();
> >>>>>>>      __enable_x2apic();
> >>>>>>>  
> >>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
> >>>>>>> +    restore_IO_APIC_setup(ioapic_entries, true);
> >>>>>>>      unmask_8259A();
> >>>>>>>  
> >>>>>>>  out:
> >>>>>>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
> >>>>>>>          printk("Switched to APIC driver %s\n", genapic.name);
> >>>>>>>  
> >>>>>>>  restore_out:
> >>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
> >>>>>>> +    /*
> >>>>>>> +     * NB: do not use raw mode when restoring entries if the iommu has been
> >>>>>>> +     * enabled during the process, because the entries need to be translated
> >>>>>>> +     * and added to the remapping table in that case.
> >>>>>>> +     */
> >>>>>>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
> >>>>>>
> >>>>>> How is this different in the resume_x2apic() case? The IOMMU gets
> >>>>>> enabled in the course of that as well. I.e. I'd expect you want
> >>>>>> to pass "false" there, not "true".
> >>>>>
> >>>>> In the resume_x2apic case interrupt remapping should already be
> >>>>> enabled or not, but that function is not going to enable interrupt
> >>>>> remapping if it wasn't enabled before, hence the IO-APIC entries
> >>>>> should already be using the interrupt remapping table and no
> >>>>> translation is needed.
> >>>>
> >>>> Who / what would have enabled the IOMMU in the resume case?
> >>>
> >>> I don't think the question is who enables interrupt remapping in the
> >>> resume case (which is resume_x2apic when calling iommu_enable_x2apic
> >>> AFAICT), the point here is that on resume the entries in the IO-APIC
> >>> will already match the state of interrupt remapping, so they shouldn't
> >>> be translated. If interrupt remapping was off before suspend it will
> >>> still be off after resume, and there won't be any translation needed.
> >>> The same is true if interrupt remapping is on before suspend.
> >>
> >> I disagree: save_IO_APIC_setup() gets called from resume_x2apic(),
> >> not prior to suspend.
> > 
> > Oh, so maybe that's a misunderstanding on my side. I don't seem to be
> > able to find a statement about the contents of the IO-APIC registers
> > (and more specifically the entries) when getting back from
> > suspension. Are all entries cleared and masked?
> > 
> > Are the values previous to suspension stored?
> 
> See ioapic_suspend() / ioapic_resume(): Looks like there's some
> redundancy here - I don't think it makes sense for the LAPIC
> code to fiddle with all the RTEs if subsequently (I assume;
> didn't check) they'll all be overwritten anyway. I would seem
> more logical to me if they'd just all get masked for IOMMU
> enabling, deferring to ioapic_resume() for everything else.

Yes, it seems like resume_x2apic shouldn't need to play with the
entries at all TBH, just enabling interrupt remapping should be enough
given that the entries are already save and restored by
ioapic_{suspend/resume}.

Would you agree to leave that as-is in this patch, ie: always pass
true to restore_IO_APIC_setup in resume_x2apic?

> >>>>>> Also how would "x2apic_enabled" reflect the transition? It may
> >>>>>> have been "true" already before entering the function here.
> >>>>>
> >>>>> If x2apic_enabled == true at the point where restore_IO_APIC_setup
> >>>>> is called interrupt remapping must be enabled, because AFAICT at this
> >>>>> point it's not possible to have x2apic_enabled == true and interrupt
> >>>>> remapping disabled.
> >>>>>
> >>>>> The issue I can see here is what happens if interrupt remapping was
> >>>>> already enabled by the hardware, and entries in the IO-APIC are
> >>>>> already using the remapping table. I would have to look into how to
> >>>>> detect that case properly, but I think the proposed change is an
> >>>>> improvement overall.
> >>>>
> >>>> Firmware handing off with the IOMMU (and hence interrupt remapping)
> >>>> already enabled is a case which - afaict - we can't currently cope
> >>>> with. Firmware handing off in x2APIC enabled state is typically
> >>>> with the IOMMU (and hence interrupt remapping) still disabled. This
> >>>> is not a forbidden mode, it's just that in such a configuration
> >>>> interrupts can't be delivered to certain CPUs.
> >>>>
> >>>> In any event you need to properly distinguish x2APIC enabled state
> >>>> (or the transition thereof) from IOMMU / interrupt remapping
> >>>> enabled state (or the transition thereof). I.e. you want to avoid
> >>>> raw mode restore if interrupt remapping state transitioned from
> >>>> off to on in the process.
> >>>
> >>> Right, and that's why the call to restore_IO_APIC_setup in
> >>> x2apic_bsp_setup uses !x2apic_enabled as it's second parameter. If
> >>> interrupt remapping has been enabled by the call to
> >>> iommu_enable_x2apic x2apic_enabled must be true, and hence the entries
> >>> in the IO-APIC need to be translated to use the interrupt remapping
> >>> table. There's no path that can lead to restore_IO_APIC_setup with
> >>> interrupt remapping enabled and x2APIC mode disabled (or with x2APIC
> >>> enabled and interrupt remapping disabled).
> >>>
> >>> Hence if interrupt remapping is off before calling x2apic_bsp_setup
> >>> (which is what Xen expects to function properly) and x2apic_enabled ==
> >>> true when calling restore_IO_APIC_setup it means interrupt remapping
> >>> got enabled, and the IO-APIC entries need translation.
> >>
> >> But the code in question sits on a shared success/error path, and
> >> in the error case it matters whether x2apic_enabled was true already
> >> on entry.
> > 
> > But it's simply not possible to reach the call to
> > restore_IO_APIC_setup with x2apic_enabled == true and interrupt
> > remapping disabled, regardless of the initial value of
> > x2apic_enabled.
> > 
> > All the paths that could lead to this scenario are short-circuited
> > above with a panic.
> 
> Hmm, true. Nevertheless it would feel better if the conditionals
> were using what actually matters, rather than something derived.

Would you be OK to using something like v1:

https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg00805.html

See the usage of iommu_enabled in x2apic_bsp_setup. I think using
x2apic_enabled is fine given the logic in the function, and the fact
that x2APIC mandates interrupt remapping, but I could live with that
extra local variable.

> >> I realize that io_apic_write() would suitably avoid going
> >> the remapping path, but I think it would be more clear if the
> >> distinction was already made properly at the call site.
> > 
> > I'm afraid I'm slightly loss, do you mean to replace the
> > ioapic_write_entry with an io_apic_write in restore_IO_APIC_setup?
> 
> No, because of ...
> 
> > That would be the same as always passing raw == false AFAICT.
> 
> ... this. I'm asking to pass in the argument for "raw" based
> on what you want, without relying on io_apic_write()'s
> behavior. That's more for code clarity than actual correctness,
> since - as said - io_apic_write() would invoke __io_apic_write()
> anyway.

Sorry, but I'm afraid I still don't fully understand what you mean
with this. restore_IO_APIC_setup uses ioapic_write_entry which in turn
uses __io_apic_write or io_apic_write. I could get rid of the usage of
ioapic_write_entry and just call __io_apic_write or io_apic_write (or
even directly call iommu_update_ire_from_apic), but I'm not sure
what's the benefit of any of this, it's just open-coding logic that's
done in the helpers.

Would you rather mean to rename the parameter of restore_IO_APIC_setup
from 'raw' to 'translate' or some such in order to clearly denote
there's a transformation done before writing the entry?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-11  9:28                 ` Roger Pau Monné
@ 2019-10-11 10:01                   ` Jan Beulich
  2019-10-11 10:36                     ` Roger Pau Monné
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2019-10-11 10:01 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On 11.10.2019 11:28, Roger Pau Monné  wrote:
> On Fri, Oct 11, 2019 at 09:52:35AM +0200, Jan Beulich wrote:
>> On 10.10.2019 17:19, Roger Pau Monné  wrote:
>>> On Thu, Oct 10, 2019 at 03:46:45PM +0200, Jan Beulich wrote:
>>>> On 10.10.2019 15:12, Roger Pau Monné  wrote:
>>>>> On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
>>>>>> On 10.10.2019 14:13, Roger Pau Monné  wrote:
>>>>>>> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
>>>>>>>> On 10.10.2019 13:33, Roger Pau Monne wrote:
>>>>>>>>> When interrupt remapping is enabled as part of enabling x2APIC the
>>>>>>>>
>>>>>>>> Perhaps "unmasked" instead of "the"?
>>>>>>>>
>>>>>>>>> IO-APIC entries also need to be translated to the new format and added
>>>>>>>>> to the interrupt remapping table.
>>>>>>>>>
>>>>>>>>> This prevents IOMMU interrupt remapping faults when booting on
>>>>>>>>> hardware that has unmasked IO-APIC pins.
>>>>>>>>
>>>>>>>> But in the end it only papers over whatever the spurious interrupts
>>>>>>>> result form, doesn't it? Which isn't to say this isn't an
>>>>>>>> improvement. Calling out the ExtInt case here may be worthwhile as
>>>>>>>> well, as would be pointing out that this case still won't work on
>>>>>>>> AMD IOMMUs.
>>>>>>>
>>>>>>> But the fix for the ExtINT AMD issue should be done in
>>>>>>> amd_iommu_ioapic_update_ire then, so that it can properly handle
>>>>>>> ExtINT delivery mode, not to this part of the code. I will look
>>>>>>> into it, but I think it's kind of tangential to the issue here.
>>>>>>
>>>>>> I'm not talking of you working on fixing this right away. I'm merely
>>>>>> asking that you mention here (a) the ExtInt special case and (b)
>>>>>> that this special case will (continue to) not work in the AMD case.
>>>>>>
>>>>>>>>> --- a/xen/arch/x86/apic.c
>>>>>>>>> +++ b/xen/arch/x86/apic.c
>>>>>>>>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
>>>>>>>>>      iommu_enable_x2apic();
>>>>>>>>>      __enable_x2apic();
>>>>>>>>>  
>>>>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
>>>>>>>>> +    restore_IO_APIC_setup(ioapic_entries, true);
>>>>>>>>>      unmask_8259A();
>>>>>>>>>  
>>>>>>>>>  out:
>>>>>>>>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
>>>>>>>>>          printk("Switched to APIC driver %s\n", genapic.name);
>>>>>>>>>  
>>>>>>>>>  restore_out:
>>>>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
>>>>>>>>> +    /*
>>>>>>>>> +     * NB: do not use raw mode when restoring entries if the iommu has been
>>>>>>>>> +     * enabled during the process, because the entries need to be translated
>>>>>>>>> +     * and added to the remapping table in that case.
>>>>>>>>> +     */
>>>>>>>>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
>>>>>>>>
>>>>>>>> How is this different in the resume_x2apic() case? The IOMMU gets
>>>>>>>> enabled in the course of that as well. I.e. I'd expect you want
>>>>>>>> to pass "false" there, not "true".
>>>>>>>
>>>>>>> In the resume_x2apic case interrupt remapping should already be
>>>>>>> enabled or not, but that function is not going to enable interrupt
>>>>>>> remapping if it wasn't enabled before, hence the IO-APIC entries
>>>>>>> should already be using the interrupt remapping table and no
>>>>>>> translation is needed.
>>>>>>
>>>>>> Who / what would have enabled the IOMMU in the resume case?
>>>>>
>>>>> I don't think the question is who enables interrupt remapping in the
>>>>> resume case (which is resume_x2apic when calling iommu_enable_x2apic
>>>>> AFAICT), the point here is that on resume the entries in the IO-APIC
>>>>> will already match the state of interrupt remapping, so they shouldn't
>>>>> be translated. If interrupt remapping was off before suspend it will
>>>>> still be off after resume, and there won't be any translation needed.
>>>>> The same is true if interrupt remapping is on before suspend.
>>>>
>>>> I disagree: save_IO_APIC_setup() gets called from resume_x2apic(),
>>>> not prior to suspend.
>>>
>>> Oh, so maybe that's a misunderstanding on my side. I don't seem to be
>>> able to find a statement about the contents of the IO-APIC registers
>>> (and more specifically the entries) when getting back from
>>> suspension. Are all entries cleared and masked?
>>>
>>> Are the values previous to suspension stored?
>>
>> See ioapic_suspend() / ioapic_resume(): Looks like there's some
>> redundancy here - I don't think it makes sense for the LAPIC
>> code to fiddle with all the RTEs if subsequently (I assume;
>> didn't check) they'll all be overwritten anyway. I would seem
>> more logical to me if they'd just all get masked for IOMMU
>> enabling, deferring to ioapic_resume() for everything else.
> 
> Yes, it seems like resume_x2apic shouldn't need to play with the
> entries at all TBH, just enabling interrupt remapping should be enough
> given that the entries are already save and restored by
> ioapic_{suspend/resume}.
> 
> Would you agree to leave that as-is in this patch, ie: always pass
> true to restore_IO_APIC_setup in resume_x2apic?

Properly explained in the description, that's an option. Even better
would of course be to do away with the unnecessary save/restore in
a prereq patch, at which point the question disappears as to what to
pass to the function at that point.

>>> But it's simply not possible to reach the call to
>>> restore_IO_APIC_setup with x2apic_enabled == true and interrupt
>>> remapping disabled, regardless of the initial value of
>>> x2apic_enabled.
>>>
>>> All the paths that could lead to this scenario are short-circuited
>>> above with a panic.
>>
>> Hmm, true. Nevertheless it would feel better if the conditionals
>> were using what actually matters, rather than something derived.
> 
> Would you be OK to using something like v1:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg00805.html
> 
> See the usage of iommu_enabled in x2apic_bsp_setup. I think using
> x2apic_enabled is fine given the logic in the function, and the fact
> that x2APIC mandates interrupt remapping, but I could live with that
> extra local variable.

Conceptually this looks better to me. I'd prefer to avoid shadowing
the global "iommu_enabled" though, and set the local variable to
"true" only one we actually enable the IOMMU. Also strictly speaking
you care about the enabling of interrupt remapping, not that of the
IOMMU (which implicitly means DMA remapping). I wonder whether the
code couldn't easily be made cope with IOMMU already being enabled,
assuming that this would lead to {iommu,intremap}_enabled to already
be true on entry. I.e. you'd request a "raw" restore unless
intremap_enabled changed from false to true.

>>>> I realize that io_apic_write() would suitably avoid going
>>>> the remapping path, but I think it would be more clear if the
>>>> distinction was already made properly at the call site.
>>>
>>> I'm afraid I'm slightly loss, do you mean to replace the
>>> ioapic_write_entry with an io_apic_write in restore_IO_APIC_setup?
>>
>> No, because of ...
>>
>>> That would be the same as always passing raw == false AFAICT.
>>
>> ... this. I'm asking to pass in the argument for "raw" based
>> on what you want, without relying on io_apic_write()'s
>> behavior. That's more for code clarity than actual correctness,
>> since - as said - io_apic_write() would invoke __io_apic_write()
>> anyway.
> 
> Sorry, but I'm afraid I still don't fully understand what you mean
> with this. restore_IO_APIC_setup uses ioapic_write_entry which in turn
> uses __io_apic_write or io_apic_write. I could get rid of the usage of
> ioapic_write_entry and just call __io_apic_write or io_apic_write (or
> even directly call iommu_update_ire_from_apic), but I'm not sure
> what's the benefit of any of this, it's just open-coding logic that's
> done in the helpers.
> 
> Would you rather mean to rename the parameter of restore_IO_APIC_setup
> from 'raw' to 'translate' or some such in order to clearly denote
> there's a transformation done before writing the entry?

Whether it's "raw" or "translate" doesn't really matter to me.
What I'm after is for you to not pass raw=true when you really
mean raw=false (or the other way around), relying on the
intremap_enabled check down the call chain in io_apic_write().

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
  2019-10-11 10:01                   ` Jan Beulich
@ 2019-10-11 10:36                     ` Roger Pau Monné
  0 siblings, 0 replies; 16+ messages in thread
From: Roger Pau Monné @ 2019-10-11 10:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Juergen Gross, Andrew Cooper, Wei Liu, xen-devel

On Fri, Oct 11, 2019 at 12:01:36PM +0200, Jan Beulich wrote:
> On 11.10.2019 11:28, Roger Pau Monné  wrote:
> > On Fri, Oct 11, 2019 at 09:52:35AM +0200, Jan Beulich wrote:
> >> On 10.10.2019 17:19, Roger Pau Monné  wrote:
> >>> On Thu, Oct 10, 2019 at 03:46:45PM +0200, Jan Beulich wrote:
> >>>> On 10.10.2019 15:12, Roger Pau Monné  wrote:
> >>>>> On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
> >>>>>> On 10.10.2019 14:13, Roger Pau Monné  wrote:
> >>>>>>> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
> >>>>>>>> On 10.10.2019 13:33, Roger Pau Monne wrote:
> >>>>>>>>> When interrupt remapping is enabled as part of enabling x2APIC the
> >>>>>>>>
> >>>>>>>> Perhaps "unmasked" instead of "the"?
> >>>>>>>>
> >>>>>>>>> IO-APIC entries also need to be translated to the new format and added
> >>>>>>>>> to the interrupt remapping table.
> >>>>>>>>>
> >>>>>>>>> This prevents IOMMU interrupt remapping faults when booting on
> >>>>>>>>> hardware that has unmasked IO-APIC pins.
> >>>>>>>>
> >>>>>>>> But in the end it only papers over whatever the spurious interrupts
> >>>>>>>> result form, doesn't it? Which isn't to say this isn't an
> >>>>>>>> improvement. Calling out the ExtInt case here may be worthwhile as
> >>>>>>>> well, as would be pointing out that this case still won't work on
> >>>>>>>> AMD IOMMUs.
> >>>>>>>
> >>>>>>> But the fix for the ExtINT AMD issue should be done in
> >>>>>>> amd_iommu_ioapic_update_ire then, so that it can properly handle
> >>>>>>> ExtINT delivery mode, not to this part of the code. I will look
> >>>>>>> into it, but I think it's kind of tangential to the issue here.
> >>>>>>
> >>>>>> I'm not talking of you working on fixing this right away. I'm merely
> >>>>>> asking that you mention here (a) the ExtInt special case and (b)
> >>>>>> that this special case will (continue to) not work in the AMD case.
> >>>>>>
> >>>>>>>>> --- a/xen/arch/x86/apic.c
> >>>>>>>>> +++ b/xen/arch/x86/apic.c
> >>>>>>>>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
> >>>>>>>>>      iommu_enable_x2apic();
> >>>>>>>>>      __enable_x2apic();
> >>>>>>>>>  
> >>>>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
> >>>>>>>>> +    restore_IO_APIC_setup(ioapic_entries, true);
> >>>>>>>>>      unmask_8259A();
> >>>>>>>>>  
> >>>>>>>>>  out:
> >>>>>>>>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
> >>>>>>>>>          printk("Switched to APIC driver %s\n", genapic.name);
> >>>>>>>>>  
> >>>>>>>>>  restore_out:
> >>>>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
> >>>>>>>>> +    /*
> >>>>>>>>> +     * NB: do not use raw mode when restoring entries if the iommu has been
> >>>>>>>>> +     * enabled during the process, because the entries need to be translated
> >>>>>>>>> +     * and added to the remapping table in that case.
> >>>>>>>>> +     */
> >>>>>>>>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
> >>>>>>>>
> >>>>>>>> How is this different in the resume_x2apic() case? The IOMMU gets
> >>>>>>>> enabled in the course of that as well. I.e. I'd expect you want
> >>>>>>>> to pass "false" there, not "true".
> >>>>>>>
> >>>>>>> In the resume_x2apic case interrupt remapping should already be
> >>>>>>> enabled or not, but that function is not going to enable interrupt
> >>>>>>> remapping if it wasn't enabled before, hence the IO-APIC entries
> >>>>>>> should already be using the interrupt remapping table and no
> >>>>>>> translation is needed.
> >>>>>>
> >>>>>> Who / what would have enabled the IOMMU in the resume case?
> >>>>>
> >>>>> I don't think the question is who enables interrupt remapping in the
> >>>>> resume case (which is resume_x2apic when calling iommu_enable_x2apic
> >>>>> AFAICT), the point here is that on resume the entries in the IO-APIC
> >>>>> will already match the state of interrupt remapping, so they shouldn't
> >>>>> be translated. If interrupt remapping was off before suspend it will
> >>>>> still be off after resume, and there won't be any translation needed.
> >>>>> The same is true if interrupt remapping is on before suspend.
> >>>>
> >>>> I disagree: save_IO_APIC_setup() gets called from resume_x2apic(),
> >>>> not prior to suspend.
> >>>
> >>> Oh, so maybe that's a misunderstanding on my side. I don't seem to be
> >>> able to find a statement about the contents of the IO-APIC registers
> >>> (and more specifically the entries) when getting back from
> >>> suspension. Are all entries cleared and masked?
> >>>
> >>> Are the values previous to suspension stored?
> >>
> >> See ioapic_suspend() / ioapic_resume(): Looks like there's some
> >> redundancy here - I don't think it makes sense for the LAPIC
> >> code to fiddle with all the RTEs if subsequently (I assume;
> >> didn't check) they'll all be overwritten anyway. I would seem
> >> more logical to me if they'd just all get masked for IOMMU
> >> enabling, deferring to ioapic_resume() for everything else.
> > 
> > Yes, it seems like resume_x2apic shouldn't need to play with the
> > entries at all TBH, just enabling interrupt remapping should be enough
> > given that the entries are already save and restored by
> > ioapic_{suspend/resume}.
> > 
> > Would you agree to leave that as-is in this patch, ie: always pass
> > true to restore_IO_APIC_setup in resume_x2apic?
> 
> Properly explained in the description, that's an option. Even better
> would of course be to do away with the unnecessary save/restore in
> a prereq patch, at which point the question disappears as to what to
> pass to the function at that point.

OK, I can create a pre-patch, but I don't think I have any hardware
that does suspend/resume since it's all server-grade. Someone would
have to test it and assert it actually works correctly, hence my
reluctance to touch any of the suspend/resume logic.

> >>> But it's simply not possible to reach the call to
> >>> restore_IO_APIC_setup with x2apic_enabled == true and interrupt
> >>> remapping disabled, regardless of the initial value of
> >>> x2apic_enabled.
> >>>
> >>> All the paths that could lead to this scenario are short-circuited
> >>> above with a panic.
> >>
> >> Hmm, true. Nevertheless it would feel better if the conditionals
> >> were using what actually matters, rather than something derived.
> > 
> > Would you be OK to using something like v1:
> > 
> > https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg00805.html
> > 
> > See the usage of iommu_enabled in x2apic_bsp_setup. I think using
> > x2apic_enabled is fine given the logic in the function, and the fact
> > that x2APIC mandates interrupt remapping, but I could live with that
> > extra local variable.
> 
> Conceptually this looks better to me. I'd prefer to avoid shadowing
> the global "iommu_enabled" though, and set the local variable to
> "true" only one we actually enable the IOMMU. Also strictly speaking
> you care about the enabling of interrupt remapping, not that of the
> IOMMU (which implicitly means DMA remapping). I wonder whether the
> code couldn't easily be made cope with IOMMU already being enabled,
> assuming that this would lead to {iommu,intremap}_enabled to already
> be true on entry. I.e. you'd request a "raw" restore unless
> intremap_enabled changed from false to true.

I'm not sure it's that easy. A raw restore wouldn't fully work, since
Xen would have replaced the interrupt remapping table with an empty
one, so Xen would have to add the entries for the unmasked IO-APIC
pins to the IRT and then likely also update the IO-APIC pin entry to
match the newly created IRTE.

I think adding a comment at the top of the function to note that this
won't work properly if there are unmasked IO-APIC pins and interrupt
remapping enabled should be enough for now, as the patch doesn't make
this case worse than it's current state.

> >>>> I realize that io_apic_write() would suitably avoid going
> >>>> the remapping path, but I think it would be more clear if the
> >>>> distinction was already made properly at the call site.
> >>>
> >>> I'm afraid I'm slightly loss, do you mean to replace the
> >>> ioapic_write_entry with an io_apic_write in restore_IO_APIC_setup?
> >>
> >> No, because of ...
> >>
> >>> That would be the same as always passing raw == false AFAICT.
> >>
> >> ... this. I'm asking to pass in the argument for "raw" based
> >> on what you want, without relying on io_apic_write()'s
> >> behavior. That's more for code clarity than actual correctness,
> >> since - as said - io_apic_write() would invoke __io_apic_write()
> >> anyway.
> > 
> > Sorry, but I'm afraid I still don't fully understand what you mean
> > with this. restore_IO_APIC_setup uses ioapic_write_entry which in turn
> > uses __io_apic_write or io_apic_write. I could get rid of the usage of
> > ioapic_write_entry and just call __io_apic_write or io_apic_write (or
> > even directly call iommu_update_ire_from_apic), but I'm not sure
> > what's the benefit of any of this, it's just open-coding logic that's
> > done in the helpers.
> > 
> > Would you rather mean to rename the parameter of restore_IO_APIC_setup
> > from 'raw' to 'translate' or some such in order to clearly denote
> > there's a transformation done before writing the entry?
> 
> Whether it's "raw" or "translate" doesn't really matter to me.
> What I'm after is for you to not pass raw=true when you really
> mean raw=false (or the other way around), relying on the
> intremap_enabled check down the call chain in io_apic_write().

I don't think that's the case with my proposed code (or at least not
the intention), when raw=false is passed that's because the entries
_must_ be translated, not because x2apic_bsp_setup is relying on the
value of intremap_enabled.

Let me prepare a new series with the proposed changes and we can
continue from there.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-10-11 10:37 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-10 11:33 [Xen-devel] [PATCH v2 0/2] iommu: fixes for interrupt remapping enabling Roger Pau Monne
2019-10-10 11:33 ` [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU Roger Pau Monne
2019-10-10 11:54   ` Jan Beulich
2019-10-10 12:13     ` Roger Pau Monné
2019-10-10 12:55       ` Jan Beulich
2019-10-10 13:12         ` Roger Pau Monné
2019-10-10 13:46           ` Jan Beulich
2019-10-10 15:19             ` Roger Pau Monné
2019-10-11  7:52               ` Jan Beulich
2019-10-11  9:28                 ` Roger Pau Monné
2019-10-11 10:01                   ` Jan Beulich
2019-10-11 10:36                     ` Roger Pau Monné
2019-10-10 13:29         ` Roger Pau Monné
2019-10-10 13:53           ` Jan Beulich
2019-10-10 11:33 ` [Xen-devel] [PATCH v2 2/2] iommu: translate IO-APIC pins when enabling interrupt remapping Roger Pau Monne
2019-10-10 11:59   ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).