All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] x86/intr: guest interrupt handling fixes/cleanup
@ 2021-01-15 14:28 Roger Pau Monne
  2021-01-15 14:28 ` [PATCH v2 1/4] x86/vioapic: check IRR before attempting to inject interrupt after EOI Roger Pau Monne
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Roger Pau Monne @ 2021-01-15 14:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu, Kevin Tian,
	Paul Durrant

Hello,

The following series aims to fix some shortcomings of guest interrupt
handling when using passthrough devices. The first 3 patches are
bugfixes, which I think should solve the issue(s) that caused the dpci
EOI timer to be introduced. However neither me nor others seem to be able
to reproduce the original issue, so it's hard to tell.

It's my opinion that we should remove the timer and see what explodes
(if anything). That's the only way we will be able to figure out what
the original issue was, and how to fix it without introducing yet
another per-guest-irq related timer.

Thanks, Roger.

Roger Pau Monne (4):
  x86/vioapic: check IRR before attempting to inject interrupt after EOI
  x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode
  x86/vpic: issue dpci EOI for cleared pins at ICW1
  x86/dpci: remove the dpci EOI timer

 xen/arch/x86/hvm/vioapic.c            | 13 +++-
 xen/arch/x86/hvm/vpic.c               | 21 ++++++
 xen/drivers/passthrough/vtd/x86/hvm.c |  3 -
 xen/drivers/passthrough/x86/hvm.c     | 95 +--------------------------
 xen/include/asm-x86/hvm/irq.h         |  3 -
 xen/include/xen/iommu.h               |  5 --
 6 files changed, 35 insertions(+), 105 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/4] x86/vioapic: check IRR before attempting to inject interrupt after EOI
  2021-01-15 14:28 [PATCH v2 0/4] x86/intr: guest interrupt handling fixes/cleanup Roger Pau Monne
@ 2021-01-15 14:28 ` Roger Pau Monne
  2021-01-21 16:03   ` Jan Beulich
  2021-01-15 14:28 ` [PATCH v2 2/4] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode Roger Pau Monne
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monne @ 2021-01-15 14:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

In vioapic_update_EOI the irq_lock will be dropped in order to forward
the EOI to the dpci handler, so there's a window between clearing IRR
and checking if the line is asserted where IRR can change behind our
back.

Fix this by checking whether IRR is set before attempting to inject a
new interrupt.

Fixes: 06e3f8f2766 ('vt-d: Do dpci eoi outside of irq_lock.')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vioapic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index eb6c143f74..804bc77279 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -526,7 +526,7 @@ void vioapic_update_EOI(struct domain *d, u8 vector)
             }
 
             if ( (ent->fields.trig_mode == VIOAPIC_LEVEL_TRIG) &&
-                 !ent->fields.mask &&
+                 !ent->fields.mask && !ent->fields.remote_irr &&
                  hvm_irq->gsi_assert_count[vioapic->base_gsi + pin] )
             {
                 ent->fields.remote_irr = 1;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/4] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode
  2021-01-15 14:28 [PATCH v2 0/4] x86/intr: guest interrupt handling fixes/cleanup Roger Pau Monne
  2021-01-15 14:28 ` [PATCH v2 1/4] x86/vioapic: check IRR before attempting to inject interrupt after EOI Roger Pau Monne
@ 2021-01-15 14:28 ` Roger Pau Monne
  2021-01-21 16:23   ` Jan Beulich
  2021-01-15 14:28 ` [PATCH v2 3/4] x86/vpic: issue dpci EOI for cleared pins at ICW1 Roger Pau Monne
  2021-01-15 14:28 ` [PATCH v2 4/4] x86/dpci: remove the dpci EOI timer Roger Pau Monne
  3 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monne @ 2021-01-15 14:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

When an IO-APIC pin is switched from level to edge trigger mode the
IRR bit is cleared, so it can be used as a way to EOI an interrupt at
the IO-APIC level.

Such EOI however does not get forwarded to the dpci code like it's
done for the local APIC initiated EOI. This change adds the code in
order to notify dpci of such EOI, so that dpci and the interrupt
controller are in sync.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vioapic.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 804bc77279..dc35347afa 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -268,6 +268,17 @@ static void vioapic_write_redirent(
 
     spin_unlock(&d->arch.hvm.irq_lock);
 
+    if ( ent.fields.trig_mode == VIOAPIC_EDGE_TRIG &&
+         ent.fields.remote_irr && is_iommu_enabled(d) )
+            /*
+             * Since IRR has been cleared and further interrupts can be
+             * injected also attempt to deassert any virtual line of passed
+             * through devices using this pin. Switching a pin from level to
+             * trigger mode can be used as a way to EOI an interrupt at the
+             * IO-APIC level.
+             */
+            hvm_dpci_eoi(d, gsi);
+
     if ( is_hardware_domain(d) && unmasked )
     {
         /*
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/4] x86/vpic: issue dpci EOI for cleared pins at ICW1
  2021-01-15 14:28 [PATCH v2 0/4] x86/intr: guest interrupt handling fixes/cleanup Roger Pau Monne
  2021-01-15 14:28 ` [PATCH v2 1/4] x86/vioapic: check IRR before attempting to inject interrupt after EOI Roger Pau Monne
  2021-01-15 14:28 ` [PATCH v2 2/4] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode Roger Pau Monne
@ 2021-01-15 14:28 ` Roger Pau Monne
  2021-01-22  9:02   ` Jan Beulich
  2021-01-15 14:28 ` [PATCH v2 4/4] x86/dpci: remove the dpci EOI timer Roger Pau Monne
  3 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monne @ 2021-01-15 14:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

When pins are cleared from either ISR or IRR as part of the
initialization sequence forward the clearing of those pins to the dpci
EOI handler, as it is equivalent to an EOI. Not doing so can bring the
interrupt controller state out of sync with the dpci handling logic,
that expects a notification when a pin has been EOI'ed.

Fixes: 7b3cb5e5416 ('IRQ injection changes for HVM PCI passthru.')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vpic.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index c1c1de7fd0..522cacdc4b 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -193,6 +193,8 @@ static void vpic_ioport_write(
     {
         if ( val & 0x10 )
         {
+            unsigned int pending = vpic->isr | (vpic->irr & ~vpic->elcr);
+
             /* ICW1 */
             /* Clear edge-sensing logic. */
             vpic->irr &= vpic->elcr;
@@ -217,6 +219,24 @@ static void vpic_ioport_write(
             }
 
             vpic->init_state = ((val & 3) << 2) | 1;
+            vpic_update_int_output(vpic);
+            vpic_unlock(vpic);
+
+            /*
+             * Forward the EOI of any pending or in service interrupt that has
+             * been cleared from IRR or ISR, or else the dpci logic will get
+             * out of sync with the state of the interrupt controller.
+             */
+            while ( pending )
+            {
+                unsigned int pin = __scanbit(pending, 8);
+
+                ASSERT(pin < 8);
+                hvm_dpci_eoi(current->domain,
+                             hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin));
+                __clear_bit(pin, &pending);
+            }
+            goto unmask;
         }
         else if ( val & 0x08 )
         {
@@ -306,6 +326,7 @@ static void vpic_ioport_write(
 
     vpic_unlock(vpic);
 
+ unmask:
     if ( unmasked )
         pt_may_unmask_irq(vpic_domain(vpic), NULL);
 }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/4] x86/dpci: remove the dpci EOI timer
  2021-01-15 14:28 [PATCH v2 0/4] x86/intr: guest interrupt handling fixes/cleanup Roger Pau Monne
                   ` (2 preceding siblings ...)
  2021-01-15 14:28 ` [PATCH v2 3/4] x86/vpic: issue dpci EOI for cleared pins at ICW1 Roger Pau Monne
@ 2021-01-15 14:28 ` Roger Pau Monne
  2021-01-15 14:56   ` Roger Pau Monné
  2021-01-22 11:37   ` Jan Beulich
  3 siblings, 2 replies; 15+ messages in thread
From: Roger Pau Monne @ 2021-01-15 14:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Kevin Tian, Jan Beulich, Paul Durrant,
	Andrew Cooper, Wei Liu

Current interrupt pass though code will setup a timer for each
interrupt injected to the guest that requires an EOI from the guest.
Such timer would perform two actions if the guest doesn't EOI the
interrupt before a given period of time. The first one is deasserting
the virtual line, the second is perform an EOI of the physical
interrupt source if it requires such.

The deasserting of the guest virtual line is wrong, since it messes
with the interrupt status of the guest. This seems to have been done
in order to compensate for missing deasserts when certain interrupt
controller actions are performed. The original motivation of the
introduction of the timer was to fix issues when a GSI was shared
between different guests. We believe that other changes in the
interrupt handling code (ie: proper propagation of EOI related actions
to dpci) will have fixed such errors now.

Performing an EOI of the physical interrupt source is redundant, since
there's already a timer that takes care of this for all interrupts,
not just the HVM dpci ones, see irq_guest_action_t struct eoi_timer
field.

Since both of the actions performed by the dpci timer are not
required, remove it altogether.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Add parentheses.
---
 xen/drivers/passthrough/vtd/x86/hvm.c |  3 -
 xen/drivers/passthrough/x86/hvm.c     | 95 +--------------------------
 xen/include/asm-x86/hvm/irq.h         |  3 -
 xen/include/xen/iommu.h               |  5 --
 4 files changed, 2 insertions(+), 104 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/x86/hvm.c b/xen/drivers/passthrough/vtd/x86/hvm.c
index f77b35815c..b531fe907a 100644
--- a/xen/drivers/passthrough/vtd/x86/hvm.c
+++ b/xen/drivers/passthrough/vtd/x86/hvm.c
@@ -36,10 +36,7 @@ static int _hvm_dpci_isairq_eoi(struct domain *d,
         {
             hvm_pci_intx_deassert(d, digl->device, digl->intx);
             if ( --pirq_dpci->pending == 0 )
-            {
-                stop_timer(&pirq_dpci->timer);
                 pirq_guest_eoi(dpci_pirq(pirq_dpci));
-            }
         }
     }
 
diff --git a/xen/drivers/passthrough/x86/hvm.c b/xen/drivers/passthrough/x86/hvm.c
index edc8059518..5d901db50c 100644
--- a/xen/drivers/passthrough/x86/hvm.c
+++ b/xen/drivers/passthrough/x86/hvm.c
@@ -136,77 +136,6 @@ static void pt_pirq_softirq_reset(struct hvm_pirq_dpci *pirq_dpci)
     pirq_dpci->masked = 0;
 }
 
-bool pt_irq_need_timer(uint32_t flags)
-{
-    return !(flags & (HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_TRANSLATE |
-                      HVM_IRQ_DPCI_NO_EOI));
-}
-
-static int pt_irq_guest_eoi(struct domain *d, struct hvm_pirq_dpci *pirq_dpci,
-                            void *arg)
-{
-    if ( __test_and_clear_bit(_HVM_IRQ_DPCI_EOI_LATCH_SHIFT,
-                              &pirq_dpci->flags) )
-    {
-        pirq_dpci->masked = 0;
-        pirq_dpci->pending = 0;
-        pirq_guest_eoi(dpci_pirq(pirq_dpci));
-    }
-
-    return 0;
-}
-
-static void pt_irq_time_out(void *data)
-{
-    struct hvm_pirq_dpci *irq_map = data;
-    const struct hvm_irq_dpci *dpci;
-    const struct dev_intx_gsi_link *digl;
-
-    spin_lock(&irq_map->dom->event_lock);
-
-    if ( irq_map->flags & HVM_IRQ_DPCI_IDENTITY_GSI )
-    {
-        ASSERT(is_hardware_domain(irq_map->dom));
-        /*
-         * Identity mapped, no need to iterate over the guest GSI list to find
-         * other pirqs sharing the same guest GSI.
-         *
-         * In the identity mapped case the EOI can also be done now, this way
-         * the iteration over the list of domain pirqs is avoided.
-         */
-        hvm_gsi_deassert(irq_map->dom, dpci_pirq(irq_map)->pirq);
-        irq_map->flags |= HVM_IRQ_DPCI_EOI_LATCH;
-        pt_irq_guest_eoi(irq_map->dom, irq_map, NULL);
-        spin_unlock(&irq_map->dom->event_lock);
-        return;
-    }
-
-    dpci = domain_get_irq_dpci(irq_map->dom);
-    if ( unlikely(!dpci) )
-    {
-        ASSERT_UNREACHABLE();
-        spin_unlock(&irq_map->dom->event_lock);
-        return;
-    }
-    list_for_each_entry ( digl, &irq_map->digl_list, list )
-    {
-        unsigned int guest_gsi = hvm_pci_intx_gsi(digl->device, digl->intx);
-        const struct hvm_girq_dpci_mapping *girq;
-
-        list_for_each_entry ( girq, &dpci->girq[guest_gsi], list )
-        {
-            struct pirq *pirq = pirq_info(irq_map->dom, girq->machine_gsi);
-
-            pirq_dpci(pirq)->flags |= HVM_IRQ_DPCI_EOI_LATCH;
-        }
-        hvm_pci_intx_deassert(irq_map->dom, digl->device, digl->intx);
-    }
-
-    pt_pirq_iterate(irq_map->dom, pt_irq_guest_eoi, NULL);
-
-    spin_unlock(&irq_map->dom->event_lock);
-}
-
 struct hvm_irq_dpci *domain_get_irq_dpci(const struct domain *d)
 {
     if ( !d || !is_hvm_domain(d) )
@@ -568,15 +497,10 @@ int pt_irq_create_bind(
                 }
             }
 
-            /* Init timer before binding */
-            if ( pt_irq_need_timer(pirq_dpci->flags) )
-                init_timer(&pirq_dpci->timer, pt_irq_time_out, pirq_dpci, 0);
             /* Deal with gsi for legacy devices */
             rc = pirq_guest_bind(d->vcpu[0], info, share);
             if ( unlikely(rc) )
             {
-                if ( pt_irq_need_timer(pirq_dpci->flags) )
-                    kill_timer(&pirq_dpci->timer);
                 /*
                  * There is no path for __do_IRQ to schedule softirq as
                  * IRQ_GUEST is not set. As such we can reset 'dom' directly.
@@ -743,8 +667,6 @@ int pt_irq_destroy_bind(
     {
         pirq_guest_unbind(d, pirq);
         msixtbl_pt_unregister(d, pirq);
-        if ( pt_irq_need_timer(pirq_dpci->flags) )
-            kill_timer(&pirq_dpci->timer);
         pirq_dpci->flags = 0;
         /*
          * See comment in pt_irq_create_bind's PT_IRQ_TYPE_MSI before the
@@ -934,16 +856,6 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci)
             __msi_pirq_eoi(pirq_dpci);
             goto out;
         }
-
-        /*
-         * Set a timer to see if the guest can finish the interrupt or not. For
-         * example, the guest OS may unmask the PIC during boot, before the
-         * guest driver is loaded. hvm_pci_intx_assert() may succeed, but the
-         * guest will never deal with the irq, then the physical interrupt line
-         * will never be deasserted.
-         */
-        ASSERT(pt_irq_need_timer(pirq_dpci->flags));
-        set_timer(&pirq_dpci->timer, NOW() + PT_IRQ_TIME_OUT);
     }
 
  out:
@@ -967,10 +879,10 @@ static void hvm_pirq_eoi(struct pirq *pirq)
      * since interrupt is still not EOIed
      */
     if ( --pirq_dpci->pending ||
-         !pt_irq_need_timer(pirq_dpci->flags) )
+         /* When the interrupt source is MSI no Ack should be performed. */
+         pirq_dpci->flags & HVM_IRQ_DPCI_TRANSLATE )
         return;
 
-    stop_timer(&pirq_dpci->timer);
     pirq_guest_eoi(pirq);
 }
 
@@ -1038,9 +950,6 @@ static int pci_clean_dpci_irq(struct domain *d,
 
     pirq_guest_unbind(d, dpci_pirq(pirq_dpci));
 
-    if ( pt_irq_need_timer(pirq_dpci->flags) )
-        kill_timer(&pirq_dpci->timer);
-
     list_for_each_entry_safe ( digl, tmp, &pirq_dpci->digl_list, list )
     {
         list_del(&digl->list);
diff --git a/xen/include/asm-x86/hvm/irq.h b/xen/include/asm-x86/hvm/irq.h
index 532880d497..d40e13de6e 100644
--- a/xen/include/asm-x86/hvm/irq.h
+++ b/xen/include/asm-x86/hvm/irq.h
@@ -117,7 +117,6 @@ struct dev_intx_gsi_link {
 #define _HVM_IRQ_DPCI_MACH_PCI_SHIFT            0
 #define _HVM_IRQ_DPCI_MACH_MSI_SHIFT            1
 #define _HVM_IRQ_DPCI_MAPPED_SHIFT              2
-#define _HVM_IRQ_DPCI_EOI_LATCH_SHIFT           3
 #define _HVM_IRQ_DPCI_GUEST_PCI_SHIFT           4
 #define _HVM_IRQ_DPCI_GUEST_MSI_SHIFT           5
 #define _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT        6
@@ -126,7 +125,6 @@ struct dev_intx_gsi_link {
 #define HVM_IRQ_DPCI_MACH_PCI        (1u << _HVM_IRQ_DPCI_MACH_PCI_SHIFT)
 #define HVM_IRQ_DPCI_MACH_MSI        (1u << _HVM_IRQ_DPCI_MACH_MSI_SHIFT)
 #define HVM_IRQ_DPCI_MAPPED          (1u << _HVM_IRQ_DPCI_MAPPED_SHIFT)
-#define HVM_IRQ_DPCI_EOI_LATCH       (1u << _HVM_IRQ_DPCI_EOI_LATCH_SHIFT)
 #define HVM_IRQ_DPCI_GUEST_PCI       (1u << _HVM_IRQ_DPCI_GUEST_PCI_SHIFT)
 #define HVM_IRQ_DPCI_GUEST_MSI       (1u << _HVM_IRQ_DPCI_GUEST_MSI_SHIFT)
 #define HVM_IRQ_DPCI_IDENTITY_GSI    (1u << _HVM_IRQ_DPCI_IDENTITY_GSI_SHIFT)
@@ -173,7 +171,6 @@ struct hvm_pirq_dpci {
     struct list_head digl_list;
     struct domain *dom;
     struct hvm_gmsi_info gmsi;
-    struct timer timer;
     struct list_head softirq_list;
 };
 
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index f0295fd6c3..4f3098b6ed 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -184,11 +184,6 @@ int pt_irq_destroy_bind(struct domain *, const struct xen_domctl_bind_pt_irq *);
 void hvm_dpci_isairq_eoi(struct domain *d, unsigned int isairq);
 struct hvm_irq_dpci *domain_get_irq_dpci(const struct domain *);
 void free_hvm_irq_dpci(struct hvm_irq_dpci *dpci);
-#ifdef CONFIG_HVM
-bool pt_irq_need_timer(uint32_t flags);
-#else
-static inline bool pt_irq_need_timer(unsigned int flags) { return false; }
-#endif
 
 struct msi_desc;
 struct msi_msg;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] x86/dpci: remove the dpci EOI timer
  2021-01-15 14:28 ` [PATCH v2 4/4] x86/dpci: remove the dpci EOI timer Roger Pau Monne
@ 2021-01-15 14:56   ` Roger Pau Monné
  2021-01-22 11:37   ` Jan Beulich
  1 sibling, 0 replies; 15+ messages in thread
From: Roger Pau Monné @ 2021-01-15 14:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, Jan Beulich, Paul Durrant, Andrew Cooper, Wei Liu

On Fri, Jan 15, 2021 at 03:28:20PM +0100, Roger Pau Monne wrote:
> Current interrupt pass though code will setup a timer for each
> interrupt injected to the guest that requires an EOI from the guest.
> Such timer would perform two actions if the guest doesn't EOI the
> interrupt before a given period of time. The first one is deasserting
> the virtual line, the second is perform an EOI of the physical
> interrupt source if it requires such.
> 
> The deasserting of the guest virtual line is wrong, since it messes
> with the interrupt status of the guest. This seems to have been done
> in order to compensate for missing deasserts when certain interrupt
> controller actions are performed. The original motivation of the
> introduction of the timer was to fix issues when a GSI was shared
> between different guests. We believe that other changes in the
> interrupt handling code (ie: proper propagation of EOI related actions
> to dpci) will have fixed such errors now.
> 
> Performing an EOI of the physical interrupt source is redundant, since
> there's already a timer that takes care of this for all interrupts,
> not just the HVM dpci ones, see irq_guest_action_t struct eoi_timer
> field.
> 
> Since both of the actions performed by the dpci timer are not
> required, remove it altogether.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
>  - Add parentheses.

This is a lie because I forgot to refresh the patch before sending.
Will send a new version if required.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/4] x86/vioapic: check IRR before attempting to inject interrupt after EOI
  2021-01-15 14:28 ` [PATCH v2 1/4] x86/vioapic: check IRR before attempting to inject interrupt after EOI Roger Pau Monne
@ 2021-01-21 16:03   ` Jan Beulich
  2021-01-21 17:27     ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2021-01-21 16:03 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 15.01.2021 15:28, Roger Pau Monne wrote:
> In vioapic_update_EOI the irq_lock will be dropped in order to forward
> the EOI to the dpci handler, so there's a window between clearing IRR
> and checking if the line is asserted where IRR can change behind our
> back.
> 
> Fix this by checking whether IRR is set before attempting to inject a
> new interrupt.
> 
> Fixes: 06e3f8f2766 ('vt-d: Do dpci eoi outside of irq_lock.')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

It's fine this way, so
Reviewed-by: Jan Beulich <jbeulich@suse.com>
but how about a slightly different change:

> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -526,7 +526,7 @@ void vioapic_update_EOI(struct domain *d, u8 vector)
>              }
>  
>              if ( (ent->fields.trig_mode == VIOAPIC_LEVEL_TRIG) &&
> -                 !ent->fields.mask &&
> +                 !ent->fields.mask && !ent->fields.remote_irr &&
>                   hvm_irq->gsi_assert_count[vioapic->base_gsi + pin] )
>              {
>                  ent->fields.remote_irr = 1;

The check is only needed if the lock was dropped intermediately,
which happens only conditionally. So an alternative would seem
to be

            if ( is_iommu_enabled(d) )
            {
                spin_unlock(&d->arch.hvm.irq_lock);
                hvm_dpci_eoi(d, vioapic->base_gsi + pin, ent);
                spin_lock(&d->arch.hvm.irq_lock);

                if ( ent->fields.remote_irr )
                    continue;
            }

in the code immediately above. Thoughts?

Jan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/4] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode
  2021-01-15 14:28 ` [PATCH v2 2/4] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode Roger Pau Monne
@ 2021-01-21 16:23   ` Jan Beulich
  2021-01-21 17:45     ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2021-01-21 16:23 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 15.01.2021 15:28, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -268,6 +268,17 @@ static void vioapic_write_redirent(
>  
>      spin_unlock(&d->arch.hvm.irq_lock);
>  
> +    if ( ent.fields.trig_mode == VIOAPIC_EDGE_TRIG &&
> +         ent.fields.remote_irr && is_iommu_enabled(d) )
> +            /*
> +             * Since IRR has been cleared and further interrupts can be
> +             * injected also attempt to deassert any virtual line of passed
> +             * through devices using this pin. Switching a pin from level to
> +             * trigger mode can be used as a way to EOI an interrupt at the
> +             * IO-APIC level.
> +             */
> +            hvm_dpci_eoi(d, gsi);
> +
>      if ( is_hardware_domain(d) && unmasked )
>      {
>          /*

I assume in the comment you mean "... from level to edge
mode ...". But this isn't reflected in the if() you add -
you do the same also when the mode doesn't change. Or do
you build on the assumption that ent.fields.remote_irr can
only be set if the prior mode was "level" (in which case
an assertion may be warranted, as I don't think this is
overly obvious)?

Also, looking at this code, is it correct to trigger an IRQ
upon the guest writing the upper half of an unmasked RTE
with remote_irr clear? I'd assume this needs to be strictly
limited to a 1->0 transition of the mask bit. If other code
indeed guarantees this in all cases, perhaps another place
where an assertion would be warranted?

Jan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/4] x86/vioapic: check IRR before attempting to inject interrupt after EOI
  2021-01-21 16:03   ` Jan Beulich
@ 2021-01-21 17:27     ` Roger Pau Monné
  0 siblings, 0 replies; 15+ messages in thread
From: Roger Pau Monné @ 2021-01-21 17:27 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Thu, Jan 21, 2021 at 05:03:51PM +0100, Jan Beulich wrote:
> On 15.01.2021 15:28, Roger Pau Monne wrote:
> > In vioapic_update_EOI the irq_lock will be dropped in order to forward
> > the EOI to the dpci handler, so there's a window between clearing IRR
> > and checking if the line is asserted where IRR can change behind our
> > back.
> > 
> > Fix this by checking whether IRR is set before attempting to inject a
> > new interrupt.
> > 
> > Fixes: 06e3f8f2766 ('vt-d: Do dpci eoi outside of irq_lock.')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> It's fine this way, so
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> but how about a slightly different change:
> 
> > --- a/xen/arch/x86/hvm/vioapic.c
> > +++ b/xen/arch/x86/hvm/vioapic.c
> > @@ -526,7 +526,7 @@ void vioapic_update_EOI(struct domain *d, u8 vector)
> >              }
> >  
> >              if ( (ent->fields.trig_mode == VIOAPIC_LEVEL_TRIG) &&
> > -                 !ent->fields.mask &&
> > +                 !ent->fields.mask && !ent->fields.remote_irr &&
> >                   hvm_irq->gsi_assert_count[vioapic->base_gsi + pin] )
> >              {
> >                  ent->fields.remote_irr = 1;
> 
> The check is only needed if the lock was dropped intermediately,
> which happens only conditionally. So an alternative would seem
> to be
> 
>             if ( is_iommu_enabled(d) )
>             {
>                 spin_unlock(&d->arch.hvm.irq_lock);
>                 hvm_dpci_eoi(d, vioapic->base_gsi + pin, ent);
>                 spin_lock(&d->arch.hvm.irq_lock);
> 
>                 if ( ent->fields.remote_irr )
>                     continue;
>             }
> 
> in the code immediately above. Thoughts?

IMO that seems more dangerous as if new code is added below that chunk
that wouldn't depend on the value of IRR it might get skipped
unintentionally, as it's possible to oversight that the loop is
short-circuited here.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/4] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode
  2021-01-21 16:23   ` Jan Beulich
@ 2021-01-21 17:45     ` Roger Pau Monné
  2021-01-22  9:13       ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monné @ 2021-01-21 17:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Thu, Jan 21, 2021 at 05:23:04PM +0100, Jan Beulich wrote:
> On 15.01.2021 15:28, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/vioapic.c
> > +++ b/xen/arch/x86/hvm/vioapic.c
> > @@ -268,6 +268,17 @@ static void vioapic_write_redirent(
> >  
> >      spin_unlock(&d->arch.hvm.irq_lock);
> >  
> > +    if ( ent.fields.trig_mode == VIOAPIC_EDGE_TRIG &&
> > +         ent.fields.remote_irr && is_iommu_enabled(d) )
> > +            /*
> > +             * Since IRR has been cleared and further interrupts can be
> > +             * injected also attempt to deassert any virtual line of passed
> > +             * through devices using this pin. Switching a pin from level to
> > +             * trigger mode can be used as a way to EOI an interrupt at the
> > +             * IO-APIC level.
> > +             */
> > +            hvm_dpci_eoi(d, gsi);
> > +
> >      if ( is_hardware_domain(d) && unmasked )
> >      {
> >          /*
> 
> I assume in the comment you mean "... from level to edge
> mode ...".

Yes, that's right, I completely missed it, sorry.

> But this isn't reflected in the if() you add -
> you do the same also when the mode doesn't change. Or do
> you build on the assumption that ent.fields.remote_irr can
> only be set if the prior mode was "level" (in which case
> an assertion may be warranted, as I don't think this is
> overly obvious)?

Yes, IRR is only set for level triggered interrupts, so it's indeed
build on the assumption that a pin can only have had IRR set when in
edge mode when it's being switched from level to edge.

I can add an assertion.

> Also, looking at this code, is it correct to trigger an IRQ
> upon the guest writing the upper half of an unmasked RTE
> with remote_irr clear? I'd assume this needs to be strictly
> limited to a 1->0 transition of the mask bit. If other code
> indeed guarantees this in all cases, perhaps another place
> where an assertion would be warranted?

Indeed. I don't think it should be possible for a write to the upper
half to trigger the injection of an interrupt, as having
gsi_assert_count > 0 would imply that either IRR is already set, or
that the pin is masked when processing an upper write.

I can add that a pre-patch if you agree.

In fact we could almost short-circuit the logic after the *pent = ent;
line for upper writes if it wasn't for the call to
vlapic_adjust_i8259_target, the rest of the code there shouldn't
matter for upper writes. And the i8259 target logic that we have is
very dodgy I would say. I have plans to fix it at some point, but
that requires fixing the virtual periodic timers logic first, which I
didn't get around to re-posting.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] x86/vpic: issue dpci EOI for cleared pins at ICW1
  2021-01-15 14:28 ` [PATCH v2 3/4] x86/vpic: issue dpci EOI for cleared pins at ICW1 Roger Pau Monne
@ 2021-01-22  9:02   ` Jan Beulich
  2021-01-22  9:53     ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2021-01-22  9:02 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 15.01.2021 15:28, Roger Pau Monne wrote:
> When pins are cleared from either ISR or IRR as part of the
> initialization sequence forward the clearing of those pins to the dpci
> EOI handler, as it is equivalent to an EOI. Not doing so can bring the
> interrupt controller state out of sync with the dpci handling logic,
> that expects a notification when a pin has been EOI'ed.

The question though is what this clearing of ISR and some of
IRR during ICW1 is based upon: Going through various manuals
(up-to-date from Nov 2020, intermediate, and all the way
through to an old hard copy from 1993) I can't find a single
mention of ICW1 having any effect on ISR or IRR, despite both
soft copy ones being detailed about other state getting
changed.

There is "Following initialization, an interrupt request (IRQ)
input must make a low-to-high transition to generate an
interrupt", but I'm afraid this can be read to mean different
things. And if it was meant to describe an effect on ISR
and/or IRR, it would imo be in conflict with us keeping IRR
bits of level triggered interrupts.

> @@ -217,6 +219,24 @@ static void vpic_ioport_write(
>              }
>  
>              vpic->init_state = ((val & 3) << 2) | 1;
> +            vpic_update_int_output(vpic);
> +            vpic_unlock(vpic);
> +
> +            /*
> +             * Forward the EOI of any pending or in service interrupt that has
> +             * been cleared from IRR or ISR, or else the dpci logic will get
> +             * out of sync with the state of the interrupt controller.
> +             */
> +            while ( pending )
> +            {
> +                unsigned int pin = __scanbit(pending, 8);
> +
> +                ASSERT(pin < 8);
> +                hvm_dpci_eoi(current->domain,
> +                             hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin));
> +                __clear_bit(pin, &pending);
> +            }
> +            goto unmask;

A similar consideration applies here (albeit just as an
observation, as being orthogonal to your change): A PIC
becomes ready for handling interrupts only at the end of the
ICWx sequence. Hence it would seem to me that invoking
pt_may_unmask_irq() (maybe also vpic_update_int_output()) at
ICW1 is premature. To me this seems particularly relevant
considering that ICW1 clears IMR, and hence an interrupt
becoming pending between ICW1 and ICW2 wouldn't know which
vector to use.

Or maybe on that side of things, vpic_update_int_output()
should really do

    if ( vpic->int_output == (!vpic->init_state && irq >= 0) )
        return;

    /* INT line transition L->H or H->L. */
    vpic->int_output = !vpic->init_state && !vpic->int_output;

?

Jan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/4] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode
  2021-01-21 17:45     ` Roger Pau Monné
@ 2021-01-22  9:13       ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2021-01-22  9:13 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 21.01.2021 18:45, Roger Pau Monné wrote:
> On Thu, Jan 21, 2021 at 05:23:04PM +0100, Jan Beulich wrote:
>> On 15.01.2021 15:28, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/vioapic.c
>>> +++ b/xen/arch/x86/hvm/vioapic.c
>>> @@ -268,6 +268,17 @@ static void vioapic_write_redirent(
>>>  
>>>      spin_unlock(&d->arch.hvm.irq_lock);
>>>  
>>> +    if ( ent.fields.trig_mode == VIOAPIC_EDGE_TRIG &&
>>> +         ent.fields.remote_irr && is_iommu_enabled(d) )
>>> +            /*
>>> +             * Since IRR has been cleared and further interrupts can be
>>> +             * injected also attempt to deassert any virtual line of passed
>>> +             * through devices using this pin. Switching a pin from level to
>>> +             * trigger mode can be used as a way to EOI an interrupt at the
>>> +             * IO-APIC level.
>>> +             */
>>> +            hvm_dpci_eoi(d, gsi);
>>> +
>>>      if ( is_hardware_domain(d) && unmasked )
>>>      {
>>>          /*
>>
>> I assume in the comment you mean "... from level to edge
>> mode ...".
> 
> Yes, that's right, I completely missed it, sorry.
> 
>> But this isn't reflected in the if() you add -
>> you do the same also when the mode doesn't change. Or do
>> you build on the assumption that ent.fields.remote_irr can
>> only be set if the prior mode was "level" (in which case
>> an assertion may be warranted, as I don't think this is
>> overly obvious)?
> 
> Yes, IRR is only set for level triggered interrupts, so it's indeed
> build on the assumption that a pin can only have had IRR set when in
> edge mode when it's being switched from level to edge.
> 
> I can add an assertion.
> 
>> Also, looking at this code, is it correct to trigger an IRQ
>> upon the guest writing the upper half of an unmasked RTE
>> with remote_irr clear? I'd assume this needs to be strictly
>> limited to a 1->0 transition of the mask bit. If other code
>> indeed guarantees this in all cases, perhaps another place
>> where an assertion would be warranted?
> 
> Indeed. I don't think it should be possible for a write to the upper
> half to trigger the injection of an interrupt, as having
> gsi_assert_count > 0 would imply that either IRR is already set, or
> that the pin is masked when processing an upper write.
> 
> I can add that a pre-patch if you agree.

I do, yes.

> In fact we could almost short-circuit the logic after the *pent = ent;
> line for upper writes if it wasn't for the call to
> vlapic_adjust_i8259_target, the rest of the code there shouldn't
> matter for upper writes. And the i8259 target logic that we have is
> very dodgy I would say. I have plans to fix it at some point, but
> that requires fixing the virtual periodic timers logic first, which I
> didn't get around to re-posting.

True. Looking forward to it.

Jan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] x86/vpic: issue dpci EOI for cleared pins at ICW1
  2021-01-22  9:02   ` Jan Beulich
@ 2021-01-22  9:53     ` Roger Pau Monné
  2021-01-22 10:06       ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monné @ 2021-01-22  9:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Fri, Jan 22, 2021 at 10:02:15AM +0100, Jan Beulich wrote:
> On 15.01.2021 15:28, Roger Pau Monne wrote:
> > When pins are cleared from either ISR or IRR as part of the
> > initialization sequence forward the clearing of those pins to the dpci
> > EOI handler, as it is equivalent to an EOI. Not doing so can bring the
> > interrupt controller state out of sync with the dpci handling logic,
> > that expects a notification when a pin has been EOI'ed.
> 
> The question though is what this clearing of ISR and some of
> IRR during ICW1 is based upon: Going through various manuals
> (up-to-date from Nov 2020, intermediate, and all the way
> through to an old hard copy from 1993) I can't find a single
> mention of ICW1 having any effect on ISR or IRR, despite both
> soft copy ones being detailed about other state getting
> changed.
> 
> There is "Following initialization, an interrupt request (IRQ)
> input must make a low-to-high transition to generate an
> interrupt", but I'm afraid this can be read to mean different
> things. And if it was meant to describe an effect on ISR
> and/or IRR, it would imo be in conflict with us keeping IRR
> bits of level triggered interrupts.

I have an old copy of the 8259A spec, and it does mention the same
quote that you have above. I also wondered while I was adjusting this
code whether what we do is fine. I have to admit I haven't considered
changing this logic much because I don't have an effective way to test
it.

I've also taken a look at what others do, QEMU for example will do
exactly the same logic as Xen during ICW1, bhyve OTOH will fully zero
IRR and leave ISR as is.

Doing a bit of archaeology in QEMU I've found the following change:

commit aa24822bdc7c4e74afbc6fa1324b01cf067da7cb
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date:   Tue Jan 24 16:29:29 2012 +0100

    i8259: Do not clear level-triggered lines in IRR on init

    When an input line is handled as level-triggered, it will immediately
    raise an IRQ on the output of a PIC again that goes through an init
    reset. So only clear the edge-triggered inputs from IRR in that
    scenario.

    Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

Which seems to point out there's a reasoning behind what it's
currently implemented. This seems to be against the spec as there's no
low-to-high transition?

> > @@ -217,6 +219,24 @@ static void vpic_ioport_write(
> >              }
> >  
> >              vpic->init_state = ((val & 3) << 2) | 1;
> > +            vpic_update_int_output(vpic);
> > +            vpic_unlock(vpic);
> > +
> > +            /*
> > +             * Forward the EOI of any pending or in service interrupt that has
> > +             * been cleared from IRR or ISR, or else the dpci logic will get
> > +             * out of sync with the state of the interrupt controller.
> > +             */
> > +            while ( pending )
> > +            {
> > +                unsigned int pin = __scanbit(pending, 8);
> > +
> > +                ASSERT(pin < 8);
> > +                hvm_dpci_eoi(current->domain,
> > +                             hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin));
> > +                __clear_bit(pin, &pending);
> > +            }
> > +            goto unmask;
> 
> A similar consideration applies here (albeit just as an
> observation, as being orthogonal to your change): A PIC
> becomes ready for handling interrupts only at the end of the
> ICWx sequence. Hence it would seem to me that invoking
> pt_may_unmask_irq()

Right, it might be best to force unmask = 1 when init_state gets set
to 0. A spurious call to pt_may_unmask_irq won't be harmful anyway,
even if no pins have been actually unmasked.

> (maybe also vpic_update_int_output()) at
> ICW1 is premature. To me this seems particularly relevant
> considering that ICW1 clears IMR, and hence an interrupt
> becoming pending between ICW1 and ICW2 wouldn't know which
> vector to use.
> 
> Or maybe on that side of things, vpic_update_int_output()
> should really do
> 
>     if ( vpic->int_output == (!vpic->init_state && irq >= 0) )
>         return;
> 
>     /* INT line transition L->H or H->L. */
>     vpic->int_output = !vpic->init_state && !vpic->int_output;
> 
> ?

So to force int_output = false when in init state. Seems reasonable, I
can implement this either as a pre-patch or a followup, but again I'm
not sure I have means to properly test it.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] x86/vpic: issue dpci EOI for cleared pins at ICW1
  2021-01-22  9:53     ` Roger Pau Monné
@ 2021-01-22 10:06       ` Jan Beulich
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2021-01-22 10:06 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel

On 22.01.2021 10:53, Roger Pau Monné wrote:
> On Fri, Jan 22, 2021 at 10:02:15AM +0100, Jan Beulich wrote:
>> On 15.01.2021 15:28, Roger Pau Monne wrote:
>>> When pins are cleared from either ISR or IRR as part of the
>>> initialization sequence forward the clearing of those pins to the dpci
>>> EOI handler, as it is equivalent to an EOI. Not doing so can bring the
>>> interrupt controller state out of sync with the dpci handling logic,
>>> that expects a notification when a pin has been EOI'ed.
>>
>> The question though is what this clearing of ISR and some of
>> IRR during ICW1 is based upon: Going through various manuals
>> (up-to-date from Nov 2020, intermediate, and all the way
>> through to an old hard copy from 1993) I can't find a single
>> mention of ICW1 having any effect on ISR or IRR, despite both
>> soft copy ones being detailed about other state getting
>> changed.
>>
>> There is "Following initialization, an interrupt request (IRQ)
>> input must make a low-to-high transition to generate an
>> interrupt", but I'm afraid this can be read to mean different
>> things. And if it was meant to describe an effect on ISR
>> and/or IRR, it would imo be in conflict with us keeping IRR
>> bits of level triggered interrupts.
> 
> I have an old copy of the 8259A spec, and it does mention the same
> quote that you have above. I also wondered while I was adjusting this
> code whether what we do is fine. I have to admit I haven't considered
> changing this logic much because I don't have an effective way to test
> it.
> 
> I've also taken a look at what others do, QEMU for example will do
> exactly the same logic as Xen during ICW1, bhyve OTOH will fully zero
> IRR and leave ISR as is.
> 
> Doing a bit of archaeology in QEMU I've found the following change:
> 
> commit aa24822bdc7c4e74afbc6fa1324b01cf067da7cb
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date:   Tue Jan 24 16:29:29 2012 +0100
> 
>     i8259: Do not clear level-triggered lines in IRR on init
> 
>     When an input line is handled as level-triggered, it will immediately
>     raise an IRQ on the output of a PIC again that goes through an init
>     reset. So only clear the edge-triggered inputs from IRR in that
>     scenario.
> 
>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> 
> Which seems to point out there's a reasoning behind what it's
> currently implemented. This seems to be against the spec as there's no
> low-to-high transition?

Would seem so to me, yes. Knowing how hard it was back at the time
to find any doc on the 8259A at all, and knowing that the doc that
initially became available wasn't very complete, I could very well
imagine the present doc still being incomplete. Let me see if I
can get something out of Intel.

>>> @@ -217,6 +219,24 @@ static void vpic_ioport_write(
>>>              }
>>>  
>>>              vpic->init_state = ((val & 3) << 2) | 1;
>>> +            vpic_update_int_output(vpic);
>>> +            vpic_unlock(vpic);
>>> +
>>> +            /*
>>> +             * Forward the EOI of any pending or in service interrupt that has
>>> +             * been cleared from IRR or ISR, or else the dpci logic will get
>>> +             * out of sync with the state of the interrupt controller.
>>> +             */
>>> +            while ( pending )
>>> +            {
>>> +                unsigned int pin = __scanbit(pending, 8);
>>> +
>>> +                ASSERT(pin < 8);
>>> +                hvm_dpci_eoi(current->domain,
>>> +                             hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin));
>>> +                __clear_bit(pin, &pending);
>>> +            }
>>> +            goto unmask;
>>
>> A similar consideration applies here (albeit just as an
>> observation, as being orthogonal to your change): A PIC
>> becomes ready for handling interrupts only at the end of the
>> ICWx sequence. Hence it would seem to me that invoking
>> pt_may_unmask_irq()
> 
> Right, it might be best to force unmask = 1 when init_state gets set
> to 0. A spurious call to pt_may_unmask_irq won't be harmful anyway,
> even if no pins have been actually unmasked.

Right, that's what I was considering too. (Really I was
thinking to propagate the "unmasked" value from ICW1
processing to when init_state gets cleared, but you're
right about a stray call to pt_may_unmask_irq() not doing
any harm.)

>> (maybe also vpic_update_int_output()) at
>> ICW1 is premature. To me this seems particularly relevant
>> considering that ICW1 clears IMR, and hence an interrupt
>> becoming pending between ICW1 and ICW2 wouldn't know which
>> vector to use.
>>
>> Or maybe on that side of things, vpic_update_int_output()
>> should really do
>>
>>     if ( vpic->int_output == (!vpic->init_state && irq >= 0) )
>>         return;
>>
>>     /* INT line transition L->H or H->L. */
>>     vpic->int_output = !vpic->init_state && !vpic->int_output;
>>
>> ?
> 
> So to force int_output = false when in init state. Seems reasonable, I
> can implement this either as a pre-patch or a followup, but again I'm
> not sure I have means to properly test it.

That testing consideration goes as well for the correctness
of what we have right now. It functions for the very limited
set of modes typical OSes use, and likely nobody really
knows whether it would also function correctly when used in
"exotic" ways.

As to pre-patch or follow-up: For this latter aspect it
probably doesn't matter much. If any adjustment is made
affecting the code further up, it may want to be a pre-patch
though, as it may change how you want to exit (e.g. I'd
expect the "goto" to possibly go away).

Jan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/4] x86/dpci: remove the dpci EOI timer
  2021-01-15 14:28 ` [PATCH v2 4/4] x86/dpci: remove the dpci EOI timer Roger Pau Monne
  2021-01-15 14:56   ` Roger Pau Monné
@ 2021-01-22 11:37   ` Jan Beulich
  1 sibling, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2021-01-22 11:37 UTC (permalink / raw)
  To: Roger Pau Monne, Ian Jackson
  Cc: Kevin Tian, Paul Durrant, Andrew Cooper, Wei Liu, xen-devel

On 15.01.2021 15:28, Roger Pau Monne wrote:
> Current interrupt pass though code will setup a timer for each
> interrupt injected to the guest that requires an EOI from the guest.
> Such timer would perform two actions if the guest doesn't EOI the
> interrupt before a given period of time. The first one is deasserting
> the virtual line, the second is perform an EOI of the physical
> interrupt source if it requires such.
> 
> The deasserting of the guest virtual line is wrong, since it messes
> with the interrupt status of the guest. This seems to have been done
> in order to compensate for missing deasserts when certain interrupt
> controller actions are performed. The original motivation of the
> introduction of the timer was to fix issues when a GSI was shared
> between different guests. We believe that other changes in the
> interrupt handling code (ie: proper propagation of EOI related actions
> to dpci) will have fixed such errors now.
> 
> Performing an EOI of the physical interrupt source is redundant, since
> there's already a timer that takes care of this for all interrupts,
> not just the HVM dpci ones, see irq_guest_action_t struct eoi_timer
> field.
> 
> Since both of the actions performed by the dpci timer are not
> required, remove it altogether.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with ...

> Changes since v1:
>  - Add parentheses.

... this, as you've already clarified, indeed having happened.
I understand this change is independent of the earlier ones in
this series. The minor adjustment could be taken care of while
committing, and I'm inclined to not wait ...

>  xen/drivers/passthrough/vtd/x86/hvm.c |  3 -

... for Kevin's ack on this obvious part of the change,
considering his prior input on the discussion we've had (where
he did signal agreement with the removal).

However, despite this series having been posted before the
deadline, it feels to me like a change that doesn't come
without risk (and hence would have been better to have earlier
in the release cycle). Hence, Ian, I'd like to gather your
release manager opinion on taking this now vs postponing for
4.16 (and then still being a likely backporting candidate). My
vote is "pro", in case it matters.

Jan


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-01-22 11:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-15 14:28 [PATCH v2 0/4] x86/intr: guest interrupt handling fixes/cleanup Roger Pau Monne
2021-01-15 14:28 ` [PATCH v2 1/4] x86/vioapic: check IRR before attempting to inject interrupt after EOI Roger Pau Monne
2021-01-21 16:03   ` Jan Beulich
2021-01-21 17:27     ` Roger Pau Monné
2021-01-15 14:28 ` [PATCH v2 2/4] x86/vioapic: issue EOI to dpci when switching pin to edge trigger mode Roger Pau Monne
2021-01-21 16:23   ` Jan Beulich
2021-01-21 17:45     ` Roger Pau Monné
2021-01-22  9:13       ` Jan Beulich
2021-01-15 14:28 ` [PATCH v2 3/4] x86/vpic: issue dpci EOI for cleared pins at ICW1 Roger Pau Monne
2021-01-22  9:02   ` Jan Beulich
2021-01-22  9:53     ` Roger Pau Monné
2021-01-22 10:06       ` Jan Beulich
2021-01-15 14:28 ` [PATCH v2 4/4] x86/dpci: remove the dpci EOI timer Roger Pau Monne
2021-01-15 14:56   ` Roger Pau Monné
2021-01-22 11:37   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.