xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT
@ 2020-09-30 10:40 Roger Pau Monne
  2020-09-30 10:40 ` [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks Roger Pau Monne
                   ` (10 more replies)
  0 siblings, 11 replies; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu,
	Paul Durrant, Jun Nakajima, Kevin Tian

Hello,

The following series introduces EOI callbacks and switches a number of
subsystems to use them. The first one is vmsi and dpci, which are quite
straight forward to switch since they used to open code hooks in the EOI
handlers.

Finally HVM virtual timers are also switched to a different model where
EOI callbacks are used instead of checking at every return to guest
whether a timer interrupt is being injected. Note that such change also
fixes a bug in virtual periodic timers that prevented injecting to a vCPU
different than the one where the timer is assigned (and that would
currently match the i8259 target).

Those changes are paired together so that practical applications of
having EOI callbacks can be seen in action.

Note that further cleanup can be done, but I think the series is already
big enough and could benefit from some review in order to know whether
the direction taken is acceptable. Not posting as RFC as part of the
series has already been posted before, but some patches could be
considered RFC.

Thanks, Roger.

Roger Pau Monne (11):
  x86/hvm: drop vcpu parameter from vlapic EOI callbacks
  x86/hvm: drop domain parameter from vioapic/vpic EOI callbacks
  x86/vlapic: introduce an EOI callback mechanism
  x86/vmsi: use the newly introduced EOI callbacks
  x86/vioapic: switch to use the EOI callback mechanism
  x86/hvm: allowing registering EOI callbacks for GSIs
  x86/dpci: move code
  x86/dpci: switch to use a GSI EOI callback
  x86/vpt: switch interrupt injection model
  x86/vpt: remove vPT timers per-vCPU lists
  x86/vpt: introduce a per-vPT lock

 xen/arch/x86/domain.c             |   2 +-
 xen/arch/x86/emul-i8254.c         |   1 +
 xen/arch/x86/hvm/hpet.c           |   8 +-
 xen/arch/x86/hvm/hvm.c            |  19 +-
 xen/arch/x86/hvm/irq.c            |  60 ++++
 xen/arch/x86/hvm/rtc.c            |   1 +
 xen/arch/x86/hvm/svm/intr.c       |   3 -
 xen/arch/x86/hvm/vioapic.c        | 139 +++++----
 xen/arch/x86/hvm/vlapic.c         |  66 ++++-
 xen/arch/x86/hvm/vmsi.c           |  36 ++-
 xen/arch/x86/hvm/vmx/intr.c       |  59 ----
 xen/arch/x86/hvm/vpic.c           |   8 +-
 xen/arch/x86/hvm/vpt.c            | 465 +++++++++---------------------
 xen/drivers/passthrough/io.c      | 215 ++++++++------
 xen/include/asm-x86/hvm/io.h      |   4 +-
 xen/include/asm-x86/hvm/irq.h     |  21 ++
 xen/include/asm-x86/hvm/vcpu.h    |   3 +-
 xen/include/asm-x86/hvm/vioapic.h |   2 +-
 xen/include/asm-x86/hvm/vlapic.h  |  18 +-
 xen/include/asm-x86/hvm/vpt.h     |  23 +-
 20 files changed, 574 insertions(+), 579 deletions(-)

-- 
2.28.0



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
@ 2020-09-30 10:40 ` Roger Pau Monne
  2020-09-30 11:30   ` Paul Durrant
  2020-10-02  8:48   ` Jan Beulich
  2020-09-30 10:40 ` [PATCH v2 02/11] x86/hvm: drop domain parameter from vioapic/vpic " Roger Pau Monne
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu,
	Paul Durrant, Paul Durrant

EOIs are always executed in guest vCPU context, so there's no reason to
pass a vCPU parameter around as can be fetched from current.

While there make the vector parameter of both callbacks unsigned int.

No functional change intended.

Suggested-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/hvm/vioapic.c        | 5 +++--
 xen/arch/x86/hvm/vlapic.c         | 7 ++-----
 xen/drivers/passthrough/io.c      | 4 +++-
 xen/include/asm-x86/hvm/io.h      | 2 +-
 xen/include/asm-x86/hvm/vioapic.h | 2 +-
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 67d4a6237f..0fb9147d99 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -353,7 +353,7 @@ static int vioapic_write(
 
 #if VIOAPIC_VERSION_ID >= 0x20
     case VIOAPIC_REG_EOI:
-        vioapic_update_EOI(v->domain, val);
+        vioapic_update_EOI(val);
         break;
 #endif
 
@@ -495,8 +495,9 @@ void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
     }
 }
 
-void vioapic_update_EOI(struct domain *d, u8 vector)
+void vioapic_update_EOI(unsigned int vector)
 {
+    struct domain *d = current->domain;
     struct hvm_irq *hvm_irq = hvm_domain_irq(d);
     union vioapic_redir_entry *ent;
     unsigned int i;
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 4e3861eb7d..ae737403f3 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -459,13 +459,10 @@ void vlapic_EOI_set(struct vlapic *vlapic)
 
 void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
 {
-    struct vcpu *v = vlapic_vcpu(vlapic);
-    struct domain *d = v->domain;
-
     if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
-        vioapic_update_EOI(d, vector);
+        vioapic_update_EOI(vector);
 
-    hvm_dpci_msi_eoi(d, vector);
+    hvm_dpci_msi_eoi(vector);
 }
 
 static bool_t is_multicast_dest(struct vlapic *vlapic, unsigned int short_hand,
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 6b1305a3e5..54f3e7b540 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -874,8 +874,10 @@ static int _hvm_dpci_msi_eoi(struct domain *d,
     return 0;
 }
 
-void hvm_dpci_msi_eoi(struct domain *d, int vector)
+void hvm_dpci_msi_eoi(unsigned int vector)
 {
+    struct domain *d = current->domain;
+
     if ( !is_iommu_enabled(d) ||
          (!hvm_domain_irq(d)->dpci && !is_hardware_domain(d)) )
        return;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 558426b772..adec0f566a 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -159,7 +159,7 @@ struct hvm_hw_stdvga {
 void stdvga_init(struct domain *d);
 void stdvga_deinit(struct domain *d);
 
-extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
+extern void hvm_dpci_msi_eoi(unsigned int vector);
 
 /* Decode a PCI port IO access into a bus/slot/func/reg. */
 unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr,
diff --git a/xen/include/asm-x86/hvm/vioapic.h b/xen/include/asm-x86/hvm/vioapic.h
index d6f4e12d54..fd602f8830 100644
--- a/xen/include/asm-x86/hvm/vioapic.h
+++ b/xen/include/asm-x86/hvm/vioapic.h
@@ -64,7 +64,7 @@ int vioapic_init(struct domain *d);
 void vioapic_deinit(struct domain *d);
 void vioapic_reset(struct domain *d);
 void vioapic_irq_positive_edge(struct domain *d, unsigned int irq);
-void vioapic_update_EOI(struct domain *d, u8 vector);
+void vioapic_update_EOI(unsigned int vector);
 
 int vioapic_get_mask(const struct domain *d, unsigned int gsi);
 int vioapic_get_vector(const struct domain *d, unsigned int gsi);
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 02/11] x86/hvm: drop domain parameter from vioapic/vpic EOI callbacks
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
  2020-09-30 10:40 ` [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks Roger Pau Monne
@ 2020-09-30 10:40 ` Roger Pau Monne
  2020-09-30 11:33   ` Paul Durrant
  2020-10-02  9:02   ` Jan Beulich
  2020-09-30 10:41 ` [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism Roger Pau Monne
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:40 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu, Paul Durrant

EOIs are always executed in guest vCPU context, so there's no reason to
pass a domain parameter around as can be fetched from current->domain.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/hvm/vioapic.c   | 2 +-
 xen/arch/x86/hvm/vpic.c      | 3 +--
 xen/drivers/passthrough/io.c | 4 ++--
 xen/include/asm-x86/hvm/io.h | 3 +--
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 0fb9147d99..752fc410db 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -522,7 +522,7 @@ void vioapic_update_EOI(unsigned int vector)
             if ( is_iommu_enabled(d) )
             {
                 spin_unlock(&d->arch.hvm.irq_lock);
-                hvm_dpci_eoi(d, vioapic->base_gsi + pin, ent);
+                hvm_dpci_eoi(vioapic->base_gsi + pin, ent);
                 spin_lock(&d->arch.hvm.irq_lock);
             }
 
diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index 3cf12581e9..26f74f4471 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -262,8 +262,7 @@ static void vpic_ioport_write(
                 /* Release lock and EOI the physical interrupt (if any). */
                 vpic_update_int_output(vpic);
                 vpic_unlock(vpic);
-                hvm_dpci_eoi(current->domain,
-                             hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin),
+                hvm_dpci_eoi(hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin),
                              NULL);
                 return; /* bail immediately */
             case 6: /* Set Priority                */
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 54f3e7b540..536e91ad76 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -1003,9 +1003,9 @@ static void hvm_gsi_eoi(struct domain *d, unsigned int gsi,
     hvm_pirq_eoi(pirq, ent);
 }
 
-void hvm_dpci_eoi(struct domain *d, unsigned int guest_gsi,
-                  const union vioapic_redir_entry *ent)
+void hvm_dpci_eoi(unsigned int guest_gsi, const union vioapic_redir_entry *ent)
 {
+    struct domain *d = current->domain;
     const struct hvm_irq_dpci *hvm_irq_dpci;
     const struct hvm_girq_dpci_mapping *girq;
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index adec0f566a..b05f619435 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -118,8 +118,7 @@ bool handle_mmio_with_translation(unsigned long gla, unsigned long gpfn,
                                   struct npfec);
 bool handle_pio(uint16_t port, unsigned int size, int dir);
 void hvm_interrupt_post(struct vcpu *v, int vector, int type);
-void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
-                  const union vioapic_redir_entry *ent);
+void hvm_dpci_eoi(unsigned int guest_irq, const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
 
 #ifdef CONFIG_HVM
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
  2020-09-30 10:40 ` [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks Roger Pau Monne
  2020-09-30 10:40 ` [PATCH v2 02/11] x86/hvm: drop domain parameter from vioapic/vpic " Roger Pau Monne
@ 2020-09-30 10:41 ` Roger Pau Monne
  2020-09-30 11:49   ` Paul Durrant
  2020-10-02  9:39   ` Jan Beulich
  2020-09-30 10:41 ` [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks Roger Pau Monne
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:41 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

Add a new vlapic_set_irq_callback helper in order to inject a vector
and set a callback to be executed when the guest performs the end of
interrupt acknowledgment.

Such functionality will be used to migrate the current ad hoc handling
done in vlapic_handle_EOI for the vectors that require some logic to
be executed when the end of interrupt is performed.

No current users are migrated to use this new functionality yet, so
not functional change expected as a result.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Make vlapic_set_irq an inline function on the header.
 - Clear the callback hook in vlapic_handle_EOI.
 - Introduce a helper to set the callback without injecting a vector.
 - Remove unneeded parentheses.
 - Reduce callback table by 16.
 - Use %pv to print domain/vcpu ID.
---
RFC: should callbacks also be executed in vlapic_do_init (which is
called by vlapic_reset). We would need to make sure ISR and IRR
are cleared using some kind of test and clear atomic functionality to
make this race free.
---
 xen/arch/x86/hvm/vlapic.c        | 62 ++++++++++++++++++++++++++++++--
 xen/include/asm-x86/hvm/vlapic.h | 18 +++++++++-
 2 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index ae737403f3..38c62a02e6 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -144,7 +144,32 @@ bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec)
     return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
 }
 
-void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
+void vlapic_set_callback(struct vlapic *vlapic, unsigned int vec,
+                         vlapic_eoi_callback_t *callback, void *data)
+{
+    unsigned long flags;
+    unsigned int index = vec - 16;
+
+    if ( !callback || vec < 16 || vec >= X86_NR_VECTORS )
+    {
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
+    spin_lock_irqsave(&vlapic->callback_lock, flags);
+    if ( vlapic->callbacks[index].callback &&
+         vlapic->callbacks[index].callback != callback )
+        printk(XENLOG_G_WARNING
+               "%pv overriding vector %#x callback %ps (%p) with %ps (%p)\n",
+               vlapic_vcpu(vlapic), vec, vlapic->callbacks[index].callback,
+               vlapic->callbacks[index].callback, callback, callback);
+    vlapic->callbacks[index].callback = callback;
+    vlapic->callbacks[index].data = data;
+    spin_unlock_irqrestore(&vlapic->callback_lock, flags);
+}
+
+void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig,
+                             vlapic_eoi_callback_t *callback, void *data)
 {
     struct vcpu *target = vlapic_vcpu(vlapic);
 
@@ -159,8 +184,12 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
     else
         vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
 
+    if ( callback )
+        vlapic_set_callback(vlapic, vec, callback, data);
+
     if ( hvm_funcs.update_eoi_exit_bitmap )
-        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
+        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec,
+                          trig || callback);
 
     if ( hvm_funcs.deliver_posted_intr )
         alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
@@ -459,10 +488,24 @@ void vlapic_EOI_set(struct vlapic *vlapic)
 
 void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
 {
+    vlapic_eoi_callback_t *callback;
+    void *data;
+    unsigned long flags;
+    unsigned int index = vector - 16;
+
     if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
         vioapic_update_EOI(vector);
 
     hvm_dpci_msi_eoi(vector);
+
+    spin_lock_irqsave(&vlapic->callback_lock, flags);
+    callback = vlapic->callbacks[index].callback;
+    vlapic->callbacks[index].callback = NULL;
+    data = vlapic->callbacks[index].data;
+    spin_unlock_irqrestore(&vlapic->callback_lock, flags);
+
+    if ( callback )
+        callback(vector, data);
 }
 
 static bool_t is_multicast_dest(struct vlapic *vlapic, unsigned int short_hand,
@@ -1629,9 +1672,23 @@ int vlapic_init(struct vcpu *v)
     }
     clear_page(vlapic->regs);
 
+    if ( !vlapic->callbacks )
+    {
+        vlapic->callbacks = xmalloc_array(typeof(*vlapic->callbacks),
+                                          X86_NR_VECTORS - 16);
+        if ( !vlapic->callbacks )
+        {
+            dprintk(XENLOG_ERR, "%pv: alloc vlapic callbacks error\n", v);
+            return -ENOMEM;
+        }
+    }
+    memset(vlapic->callbacks, 0, sizeof(*vlapic->callbacks) *
+                                 (X86_NR_VECTORS - 16));
+
     vlapic_reset(vlapic);
 
     spin_lock_init(&vlapic->esr_lock);
+    spin_lock_init(&vlapic->callback_lock);
 
     tasklet_init(&vlapic->init_sipi.tasklet, vlapic_init_sipi_action, v);
 
@@ -1653,6 +1710,7 @@ void vlapic_destroy(struct vcpu *v)
     destroy_periodic_time(&vlapic->pt);
     unmap_domain_page_global(vlapic->regs);
     free_domheap_page(vlapic->regs_page);
+    XFREE(vlapic->callbacks);
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/vlapic.h b/xen/include/asm-x86/hvm/vlapic.h
index 8f908928c3..c380127a71 100644
--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -73,6 +73,8 @@
 #define vlapic_clear_vector(vec, bitmap)                                \
     clear_bit(VEC_POS(vec), (uint32_t *)((bitmap) + REG_POS(vec)))
 
+typedef void vlapic_eoi_callback_t(unsigned int vector, void *data);
+
 struct vlapic {
     struct hvm_hw_lapic      hw;
     struct hvm_hw_lapic_regs *regs;
@@ -89,6 +91,11 @@ struct vlapic {
         uint32_t             icr, dest;
         struct tasklet       tasklet;
     } init_sipi;
+    struct {
+        vlapic_eoi_callback_t *callback;
+        void                 *data;
+    } *callbacks;
+    spinlock_t               callback_lock;
 };
 
 /* vlapic's frequence is 100 MHz */
@@ -111,7 +118,16 @@ void vlapic_reg_write(struct vcpu *v, unsigned int reg, uint32_t val);
 bool_t is_vlapic_lvtpc_enabled(struct vlapic *vlapic);
 
 bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec);
-void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig);
+void vlapic_set_callback(struct vlapic *vlapic, unsigned int vec,
+                         vlapic_eoi_callback_t *callback, void *data);
+void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig,
+                             vlapic_eoi_callback_t *callback, void *data);
+
+static inline void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec,
+                                  uint8_t trig)
+{
+    vlapic_set_irq_callback(vlapic, vec, trig, NULL, NULL);
+}
 
 int vlapic_has_pending_irq(struct vcpu *v);
 int vlapic_ack_pending_irq(struct vcpu *v, int vector, bool_t force_ack);
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
                   ` (2 preceding siblings ...)
  2020-09-30 10:41 ` [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism Roger Pau Monne
@ 2020-09-30 10:41 ` Roger Pau Monne
  2020-09-30 11:57   ` Paul Durrant
  2020-10-02 15:25   ` Jan Beulich
  2020-09-30 10:41 ` [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism Roger Pau Monne
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:41 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu, Paul Durrant

Remove the unconditional call to hvm_dpci_msi_eoi in vlapic_handle_EOI
and instead use the newly introduced EOI callback mechanism in order
to register a callback for MSI vectors injected from passed through
devices.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vlapic.c    |  2 --
 xen/arch/x86/hvm/vmsi.c      | 36 ++++++++++++++++++++++--------------
 xen/drivers/passthrough/io.c |  2 +-
 xen/include/asm-x86/hvm/io.h |  2 +-
 4 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 38c62a02e6..8a18b33428 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -496,8 +496,6 @@ void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
     if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
         vioapic_update_EOI(vector);
 
-    hvm_dpci_msi_eoi(vector);
-
     spin_lock_irqsave(&vlapic->callback_lock, flags);
     callback = vlapic->callbacks[index].callback;
     vlapic->callbacks[index].callback = NULL;
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 7ca19353ab..e192c4c6da 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -44,11 +44,9 @@
 #include <asm/event.h>
 #include <asm/io_apic.h>
 
-static void vmsi_inj_irq(
-    struct vlapic *target,
-    uint8_t vector,
-    uint8_t trig_mode,
-    uint8_t delivery_mode)
+static void vmsi_inj_irq(struct vlapic *target, uint8_t vector,
+                         uint8_t trig_mode, uint8_t delivery_mode,
+                         vlapic_eoi_callback_t *callback, void *data)
 {
     HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "vmsi_inj_irq: vec %02x trig %d dm %d\n",
                 vector, trig_mode, delivery_mode);
@@ -57,17 +55,17 @@ static void vmsi_inj_irq(
     {
     case dest_Fixed:
     case dest_LowestPrio:
-        vlapic_set_irq(target, vector, trig_mode);
+        vlapic_set_irq_callback(target, vector, trig_mode, callback, data);
         break;
     default:
         BUG();
     }
 }
 
-int vmsi_deliver(
-    struct domain *d, int vector,
-    uint8_t dest, uint8_t dest_mode,
-    uint8_t delivery_mode, uint8_t trig_mode)
+static int vmsi_deliver_callback(struct domain *d, int vector, uint8_t dest,
+                                 uint8_t dest_mode, uint8_t delivery_mode,
+                                 uint8_t trig_mode,
+                                 vlapic_eoi_callback_t *callback, void *data)
 {
     struct vlapic *target;
     struct vcpu *v;
@@ -78,7 +76,8 @@ int vmsi_deliver(
         target = vlapic_lowest_prio(d, NULL, 0, dest, dest_mode);
         if ( target != NULL )
         {
-            vmsi_inj_irq(target, vector, trig_mode, delivery_mode);
+            vmsi_inj_irq(target, vector, trig_mode, delivery_mode, callback,
+                         data);
             break;
         }
         HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "null MSI round robin: vector=%02x\n",
@@ -89,8 +88,8 @@ int vmsi_deliver(
         for_each_vcpu ( d, v )
             if ( vlapic_match_dest(vcpu_vlapic(v), NULL,
                                    0, dest, dest_mode) )
-                vmsi_inj_irq(vcpu_vlapic(v), vector,
-                             trig_mode, delivery_mode);
+                vmsi_inj_irq(vcpu_vlapic(v), vector, trig_mode, delivery_mode,
+                             callback, data);
         break;
 
     default:
@@ -103,6 +102,14 @@ int vmsi_deliver(
     return 0;
 }
 
+
+int vmsi_deliver(struct domain *d, int vector, uint8_t dest, uint8_t dest_mode,
+                 uint8_t delivery_mode, uint8_t trig_mode)
+{
+    return vmsi_deliver_callback(d, vector, dest, dest_mode, delivery_mode,
+                                 trig_mode, NULL, NULL);
+}
+
 void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
 {
     uint32_t flags = pirq_dpci->gmsi.gflags;
@@ -119,7 +126,8 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
 
     ASSERT(pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI);
 
-    vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
+    vmsi_deliver_callback(d, vector, dest, dest_mode, delivery_mode, trig_mode,
+                          hvm_dpci_msi_eoi, NULL);
 }
 
 /* Return value, -1 : multi-dests, non-negative value: dest_vcpu_id */
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 536e91ad76..bff0f6628a 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -874,7 +874,7 @@ static int _hvm_dpci_msi_eoi(struct domain *d,
     return 0;
 }
 
-void hvm_dpci_msi_eoi(unsigned int vector)
+void hvm_dpci_msi_eoi(unsigned int vector, void *data)
 {
     struct domain *d = current->domain;
 
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index b05f619435..759ee486af 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -158,7 +158,7 @@ struct hvm_hw_stdvga {
 void stdvga_init(struct domain *d);
 void stdvga_deinit(struct domain *d);
 
-extern void hvm_dpci_msi_eoi(unsigned int vector);
+void hvm_dpci_msi_eoi(unsigned int vector, void *data);
 
 /* Decode a PCI port IO access into a bus/slot/func/reg. */
 unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr,
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
                   ` (3 preceding siblings ...)
  2020-09-30 10:41 ` [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks Roger Pau Monne
@ 2020-09-30 10:41 ` Roger Pau Monne
  2020-09-30 12:09   ` Paul Durrant
  2020-10-22 16:12   ` Jan Beulich
  2020-09-30 10:41 ` [PATCH v2 06/11] x86/hvm: allowing registering EOI callbacks for GSIs Roger Pau Monne
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:41 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

Switch the emulated IO-APIC code to use the local APIC EOI callback
mechanism. This allows to remove the last hardcoded callback from
vlapic_handle_EOI. Removing the hardcoded vIO-APIC callback also
allows to getting rid of setting the EOI exit bitmap based on the
triggering mode, as now all users that require an EOI action use the
newly introduced callback mechanism.

Move and rename the vioapic_update_EOI now that it can be made static.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Remove the triggering check in the update_eoi_exit_bitmap call.
 - Register the vlapic callbacks when loading the vIO-APIC state.
 - Reduce scope of ent.
---
 xen/arch/x86/hvm/vioapic.c | 131 ++++++++++++++++++++++++-------------
 xen/arch/x86/hvm/vlapic.c  |   5 +-
 2 files changed, 86 insertions(+), 50 deletions(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 752fc410db..dce98b1479 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -375,6 +375,50 @@ static const struct hvm_mmio_ops vioapic_mmio_ops = {
     .write = vioapic_write
 };
 
+static void eoi_callback(unsigned int vector, void *data)
+{
+    struct domain *d = current->domain;
+    struct hvm_irq *hvm_irq = hvm_domain_irq(d);
+    unsigned int i;
+
+    ASSERT(has_vioapic(d));
+
+    spin_lock(&d->arch.hvm.irq_lock);
+
+    for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
+    {
+        struct hvm_vioapic *vioapic = domain_vioapic(d, i);
+        unsigned int pin;
+
+        for ( pin = 0; pin < vioapic->nr_pins; pin++ )
+        {
+            union vioapic_redir_entry *ent = &vioapic->redirtbl[pin];
+
+            if ( ent->fields.vector != vector )
+                continue;
+
+            ent->fields.remote_irr = 0;
+
+            if ( is_iommu_enabled(d) )
+            {
+                spin_unlock(&d->arch.hvm.irq_lock);
+                hvm_dpci_eoi(vioapic->base_gsi + pin, ent);
+                spin_lock(&d->arch.hvm.irq_lock);
+            }
+
+            if ( (ent->fields.trig_mode == VIOAPIC_LEVEL_TRIG) &&
+                 !ent->fields.mask &&
+                 hvm_irq->gsi_assert_count[vioapic->base_gsi + pin] )
+            {
+                ent->fields.remote_irr = 1;
+                vioapic_deliver(vioapic, pin);
+            }
+        }
+    }
+
+    spin_unlock(&d->arch.hvm.irq_lock);
+}
+
 static void ioapic_inj_irq(
     struct hvm_vioapic *vioapic,
     struct vlapic *target,
@@ -388,7 +432,8 @@ static void ioapic_inj_irq(
     ASSERT((delivery_mode == dest_Fixed) ||
            (delivery_mode == dest_LowestPrio));
 
-    vlapic_set_irq(target, vector, trig_mode);
+    vlapic_set_irq_callback(target, vector, trig_mode,
+                            trig_mode ? eoi_callback : NULL, NULL);
 }
 
 static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
@@ -495,50 +540,6 @@ void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
     }
 }
 
-void vioapic_update_EOI(unsigned int vector)
-{
-    struct domain *d = current->domain;
-    struct hvm_irq *hvm_irq = hvm_domain_irq(d);
-    union vioapic_redir_entry *ent;
-    unsigned int i;
-
-    ASSERT(has_vioapic(d));
-
-    spin_lock(&d->arch.hvm.irq_lock);
-
-    for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
-    {
-        struct hvm_vioapic *vioapic = domain_vioapic(d, i);
-        unsigned int pin;
-
-        for ( pin = 0; pin < vioapic->nr_pins; pin++ )
-        {
-            ent = &vioapic->redirtbl[pin];
-            if ( ent->fields.vector != vector )
-                continue;
-
-            ent->fields.remote_irr = 0;
-
-            if ( is_iommu_enabled(d) )
-            {
-                spin_unlock(&d->arch.hvm.irq_lock);
-                hvm_dpci_eoi(vioapic->base_gsi + pin, ent);
-                spin_lock(&d->arch.hvm.irq_lock);
-            }
-
-            if ( (ent->fields.trig_mode == VIOAPIC_LEVEL_TRIG) &&
-                 !ent->fields.mask &&
-                 hvm_irq->gsi_assert_count[vioapic->base_gsi + pin] )
-            {
-                ent->fields.remote_irr = 1;
-                vioapic_deliver(vioapic, pin);
-            }
-        }
-    }
-
-    spin_unlock(&d->arch.hvm.irq_lock);
-}
-
 int vioapic_get_mask(const struct domain *d, unsigned int gsi)
 {
     unsigned int pin = 0; /* See gsi_vioapic */
@@ -592,6 +593,8 @@ static int ioapic_save(struct vcpu *v, hvm_domain_context_t *h)
 static int ioapic_load(struct domain *d, hvm_domain_context_t *h)
 {
     struct hvm_vioapic *s;
+    unsigned int i;
+    int rc;
 
     if ( !has_vioapic(d) )
         return -ENODEV;
@@ -602,7 +605,43 @@ static int ioapic_load(struct domain *d, hvm_domain_context_t *h)
          d->arch.hvm.nr_vioapics != 1 )
         return -EOPNOTSUPP;
 
-    return hvm_load_entry(IOAPIC, h, &s->domU);
+    rc = hvm_load_entry(IOAPIC, h, &s->domU);
+    if ( rc )
+        return rc;
+
+    for ( i = 0; i < ARRAY_SIZE(s->domU.redirtbl); i++ )
+    {
+        const union vioapic_redir_entry *ent = &s->domU.redirtbl[i];
+        unsigned int vector = ent->fields.vector;
+        unsigned int delivery_mode = ent->fields.delivery_mode;
+        struct vcpu *v;
+
+        /*
+         * Add a callback for each possible vector injected by a redirection
+         * entry.
+         */
+        if ( vector < 16 || !ent->fields.remote_irr ||
+             (delivery_mode != dest_LowestPrio && delivery_mode != dest_Fixed) )
+            continue;
+
+        for_each_vcpu ( d, v )
+        {
+            struct vlapic *vlapic = vcpu_vlapic(v);
+
+            /*
+             * NB: if the vlapic registers where restored before the vio-apic
+             * ones we could test whether the vector is set in the vlapic IRR
+             * or ISR registers before unconditionally setting the callback.
+             * This is harmless as eoi_callback is capable of dealing with
+             * spurious callbacks.
+             */
+            if ( vlapic_match_dest(vlapic, NULL, 0, ent->fields.dest_id,
+                                   ent->fields.dest_mode) )
+                vlapic_set_callback(vlapic, vector, eoi_callback, NULL);
+        }
+    }
+
+    return 0;
 }
 
 HVM_REGISTER_SAVE_RESTORE(IOAPIC, ioapic_save, ioapic_load, 1, HVMSR_PER_DOM);
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 8a18b33428..35f12e0909 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -189,7 +189,7 @@ void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig,
 
     if ( hvm_funcs.update_eoi_exit_bitmap )
         alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec,
-                          trig || callback);
+                          callback);
 
     if ( hvm_funcs.deliver_posted_intr )
         alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
@@ -493,9 +493,6 @@ void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
     unsigned long flags;
     unsigned int index = vector - 16;
 
-    if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
-        vioapic_update_EOI(vector);
-
     spin_lock_irqsave(&vlapic->callback_lock, flags);
     callback = vlapic->callbacks[index].callback;
     vlapic->callbacks[index].callback = NULL;
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 06/11] x86/hvm: allowing registering EOI callbacks for GSIs
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
                   ` (4 preceding siblings ...)
  2020-09-30 10:41 ` [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism Roger Pau Monne
@ 2020-09-30 10:41 ` Roger Pau Monne
  2020-10-23 12:29   ` Jan Beulich
  2020-09-30 10:41 ` [PATCH v2 07/11] x86/dpci: move code Roger Pau Monne
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:41 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

Such callbacks will be executed once a EOI is performed by the guest,
regardless of whether the interrupts are injected from the vIO-APIC or
the vPIC, as ISA IRQs are translated to GSIs and then the
corresponding callback is executed at EOI.

The vIO-APIC infrastructure for handling EOIs is build on top of the
existing vlapic EOI callback functionality, while the vPIC one is
handled when writing to the vPIC EOI register.

Note that such callbacks need to be registered and de-registered, and
that a single GSI can have multiple callbacks associated. That's
because GSIs can be level triggered and shared, as that's the case
with legacy PCI interrupts shared between several devices.

Strictly speaking this is a non-functional change, since there are no
users of this new interface introduced by this change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/hvm/hvm.c        | 15 ++++++++-
 xen/arch/x86/hvm/irq.c        | 60 +++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/vioapic.c    | 22 +++++++++----
 xen/arch/x86/hvm/vpic.c       |  5 +++
 xen/include/asm-x86/hvm/irq.h | 20 ++++++++++++
 5 files changed, 114 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 2dfda93e09..9636ac6bf1 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -608,7 +608,7 @@ static int hvm_print_line(
 
 int hvm_domain_initialise(struct domain *d)
 {
-    unsigned int nr_gsis;
+    unsigned int nr_gsis, i;
     int rc;
 
     if ( !hvm_enabled )
@@ -655,6 +655,14 @@ int hvm_domain_initialise(struct domain *d)
     BUILD_BUG_ON(NR_HVM_DOMU_IRQS < NR_ISAIRQS);
     ASSERT(hvm_domain_irq(d)->nr_gsis >= NR_ISAIRQS);
 
+    /* Initialize the EOI callback list. */
+    hvm_domain_irq(d)->gsi_callbacks = xmalloc_array(struct list_head, nr_gsis);
+    if ( !hvm_domain_irq(d)->gsi_callbacks )
+        goto fail1;
+    rwlock_init(&hvm_domain_irq(d)->gsi_callbacks_lock);
+    for ( i = 0; i < nr_gsis; i++ )
+        INIT_LIST_HEAD(&hvm_domain_irq(d)->gsi_callbacks[i]);
+
     /* need link to containing domain */
     d->arch.hvm.pl_time->domain = d;
 
@@ -714,6 +722,8 @@ int hvm_domain_initialise(struct domain *d)
  fail1:
     if ( is_hardware_domain(d) )
         xfree(d->arch.hvm.io_bitmap);
+    if ( hvm_domain_irq(d) )
+        XFREE(hvm_domain_irq(d)->gsi_callbacks);
     XFREE(d->arch.hvm.io_handler);
     XFREE(d->arch.hvm.params);
     XFREE(d->arch.hvm.pl_time);
@@ -776,6 +786,9 @@ void hvm_domain_destroy(struct domain *d)
     vioapic_deinit(d);
 
     XFREE(d->arch.hvm.pl_time);
+
+    if ( hvm_domain_irq(d) )
+        XFREE(hvm_domain_irq(d)->gsi_callbacks);
     XFREE(d->arch.hvm.irq);
 
     list_for_each_safe ( ioport_list, tmp, &d->arch.hvm.g2m_ioport_list )
diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index 38ac5fb6c7..aedd390163 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -595,6 +595,66 @@ int hvm_local_events_need_delivery(struct vcpu *v)
     return !hvm_interrupt_blocked(v, intack);
 }
 
+int hvm_gsi_register_callback(struct domain *d, unsigned int gsi,
+                              struct hvm_gsi_eoi_callback *cb)
+{
+    if ( gsi >= hvm_domain_irq(d)->nr_gsis )
+    {
+        ASSERT_UNREACHABLE();
+        return -EINVAL;
+    }
+
+    write_lock(&hvm_domain_irq(d)->gsi_callbacks_lock);
+    list_add(&cb->list, &hvm_domain_irq(d)->gsi_callbacks[gsi]);
+    write_unlock(&hvm_domain_irq(d)->gsi_callbacks_lock);
+
+    return 0;
+}
+
+void hvm_gsi_unregister_callback(struct domain *d, unsigned int gsi,
+                                 struct hvm_gsi_eoi_callback *cb)
+{
+    struct list_head *tmp;
+
+    if ( gsi >= hvm_domain_irq(d)->nr_gsis )
+    {
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
+    write_lock(&hvm_domain_irq(d)->gsi_callbacks_lock);
+    list_for_each ( tmp, &hvm_domain_irq(d)->gsi_callbacks[gsi] )
+        if ( tmp == &cb->list )
+        {
+            list_del(tmp);
+            break;
+        }
+    write_unlock(&hvm_domain_irq(d)->gsi_callbacks_lock);
+}
+
+void hvm_gsi_execute_callbacks(unsigned int gsi, void *data)
+{
+    struct domain *currd = current->domain;
+    struct hvm_gsi_eoi_callback *cb;
+
+    read_lock(&hvm_domain_irq(currd)->gsi_callbacks_lock);
+    list_for_each_entry ( cb, &hvm_domain_irq(currd)->gsi_callbacks[gsi],
+                          list )
+        cb->callback(gsi, cb->data ?: data);
+    read_unlock(&hvm_domain_irq(currd)->gsi_callbacks_lock);
+}
+
+bool hvm_gsi_has_callbacks(struct domain *d, unsigned int gsi)
+{
+    bool has_callbacks;
+
+    read_lock(&hvm_domain_irq(d)->gsi_callbacks_lock);
+    has_callbacks = !list_empty(&hvm_domain_irq(d)->gsi_callbacks[gsi]);
+    read_unlock(&hvm_domain_irq(d)->gsi_callbacks_lock);
+
+    return has_callbacks;
+}
+
 static void irq_dump(struct domain *d)
 {
     struct hvm_irq *hvm_irq = hvm_domain_irq(d);
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index dce98b1479..03b1350c04 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -393,6 +393,7 @@ static void eoi_callback(unsigned int vector, void *data)
         for ( pin = 0; pin < vioapic->nr_pins; pin++ )
         {
             union vioapic_redir_entry *ent = &vioapic->redirtbl[pin];
+            unsigned int gsi = vioapic->base_gsi + pin;
 
             if ( ent->fields.vector != vector )
                 continue;
@@ -402,13 +403,17 @@ static void eoi_callback(unsigned int vector, void *data)
             if ( is_iommu_enabled(d) )
             {
                 spin_unlock(&d->arch.hvm.irq_lock);
-                hvm_dpci_eoi(vioapic->base_gsi + pin, ent);
+                hvm_dpci_eoi(gsi, ent);
                 spin_lock(&d->arch.hvm.irq_lock);
             }
 
+            spin_unlock(&d->arch.hvm.irq_lock);
+            hvm_gsi_execute_callbacks(gsi, ent);
+            spin_lock(&d->arch.hvm.irq_lock);
+
             if ( (ent->fields.trig_mode == VIOAPIC_LEVEL_TRIG) &&
                  !ent->fields.mask &&
-                 hvm_irq->gsi_assert_count[vioapic->base_gsi + pin] )
+                 hvm_irq->gsi_assert_count[gsi] )
             {
                 ent->fields.remote_irr = 1;
                 vioapic_deliver(vioapic, pin);
@@ -424,7 +429,8 @@ static void ioapic_inj_irq(
     struct vlapic *target,
     uint8_t vector,
     uint8_t trig_mode,
-    uint8_t delivery_mode)
+    uint8_t delivery_mode,
+    bool callback)
 {
     HVM_DBG_LOG(DBG_LEVEL_IOAPIC, "irq %d trig %d deliv %d",
                 vector, trig_mode, delivery_mode);
@@ -433,7 +439,7 @@ static void ioapic_inj_irq(
            (delivery_mode == dest_LowestPrio));
 
     vlapic_set_irq_callback(target, vector, trig_mode,
-                            trig_mode ? eoi_callback : NULL, NULL);
+                            callback ? eoi_callback : NULL, NULL);
 }
 
 static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
@@ -447,6 +453,7 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
     struct vlapic *target;
     struct vcpu *v;
     unsigned int irq = vioapic->base_gsi + pin;
+    bool callback = trig_mode || hvm_gsi_has_callbacks(d, irq);
 
     ASSERT(spin_is_locked(&d->arch.hvm.irq_lock));
 
@@ -473,7 +480,8 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
             target = vlapic_lowest_prio(d, NULL, 0, dest, dest_mode);
         if ( target != NULL )
         {
-            ioapic_inj_irq(vioapic, target, vector, trig_mode, delivery_mode);
+            ioapic_inj_irq(vioapic, target, vector, trig_mode, delivery_mode,
+                           callback);
         }
         else
         {
@@ -488,7 +496,7 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
         for_each_vcpu ( d, v )
             if ( vlapic_match_dest(vcpu_vlapic(v), NULL, 0, dest, dest_mode) )
                 ioapic_inj_irq(vioapic, vcpu_vlapic(v), vector, trig_mode,
-                               delivery_mode);
+                               delivery_mode, callback);
         break;
 
     case dest_NMI:
@@ -620,7 +628,7 @@ static int ioapic_load(struct domain *d, hvm_domain_context_t *h)
          * Add a callback for each possible vector injected by a redirection
          * entry.
          */
-        if ( vector < 16 || !ent->fields.remote_irr ||
+        if ( vector < 16 ||
              (delivery_mode != dest_LowestPrio && delivery_mode != dest_Fixed) )
             continue;
 
diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index 26f74f4471..09c937c322 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -262,9 +262,14 @@ static void vpic_ioport_write(
                 /* Release lock and EOI the physical interrupt (if any). */
                 vpic_update_int_output(vpic);
                 vpic_unlock(vpic);
+
+                hvm_gsi_execute_callbacks(
+                        hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin),
+                        NULL);
                 hvm_dpci_eoi(hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin),
                              NULL);
                 return; /* bail immediately */
+
             case 6: /* Set Priority                */
                 vpic->priority_add = (val + 1) & 7;
                 break;
diff --git a/xen/include/asm-x86/hvm/irq.h b/xen/include/asm-x86/hvm/irq.h
index 532880d497..db38c3e119 100644
--- a/xen/include/asm-x86/hvm/irq.h
+++ b/xen/include/asm-x86/hvm/irq.h
@@ -21,6 +21,7 @@
 #ifndef __ASM_X86_HVM_IRQ_H__
 #define __ASM_X86_HVM_IRQ_H__
 
+#include <xen/rwlock.h>
 #include <xen/timer.h>
 
 #include <asm/hvm/hvm.h>
@@ -79,6 +80,10 @@ struct hvm_irq {
 
     struct hvm_irq_dpci *dpci;
 
+    /* List of callbacks for GSI EOI events. Protected by irq_lock. */
+    struct list_head  *gsi_callbacks;
+    rwlock_t gsi_callbacks_lock;
+
     /*
      * Number of wires asserting each GSI.
      *
@@ -140,6 +145,13 @@ struct hvm_gmsi_info {
     bool posted; /* directly deliver to guest via VT-d PI? */
 };
 
+typedef void hvm_gsi_eoi_callback_t(unsigned int gsi, void *data);
+struct hvm_gsi_eoi_callback {
+    hvm_gsi_eoi_callback_t *callback;
+    void *data;
+    struct list_head list;
+};
+
 struct hvm_girq_dpci_mapping {
     struct list_head list;
     uint8_t bus;
@@ -230,4 +242,12 @@ void hvm_set_callback_via(struct domain *d, uint64_t via);
 struct pirq;
 bool hvm_domain_use_pirq(const struct domain *, const struct pirq *);
 
+int hvm_gsi_register_callback(struct domain *d, unsigned int gsi,
+                              struct hvm_gsi_eoi_callback *cb);
+void hvm_gsi_unregister_callback(struct domain *d, unsigned int gsi,
+                                 struct hvm_gsi_eoi_callback *cb);
+/* data is an opaque blob to pass to the callback if it has no private data. */
+void hvm_gsi_execute_callbacks(unsigned int gsi, void *data);
+bool hvm_gsi_has_callbacks(struct domain *d, unsigned int gsi);
+
 #endif /* __ASM_X86_HVM_IRQ_H__ */
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 07/11] x86/dpci: move code
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
                   ` (5 preceding siblings ...)
  2020-09-30 10:41 ` [PATCH v2 06/11] x86/hvm: allowing registering EOI callbacks for GSIs Roger Pau Monne
@ 2020-09-30 10:41 ` Roger Pau Monne
  2020-10-23 12:32   ` Jan Beulich
  2020-09-30 10:41 ` [PATCH v2 08/11] x86/dpci: switch to use a GSI EOI callback Roger Pau Monne
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:41 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Paul Durrant

This is code movement in order to simply further changes.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/drivers/passthrough/io.c | 172 +++++++++++++++++------------------
 1 file changed, 86 insertions(+), 86 deletions(-)

diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index bff0f6628a..770a5cce6b 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -276,6 +276,92 @@ static struct vcpu *vector_hashing_dest(const struct domain *d,
     return dest;
 }
 
+static void hvm_pirq_eoi(struct pirq *pirq,
+                         const union vioapic_redir_entry *ent)
+{
+    struct hvm_pirq_dpci *pirq_dpci;
+
+    if ( !pirq )
+    {
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
+    pirq_dpci = pirq_dpci(pirq);
+
+    /*
+     * No need to get vector lock for timer
+     * since interrupt is still not EOIed
+     */
+    if ( --pirq_dpci->pending ||
+         (ent && ent->fields.mask) ||
+         !pt_irq_need_timer(pirq_dpci->flags) )
+        return;
+
+    stop_timer(&pirq_dpci->timer);
+    pirq_guest_eoi(pirq);
+}
+
+static void __hvm_dpci_eoi(struct domain *d,
+                           const struct hvm_girq_dpci_mapping *girq,
+                           const union vioapic_redir_entry *ent)
+{
+    struct pirq *pirq = pirq_info(d, girq->machine_gsi);
+
+    if ( !hvm_domain_use_pirq(d, pirq) )
+        hvm_pci_intx_deassert(d, girq->device, girq->intx);
+
+    hvm_pirq_eoi(pirq, ent);
+}
+
+static void hvm_gsi_eoi(struct domain *d, unsigned int gsi,
+                        const union vioapic_redir_entry *ent)
+{
+    struct pirq *pirq = pirq_info(d, gsi);
+
+    /* Check if GSI is actually mapped. */
+    if ( !pirq_dpci(pirq) )
+        return;
+
+    hvm_gsi_deassert(d, gsi);
+    hvm_pirq_eoi(pirq, ent);
+}
+
+void hvm_dpci_eoi(unsigned int guest_gsi, const union vioapic_redir_entry *ent)
+{
+    struct domain *d = current->domain;
+    const struct hvm_irq_dpci *hvm_irq_dpci;
+    const struct hvm_girq_dpci_mapping *girq;
+
+    if ( !is_iommu_enabled(d) )
+        return;
+
+    if ( is_hardware_domain(d) )
+    {
+        spin_lock(&d->event_lock);
+        hvm_gsi_eoi(d, guest_gsi, ent);
+        goto unlock;
+    }
+
+    if ( guest_gsi < NR_ISAIRQS )
+    {
+        hvm_dpci_isairq_eoi(d, guest_gsi);
+        return;
+    }
+
+    spin_lock(&d->event_lock);
+    hvm_irq_dpci = domain_get_irq_dpci(d);
+
+    if ( !hvm_irq_dpci )
+        goto unlock;
+
+    list_for_each_entry ( girq, &hvm_irq_dpci->girq[guest_gsi], list )
+        __hvm_dpci_eoi(d, girq, ent);
+
+unlock:
+    spin_unlock(&d->event_lock);
+}
+
 int pt_irq_create_bind(
     struct domain *d, const struct xen_domctl_bind_pt_irq *pt_irq_bind)
 {
@@ -952,92 +1038,6 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci)
     spin_unlock(&d->event_lock);
 }
 
-static void hvm_pirq_eoi(struct pirq *pirq,
-                         const union vioapic_redir_entry *ent)
-{
-    struct hvm_pirq_dpci *pirq_dpci;
-
-    if ( !pirq )
-    {
-        ASSERT_UNREACHABLE();
-        return;
-    }
-
-    pirq_dpci = pirq_dpci(pirq);
-
-    /*
-     * No need to get vector lock for timer
-     * since interrupt is still not EOIed
-     */
-    if ( --pirq_dpci->pending ||
-         (ent && ent->fields.mask) ||
-         !pt_irq_need_timer(pirq_dpci->flags) )
-        return;
-
-    stop_timer(&pirq_dpci->timer);
-    pirq_guest_eoi(pirq);
-}
-
-static void __hvm_dpci_eoi(struct domain *d,
-                           const struct hvm_girq_dpci_mapping *girq,
-                           const union vioapic_redir_entry *ent)
-{
-    struct pirq *pirq = pirq_info(d, girq->machine_gsi);
-
-    if ( !hvm_domain_use_pirq(d, pirq) )
-        hvm_pci_intx_deassert(d, girq->device, girq->intx);
-
-    hvm_pirq_eoi(pirq, ent);
-}
-
-static void hvm_gsi_eoi(struct domain *d, unsigned int gsi,
-                        const union vioapic_redir_entry *ent)
-{
-    struct pirq *pirq = pirq_info(d, gsi);
-
-    /* Check if GSI is actually mapped. */
-    if ( !pirq_dpci(pirq) )
-        return;
-
-    hvm_gsi_deassert(d, gsi);
-    hvm_pirq_eoi(pirq, ent);
-}
-
-void hvm_dpci_eoi(unsigned int guest_gsi, const union vioapic_redir_entry *ent)
-{
-    struct domain *d = current->domain;
-    const struct hvm_irq_dpci *hvm_irq_dpci;
-    const struct hvm_girq_dpci_mapping *girq;
-
-    if ( !is_iommu_enabled(d) )
-        return;
-
-    if ( is_hardware_domain(d) )
-    {
-        spin_lock(&d->event_lock);
-        hvm_gsi_eoi(d, guest_gsi, ent);
-        goto unlock;
-    }
-
-    if ( guest_gsi < NR_ISAIRQS )
-    {
-        hvm_dpci_isairq_eoi(d, guest_gsi);
-        return;
-    }
-
-    spin_lock(&d->event_lock);
-    hvm_irq_dpci = domain_get_irq_dpci(d);
-
-    if ( !hvm_irq_dpci )
-        goto unlock;
-
-    list_for_each_entry ( girq, &hvm_irq_dpci->girq[guest_gsi], list )
-        __hvm_dpci_eoi(d, girq, ent);
-
-unlock:
-    spin_unlock(&d->event_lock);
-}
-
 /*
  * Note: 'pt_pirq_softirq_reset' can clear the STATE_SCHED before we get to
  * doing it. If that is the case we let 'pt_pirq_softirq_reset' do ref-counting.
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 08/11] x86/dpci: switch to use a GSI EOI callback
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
                   ` (6 preceding siblings ...)
  2020-09-30 10:41 ` [PATCH v2 07/11] x86/dpci: move code Roger Pau Monne
@ 2020-09-30 10:41 ` Roger Pau Monne
  2020-10-23 12:47   ` Jan Beulich
  2020-09-30 10:41 ` [PATCH v2 09/11] x86/vpt: switch interrupt injection model Roger Pau Monne
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:41 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu, Paul Durrant

Switch the dpci GSI EOI callback hooks to use the newly introduced
generic callback functionality, and remove the custom dpci calls found
on the vPIC and vIO-APIC implementations.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/hvm/vioapic.c    |  7 ------
 xen/arch/x86/hvm/vpic.c       |  2 --
 xen/drivers/passthrough/io.c  | 41 ++++++++++++++++++++++++++++++++---
 xen/include/asm-x86/hvm/io.h  |  1 -
 xen/include/asm-x86/hvm/irq.h |  1 +
 5 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 03b1350c04..ea4d60d33e 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -400,13 +400,6 @@ static void eoi_callback(unsigned int vector, void *data)
 
             ent->fields.remote_irr = 0;
 
-            if ( is_iommu_enabled(d) )
-            {
-                spin_unlock(&d->arch.hvm.irq_lock);
-                hvm_dpci_eoi(gsi, ent);
-                spin_lock(&d->arch.hvm.irq_lock);
-            }
-
             spin_unlock(&d->arch.hvm.irq_lock);
             hvm_gsi_execute_callbacks(gsi, ent);
             spin_lock(&d->arch.hvm.irq_lock);
diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index 09c937c322..3c01c638fa 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -266,8 +266,6 @@ static void vpic_ioport_write(
                 hvm_gsi_execute_callbacks(
                         hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin),
                         NULL);
-                hvm_dpci_eoi(hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin),
-                             NULL);
                 return; /* bail immediately */
 
             case 6: /* Set Priority                */
diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
index 770a5cce6b..6908438a94 100644
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -327,9 +327,10 @@ static void hvm_gsi_eoi(struct domain *d, unsigned int gsi,
     hvm_pirq_eoi(pirq, ent);
 }
 
-void hvm_dpci_eoi(unsigned int guest_gsi, const union vioapic_redir_entry *ent)
+static void dpci_eoi(unsigned int guest_gsi, void *data)
 {
     struct domain *d = current->domain;
+    const union vioapic_redir_entry *ent = data;
     const struct hvm_irq_dpci *hvm_irq_dpci;
     const struct hvm_girq_dpci_mapping *girq;
 
@@ -565,7 +566,7 @@ int pt_irq_create_bind(
             unsigned int link;
 
             digl = xmalloc(struct dev_intx_gsi_link);
-            girq = xmalloc(struct hvm_girq_dpci_mapping);
+            girq = xzalloc(struct hvm_girq_dpci_mapping);
 
             if ( !digl || !girq )
             {
@@ -578,11 +579,22 @@ int pt_irq_create_bind(
             girq->bus = digl->bus = pt_irq_bind->u.pci.bus;
             girq->device = digl->device = pt_irq_bind->u.pci.device;
             girq->intx = digl->intx = pt_irq_bind->u.pci.intx;
-            list_add_tail(&digl->list, &pirq_dpci->digl_list);
+            girq->cb.callback = dpci_eoi;
 
             guest_gsi = hvm_pci_intx_gsi(digl->device, digl->intx);
             link = hvm_pci_intx_link(digl->device, digl->intx);
 
+            rc = hvm_gsi_register_callback(d, guest_gsi, &girq->cb);
+            if ( rc )
+            {
+                spin_unlock(&d->event_lock);
+                xfree(girq);
+                xfree(digl);
+                return rc;
+            }
+
+            list_add_tail(&digl->list, &pirq_dpci->digl_list);
+
             hvm_irq_dpci->link_cnt[link]++;
 
             girq->machine_gsi = pirq;
@@ -590,8 +602,17 @@ int pt_irq_create_bind(
         }
         else
         {
+            struct hvm_gsi_eoi_callback *cb =
+                xzalloc(struct hvm_gsi_eoi_callback);
+
             ASSERT(is_hardware_domain(d));
 
+            if ( !cb )
+            {
+                spin_unlock(&d->event_lock);
+                return -ENOMEM;
+            }
+
             /* MSI_TRANSLATE is not supported for the hardware domain. */
             if ( pt_irq_bind->irq_type != PT_IRQ_TYPE_PCI ||
                  pirq >= hvm_domain_irq(d)->nr_gsis )
@@ -601,6 +622,19 @@ int pt_irq_create_bind(
                 return -EINVAL;
             }
             guest_gsi = pirq;
+
+            cb->callback = dpci_eoi;
+            /*
+             * IRQ binds created for the hardware domain are never destroyed,
+             * so it's fine to not keep a reference to cb here.
+             */
+            rc = hvm_gsi_register_callback(d, guest_gsi, cb);
+            if ( rc )
+            {
+                spin_unlock(&d->event_lock);
+                xfree(cb);
+                return rc;
+            }
         }
 
         /* Bind the same mirq once in the same domain */
@@ -789,6 +823,7 @@ int pt_irq_destroy_bind(
                  girq->machine_gsi == machine_gsi )
             {
                 list_del(&girq->list);
+                hvm_gsi_unregister_callback(d, guest_gsi, &girq->cb);
                 xfree(girq);
                 girq = NULL;
                 break;
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 759ee486af..e1bc556613 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -118,7 +118,6 @@ bool handle_mmio_with_translation(unsigned long gla, unsigned long gpfn,
                                   struct npfec);
 bool handle_pio(uint16_t port, unsigned int size, int dir);
 void hvm_interrupt_post(struct vcpu *v, int vector, int type);
-void hvm_dpci_eoi(unsigned int guest_irq, const union vioapic_redir_entry *ent);
 void msix_write_completion(struct vcpu *);
 
 #ifdef CONFIG_HVM
diff --git a/xen/include/asm-x86/hvm/irq.h b/xen/include/asm-x86/hvm/irq.h
index db38c3e119..2f01d8fe64 100644
--- a/xen/include/asm-x86/hvm/irq.h
+++ b/xen/include/asm-x86/hvm/irq.h
@@ -158,6 +158,7 @@ struct hvm_girq_dpci_mapping {
     uint8_t device;
     uint8_t intx;
     uint8_t machine_gsi;
+    struct hvm_gsi_eoi_callback cb;
 };
 
 #define NR_ISAIRQS  16
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 09/11] x86/vpt: switch interrupt injection model
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
                   ` (7 preceding siblings ...)
  2020-09-30 10:41 ` [PATCH v2 08/11] x86/dpci: switch to use a GSI EOI callback Roger Pau Monne
@ 2020-09-30 10:41 ` Roger Pau Monne
  2020-10-23 14:59   ` Jan Beulich
  2020-09-30 10:41 ` [PATCH v2 10/11] x86/vpt: remove vPT timers per-vCPU lists Roger Pau Monne
  2020-09-30 10:41 ` [PATCH v2 11/11] x86/vpt: introduce a per-vPT lock Roger Pau Monne
  10 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:41 UTC (permalink / raw)
  To: xen-devel
  Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu,
	Jun Nakajima, Kevin Tian

Currently vPT relies on timers being assigned to a vCPU and performing
checks on every return to HVM guest in order to check if an interrupt
from a vPT timer assigned to the vCPU is currently being injected.

This model doesn't work properly since the interrupt destination vCPU
of a vPT timer can be different from the vCPU where the timer is
currently assigned, in which case the timer would get stuck because it
never sees the interrupt as being injected.

Knowing when a vPT interrupt is injected is relevant for the guest
timer modes where missed vPT interrupts are not discarded and instead
are accumulated and injected when possible.

This change aims to modify the logic described above, so that vPT
doesn't need to check on every return to HVM guest if a vPT interrupt
is being injected. In order to achieve this the vPT code is modified
to make use of the new EOI callbacks, so that virtual timers can
detect when a interrupt has been serviced by the guest by waiting for
the EOI callback to execute.

This model also simplifies some of the logic, as when executing the
timer EOI callback Xen can try to inject another interrupt if the
timer has interrupts pending for delivery.

Note that timers are still bound to a vCPU for the time being, this
relation however doesn't limit the interrupt destination anymore, and
will be removed by further patches.

This model has been tested with Windows 7 guests without showing any
timer delay, even when the guest was limited to have very little CPU
capacity and pending virtual timer interrupts accumulate.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
Sorry, this is a big change, but I'm having issues splitting it into
smaller pieces as the functionality needs to be changed in one go, or
else timers would be broken.

If this approach seems sensible I can try to split it up.
---
 xen/arch/x86/hvm/svm/intr.c   |   3 -
 xen/arch/x86/hvm/vmx/intr.c   |  59 ------
 xen/arch/x86/hvm/vpt.c        | 326 ++++++++++++++--------------------
 xen/include/asm-x86/hvm/vpt.h |   5 +-
 4 files changed, 135 insertions(+), 258 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/intr.c b/xen/arch/x86/hvm/svm/intr.c
index 7f815d2307..2ee2332253 100644
--- a/xen/arch/x86/hvm/svm/intr.c
+++ b/xen/arch/x86/hvm/svm/intr.c
@@ -146,8 +146,6 @@ void svm_intr_assist(void)
         return;
 
     /* Crank the handle on interrupt state. */
-    pt_update_irq(v);
-
     do {
         intack = hvm_vcpu_has_pending_irq(v);
         if ( likely(intack.source == hvm_intsrc_none) )
@@ -219,7 +217,6 @@ void svm_intr_assist(void)
     {
         HVMTRACE_2D(INJ_VIRQ, intack.vector, /*fake=*/ 0);
         svm_inject_extint(v, intack.vector);
-        pt_intr_post(v, intack);
     }
 
     /* Is there another IRQ to queue up behind this one? */
diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
index 80bfbb4787..3fcc7073db 100644
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -203,7 +203,6 @@ static int nvmx_intr_intercept(struct vcpu *v, struct hvm_intack intack)
             {
                 /* for now, duplicate the ack path in vmx_intr_assist */
                 hvm_vcpu_ack_pending_irq(v, intack);
-                pt_intr_post(v, intack);
 
                 intack = hvm_vcpu_has_pending_irq(v);
                 if ( unlikely(intack.source != hvm_intsrc_none) )
@@ -242,7 +241,6 @@ void vmx_intr_assist(void)
     struct vcpu *v = current;
     unsigned int tpr_threshold = 0;
     enum hvm_intblk intblk;
-    int pt_vector;
 
     /* Block event injection when single step with MTF. */
     if ( unlikely(v->arch.hvm.single_step) )
@@ -263,8 +261,6 @@ void vmx_intr_assist(void)
 #endif
 
     /* Crank the handle on interrupt state. */
-    pt_vector = pt_update_irq(v);
-
     do {
         unsigned long intr_info;
 
@@ -337,58 +333,6 @@ void vmx_intr_assist(void)
     {
         unsigned long status;
 
-       /*
-        * intack.vector is the highest priority vector. So we set eoi_exit_bitmap
-        * for intack.vector - give a chance to post periodic time interrupts when
-        * periodic time interrupts become the highest one
-        */
-        if ( pt_vector != -1 )
-        {
-#ifndef NDEBUG
-            /*
-             * We assert that intack.vector is the highest priority vector for
-             * only an interrupt from vlapic can reach this point and the
-             * highest vector is chosen in hvm_vcpu_has_pending_irq().
-             * But, in fact, the assertion failed sometimes. It is suspected
-             * that PIR is not synced to vIRR which makes pt_vector is left in
-             * PIR. In order to verify this suspicion, dump some information
-             * when the assertion fails.
-             */
-            if ( unlikely(intack.vector < pt_vector) )
-            {
-                const struct vlapic *vlapic;
-                const struct pi_desc *pi_desc;
-                const uint32_t *word;
-                unsigned int i;
-
-                printk(XENLOG_ERR "%pv: intack: %u:%02x pt: %02x\n",
-                       current, intack.source, intack.vector, pt_vector);
-
-                vlapic = vcpu_vlapic(v);
-                if ( vlapic && vlapic->regs )
-                {
-                    word = (const void *)&vlapic->regs->data[APIC_IRR];
-                    printk(XENLOG_ERR "vIRR:");
-                    for ( i = X86_NR_VECTORS / 32; i-- ; )
-                        printk(" %08x", word[i*4]);
-                    printk("\n");
-                }
-
-                pi_desc = &v->arch.hvm.vmx.pi_desc;
-                if ( pi_desc )
-                {
-                    word = (const void *)&pi_desc->pir;
-                    printk(XENLOG_ERR " PIR:");
-                    for ( i = X86_NR_VECTORS / 32; i-- ; )
-                        printk(" %08x", word[i]);
-                    printk("\n");
-                }
-            }
-#endif
-            ASSERT(intack.vector >= pt_vector);
-            vmx_set_eoi_exit_bitmap(v, intack.vector);
-        }
-
         /* we need update the RVI field */
         __vmread(GUEST_INTR_STATUS, &status);
         status &= ~VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK;
@@ -397,14 +341,11 @@ void vmx_intr_assist(void)
         __vmwrite(GUEST_INTR_STATUS, status);
 
         vmx_sync_exit_bitmap(v);
-
-        pt_intr_post(v, intack);
     }
     else
     {
         HVMTRACE_2D(INJ_VIRQ, intack.vector, /*fake=*/ 0);
         vmx_inject_extint(intack.vector, intack.source);
-        pt_intr_post(v, intack);
     }
 
     /* Is there another IRQ to queue up behind this one? */
diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c
index 867deb4da5..787931d7bb 100644
--- a/xen/arch/x86/hvm/vpt.c
+++ b/xen/arch/x86/hvm/vpt.c
@@ -27,6 +27,8 @@
 #define mode_is(d, name) \
     ((d)->arch.hvm.params[HVM_PARAM_TIMER_MODE] == HVMPTM_##name)
 
+static bool inject_interrupt(struct periodic_time *pt);
+
 void hvm_init_guest_time(struct domain *d)
 {
     struct pl_time *pl = d->arch.hvm.pl_time;
@@ -76,35 +78,6 @@ void hvm_set_guest_time(struct vcpu *v, u64 guest_time)
     }
 }
 
-static int pt_irq_vector(struct periodic_time *pt, enum hvm_intsrc src)
-{
-    struct vcpu *v = pt->vcpu;
-    unsigned int gsi, isa_irq;
-    int vector;
-
-    if ( pt->source == PTSRC_lapic )
-        return pt->irq;
-
-    isa_irq = pt->irq;
-
-    if ( src == hvm_intsrc_pic )
-        return (v->domain->arch.hvm.vpic[isa_irq >> 3].irq_base
-                + (isa_irq & 7));
-
-    ASSERT(src == hvm_intsrc_lapic);
-    gsi = pt->source == PTSRC_isa ? hvm_isa_irq_to_gsi(isa_irq) : pt->irq;
-    vector = vioapic_get_vector(v->domain, gsi);
-    if ( vector < 0 )
-    {
-        dprintk(XENLOG_WARNING, "d%u: invalid GSI (%u) for platform timer\n",
-                v->domain->domain_id, gsi);
-        domain_crash(v->domain);
-        return -1;
-    }
-
-    return vector;
-}
-
 static int pt_irq_masked(struct periodic_time *pt)
 {
     struct vcpu *v = pt->vcpu;
@@ -247,34 +220,14 @@ void pt_restore_timer(struct vcpu *v)
     pt_vcpu_lock(v);
 
     list_for_each_entry ( pt, head, list )
-    {
         if ( pt->pending_intr_nr == 0 )
-        {
-            pt_process_missed_ticks(pt);
             set_timer(&pt->timer, pt->scheduled);
-        }
-    }
 
     pt_thaw_time(v);
 
     pt_vcpu_unlock(v);
 }
 
-static void pt_timer_fn(void *data)
-{
-    struct periodic_time *pt = data;
-
-    pt_lock(pt);
-
-    pt->pending_intr_nr++;
-    pt->scheduled += pt->period;
-    pt->do_not_freeze = 0;
-
-    vcpu_kick(pt->vcpu);
-
-    pt_unlock(pt);
-}
-
 static void pt_irq_fired(struct vcpu *v, struct periodic_time *pt)
 {
     pt->irq_issued = false;
@@ -285,89 +238,108 @@ static void pt_irq_fired(struct vcpu *v, struct periodic_time *pt)
             list_del(&pt->list);
         pt->on_list = false;
         pt->pending_intr_nr = 0;
+
+        return;
     }
-    else if ( mode_is(v->domain, one_missed_tick_pending) ||
-              mode_is(v->domain, no_missed_ticks_pending) )
+
+    if ( mode_is(v->domain, one_missed_tick_pending) ||
+         mode_is(v->domain, no_missed_ticks_pending) )
     {
-        pt->last_plt_gtime = hvm_get_guest_time(v);
         pt_process_missed_ticks(pt);
         pt->pending_intr_nr = 0; /* 'collapse' all missed ticks */
+    }
+    else if ( !pt->pending_intr_nr )
+        pt_process_missed_ticks(pt);
+
+    if ( !pt->pending_intr_nr )
         set_timer(&pt->timer, pt->scheduled);
+}
+
+static void pt_timer_fn(void *data)
+{
+    struct periodic_time *pt = data;
+    struct vcpu *v;
+    time_cb *cb = NULL;
+    void *cb_priv;
+    unsigned int irq;
+
+    pt_lock(pt);
+
+    v = pt->vcpu;
+    irq = pt->irq;
+
+    if ( inject_interrupt(pt) )
+    {
+        pt->scheduled += pt->period;
+        pt->do_not_freeze = 0;
+        cb = pt->cb;
+        cb_priv = pt->priv;
     }
     else
     {
-        pt->last_plt_gtime += pt->period;
-        if ( --pt->pending_intr_nr == 0 )
-        {
-            pt_process_missed_ticks(pt);
-            if ( pt->pending_intr_nr == 0 )
-                set_timer(&pt->timer, pt->scheduled);
-        }
+        /* Masked. */
+        if ( pt->on_list )
+            list_del(&pt->list);
+        pt->on_list = false;
+        pt->pending_intr_nr++;
     }
 
-    if ( mode_is(v->domain, delay_for_missed_ticks) &&
-         (hvm_get_guest_time(v) < pt->last_plt_gtime) )
-        hvm_set_guest_time(v, pt->last_plt_gtime);
+    pt_unlock(pt);
+
+    if ( cb )
+        cb(v, cb_priv);
 }
 
-int pt_update_irq(struct vcpu *v)
+/*
+ * The same callback is shared between LAPIC and PIC/IO-APIC based timers, as
+ * we ignore the first parameter that's different between them.
+ */
+static void eoi_callback(unsigned int unused, void *data)
 {
-    struct list_head *head = &v->arch.hvm.tm_list;
-    struct periodic_time *pt, *temp, *earliest_pt;
-    uint64_t max_lag;
-    int irq, pt_vector = -1;
-    bool level;
+    struct periodic_time *pt = data;
+    struct vcpu *v;
+    time_cb *cb = NULL;
+    void *cb_priv;
 
-    pt_vcpu_lock(v);
+    pt_lock(pt);
 
-    earliest_pt = NULL;
-    max_lag = -1ULL;
-    list_for_each_entry_safe ( pt, temp, head, list )
+    pt_irq_fired(pt->vcpu, pt);
+    if ( pt->pending_intr_nr )
     {
-        if ( pt->pending_intr_nr )
+        if ( inject_interrupt(pt) )
+        {
+            pt->pending_intr_nr--;
+            cb = pt->cb;
+            cb_priv = pt->priv;
+            v = pt->vcpu;
+        }
+        else
         {
-            /* RTC code takes care of disabling the timer itself. */
-            if ( (pt->irq != RTC_IRQ || !pt->priv) && pt_irq_masked(pt) &&
-                 /* Level interrupts should be asserted even if masked. */
-                 !pt->level )
-            {
-                /* suspend timer emulation */
+            /* Masked. */
+            if ( pt->on_list )
                 list_del(&pt->list);
-                pt->on_list = 0;
-            }
-            else
-            {
-                if ( (pt->last_plt_gtime + pt->period) < max_lag )
-                {
-                    max_lag = pt->last_plt_gtime + pt->period;
-                    earliest_pt = pt;
-                }
-            }
+            pt->on_list = false;
         }
     }
 
-    if ( earliest_pt == NULL )
-    {
-        pt_vcpu_unlock(v);
-        return -1;
-    }
+    pt_unlock(pt);
 
-    earliest_pt->irq_issued = 1;
-    irq = earliest_pt->irq;
-    level = earliest_pt->level;
+    if ( cb != NULL )
+        cb(v, cb_priv);
+}
 
-    pt_vcpu_unlock(v);
+static bool inject_interrupt(struct periodic_time *pt)
+{
+    struct vcpu *v = pt->vcpu;
+    unsigned int irq = pt->irq;
 
-    switch ( earliest_pt->source )
+    if ( pt_irq_masked(pt) )
+        return false;
+
+    switch ( pt->source )
     {
     case PTSRC_lapic:
-        /*
-         * If periodic timer interrupt is handled by lapic, its vector in
-         * IRR is returned and used to set eoi_exit_bitmap for virtual
-         * interrupt delivery case. Otherwise return -1 to do nothing.
-         */
-        vlapic_set_irq(vcpu_vlapic(v), irq, 0);
-        pt_vector = irq;
+        vlapic_set_irq_callback(vcpu_vlapic(v), pt->irq, 0, eoi_callback, pt);
         break;
 
     case PTSRC_isa:
@@ -376,98 +348,26 @@ int pt_update_irq(struct vcpu *v)
              v->domain->arch.hvm.vpic[irq >> 3].int_output )
             hvm_isa_irq_assert(v->domain, irq, NULL);
         else
-        {
-            pt_vector = hvm_isa_irq_assert(v->domain, irq, vioapic_get_vector);
-            /*
-             * hvm_isa_irq_assert may not set the corresponding bit in vIRR
-             * when mask field of IOAPIC RTE is set. Check it again.
-             */
-            if ( pt_vector < 0 || !vlapic_test_irq(vcpu_vlapic(v), pt_vector) )
-                pt_vector = -1;
-        }
+            hvm_isa_irq_assert(v->domain, irq, vioapic_get_vector);
         break;
 
     case PTSRC_ioapic:
-        pt_vector = hvm_ioapic_assert(v->domain, irq, level);
-        if ( pt_vector < 0 || !vlapic_test_irq(vcpu_vlapic(v), pt_vector) )
-        {
-            pt_vector = -1;
-            if ( level )
-            {
-                /*
-                 * Level interrupts are always asserted because the pin assert
-                 * count is incremented regardless of whether the pin is masked
-                 * or the vector latched in IRR, so also execute the callback
-                 * associated with the timer.
-                 */
-                time_cb *cb = NULL;
-                void *cb_priv;
-
-                pt_vcpu_lock(v);
-                /* Make sure the timer is still on the list. */
-                list_for_each_entry ( pt, &v->arch.hvm.tm_list, list )
-                    if ( pt == earliest_pt )
-                    {
-                        pt_irq_fired(v, pt);
-                        cb = pt->cb;
-                        cb_priv = pt->priv;
-                        break;
-                    }
-                pt_vcpu_unlock(v);
-
-                if ( cb != NULL )
-                    cb(v, cb_priv);
-            }
-        }
+        hvm_ioapic_assert(v->domain, irq, pt->level);
         break;
     }
 
-    return pt_vector;
-}
-
-static struct periodic_time *is_pt_irq(
-    struct vcpu *v, struct hvm_intack intack)
-{
-    struct list_head *head = &v->arch.hvm.tm_list;
-    struct periodic_time *pt;
-
-    list_for_each_entry ( pt, head, list )
-    {
-        if ( pt->pending_intr_nr && pt->irq_issued &&
-             (intack.vector == pt_irq_vector(pt, intack.source)) )
-            return pt;
-    }
-
-    return NULL;
-}
-
-void pt_intr_post(struct vcpu *v, struct hvm_intack intack)
-{
-    struct periodic_time *pt;
-    time_cb *cb;
-    void *cb_priv;
-
-    if ( intack.source == hvm_intsrc_vector )
-        return;
-
-    pt_vcpu_lock(v);
-
-    pt = is_pt_irq(v, intack);
-    if ( pt == NULL )
-    {
-        pt_vcpu_unlock(v);
-        return;
-    }
-
-    pt_irq_fired(v, pt);
-
-    cb = pt->cb;
-    cb_priv = pt->priv;
+    /* Update time when an interrupt is injected. */
+    if ( mode_is(v->domain, one_missed_tick_pending) ||
+         mode_is(v->domain, no_missed_ticks_pending) )
+        pt->last_plt_gtime = hvm_get_guest_time(v);
+    else
+        pt->last_plt_gtime += pt->period;
 
-    pt_vcpu_unlock(v);
+    if ( mode_is(v->domain, delay_for_missed_ticks) &&
+         hvm_get_guest_time(v) < pt->last_plt_gtime )
+        hvm_set_guest_time(v, pt->last_plt_gtime);
 
-    if ( cb != NULL )
-        cb(v, cb_priv);
+    return true;
 }
 
 void pt_migrate(struct vcpu *v)
@@ -543,6 +443,24 @@ void create_periodic_time(
     pt->cb = cb;
     pt->priv = data;
 
+    switch ( pt->source )
+    {
+        int rc;
+
+    case PTSRC_isa:
+        irq = hvm_isa_irq_to_gsi(irq);
+        /* fallthrough */
+    case PTSRC_ioapic:
+        pt->eoi_cb.callback = eoi_callback;
+        pt->eoi_cb.data = pt;
+        rc = hvm_gsi_register_callback(v->domain, irq, &pt->eoi_cb);
+        if ( rc )
+            gdprintk(XENLOG_WARNING,
+                     "unable to register callback for timer GSI %u: %d\n",
+                     irq, rc);
+        break;
+    }
+
     pt->on_list = 1;
     list_add(&pt->list, &v->arch.hvm.tm_list);
 
@@ -554,6 +472,8 @@ void create_periodic_time(
 
 void destroy_periodic_time(struct periodic_time *pt)
 {
+    unsigned int gsi;
+
     /* Was this structure previously initialised by create_periodic_time()? */
     if ( pt->vcpu == NULL )
         return;
@@ -563,6 +483,17 @@ void destroy_periodic_time(struct periodic_time *pt)
         list_del(&pt->list);
     pt->on_list = 0;
     pt->pending_intr_nr = 0;
+
+    gsi = pt->irq;
+    switch ( pt->source )
+    {
+    case PTSRC_isa:
+        gsi = hvm_isa_irq_to_gsi(pt->irq);
+        /* fallthrough */
+    case PTSRC_ioapic:
+        hvm_gsi_unregister_callback(pt->vcpu->domain, gsi, &pt->eoi_cb);
+        break;
+    }
     pt_unlock(pt);
 
     /*
@@ -617,20 +548,29 @@ void pt_adjust_global_vcpu_target(struct vcpu *v)
     write_unlock(&pl_time->vhpet.lock);
 }
 
-
 static void pt_resume(struct periodic_time *pt)
 {
+    struct vcpu *v;
+    time_cb *cb = NULL;
+    void *cb_priv;
+
     if ( pt->vcpu == NULL )
         return;
 
     pt_lock(pt);
-    if ( pt->pending_intr_nr && !pt->on_list )
+    if ( pt->pending_intr_nr && !pt->on_list && inject_interrupt(pt) )
     {
+        pt->pending_intr_nr--;
+        cb = pt->cb;
+        cb_priv = pt->priv;
+        v = pt->vcpu;
         pt->on_list = 1;
         list_add(&pt->list, &pt->vcpu->arch.hvm.tm_list);
-        vcpu_kick(pt->vcpu);
     }
     pt_unlock(pt);
+
+    if ( cb )
+        cb(v, cb_priv);
 }
 
 void pt_may_unmask_irq(struct domain *d, struct periodic_time *vlapic_pt)
diff --git a/xen/include/asm-x86/hvm/vpt.h b/xen/include/asm-x86/hvm/vpt.h
index 39d26cbda4..9440fe4ac7 100644
--- a/xen/include/asm-x86/hvm/vpt.h
+++ b/xen/include/asm-x86/hvm/vpt.h
@@ -23,6 +23,7 @@
 #include <xen/list.h>
 #include <xen/rwlock.h>
 #include <asm/hvm/hvm.h>
+#include <asm/hvm/irq.h>
 
 /*
  * Abstract layer of periodic time, one short time.
@@ -50,6 +51,7 @@ struct periodic_time {
     struct timer timer;         /* ac_timer */
     time_cb *cb;
     void *priv;                 /* point back to platform time source */
+    struct hvm_gsi_eoi_callback eoi_cb; /* EOI callback registration data */
 };
 
 
@@ -145,9 +147,6 @@ struct pl_time {    /* platform time */
 
 void pt_save_timer(struct vcpu *v);
 void pt_restore_timer(struct vcpu *v);
-int pt_update_irq(struct vcpu *v);
-struct hvm_intack;
-void pt_intr_post(struct vcpu *v, struct hvm_intack intack);
 void pt_migrate(struct vcpu *v);
 
 void pt_adjust_global_vcpu_target(struct vcpu *v);
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 10/11] x86/vpt: remove vPT timers per-vCPU lists
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
                   ` (8 preceding siblings ...)
  2020-09-30 10:41 ` [PATCH v2 09/11] x86/vpt: switch interrupt injection model Roger Pau Monne
@ 2020-09-30 10:41 ` Roger Pau Monne
  2020-10-23 15:34   ` Jan Beulich
  2020-09-30 10:41 ` [PATCH v2 11/11] x86/vpt: introduce a per-vPT lock Roger Pau Monne
  10 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:41 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

No longer add vPT timers to lists on specific vCPUs, since there's no
need anymore to check if timer interrupts have been injected on return
to HVM guest.

Such change allows to get rid of virtual timers vCPU migration, and
also cleanup some of the virtual timers fields that are no longer
required.

The model is also slightly different now in that timers are not
stopped when a vCPU is de-scheduled. Such timers will continue
running, and when triggered the function will try to inject the
corresponding interrupt to the guest (which might be different than
the currently running one). Note that the timer triggering when the
guest is no longer running can only happen once, as the timer callback
will not reset the interrupt to fire again. Such resetting if required
will be done by the EOI callback.

Since virtual timers are no longer added to per-VCPU lists when active
a new 'masked' field is added to the structure, to signal whether a
timer has it's interrupt source currently masked.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in this version.
---
 xen/arch/x86/domain.c          |   2 +-
 xen/arch/x86/hvm/hvm.c         |   2 -
 xen/arch/x86/hvm/vlapic.c      |   1 -
 xen/arch/x86/hvm/vpt.c         | 153 ++++-----------------------------
 xen/include/asm-x86/hvm/vcpu.h |   3 +-
 xen/include/asm-x86/hvm/vpt.h  |   9 +-
 6 files changed, 23 insertions(+), 147 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index e8e91cf080..f373431105 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1964,7 +1964,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
     vpmu_switch_from(prev);
     np2m_schedule(NP2M_SCHEDLE_OUT);
 
-    if ( is_hvm_domain(prevd) && !list_empty(&prev->arch.hvm.tm_list) )
+    if ( is_hvm_domain(prevd) )
         pt_save_timer(prev);
 
     local_irq_disable();
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 9636ac6bf1..5a0448aa13 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -489,7 +489,6 @@ void hvm_set_info_guest(struct vcpu *v)
 void hvm_migrate_timers(struct vcpu *v)
 {
     rtc_migrate_timers(v);
-    pt_migrate(v);
 }
 
 void hvm_migrate_pirq(struct hvm_pirq_dpci *pirq_dpci, const struct vcpu *v)
@@ -1558,7 +1557,6 @@ int hvm_vcpu_initialise(struct vcpu *v)
     hvm_asid_flush_vcpu(v);
 
     spin_lock_init(&v->arch.hvm.tm_lock);
-    INIT_LIST_HEAD(&v->arch.hvm.tm_list);
 
     rc = hvm_vcpu_cacheattr_init(v); /* teardown: vcpu_cacheattr_destroy */
     if ( rc != 0 )
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 35f12e0909..9afcb239af 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1330,7 +1330,6 @@ void vlapic_adjust_i8259_target(struct domain *d)
     if ( d->arch.hvm.i8259_target == v )
         return;
     d->arch.hvm.i8259_target = v;
-    pt_adjust_global_vcpu_target(v);
 }
 
 int vlapic_has_pending_irq(struct vcpu *v)
diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c
index 787931d7bb..76ace8da80 100644
--- a/xen/arch/x86/hvm/vpt.c
+++ b/xen/arch/x86/hvm/vpt.c
@@ -126,18 +126,6 @@ static int pt_irq_masked(struct periodic_time *pt)
     return 1;
 }
 
-static void pt_vcpu_lock(struct vcpu *v)
-{
-    read_lock(&v->domain->arch.hvm.pl_time->pt_migrate);
-    spin_lock(&v->arch.hvm.tm_lock);
-}
-
-static void pt_vcpu_unlock(struct vcpu *v)
-{
-    spin_unlock(&v->arch.hvm.tm_lock);
-    read_unlock(&v->domain->arch.hvm.pl_time->pt_migrate);
-}
-
 static void pt_lock(struct periodic_time *pt)
 {
     /*
@@ -151,7 +139,8 @@ static void pt_lock(struct periodic_time *pt)
 
 static void pt_unlock(struct periodic_time *pt)
 {
-    pt_vcpu_unlock(pt->vcpu);
+    spin_unlock(&pt->vcpu->arch.hvm.tm_lock);
+    read_unlock(&pt->vcpu->domain->arch.hvm.pl_time->pt_migrate);
 }
 
 static void pt_process_missed_ticks(struct periodic_time *pt)
@@ -166,9 +155,7 @@ static void pt_process_missed_ticks(struct periodic_time *pt)
         return;
 
     missed_ticks = missed_ticks / (s_time_t) pt->period + 1;
-    if ( mode_is(pt->vcpu->domain, no_missed_ticks_pending) )
-        pt->do_not_freeze = !pt->pending_intr_nr;
-    else
+    if ( !mode_is(pt->vcpu->domain, no_missed_ticks_pending) )
         pt->pending_intr_nr += missed_ticks;
     pt->scheduled += missed_ticks * pt->period;
 }
@@ -195,50 +182,20 @@ static void pt_thaw_time(struct vcpu *v)
 
 void pt_save_timer(struct vcpu *v)
 {
-    struct list_head *head = &v->arch.hvm.tm_list;
-    struct periodic_time *pt;
-
-    if ( v->pause_flags & VPF_blocked )
-        return;
-
-    pt_vcpu_lock(v);
-
-    list_for_each_entry ( pt, head, list )
-        if ( !pt->do_not_freeze )
-            stop_timer(&pt->timer);
 
     pt_freeze_time(v);
-
-    pt_vcpu_unlock(v);
 }
 
 void pt_restore_timer(struct vcpu *v)
 {
-    struct list_head *head = &v->arch.hvm.tm_list;
-    struct periodic_time *pt;
-
-    pt_vcpu_lock(v);
-
-    list_for_each_entry ( pt, head, list )
-        if ( pt->pending_intr_nr == 0 )
-            set_timer(&pt->timer, pt->scheduled);
-
     pt_thaw_time(v);
-
-    pt_vcpu_unlock(v);
 }
 
 static void pt_irq_fired(struct vcpu *v, struct periodic_time *pt)
 {
-    pt->irq_issued = false;
-
     if ( pt->one_shot )
     {
-        if ( pt->on_list )
-            list_del(&pt->list);
-        pt->on_list = false;
         pt->pending_intr_nr = 0;
-
         return;
     }
 
@@ -252,7 +209,11 @@ static void pt_irq_fired(struct vcpu *v, struct periodic_time *pt)
         pt_process_missed_ticks(pt);
 
     if ( !pt->pending_intr_nr )
+    {
+        /* Make sure timer follows vCPU. */
+        migrate_timer(&pt->timer, current->processor);
         set_timer(&pt->timer, pt->scheduled);
+    }
 }
 
 static void pt_timer_fn(void *data)
@@ -268,21 +229,16 @@ static void pt_timer_fn(void *data)
     v = pt->vcpu;
     irq = pt->irq;
 
-    if ( inject_interrupt(pt) )
+    pt->masked = !inject_interrupt(pt);
+    pt->scheduled += pt->period;
+
+    if ( pt->masked )
+        pt->pending_intr_nr++;
+    else
     {
-        pt->scheduled += pt->period;
-        pt->do_not_freeze = 0;
         cb = pt->cb;
         cb_priv = pt->priv;
     }
-    else
-    {
-        /* Masked. */
-        if ( pt->on_list )
-            list_del(&pt->list);
-        pt->on_list = false;
-        pt->pending_intr_nr++;
-    }
 
     pt_unlock(pt);
 
@@ -306,20 +262,14 @@ static void eoi_callback(unsigned int unused, void *data)
     pt_irq_fired(pt->vcpu, pt);
     if ( pt->pending_intr_nr )
     {
-        if ( inject_interrupt(pt) )
+        pt->masked = !inject_interrupt(pt);
+        if ( !pt->masked )
         {
             pt->pending_intr_nr--;
             cb = pt->cb;
             cb_priv = pt->priv;
             v = pt->vcpu;
         }
-        else
-        {
-            /* Masked. */
-            if ( pt->on_list )
-                list_del(&pt->list);
-            pt->on_list = false;
-        }
     }
 
     pt_unlock(pt);
@@ -370,19 +320,6 @@ static bool inject_interrupt(struct periodic_time *pt)
     return true;
 }
 
-void pt_migrate(struct vcpu *v)
-{
-    struct list_head *head = &v->arch.hvm.tm_list;
-    struct periodic_time *pt;
-
-    pt_vcpu_lock(v);
-
-    list_for_each_entry ( pt, head, list )
-        migrate_timer(&pt->timer, v->processor);
-
-    pt_vcpu_unlock(v);
-}
-
 void create_periodic_time(
     struct vcpu *v, struct periodic_time *pt, uint64_t delta,
     uint64_t period, uint8_t irq, time_cb *cb, void *data, bool level)
@@ -402,8 +339,7 @@ void create_periodic_time(
     write_lock(&v->domain->arch.hvm.pl_time->pt_migrate);
 
     pt->pending_intr_nr = 0;
-    pt->do_not_freeze = 0;
-    pt->irq_issued = 0;
+    pt->masked = false;
 
     /* Periodic timer must be at least 0.1ms. */
     if ( (period < 100000) && period )
@@ -461,9 +397,6 @@ void create_periodic_time(
         break;
     }
 
-    pt->on_list = 1;
-    list_add(&pt->list, &v->arch.hvm.tm_list);
-
     init_timer(&pt->timer, pt_timer_fn, pt, v->processor);
     set_timer(&pt->timer, pt->scheduled);
 
@@ -479,10 +412,8 @@ void destroy_periodic_time(struct periodic_time *pt)
         return;
 
     pt_lock(pt);
-    if ( pt->on_list )
-        list_del(&pt->list);
-    pt->on_list = 0;
     pt->pending_intr_nr = 0;
+    pt->masked = false;
 
     gsi = pt->irq;
     switch ( pt->source )
@@ -503,51 +434,6 @@ void destroy_periodic_time(struct periodic_time *pt)
     kill_timer(&pt->timer);
 }
 
-static void pt_adjust_vcpu(struct periodic_time *pt, struct vcpu *v)
-{
-    ASSERT(pt->source == PTSRC_isa || pt->source == PTSRC_ioapic);
-
-    if ( pt->vcpu == NULL )
-        return;
-
-    write_lock(&pt->vcpu->domain->arch.hvm.pl_time->pt_migrate);
-    pt->vcpu = v;
-    if ( pt->on_list )
-    {
-        list_del(&pt->list);
-        list_add(&pt->list, &v->arch.hvm.tm_list);
-        migrate_timer(&pt->timer, v->processor);
-    }
-    write_unlock(&pt->vcpu->domain->arch.hvm.pl_time->pt_migrate);
-}
-
-void pt_adjust_global_vcpu_target(struct vcpu *v)
-{
-    struct PITState *vpit;
-    struct pl_time *pl_time;
-    int i;
-
-    if ( !v || !has_vpit(v->domain) )
-        return;
-
-    vpit = &v->domain->arch.vpit;
-
-    spin_lock(&vpit->lock);
-    pt_adjust_vcpu(&vpit->pt0, v);
-    spin_unlock(&vpit->lock);
-
-    pl_time = v->domain->arch.hvm.pl_time;
-
-    spin_lock(&pl_time->vrtc.lock);
-    pt_adjust_vcpu(&pl_time->vrtc.pt, v);
-    spin_unlock(&pl_time->vrtc.lock);
-
-    write_lock(&pl_time->vhpet.lock);
-    for ( i = 0; i < HPET_TIMER_NUM; i++ )
-        pt_adjust_vcpu(&pl_time->vhpet.pt[i], v);
-    write_unlock(&pl_time->vhpet.lock);
-}
-
 static void pt_resume(struct periodic_time *pt)
 {
     struct vcpu *v;
@@ -558,14 +444,13 @@ static void pt_resume(struct periodic_time *pt)
         return;
 
     pt_lock(pt);
-    if ( pt->pending_intr_nr && !pt->on_list && inject_interrupt(pt) )
+    if ( pt->pending_intr_nr && pt->masked && inject_interrupt(pt) )
     {
         pt->pending_intr_nr--;
         cb = pt->cb;
         cb_priv = pt->priv;
         v = pt->vcpu;
-        pt->on_list = 1;
-        list_add(&pt->list, &pt->vcpu->arch.hvm.tm_list);
+        pt->masked = false;
     }
     pt_unlock(pt);
 
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 5ccd075815..07a9890ed3 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -166,9 +166,8 @@ struct hvm_vcpu {
     s64                 cache_tsc_offset;
     u64                 guest_time;
 
-    /* Lock and list for virtual platform timers. */
+    /* Lock for virtual platform timers. */
     spinlock_t          tm_lock;
-    struct list_head    tm_list;
 
     bool                flag_dr_dirty;
     bool                debug_state_latch;
diff --git a/xen/include/asm-x86/hvm/vpt.h b/xen/include/asm-x86/hvm/vpt.h
index 9440fe4ac7..7c0322727b 100644
--- a/xen/include/asm-x86/hvm/vpt.h
+++ b/xen/include/asm-x86/hvm/vpt.h
@@ -31,13 +31,10 @@
 typedef void time_cb(struct vcpu *v, void *opaque);
 
 struct periodic_time {
-    struct list_head list;
-    bool on_list;
     bool one_shot;
-    bool do_not_freeze;
-    bool irq_issued;
     bool warned_timeout_too_short;
     bool level;
+    bool masked;
 #define PTSRC_isa    1 /* ISA time source */
 #define PTSRC_lapic  2 /* LAPIC time source */
 #define PTSRC_ioapic 3 /* IOAPIC time source */
@@ -147,9 +144,7 @@ struct pl_time {    /* platform time */
 
 void pt_save_timer(struct vcpu *v);
 void pt_restore_timer(struct vcpu *v);
-void pt_migrate(struct vcpu *v);
 
-void pt_adjust_global_vcpu_target(struct vcpu *v);
 #define pt_global_vcpu_target(d) \
     (is_hvm_domain(d) && (d)->arch.hvm.i8259_target ? \
      (d)->arch.hvm.i8259_target : \
@@ -158,7 +153,7 @@ void pt_adjust_global_vcpu_target(struct vcpu *v);
 void pt_may_unmask_irq(struct domain *d, struct periodic_time *vlapic_pt);
 
 /* Is given periodic timer active? */
-#define pt_active(pt) ((pt)->on_list || (pt)->pending_intr_nr)
+#define pt_active(pt) !(pt)->masked
 
 /*
  * Create/destroy a periodic (or one-shot!) timer.
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 11/11] x86/vpt: introduce a per-vPT lock
  2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
                   ` (9 preceding siblings ...)
  2020-09-30 10:41 ` [PATCH v2 10/11] x86/vpt: remove vPT timers per-vCPU lists Roger Pau Monne
@ 2020-09-30 10:41 ` Roger Pau Monne
  2020-09-30 13:30   ` Roger Pau Monné
  10 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monne @ 2020-09-30 10:41 UTC (permalink / raw)
  To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Wei Liu

Introduce a per virtual timer lock that replaces the existing per-vCPU
and per-domain vPT locks. Since virtual timers are no longer assigned
or migrated between vCPUs the locking can be simplified to a
in-structure spinlock that protects all the fields.

This requires introducing a helper to initialize the spinlock, and
that could be used to initialize other virtual timer fields in the
future.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - New in his version.
---
 xen/arch/x86/emul-i8254.c     |  1 +
 xen/arch/x86/hvm/hpet.c       |  8 +++++-
 xen/arch/x86/hvm/hvm.c        |  2 --
 xen/arch/x86/hvm/rtc.c        |  1 +
 xen/arch/x86/hvm/vlapic.c     |  1 +
 xen/arch/x86/hvm/vpt.c        | 48 +++++++++++++++--------------------
 xen/include/asm-x86/hvm/vpt.h |  9 ++-----
 7 files changed, 33 insertions(+), 37 deletions(-)

diff --git a/xen/arch/x86/emul-i8254.c b/xen/arch/x86/emul-i8254.c
index 73be4188ad..a47138cbab 100644
--- a/xen/arch/x86/emul-i8254.c
+++ b/xen/arch/x86/emul-i8254.c
@@ -484,6 +484,7 @@ void pit_init(struct domain *d, unsigned long cpu_khz)
     {
         register_portio_handler(d, PIT_BASE, 4, handle_pit_io);
         register_portio_handler(d, 0x61, 1, handle_speaker_io);
+        init_periodic_timer(&pit->pt0);
     }
 
     pit_reset(d);
diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index ca94e8b453..20593c3862 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -739,12 +739,18 @@ static void hpet_set(HPETState *h)
 
 void hpet_init(struct domain *d)
 {
+    HPETState *h = domain_vhpet(d);
+    unsigned int i;
+
     if ( !has_vhpet(d) )
         return;
 
-    hpet_set(domain_vhpet(d));
+    hpet_set(h);
     register_mmio_handler(d, &hpet_mmio_ops);
     d->arch.hvm.params[HVM_PARAM_HPET_ENABLED] = 1;
+
+    for ( i = 0; i < HPET_TIMER_NUM; i++ )
+        init_periodic_timer(&h->pt[i]);
 }
 
 void hpet_deinit(struct domain *d)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 5a0448aa13..7cb4511b60 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -665,8 +665,6 @@ int hvm_domain_initialise(struct domain *d)
     /* need link to containing domain */
     d->arch.hvm.pl_time->domain = d;
 
-    rwlock_init(&d->arch.hvm.pl_time->pt_migrate);
-
     /* Set the default IO Bitmap. */
     if ( is_hardware_domain(d) )
     {
diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c
index 3150f5f147..2d540b16ac 100644
--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -846,6 +846,7 @@ void rtc_init(struct domain *d)
     init_timer(&s->update_timer, rtc_update_timer, s, smp_processor_id());
     init_timer(&s->update_timer2, rtc_update_timer2, s, smp_processor_id());
     init_timer(&s->alarm_timer, rtc_alarm_cb, s, smp_processor_id());
+    init_periodic_timer(&s->pt);
 
     register_portio_handler(d, RTC_PORT(0), 2, handle_rtc_io);
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 9afcb239af..fa40fca6c9 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1642,6 +1642,7 @@ int vlapic_init(struct vcpu *v)
         return 0;
     }
 
+    init_periodic_timer(&vlapic->pt);
     vlapic->pt.source = PTSRC_lapic;
 
     if (vlapic->regs_page == NULL)
diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c
index 76ace8da80..47bd3285e1 100644
--- a/xen/arch/x86/hvm/vpt.c
+++ b/xen/arch/x86/hvm/vpt.c
@@ -126,23 +126,6 @@ static int pt_irq_masked(struct periodic_time *pt)
     return 1;
 }
 
-static void pt_lock(struct periodic_time *pt)
-{
-    /*
-     * We cannot use pt_vcpu_lock here, because we need to acquire the
-     * per-domain lock first and then (re-)fetch the value of pt->vcpu, or
-     * else we might be using a stale value of pt->vcpu.
-     */
-    read_lock(&pt->vcpu->domain->arch.hvm.pl_time->pt_migrate);
-    spin_lock(&pt->vcpu->arch.hvm.tm_lock);
-}
-
-static void pt_unlock(struct periodic_time *pt)
-{
-    spin_unlock(&pt->vcpu->arch.hvm.tm_lock);
-    read_unlock(&pt->vcpu->domain->arch.hvm.pl_time->pt_migrate);
-}
-
 static void pt_process_missed_ticks(struct periodic_time *pt)
 {
     s_time_t missed_ticks, now = NOW();
@@ -224,7 +207,7 @@ static void pt_timer_fn(void *data)
     void *cb_priv;
     unsigned int irq;
 
-    pt_lock(pt);
+    spin_lock(&pt->lock);
 
     v = pt->vcpu;
     irq = pt->irq;
@@ -240,7 +223,7 @@ static void pt_timer_fn(void *data)
         cb_priv = pt->priv;
     }
 
-    pt_unlock(pt);
+    spin_unlock(&pt->lock);
 
     if ( cb )
         cb(v, cb_priv);
@@ -257,7 +240,7 @@ static void eoi_callback(unsigned int unused, void *data)
     time_cb *cb = NULL;
     void *cb_priv;
 
-    pt_lock(pt);
+    spin_lock(&pt->lock);
 
     pt_irq_fired(pt->vcpu, pt);
     if ( pt->pending_intr_nr )
@@ -272,7 +255,7 @@ static void eoi_callback(unsigned int unused, void *data)
         }
     }
 
-    pt_unlock(pt);
+    spin_unlock(&pt->lock);
 
     if ( cb != NULL )
         cb(v, cb_priv);
@@ -320,6 +303,11 @@ static bool inject_interrupt(struct periodic_time *pt)
     return true;
 }
 
+void init_periodic_timer(struct periodic_time *pt)
+{
+    spin_lock_init(&pt->lock);
+}
+
 void create_periodic_time(
     struct vcpu *v, struct periodic_time *pt, uint64_t delta,
     uint64_t period, uint8_t irq, time_cb *cb, void *data, bool level)
@@ -336,7 +324,7 @@ void create_periodic_time(
 
     destroy_periodic_time(pt);
 
-    write_lock(&v->domain->arch.hvm.pl_time->pt_migrate);
+    spin_lock(&pt->lock);
 
     pt->pending_intr_nr = 0;
     pt->masked = false;
@@ -400,18 +388,21 @@ void create_periodic_time(
     init_timer(&pt->timer, pt_timer_fn, pt, v->processor);
     set_timer(&pt->timer, pt->scheduled);
 
-    write_unlock(&v->domain->arch.hvm.pl_time->pt_migrate);
+    spin_unlock(&pt->lock);
 }
 
 void destroy_periodic_time(struct periodic_time *pt)
 {
     unsigned int gsi;
 
+    spin_lock(&pt->lock);
     /* Was this structure previously initialised by create_periodic_time()? */
     if ( pt->vcpu == NULL )
+    {
+        spin_unlock(&pt->lock);
         return;
+    }
 
-    pt_lock(pt);
     pt->pending_intr_nr = 0;
     pt->masked = false;
 
@@ -425,7 +416,7 @@ void destroy_periodic_time(struct periodic_time *pt)
         hvm_gsi_unregister_callback(pt->vcpu->domain, gsi, &pt->eoi_cb);
         break;
     }
-    pt_unlock(pt);
+    spin_unlock(&pt->lock);
 
     /*
      * pt_timer_fn() can run until this kill_timer() returns. We must do this
@@ -440,10 +431,13 @@ static void pt_resume(struct periodic_time *pt)
     time_cb *cb = NULL;
     void *cb_priv;
 
+    spin_lock(&pt->lock);
     if ( pt->vcpu == NULL )
+    {
+        spin_unlock(&pt->lock);
         return;
+    }
 
-    pt_lock(pt);
     if ( pt->pending_intr_nr && pt->masked && inject_interrupt(pt) )
     {
         pt->pending_intr_nr--;
@@ -452,7 +446,7 @@ static void pt_resume(struct periodic_time *pt)
         v = pt->vcpu;
         pt->masked = false;
     }
-    pt_unlock(pt);
+    spin_unlock(&pt->lock);
 
     if ( cb )
         cb(v, cb_priv);
diff --git a/xen/include/asm-x86/hvm/vpt.h b/xen/include/asm-x86/hvm/vpt.h
index 7c0322727b..75e0526b17 100644
--- a/xen/include/asm-x86/hvm/vpt.h
+++ b/xen/include/asm-x86/hvm/vpt.h
@@ -49,6 +49,7 @@ struct periodic_time {
     time_cb *cb;
     void *priv;                 /* point back to platform time source */
     struct hvm_gsi_eoi_callback eoi_cb; /* EOI callback registration data */
+    spinlock_t lock;
 };
 
 
@@ -127,13 +128,6 @@ struct pl_time {    /* platform time */
     struct RTCState  vrtc;
     struct HPETState vhpet;
     struct PMTState  vpmt;
-    /*
-     * rwlock to prevent periodic_time vCPU migration. Take the lock in read
-     * mode in order to prevent the vcpu field of periodic_time from changing.
-     * Lock must be taken in write mode when changes to the vcpu field are
-     * performed, as it allows exclusive access to all the timers of a domain.
-     */
-    rwlock_t pt_migrate;
     /* guest_time = Xen sys time + stime_offset */
     int64_t stime_offset;
     /* Ensures monotonicity in appropriate timer modes. */
@@ -168,6 +162,7 @@ void create_periodic_time(
     struct vcpu *v, struct periodic_time *pt, uint64_t delta,
     uint64_t period, uint8_t irq, time_cb *cb, void *data, bool level);
 void destroy_periodic_time(struct periodic_time *pt);
+void init_periodic_timer(struct periodic_time *pt);
 
 int pv_pit_handler(int port, int data, int write);
 void pit_reset(struct domain *d);
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* RE: [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
  2020-09-30 10:40 ` [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks Roger Pau Monne
@ 2020-09-30 11:30   ` Paul Durrant
  2020-10-02  8:48   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Paul Durrant @ 2020-09-30 11:30 UTC (permalink / raw)
  To: 'Roger Pau Monne', xen-devel
  Cc: 'Jan Beulich', 'Andrew Cooper', 'Wei Liu',
	'Paul Durrant'

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 30 September 2020 11:41
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Paul Durrant <paul@xen.org>; Paul Durrant
> <pdurrant@amazon.com>
> Subject: [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
> 
> EOIs are always executed in guest vCPU context, so there's no reason to
> pass a vCPU parameter around as can be fetched from current.
> 
> While there make the vector parameter of both callbacks unsigned int.
> 
> No functional change intended.
> 
> Suggested-by: Paul Durrant <pdurrant@amazon.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Paul Durrant <paul@xen.org>

> ---
> Changes since v1:
>  - New in this version.
> ---
>  xen/arch/x86/hvm/vioapic.c        | 5 +++--
>  xen/arch/x86/hvm/vlapic.c         | 7 ++-----
>  xen/drivers/passthrough/io.c      | 4 +++-
>  xen/include/asm-x86/hvm/io.h      | 2 +-
>  xen/include/asm-x86/hvm/vioapic.h | 2 +-
>  5 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 67d4a6237f..0fb9147d99 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -353,7 +353,7 @@ static int vioapic_write(
> 
>  #if VIOAPIC_VERSION_ID >= 0x20
>      case VIOAPIC_REG_EOI:
> -        vioapic_update_EOI(v->domain, val);
> +        vioapic_update_EOI(val);
>          break;
>  #endif
> 
> @@ -495,8 +495,9 @@ void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
>      }
>  }
> 
> -void vioapic_update_EOI(struct domain *d, u8 vector)
> +void vioapic_update_EOI(unsigned int vector)
>  {
> +    struct domain *d = current->domain;
>      struct hvm_irq *hvm_irq = hvm_domain_irq(d);
>      union vioapic_redir_entry *ent;
>      unsigned int i;
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 4e3861eb7d..ae737403f3 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -459,13 +459,10 @@ void vlapic_EOI_set(struct vlapic *vlapic)
> 
>  void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
>  {
> -    struct vcpu *v = vlapic_vcpu(vlapic);
> -    struct domain *d = v->domain;
> -
>      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
> -        vioapic_update_EOI(d, vector);
> +        vioapic_update_EOI(vector);
> 
> -    hvm_dpci_msi_eoi(d, vector);
> +    hvm_dpci_msi_eoi(vector);
>  }
> 
>  static bool_t is_multicast_dest(struct vlapic *vlapic, unsigned int short_hand,
> diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
> index 6b1305a3e5..54f3e7b540 100644
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -874,8 +874,10 @@ static int _hvm_dpci_msi_eoi(struct domain *d,
>      return 0;
>  }
> 
> -void hvm_dpci_msi_eoi(struct domain *d, int vector)
> +void hvm_dpci_msi_eoi(unsigned int vector)
>  {
> +    struct domain *d = current->domain;
> +
>      if ( !is_iommu_enabled(d) ||
>           (!hvm_domain_irq(d)->dpci && !is_hardware_domain(d)) )
>         return;
> diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
> index 558426b772..adec0f566a 100644
> --- a/xen/include/asm-x86/hvm/io.h
> +++ b/xen/include/asm-x86/hvm/io.h
> @@ -159,7 +159,7 @@ struct hvm_hw_stdvga {
>  void stdvga_init(struct domain *d);
>  void stdvga_deinit(struct domain *d);
> 
> -extern void hvm_dpci_msi_eoi(struct domain *d, int vector);
> +extern void hvm_dpci_msi_eoi(unsigned int vector);
> 
>  /* Decode a PCI port IO access into a bus/slot/func/reg. */
>  unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr,
> diff --git a/xen/include/asm-x86/hvm/vioapic.h b/xen/include/asm-x86/hvm/vioapic.h
> index d6f4e12d54..fd602f8830 100644
> --- a/xen/include/asm-x86/hvm/vioapic.h
> +++ b/xen/include/asm-x86/hvm/vioapic.h
> @@ -64,7 +64,7 @@ int vioapic_init(struct domain *d);
>  void vioapic_deinit(struct domain *d);
>  void vioapic_reset(struct domain *d);
>  void vioapic_irq_positive_edge(struct domain *d, unsigned int irq);
> -void vioapic_update_EOI(struct domain *d, u8 vector);
> +void vioapic_update_EOI(unsigned int vector);
> 
>  int vioapic_get_mask(const struct domain *d, unsigned int gsi);
>  int vioapic_get_vector(const struct domain *d, unsigned int gsi);
> --
> 2.28.0




^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v2 02/11] x86/hvm: drop domain parameter from vioapic/vpic EOI callbacks
  2020-09-30 10:40 ` [PATCH v2 02/11] x86/hvm: drop domain parameter from vioapic/vpic " Roger Pau Monne
@ 2020-09-30 11:33   ` Paul Durrant
  2020-10-02  9:02   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Paul Durrant @ 2020-09-30 11:33 UTC (permalink / raw)
  To: 'Roger Pau Monne', xen-devel
  Cc: 'Jan Beulich', 'Andrew Cooper', 'Wei Liu'

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 30 September 2020 11:41
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Paul Durrant <paul@xen.org>
> Subject: [PATCH v2 02/11] x86/hvm: drop domain parameter from vioapic/vpic EOI callbacks
> 
> EOIs are always executed in guest vCPU context, so there's no reason to
> pass a domain parameter around as can be fetched from current->domain.
> 
> No functional change intended.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
>  - New in this version.

You could even squash this with the previous patch I think...

Reviewed-by: Paul Durrant <paul@xen.org>

> ---
>  xen/arch/x86/hvm/vioapic.c   | 2 +-
>  xen/arch/x86/hvm/vpic.c      | 3 +--
>  xen/drivers/passthrough/io.c | 4 ++--
>  xen/include/asm-x86/hvm/io.h | 3 +--
>  4 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 0fb9147d99..752fc410db 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -522,7 +522,7 @@ void vioapic_update_EOI(unsigned int vector)
>              if ( is_iommu_enabled(d) )
>              {
>                  spin_unlock(&d->arch.hvm.irq_lock);
> -                hvm_dpci_eoi(d, vioapic->base_gsi + pin, ent);
> +                hvm_dpci_eoi(vioapic->base_gsi + pin, ent);
>                  spin_lock(&d->arch.hvm.irq_lock);
>              }
> 
> diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
> index 3cf12581e9..26f74f4471 100644
> --- a/xen/arch/x86/hvm/vpic.c
> +++ b/xen/arch/x86/hvm/vpic.c
> @@ -262,8 +262,7 @@ static void vpic_ioport_write(
>                  /* Release lock and EOI the physical interrupt (if any). */
>                  vpic_update_int_output(vpic);
>                  vpic_unlock(vpic);
> -                hvm_dpci_eoi(current->domain,
> -                             hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin),
> +                hvm_dpci_eoi(hvm_isa_irq_to_gsi((addr >> 7) ? (pin | 8) : pin),
>                               NULL);
>                  return; /* bail immediately */
>              case 6: /* Set Priority                */
> diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
> index 54f3e7b540..536e91ad76 100644
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -1003,9 +1003,9 @@ static void hvm_gsi_eoi(struct domain *d, unsigned int gsi,
>      hvm_pirq_eoi(pirq, ent);
>  }
> 
> -void hvm_dpci_eoi(struct domain *d, unsigned int guest_gsi,
> -                  const union vioapic_redir_entry *ent)
> +void hvm_dpci_eoi(unsigned int guest_gsi, const union vioapic_redir_entry *ent)
>  {
> +    struct domain *d = current->domain;
>      const struct hvm_irq_dpci *hvm_irq_dpci;
>      const struct hvm_girq_dpci_mapping *girq;
> 
> diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
> index adec0f566a..b05f619435 100644
> --- a/xen/include/asm-x86/hvm/io.h
> +++ b/xen/include/asm-x86/hvm/io.h
> @@ -118,8 +118,7 @@ bool handle_mmio_with_translation(unsigned long gla, unsigned long gpfn,
>                                    struct npfec);
>  bool handle_pio(uint16_t port, unsigned int size, int dir);
>  void hvm_interrupt_post(struct vcpu *v, int vector, int type);
> -void hvm_dpci_eoi(struct domain *d, unsigned int guest_irq,
> -                  const union vioapic_redir_entry *ent);
> +void hvm_dpci_eoi(unsigned int guest_irq, const union vioapic_redir_entry *ent);
>  void msix_write_completion(struct vcpu *);
> 
>  #ifdef CONFIG_HVM
> --
> 2.28.0




^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism
  2020-09-30 10:41 ` [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism Roger Pau Monne
@ 2020-09-30 11:49   ` Paul Durrant
  2020-10-02  9:22     ` Jan Beulich
  2020-10-02  9:39   ` Jan Beulich
  1 sibling, 1 reply; 40+ messages in thread
From: Paul Durrant @ 2020-09-30 11:49 UTC (permalink / raw)
  To: 'Roger Pau Monne', xen-devel
  Cc: 'Jan Beulich', 'Andrew Cooper', 'Wei Liu'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Roger Pau Monne
> Sent: 30 September 2020 11:41
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>
> Subject: [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism
> 
> Add a new vlapic_set_irq_callback helper in order to inject a vector
> and set a callback to be executed when the guest performs the end of
> interrupt acknowledgment.
> 
> Such functionality will be used to migrate the current ad hoc handling
> done in vlapic_handle_EOI for the vectors that require some logic to
> be executed when the end of interrupt is performed.
> 
> No current users are migrated to use this new functionality yet, so
> not functional change expected as a result.

s/not/no

> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
>  - Make vlapic_set_irq an inline function on the header.
>  - Clear the callback hook in vlapic_handle_EOI.
>  - Introduce a helper to set the callback without injecting a vector.
>  - Remove unneeded parentheses.
>  - Reduce callback table by 16.
>  - Use %pv to print domain/vcpu ID.
> ---
> RFC: should callbacks also be executed in vlapic_do_init (which is
> called by vlapic_reset). We would need to make sure ISR and IRR
> are cleared using some kind of test and clear atomic functionality to
> make this race free.
> ---
>  xen/arch/x86/hvm/vlapic.c        | 62 ++++++++++++++++++++++++++++++--
>  xen/include/asm-x86/hvm/vlapic.h | 18 +++++++++-
>  2 files changed, 77 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index ae737403f3..38c62a02e6 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -144,7 +144,32 @@ bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec)
>      return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
>  }
> 
> -void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
> +void vlapic_set_callback(struct vlapic *vlapic, unsigned int vec,
> +                         vlapic_eoi_callback_t *callback, void *data)
> +{
> +    unsigned long flags;
> +    unsigned int index = vec - 16;
> +
> +    if ( !callback || vec < 16 || vec >= X86_NR_VECTORS )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return;
> +    }
> +
> +    spin_lock_irqsave(&vlapic->callback_lock, flags);
> +    if ( vlapic->callbacks[index].callback &&
> +         vlapic->callbacks[index].callback != callback )
> +        printk(XENLOG_G_WARNING
> +               "%pv overriding vector %#x callback %ps (%p) with %ps (%p)\n",
> +               vlapic_vcpu(vlapic), vec, vlapic->callbacks[index].callback,
> +               vlapic->callbacks[index].callback, callback, callback);
> +    vlapic->callbacks[index].callback = callback;
> +    vlapic->callbacks[index].data = data;
> +    spin_unlock_irqrestore(&vlapic->callback_lock, flags);
> +}
> +
> +void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig,
> +                             vlapic_eoi_callback_t *callback, void *data)
>  {
>      struct vcpu *target = vlapic_vcpu(vlapic);
> 
> @@ -159,8 +184,12 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
>      else
>          vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
> 
> +    if ( callback )
> +        vlapic_set_callback(vlapic, vec, callback, data);
> +

Can this not happen several times before an EOI? I.e. the vector could already be set in IRR, right?

  Paul

>      if ( hvm_funcs.update_eoi_exit_bitmap )
> -        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
> +        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec,
> +                          trig || callback);
> 
>      if ( hvm_funcs.deliver_posted_intr )
>          alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
> @@ -459,10 +488,24 @@ void vlapic_EOI_set(struct vlapic *vlapic)
> 
>  void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
>  {
> +    vlapic_eoi_callback_t *callback;
> +    void *data;
> +    unsigned long flags;
> +    unsigned int index = vector - 16;
> +
>      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
>          vioapic_update_EOI(vector);
> 
>      hvm_dpci_msi_eoi(vector);
> +
> +    spin_lock_irqsave(&vlapic->callback_lock, flags);
> +    callback = vlapic->callbacks[index].callback;
> +    vlapic->callbacks[index].callback = NULL;
> +    data = vlapic->callbacks[index].data;
> +    spin_unlock_irqrestore(&vlapic->callback_lock, flags);
> +
> +    if ( callback )
> +        callback(vector, data);
>  }
> 
>  static bool_t is_multicast_dest(struct vlapic *vlapic, unsigned int short_hand,
> @@ -1629,9 +1672,23 @@ int vlapic_init(struct vcpu *v)
>      }
>      clear_page(vlapic->regs);
> 
> +    if ( !vlapic->callbacks )
> +    {
> +        vlapic->callbacks = xmalloc_array(typeof(*vlapic->callbacks),
> +                                          X86_NR_VECTORS - 16);
> +        if ( !vlapic->callbacks )
> +        {
> +            dprintk(XENLOG_ERR, "%pv: alloc vlapic callbacks error\n", v);
> +            return -ENOMEM;
> +        }
> +    }
> +    memset(vlapic->callbacks, 0, sizeof(*vlapic->callbacks) *
> +                                 (X86_NR_VECTORS - 16));
> +
>      vlapic_reset(vlapic);
> 
>      spin_lock_init(&vlapic->esr_lock);
> +    spin_lock_init(&vlapic->callback_lock);
> 
>      tasklet_init(&vlapic->init_sipi.tasklet, vlapic_init_sipi_action, v);
> 
> @@ -1653,6 +1710,7 @@ void vlapic_destroy(struct vcpu *v)
>      destroy_periodic_time(&vlapic->pt);
>      unmap_domain_page_global(vlapic->regs);
>      free_domheap_page(vlapic->regs_page);
> +    XFREE(vlapic->callbacks);
>  }
> 
>  /*
> diff --git a/xen/include/asm-x86/hvm/vlapic.h b/xen/include/asm-x86/hvm/vlapic.h
> index 8f908928c3..c380127a71 100644
> --- a/xen/include/asm-x86/hvm/vlapic.h
> +++ b/xen/include/asm-x86/hvm/vlapic.h
> @@ -73,6 +73,8 @@
>  #define vlapic_clear_vector(vec, bitmap)                                \
>      clear_bit(VEC_POS(vec), (uint32_t *)((bitmap) + REG_POS(vec)))
> 
> +typedef void vlapic_eoi_callback_t(unsigned int vector, void *data);
> +
>  struct vlapic {
>      struct hvm_hw_lapic      hw;
>      struct hvm_hw_lapic_regs *regs;
> @@ -89,6 +91,11 @@ struct vlapic {
>          uint32_t             icr, dest;
>          struct tasklet       tasklet;
>      } init_sipi;
> +    struct {
> +        vlapic_eoi_callback_t *callback;
> +        void                 *data;
> +    } *callbacks;
> +    spinlock_t               callback_lock;
>  };
> 
>  /* vlapic's frequence is 100 MHz */
> @@ -111,7 +118,16 @@ void vlapic_reg_write(struct vcpu *v, unsigned int reg, uint32_t val);
>  bool_t is_vlapic_lvtpc_enabled(struct vlapic *vlapic);
> 
>  bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec);
> -void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig);
> +void vlapic_set_callback(struct vlapic *vlapic, unsigned int vec,
> +                         vlapic_eoi_callback_t *callback, void *data);
> +void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig,
> +                             vlapic_eoi_callback_t *callback, void *data);
> +
> +static inline void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec,
> +                                  uint8_t trig)
> +{
> +    vlapic_set_irq_callback(vlapic, vec, trig, NULL, NULL);
> +}
> 
>  int vlapic_has_pending_irq(struct vcpu *v);
>  int vlapic_ack_pending_irq(struct vcpu *v, int vector, bool_t force_ack);
> --
> 2.28.0
> 




^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks
  2020-09-30 10:41 ` [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks Roger Pau Monne
@ 2020-09-30 11:57   ` Paul Durrant
  2020-09-30 13:37     ` Roger Pau Monné
  2020-10-02 15:25   ` Jan Beulich
  1 sibling, 1 reply; 40+ messages in thread
From: Paul Durrant @ 2020-09-30 11:57 UTC (permalink / raw)
  To: 'Roger Pau Monne', xen-devel
  Cc: 'Jan Beulich', 'Andrew Cooper', 'Wei Liu'

> -----Original Message-----
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: 30 September 2020 11:41
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Paul Durrant <paul@xen.org>
> Subject: [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks
> 
> Remove the unconditional call to hvm_dpci_msi_eoi in vlapic_handle_EOI
> and instead use the newly introduced EOI callback mechanism in order
> to register a callback for MSI vectors injected from passed through
> devices.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>  xen/arch/x86/hvm/vlapic.c    |  2 --
>  xen/arch/x86/hvm/vmsi.c      | 36 ++++++++++++++++++++++--------------
>  xen/drivers/passthrough/io.c |  2 +-
>  xen/include/asm-x86/hvm/io.h |  2 +-
>  4 files changed, 24 insertions(+), 18 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 38c62a02e6..8a18b33428 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -496,8 +496,6 @@ void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
>      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
>          vioapic_update_EOI(vector);
> 
> -    hvm_dpci_msi_eoi(vector);
> -
>      spin_lock_irqsave(&vlapic->callback_lock, flags);
>      callback = vlapic->callbacks[index].callback;
>      vlapic->callbacks[index].callback = NULL;
> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> index 7ca19353ab..e192c4c6da 100644
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -44,11 +44,9 @@
>  #include <asm/event.h>
>  #include <asm/io_apic.h>
> 
> -static void vmsi_inj_irq(
> -    struct vlapic *target,
> -    uint8_t vector,
> -    uint8_t trig_mode,
> -    uint8_t delivery_mode)
> +static void vmsi_inj_irq(struct vlapic *target, uint8_t vector,
> +                         uint8_t trig_mode, uint8_t delivery_mode,
> +                         vlapic_eoi_callback_t *callback, void *data)
>  {
>      HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "vmsi_inj_irq: vec %02x trig %d dm %d\n",
>                  vector, trig_mode, delivery_mode);
> @@ -57,17 +55,17 @@ static void vmsi_inj_irq(
>      {
>      case dest_Fixed:
>      case dest_LowestPrio:
> -        vlapic_set_irq(target, vector, trig_mode);
> +        vlapic_set_irq_callback(target, vector, trig_mode, callback, data);
>          break;
>      default:
>          BUG();
>      }
>  }
> 
> -int vmsi_deliver(
> -    struct domain *d, int vector,
> -    uint8_t dest, uint8_t dest_mode,
> -    uint8_t delivery_mode, uint8_t trig_mode)
> +static int vmsi_deliver_callback(struct domain *d, int vector, uint8_t dest,
> +                                 uint8_t dest_mode, uint8_t delivery_mode,
> +                                 uint8_t trig_mode,
> +                                 vlapic_eoi_callback_t *callback, void *data)
>  {
>      struct vlapic *target;
>      struct vcpu *v;
> @@ -78,7 +76,8 @@ int vmsi_deliver(
>          target = vlapic_lowest_prio(d, NULL, 0, dest, dest_mode);
>          if ( target != NULL )
>          {
> -            vmsi_inj_irq(target, vector, trig_mode, delivery_mode);
> +            vmsi_inj_irq(target, vector, trig_mode, delivery_mode, callback,
> +                         data);
>              break;
>          }
>          HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "null MSI round robin: vector=%02x\n",
> @@ -89,8 +88,8 @@ int vmsi_deliver(
>          for_each_vcpu ( d, v )
>              if ( vlapic_match_dest(vcpu_vlapic(v), NULL,
>                                     0, dest, dest_mode) )
> -                vmsi_inj_irq(vcpu_vlapic(v), vector,
> -                             trig_mode, delivery_mode);
> +                vmsi_inj_irq(vcpu_vlapic(v), vector, trig_mode, delivery_mode,
> +                             callback, data);
>          break;
> 
>      default:
> @@ -103,6 +102,14 @@ int vmsi_deliver(
>      return 0;
>  }
> 
> +

Stray blank line

> +int vmsi_deliver(struct domain *d, int vector, uint8_t dest, uint8_t dest_mode,
> +                 uint8_t delivery_mode, uint8_t trig_mode)
> +{
> +    return vmsi_deliver_callback(d, vector, dest, dest_mode, delivery_mode,
> +                                 trig_mode, NULL, NULL);
> +}
> +
>  void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
>  {
>      uint32_t flags = pirq_dpci->gmsi.gflags;
> @@ -119,7 +126,8 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
> 
>      ASSERT(pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI);
> 
> -    vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
> +    vmsi_deliver_callback(d, vector, dest, dest_mode, delivery_mode, trig_mode,
> +                          hvm_dpci_msi_eoi, NULL);

I think there are more efficiencies possible here. E.g. if is_hardware_domain(d) is true then hvm_dpci_msi_eoi() will do nothing, so no point in setting up the callback in that case.

  Paul

>  }
> 
>  /* Return value, -1 : multi-dests, non-negative value: dest_vcpu_id */
> diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
> index 536e91ad76..bff0f6628a 100644
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -874,7 +874,7 @@ static int _hvm_dpci_msi_eoi(struct domain *d,
>      return 0;
>  }
> 
> -void hvm_dpci_msi_eoi(unsigned int vector)
> +void hvm_dpci_msi_eoi(unsigned int vector, void *data)
>  {
>      struct domain *d = current->domain;
> 
> diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
> index b05f619435..759ee486af 100644
> --- a/xen/include/asm-x86/hvm/io.h
> +++ b/xen/include/asm-x86/hvm/io.h
> @@ -158,7 +158,7 @@ struct hvm_hw_stdvga {
>  void stdvga_init(struct domain *d);
>  void stdvga_deinit(struct domain *d);
> 
> -extern void hvm_dpci_msi_eoi(unsigned int vector);
> +void hvm_dpci_msi_eoi(unsigned int vector, void *data);
> 
>  /* Decode a PCI port IO access into a bus/slot/func/reg. */
>  unsigned int hvm_pci_decode_addr(unsigned int cf8, unsigned int addr,
> --
> 2.28.0




^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism
  2020-09-30 10:41 ` [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism Roger Pau Monne
@ 2020-09-30 12:09   ` Paul Durrant
  2020-09-30 13:29     ` Roger Pau Monné
  2020-10-22 16:12   ` Jan Beulich
  1 sibling, 1 reply; 40+ messages in thread
From: Paul Durrant @ 2020-09-30 12:09 UTC (permalink / raw)
  To: 'Roger Pau Monne', xen-devel
  Cc: 'Jan Beulich', 'Andrew Cooper', 'Wei Liu'

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Roger Pau Monne
> Sent: 30 September 2020 11:41
> To: xen-devel@lists.xenproject.org
> Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>
> Subject: [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism
> 
> Switch the emulated IO-APIC code to use the local APIC EOI callback
> mechanism. This allows to remove the last hardcoded callback from
> vlapic_handle_EOI. Removing the hardcoded vIO-APIC callback also
> allows to getting rid of setting the EOI exit bitmap based on the
> triggering mode, as now all users that require an EOI action use the
> newly introduced callback mechanism.
> 
> Move and rename the vioapic_update_EOI now that it can be made static.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
>  - Remove the triggering check in the update_eoi_exit_bitmap call.
>  - Register the vlapic callbacks when loading the vIO-APIC state.
>  - Reduce scope of ent.
> ---
>  xen/arch/x86/hvm/vioapic.c | 131 ++++++++++++++++++++++++-------------
>  xen/arch/x86/hvm/vlapic.c  |   5 +-
>  2 files changed, 86 insertions(+), 50 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 752fc410db..dce98b1479 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -375,6 +375,50 @@ static const struct hvm_mmio_ops vioapic_mmio_ops = {
>      .write = vioapic_write
>  };
> 
> +static void eoi_callback(unsigned int vector, void *data)
> +{
> +    struct domain *d = current->domain;
> +    struct hvm_irq *hvm_irq = hvm_domain_irq(d);
> +    unsigned int i;
> +
> +    ASSERT(has_vioapic(d));
> +
> +    spin_lock(&d->arch.hvm.irq_lock);
> +
> +    for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
> +    {
> +        struct hvm_vioapic *vioapic = domain_vioapic(d, i);
> +        unsigned int pin;
> +
> +        for ( pin = 0; pin < vioapic->nr_pins; pin++ )
> +        {
> +            union vioapic_redir_entry *ent = &vioapic->redirtbl[pin];
> +
> +            if ( ent->fields.vector != vector )
> +                continue;
> +
> +            ent->fields.remote_irr = 0;
> +
> +            if ( is_iommu_enabled(d) )
> +            {
> +                spin_unlock(&d->arch.hvm.irq_lock);
> +                hvm_dpci_eoi(vioapic->base_gsi + pin, ent);
> +                spin_lock(&d->arch.hvm.irq_lock);

Is this safe? If so, why lock around the whole loop in the first place?

  Paul

> +            }
> +
> +            if ( (ent->fields.trig_mode == VIOAPIC_LEVEL_TRIG) &&
> +                 !ent->fields.mask &&
> +                 hvm_irq->gsi_assert_count[vioapic->base_gsi + pin] )
> +            {
> +                ent->fields.remote_irr = 1;
> +                vioapic_deliver(vioapic, pin);
> +            }
> +        }
> +    }
> +
> +    spin_unlock(&d->arch.hvm.irq_lock);
> +}
> +
>  static void ioapic_inj_irq(
>      struct hvm_vioapic *vioapic,
>      struct vlapic *target,
> @@ -388,7 +432,8 @@ static void ioapic_inj_irq(
>      ASSERT((delivery_mode == dest_Fixed) ||
>             (delivery_mode == dest_LowestPrio));
> 
> -    vlapic_set_irq(target, vector, trig_mode);
> +    vlapic_set_irq_callback(target, vector, trig_mode,
> +                            trig_mode ? eoi_callback : NULL, NULL);
>  }
> 
>  static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
> @@ -495,50 +540,6 @@ void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
>      }
>  }
> 
> -void vioapic_update_EOI(unsigned int vector)
> -{
> -    struct domain *d = current->domain;
> -    struct hvm_irq *hvm_irq = hvm_domain_irq(d);
> -    union vioapic_redir_entry *ent;
> -    unsigned int i;
> -
> -    ASSERT(has_vioapic(d));
> -
> -    spin_lock(&d->arch.hvm.irq_lock);
> -
> -    for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
> -    {
> -        struct hvm_vioapic *vioapic = domain_vioapic(d, i);
> -        unsigned int pin;
> -
> -        for ( pin = 0; pin < vioapic->nr_pins; pin++ )
> -        {
> -            ent = &vioapic->redirtbl[pin];
> -            if ( ent->fields.vector != vector )
> -                continue;
> -
> -            ent->fields.remote_irr = 0;
> -
> -            if ( is_iommu_enabled(d) )
> -            {
> -                spin_unlock(&d->arch.hvm.irq_lock);
> -                hvm_dpci_eoi(vioapic->base_gsi + pin, ent);
> -                spin_lock(&d->arch.hvm.irq_lock);
> -            }
> -
> -            if ( (ent->fields.trig_mode == VIOAPIC_LEVEL_TRIG) &&
> -                 !ent->fields.mask &&
> -                 hvm_irq->gsi_assert_count[vioapic->base_gsi + pin] )
> -            {
> -                ent->fields.remote_irr = 1;
> -                vioapic_deliver(vioapic, pin);
> -            }
> -        }
> -    }
> -
> -    spin_unlock(&d->arch.hvm.irq_lock);
> -}
> -
>  int vioapic_get_mask(const struct domain *d, unsigned int gsi)
>  {
>      unsigned int pin = 0; /* See gsi_vioapic */
> @@ -592,6 +593,8 @@ static int ioapic_save(struct vcpu *v, hvm_domain_context_t *h)
>  static int ioapic_load(struct domain *d, hvm_domain_context_t *h)
>  {
>      struct hvm_vioapic *s;
> +    unsigned int i;
> +    int rc;
> 
>      if ( !has_vioapic(d) )
>          return -ENODEV;
> @@ -602,7 +605,43 @@ static int ioapic_load(struct domain *d, hvm_domain_context_t *h)
>           d->arch.hvm.nr_vioapics != 1 )
>          return -EOPNOTSUPP;
> 
> -    return hvm_load_entry(IOAPIC, h, &s->domU);
> +    rc = hvm_load_entry(IOAPIC, h, &s->domU);
> +    if ( rc )
> +        return rc;
> +
> +    for ( i = 0; i < ARRAY_SIZE(s->domU.redirtbl); i++ )
> +    {
> +        const union vioapic_redir_entry *ent = &s->domU.redirtbl[i];
> +        unsigned int vector = ent->fields.vector;
> +        unsigned int delivery_mode = ent->fields.delivery_mode;
> +        struct vcpu *v;
> +
> +        /*
> +         * Add a callback for each possible vector injected by a redirection
> +         * entry.
> +         */
> +        if ( vector < 16 || !ent->fields.remote_irr ||
> +             (delivery_mode != dest_LowestPrio && delivery_mode != dest_Fixed) )
> +            continue;
> +
> +        for_each_vcpu ( d, v )
> +        {
> +            struct vlapic *vlapic = vcpu_vlapic(v);
> +
> +            /*
> +             * NB: if the vlapic registers where restored before the vio-apic
> +             * ones we could test whether the vector is set in the vlapic IRR
> +             * or ISR registers before unconditionally setting the callback.
> +             * This is harmless as eoi_callback is capable of dealing with
> +             * spurious callbacks.
> +             */
> +            if ( vlapic_match_dest(vlapic, NULL, 0, ent->fields.dest_id,
> +                                   ent->fields.dest_mode) )
> +                vlapic_set_callback(vlapic, vector, eoi_callback, NULL);
> +        }
> +    }
> +
> +    return 0;
>  }
> 
>  HVM_REGISTER_SAVE_RESTORE(IOAPIC, ioapic_save, ioapic_load, 1, HVMSR_PER_DOM);
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 8a18b33428..35f12e0909 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -189,7 +189,7 @@ void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig,
> 
>      if ( hvm_funcs.update_eoi_exit_bitmap )
>          alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec,
> -                          trig || callback);
> +                          callback);
> 
>      if ( hvm_funcs.deliver_posted_intr )
>          alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
> @@ -493,9 +493,6 @@ void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
>      unsigned long flags;
>      unsigned int index = vector - 16;
> 
> -    if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
> -        vioapic_update_EOI(vector);
> -
>      spin_lock_irqsave(&vlapic->callback_lock, flags);
>      callback = vlapic->callbacks[index].callback;
>      vlapic->callbacks[index].callback = NULL;
> --
> 2.28.0
> 




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism
  2020-09-30 12:09   ` Paul Durrant
@ 2020-09-30 13:29     ` Roger Pau Monné
  0 siblings, 0 replies; 40+ messages in thread
From: Roger Pau Monné @ 2020-09-30 13:29 UTC (permalink / raw)
  To: paul
  Cc: xen-devel, 'Jan Beulich', 'Andrew Cooper',
	'Wei Liu'

On Wed, Sep 30, 2020 at 01:09:21PM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Roger Pau Monne
> > Sent: 30 September 2020 11:41
> > To: xen-devel@lists.xenproject.org
> > Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> > <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>
> > Subject: [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism
> > 
> > Switch the emulated IO-APIC code to use the local APIC EOI callback
> > mechanism. This allows to remove the last hardcoded callback from
> > vlapic_handle_EOI. Removing the hardcoded vIO-APIC callback also
> > allows to getting rid of setting the EOI exit bitmap based on the
> > triggering mode, as now all users that require an EOI action use the
> > newly introduced callback mechanism.
> > 
> > Move and rename the vioapic_update_EOI now that it can be made static.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Changes since v1:
> >  - Remove the triggering check in the update_eoi_exit_bitmap call.
> >  - Register the vlapic callbacks when loading the vIO-APIC state.
> >  - Reduce scope of ent.
> > ---
> >  xen/arch/x86/hvm/vioapic.c | 131 ++++++++++++++++++++++++-------------
> >  xen/arch/x86/hvm/vlapic.c  |   5 +-
> >  2 files changed, 86 insertions(+), 50 deletions(-)
> > 
> > diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> > index 752fc410db..dce98b1479 100644
> > --- a/xen/arch/x86/hvm/vioapic.c
> > +++ b/xen/arch/x86/hvm/vioapic.c
> > @@ -375,6 +375,50 @@ static const struct hvm_mmio_ops vioapic_mmio_ops = {
> >      .write = vioapic_write
> >  };
> > 
> > +static void eoi_callback(unsigned int vector, void *data)
> > +{
> > +    struct domain *d = current->domain;
> > +    struct hvm_irq *hvm_irq = hvm_domain_irq(d);
> > +    unsigned int i;
> > +
> > +    ASSERT(has_vioapic(d));
> > +
> > +    spin_lock(&d->arch.hvm.irq_lock);
> > +
> > +    for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
> > +    {
> > +        struct hvm_vioapic *vioapic = domain_vioapic(d, i);
> > +        unsigned int pin;
> > +
> > +        for ( pin = 0; pin < vioapic->nr_pins; pin++ )
> > +        {
> > +            union vioapic_redir_entry *ent = &vioapic->redirtbl[pin];
> > +
> > +            if ( ent->fields.vector != vector )
> > +                continue;
> > +
> > +            ent->fields.remote_irr = 0;
> > +
> > +            if ( is_iommu_enabled(d) )
> > +            {
> > +                spin_unlock(&d->arch.hvm.irq_lock);
> > +                hvm_dpci_eoi(vioapic->base_gsi + pin, ent);
> > +                spin_lock(&d->arch.hvm.irq_lock);
> 
> Is this safe? If so, why lock around the whole loop in the first place?

It's my understanding that locking around the whole loop is mostly
done for convenience reasons, as vioapic entries cannot go away after
initialization.

The lock here is taken to assure consistency of the contents of
vioapic_redir_entry entry, so that other concurrent read/writes don't
change the entry while being processed here - but the entry can never
disappear under our feet.

Jan expressed similar concerns on the previous version, but I'm afraid
I didn't look much at whether hvm_dpci_eoi could be called with the
lock taken, or whether we could move the call of the EOI hooks out of
the locked region. I was mostly leaving this part for later, since
the series is already fairly long and this didn't seem critical.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 11/11] x86/vpt: introduce a per-vPT lock
  2020-09-30 10:41 ` [PATCH v2 11/11] x86/vpt: introduce a per-vPT lock Roger Pau Monne
@ 2020-09-30 13:30   ` Roger Pau Monné
  2020-10-23 15:42     ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monné @ 2020-09-30 13:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Jan Beulich, Andrew Cooper, Wei Liu

On Wed, Sep 30, 2020 at 12:41:08PM +0200, Roger Pau Monne wrote:
> Introduce a per virtual timer lock that replaces the existing per-vCPU
> and per-domain vPT locks. Since virtual timers are no longer assigned
> or migrated between vCPUs the locking can be simplified to a
> in-structure spinlock that protects all the fields.
> 
> This requires introducing a helper to initialize the spinlock, and
> that could be used to initialize other virtual timer fields in the
> future.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Just realized I had the following uncommitted chunk that should have
been part of this patch, nothing critical but the tm_lock can now be
removed.

---8<---
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7cb4511b60..7daca3f85c 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1554,8 +1554,6 @@ int hvm_vcpu_initialise(struct vcpu *v)
 
     hvm_asid_flush_vcpu(v);
 
-    spin_lock_init(&v->arch.hvm.tm_lock);
-
     rc = hvm_vcpu_cacheattr_init(v); /* teardown: vcpu_cacheattr_destroy */
     if ( rc != 0 )
         goto fail1;
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 07a9890ed3..6e3bdef5bc 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -166,9 +166,6 @@ struct hvm_vcpu {
     s64                 cache_tsc_offset;
     u64                 guest_time;
 
-    /* Lock for virtual platform timers. */
-    spinlock_t          tm_lock;
-
     bool                flag_dr_dirty;
     bool                debug_state_latch;
     bool                single_step;



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks
  2020-09-30 11:57   ` Paul Durrant
@ 2020-09-30 13:37     ` Roger Pau Monné
  0 siblings, 0 replies; 40+ messages in thread
From: Roger Pau Monné @ 2020-09-30 13:37 UTC (permalink / raw)
  To: paul
  Cc: xen-devel, 'Jan Beulich', 'Andrew Cooper',
	'Wei Liu'

On Wed, Sep 30, 2020 at 12:57:31PM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: 30 September 2020 11:41
> > To: xen-devel@lists.xenproject.org
> > Cc: Roger Pau Monne <roger.pau@citrix.com>; Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> > <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Paul Durrant <paul@xen.org>
> > Subject: [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks
> > 
> > Remove the unconditional call to hvm_dpci_msi_eoi in vlapic_handle_EOI
> > and instead use the newly introduced EOI callback mechanism in order
> > to register a callback for MSI vectors injected from passed through
> > devices.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> >  xen/arch/x86/hvm/vlapic.c    |  2 --
> >  xen/arch/x86/hvm/vmsi.c      | 36 ++++++++++++++++++++++--------------
> >  xen/drivers/passthrough/io.c |  2 +-
> >  xen/include/asm-x86/hvm/io.h |  2 +-
> >  4 files changed, 24 insertions(+), 18 deletions(-)
> > 
> > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> > index 38c62a02e6..8a18b33428 100644
> > --- a/xen/arch/x86/hvm/vlapic.c
> > +++ b/xen/arch/x86/hvm/vlapic.c
> > @@ -496,8 +496,6 @@ void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
> >      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
> >          vioapic_update_EOI(vector);
> > 
> > -    hvm_dpci_msi_eoi(vector);
> > -
> >      spin_lock_irqsave(&vlapic->callback_lock, flags);
> >      callback = vlapic->callbacks[index].callback;
> >      vlapic->callbacks[index].callback = NULL;
> > diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> > index 7ca19353ab..e192c4c6da 100644
> > --- a/xen/arch/x86/hvm/vmsi.c
> > +++ b/xen/arch/x86/hvm/vmsi.c
> > @@ -44,11 +44,9 @@
> >  #include <asm/event.h>
> >  #include <asm/io_apic.h>
> > 
> > -static void vmsi_inj_irq(
> > -    struct vlapic *target,
> > -    uint8_t vector,
> > -    uint8_t trig_mode,
> > -    uint8_t delivery_mode)
> > +static void vmsi_inj_irq(struct vlapic *target, uint8_t vector,
> > +                         uint8_t trig_mode, uint8_t delivery_mode,
> > +                         vlapic_eoi_callback_t *callback, void *data)
> >  {
> >      HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "vmsi_inj_irq: vec %02x trig %d dm %d\n",
> >                  vector, trig_mode, delivery_mode);
> > @@ -57,17 +55,17 @@ static void vmsi_inj_irq(
> >      {
> >      case dest_Fixed:
> >      case dest_LowestPrio:
> > -        vlapic_set_irq(target, vector, trig_mode);
> > +        vlapic_set_irq_callback(target, vector, trig_mode, callback, data);
> >          break;
> >      default:
> >          BUG();
> >      }
> >  }
> > 
> > -int vmsi_deliver(
> > -    struct domain *d, int vector,
> > -    uint8_t dest, uint8_t dest_mode,
> > -    uint8_t delivery_mode, uint8_t trig_mode)
> > +static int vmsi_deliver_callback(struct domain *d, int vector, uint8_t dest,
> > +                                 uint8_t dest_mode, uint8_t delivery_mode,
> > +                                 uint8_t trig_mode,
> > +                                 vlapic_eoi_callback_t *callback, void *data)
> >  {
> >      struct vlapic *target;
> >      struct vcpu *v;
> > @@ -78,7 +76,8 @@ int vmsi_deliver(
> >          target = vlapic_lowest_prio(d, NULL, 0, dest, dest_mode);
> >          if ( target != NULL )
> >          {
> > -            vmsi_inj_irq(target, vector, trig_mode, delivery_mode);
> > +            vmsi_inj_irq(target, vector, trig_mode, delivery_mode, callback,
> > +                         data);
> >              break;
> >          }
> >          HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "null MSI round robin: vector=%02x\n",
> > @@ -89,8 +88,8 @@ int vmsi_deliver(
> >          for_each_vcpu ( d, v )
> >              if ( vlapic_match_dest(vcpu_vlapic(v), NULL,
> >                                     0, dest, dest_mode) )
> > -                vmsi_inj_irq(vcpu_vlapic(v), vector,
> > -                             trig_mode, delivery_mode);
> > +                vmsi_inj_irq(vcpu_vlapic(v), vector, trig_mode, delivery_mode,
> > +                             callback, data);
> >          break;
> > 
> >      default:
> > @@ -103,6 +102,14 @@ int vmsi_deliver(
> >      return 0;
> >  }
> > 
> > +
> 
> Stray blank line
> 
> > +int vmsi_deliver(struct domain *d, int vector, uint8_t dest, uint8_t dest_mode,
> > +                 uint8_t delivery_mode, uint8_t trig_mode)
> > +{
> > +    return vmsi_deliver_callback(d, vector, dest, dest_mode, delivery_mode,
> > +                                 trig_mode, NULL, NULL);
> > +}
> > +
> >  void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
> >  {
> >      uint32_t flags = pirq_dpci->gmsi.gflags;
> > @@ -119,7 +126,8 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
> > 
> >      ASSERT(pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI);
> > 
> > -    vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
> > +    vmsi_deliver_callback(d, vector, dest, dest_mode, delivery_mode, trig_mode,
> > +                          hvm_dpci_msi_eoi, NULL);
> 
> I think there are more efficiencies possible here. E.g. if is_hardware_domain(d) is true then hvm_dpci_msi_eoi() will do nothing, so no point in setting up the callback in that case.

No, I don't think that's true, hvm_dpci_msi_eoi will execute the call
to pt_pirq_iterate and _hvm_dpci_msi_eoi for the hardware domain,
because MSI vectors need to be acked from the physical lapic even in
the hardware domain case. This was fixed by commit 7b653a245ff.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
  2020-09-30 10:40 ` [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks Roger Pau Monne
  2020-09-30 11:30   ` Paul Durrant
@ 2020-10-02  8:48   ` Jan Beulich
  2020-10-02  9:24     ` Durrant, Paul
  2020-10-13 14:08     ` Roger Pau Monné
  1 sibling, 2 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-02  8:48 UTC (permalink / raw)
  To: Roger Pau Monne, Wei Liu, Paul Durrant
  Cc: xen-devel, Andrew Cooper, Paul Durrant

On 30.09.2020 12:40, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -459,13 +459,10 @@ void vlapic_EOI_set(struct vlapic *vlapic)
>  
>  void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
>  {
> -    struct vcpu *v = vlapic_vcpu(vlapic);
> -    struct domain *d = v->domain;
> -
>      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
> -        vioapic_update_EOI(d, vector);
> +        vioapic_update_EOI(vector);
>  
> -    hvm_dpci_msi_eoi(d, vector);
> +    hvm_dpci_msi_eoi(vector);
>  }

What about viridian_synic_wrmsr() -> vlapic_EOI_set() ->
vlapic_handle_EOI()? You'd probably have noticed this if you
had tried to (consistently) drop the respective parameters from
the intermediate functions as well.

Question of course is in how far viridian_synic_wrmsr() for
HV_X64_MSR_EOI makes much sense when v != current. Paul, Wei?

A secondary question of course is whether passing around the
pointers isn't really cheaper than the obtaining of 'current'.

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 02/11] x86/hvm: drop domain parameter from vioapic/vpic EOI callbacks
  2020-09-30 10:40 ` [PATCH v2 02/11] x86/hvm: drop domain parameter from vioapic/vpic " Roger Pau Monne
  2020-09-30 11:33   ` Paul Durrant
@ 2020-10-02  9:02   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-02  9:02 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Andrew Cooper, Wei Liu, Paul Durrant

On 30.09.2020 12:40, Roger Pau Monne wrote:
> EOIs are always executed in guest vCPU context, so there's no reason to
> pass a domain parameter around as can be fetched from current->domain.

FAOD whether this is correct depends on what adjustments get made
to patch 1.

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism
  2020-09-30 11:49   ` Paul Durrant
@ 2020-10-02  9:22     ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-02  9:22 UTC (permalink / raw)
  To: paul
  Cc: 'Roger Pau Monne', xen-devel, 'Andrew Cooper',
	'Wei Liu'

On 30.09.2020 13:49, Paul Durrant wrote:
>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Roger Pau Monne
>> Sent: 30 September 2020 11:41
>>
>> @@ -159,8 +184,12 @@ void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
>>      else
>>          vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
>>
>> +    if ( callback )
>> +        vlapic_set_callback(vlapic, vec, callback, data);
>> +
> 
> Can this not happen several times before an EOI? I.e. the vector could
> already be set in IRR, right?

Yes, but I take it the assumption is that it'll always be the same
callback that ought to get set here. Hence the warning printk() in
that function in case it isn't.

What I wonder while looking at this function is whether the TMR
handling is correct. The SDM says "Upon acceptance of an interrupt
into the IRR, ..." which I read as "when the IRR bit transitions
from 0 to 1" (but I can see room for reading this differently).

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
  2020-10-02  8:48   ` Jan Beulich
@ 2020-10-02  9:24     ` Durrant, Paul
  2020-10-02 10:54       ` Wei Liu
  2020-10-13 14:08     ` Roger Pau Monné
  1 sibling, 1 reply; 40+ messages in thread
From: Durrant, Paul @ 2020-10-02  9:24 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monne, Wei Liu, Paul Durrant
  Cc: xen-devel, Andrew Cooper

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 02 October 2020 09:48
> To: Roger Pau Monne <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; Paul Durrant <paul@xen.org>
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <andrew.cooper3@citrix.com>; Durrant, Paul
> <pdurrant@amazon.co.uk>
> Subject: RE: [EXTERNAL] [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
> 
> CAUTION: This email originated from outside of the organization. Do not click links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 30.09.2020 12:40, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/vlapic.c
> > +++ b/xen/arch/x86/hvm/vlapic.c
> > @@ -459,13 +459,10 @@ void vlapic_EOI_set(struct vlapic *vlapic)
> >
> >  void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
> >  {
> > -    struct vcpu *v = vlapic_vcpu(vlapic);
> > -    struct domain *d = v->domain;
> > -
> >      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
> > -        vioapic_update_EOI(d, vector);
> > +        vioapic_update_EOI(vector);
> >
> > -    hvm_dpci_msi_eoi(d, vector);
> > +    hvm_dpci_msi_eoi(vector);
> >  }
> 
> What about viridian_synic_wrmsr() -> vlapic_EOI_set() ->
> vlapic_handle_EOI()? You'd probably have noticed this if you
> had tried to (consistently) drop the respective parameters from
> the intermediate functions as well.
> 
> Question of course is in how far viridian_synic_wrmsr() for
> HV_X64_MSR_EOI makes much sense when v != current. Paul, Wei?
> 

I don't think it makes any sense. I think it would be fine to only do it if v == current.

  Paul

> A secondary question of course is whether passing around the
> pointers isn't really cheaper than the obtaining of 'current'.
> 
> Jan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism
  2020-09-30 10:41 ` [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism Roger Pau Monne
  2020-09-30 11:49   ` Paul Durrant
@ 2020-10-02  9:39   ` Jan Beulich
  2020-10-13 14:30     ` Roger Pau Monné
  1 sibling, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2020-10-02  9:39 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 30.09.2020 12:41, Roger Pau Monne wrote:
> Add a new vlapic_set_irq_callback helper in order to inject a vector
> and set a callback to be executed when the guest performs the end of
> interrupt acknowledgment.

On v1 I did ask

"One thing I don't understand at all for now is how these
 callbacks are going to be re-instated after migration for
 not-yet-EOIed interrupts."

Afaics I didn't get an answer on this.

> ---
> RFC: should callbacks also be executed in vlapic_do_init (which is
> called by vlapic_reset). We would need to make sure ISR and IRR
> are cleared using some kind of test and clear atomic functionality to
> make this race free.

I guess this can't be decided at this point of the series, as it
may depend on what exactly the callbacks mean to do. It may even
be that whether a callback wants to do something depends on
whether it gets called "normally" or from vlapic_do_init().

> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -144,7 +144,32 @@ bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec)
>      return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
>  }
>  
> -void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
> +void vlapic_set_callback(struct vlapic *vlapic, unsigned int vec,
> +                         vlapic_eoi_callback_t *callback, void *data)
> +{
> +    unsigned long flags;
> +    unsigned int index = vec - 16;
> +
> +    if ( !callback || vec < 16 || vec >= X86_NR_VECTORS )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return;
> +    }
> +
> +    spin_lock_irqsave(&vlapic->callback_lock, flags);
> +    if ( vlapic->callbacks[index].callback &&
> +         vlapic->callbacks[index].callback != callback )
> +        printk(XENLOG_G_WARNING
> +               "%pv overriding vector %#x callback %ps (%p) with %ps (%p)\n",
> +               vlapic_vcpu(vlapic), vec, vlapic->callbacks[index].callback,
> +               vlapic->callbacks[index].callback, callback, callback);
> +    vlapic->callbacks[index].callback = callback;
> +    vlapic->callbacks[index].data = data;

Should "data" perhaps also be compared in the override check above?

> @@ -1629,9 +1672,23 @@ int vlapic_init(struct vcpu *v)
>      }
>      clear_page(vlapic->regs);
>  
> +    if ( !vlapic->callbacks )
> +    {
> +        vlapic->callbacks = xmalloc_array(typeof(*vlapic->callbacks),
> +                                          X86_NR_VECTORS - 16);
> +        if ( !vlapic->callbacks )
> +        {
> +            dprintk(XENLOG_ERR, "%pv: alloc vlapic callbacks error\n", v);
> +            return -ENOMEM;
> +        }
> +    }
> +    memset(vlapic->callbacks, 0, sizeof(*vlapic->callbacks) *
> +                                 (X86_NR_VECTORS - 16));

While it resembles code earlier in this function, it widens an
existing memory leak (I'll make a patch for that one) and also
makes it appear as if this function could be called more than
once for a vCPU (maybe I'll also make a patch for this).

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
  2020-10-02  9:24     ` Durrant, Paul
@ 2020-10-02 10:54       ` Wei Liu
  0 siblings, 0 replies; 40+ messages in thread
From: Wei Liu @ 2020-10-02 10:54 UTC (permalink / raw)
  To: Durrant, Paul
  Cc: Jan Beulich, Roger Pau Monne, Wei Liu, Paul Durrant, xen-devel,
	Andrew Cooper

On Fri, Oct 02, 2020 at 09:24:57AM +0000, Durrant, Paul wrote:
> > -----Original Message-----
> > From: Jan Beulich <jbeulich@suse.com>
> > Sent: 02 October 2020 09:48
> > To: Roger Pau Monne <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; Paul Durrant <paul@xen.org>
> > Cc: xen-devel@lists.xenproject.org; Andrew Cooper <andrew.cooper3@citrix.com>; Durrant, Paul
> > <pdurrant@amazon.co.uk>
> > Subject: RE: [EXTERNAL] [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
> > 
> > CAUTION: This email originated from outside of the organization. Do not click links or open
> > attachments unless you can confirm the sender and know the content is safe.
> > 
> > 
> > 
> > On 30.09.2020 12:40, Roger Pau Monne wrote:
> > > --- a/xen/arch/x86/hvm/vlapic.c
> > > +++ b/xen/arch/x86/hvm/vlapic.c
> > > @@ -459,13 +459,10 @@ void vlapic_EOI_set(struct vlapic *vlapic)
> > >
> > >  void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
> > >  {
> > > -    struct vcpu *v = vlapic_vcpu(vlapic);
> > > -    struct domain *d = v->domain;
> > > -
> > >      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
> > > -        vioapic_update_EOI(d, vector);
> > > +        vioapic_update_EOI(vector);
> > >
> > > -    hvm_dpci_msi_eoi(d, vector);
> > > +    hvm_dpci_msi_eoi(vector);
> > >  }
> > 
> > What about viridian_synic_wrmsr() -> vlapic_EOI_set() ->
> > vlapic_handle_EOI()? You'd probably have noticed this if you
> > had tried to (consistently) drop the respective parameters from
> > the intermediate functions as well.
> > 
> > Question of course is in how far viridian_synic_wrmsr() for
> > HV_X64_MSR_EOI makes much sense when v != current. Paul, Wei?
> > 
> 
> I don't think it makes any sense. I think it would be fine to only do it if v == current.

Yes, I agree.

Wei.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks
  2020-09-30 10:41 ` [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks Roger Pau Monne
  2020-09-30 11:57   ` Paul Durrant
@ 2020-10-02 15:25   ` Jan Beulich
  2020-10-13 14:47     ` Roger Pau Monné
  1 sibling, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2020-10-02 15:25 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Andrew Cooper, Wei Liu, Paul Durrant

On 30.09.2020 12:41, Roger Pau Monne wrote:
> Remove the unconditional call to hvm_dpci_msi_eoi in vlapic_handle_EOI
> and instead use the newly introduced EOI callback mechanism in order
> to register a callback for MSI vectors injected from passed through
> devices.

What I'm kind of missing here is a word on why this is an improvement:
After all ...

> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -496,8 +496,6 @@ void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
>      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
>          vioapic_update_EOI(vector);
>  
> -    hvm_dpci_msi_eoi(vector);

... you're exchanging this direct call for a more complex model with
an indirect one (to the same function).

> @@ -119,7 +126,8 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
>  
>      ASSERT(pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI);
>  
> -    vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
> +    vmsi_deliver_callback(d, vector, dest, dest_mode, delivery_mode, trig_mode,
> +                          hvm_dpci_msi_eoi, NULL);
>  }

While I agree with your reply to Paul regarding Dom0, I still think
the entire if() in hvm_dpci_msi_eoi() should be converted into a
conditional here. There's no point registering the callback if it's
not going to do anything.

However, looking further, the "!hvm_domain_irq(d)->dpci &&
!is_hardware_domain(d)" can be simply dropped altogether, right away.
It's now fulfilled by the identical check at the top of
hvm_dirq_assist(), thus guarding the sole call site of this function.

The !is_iommu_enabled(d) is slightly more involved to prove, but it
should also be possible to simply drop. What might help here is a
separate change to suppress opening of HVM_DPCI_SOFTIRQ when there's
no IOMMU in the system, as then it becomes obvious that this part of
the condition is guaranteed by hvm_do_IRQ_dpci(), being the only
site where the softirq can get raised (apart from the softirq
handler itself).

To sum up - the call above can probably stay as is, but the callback
can be simplified as a result of the change.

> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -874,7 +874,7 @@ static int _hvm_dpci_msi_eoi(struct domain *d,
>      return 0;
>  }
>  
> -void hvm_dpci_msi_eoi(unsigned int vector)
> +void hvm_dpci_msi_eoi(unsigned int vector, void *data)
>  {
>      struct domain *d = current->domain;

Instead of passing NULL for data and latching d from current, how
about you make the registration pass d to more easily use here?

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
  2020-10-02  8:48   ` Jan Beulich
  2020-10-02  9:24     ` Durrant, Paul
@ 2020-10-13 14:08     ` Roger Pau Monné
  2020-10-13 14:13       ` Jan Beulich
  1 sibling, 1 reply; 40+ messages in thread
From: Roger Pau Monné @ 2020-10-13 14:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, Paul Durrant, xen-devel, Andrew Cooper, Paul Durrant

On Fri, Oct 02, 2020 at 10:48:07AM +0200, Jan Beulich wrote:
> On 30.09.2020 12:40, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/vlapic.c
> > +++ b/xen/arch/x86/hvm/vlapic.c
> > @@ -459,13 +459,10 @@ void vlapic_EOI_set(struct vlapic *vlapic)
> >  
> >  void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
> >  {
> > -    struct vcpu *v = vlapic_vcpu(vlapic);
> > -    struct domain *d = v->domain;
> > -
> >      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
> > -        vioapic_update_EOI(d, vector);
> > +        vioapic_update_EOI(vector);
> >  
> > -    hvm_dpci_msi_eoi(d, vector);
> > +    hvm_dpci_msi_eoi(vector);
> >  }
> 
> What about viridian_synic_wrmsr() -> vlapic_EOI_set() ->
> vlapic_handle_EOI()? You'd probably have noticed this if you
> had tried to (consistently) drop the respective parameters from
> the intermediate functions as well.
> 
> Question of course is in how far viridian_synic_wrmsr() for
> HV_X64_MSR_EOI makes much sense when v != current. Paul, Wei?

There's already an assert at the top of viridian_synic_wrmsr of v ==
current, which I assume is why I thought this change was fine. I can
purge the passing of v (current) further, but it wasn't really needed
for the rest of the series.

> A secondary question of course is whether passing around the
> pointers isn't really cheaper than the obtaining of 'current'.

Well, while there's indeed a performance aspect here, I think
using current is much clearer than passing a vcpu around. I could
rename the parameter to curr or some such, but I think using
current and not passing a vcpu parameter is clearer.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks
  2020-10-13 14:08     ` Roger Pau Monné
@ 2020-10-13 14:13       ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-13 14:13 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Wei Liu, Paul Durrant, xen-devel, Andrew Cooper, Paul Durrant

On 13.10.2020 16:08, Roger Pau Monné wrote:
> On Fri, Oct 02, 2020 at 10:48:07AM +0200, Jan Beulich wrote:
>> On 30.09.2020 12:40, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/vlapic.c
>>> +++ b/xen/arch/x86/hvm/vlapic.c
>>> @@ -459,13 +459,10 @@ void vlapic_EOI_set(struct vlapic *vlapic)
>>>  
>>>  void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
>>>  {
>>> -    struct vcpu *v = vlapic_vcpu(vlapic);
>>> -    struct domain *d = v->domain;
>>> -
>>>      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
>>> -        vioapic_update_EOI(d, vector);
>>> +        vioapic_update_EOI(vector);
>>>  
>>> -    hvm_dpci_msi_eoi(d, vector);
>>> +    hvm_dpci_msi_eoi(vector);
>>>  }
>>
>> What about viridian_synic_wrmsr() -> vlapic_EOI_set() ->
>> vlapic_handle_EOI()? You'd probably have noticed this if you
>> had tried to (consistently) drop the respective parameters from
>> the intermediate functions as well.
>>
>> Question of course is in how far viridian_synic_wrmsr() for
>> HV_X64_MSR_EOI makes much sense when v != current. Paul, Wei?
> 
> There's already an assert at the top of viridian_synic_wrmsr of v ==
> current, which I assume is why I thought this change was fine. I can
> purge the passing of v (current) further, but it wasn't really needed
> for the rest of the series.

To a large degree that's up to you. It's just that, as said, if
you had done so, you'd likely have noticed the issue, and hence
doing so here and elsewhere may provide reassurance that there's
no further similar case lurking anywhere.

>> A secondary question of course is whether passing around the
>> pointers isn't really cheaper than the obtaining of 'current'.
> 
> Well, while there's indeed a performance aspect here, I think
> using current is much clearer than passing a vcpu around. I could
> rename the parameter to curr or some such, but I think using
> current and not passing a vcpu parameter is clearer.

Personally I'd prefer "curr" named function parameters. But if
Andrew or Wei agree with your approach, I'm not going to object.

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism
  2020-10-02  9:39   ` Jan Beulich
@ 2020-10-13 14:30     ` Roger Pau Monné
  2020-10-13 15:41       ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monné @ 2020-10-13 14:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Fri, Oct 02, 2020 at 11:39:50AM +0200, Jan Beulich wrote:
> On 30.09.2020 12:41, Roger Pau Monne wrote:
> > Add a new vlapic_set_irq_callback helper in order to inject a vector
> > and set a callback to be executed when the guest performs the end of
> > interrupt acknowledgment.
> 
> On v1 I did ask
> 
> "One thing I don't understand at all for now is how these
>  callbacks are going to be re-instated after migration for
>  not-yet-EOIed interrupts."
> 
> Afaics I didn't get an answer on this.

Oh sorry, I remember your comment and I've changed further patches to
address this.

The setter of the callback will be in charge for setting the callback
again on resume. That's why vlapic_set_callback is not a static
function, and is added to the header.

Patch 5/11 [0] contains an example of hw the vIO-APIC resets the callbacks
on load. 

> 
> > ---
> > RFC: should callbacks also be executed in vlapic_do_init (which is
> > called by vlapic_reset). We would need to make sure ISR and IRR
> > are cleared using some kind of test and clear atomic functionality to
> > make this race free.
> 
> I guess this can't be decided at this point of the series, as it
> may depend on what exactly the callbacks mean to do. It may even
> be that whether a callback wants to do something depends on
> whether it gets called "normally" or from vlapic_do_init().

Well, lets try to get some progress on the other questions and we will
eventually get here I think.

> > --- a/xen/arch/x86/hvm/vlapic.c
> > +++ b/xen/arch/x86/hvm/vlapic.c
> > @@ -144,7 +144,32 @@ bool vlapic_test_irq(const struct vlapic *vlapic, uint8_t vec)
> >      return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
> >  }
> >  
> > -void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
> > +void vlapic_set_callback(struct vlapic *vlapic, unsigned int vec,
> > +                         vlapic_eoi_callback_t *callback, void *data)
> > +{
> > +    unsigned long flags;
> > +    unsigned int index = vec - 16;
> > +
> > +    if ( !callback || vec < 16 || vec >= X86_NR_VECTORS )
> > +    {
> > +        ASSERT_UNREACHABLE();
> > +        return;
> > +    }
> > +
> > +    spin_lock_irqsave(&vlapic->callback_lock, flags);
> > +    if ( vlapic->callbacks[index].callback &&
> > +         vlapic->callbacks[index].callback != callback )
> > +        printk(XENLOG_G_WARNING
> > +               "%pv overriding vector %#x callback %ps (%p) with %ps (%p)\n",
> > +               vlapic_vcpu(vlapic), vec, vlapic->callbacks[index].callback,
> > +               vlapic->callbacks[index].callback, callback, callback);
> > +    vlapic->callbacks[index].callback = callback;
> > +    vlapic->callbacks[index].data = data;
> 
> Should "data" perhaps also be compared in the override check above?

Could do, there might indeed be cases where the callback is the same
but data has changed and should be interesting to log.

Thanks, Roger.

[0] https://lore.kernel.org/xen-devel/20200930104108.35969-6-roger.pau@citrix.com/


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks
  2020-10-02 15:25   ` Jan Beulich
@ 2020-10-13 14:47     ` Roger Pau Monné
  2020-10-13 15:42       ` Jan Beulich
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Pau Monné @ 2020-10-13 14:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu, Paul Durrant

On Fri, Oct 02, 2020 at 05:25:34PM +0200, Jan Beulich wrote:
> On 30.09.2020 12:41, Roger Pau Monne wrote:
> > Remove the unconditional call to hvm_dpci_msi_eoi in vlapic_handle_EOI
> > and instead use the newly introduced EOI callback mechanism in order
> > to register a callback for MSI vectors injected from passed through
> > devices.
> 
> What I'm kind of missing here is a word on why this is an improvement:
> After all ...
> 
> > --- a/xen/arch/x86/hvm/vlapic.c
> > +++ b/xen/arch/x86/hvm/vlapic.c
> > @@ -496,8 +496,6 @@ void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
> >      if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) )
> >          vioapic_update_EOI(vector);
> >  
> > -    hvm_dpci_msi_eoi(vector);
> 
> ... you're exchanging this direct call for a more complex model with
> an indirect one (to the same function).

Sure. But this direct call will be made for each vlapic EOI, while my
added callback will only be executed if the vector was injected by
thee vmsi code, and hence will remove pointless calls to
hvm_dpci_msi_eoi.

It's IMO not feasible to be adding hardcoded calls to
vlapic_handle_EOI for each possible subsystem or emulated device that
wants to be notified of EOIs, hence we need some kind of generic
framework to achieve this.

> > @@ -119,7 +126,8 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
> >  
> >      ASSERT(pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI);
> >  
> > -    vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
> > +    vmsi_deliver_callback(d, vector, dest, dest_mode, delivery_mode, trig_mode,
> > +                          hvm_dpci_msi_eoi, NULL);
> >  }
> 
> While I agree with your reply to Paul regarding Dom0, I still think
> the entire if() in hvm_dpci_msi_eoi() should be converted into a
> conditional here. There's no point registering the callback if it's
> not going to do anything.
> 
> However, looking further, the "!hvm_domain_irq(d)->dpci &&
> !is_hardware_domain(d)" can be simply dropped altogether, right away.
> It's now fulfilled by the identical check at the top of
> hvm_dirq_assist(), thus guarding the sole call site of this function.
> 
> The !is_iommu_enabled(d) is slightly more involved to prove, but it
> should also be possible to simply drop. What might help here is a
> separate change to suppress opening of HVM_DPCI_SOFTIRQ when there's
> no IOMMU in the system, as then it becomes obvious that this part of
> the condition is guaranteed by hvm_do_IRQ_dpci(), being the only
> site where the softirq can get raised (apart from the softirq
> handler itself).
> 
> To sum up - the call above can probably stay as is, but the callback
> can be simplified as a result of the change.

Yes, I agree. Would you be fine with converting the check in the
callback into an assert, or would you rather have it removed
completely?

> > --- a/xen/drivers/passthrough/io.c
> > +++ b/xen/drivers/passthrough/io.c
> > @@ -874,7 +874,7 @@ static int _hvm_dpci_msi_eoi(struct domain *d,
> >      return 0;
> >  }
> >  
> > -void hvm_dpci_msi_eoi(unsigned int vector)
> > +void hvm_dpci_msi_eoi(unsigned int vector, void *data)
> >  {
> >      struct domain *d = current->domain;
> 
> Instead of passing NULL for data and latching d from current, how
> about you make the registration pass d to more easily use here?

Yes, I think that's fine - we already have the domain pointer in
vmsi_deliver_callback so it could be passed as the callback private
data.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism
  2020-10-13 14:30     ` Roger Pau Monné
@ 2020-10-13 15:41       ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-13 15:41 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 13.10.2020 16:30, Roger Pau Monné wrote:
> On Fri, Oct 02, 2020 at 11:39:50AM +0200, Jan Beulich wrote:
>> On 30.09.2020 12:41, Roger Pau Monne wrote:
>>> Add a new vlapic_set_irq_callback helper in order to inject a vector
>>> and set a callback to be executed when the guest performs the end of
>>> interrupt acknowledgment.
>>
>> On v1 I did ask
>>
>> "One thing I don't understand at all for now is how these
>>  callbacks are going to be re-instated after migration for
>>  not-yet-EOIed interrupts."
>>
>> Afaics I didn't get an answer on this.
> 
> Oh sorry, I remember your comment and I've changed further patches to
> address this.
> 
> The setter of the callback will be in charge for setting the callback
> again on resume. That's why vlapic_set_callback is not a static
> function, and is added to the header.
> 
> Patch 5/11 [0] contains an example of hw the vIO-APIC resets the callbacks
> on load. 

Ah, I see - I didn't get there yet. Could you mention this behavior in
the description here, or maybe in a code comment next to the declaration
(or definition) of the function?

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks
  2020-10-13 14:47     ` Roger Pau Monné
@ 2020-10-13 15:42       ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-13 15:42 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu, Paul Durrant

On 13.10.2020 16:47, Roger Pau Monné wrote:
> On Fri, Oct 02, 2020 at 05:25:34PM +0200, Jan Beulich wrote:
>> On 30.09.2020 12:41, Roger Pau Monne wrote:
>>> @@ -119,7 +126,8 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
>>>  
>>>      ASSERT(pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI);
>>>  
>>> -    vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode);
>>> +    vmsi_deliver_callback(d, vector, dest, dest_mode, delivery_mode, trig_mode,
>>> +                          hvm_dpci_msi_eoi, NULL);
>>>  }
>>
>> While I agree with your reply to Paul regarding Dom0, I still think
>> the entire if() in hvm_dpci_msi_eoi() should be converted into a
>> conditional here. There's no point registering the callback if it's
>> not going to do anything.
>>
>> However, looking further, the "!hvm_domain_irq(d)->dpci &&
>> !is_hardware_domain(d)" can be simply dropped altogether, right away.
>> It's now fulfilled by the identical check at the top of
>> hvm_dirq_assist(), thus guarding the sole call site of this function.
>>
>> The !is_iommu_enabled(d) is slightly more involved to prove, but it
>> should also be possible to simply drop. What might help here is a
>> separate change to suppress opening of HVM_DPCI_SOFTIRQ when there's
>> no IOMMU in the system, as then it becomes obvious that this part of
>> the condition is guaranteed by hvm_do_IRQ_dpci(), being the only
>> site where the softirq can get raised (apart from the softirq
>> handler itself).
>>
>> To sum up - the call above can probably stay as is, but the callback
>> can be simplified as a result of the change.
> 
> Yes, I agree. Would you be fine with converting the check in the
> callback into an assert, or would you rather have it removed
> completely?

Either way is fine with me, I think.

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism
  2020-09-30 10:41 ` [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism Roger Pau Monne
  2020-09-30 12:09   ` Paul Durrant
@ 2020-10-22 16:12   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-22 16:12 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 30.09.2020 12:41, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -189,7 +189,7 @@ void vlapic_set_irq_callback(struct vlapic *vlapic, uint8_t vec, uint8_t trig,
>  
>      if ( hvm_funcs.update_eoi_exit_bitmap )
>          alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec,
> -                          trig || callback);
> +                          callback);

There's a shortcoming in the alternative call framework which I
see no way to eliminate but which makes it necessary to use
!!callback here. Otherwise, if the callback happens to sit on a
256-byte boundary (low address byte zero), you'll pass false
when you mean true. (The original use, i.e. prior to patch 3,
of just "trig" was sufficiently okay, because the parameter
- despite being u8 - is effectively used as a boolean by the
callers iirc.)

Or perhaps the best thing is to require wrappers for all hooks
taking bool parameters, because then the necessary conversion
will be done when calling the wrapper.

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 06/11] x86/hvm: allowing registering EOI callbacks for GSIs
  2020-09-30 10:41 ` [PATCH v2 06/11] x86/hvm: allowing registering EOI callbacks for GSIs Roger Pau Monne
@ 2020-10-23 12:29   ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-23 12:29 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 30.09.2020 12:41, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/irq.c
> +++ b/xen/arch/x86/hvm/irq.c
> @@ -595,6 +595,66 @@ int hvm_local_events_need_delivery(struct vcpu *v)
>      return !hvm_interrupt_blocked(v, intack);
>  }
>  
> +int hvm_gsi_register_callback(struct domain *d, unsigned int gsi,
> +                              struct hvm_gsi_eoi_callback *cb)
> +{
> +    if ( gsi >= hvm_domain_irq(d)->nr_gsis )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return -EINVAL;
> +    }
> +
> +    write_lock(&hvm_domain_irq(d)->gsi_callbacks_lock);
> +    list_add(&cb->list, &hvm_domain_irq(d)->gsi_callbacks[gsi]);
> +    write_unlock(&hvm_domain_irq(d)->gsi_callbacks_lock);
> +
> +    return 0;
> +}
> +
> +void hvm_gsi_unregister_callback(struct domain *d, unsigned int gsi,
> +                                 struct hvm_gsi_eoi_callback *cb)
> +{
> +    struct list_head *tmp;

This could be const if you used ...

> +    if ( gsi >= hvm_domain_irq(d)->nr_gsis )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return;
> +    }
> +
> +    write_lock(&hvm_domain_irq(d)->gsi_callbacks_lock);
> +    list_for_each ( tmp, &hvm_domain_irq(d)->gsi_callbacks[gsi] )
> +        if ( tmp == &cb->list )
> +        {
> +            list_del(tmp);

... &cb->list here.

> +            break;
> +        }
> +    write_unlock(&hvm_domain_irq(d)->gsi_callbacks_lock);
> +}
> +
> +void hvm_gsi_execute_callbacks(unsigned int gsi, void *data)
> +{
> +    struct domain *currd = current->domain;
> +    struct hvm_gsi_eoi_callback *cb;
> +
> +    read_lock(&hvm_domain_irq(currd)->gsi_callbacks_lock);
> +    list_for_each_entry ( cb, &hvm_domain_irq(currd)->gsi_callbacks[gsi],
> +                          list )
> +        cb->callback(gsi, cb->data ?: data);

Are callback functions in principle permitted to unregister
themselves? If so, you'd need to use list_for_each_entry_safe()
here.

What's the idea of passing cb->data _or_ data?

Finally here and maybe in a few more places latch hvm_domain_irq()
into a local variable?

> +    read_unlock(&hvm_domain_irq(currd)->gsi_callbacks_lock);
> +}
> +
> +bool hvm_gsi_has_callbacks(struct domain *d, unsigned int gsi)

I think a function like this would want to have all const inputs,
and it looks to be possible thanks to hvm_domain_irq() yielding
a pointer.

> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -393,6 +393,7 @@ static void eoi_callback(unsigned int vector, void *data)
>          for ( pin = 0; pin < vioapic->nr_pins; pin++ )
>          {
>              union vioapic_redir_entry *ent = &vioapic->redirtbl[pin];
> +            unsigned int gsi = vioapic->base_gsi + pin;
>  
>              if ( ent->fields.vector != vector )
>                  continue;
> @@ -402,13 +403,17 @@ static void eoi_callback(unsigned int vector, void *data)
>              if ( is_iommu_enabled(d) )
>              {
>                  spin_unlock(&d->arch.hvm.irq_lock);
> -                hvm_dpci_eoi(vioapic->base_gsi + pin, ent);
> +                hvm_dpci_eoi(gsi, ent);
>                  spin_lock(&d->arch.hvm.irq_lock);
>              }
>  
> +            spin_unlock(&d->arch.hvm.irq_lock);
> +            hvm_gsi_execute_callbacks(gsi, ent);
> +            spin_lock(&d->arch.hvm.irq_lock);

Iirc on an earlier patch Paul has already expressed concern about such
transient unlocking. At the very least I'd expect the description to
say why this is safe. One particular question would be in how far what
ents points to can't change across this window, disconnecting the uses
of it in the 1st locked section from those in the 2nd one.

> @@ -620,7 +628,7 @@ static int ioapic_load(struct domain *d, hvm_domain_context_t *h)
>           * Add a callback for each possible vector injected by a redirection
>           * entry.
>           */
> -        if ( vector < 16 || !ent->fields.remote_irr ||
> +        if ( vector < 16 ||
>               (delivery_mode != dest_LowestPrio && delivery_mode != dest_Fixed) )
>              continue;

I'm having trouble identifying what this gets replaced by.

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 07/11] x86/dpci: move code
  2020-09-30 10:41 ` [PATCH v2 07/11] x86/dpci: move code Roger Pau Monne
@ 2020-10-23 12:32   ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-23 12:32 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Paul Durrant

On 30.09.2020 12:41, Roger Pau Monne wrote:
> This is code movement in order to simply further changes.
> 
> No functional change intended.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
albeit ...

> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -276,6 +276,92 @@ static struct vcpu *vector_hashing_dest(const struct domain *d,
>      return dest;
>  }
>  
> +static void hvm_pirq_eoi(struct pirq *pirq,
> +                         const union vioapic_redir_entry *ent)
> +{
> +    struct hvm_pirq_dpci *pirq_dpci;
> +
> +    if ( !pirq )
> +    {
> +        ASSERT_UNREACHABLE();
> +        return;
> +    }
> +
> +    pirq_dpci = pirq_dpci(pirq);
> +
> +    /*
> +     * No need to get vector lock for timer
> +     * since interrupt is still not EOIed
> +     */
> +    if ( --pirq_dpci->pending ||
> +         (ent && ent->fields.mask) ||
> +         !pt_irq_need_timer(pirq_dpci->flags) )
> +        return;
> +
> +    stop_timer(&pirq_dpci->timer);
> +    pirq_guest_eoi(pirq);
> +}
> +
> +static void __hvm_dpci_eoi(struct domain *d,

... could I talk you into dropping one of the two leading underscores
while moving the thing?

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 08/11] x86/dpci: switch to use a GSI EOI callback
  2020-09-30 10:41 ` [PATCH v2 08/11] x86/dpci: switch to use a GSI EOI callback Roger Pau Monne
@ 2020-10-23 12:47   ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-23 12:47 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Andrew Cooper, Wei Liu, Paul Durrant

On 30.09.2020 12:41, Roger Pau Monne wrote:
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -327,9 +327,10 @@ static void hvm_gsi_eoi(struct domain *d, unsigned int gsi,
>      hvm_pirq_eoi(pirq, ent);
>  }
>  
> -void hvm_dpci_eoi(unsigned int guest_gsi, const union vioapic_redir_entry *ent)
> +static void dpci_eoi(unsigned int guest_gsi, void *data)
>  {
>      struct domain *d = current->domain;
> +    const union vioapic_redir_entry *ent = data;
>      const struct hvm_irq_dpci *hvm_irq_dpci;
>      const struct hvm_girq_dpci_mapping *girq;
>  
> @@ -565,7 +566,7 @@ int pt_irq_create_bind(
>              unsigned int link;
>  
>              digl = xmalloc(struct dev_intx_gsi_link);
> -            girq = xmalloc(struct hvm_girq_dpci_mapping);
> +            girq = xzalloc(struct hvm_girq_dpci_mapping);
>  
>              if ( !digl || !girq )
>              {
> @@ -578,11 +579,22 @@ int pt_irq_create_bind(
>              girq->bus = digl->bus = pt_irq_bind->u.pci.bus;
>              girq->device = digl->device = pt_irq_bind->u.pci.device;
>              girq->intx = digl->intx = pt_irq_bind->u.pci.intx;
> -            list_add_tail(&digl->list, &pirq_dpci->digl_list);
> +            girq->cb.callback = dpci_eoi;
>  
>              guest_gsi = hvm_pci_intx_gsi(digl->device, digl->intx);
>              link = hvm_pci_intx_link(digl->device, digl->intx);
>  
> +            rc = hvm_gsi_register_callback(d, guest_gsi, &girq->cb);

So this is where my question on the earlier patch gets answered:
You utilize passing NULL data to the callback to actually get
passed the IO-APIC redir entry pointer into the callback. This is
perhaps okay in principle if it was half way visible. May I ask
that at the very least instead of switching to xzalloc above you
set ->data to NULL here explicitly, accompanied by a comment on
the effect?

However, I wonder whether it wouldn't be better to have the
callback be passed const union vioapic_redir_entry * right away.
Albeit I haven't looked at the later patches yes, where it may
well be I'd find arguments against.

> @@ -590,8 +602,17 @@ int pt_irq_create_bind(
>          }
>          else
>          {
> +            struct hvm_gsi_eoi_callback *cb =
> +                xzalloc(struct hvm_gsi_eoi_callback);

I can't seem to be able to spot anywhere that this would get freed
(except on an error path in this function).

>              ASSERT(is_hardware_domain(d));
>  
> +            if ( !cb )
> +            {
> +                spin_unlock(&d->event_lock);
> +                return -ENOMEM;
> +            }
> +
>              /* MSI_TRANSLATE is not supported for the hardware domain. */
>              if ( pt_irq_bind->irq_type != PT_IRQ_TYPE_PCI ||
>                   pirq >= hvm_domain_irq(d)->nr_gsis )
> @@ -601,6 +622,19 @@ int pt_irq_create_bind(
>                  return -EINVAL;
>              }

There's an error path here where you don't free cb, and I think
one or two more further down (where you then also may need to
unregister it first).

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 09/11] x86/vpt: switch interrupt injection model
  2020-09-30 10:41 ` [PATCH v2 09/11] x86/vpt: switch interrupt injection model Roger Pau Monne
@ 2020-10-23 14:59   ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-23 14:59 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: xen-devel, Andrew Cooper, Wei Liu, Jun Nakajima, Kevin Tian

On 30.09.2020 12:41, Roger Pau Monne wrote:
> Currently vPT relies on timers being assigned to a vCPU and performing
> checks on every return to HVM guest in order to check if an interrupt
> from a vPT timer assigned to the vCPU is currently being injected.
> 
> This model doesn't work properly since the interrupt destination vCPU
> of a vPT timer can be different from the vCPU where the timer is
> currently assigned, in which case the timer would get stuck because it
> never sees the interrupt as being injected.
> 
> Knowing when a vPT interrupt is injected is relevant for the guest
> timer modes where missed vPT interrupts are not discarded and instead
> are accumulated and injected when possible.
> 
> This change aims to modify the logic described above, so that vPT
> doesn't need to check on every return to HVM guest if a vPT interrupt
> is being injected. In order to achieve this the vPT code is modified
> to make use of the new EOI callbacks, so that virtual timers can
> detect when a interrupt has been serviced by the guest by waiting for
> the EOI callback to execute.
> 
> This model also simplifies some of the logic, as when executing the
> timer EOI callback Xen can try to inject another interrupt if the
> timer has interrupts pending for delivery.
> 
> Note that timers are still bound to a vCPU for the time being, this
> relation however doesn't limit the interrupt destination anymore, and
> will be removed by further patches.
> 
> This model has been tested with Windows 7 guests without showing any
> timer delay, even when the guest was limited to have very little CPU
> capacity and pending virtual timer interrupts accumulate.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
>  - New in this version.
> ---
> Sorry, this is a big change, but I'm having issues splitting it into
> smaller pieces as the functionality needs to be changed in one go, or
> else timers would be broken.
> 
> If this approach seems sensible I can try to split it up.

If it can't sensibly be split, so be it, I would say. And yes, the
approach does look sensible to me, supported by ...

> ---
>  xen/arch/x86/hvm/svm/intr.c   |   3 -
>  xen/arch/x86/hvm/vmx/intr.c   |  59 ------
>  xen/arch/x86/hvm/vpt.c        | 326 ++++++++++++++--------------------
>  xen/include/asm-x86/hvm/vpt.h |   5 +-
>  4 files changed, 135 insertions(+), 258 deletions(-)

... this diffstat. Good work!

Just a couple of nits, but before giving this my ack I may need to
go through it a 2nd time.

> +/*
> + * The same callback is shared between LAPIC and PIC/IO-APIC based timers, as
> + * we ignore the first parameter that's different between them.
> + */
> +static void eoi_callback(unsigned int unused, void *data)
>  {
> -    struct list_head *head = &v->arch.hvm.tm_list;
> -    struct periodic_time *pt, *temp, *earliest_pt;
> -    uint64_t max_lag;
> -    int irq, pt_vector = -1;
> -    bool level;
> +    struct periodic_time *pt = data;
> +    struct vcpu *v;
> +    time_cb *cb = NULL;
> +    void *cb_priv;
>  
> -    pt_vcpu_lock(v);
> +    pt_lock(pt);
>  
> -    earliest_pt = NULL;
> -    max_lag = -1ULL;
> -    list_for_each_entry_safe ( pt, temp, head, list )
> +    pt_irq_fired(pt->vcpu, pt);
> +    if ( pt->pending_intr_nr )
>      {
> -        if ( pt->pending_intr_nr )
> +        if ( inject_interrupt(pt) )
> +        {
> +            pt->pending_intr_nr--;
> +            cb = pt->cb;
> +            cb_priv = pt->priv;
> +            v = pt->vcpu;
> +        }
> +        else
>          {
> -            /* RTC code takes care of disabling the timer itself. */
> -            if ( (pt->irq != RTC_IRQ || !pt->priv) && pt_irq_masked(pt) &&
> -                 /* Level interrupts should be asserted even if masked. */
> -                 !pt->level )
> -            {
> -                /* suspend timer emulation */
> +            /* Masked. */
> +            if ( pt->on_list )
>                  list_del(&pt->list);
> -                pt->on_list = 0;
> -            }
> -            else
> -            {
> -                if ( (pt->last_plt_gtime + pt->period) < max_lag )
> -                {
> -                    max_lag = pt->last_plt_gtime + pt->period;
> -                    earliest_pt = pt;
> -                }
> -            }
> +            pt->on_list = false;
>          }
>      }
>  
> -    if ( earliest_pt == NULL )
> -    {
> -        pt_vcpu_unlock(v);
> -        return -1;
> -    }
> +    pt_unlock(pt);
>  
> -    earliest_pt->irq_issued = 1;
> -    irq = earliest_pt->irq;
> -    level = earliest_pt->level;
> +    if ( cb != NULL )
> +        cb(v, cb_priv);

Nit: Like done elsewhere, omit the " != NULL"?

> +    /* Update time when an interrupt is injected. */
> +    if ( mode_is(v->domain, one_missed_tick_pending) ||
> +         mode_is(v->domain, no_missed_ticks_pending) )
> +        pt->last_plt_gtime = hvm_get_guest_time(v);
> +    else
> +        pt->last_plt_gtime += pt->period;
>  
> -    pt_vcpu_unlock(v);
> +    if ( mode_is(v->domain, delay_for_missed_ticks) &&

This looks to be possible to move into the "else" above, but on the
whole maybe everything together would best be handled by switch()?

> @@ -543,6 +443,24 @@ void create_periodic_time(
>      pt->cb = cb;
>      pt->priv = data;
>  
> +    switch ( pt->source )
> +    {
> +        int rc;
> +
> +    case PTSRC_isa:
> +        irq = hvm_isa_irq_to_gsi(irq);
> +        /* fallthrough */
> +    case PTSRC_ioapic:
> +        pt->eoi_cb.callback = eoi_callback;
> +        pt->eoi_cb.data = pt;
> +        rc = hvm_gsi_register_callback(v->domain, irq, &pt->eoi_cb);
> +        if ( rc )
> +            gdprintk(XENLOG_WARNING,
> +                     "unable to register callback for timer GSI %u: %d\n",
> +                     irq, rc);

If this triggers, would it be helpful to also log pt->source?

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 10/11] x86/vpt: remove vPT timers per-vCPU lists
  2020-09-30 10:41 ` [PATCH v2 10/11] x86/vpt: remove vPT timers per-vCPU lists Roger Pau Monne
@ 2020-10-23 15:34   ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-23 15:34 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 30.09.2020 12:41, Roger Pau Monne wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1964,7 +1964,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
>      vpmu_switch_from(prev);
>      np2m_schedule(NP2M_SCHEDLE_OUT);
>  
> -    if ( is_hvm_domain(prevd) && !list_empty(&prev->arch.hvm.tm_list) )
> +    if ( is_hvm_domain(prevd) )
>          pt_save_timer(prev);

While most of the function goes away, pt_freeze_time() now will get
called in cases it previously wasn't called - is this benign?

> @@ -195,50 +182,20 @@ static void pt_thaw_time(struct vcpu *v)
>  
>  void pt_save_timer(struct vcpu *v)
>  {
> -    struct list_head *head = &v->arch.hvm.tm_list;
> -    struct periodic_time *pt;
> -
> -    if ( v->pause_flags & VPF_blocked )
> -        return;
> -
> -    pt_vcpu_lock(v);
> -
> -    list_for_each_entry ( pt, head, list )
> -        if ( !pt->do_not_freeze )
> -            stop_timer(&pt->timer);
>  
>      pt_freeze_time(v);
> -
> -    pt_vcpu_unlock(v);
>  }
>  
>  void pt_restore_timer(struct vcpu *v)
>  {
> -    struct list_head *head = &v->arch.hvm.tm_list;
> -    struct periodic_time *pt;
> -
> -    pt_vcpu_lock(v);
> -
> -    list_for_each_entry ( pt, head, list )
> -        if ( pt->pending_intr_nr == 0 )
> -            set_timer(&pt->timer, pt->scheduled);
> -
>      pt_thaw_time(v);
> -
> -    pt_vcpu_unlock(v);
>  }

In both functions the single function called also is the only
time it is used anywhere, so I guess the extra layer could be
removed.

> @@ -402,8 +339,7 @@ void create_periodic_time(
>      write_lock(&v->domain->arch.hvm.pl_time->pt_migrate);
>  
>      pt->pending_intr_nr = 0;
> -    pt->do_not_freeze = 0;
> -    pt->irq_issued = 0;
> +    pt->masked = false;

I agree here, but ...

> @@ -479,10 +412,8 @@ void destroy_periodic_time(struct periodic_time *pt)
>          return;
>  
>      pt_lock(pt);
> -    if ( pt->on_list )
> -        list_del(&pt->list);
> -    pt->on_list = 0;
>      pt->pending_intr_nr = 0;
> +    pt->masked = false;

... why not "true" here, at the very least for pt_active()'s sake?

> --- a/xen/include/asm-x86/hvm/vpt.h
> +++ b/xen/include/asm-x86/hvm/vpt.h
> @@ -31,13 +31,10 @@
>  typedef void time_cb(struct vcpu *v, void *opaque);
>  
>  struct periodic_time {
> -    struct list_head list;
> -    bool on_list;
>      bool one_shot;
> -    bool do_not_freeze;
> -    bool irq_issued;
>      bool warned_timeout_too_short;
>      bool level;
> +    bool masked;

"masked" aiui doesn't say anything about the present state of a
timer, but about its state the last time an interrupt was
attempted to be injected. If this is right, either a name change
("last_seen_masked" is somewhat longish) might be helpful, but
at the very least I'd like to ask for a comment to this effect.

> @@ -158,7 +153,7 @@ void pt_adjust_global_vcpu_target(struct vcpu *v);
>  void pt_may_unmask_irq(struct domain *d, struct periodic_time *vlapic_pt);
>  
>  /* Is given periodic timer active? */
> -#define pt_active(pt) ((pt)->on_list || (pt)->pending_intr_nr)
> +#define pt_active(pt) !(pt)->masked

This wants parentheses around it. And why does the right side of the
|| go away?

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 11/11] x86/vpt: introduce a per-vPT lock
  2020-09-30 13:30   ` Roger Pau Monné
@ 2020-10-23 15:42     ` Jan Beulich
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2020-10-23 15:42 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 30.09.2020 15:30, Roger Pau Monné wrote:
> On Wed, Sep 30, 2020 at 12:41:08PM +0200, Roger Pau Monne wrote:
>> Introduce a per virtual timer lock that replaces the existing per-vCPU
>> and per-domain vPT locks. Since virtual timers are no longer assigned
>> or migrated between vCPUs the locking can be simplified to a
>> in-structure spinlock that protects all the fields.
>>
>> This requires introducing a helper to initialize the spinlock, and
>> that could be used to initialize other virtual timer fields in the
>> future.
>>
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Just realized I had the following uncommitted chunk that should have
> been part of this patch, nothing critical but the tm_lock can now be
> removed.

And then
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2020-10-23 15:42 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-30 10:40 [PATCH v2 00/11] x86/intr: introduce EOI callbacks and fix vPT Roger Pau Monne
2020-09-30 10:40 ` [PATCH v2 01/11] x86/hvm: drop vcpu parameter from vlapic EOI callbacks Roger Pau Monne
2020-09-30 11:30   ` Paul Durrant
2020-10-02  8:48   ` Jan Beulich
2020-10-02  9:24     ` Durrant, Paul
2020-10-02 10:54       ` Wei Liu
2020-10-13 14:08     ` Roger Pau Monné
2020-10-13 14:13       ` Jan Beulich
2020-09-30 10:40 ` [PATCH v2 02/11] x86/hvm: drop domain parameter from vioapic/vpic " Roger Pau Monne
2020-09-30 11:33   ` Paul Durrant
2020-10-02  9:02   ` Jan Beulich
2020-09-30 10:41 ` [PATCH v2 03/11] x86/vlapic: introduce an EOI callback mechanism Roger Pau Monne
2020-09-30 11:49   ` Paul Durrant
2020-10-02  9:22     ` Jan Beulich
2020-10-02  9:39   ` Jan Beulich
2020-10-13 14:30     ` Roger Pau Monné
2020-10-13 15:41       ` Jan Beulich
2020-09-30 10:41 ` [PATCH v2 04/11] x86/vmsi: use the newly introduced EOI callbacks Roger Pau Monne
2020-09-30 11:57   ` Paul Durrant
2020-09-30 13:37     ` Roger Pau Monné
2020-10-02 15:25   ` Jan Beulich
2020-10-13 14:47     ` Roger Pau Monné
2020-10-13 15:42       ` Jan Beulich
2020-09-30 10:41 ` [PATCH v2 05/11] x86/vioapic: switch to use the EOI callback mechanism Roger Pau Monne
2020-09-30 12:09   ` Paul Durrant
2020-09-30 13:29     ` Roger Pau Monné
2020-10-22 16:12   ` Jan Beulich
2020-09-30 10:41 ` [PATCH v2 06/11] x86/hvm: allowing registering EOI callbacks for GSIs Roger Pau Monne
2020-10-23 12:29   ` Jan Beulich
2020-09-30 10:41 ` [PATCH v2 07/11] x86/dpci: move code Roger Pau Monne
2020-10-23 12:32   ` Jan Beulich
2020-09-30 10:41 ` [PATCH v2 08/11] x86/dpci: switch to use a GSI EOI callback Roger Pau Monne
2020-10-23 12:47   ` Jan Beulich
2020-09-30 10:41 ` [PATCH v2 09/11] x86/vpt: switch interrupt injection model Roger Pau Monne
2020-10-23 14:59   ` Jan Beulich
2020-09-30 10:41 ` [PATCH v2 10/11] x86/vpt: remove vPT timers per-vCPU lists Roger Pau Monne
2020-10-23 15:34   ` Jan Beulich
2020-09-30 10:41 ` [PATCH v2 11/11] x86/vpt: introduce a per-vPT lock Roger Pau Monne
2020-09-30 13:30   ` Roger Pau Monné
2020-10-23 15:42     ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).