All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list
@ 2016-10-28  2:37 Feng Wu
  2016-10-28  2:37 ` [PATCH v6 1/7] VMX: Permanently assign PI hook vmx_pi_switch_to() Feng Wu
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Feng Wu @ 2016-10-28  2:37 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

The current VT-d PI related code may operate incorrectly in the
following scenarios:
1. When the last assigned device is dettached from the domain, all
the PI related hooks are removed then, however, the vCPU can be
blocked, switched to another pCPU, etc, all without the aware of
PI. After the next time we attach another device to the domain,
which makes the PI realted hooks avaliable again, the status
of the pi descriptor is not true. Beside that, the blocking vcpu
may still remain in the per-cpu blocking in this case. Patch [1/7]
and [2/7] handle this.

2. After the domain is destroyed, the the blocking vcpu may also
remain in the per-cpu blocking. Handled in patch [3/7].

3. When IRTE is in posted mode, we don't need to set the irq
affinity for it, since the destination of these interrupts is
vCPU and the vCPU affinity is set during vCPU scheduling. Patch
[5/7] handles this.

4. [6/7] is a cleanup patch

5. When a pCPU is unplugged, and there might be vCPUs on its
list. Since the pCPU is offline, those vCPUs might not be woken
up again. [7/7] addresses it.

Feng Wu (7):
  VMX: Permanently assign PI hook vmx_pi_switch_to()
  VMX: Properly handle pi when all the assigned devices are removed
  VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
  VMX: Make sure PI is in proper state before install the hooks
  VT-d: No need to set irq affinity for posted format IRTE
  VT-d: Some cleanups
  VMX: Fixup PI descriptor when cpu is offline

 xen/arch/x86/hvm/vmx/vmcs.c            |  14 ++--
 xen/arch/x86/hvm/vmx/vmx.c             | 135 +++++++++++++++++++++++++++++++--
 xen/drivers/passthrough/vtd/intremap.c |  74 +++++++++++++++---
 xen/include/asm-x86/hvm/vmx/vmx.h      |   3 +
 4 files changed, 202 insertions(+), 24 deletions(-)

-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v6 1/7] VMX: Permanently assign PI hook vmx_pi_switch_to()
  2016-10-28  2:37 [PATCH v6 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
@ 2016-10-28  2:37 ` Feng Wu
  2016-10-28 13:04   ` Jan Beulich
  2016-10-28  2:37 ` [PATCH v6 2/7] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Feng Wu @ 2016-10-28  2:37 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

PI hook vmx_pi_switch_to() is needed even when any previously
assigned device is detached from the domain. Since 'SN' bit is
also used to control the CPU side PI and we change the state of
SN bit in these two functions, then evaluate this bit in
vmx_deliver_posted_intr() when trying to deliver the interrupt
in posted way via software. The problem is if we deassign the
hooks while the vCPU is runnable in the runqueue with 'SN' set,
all the furture notificaton event will be suppressed. This patch
makes these two hooks statically assigned.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v6:
- Adjust the comments and wording.

v5:
- Zap "pi_switch_from" hook

v4:
- Don't zap vmx_pi_switch_from() and vmx_pi_switch_to() when
any previously assigned device is detached from the domain.
- Comments changes.

 xen/arch/x86/hvm/vmx/vmx.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 3d330b6..faaa987 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -221,9 +221,14 @@ void vmx_pi_hooks_deassign(struct domain *d)
     ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
 
     d->arch.hvm_domain.vmx.vcpu_block = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
     d->arch.hvm_domain.vmx.pi_do_resume = NULL;
+    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
+
+    /*
+     * In fact, we can remove 'vmx_pi_switch_to' inside itself if no new device
+     * is in the process of getting assigned and "from" hook is NULL. However,
+     * it is not straightforward to find a clear solution, so just leave it here.
+     */
 }
 
 static int vmx_domain_initialise(struct domain *d)
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 2/7] VMX: Properly handle pi when all the assigned devices are removed
  2016-10-28  2:37 [PATCH v6 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
  2016-10-28  2:37 ` [PATCH v6 1/7] VMX: Permanently assign PI hook vmx_pi_switch_to() Feng Wu
@ 2016-10-28  2:37 ` Feng Wu
  2016-10-28 13:19   ` Jan Beulich
  2016-10-28  2:37 ` [PATCH v6 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Feng Wu @ 2016-10-28  2:37 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

This patch handles some corner cases when the last assigned device
is removed from the domain. In this case we should carefully handle
pi descriptor and the per-cpu blocking list, to make sure:
- all the PI descriptor are in the right state when next time a
devices is assigned to the domain again.
- No remaining vcpus of the domain in the per-cpu blocking list.

Here we call vmx_pi_list_remove() to remove the vCPU from the blocking list
if it is on the list. However, this could happen when vmx_vcpu_block() is
being called, hence we might incorrectly add the vCPU to the blocking list
while the last devcie is detached from the domain. Consider that the situation
can only occur when detaching the last device from the domain and it is not
a frequent operation, so we use domain_pause before that, which is considered
as an clean and maintainable solution for the situation.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v6:
- Comments changes
- Rename vmx_pi_list_remove() to vmx_pi_unblock_vcpu()

v5:
- Remove a no-op wrapper

v4:
- Rename some functions:
	vmx_pi_remove_vcpu_from_blocking_list() -> vmx_pi_list_remove()
	vmx_pi_blocking_cleanup() -> vmx_pi_list_cleanup()
- Remove the check in vmx_pi_list_cleanup()
- Comments adjustment

 xen/arch/x86/hvm/vmx/vmx.c | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index faaa987..508be7c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -158,14 +158,12 @@ static void vmx_pi_switch_to(struct vcpu *v)
     pi_clear_sn(pi_desc);
 }
 
-static void vmx_pi_do_resume(struct vcpu *v)
+static void vmx_pi_unblock_vcpu(struct vcpu *v)
 {
     unsigned long flags;
     spinlock_t *pi_blocking_list_lock;
     struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
 
-    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
-
     /*
      * Set 'NV' field back to posted_intr_vector, so the
      * Posted-Interrupts can be delivered to the vCPU when
@@ -173,12 +171,12 @@ static void vmx_pi_do_resume(struct vcpu *v)
      */
     write_atomic(&pi_desc->nv, posted_intr_vector);
 
-    /* The vCPU is not on any blocking list. */
     pi_blocking_list_lock = v->arch.hvm_vmx.pi_blocking.lock;
 
     /* Prevent the compiler from eliminating the local variable.*/
     smp_rmb();
 
+    /* The vCPU is not on any blocking list. */
     if ( pi_blocking_list_lock == NULL )
         return;
 
@@ -198,6 +196,13 @@ static void vmx_pi_do_resume(struct vcpu *v)
     spin_unlock_irqrestore(pi_blocking_list_lock, flags);
 }
 
+static void vmx_pi_do_resume(struct vcpu *v)
+{
+    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
+
+    vmx_pi_unblock_vcpu(v);
+}
+
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_assign(struct domain *d)
 {
@@ -215,11 +220,21 @@ void vmx_pi_hooks_assign(struct domain *d)
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_deassign(struct domain *d)
 {
+    struct vcpu *v;
+
     if ( !iommu_intpost || !has_hvm_container_domain(d) )
         return;
 
     ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
 
+    /*
+     * Pausing the domain can make sure the vCPU is not
+     * running and hence not calling the hooks simultaneously
+     * when deassigning the PI hooks and removing the vCPU
+     * from the blocking list.
+     */
+    domain_pause(d);
+
     d->arch.hvm_domain.vmx.vcpu_block = NULL;
     d->arch.hvm_domain.vmx.pi_do_resume = NULL;
     d->arch.hvm_domain.vmx.pi_switch_from = NULL;
@@ -229,6 +244,11 @@ void vmx_pi_hooks_deassign(struct domain *d)
      * is in the process of getting assigned and "from" hook is NULL. However,
      * it is not straightforward to find a clear solution, so just leave it here.
      */
+
+    for_each_vcpu ( d, v )
+        vmx_pi_unblock_vcpu(v);
+
+    domain_unpause(d);
 }
 
 static int vmx_domain_initialise(struct domain *d)
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
  2016-10-28  2:37 [PATCH v6 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
  2016-10-28  2:37 ` [PATCH v6 1/7] VMX: Permanently assign PI hook vmx_pi_switch_to() Feng Wu
  2016-10-28  2:37 ` [PATCH v6 2/7] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
@ 2016-10-28  2:37 ` Feng Wu
  2016-10-28 13:23   ` Jan Beulich
  2016-10-28  2:37 ` [PATCH v6 4/7] VMX: Make sure PI is in proper state before install the hooks Feng Wu
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Feng Wu @ 2016-10-28  2:37 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

We should remove the vCPU from the per-cpu blocking list
if it is going to be destroyed.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v6:
- Use vmx_pi_unblock_vcpu() instead of vmx_pi_list_remove()

v5:
- Use vmx_pi_list_remove() instead of vmx_pi_list_cleanup()

v4:
- Call vmx_pi_list_cleanup() before vmx_destroy_vmcs()

 xen/arch/x86/hvm/vmx/vmx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 508be7c..eb4770c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -338,6 +338,7 @@ static void vmx_vcpu_destroy(struct vcpu *v)
      * separately here.
      */
     vmx_vcpu_disable_pml(v);
+    vmx_pi_unblock_vcpu(v);
     vmx_destroy_vmcs(v);
     vpmu_destroy(v);
     passive_domain_destroy(v);
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 4/7] VMX: Make sure PI is in proper state before install the hooks
  2016-10-28  2:37 [PATCH v6 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (2 preceding siblings ...)
  2016-10-28  2:37 ` [PATCH v6 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
@ 2016-10-28  2:37 ` Feng Wu
  2016-10-28 13:24   ` Jan Beulich
  2016-10-28  2:37 ` [PATCH v6 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Feng Wu @ 2016-10-28  2:37 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

We may hit the last ASSERT() in vmx_vcpu_block in the current code,
since vmx_vcpu_block() may get called before vmx_pi_switch_to()
has been installed or executed. Here We use cmpxchg to update
the NDST field, this can make sure we only update the NDST when
vmx_pi_switch_to() has not been called. So the NDST is in a
proper state in vmx_vcpu_block().

Suggested-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v6:
- Comments changes
- Define macro 'APIC_INVALID_DEST' for '0xffffffff'

v5:
- Use 0xffffffff as the invalid value for NDST field.

v4:
- This patch is previously called "Pause/Unpause the domain before/after assigning PI hooks"
- Remove the pause/unpause method
- Use cmpxchg to update NDST

 xen/arch/x86/hvm/vmx/vmcs.c       | 13 +++++--------
 xen/arch/x86/hvm/vmx/vmx.c        | 27 ++++++++++++++++++++++++++-
 xen/include/asm-x86/hvm/vmx/vmx.h |  2 ++
 3 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 1bd875a..e8e3616 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -956,16 +956,13 @@ void virtual_vmcs_vmwrite(const struct vcpu *v, u32 vmcs_encoding, u64 val)
  */
 static void pi_desc_init(struct vcpu *v)
 {
-    uint32_t dest;
-
     v->arch.hvm_vmx.pi_desc.nv = posted_intr_vector;
 
-    dest = cpu_physical_id(v->processor);
-
-    if ( x2apic_enabled )
-        v->arch.hvm_vmx.pi_desc.ndst = dest;
-    else
-        v->arch.hvm_vmx.pi_desc.ndst = MASK_INSR(dest, PI_xAPIC_NDST_MASK);
+    /*
+     * Mark NDST as invalid, then we can use this invalid value as a
+     * marker to whether update NDST or not in vmx_pi_hooks_assign().
+     */
+    v->arch.hvm_vmx.pi_desc.ndst = APIC_INVALID_DEST;
 }
 
 static int construct_vmcs(struct vcpu *v)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index eb4770c..58694ce 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -206,14 +206,39 @@ static void vmx_pi_do_resume(struct vcpu *v)
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_assign(struct domain *d)
 {
+    struct vcpu *v;
+
     if ( !iommu_intpost || !has_hvm_container_domain(d) )
         return;
 
     ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
 
-    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
+    /*
+     * We carefully handle the timing here:
+     * - Install the context switch first
+     * - Then set the NDST field
+     * - Install the block and resume hooks in the end
+     *
+     * This can make sure the PI (especially the NDST feild) is
+     * in proper state when we call vmx_vcpu_block().
+     */
     d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
     d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
+
+    for_each_vcpu ( d, v )
+    {
+        unsigned int dest = cpu_physical_id(v->processor);
+        struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+
+        /*
+         * We don't need to update NDST if vmx_pi_switch_to()
+         * has already got called.
+         */
+        (void)cmpxchg(&pi_desc->ndst, APIC_INVALID_DEST,
+                x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
+    }
+
+    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
     d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
 }
 
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 4cdd9b1..2f0435c 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -573,6 +573,8 @@ void vmx_pi_per_cpu_init(unsigned int cpu);
 void vmx_pi_hooks_assign(struct domain *d);
 void vmx_pi_hooks_deassign(struct domain *d);
 
+#define APIC_INVALID_DEST           0xffffffff
+
 /* EPT violation qualifications definitions */
 #define _EPT_READ_VIOLATION         0
 #define EPT_READ_VIOLATION          (1UL<<_EPT_READ_VIOLATION)
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-28  2:37 [PATCH v6 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (3 preceding siblings ...)
  2016-10-28  2:37 ` [PATCH v6 4/7] VMX: Make sure PI is in proper state before install the hooks Feng Wu
@ 2016-10-28  2:37 ` Feng Wu
  2016-10-31 14:57   ` Jan Beulich
  2016-10-28  2:37 ` [PATCH v6 6/7] VT-d: Some cleanups Feng Wu
  2016-10-28  2:37 ` [PATCH v6 7/7] VMX: Fixup PI descriptor when cpu is offline Feng Wu
  6 siblings, 1 reply; 22+ messages in thread
From: Feng Wu @ 2016-10-28  2:37 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

We don't set the affinity for posted format IRTE, since the
destination of these interrupts is vCPU and the vCPU affinity
is set during vCPU scheduling.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v6:
- Make pi_can_suppress_irte_update() a check-only function
- Introduce another function pi_get_new_irte() to update the 'new_ire' if needed

v5:
- Only suppress affinity related IRTE updates for PI

v4:
- Keep the construction of new_ire and only modify the hardware
IRTE when it is not in posted mode.

 xen/drivers/passthrough/vtd/intremap.c | 61 ++++++++++++++++++++++++++++++++--
 1 file changed, 58 insertions(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
index bfd468b..97e80a6 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -547,6 +547,54 @@ static int remap_entry_to_msi_msg(
     return 0;
 }
 
+static bool pi_can_suppress_irte_update(const struct iremap_entry *new,
+    const struct iremap_entry *old)
+{
+    ASSERT( old && new );
+
+    if ( !old->remap.p || !old->remap.im || !new->remap.p || new->remap.im )
+        return false;
+
+    /*
+     * We are updating posted IRTE to remapped one, check whether
+     * the common fields are going to be changed.
+     */
+    if ( ( new->remap.fpd != old->post.fpd ) ||
+         ( new->remap.sid != old->post.sid ) ||
+         ( new->remap.sq != old->post.sq ) ||
+         ( new->remap.svt != old->post.svt ) )
+        return false;
+
+    return true;
+}
+
+static void pi_get_new_irte(struct iremap_entry *new,
+    const struct iremap_entry *old)
+{
+    u16 fpd, sid, sq, svt;
+
+    ASSERT( old && new );
+
+    fpd = new->remap.fpd;
+    sid = new->remap.sid;
+    sq = new->remap.sq;
+    svt = new->remap.svt;
+
+    *new = *old;
+
+    if ( fpd != old->post.fpd )
+        new->post.fpd = fpd;
+
+    if ( sid != old->post.sid )
+        new->post.sid = sid;
+
+    if ( sq != old->post.sq )
+        new->post.sq = sq;
+
+    if ( svt != old->post.svt )
+        new->post.svt = svt;
+}
+
 static int msi_msg_to_remap_entry(
     struct iommu *iommu, struct pci_dev *pdev,
     struct msi_desc *msi_desc, struct msi_msg *msg)
@@ -637,9 +685,16 @@ static int msi_msg_to_remap_entry(
     remap_rte->address_hi = 0;
     remap_rte->data = index - i;
 
-    memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
-    iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
-    iommu_flush_iec_index(iommu, 0, index);
+    if ( !pi_can_suppress_irte_update(&new_ire, iremap_entry) )
+    {
+        if ( iremap_entry->remap.p && iremap_entry->remap.im &&
+             new_ire.remap.p && !new_ire.remap.im )
+            pi_get_new_irte(&new_ire, iremap_entry);
+
+        memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
+        iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+        iommu_flush_iec_index(iommu, 0, index);
+    }
 
     unmap_vtd_domain_page(iremap_entries);
     spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags);
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 6/7] VT-d: Some cleanups
  2016-10-28  2:37 [PATCH v6 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (4 preceding siblings ...)
  2016-10-28  2:37 ` [PATCH v6 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
@ 2016-10-28  2:37 ` Feng Wu
  2016-10-31 15:00   ` Jan Beulich
  2016-10-28  2:37 ` [PATCH v6 7/7] VMX: Fixup PI descriptor when cpu is offline Feng Wu
  6 siblings, 1 reply; 22+ messages in thread
From: Feng Wu @ 2016-10-28  2:37 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

Use type-safe structure assignment instead of memcpy()
Use sizeof(*iremap_entry). Besides that, this patch also
handle another cleanup, in msi_msg_to_remap_entry() we don't
need to copy all the content of old IRTE to the new IRE, we
only need to save the 'IM' field to 'new_ire' if the entry
is present.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v6:
- More descripion about the patch

 xen/drivers/passthrough/vtd/intremap.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
index 97e80a6..39dcb3a 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -183,8 +183,8 @@ static void free_remap_entry(struct iommu *iommu, int index)
     GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
                      iremap_entries, iremap_entry);
 
-    memset(iremap_entry, 0, sizeof(struct iremap_entry));
-    iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+    memset(iremap_entry, 0, sizeof(*iremap_entry));
+    iommu_flush_cache_entry(iremap_entry, sizeof(*iremap_entry));
     iommu_flush_iec_index(iommu, 0, index);
 
     unmap_vtd_domain_page(iremap_entries);
@@ -310,7 +310,7 @@ static int ioapic_rte_to_remap_entry(struct iommu *iommu,
     GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
                      iremap_entries, iremap_entry);
 
-    memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry));
+    new_ire = *iremap_entry;
 
     if ( rte_upper )
     {
@@ -353,8 +353,8 @@ static int ioapic_rte_to_remap_entry(struct iommu *iommu,
         remap_rte->format = 1;    /* indicate remap format */
     }
 
-    memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
-    iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+    *iremap_entry = new_ire;
+    iommu_flush_cache_entry(iremap_entry, sizeof(*iremap_entry));
     iommu_flush_iec_index(iommu, 0, index);
 
     unmap_vtd_domain_page(iremap_entries);
@@ -643,7 +643,8 @@ static int msi_msg_to_remap_entry(
     GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
                      iremap_entries, iremap_entry);
 
-    memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry));
+    if ( iremap_entry->remap.p )
+        new_ire.remap.im = iremap_entry->remap.im;
 
     /* Set interrupt remapping table entry */
     new_ire.remap.fpd = 0;
@@ -691,8 +692,8 @@ static int msi_msg_to_remap_entry(
              new_ire.remap.p && !new_ire.remap.im )
             pi_get_new_irte(&new_ire, iremap_entry);
 
-        memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
-        iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+        *iremap_entry = new_ire;
+        iommu_flush_cache_entry(iremap_entry, sizeof(*iremap_entry));
         iommu_flush_iec_index(iommu, 0, index);
     }
 
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 7/7] VMX: Fixup PI descriptor when cpu is offline
  2016-10-28  2:37 [PATCH v6 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (5 preceding siblings ...)
  2016-10-28  2:37 ` [PATCH v6 6/7] VT-d: Some cleanups Feng Wu
@ 2016-10-28  2:37 ` Feng Wu
  2016-10-31 15:50   ` Jan Beulich
  6 siblings, 1 reply; 22+ messages in thread
From: Feng Wu @ 2016-10-28  2:37 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

When cpu is offline, we need to move all the vcpus in its blocking
list to another online cpu, this patch handles it.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v6:
- Carefully suppress 'SN' to avoid missing notification event
during moving the vcpu to the new list

v5:
- Add some comments to explain why it doesn't cause deadlock
for the ABBA deadlock scenario.

v4:
- Remove the pointless check since we are in machine stop
context and no other cpus go down in parallel.

 xen/arch/x86/hvm/vmx/vmcs.c       |  1 +
 xen/arch/x86/hvm/vmx/vmx.c        | 70 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/vmx/vmx.h |  1 +
 3 files changed, 72 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index e8e3616..1846e25 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -578,6 +578,7 @@ void vmx_cpu_dead(unsigned int cpu)
     vmx_free_vmcs(per_cpu(vmxon_region, cpu));
     per_cpu(vmxon_region, cpu) = 0;
     nvmx_cpu_dead(cpu);
+    vmx_pi_desc_fixup(cpu);
 }
 
 int vmx_cpu_up(void)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 58694ce..a1dc5fa 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -203,6 +203,76 @@ static void vmx_pi_do_resume(struct vcpu *v)
     vmx_pi_unblock_vcpu(v);
 }
 
+void vmx_pi_desc_fixup(int cpu)
+{
+    unsigned int new_cpu, dest;
+    unsigned long flags;
+    struct arch_vmx_struct *vmx, *tmp;
+    spinlock_t *new_lock, *old_lock = &per_cpu(vmx_pi_blocking, cpu).lock;
+    struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list;
+
+    if ( !iommu_intpost )
+        return;
+
+    /*
+     * We are in the context of CPU_DEAD or CPU_UP_CANCELED notification,
+     * and it is impossible for a second CPU go down in parallel. So we
+     * can safely acquire the old cpu's lock and then acquire the new_cpu's
+     * lock after that.
+     */
+    spin_lock_irqsave(old_lock, flags);
+
+    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
+    {
+        /*
+         * Suppress notification or we may miss an interrupt when the
+         * target cpu is dying.
+         */
+        pi_set_sn(&vmx->pi_desc);
+
+        /*
+         * Check whether a notification is pending before doing the
+         * movement, if that is the case we need to wake up it directly
+         * other than moving it to the new cpu's list.
+         */
+        if ( pi_test_on(&vmx->pi_desc) )
+        {
+            list_del(&vmx->pi_blocking.list);
+            vmx->pi_blocking.lock = NULL;
+            vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
+        }
+        else
+        {
+            /*
+             * We need to find an online cpu as the NDST of the PI descriptor, it
+             * doesn't matter whether it is within the cpupool of the domain or
+             * not. As long as it is online, the vCPU will be woken up once the
+             * notification event arrives.
+             */
+            new_cpu = cpumask_any(&cpu_online_map);
+            new_lock = &per_cpu(vmx_pi_blocking, new_cpu).lock;
+
+            spin_lock(new_lock);
+
+            ASSERT(vmx->pi_blocking.lock == old_lock);
+
+            dest = cpu_physical_id(new_cpu);
+            write_atomic(&vmx->pi_desc.ndst,
+                         x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
+
+            list_move(&vmx->pi_blocking.list,
+                      &per_cpu(vmx_pi_blocking, new_cpu).list);
+            vmx->pi_blocking.lock = new_lock;
+
+            spin_unlock(new_lock);
+        }
+
+        pi_clear_sn(&vmx->pi_desc);
+    }
+
+    spin_unlock_irqrestore(old_lock, flags);
+}
+
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_assign(struct domain *d)
 {
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 2f0435c..7aed01d 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -569,6 +569,7 @@ void free_p2m_hap_data(struct p2m_domain *p2m);
 void p2m_init_hap_data(struct p2m_domain *p2m);
 
 void vmx_pi_per_cpu_init(unsigned int cpu);
+void vmx_pi_desc_fixup(int cpu);
 
 void vmx_pi_hooks_assign(struct domain *d);
 void vmx_pi_hooks_deassign(struct domain *d);
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 1/7] VMX: Permanently assign PI hook vmx_pi_switch_to()
  2016-10-28  2:37 ` [PATCH v6 1/7] VMX: Permanently assign PI hook vmx_pi_switch_to() Feng Wu
@ 2016-10-28 13:04   ` Jan Beulich
  2016-10-28 13:18     ` Jan Beulich
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2016-10-28 13:04 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> PI hook vmx_pi_switch_to() is needed even when any previously
> assigned device is detached from the domain. Since 'SN' bit is
> also used to control the CPU side PI and we change the state of
> SN bit in these two functions,

There's just one function you've mentioned up to here.

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -221,9 +221,14 @@ void vmx_pi_hooks_deassign(struct domain *d)
>      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
>  
>      d->arch.hvm_domain.vmx.vcpu_block = NULL;
> -    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
> -    d->arch.hvm_domain.vmx.pi_switch_to = NULL;

And with the commit message as it is right now, it is completely
unclear why the from hook also gets zapped. In fact, with the
description pointing out a problem with just the SN=1 case, I
don't see what you need the from hook for without devices
assigned, as that's where SN gets set (and you want to avoid
that).

>      d->arch.hvm_domain.vmx.pi_do_resume = NULL;
> +    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
> +
> +    /*
> +     * In fact, we can remove 'vmx_pi_switch_to' inside itself if no new device

s/can/could/

> +     * is in the process of getting assigned and "from" hook is NULL. However,
> +     * it is not straightforward to find a clear solution, so just leave it here.
> +     */
>  }
>  
>  static int vmx_domain_initialise(struct domain *d)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 1/7] VMX: Permanently assign PI hook vmx_pi_switch_to()
  2016-10-28 13:04   ` Jan Beulich
@ 2016-10-28 13:18     ` Jan Beulich
  2016-10-31  1:11       ` Wu, Feng
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2016-10-28 13:18 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 28.10.16 at 15:04, <JBeulich@suse.com> wrote:
>>>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -221,9 +221,14 @@ void vmx_pi_hooks_deassign(struct domain *d)
>>      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
>>  
>>      d->arch.hvm_domain.vmx.vcpu_block = NULL;
>> -    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
>> -    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
> 
> And with the commit message as it is right now, it is completely
> unclear why the from hook also gets zapped. In fact, with the
> description pointing out a problem with just the SN=1 case, I
> don't see what you need the from hook for without devices
> assigned, as that's where SN gets set (and you want to avoid
> that).
> 
>>      d->arch.hvm_domain.vmx.pi_do_resume = NULL;
>> +    d->arch.hvm_domain.vmx.pi_switch_from = NULL;

Oh, I'm sorry - I paid attention only to the lines getting removed
above, not the one line getting re-inserted here. That addresses
my comment, of course; I just can't see why the line needs to be
moved.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/7] VMX: Properly handle pi when all the assigned devices are removed
  2016-10-28  2:37 ` [PATCH v6 2/7] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
@ 2016-10-28 13:19   ` Jan Beulich
  2016-11-03  7:45     ` Wu, Feng
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2016-10-28 13:19 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> @@ -215,11 +220,21 @@ void vmx_pi_hooks_assign(struct domain *d)
>  /* This function is called when pcidevs_lock is held */
>  void vmx_pi_hooks_deassign(struct domain *d)
>  {
> +    struct vcpu *v;
> +
>      if ( !iommu_intpost || !has_hvm_container_domain(d) )
>          return;
>  
>      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
>  
> +    /*
> +     * Pausing the domain can make sure the vCPU is not
> +     * running and hence not calling the hooks simultaneously
> +     * when deassigning the PI hooks and removing the vCPU
> +     * from the blocking list.
> +     */
> +    domain_pause(d);

There's one additional caveat here which no-one of us so far thought
of: Currently there's nothing preventing the domctl-s under which
this sits from being issued by the control domain for itself. Various
other domctl-s, however, guard against this case when intending
to pause the target domain. The same needs to be done for the
ones leading here.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
  2016-10-28  2:37 ` [PATCH v6 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
@ 2016-10-28 13:23   ` Jan Beulich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2016-10-28 13:23 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> We should remove the vCPU from the per-cpu blocking list
> if it is going to be destroyed.

I think I did raise this question at least once before: Why can it
still be on the list in the first place, after patch 2? This is even
more odd with ...

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -338,6 +338,7 @@ static void vmx_vcpu_destroy(struct vcpu *v)
>       * separately here.
>       */
>      vmx_vcpu_disable_pml(v);
> +    vmx_pi_unblock_vcpu(v);

... the renamed function: A vCPU being destroyed can hardly get
unblocked.

If the apparently proper solution (removing devices from a domain
before destroying its vCPU-s) is not possible or feasible for some
reason, then the commit message should say so (and explain why).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 4/7] VMX: Make sure PI is in proper state before install the hooks
  2016-10-28  2:37 ` [PATCH v6 4/7] VMX: Make sure PI is in proper state before install the hooks Feng Wu
@ 2016-10-28 13:24   ` Jan Beulich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2016-10-28 13:24 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> We may hit the last ASSERT() in vmx_vcpu_block in the current code,
> since vmx_vcpu_block() may get called before vmx_pi_switch_to()
> has been installed or executed. Here We use cmpxchg to update
> the NDST field, this can make sure we only update the NDST when
> vmx_pi_switch_to() has not been called. So the NDST is in a
> proper state in vmx_vcpu_block().
> 
> Suggested-by: Jan Beulich <JBeulich@suse.com>
> Signed-off-by: Feng Wu <feng.wu@intel.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 1/7] VMX: Permanently assign PI hook vmx_pi_switch_to()
  2016-10-28 13:18     ` Jan Beulich
@ 2016-10-31  1:11       ` Wu, Feng
  0 siblings, 0 replies; 22+ messages in thread
From: Wu, Feng @ 2016-10-31  1:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, October 28, 2016 9:18 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH v6 1/7] VMX: Permanently assign PI hook
> vmx_pi_switch_to()
> 
> >>> On 28.10.16 at 15:04, <JBeulich@suse.com> wrote:
> >>>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> >> --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> @@ -221,9 +221,14 @@ void vmx_pi_hooks_deassign(struct domain *d)
> >>      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
> >>
> >>      d->arch.hvm_domain.vmx.vcpu_block = NULL;
> >> -    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
> >> -    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
> >
> > And with the commit message as it is right now, it is completely
> > unclear why the from hook also gets zapped. In fact, with the
> > description pointing out a problem with just the SN=1 case, I
> > don't see what you need the from hook for without devices
> > assigned, as that's where SN gets set (and you want to avoid
> > that).
> >
> >>      d->arch.hvm_domain.vmx.pi_do_resume = NULL;
> >> +    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
> 
> Oh, I'm sorry - I paid attention only to the lines getting removed
> above, not the one line getting re-inserted here. That addresses
> my comment, of course; I just can't see why the line needs to be
> moved.

Yes, we could just reuse the original line. Thanks!

Thanks,
Feng

> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-28  2:37 ` [PATCH v6 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
@ 2016-10-31 14:57   ` Jan Beulich
  2016-11-03  7:46     ` Wu, Feng
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2016-10-31 14:57 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> v6:
> - Make pi_can_suppress_irte_update() a check-only function
> - Introduce another function pi_get_new_irte() to update the 'new_ire' if 
> needed

I don't see what you need this function for. My earlier comments
were not meant to make you split the function, but to drop all this
secondary modification (unless there's actually a reason for this,
which so far I haven't seen any proof of). Apart from that there
already is setup_posted_irte(), which seems to do most if not all
of what you really need.

In any event, the sequence of operations should be
1) create full new entry
2) check whether update is needed (i.e. whether old and new entries
   have meaningful differences)
3) do update
4) flush.

> --- a/xen/drivers/passthrough/vtd/intremap.c
> +++ b/xen/drivers/passthrough/vtd/intremap.c
> @@ -547,6 +547,54 @@ static int remap_entry_to_msi_msg(
>      return 0;
>  }
>  
> +static bool pi_can_suppress_irte_update(const struct iremap_entry *new,
> +    const struct iremap_entry *old)

The two parameter declarations should be aligned with one another.

> +{
> +    ASSERT( old && new );
> +
> +    if ( !old->remap.p || !old->remap.im || !new->remap.p || new->remap.im )
> +        return false;

The asymmetry regarding the IM bit is confusing, and imo indicates
a problem with the logic.

> +    /*
> +     * We are updating posted IRTE to remapped one, check whether
> +     * the common fields are going to be changed.
> +     */
> +    if ( ( new->remap.fpd != old->post.fpd ) ||
> +         ( new->remap.sid != old->post.sid ) ||
> +         ( new->remap.sq != old->post.sq ) ||
> +         ( new->remap.svt != old->post.svt ) )

Stray blanks inside the inner parentheses.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 6/7] VT-d: Some cleanups
  2016-10-28  2:37 ` [PATCH v6 6/7] VT-d: Some cleanups Feng Wu
@ 2016-10-31 15:00   ` Jan Beulich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2016-10-31 15:00 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> @@ -643,7 +643,8 @@ static int msi_msg_to_remap_entry(
>      GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
>                       iremap_entries, iremap_entry);
>  
> -    memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry));
> +    if ( iremap_entry->remap.p )
> +        new_ire.remap.im = iremap_entry->remap.im;

I don't understand this change, but I also don't think the change is
going to be wanted/needed once the issues with the previous patch
got sorted out.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 7/7] VMX: Fixup PI descriptor when cpu is offline
  2016-10-28  2:37 ` [PATCH v6 7/7] VMX: Fixup PI descriptor when cpu is offline Feng Wu
@ 2016-10-31 15:50   ` Jan Beulich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2016-10-31 15:50 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -203,6 +203,76 @@ static void vmx_pi_do_resume(struct vcpu *v)
>      vmx_pi_unblock_vcpu(v);
>  }
>  
> +void vmx_pi_desc_fixup(int cpu)

unsigned int

With that fixed,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/7] VMX: Properly handle pi when all the assigned devices are removed
  2016-10-28 13:19   ` Jan Beulich
@ 2016-11-03  7:45     ` Wu, Feng
  2016-11-03  9:07       ` Jan Beulich
  0 siblings, 1 reply; 22+ messages in thread
From: Wu, Feng @ 2016-11-03  7:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, October 28, 2016 9:19 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v6 2/7] VMX: Properly handle pi when all the assigned
> devices are removed
> 
> >>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> > @@ -215,11 +220,21 @@ void vmx_pi_hooks_assign(struct domain *d)
> >  /* This function is called when pcidevs_lock is held */
> >  void vmx_pi_hooks_deassign(struct domain *d)
> >  {
> > +    struct vcpu *v;
> > +
> >      if ( !iommu_intpost || !has_hvm_container_domain(d) )
> >          return;
> >
> >      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
> >
> > +    /*
> > +     * Pausing the domain can make sure the vCPU is not
> > +     * running and hence not calling the hooks simultaneously
> > +     * when deassigning the PI hooks and removing the vCPU
> > +     * from the blocking list.
> > +     */
> > +    domain_pause(d);
> 
> There's one additional caveat here which no-one of us so far thought
> of: Currently there's nothing preventing the domctl-s under which
> this sits from being issued by the control domain for itself. Various
> other domctl-s, however, guard against this case when intending
> to pause the target domain. The same needs to be done for the
> ones leading here.

Thanks for the comments! Could you share in which case can a domain
attach/detach a device for itself? Thanks!

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-31 14:57   ` Jan Beulich
@ 2016-11-03  7:46     ` Wu, Feng
  2016-11-03  9:15       ` Jan Beulich
  0 siblings, 1 reply; 22+ messages in thread
From: Wu, Feng @ 2016-11-03  7:46 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, October 31, 2016 10:57 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v6 5/7] VT-d: No need to set irq affinity for posted format
> IRTE
> 
> >>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> > v6:
> > - Make pi_can_suppress_irte_update() a check-only function
> > - Introduce another function pi_get_new_irte() to update the 'new_ire' if
> > needed
> 
> I don't see what you need this function for. My earlier comments
> were not meant to make you split the function, but to drop all this
> secondary modification (unless there's actually a reason for this,
> which so far I haven't seen any proof of). Apart from that there
> already is setup_posted_irte(), which seems to do most if not all
> of what you really need.
> 
> In any event, the sequence of operations should be
> 1) create full new entry
> 2) check whether update is needed (i.e. whether old and new entries
>    have meaningful differences)
> 3) do update
> 4) flush.

Your early comment suggest we only suppress the affinity related
field. So the modification here is to handle the following case:

When the IRTE is currently in posted mode and this function tries
to update it he remapped mode, then we need to suppress
the affinity related bit, however, if the common fields (fpd, sid
sq, svt) in the new remapped IRTE is different from the value
in the old posted IRTE, what do you suggest we should do?
Here I copy these fields to the new IRTE and update them
accordingly.

> 
> > --- a/xen/drivers/passthrough/vtd/intremap.c
> > +++ b/xen/drivers/passthrough/vtd/intremap.c
> > @@ -547,6 +547,54 @@ static int remap_entry_to_msi_msg(
> >      return 0;
> >  }
> >
> > +static bool pi_can_suppress_irte_update(const struct iremap_entry *new,
> > +    const struct iremap_entry *old)
> 
> The two parameter declarations should be aligned with one another.
> 
> > +{
> > +    ASSERT( old && new );
> > +
> > +    if ( !old->remap.p || !old->remap.im || !new->remap.p || new->remap.im )
> > +        return false;
> 
> The asymmetry regarding the IM bit is confusing, and imo indicates
> a problem with the logic.

Would you please share what the problem is here? If we get here, that means
we are updating a posted IRTE to remapped one.

Thanks,
Feng

> 
> > +    /*
> > +     * We are updating posted IRTE to remapped one, check whether
> > +     * the common fields are going to be changed.
> > +     */
> > +    if ( ( new->remap.fpd != old->post.fpd ) ||
> > +         ( new->remap.sid != old->post.sid ) ||
> > +         ( new->remap.sq != old->post.sq ) ||
> > +         ( new->remap.svt != old->post.svt ) )
> 
> Stray blanks inside the inner parentheses.
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/7] VMX: Properly handle pi when all the assigned devices are removed
  2016-11-03  7:45     ` Wu, Feng
@ 2016-11-03  9:07       ` Jan Beulich
  2016-11-03  9:10         ` Wu, Feng
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2016-11-03  9:07 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 03.11.16 at 08:45, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Friday, October 28, 2016 9:19 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: Re: [PATCH v6 2/7] VMX: Properly handle pi when all the assigned
>> devices are removed
>> 
>> >>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
>> > @@ -215,11 +220,21 @@ void vmx_pi_hooks_assign(struct domain *d)
>> >  /* This function is called when pcidevs_lock is held */
>> >  void vmx_pi_hooks_deassign(struct domain *d)
>> >  {
>> > +    struct vcpu *v;
>> > +
>> >      if ( !iommu_intpost || !has_hvm_container_domain(d) )
>> >          return;
>> >
>> >      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
>> >
>> > +    /*
>> > +     * Pausing the domain can make sure the vCPU is not
>> > +     * running and hence not calling the hooks simultaneously
>> > +     * when deassigning the PI hooks and removing the vCPU
>> > +     * from the blocking list.
>> > +     */
>> > +    domain_pause(d);
>> 
>> There's one additional caveat here which no-one of us so far thought
>> of: Currently there's nothing preventing the domctl-s under which
>> this sits from being issued by the control domain for itself. Various
>> other domctl-s, however, guard against this case when intending
>> to pause the target domain. The same needs to be done for the
>> ones leading here.
> 
> Thanks for the comments! Could you share in which case can a domain
> attach/detach a device for itself? Thanks!

That's not the question. The code path needs to be safe, just like
all others (which we also expect Dom0 - or any other eligible domain
- to not use with itself as the target domain).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 2/7] VMX: Properly handle pi when all the assigned devices are removed
  2016-11-03  9:07       ` Jan Beulich
@ 2016-11-03  9:10         ` Wu, Feng
  0 siblings, 0 replies; 22+ messages in thread
From: Wu, Feng @ 2016-11-03  9:10 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, November 3, 2016 5:08 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v6 2/7] VMX: Properly handle pi when all the assigned
> devices are removed
> 
> >>> On 03.11.16 at 08:45, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Friday, October 28, 2016 9:19 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> devel@lists.xen.org
> >> Subject: Re: [PATCH v6 2/7] VMX: Properly handle pi when all the assigned
> >> devices are removed
> >>
> >> >>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
> >> > @@ -215,11 +220,21 @@ void vmx_pi_hooks_assign(struct domain *d)
> >> >  /* This function is called when pcidevs_lock is held */
> >> >  void vmx_pi_hooks_deassign(struct domain *d)
> >> >  {
> >> > +    struct vcpu *v;
> >> > +
> >> >      if ( !iommu_intpost || !has_hvm_container_domain(d) )
> >> >          return;
> >> >
> >> >      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
> >> >
> >> > +    /*
> >> > +     * Pausing the domain can make sure the vCPU is not
> >> > +     * running and hence not calling the hooks simultaneously
> >> > +     * when deassigning the PI hooks and removing the vCPU
> >> > +     * from the blocking list.
> >> > +     */
> >> > +    domain_pause(d);
> >>
> >> There's one additional caveat here which no-one of us so far thought
> >> of: Currently there's nothing preventing the domctl-s under which
> >> this sits from being issued by the control domain for itself. Various
> >> other domctl-s, however, guard against this case when intending
> >> to pause the target domain. The same needs to be done for the
> >> ones leading here.
> >
> > Thanks for the comments! Could you share in which case can a domain
> > attach/detach a device for itself? Thanks!
> 
> That's not the question. The code path needs to be safe, just like
> all others (which we also expect Dom0 - or any other eligible domain
> - to not use with itself as the target domain).

Sure, got it!

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-11-03  7:46     ` Wu, Feng
@ 2016-11-03  9:15       ` Jan Beulich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2016-11-03  9:15 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 03.11.16 at 08:46, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Monday, October 31, 2016 10:57 PM
>> >>> On 28.10.16 at 04:37, <feng.wu@intel.com> wrote:
>> > v6:
>> > - Make pi_can_suppress_irte_update() a check-only function
>> > - Introduce another function pi_get_new_irte() to update the 'new_ire' if
>> > needed
>> 
>> I don't see what you need this function for. My earlier comments
>> were not meant to make you split the function, but to drop all this
>> secondary modification (unless there's actually a reason for this,
>> which so far I haven't seen any proof of). Apart from that there
>> already is setup_posted_irte(), which seems to do most if not all
>> of what you really need.
>> 
>> In any event, the sequence of operations should be
>> 1) create full new entry
>> 2) check whether update is needed (i.e. whether old and new entries
>>    have meaningful differences)
>> 3) do update
>> 4) flush.
> 
> Your early comment suggest we only suppress the affinity related
> field. So the modification here is to handle the following case:
> 
> When the IRTE is currently in posted mode and this function tries
> to update it he remapped mode, then we need to suppress
> the affinity related bit, however, if the common fields (fpd, sid
> sq, svt) in the new remapped IRTE is different from the value
> in the old posted IRTE, what do you suggest we should do?
> Here I copy these fields to the new IRTE and update them
> accordingly.

You continue to look at this from what I think is the wrong angle:
msi_msg_to_remap_entry() should not make assumptions or
special case either of the formats. It should construct the new
entry (whichever format that's supposed to be) and compare
all relevant (for the given format) fields with those of the old
entry. If they all match, skip the flush.

>> > --- a/xen/drivers/passthrough/vtd/intremap.c
>> > +++ b/xen/drivers/passthrough/vtd/intremap.c
>> > @@ -547,6 +547,54 @@ static int remap_entry_to_msi_msg(
>> >      return 0;
>> >  }
>> >
>> > +static bool pi_can_suppress_irte_update(const struct iremap_entry *new,
>> > +    const struct iremap_entry *old)
>> 
>> The two parameter declarations should be aligned with one another.
>> 
>> > +{
>> > +    ASSERT( old && new );

Btw, stray blanks here (in case I didn't point this out before).

>> > +    if ( !old->remap.p || !old->remap.im || !new->remap.p || new->remap.im )
>> > +        return false;
>> 
>> The asymmetry regarding the IM bit is confusing, and imo indicates
>> a problem with the logic.
> 
> Would you please share what the problem is here? If we get here, that means
> we are updating a posted IRTE to remapped one.

It's the assumption expressed by the second sentence of your reply
which is the problem: You're making an assumption here which you
shouldn't make - I don't see the caller guaranteeing that assumption,
and even if this one caller did, a function like the one here should be
generic enough so that future callers won't need to worry about
preconditions.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-11-03  9:15 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-28  2:37 [PATCH v6 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
2016-10-28  2:37 ` [PATCH v6 1/7] VMX: Permanently assign PI hook vmx_pi_switch_to() Feng Wu
2016-10-28 13:04   ` Jan Beulich
2016-10-28 13:18     ` Jan Beulich
2016-10-31  1:11       ` Wu, Feng
2016-10-28  2:37 ` [PATCH v6 2/7] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
2016-10-28 13:19   ` Jan Beulich
2016-11-03  7:45     ` Wu, Feng
2016-11-03  9:07       ` Jan Beulich
2016-11-03  9:10         ` Wu, Feng
2016-10-28  2:37 ` [PATCH v6 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
2016-10-28 13:23   ` Jan Beulich
2016-10-28  2:37 ` [PATCH v6 4/7] VMX: Make sure PI is in proper state before install the hooks Feng Wu
2016-10-28 13:24   ` Jan Beulich
2016-10-28  2:37 ` [PATCH v6 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
2016-10-31 14:57   ` Jan Beulich
2016-11-03  7:46     ` Wu, Feng
2016-11-03  9:15       ` Jan Beulich
2016-10-28  2:37 ` [PATCH v6 6/7] VT-d: Some cleanups Feng Wu
2016-10-31 15:00   ` Jan Beulich
2016-10-28  2:37 ` [PATCH v6 7/7] VMX: Fixup PI descriptor when cpu is offline Feng Wu
2016-10-31 15:50   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.