All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list
@ 2016-10-11  0:57 Feng Wu
  2016-10-11  0:57 ` [PATCH v5 1/7] VMX: Statically assign two PI hooks Feng Wu
                   ` (7 more replies)
  0 siblings, 8 replies; 35+ messages in thread
From: Feng Wu @ 2016-10-11  0:57 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

The current VT-d PI related code may operate incorrectly in the
following scenarios:
1. When the last assigned device is dettached from the domain, all
the PI related hooks are removed then, however, the vCPU can be
blocked, switched to another pCPU, etc, all without the aware of
PI. After the next time we attach another device to the domain,
which makes the PI realted hooks avaliable again, the status
of the pi descriptor is not true. Beside that, the blocking vcpu
may still remain in the per-cpu blocking in this case. Patch [1/6]
and [2/6] handle this.

2. After the domain is destroyed, the the blocking vcpu may also
remain in the per-cpu blocking. Handled in patch [3/6].

3. When IRTE is in posted mode, we don't need to set the irq
affinity for it, since the destination of these interrupts is
vCPU and the vCPU affinity is set during vCPU scheduling. Patch
[5/6] handles this.

4. When a pCPU is unplugged, and there might be vCPUs on its
list. Since the pCPU is offline, those vCPUs might not be woken
up again. [6/6] addresses it.

Feng Wu (7):
  VMX: Statically assign two PI hooks
  VMX: Properly handle pi when all the assigned devices are removed
  VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
  VMX: Make sure PI is in proper state before install the hooks
  VT-d: No need to set irq affinity for posted format IRTE
  VT-d: Some cleanups
  VMX: Fixup PI descriptor when cpu is offline

 xen/arch/x86/hvm/vmx/vmcs.c            |  14 ++---
 xen/arch/x86/hvm/vmx/vmx.c             | 112 ++++++++++++++++++++++++++++++---
 xen/drivers/passthrough/vtd/intremap.c |  65 ++++++++++++++++---
 xen/include/asm-x86/hvm/vmx/vmx.h      |   1 +
 4 files changed, 168 insertions(+), 24 deletions(-)

-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v5 1/7] VMX: Statically assign two PI hooks
  2016-10-11  0:57 [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
@ 2016-10-11  0:57 ` Feng Wu
  2016-10-11  8:11   ` Tian, Kevin
  2016-10-12 13:25   ` Jan Beulich
  2016-10-11  0:57 ` [PATCH v5 2/7] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 35+ messages in thread
From: Feng Wu @ 2016-10-11  0:57 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

PI hooks: vmx_pi_switch_from() and vmx_pi_switch_to() are
needed even when any previously assigned device is detached
from the domain. Since 'SN' bit is also used to control the
CPU side PI and we change the state of SN bit in these two
functions, then evaluate this bit in vmx_deliver_posted_intr()
when trying to deliver the interrupt in posted way via software.
The problem is if we deassign the hooks while the vCPU is runnable
in the runqueue with 'SN' set, all the furture notificaton event
will be suppressed. This patch makes these two hooks statically
assigned.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v5:
- Zap "pi_switch_from" hook

 xen/arch/x86/hvm/vmx/vmx.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 3d330b6..623d5bc 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -147,6 +147,11 @@ static void vmx_pi_switch_from(struct vcpu *v)
     pi_set_sn(pi_desc);
 }
 
+/*
+ * In fact, we can remove this hooks inside itself if no new device is in the
+ * process of getting assigned and "from" hook is NULL. However, it is not
+ * straightforward to find a clear solution, so just leave it here.
+ */
 static void vmx_pi_switch_to(struct vcpu *v)
 {
     struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
@@ -221,9 +226,8 @@ void vmx_pi_hooks_deassign(struct domain *d)
     ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
 
     d->arch.hvm_domain.vmx.vcpu_block = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
     d->arch.hvm_domain.vmx.pi_do_resume = NULL;
+    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
 }
 
 static int vmx_domain_initialise(struct domain *d)
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v5 2/7] VMX: Properly handle pi when all the assigned devices are removed
  2016-10-11  0:57 [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
  2016-10-11  0:57 ` [PATCH v5 1/7] VMX: Statically assign two PI hooks Feng Wu
@ 2016-10-11  0:57 ` Feng Wu
  2016-10-11  8:20   ` Tian, Kevin
  2016-10-11  0:57 ` [PATCH v5 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2016-10-11  0:57 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

This patch handles some concern cases when the last assigned device
is removed from the domain. In this case we should carefully handle
pi descriptor and the per-cpu blocking list, to make sure:
- all the PI descriptor are in the right state when next time a
devices is assigned to the domain again.
- No remaining vcpus of the domain in the per-cpu blocking list.

Here we call vmx_pi_list_remove() to remove the vCPU from the blocking list
if it is on the list. However, this could happen when vmx_vcpu_block() is
being called, hence we might incorrectly add the vCPU to the blocking list
while the last devcie is detached from the domain. Consider that the situation
can only occur when detaching the last device from the domain and it is not
a frequent operation, so we use domain_pause before that, which is considered
as an clean and maintainable solution for the situation.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v5:
- Remove a no-op wrapper

 xen/arch/x86/hvm/vmx/vmx.c | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 623d5bc..d210516 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -163,14 +163,12 @@ static void vmx_pi_switch_to(struct vcpu *v)
     pi_clear_sn(pi_desc);
 }
 
-static void vmx_pi_do_resume(struct vcpu *v)
+static void vmx_pi_list_remove(struct vcpu *v)
 {
     unsigned long flags;
     spinlock_t *pi_blocking_list_lock;
     struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
 
-    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
-
     /*
      * Set 'NV' field back to posted_intr_vector, so the
      * Posted-Interrupts can be delivered to the vCPU when
@@ -178,12 +176,12 @@ static void vmx_pi_do_resume(struct vcpu *v)
      */
     write_atomic(&pi_desc->nv, posted_intr_vector);
 
-    /* The vCPU is not on any blocking list. */
     pi_blocking_list_lock = v->arch.hvm_vmx.pi_blocking.lock;
 
     /* Prevent the compiler from eliminating the local variable.*/
     smp_rmb();
 
+    /* The vCPU is not on any blocking list. */
     if ( pi_blocking_list_lock == NULL )
         return;
 
@@ -203,6 +201,13 @@ static void vmx_pi_do_resume(struct vcpu *v)
     spin_unlock_irqrestore(pi_blocking_list_lock, flags);
 }
 
+static void vmx_pi_do_resume(struct vcpu *v)
+{
+    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
+
+    vmx_pi_list_remove(v);
+}
+
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_assign(struct domain *d)
 {
@@ -220,14 +225,29 @@ void vmx_pi_hooks_assign(struct domain *d)
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_deassign(struct domain *d)
 {
+    struct vcpu *v;
+
     if ( !iommu_intpost || !has_hvm_container_domain(d) )
         return;
 
     ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
 
+    /*
+     * Pausing the domain can make sure the vCPU is not
+     * running and hence calling the hooks simultaneously
+     * when deassigning the PI hooks and removing the vCPU
+     * from the blocking list.
+     */
+    domain_pause(d);
+
     d->arch.hvm_domain.vmx.vcpu_block = NULL;
     d->arch.hvm_domain.vmx.pi_do_resume = NULL;
     d->arch.hvm_domain.vmx.pi_switch_from = NULL;
+
+    for_each_vcpu ( d, v )
+        vmx_pi_list_remove(v);
+
+    domain_unpause(d);
 }
 
 static int vmx_domain_initialise(struct domain *d)
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v5 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
  2016-10-11  0:57 [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
  2016-10-11  0:57 ` [PATCH v5 1/7] VMX: Statically assign two PI hooks Feng Wu
  2016-10-11  0:57 ` [PATCH v5 2/7] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
@ 2016-10-11  0:57 ` Feng Wu
  2016-10-11  0:57 ` [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks Feng Wu
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 35+ messages in thread
From: Feng Wu @ 2016-10-11  0:57 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

We should remove the vCPU from the per-cpu blocking list
if it is going to be destroyed.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v5:
- Use vmx_pi_list_remove() instead of vmx_pi_list_cleanup()

 xen/arch/x86/hvm/vmx/vmx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index d210516..b8b152a 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -337,6 +337,7 @@ static void vmx_vcpu_destroy(struct vcpu *v)
      * separately here.
      */
     vmx_vcpu_disable_pml(v);
+    vmx_pi_list_remove(v);
     vmx_destroy_vmcs(v);
     vpmu_destroy(v);
     passive_domain_destroy(v);
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks
  2016-10-11  0:57 [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (2 preceding siblings ...)
  2016-10-11  0:57 ` [PATCH v5 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
@ 2016-10-11  0:57 ` Feng Wu
  2016-10-12 13:45   ` Jan Beulich
  2016-10-11  0:57 ` [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2016-10-11  0:57 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

We may hit the last ASSERT() in vmx_vcpu_block in the current code,
since vmx_vcpu_block() may get called before vmx_pi_switch_to()
has been installed or executed. Here We use cmpxchg to update
the NDST field, this can make sure we only update the NDST when
vmx_pi_switch_to() has not been called. So the NDST is in a
proper state in vmx_vcpu_block().

Suggested-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v5:
- Use 0xffffffff as the invalid value for NDST field.

 xen/arch/x86/hvm/vmx/vmcs.c | 13 +++++--------
 xen/arch/x86/hvm/vmx/vmx.c  | 27 ++++++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 1bd875a..10976bd 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -956,16 +956,13 @@ void virtual_vmcs_vmwrite(const struct vcpu *v, u32 vmcs_encoding, u64 val)
  */
 static void pi_desc_init(struct vcpu *v)
 {
-    uint32_t dest;
-
     v->arch.hvm_vmx.pi_desc.nv = posted_intr_vector;
 
-    dest = cpu_physical_id(v->processor);
-
-    if ( x2apic_enabled )
-        v->arch.hvm_vmx.pi_desc.ndst = dest;
-    else
-        v->arch.hvm_vmx.pi_desc.ndst = MASK_INSR(dest, PI_xAPIC_NDST_MASK);
+    /*
+     * Mark NDST as invalid, then we can use this invalid value as a
+     * marker to whether update NDST or not in vmx_pi_hooks_assign().
+     */
+    v->arch.hvm_vmx.pi_desc.ndst = 0xffffffff;
 }
 
 static int construct_vmcs(struct vcpu *v)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b8b152a..b14c84e 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -211,14 +211,39 @@ static void vmx_pi_do_resume(struct vcpu *v)
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_assign(struct domain *d)
 {
+    struct vcpu *v;
+
     if ( !iommu_intpost || !has_hvm_container_domain(d) )
         return;
 
     ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
 
-    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
+    /*
+     * We carefully handle the timing here:
+     * - Install the context switch first
+     * - Then set the NDST field
+     * - Install the block and resume hooks in the end
+     *
+     * This can make sure the PI (especially the NDST feild) is
+     * in proper state when we call vmx_vcpu_block().
+     */
     d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
     d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
+
+    for_each_vcpu ( d, v )
+    {
+        unsigned int dest = cpu_physical_id(v->processor);
+        struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+
+        /*
+         * We don't need to update NDST if 'arch.hvm_domain.vmx.pi_switch_to'
+         * already gets called
+         */
+        (void)cmpxchg(&pi_desc->ndst, 0xffffffff,
+                x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
+    }
+
+    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
     d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
 }
 
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-11  0:57 [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (3 preceding siblings ...)
  2016-10-11  0:57 ` [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks Feng Wu
@ 2016-10-11  0:57 ` Feng Wu
  2016-10-12 13:56   ` Jan Beulich
  2016-10-11  0:57 ` [PATCH v5 6/7] VT-d: Some cleanups Feng Wu
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2016-10-11  0:57 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

We don't set the affinity for posted format IRTE, since the
destination of these interrupts is vCPU and the vCPU affinity
is set during vCPU scheduling.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v5:
- Only suppress affinity related IRTE updates for PI

 xen/drivers/passthrough/vtd/intremap.c | 52 ++++++++++++++++++++++++++++++++--
 1 file changed, 49 insertions(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
index bfd468b..8dd0373 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
     return 0;
 }
 
+static bool_t pi_can_suppress_irte_update(struct iremap_entry *new,
+    const struct iremap_entry *old)
+{
+    bool_t ret = 1;
+    u16 fpd, sid, sq, svt;
+
+    if ( !old->remap.p || !old->remap.im )
+        return 0;
+
+    fpd = new->post.fpd;
+    sid = new->post.sid;
+    sq = new->post.sq;
+    svt = new->post.svt;
+
+    *new = *old;
+
+    if ( fpd != old->post.fpd )
+    {
+        new->post.fpd = fpd;
+        ret = 0;
+    }
+
+    if ( sid != old->post.sid )
+    {
+        new->post.sid = sid;
+        ret = 0;
+    }
+
+    if ( sq != old->post.sq )
+    {
+        new->post.sq = sq;
+        ret = 0;
+    }
+
+    if ( svt != old->post.svt )
+    {
+        new->post.svt = svt;
+        ret = 0;
+    }
+
+    return ret;
+}
+
 static int msi_msg_to_remap_entry(
     struct iommu *iommu, struct pci_dev *pdev,
     struct msi_desc *msi_desc, struct msi_msg *msg)
@@ -637,9 +680,12 @@ static int msi_msg_to_remap_entry(
     remap_rte->address_hi = 0;
     remap_rte->data = index - i;
 
-    memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
-    iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
-    iommu_flush_iec_index(iommu, 0, index);
+    if ( !pi_can_suppress_irte_update(&new_ire, iremap_entry) )
+    {
+        memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
+        iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+        iommu_flush_iec_index(iommu, 0, index);
+    }
 
     unmap_vtd_domain_page(iremap_entries);
     spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags);
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v5 6/7] VT-d: Some cleanups
  2016-10-11  0:57 [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (4 preceding siblings ...)
  2016-10-11  0:57 ` [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
@ 2016-10-11  0:57 ` Feng Wu
  2016-10-12 14:01   ` Jan Beulich
  2016-10-11  0:57 ` [PATCH v5 7/7] VMX: Fixup PI descriptor when cpu is offline Feng Wu
  2016-10-11  8:08 ` [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Tian, Kevin
  7 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2016-10-11  0:57 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

Use type-safe structure assignment instead of memcpy()
Use sizeof(*iremap_entry)

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/drivers/passthrough/vtd/intremap.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
index 8dd0373..b6f8d3d 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -183,8 +183,8 @@ static void free_remap_entry(struct iommu *iommu, int index)
     GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
                      iremap_entries, iremap_entry);
 
-    memset(iremap_entry, 0, sizeof(struct iremap_entry));
-    iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+    memset(iremap_entry, 0, sizeof(*iremap_entry));
+    iommu_flush_cache_entry(iremap_entry, sizeof(*iremap_entry));
     iommu_flush_iec_index(iommu, 0, index);
 
     unmap_vtd_domain_page(iremap_entries);
@@ -310,7 +310,7 @@ static int ioapic_rte_to_remap_entry(struct iommu *iommu,
     GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
                      iremap_entries, iremap_entry);
 
-    memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry));
+    new_ire = *iremap_entry;
 
     if ( rte_upper )
     {
@@ -353,8 +353,8 @@ static int ioapic_rte_to_remap_entry(struct iommu *iommu,
         remap_rte->format = 1;    /* indicate remap format */
     }
 
-    memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
-    iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+    *iremap_entry = new_ire;
+    iommu_flush_cache_entry(iremap_entry, sizeof(*iremap_entry));
     iommu_flush_iec_index(iommu, 0, index);
 
     unmap_vtd_domain_page(iremap_entries);
@@ -638,7 +638,8 @@ static int msi_msg_to_remap_entry(
     GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
                      iremap_entries, iremap_entry);
 
-    memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry));
+    if ( iremap_entry->remap.p )
+        new_ire.remap.im = iremap_entry->remap.im;
 
     /* Set interrupt remapping table entry */
     new_ire.remap.fpd = 0;
@@ -682,8 +683,8 @@ static int msi_msg_to_remap_entry(
 
     if ( !pi_can_suppress_irte_update(&new_ire, iremap_entry) )
     {
-        memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
-        iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+        *iremap_entry = new_ire;
+        iommu_flush_cache_entry(iremap_entry, sizeof(*iremap_entry));
         iommu_flush_iec_index(iommu, 0, index);
     }
 
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v5 7/7] VMX: Fixup PI descriptor when cpu is offline
  2016-10-11  0:57 [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (5 preceding siblings ...)
  2016-10-11  0:57 ` [PATCH v5 6/7] VT-d: Some cleanups Feng Wu
@ 2016-10-11  0:57 ` Feng Wu
  2016-10-11  8:38   ` Tian, Kevin
  2016-10-11  8:08 ` [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Tian, Kevin
  7 siblings, 1 reply; 35+ messages in thread
From: Feng Wu @ 2016-10-11  0:57 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

When cpu is offline, we need to move all the vcpus in its blocking
list to another online cpu, this patch handles it.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v5:
- Add some comments to explain why it doesn't cause deadlock
for the ABBA deadlock scenario.

 xen/arch/x86/hvm/vmx/vmcs.c       |  1 +
 xen/arch/x86/hvm/vmx/vmx.c        | 48 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/vmx/vmx.h |  1 +
 3 files changed, 50 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 10976bd..5dd68ca 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -578,6 +578,7 @@ void vmx_cpu_dead(unsigned int cpu)
     vmx_free_vmcs(per_cpu(vmxon_region, cpu));
     per_cpu(vmxon_region, cpu) = 0;
     nvmx_cpu_dead(cpu);
+    vmx_pi_desc_fixup(cpu);
 }
 
 int vmx_cpu_up(void)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b14c84e..c71d496 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -208,6 +208,54 @@ static void vmx_pi_do_resume(struct vcpu *v)
     vmx_pi_list_remove(v);
 }
 
+void vmx_pi_desc_fixup(int cpu)
+{
+    unsigned int new_cpu, dest;
+    unsigned long flags;
+    struct arch_vmx_struct *vmx, *tmp;
+    spinlock_t *new_lock, *old_lock = &per_cpu(vmx_pi_blocking, cpu).lock;
+    struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list;
+
+    if ( !iommu_intpost )
+        return;
+
+    /*
+     * We are in the context of CPU_DEAD or CPU_UP_CANCELED notification,
+     * and it is impossible for a second CPU go down in parallel. So we
+     * can safely acquire the old cpu's lock and then acquire the new_cpu's
+     * lock after that.
+     */
+    spin_lock_irqsave(old_lock, flags);
+
+    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
+    {
+        /*
+         * We need to find an online cpu as the NDST of the PI descriptor, it
+         * doesn't matter whether it is within the cpupool of the domain or
+         * not. As long as it is online, the vCPU will be woken up once the
+         * notification event arrives.
+         */
+        new_cpu = cpumask_any(&cpu_online_map);
+        new_lock = &per_cpu(vmx_pi_blocking, new_cpu).lock;
+
+        spin_lock(new_lock);
+
+        ASSERT(vmx->pi_blocking.lock == old_lock);
+
+        dest = cpu_physical_id(new_cpu);
+        write_atomic(&vmx->pi_desc.ndst,
+                     x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
+
+        list_move(&vmx->pi_blocking.list,
+                  &per_cpu(vmx_pi_blocking, new_cpu).list);
+        vmx->pi_blocking.lock = new_lock;
+
+        spin_unlock(new_lock);
+    }
+
+    spin_unlock_irqrestore(old_lock, flags);
+}
+
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_assign(struct domain *d)
 {
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 4cdd9b1..9783c70 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -569,6 +569,7 @@ void free_p2m_hap_data(struct p2m_domain *p2m);
 void p2m_init_hap_data(struct p2m_domain *p2m);
 
 void vmx_pi_per_cpu_init(unsigned int cpu);
+void vmx_pi_desc_fixup(int cpu);
 
 void vmx_pi_hooks_assign(struct domain *d);
 void vmx_pi_hooks_deassign(struct domain *d);
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list
  2016-10-11  0:57 [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (6 preceding siblings ...)
  2016-10-11  0:57 ` [PATCH v5 7/7] VMX: Fixup PI descriptor when cpu is offline Feng Wu
@ 2016-10-11  8:08 ` Tian, Kevin
  2016-10-11  8:11   ` Wu, Feng
  7 siblings, 1 reply; 35+ messages in thread
From: Tian, Kevin @ 2016-10-11  8:08 UTC (permalink / raw)
  To: Wu, Feng, xen-devel
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, jbeulich

> From: Wu, Feng
> Sent: Tuesday, October 11, 2016 8:58 AM
> 
> The current VT-d PI related code may operate incorrectly in the
> following scenarios:
> 1. When the last assigned device is dettached from the domain, all
> the PI related hooks are removed then, however, the vCPU can be
> blocked, switched to another pCPU, etc, all without the aware of
> PI. After the next time we attach another device to the domain,
> which makes the PI realted hooks avaliable again, the status
> of the pi descriptor is not true. Beside that, the blocking vcpu
> may still remain in the per-cpu blocking in this case. Patch [1/6]
> and [2/6] handle this.

Your v5 series have 7 patches in total.

> 
> 2. After the domain is destroyed, the the blocking vcpu may also
> remain in the per-cpu blocking. Handled in patch [3/6].
> 
> 3. When IRTE is in posted mode, we don't need to set the irq
> affinity for it, since the destination of these interrupts is
> vCPU and the vCPU affinity is set during vCPU scheduling. Patch
> [5/6] handles this.
> 
> 4. When a pCPU is unplugged, and there might be vCPUs on its
> list. Since the pCPU is offline, those vCPUs might not be woken
> up again. [6/6] addresses it.
> 
> Feng Wu (7):
>   VMX: Statically assign two PI hooks
>   VMX: Properly handle pi when all the assigned devices are removed
>   VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
>   VMX: Make sure PI is in proper state before install the hooks
>   VT-d: No need to set irq affinity for posted format IRTE
>   VT-d: Some cleanups
>   VMX: Fixup PI descriptor when cpu is offline
> 
>  xen/arch/x86/hvm/vmx/vmcs.c            |  14 ++---
>  xen/arch/x86/hvm/vmx/vmx.c             | 112
> ++++++++++++++++++++++++++++++---
>  xen/drivers/passthrough/vtd/intremap.c |  65 ++++++++++++++++---
>  xen/include/asm-x86/hvm/vmx/vmx.h      |   1 +
>  4 files changed, 168 insertions(+), 24 deletions(-)
> 
> --
> 2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 1/7] VMX: Statically assign two PI hooks
  2016-10-11  0:57 ` [PATCH v5 1/7] VMX: Statically assign two PI hooks Feng Wu
@ 2016-10-11  8:11   ` Tian, Kevin
  2016-10-12 13:25   ` Jan Beulich
  1 sibling, 0 replies; 35+ messages in thread
From: Tian, Kevin @ 2016-10-11  8:11 UTC (permalink / raw)
  To: Wu, Feng, xen-devel
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, jbeulich

> From: Wu, Feng
> Sent: Tuesday, October 11, 2016 8:58 AM
> 
> PI hooks: vmx_pi_switch_from() and vmx_pi_switch_to() are
> needed even when any previously assigned device is detached
> from the domain. Since 'SN' bit is also used to control the
> CPU side PI and we change the state of SN bit in these two
> functions, then evaluate this bit in vmx_deliver_posted_intr()
> when trying to deliver the interrupt in posted way via software.
> The problem is if we deassign the hooks while the vCPU is runnable
> in the runqueue with 'SN' set, all the furture notificaton event
> will be suppressed. This patch makes these two hooks statically
> assigned.

Your patch has only one hook statically assigned...

> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
> v5:
> - Zap "pi_switch_from" hook
> 
>  xen/arch/x86/hvm/vmx/vmx.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 3d330b6..623d5bc 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -147,6 +147,11 @@ static void vmx_pi_switch_from(struct vcpu *v)
>      pi_set_sn(pi_desc);
>  }
> 
> +/*
> + * In fact, we can remove this hooks inside itself if no new device is in the
> + * process of getting assigned and "from" hook is NULL. However, it is not
> + * straightforward to find a clear solution, so just leave it here.
> + */
>  static void vmx_pi_switch_to(struct vcpu *v)
>  {
>      struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> @@ -221,9 +226,8 @@ void vmx_pi_hooks_deassign(struct domain *d)
>      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
> 
>      d->arch.hvm_domain.vmx.vcpu_block = NULL;
> -    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
> -    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
>      d->arch.hvm_domain.vmx.pi_do_resume = NULL;
> +    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
>  }
> 
>  static int vmx_domain_initialise(struct domain *d)
> --
> 2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list
  2016-10-11  8:08 ` [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Tian, Kevin
@ 2016-10-11  8:11   ` Wu, Feng
  0 siblings, 0 replies; 35+ messages in thread
From: Wu, Feng @ 2016-10-11  8:11 UTC (permalink / raw)
  To: Tian, Kevin, xen-devel
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Wu, Feng, jbeulich



> -----Original Message-----
> From: Tian, Kevin
> Sent: Tuesday, October 11, 2016 4:08 PM
> To: Wu, Feng <feng.wu@intel.com>; xen-devel@lists.xen.org
> Cc: jbeulich@suse.com; andrew.cooper3@citrix.com;
> george.dunlap@eu.citrix.com; dario.faggioli@citrix.com
> Subject: RE: [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu
> blocking list
> 
> > From: Wu, Feng
> > Sent: Tuesday, October 11, 2016 8:58 AM
> >
> > The current VT-d PI related code may operate incorrectly in the
> > following scenarios:
> > 1. When the last assigned device is dettached from the domain, all
> > the PI related hooks are removed then, however, the vCPU can be
> > blocked, switched to another pCPU, etc, all without the aware of
> > PI. After the next time we attach another device to the domain,
> > which makes the PI realted hooks avaliable again, the status
> > of the pi descriptor is not true. Besides that, the blocking vcpu
> > may still remain in the per-cpu blocking in this case. Patch [1/6]
> > and [2/6] handle this.
> 
> Your v5 series have 7 patches in total.

Oh, yes, forgot to update this. Should be [1/7] and [2/7]

> 
> >
> > 2. After the domain is destroyed, the the blocking vcpu may also
> > remain in the per-cpu blocking. Handled in patch [3/6].

[3/7]

> >
> > 3. When IRTE is in posted mode, we don't need to set the irq
> > affinity for it, since the destination of these interrupts is
> > vCPU and the vCPU affinity is set during vCPU scheduling. Patch
> > [5/6] handles this.

[5/7]

> >
> > 4. When a pCPU is unplugged, and there might be vCPUs on its
> > list. Since the pCPU is offline, those vCPUs might not be woken
> > up again. [6/6] addresses it.

[7/7]

Thanks,
Feng


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 2/7] VMX: Properly handle pi when all the assigned devices are removed
  2016-10-11  0:57 ` [PATCH v5 2/7] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
@ 2016-10-11  8:20   ` Tian, Kevin
  0 siblings, 0 replies; 35+ messages in thread
From: Tian, Kevin @ 2016-10-11  8:20 UTC (permalink / raw)
  To: Wu, Feng, xen-devel
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, jbeulich

> From: Wu, Feng
> Sent: Tuesday, October 11, 2016 8:58 AM
> 
> This patch handles some concern cases when the last assigned device

concern -> corner

> is removed from the domain. In this case we should carefully handle
> pi descriptor and the per-cpu blocking list, to make sure:
> - all the PI descriptor are in the right state when next time a
> devices is assigned to the domain again.
> - No remaining vcpus of the domain in the per-cpu blocking list.
> 
> Here we call vmx_pi_list_remove() to remove the vCPU from the blocking list
> if it is on the list. However, this could happen when vmx_vcpu_block() is
> being called, hence we might incorrectly add the vCPU to the blocking list
> while the last devcie is detached from the domain. Consider that the situation
> can only occur when detaching the last device from the domain and it is not
> a frequent operation, so we use domain_pause before that, which is considered
> as an clean and maintainable solution for the situation.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
> v5:
> - Remove a no-op wrapper
> 
>  xen/arch/x86/hvm/vmx/vmx.c | 28 ++++++++++++++++++++++++----
>  1 file changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 623d5bc..d210516 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -163,14 +163,12 @@ static void vmx_pi_switch_to(struct vcpu *v)
>      pi_clear_sn(pi_desc);
>  } 	
> 
> -static void vmx_pi_do_resume(struct vcpu *v)
> +static void vmx_pi_list_remove(struct vcpu *v)

vmx_pi_unblock_vcpu?

>  {
>      unsigned long flags;
>      spinlock_t *pi_blocking_list_lock;
>      struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> 
> -    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> -
>      /*
>       * Set 'NV' field back to posted_intr_vector, so the
>       * Posted-Interrupts can be delivered to the vCPU when
> @@ -178,12 +176,12 @@ static void vmx_pi_do_resume(struct vcpu *v)
>       */
>      write_atomic(&pi_desc->nv, posted_intr_vector);
> 
> -    /* The vCPU is not on any blocking list. */
>      pi_blocking_list_lock = v->arch.hvm_vmx.pi_blocking.lock;
> 
>      /* Prevent the compiler from eliminating the local variable.*/
>      smp_rmb();
> 
> +    /* The vCPU is not on any blocking list. */
>      if ( pi_blocking_list_lock == NULL )
>          return;
> 
> @@ -203,6 +201,13 @@ static void vmx_pi_do_resume(struct vcpu *v)
>      spin_unlock_irqrestore(pi_blocking_list_lock, flags);
>  }
> 
> +static void vmx_pi_do_resume(struct vcpu *v)
> +{
> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> +
> +    vmx_pi_list_remove(v);
> +}
> +
>  /* This function is called when pcidevs_lock is held */
>  void vmx_pi_hooks_assign(struct domain *d)
>  {
> @@ -220,14 +225,29 @@ void vmx_pi_hooks_assign(struct domain *d)
>  /* This function is called when pcidevs_lock is held */
>  void vmx_pi_hooks_deassign(struct domain *d)
>  {
> +    struct vcpu *v;
> +
>      if ( !iommu_intpost || !has_hvm_container_domain(d) )
>          return;
> 
>      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
> 
> +    /*
> +     * Pausing the domain can make sure the vCPU is not
> +     * running and hence calling the hooks simultaneously

and hence 'not' calling

> +     * when deassigning the PI hooks and removing the vCPU
> +     * from the blocking list.
> +     */
> +    domain_pause(d);
> +
>      d->arch.hvm_domain.vmx.vcpu_block = NULL;
>      d->arch.hvm_domain.vmx.pi_do_resume = NULL;
>      d->arch.hvm_domain.vmx.pi_switch_from = NULL;
> +
> +    for_each_vcpu ( d, v )
> +        vmx_pi_list_remove(v);
> +
> +    domain_unpause(d);
>  }
> 
>  static int vmx_domain_initialise(struct domain *d)
> --
> 2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 7/7] VMX: Fixup PI descriptor when cpu is offline
  2016-10-11  0:57 ` [PATCH v5 7/7] VMX: Fixup PI descriptor when cpu is offline Feng Wu
@ 2016-10-11  8:38   ` Tian, Kevin
  2016-10-11 11:46     ` Wu, Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Tian, Kevin @ 2016-10-11  8:38 UTC (permalink / raw)
  To: Wu, Feng, xen-devel
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, jbeulich

> From: Wu, Feng
> Sent: Tuesday, October 11, 2016 8:58 AM
> 
> When cpu is offline, we need to move all the vcpus in its blocking
> list to another online cpu, this patch handles it.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
> v5:
> - Add some comments to explain why it doesn't cause deadlock
> for the ABBA deadlock scenario.
> 
>  xen/arch/x86/hvm/vmx/vmcs.c       |  1 +
>  xen/arch/x86/hvm/vmx/vmx.c        | 48
> +++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/hvm/vmx/vmx.h |  1 +
>  3 files changed, 50 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index 10976bd..5dd68ca 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -578,6 +578,7 @@ void vmx_cpu_dead(unsigned int cpu)
>      vmx_free_vmcs(per_cpu(vmxon_region, cpu));
>      per_cpu(vmxon_region, cpu) = 0;
>      nvmx_cpu_dead(cpu);
> +    vmx_pi_desc_fixup(cpu);
>  }
> 
>  int vmx_cpu_up(void)
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index b14c84e..c71d496 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -208,6 +208,54 @@ static void vmx_pi_do_resume(struct vcpu *v)
>      vmx_pi_list_remove(v);
>  }
> 
> +void vmx_pi_desc_fixup(int cpu)
> +{
> +    unsigned int new_cpu, dest;
> +    unsigned long flags;
> +    struct arch_vmx_struct *vmx, *tmp;
> +    spinlock_t *new_lock, *old_lock = &per_cpu(vmx_pi_blocking, cpu).lock;
> +    struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list;
> +
> +    if ( !iommu_intpost )
> +        return;
> +
> +    /*
> +     * We are in the context of CPU_DEAD or CPU_UP_CANCELED notification,
> +     * and it is impossible for a second CPU go down in parallel. So we
> +     * can safely acquire the old cpu's lock and then acquire the new_cpu's
> +     * lock after that.
> +     */
> +    spin_lock_irqsave(old_lock, flags);
> +
> +    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
> +    {
> +        /*
> +         * We need to find an online cpu as the NDST of the PI descriptor, it
> +         * doesn't matter whether it is within the cpupool of the domain or
> +         * not. As long as it is online, the vCPU will be woken up once the
> +         * notification event arrives.
> +         */
> +        new_cpu = cpumask_any(&cpu_online_map);
> +        new_lock = &per_cpu(vmx_pi_blocking, new_cpu).lock;
> +
> +        spin_lock(new_lock);
> +
> +        ASSERT(vmx->pi_blocking.lock == old_lock);
> +
> +        dest = cpu_physical_id(new_cpu);
> +        write_atomic(&vmx->pi_desc.ndst,
> +                     x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
> +
> +        list_move(&vmx->pi_blocking.list,
> +                  &per_cpu(vmx_pi_blocking, new_cpu).list);
> +        vmx->pi_blocking.lock = new_lock;
> +
> +        spin_unlock(new_lock);

I didn't check the whole flow... but did you suppress notification somewhere
earlier before above list movement happens? Otherwise you may miss an
interrupt when the target cpu is dying...

> +    }
> +
> +    spin_unlock_irqrestore(old_lock, flags);
> +}
> +
>  /* This function is called when pcidevs_lock is held */
>  void vmx_pi_hooks_assign(struct domain *d)
>  {
> diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h
> b/xen/include/asm-x86/hvm/vmx/vmx.h
> index 4cdd9b1..9783c70 100644
> --- a/xen/include/asm-x86/hvm/vmx/vmx.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmx.h
> @@ -569,6 +569,7 @@ void free_p2m_hap_data(struct p2m_domain *p2m);
>  void p2m_init_hap_data(struct p2m_domain *p2m);
> 
>  void vmx_pi_per_cpu_init(unsigned int cpu);
> +void vmx_pi_desc_fixup(int cpu);
> 
>  void vmx_pi_hooks_assign(struct domain *d);
>  void vmx_pi_hooks_deassign(struct domain *d);
> --
> 2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 7/7] VMX: Fixup PI descriptor when cpu is offline
  2016-10-11  8:38   ` Tian, Kevin
@ 2016-10-11 11:46     ` Wu, Feng
  0 siblings, 0 replies; 35+ messages in thread
From: Wu, Feng @ 2016-10-11 11:46 UTC (permalink / raw)
  To: Tian, Kevin, xen-devel
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Wu, Feng, jbeulich



> -----Original Message-----
> From: Tian, Kevin
> Sent: Tuesday, October 11, 2016 4:38 PM
> To: Wu, Feng <feng.wu@intel.com>; xen-devel@lists.xen.org
> Cc: jbeulich@suse.com; andrew.cooper3@citrix.com;
> george.dunlap@eu.citrix.com; dario.faggioli@citrix.com
> Subject: RE: [PATCH v5 7/7] VMX: Fixup PI descriptor when cpu is offline
> 
> > From: Wu, Feng
> > Sent: Tuesday, October 11, 2016 8:58 AM
> >
> > When cpu is offline, we need to move all the vcpus in its blocking
> > list to another online cpu, this patch handles it.
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > ---
> > v5:
> > - Add some comments to explain why it doesn't cause deadlock
> > for the ABBA deadlock scenario.
> >
> >  xen/arch/x86/hvm/vmx/vmcs.c       |  1 +
> >  xen/arch/x86/hvm/vmx/vmx.c        | 48
> > +++++++++++++++++++++++++++++++++++++++
> >  xen/include/asm-x86/hvm/vmx/vmx.h |  1 +
> >  3 files changed, 50 insertions(+)
> >
> > diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> > index 10976bd..5dd68ca 100644
> > --- a/xen/arch/x86/hvm/vmx/vmcs.c
> > +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> > @@ -578,6 +578,7 @@ void vmx_cpu_dead(unsigned int cpu)
> >      vmx_free_vmcs(per_cpu(vmxon_region, cpu));
> >      per_cpu(vmxon_region, cpu) = 0;
> >      nvmx_cpu_dead(cpu);
> > +    vmx_pi_desc_fixup(cpu);
> >  }
> >
> >  int vmx_cpu_up(void)
> > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> > index b14c84e..c71d496 100644
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -208,6 +208,54 @@ static void vmx_pi_do_resume(struct vcpu *v)
> >      vmx_pi_list_remove(v);
> >  }
> >
> > +void vmx_pi_desc_fixup(int cpu)
> > +{
> > +    unsigned int new_cpu, dest;
> > +    unsigned long flags;
> > +    struct arch_vmx_struct *vmx, *tmp;
> > +    spinlock_t *new_lock, *old_lock = &per_cpu(vmx_pi_blocking, cpu).lock;
> > +    struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list;
> > +
> > +    if ( !iommu_intpost )
> > +        return;
> > +
> > +    /*
> > +     * We are in the context of CPU_DEAD or CPU_UP_CANCELED notification,
> > +     * and it is impossible for a second CPU go down in parallel. So we
> > +     * can safely acquire the old cpu's lock and then acquire the new_cpu's
> > +     * lock after that.
> > +     */
> > +    spin_lock_irqsave(old_lock, flags);
> > +
> > +    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
> > +    {
> > +        /*
> > +         * We need to find an online cpu as the NDST of the PI descriptor, it
> > +         * doesn't matter whether it is within the cpupool of the domain or
> > +         * not. As long as it is online, the vCPU will be woken up once the
> > +         * notification event arrives.
> > +         */
> > +        new_cpu = cpumask_any(&cpu_online_map);
> > +        new_lock = &per_cpu(vmx_pi_blocking, new_cpu).lock;
> > +
> > +        spin_lock(new_lock);
> > +
> > +        ASSERT(vmx->pi_blocking.lock == old_lock);
> > +
> > +        dest = cpu_physical_id(new_cpu);
> > +        write_atomic(&vmx->pi_desc.ndst,
> > +                     x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
> > +
> > +        list_move(&vmx->pi_blocking.list,
> > +                  &per_cpu(vmx_pi_blocking, new_cpu).list);
> > +        vmx->pi_blocking.lock = new_lock;
> > +
> > +        spin_unlock(new_lock);
> 
> I didn't check the whole flow... but did you suppress notification somewhere
> earlier before above list movement happens? Otherwise you may miss an
> interrupt when the target cpu is dying...
> 

Yes, this is really a good point. the wakeup notification event will miss if
it comes in before " write_atomic(&vmx->pi_desc.ndst ...." above. So we
need to suppress it before "spin_lock_irqsave(old_lock, flags);".

Thanks,
Feng

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 1/7] VMX: Statically assign two PI hooks
  2016-10-11  0:57 ` [PATCH v5 1/7] VMX: Statically assign two PI hooks Feng Wu
  2016-10-11  8:11   ` Tian, Kevin
@ 2016-10-12 13:25   ` Jan Beulich
  1 sibling, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2016-10-12 13:25 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> +/*
> + * In fact, we can remove this hooks inside itself if no new device is in the
> + * process of getting assigned and "from" hook is NULL. However, it is not
> + * straightforward to find a clear solution, so just leave it here.
> + */
>  static void vmx_pi_switch_to(struct vcpu *v)

I think this comment would better go next to where the NULL
assignment is (to be).

> @@ -221,9 +226,8 @@ void vmx_pi_hooks_deassign(struct domain *d)
>      ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
>  
>      d->arch.hvm_domain.vmx.vcpu_block = NULL;
> -    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
> -    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
>      d->arch.hvm_domain.vmx.pi_do_resume = NULL;
> +    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
>  }

Besides the discrepancy of description and implementation which
Kevin has already pointed out, I'd also like to ask to replace the
"statically" in title and description. "Statically" would mean the
hooks get installed unconditionally at domain creation time (or
even at build time). Instead I think you mean "keep two PI hooks
assigned" or "permanently assign two PI hooks".

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks
  2016-10-11  0:57 ` [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks Feng Wu
@ 2016-10-12 13:45   ` Jan Beulich
  2016-10-17  6:26     ` Wu, Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2016-10-12 13:45 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
>  static void pi_desc_init(struct vcpu *v)
>  {
> -    uint32_t dest;
> -
>      v->arch.hvm_vmx.pi_desc.nv = posted_intr_vector;
>  
> -    dest = cpu_physical_id(v->processor);
> -
> -    if ( x2apic_enabled )
> -        v->arch.hvm_vmx.pi_desc.ndst = dest;
> -    else
> -        v->arch.hvm_vmx.pi_desc.ndst = MASK_INSR(dest, PI_xAPIC_NDST_MASK);
> +    /*
> +     * Mark NDST as invalid, then we can use this invalid value as a
> +     * marker to whether update NDST or not in vmx_pi_hooks_assign().
> +     */
> +    v->arch.hvm_vmx.pi_desc.ndst = 0xffffffff;

I think I had at the same time asked to make this a #define, so the
two (currently) instance can be connected together.

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -211,14 +211,39 @@ static void vmx_pi_do_resume(struct vcpu *v)
>  /* This function is called when pcidevs_lock is held */
>  void vmx_pi_hooks_assign(struct domain *d)
>  {
> +    struct vcpu *v;
> +
>      if ( !iommu_intpost || !has_hvm_container_domain(d) )
>          return;
>  
>      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>  
> -    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> +    /*
> +     * We carefully handle the timing here:
> +     * - Install the context switch first
> +     * - Then set the NDST field
> +     * - Install the block and resume hooks in the end
> +     *
> +     * This can make sure the PI (especially the NDST feild) is
> +     * in proper state when we call vmx_vcpu_block().
> +     */
>      d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
>      d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> +
> +    for_each_vcpu ( d, v )
> +    {
> +        unsigned int dest = cpu_physical_id(v->processor);
> +        struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> +
> +        /*
> +         * We don't need to update NDST if 'arch.hvm_domain.vmx.pi_switch_to'
> +         * already gets called

I think you mean "got" or "has got". Also I think you'd better refer to
the actual VMX function instead of the hook pointer field.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-11  0:57 ` [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
@ 2016-10-12 13:56   ` Jan Beulich
  2016-10-17  7:02     ` Wu, Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2016-10-12 13:56 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> --- a/xen/drivers/passthrough/vtd/intremap.c
> +++ b/xen/drivers/passthrough/vtd/intremap.c
> @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
>      return 0;
>  }
>  
> +static bool_t pi_can_suppress_irte_update(struct iremap_entry *new,

bool (and true/false respectively) please.

And then the function name suggests that no modification gets done
here (and hence the first parameter could be const too), yet the
implementation does otherwise (and I don't understand why).

> +    const struct iremap_entry *old)
> +{
> +    bool_t ret = 1;
> +    u16 fpd, sid, sq, svt;
> +
> +    if ( !old->remap.p || !old->remap.im )
> +        return 0;
> +
> +    fpd = new->post.fpd;
> +    sid = new->post.sid;
> +    sq = new->post.sq;
> +    svt = new->post.svt;
> +
> +    *new = *old;
> +
> +    if ( fpd != old->post.fpd )
> +    {
> +        new->post.fpd = fpd;
> +        ret = 0;
> +    }
> +
> +    if ( sid != old->post.sid )
> +    {
> +        new->post.sid = sid;
> +        ret = 0;
> +    }
> +
> +    if ( sq != old->post.sq )
> +    {
> +        new->post.sq = sq;
> +        ret = 0;
> +    }
> +
> +    if ( svt != old->post.svt )
> +    {
> +        new->post.svt = svt;
> +        ret = 0;
> +    }

What's the selection of the fields based on? Namely, what about
vector, pda_l, and pda_h?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 6/7] VT-d: Some cleanups
  2016-10-11  0:57 ` [PATCH v5 6/7] VT-d: Some cleanups Feng Wu
@ 2016-10-12 14:01   ` Jan Beulich
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2016-10-12 14:01 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> @@ -638,7 +638,8 @@ static int msi_msg_to_remap_entry(
>      GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
>                       iremap_entries, iremap_entry);
>  
> -    memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry));
> +    if ( iremap_entry->remap.p )
> +        new_ire.remap.im = iremap_entry->remap.im;

This is certainly more of a change than what title and description
suggest. If you really mean it to be this way, it needs to be
explained in the description.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks
  2016-10-12 13:45   ` Jan Beulich
@ 2016-10-17  6:26     ` Wu, Feng
  2016-10-24  7:22       ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Wu, Feng @ 2016-10-17  6:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Wednesday, October 12, 2016 9:45 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v5 4/7] VMX: Make sure PI is in proper state before install
> the hooks
> 
> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> >  static void pi_desc_init(struct vcpu *v)
> >  {
> > -    uint32_t dest;
> > -
> >      v->arch.hvm_vmx.pi_desc.nv = posted_intr_vector;
> >
> > -    dest = cpu_physical_id(v->processor);
> > -
> > -    if ( x2apic_enabled )
> > -        v->arch.hvm_vmx.pi_desc.ndst = dest;
> > -    else
> > -        v->arch.hvm_vmx.pi_desc.ndst = MASK_INSR(dest,
> PI_xAPIC_NDST_MASK);
> > +    /*
> > +     * Mark NDST as invalid, then we can use this invalid value as a
> > +     * marker to whether update NDST or not in vmx_pi_hooks_assign().
> > +     */
> > +    v->arch.hvm_vmx.pi_desc.ndst = 0xffffffff;
> 
> I think I had at the same time asked to make this a #define, so the
> two (currently) instance can be connected together.

Sorry, Maybe I didn't get that. Do you mean I need to define a Macro
for 0xffffffff, so we can use it here and in vmx.c?

Thanks,
Feng

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-12 13:56   ` Jan Beulich
@ 2016-10-17  7:02     ` Wu, Feng
  2016-10-24  7:27       ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Wu, Feng @ 2016-10-17  7:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Wednesday, October 12, 2016 9:56 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format
> IRTE
> 
> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> > --- a/xen/drivers/passthrough/vtd/intremap.c
> > +++ b/xen/drivers/passthrough/vtd/intremap.c
> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
> >      return 0;
> >  }
> >
> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry *new,
> 
> bool (and true/false respectively) please.
> 
> And then the function name suggests that no modification gets done
> here (and hence the first parameter could be const too), yet the
> implementation does otherwise (and I don't understand why).

The idea here is that if the old IRTE is in posted format and fields like
'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use these
new values for the new_ire, while we still need to use the old values
of other fields in IRTE, so this function returns the new irte in its first
 parameter it we cannot suppress the update. I try to do it in this
function.

> 
> > +    const struct iremap_entry *old)
> > +{
> > +    bool_t ret = 1;
> > +    u16 fpd, sid, sq, svt;
> > +
> > +    if ( !old->remap.p || !old->remap.im )
> > +        return 0;
> > +
> > +    fpd = new->post.fpd;
> > +    sid = new->post.sid;
> > +    sq = new->post.sq;
> > +    svt = new->post.svt;
> > +
> > +    *new = *old;
> > +
> > +    if ( fpd != old->post.fpd )
> > +    {
> > +        new->post.fpd = fpd;
> > +        ret = 0;
> > +    }
> > +
> > +    if ( sid != old->post.sid )
> > +    {
> > +        new->post.sid = sid;
> > +        ret = 0;
> > +    }
> > +
> > +    if ( sq != old->post.sq )
> > +    {
> > +        new->post.sq = sq;
> > +        ret = 0;
> > +    }
> > +
> > +    if ( svt != old->post.svt )
> > +    {
> > +        new->post.svt = svt;
> > +        ret = 0;
> > +    }
> 
> What's the selection of the fields based on? Namely, what about
> vector, pda_l, and pda_h?

These filed are the common field between posted format and remapped format.
'vector' field has different meaning in the two formant, pda_l and pda_h is only
for posted format. As mentioned above, the purpose of this function is to find
whether use need to update this common field in posted format, if it is, we need
to use them and reuse the old value of other fields (pda_l, pda_h, vector, etc.).
since we need to suppress affinity related update for posted format.

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks
  2016-10-17  6:26     ` Wu, Feng
@ 2016-10-24  7:22       ` Jan Beulich
  2016-10-24  7:45         ` Wu, Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2016-10-24  7:22 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 17.10.16 at 08:26, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Wednesday, October 12, 2016 9:45 PM
>> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
>> >  static void pi_desc_init(struct vcpu *v)
>> >  {
>> > -    uint32_t dest;
>> > -
>> >      v->arch.hvm_vmx.pi_desc.nv = posted_intr_vector;
>> >
>> > -    dest = cpu_physical_id(v->processor);
>> > -
>> > -    if ( x2apic_enabled )
>> > -        v->arch.hvm_vmx.pi_desc.ndst = dest;
>> > -    else
>> > -        v->arch.hvm_vmx.pi_desc.ndst = MASK_INSR(dest,
>> PI_xAPIC_NDST_MASK);
>> > +    /*
>> > +     * Mark NDST as invalid, then we can use this invalid value as a
>> > +     * marker to whether update NDST or not in vmx_pi_hooks_assign().
>> > +     */
>> > +    v->arch.hvm_vmx.pi_desc.ndst = 0xffffffff;
>> 
>> I think I had at the same time asked to make this a #define, so the
>> two (currently) instance can be connected together.
> 
> Sorry, Maybe I didn't get that. Do you mean I need to define a Macro
> for 0xffffffff, so we can use it here and in vmx.c?

Yes.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-17  7:02     ` Wu, Feng
@ 2016-10-24  7:27       ` Jan Beulich
  2016-10-24  8:57         ` Wu, Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2016-10-24  7:27 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 17.10.16 at 09:02, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Wednesday, October 12, 2016 9:56 PM
>> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
>> > --- a/xen/drivers/passthrough/vtd/intremap.c
>> > +++ b/xen/drivers/passthrough/vtd/intremap.c
>> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
>> >      return 0;
>> >  }
>> >
>> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry *new,
>> 
>> bool (and true/false respectively) please.
>> 
>> And then the function name suggests that no modification gets done
>> here (and hence the first parameter could be const too), yet the
>> implementation does otherwise (and I don't understand why).
> 
> The idea here is that if the old IRTE is in posted format and fields like
> 'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use these
> new values for the new_ire, while we still need to use the old values
> of other fields in IRTE, so this function returns the new irte in its first
>  parameter it we cannot suppress the update. I try to do it in this
> function.

I don't understand: The caller fully constructs the new entry. Why
would you want to do further modifications here? I continue to
think that this function should solely check whether the changes
between old and new entry are such that the actual update (and
hence the flush) can be bypassed.

>> > +    const struct iremap_entry *old)
>> > +{
>> > +    bool_t ret = 1;
>> > +    u16 fpd, sid, sq, svt;
>> > +
>> > +    if ( !old->remap.p || !old->remap.im )
>> > +        return 0;
>> > +
>> > +    fpd = new->post.fpd;
>> > +    sid = new->post.sid;
>> > +    sq = new->post.sq;
>> > +    svt = new->post.svt;
>> > +
>> > +    *new = *old;
>> > +
>> > +    if ( fpd != old->post.fpd )
>> > +    {
>> > +        new->post.fpd = fpd;
>> > +        ret = 0;
>> > +    }
>> > +
>> > +    if ( sid != old->post.sid )
>> > +    {
>> > +        new->post.sid = sid;
>> > +        ret = 0;
>> > +    }
>> > +
>> > +    if ( sq != old->post.sq )
>> > +    {
>> > +        new->post.sq = sq;
>> > +        ret = 0;
>> > +    }
>> > +
>> > +    if ( svt != old->post.svt )
>> > +    {
>> > +        new->post.svt = svt;
>> > +        ret = 0;
>> > +    }
>> 
>> What's the selection of the fields based on? Namely, what about
>> vector, pda_l, and pda_h?
> 
> These filed are the common field between posted format and remapped format.
> 'vector' field has different meaning in the two formant, pda_l and pda_h is only
> for posted format. As mentioned above, the purpose of this function is to find
> whether use need to update this common field in posted format, if it is, we need
> to use them and reuse the old value of other fields (pda_l, pda_h, vector, etc.).
> since we need to suppress affinity related update for posted format.

If that was the case, then the first thing you'd need to check would
be whether the format actually changes. If it doesn't, all fields need
to be compared, while if it does change, the write (and flush) clearly
can't be suppressed.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks
  2016-10-24  7:22       ` Jan Beulich
@ 2016-10-24  7:45         ` Wu, Feng
  2016-10-24  7:59           ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Wu, Feng @ 2016-10-24  7:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, October 24, 2016 3:23 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v5 4/7] VMX: Make sure PI is in proper state before install
> the hooks
> 
> >>> On 17.10.16 at 08:26, <feng.wu@intel.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Wednesday, October 12, 2016 9:45 PM
> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> >> >  static void pi_desc_init(struct vcpu *v)
> >> >  {
> >> > -    uint32_t dest;
> >> > -
> >> >      v->arch.hvm_vmx.pi_desc.nv = posted_intr_vector;
> >> >
> >> > -    dest = cpu_physical_id(v->processor);
> >> > -
> >> > -    if ( x2apic_enabled )
> >> > -        v->arch.hvm_vmx.pi_desc.ndst = dest;
> >> > -    else
> >> > -        v->arch.hvm_vmx.pi_desc.ndst = MASK_INSR(dest,
> >> PI_xAPIC_NDST_MASK);
> >> > +    /*
> >> > +     * Mark NDST as invalid, then we can use this invalid value as a
> >> > +     * marker to whether update NDST or not in vmx_pi_hooks_assign().
> >> > +     */
> >> > +    v->arch.hvm_vmx.pi_desc.ndst = 0xffffffff;
> >>
> >> I think I had at the same time asked to make this a #define, so the
> >> two (currently) instance can be connected together.
> >
> > Sorry, Maybe I didn't get that. Do you mean I need to define a Macro
> > for 0xffffffff, so we can use it here and in vmx.c?
> 
> Yes.

Thanks for confirm. Which one do you like? Put the macro in vmx.h
or asm/apic.h?

Thanks,
Feng

> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks
  2016-10-24  7:45         ` Wu, Feng
@ 2016-10-24  7:59           ` Jan Beulich
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2016-10-24  7:59 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 24.10.16 at 09:45, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Monday, October 24, 2016 3:23 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v5 4/7] VMX: Make sure PI is in proper state before 
> install
>> the hooks
>> 
>> >>> On 17.10.16 at 08:26, <feng.wu@intel.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Wednesday, October 12, 2016 9:45 PM
>> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
>> >> >  static void pi_desc_init(struct vcpu *v)
>> >> >  {
>> >> > -    uint32_t dest;
>> >> > -
>> >> >      v->arch.hvm_vmx.pi_desc.nv = posted_intr_vector;
>> >> >
>> >> > -    dest = cpu_physical_id(v->processor);
>> >> > -
>> >> > -    if ( x2apic_enabled )
>> >> > -        v->arch.hvm_vmx.pi_desc.ndst = dest;
>> >> > -    else
>> >> > -        v->arch.hvm_vmx.pi_desc.ndst = MASK_INSR(dest,
>> >> PI_xAPIC_NDST_MASK);
>> >> > +    /*
>> >> > +     * Mark NDST as invalid, then we can use this invalid value as a
>> >> > +     * marker to whether update NDST or not in vmx_pi_hooks_assign().
>> >> > +     */
>> >> > +    v->arch.hvm_vmx.pi_desc.ndst = 0xffffffff;
>> >>
>> >> I think I had at the same time asked to make this a #define, so the
>> >> two (currently) instance can be connected together.
>> >
>> > Sorry, Maybe I didn't get that. Do you mean I need to define a Macro
>> > for 0xffffffff, so we can use it here and in vmx.c?
>> 
>> Yes.
> 
> Thanks for confirm. Which one do you like? Put the macro in vmx.h
> or asm/apic.h?

This being (for now at least) VMX-specific, I think the former is to be
preferred.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-24  7:27       ` Jan Beulich
@ 2016-10-24  8:57         ` Wu, Feng
  2016-10-24  9:53           ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Wu, Feng @ 2016-10-24  8:57 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, October 24, 2016 3:28 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format
> IRTE
> 
> >>> On 17.10.16 at 09:02, <feng.wu@intel.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Wednesday, October 12, 2016 9:56 PM
> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> >> > --- a/xen/drivers/passthrough/vtd/intremap.c
> >> > +++ b/xen/drivers/passthrough/vtd/intremap.c
> >> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
> >> >      return 0;
> >> >  }
> >> >
> >> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry *new,
> >>
> >> bool (and true/false respectively) please.
> >>
> >> And then the function name suggests that no modification gets done
> >> here (and hence the first parameter could be const too), yet the
> >> implementation does otherwise (and I don't understand why).
> >
> > The idea here is that if the old IRTE is in posted format and fields like
> > 'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use these
> > new values for the new_ire, while we still need to use the old values
> > of other fields in IRTE, so this function returns the new irte in its first
> >  parameter it we cannot suppress the update. I try to do it in this
> > function.
> 
> I don't understand: The caller fully constructs the new entry. Why
> would you want to do further modifications here? I continue to
> think that this function should solely check whether the changes
> between old and new entry are such that the actual update (and
> hence the flush) can be bypassed.
> 
> >> > +    const struct iremap_entry *old)
> >> > +{
> >> > +    bool_t ret = 1;
> >> > +    u16 fpd, sid, sq, svt;
> >> > +
> >> > +    if ( !old->remap.p || !old->remap.im )
> >> > +        return 0;
> >> > +
> >> > +    fpd = new->post.fpd;
> >> > +    sid = new->post.sid;
> >> > +    sq = new->post.sq;
> >> > +    svt = new->post.svt;
> >> > +
> >> > +    *new = *old;
> >> > +
> >> > +    if ( fpd != old->post.fpd )
> >> > +    {
> >> > +        new->post.fpd = fpd;
> >> > +        ret = 0;
> >> > +    }
> >> > +
> >> > +    if ( sid != old->post.sid )
> >> > +    {
> >> > +        new->post.sid = sid;
> >> > +        ret = 0;
> >> > +    }
> >> > +
> >> > +    if ( sq != old->post.sq )
> >> > +    {
> >> > +        new->post.sq = sq;
> >> > +        ret = 0;
> >> > +    }
> >> > +
> >> > +    if ( svt != old->post.svt )
> >> > +    {
> >> > +        new->post.svt = svt;
> >> > +        ret = 0;
> >> > +    }
> >>
> >> What's the selection of the fields based on? Namely, what about
> >> vector, pda_l, and pda_h?
> >
> > These filed are the common field between posted format and remapped
> format.
> > 'vector' field has different meaning in the two formant, pda_l and pda_h is
> only
> > for posted format. As mentioned above, the purpose of this function is to find
> > whether use need to update this common field in posted format, if it is, we
> need
> > to use them and reuse the old value of other fields (pda_l, pda_h, vector, etc.).
> > since we need to suppress affinity related update for posted format.
> 
> If that was the case, then the first thing you'd need to check would
> be whether the format actually changes. If it doesn't, all fields need
> to be compared, while if it does change, the write (and flush) clearly
> can't be suppressed.
> 

Let me elaborate a bit more on the function to make things clear:
1. If the old IRTE is present, or it is in remapped mode, we cannot suppress
the update, such as we may create a new IRTE and put it in remapped mode,
or update the remapped mode to posted mode.
2. If the condition in step 1 is false, that means the old IRTE is present and
in posted mode, so we need to suppress the affinity related updates, and
only update the fields:  'fpd', 'sid', 'sq', or 'svt'. (Here maybe we need to check
whether if the new IRTE is in posted mode, if it is we need to update all
the field, but currently if we update posted mode -> posted mode, we don't
go to this function, it is done in pi_update_irte(), so maybe we can add a
 WARN_ON() for that case?)

Does the above sound clear to you, if it doesn't, don't hesitate to point it
out, thanks a lot!

Thanks,
Feng


> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-24  8:57         ` Wu, Feng
@ 2016-10-24  9:53           ` Jan Beulich
  2016-10-24 10:18             ` Wu, Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2016-10-24  9:53 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 24.10.16 at 10:57, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Monday, October 24, 2016 3:28 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted 
> format
>> IRTE
>> 
>> >>> On 17.10.16 at 09:02, <feng.wu@intel.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Wednesday, October 12, 2016 9:56 PM
>> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
>> >> > --- a/xen/drivers/passthrough/vtd/intremap.c
>> >> > +++ b/xen/drivers/passthrough/vtd/intremap.c
>> >> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
>> >> >      return 0;
>> >> >  }
>> >> >
>> >> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry *new,
>> >>
>> >> bool (and true/false respectively) please.
>> >>
>> >> And then the function name suggests that no modification gets done
>> >> here (and hence the first parameter could be const too), yet the
>> >> implementation does otherwise (and I don't understand why).
>> >
>> > The idea here is that if the old IRTE is in posted format and fields like
>> > 'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use these
>> > new values for the new_ire, while we still need to use the old values
>> > of other fields in IRTE, so this function returns the new irte in its first
>> >  parameter it we cannot suppress the update. I try to do it in this
>> > function.
>> 
>> I don't understand: The caller fully constructs the new entry. Why
>> would you want to do further modifications here? I continue to
>> think that this function should solely check whether the changes
>> between old and new entry are such that the actual update (and
>> hence the flush) can be bypassed.
>> 
>> >> > +    const struct iremap_entry *old)
>> >> > +{
>> >> > +    bool_t ret = 1;
>> >> > +    u16 fpd, sid, sq, svt;
>> >> > +
>> >> > +    if ( !old->remap.p || !old->remap.im )
>> >> > +        return 0;
>> >> > +
>> >> > +    fpd = new->post.fpd;
>> >> > +    sid = new->post.sid;
>> >> > +    sq = new->post.sq;
>> >> > +    svt = new->post.svt;
>> >> > +
>> >> > +    *new = *old;
>> >> > +
>> >> > +    if ( fpd != old->post.fpd )
>> >> > +    {
>> >> > +        new->post.fpd = fpd;
>> >> > +        ret = 0;
>> >> > +    }
>> >> > +
>> >> > +    if ( sid != old->post.sid )
>> >> > +    {
>> >> > +        new->post.sid = sid;
>> >> > +        ret = 0;
>> >> > +    }
>> >> > +
>> >> > +    if ( sq != old->post.sq )
>> >> > +    {
>> >> > +        new->post.sq = sq;
>> >> > +        ret = 0;
>> >> > +    }
>> >> > +
>> >> > +    if ( svt != old->post.svt )
>> >> > +    {
>> >> > +        new->post.svt = svt;
>> >> > +        ret = 0;
>> >> > +    }
>> >>
>> >> What's the selection of the fields based on? Namely, what about
>> >> vector, pda_l, and pda_h?
>> >
>> > These filed are the common field between posted format and remapped
>> format.
>> > 'vector' field has different meaning in the two formant, pda_l and pda_h is
>> only
>> > for posted format. As mentioned above, the purpose of this function is to find
>> > whether use need to update this common field in posted format, if it is, we
>> need
>> > to use them and reuse the old value of other fields (pda_l, pda_h, vector, etc.).
>> > since we need to suppress affinity related update for posted format.
>> 
>> If that was the case, then the first thing you'd need to check would
>> be whether the format actually changes. If it doesn't, all fields need
>> to be compared, while if it does change, the write (and flush) clearly
>> can't be suppressed.
>> 
> 
> Let me elaborate a bit more on the function to make things clear:
> 1. If the old IRTE is present, or it is in remapped mode, we cannot suppress
> the update, such as we may create a new IRTE and put it in remapped mode,
> or update the remapped mode to posted mode.
> 2. If the condition in step 1 is false, that means the old IRTE is present and
> in posted mode, so we need to suppress the affinity related updates,

But only if the new entry is in posted mode too - see my earlier reply.

> and
> only update the fields:  'fpd', 'sid', 'sq', or 'svt'. (Here maybe we need to check
> whether if the new IRTE is in posted mode, if it is we need to update all
> the field, but currently if we update posted mode -> posted mode, we don't
> go to this function, it is done in pi_update_irte(),

Which looks like a code flow problem anyway - there shouldn't be
direct calls from vendor independent code to vendor dependent
functions. And then I can't see how the call to pi_update_irte()
prevents execution flow reaching msi_msg_to_remap_entry(); at
best the function would just re-write the same entry unchanged.

In any event - msi_msg_to_remap_entry() should be correct for
all current and future callers, and hence I continue to think you
want to adjust the code as suggested.

> so maybe we can add a WARN_ON() for that case?)

We need to be very careful with such WARN_ON()s - they must
not be guest triggerable (I think this one wouldn't be) and they
should not be raised more than once until a "good" update
happened again (to avoid spamming the log).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-24  9:53           ` Jan Beulich
@ 2016-10-24 10:18             ` Wu, Feng
  2016-10-24 10:57               ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Wu, Feng @ 2016-10-24 10:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, October 24, 2016 5:54 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format
> IRTE
> 
> >>> On 24.10.16 at 10:57, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Monday, October 24, 2016 3:28 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> devel@lists.xen.org
> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
> > format
> >> IRTE
> >>
> >> >>> On 17.10.16 at 09:02, <feng.wu@intel.com> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Wednesday, October 12, 2016 9:56 PM
> >> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> >> >> > --- a/xen/drivers/passthrough/vtd/intremap.c
> >> >> > +++ b/xen/drivers/passthrough/vtd/intremap.c
> >> >> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
> >> >> >      return 0;
> >> >> >  }
> >> >> >
> >> >> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry *new,
> >> >>
> >> >> bool (and true/false respectively) please.
> >> >>
> >> >> And then the function name suggests that no modification gets done
> >> >> here (and hence the first parameter could be const too), yet the
> >> >> implementation does otherwise (and I don't understand why).
> >> >
> >> > The idea here is that if the old IRTE is in posted format and fields like
> >> > 'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use these
> >> > new values for the new_ire, while we still need to use the old values
> >> > of other fields in IRTE, so this function returns the new irte in its first
> >> >  parameter it we cannot suppress the update. I try to do it in this
> >> > function.
> >>
> >> I don't understand: The caller fully constructs the new entry. Why
> >> would you want to do further modifications here? I continue to
> >> think that this function should solely check whether the changes
> >> between old and new entry are such that the actual update (and
> >> hence the flush) can be bypassed.
> >>
> >> >> > +    const struct iremap_entry *old)
> >> >> > +{
> >> >> > +    bool_t ret = 1;
> >> >> > +    u16 fpd, sid, sq, svt;
> >> >> > +
> >> >> > +    if ( !old->remap.p || !old->remap.im )
> >> >> > +        return 0;
> >> >> > +
> >> >> > +    fpd = new->post.fpd;
> >> >> > +    sid = new->post.sid;
> >> >> > +    sq = new->post.sq;
> >> >> > +    svt = new->post.svt;
> >> >> > +
> >> >> > +    *new = *old;
> >> >> > +
> >> >> > +    if ( fpd != old->post.fpd )
> >> >> > +    {
> >> >> > +        new->post.fpd = fpd;
> >> >> > +        ret = 0;
> >> >> > +    }
> >> >> > +
> >> >> > +    if ( sid != old->post.sid )
> >> >> > +    {
> >> >> > +        new->post.sid = sid;
> >> >> > +        ret = 0;
> >> >> > +    }
> >> >> > +
> >> >> > +    if ( sq != old->post.sq )
> >> >> > +    {
> >> >> > +        new->post.sq = sq;
> >> >> > +        ret = 0;
> >> >> > +    }
> >> >> > +
> >> >> > +    if ( svt != old->post.svt )
> >> >> > +    {
> >> >> > +        new->post.svt = svt;
> >> >> > +        ret = 0;
> >> >> > +    }
> >> >>
> >> >> What's the selection of the fields based on? Namely, what about
> >> >> vector, pda_l, and pda_h?
> >> >
> >> > These filed are the common field between posted format and remapped
> >> format.
> >> > 'vector' field has different meaning in the two formant, pda_l and pda_h is
> >> only
> >> > for posted format. As mentioned above, the purpose of this function is to
> find
> >> > whether use need to update this common field in posted format, if it is, we
> >> need
> >> > to use them and reuse the old value of other fields (pda_l, pda_h, vector,
> etc.).
> >> > since we need to suppress affinity related update for posted format.
> >>
> >> If that was the case, then the first thing you'd need to check would
> >> be whether the format actually changes. If it doesn't, all fields need
> >> to be compared, while if it does change, the write (and flush) clearly
> >> can't be suppressed.
> >>
> >
> > Let me elaborate a bit more on the function to make things clear:
> > 1. If the old IRTE is present, or it is in remapped mode, we cannot suppress
> > the update, such as we may create a new IRTE and put it in remapped mode,
> > or update the remapped mode to posted mode.
> > 2. If the condition in step 1 is false, that means the old IRTE is present and
> > in posted mode, so we need to suppress the affinity related updates,
> 
> But only if the new entry is in posted mode too - see my earlier reply.
> 
> > and
> > only update the fields:  'fpd', 'sid', 'sq', or 'svt'. (Here maybe we need to check
> > whether if the new IRTE is in posted mode, if it is we need to update all
> > the field, but currently if we update posted mode -> posted mode, we don't
> > go to this function, it is done in pi_update_irte(),
> 
> Which looks like a code flow problem anyway - there shouldn't be
> direct calls from vendor independent code to vendor dependent
> functions. And then I can't see how the call to pi_update_irte()
> prevents execution flow reaching msi_msg_to_remap_entry(); at
> best the function would just re-write the same entry unchanged.
> 
> In any event - msi_msg_to_remap_entry() should be correct for
> all current and future callers, and hence I continue to think you
> want to adjust the code as suggested.
> 
> > so maybe we can add a WARN_ON() for that case?)
> 
> We need to be very careful with such WARN_ON()s - they must
> not be guest triggerable (I think this one wouldn't be) and they
> should not be raised more than once until a "good" update
> happened again (to avoid spamming the log).

So based on your comments about, I summarize the handling flow:
1. The same as above
2. If the condition in step 1 is false, that means the old IRTE is present and
in posted mode. If the new IRTE is in posted mode, we just update it, but
if it is in remapped mode, we need to suppress the affinity related updates,
and only update the fields:  'fpd', 'sid', 'sq', and 'svt'.

Does this looks okay to you?

Thanks,
Feng

> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-24 10:18             ` Wu, Feng
@ 2016-10-24 10:57               ` Jan Beulich
  2016-10-24 11:10                 ` Wu, Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2016-10-24 10:57 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 24.10.16 at 12:18, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Monday, October 24, 2016 5:54 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted 
> format
>> IRTE
>> 
>> >>> On 24.10.16 at 10:57, <feng.wu@intel.com> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Monday, October 24, 2016 3:28 PM
>> >> To: Wu, Feng <feng.wu@intel.com>
>> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> devel@lists.xen.org 
>> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
>> > format
>> >> IRTE
>> >>
>> >> >>> On 17.10.16 at 09:02, <feng.wu@intel.com> wrote:
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: Wednesday, October 12, 2016 9:56 PM
>> >> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
>> >> >> > --- a/xen/drivers/passthrough/vtd/intremap.c
>> >> >> > +++ b/xen/drivers/passthrough/vtd/intremap.c
>> >> >> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
>> >> >> >      return 0;
>> >> >> >  }
>> >> >> >
>> >> >> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry *new,
>> >> >>
>> >> >> bool (and true/false respectively) please.
>> >> >>
>> >> >> And then the function name suggests that no modification gets done
>> >> >> here (and hence the first parameter could be const too), yet the
>> >> >> implementation does otherwise (and I don't understand why).
>> >> >
>> >> > The idea here is that if the old IRTE is in posted format and fields like
>> >> > 'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use these
>> >> > new values for the new_ire, while we still need to use the old values
>> >> > of other fields in IRTE, so this function returns the new irte in its 
> first
>> >> >  parameter it we cannot suppress the update. I try to do it in this
>> >> > function.
>> >>
>> >> I don't understand: The caller fully constructs the new entry. Why
>> >> would you want to do further modifications here? I continue to
>> >> think that this function should solely check whether the changes
>> >> between old and new entry are such that the actual update (and
>> >> hence the flush) can be bypassed.
>> >>
>> >> >> > +    const struct iremap_entry *old)
>> >> >> > +{
>> >> >> > +    bool_t ret = 1;
>> >> >> > +    u16 fpd, sid, sq, svt;
>> >> >> > +
>> >> >> > +    if ( !old->remap.p || !old->remap.im )
>> >> >> > +        return 0;
>> >> >> > +
>> >> >> > +    fpd = new->post.fpd;
>> >> >> > +    sid = new->post.sid;
>> >> >> > +    sq = new->post.sq;
>> >> >> > +    svt = new->post.svt;
>> >> >> > +
>> >> >> > +    *new = *old;
>> >> >> > +
>> >> >> > +    if ( fpd != old->post.fpd )
>> >> >> > +    {
>> >> >> > +        new->post.fpd = fpd;
>> >> >> > +        ret = 0;
>> >> >> > +    }
>> >> >> > +
>> >> >> > +    if ( sid != old->post.sid )
>> >> >> > +    {
>> >> >> > +        new->post.sid = sid;
>> >> >> > +        ret = 0;
>> >> >> > +    }
>> >> >> > +
>> >> >> > +    if ( sq != old->post.sq )
>> >> >> > +    {
>> >> >> > +        new->post.sq = sq;
>> >> >> > +        ret = 0;
>> >> >> > +    }
>> >> >> > +
>> >> >> > +    if ( svt != old->post.svt )
>> >> >> > +    {
>> >> >> > +        new->post.svt = svt;
>> >> >> > +        ret = 0;
>> >> >> > +    }
>> >> >>
>> >> >> What's the selection of the fields based on? Namely, what about
>> >> >> vector, pda_l, and pda_h?
>> >> >
>> >> > These filed are the common field between posted format and remapped
>> >> format.
>> >> > 'vector' field has different meaning in the two formant, pda_l and pda_h 
> is
>> >> only
>> >> > for posted format. As mentioned above, the purpose of this function is to
>> find
>> >> > whether use need to update this common field in posted format, if it is, 
> we
>> >> need
>> >> > to use them and reuse the old value of other fields (pda_l, pda_h, vector,
>> etc.).
>> >> > since we need to suppress affinity related update for posted format.
>> >>
>> >> If that was the case, then the first thing you'd need to check would
>> >> be whether the format actually changes. If it doesn't, all fields need
>> >> to be compared, while if it does change, the write (and flush) clearly
>> >> can't be suppressed.
>> >>
>> >
>> > Let me elaborate a bit more on the function to make things clear:
>> > 1. If the old IRTE is present, or it is in remapped mode, we cannot 
> suppress
>> > the update, such as we may create a new IRTE and put it in remapped mode,
>> > or update the remapped mode to posted mode.
>> > 2. If the condition in step 1 is false, that means the old IRTE is present 
> and
>> > in posted mode, so we need to suppress the affinity related updates,
>> 
>> But only if the new entry is in posted mode too - see my earlier reply.
>> 
>> > and
>> > only update the fields:  'fpd', 'sid', 'sq', or 'svt'. (Here maybe we need 
> to check
>> > whether if the new IRTE is in posted mode, if it is we need to update all
>> > the field, but currently if we update posted mode -> posted mode, we don't
>> > go to this function, it is done in pi_update_irte(),
>> 
>> Which looks like a code flow problem anyway - there shouldn't be
>> direct calls from vendor independent code to vendor dependent
>> functions. And then I can't see how the call to pi_update_irte()
>> prevents execution flow reaching msi_msg_to_remap_entry(); at
>> best the function would just re-write the same entry unchanged.
>> 
>> In any event - msi_msg_to_remap_entry() should be correct for
>> all current and future callers, and hence I continue to think you
>> want to adjust the code as suggested.
>> 
>> > so maybe we can add a WARN_ON() for that case?)
>> 
>> We need to be very careful with such WARN_ON()s - they must
>> not be guest triggerable (I think this one wouldn't be) and they
>> should not be raised more than once until a "good" update
>> happened again (to avoid spamming the log).
> 
> So based on your comments about, I summarize the handling flow:
> 1. The same as above
> 2. If the condition in step 1 is false, that means the old IRTE is present and
> in posted mode. If the new IRTE is in posted mode, we just update it, but
> if it is in remapped mode, we need to suppress the affinity related updates,
> and only update the fields:  'fpd', 'sid', 'sq', and 'svt'.
> 
> Does this looks okay to you?

No. Just to repeat: "In any event - msi_msg_to_remap_entry()
should be correct for all current and future callers, and hence I
continue to think you want to adjust the code as suggested." IOW
the checking function should really just be checking things, and it
should do so (correctly) for all possible inputs. Its return value
ought to indicate whether the update can be suppressed.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-24 10:57               ` Jan Beulich
@ 2016-10-24 11:10                 ` Wu, Feng
  2016-10-24 11:31                   ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Wu, Feng @ 2016-10-24 11:10 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, October 24, 2016 6:57 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format
> IRTE
> 
> >>> On 24.10.16 at 12:18, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Monday, October 24, 2016 5:54 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> devel@lists.xen.org
> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
> > format
> >> IRTE
> >>
> >> >>> On 24.10.16 at 10:57, <feng.wu@intel.com> wrote:
> >>
> >> >
> >> >> -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Monday, October 24, 2016 3:28 PM
> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> >> devel@lists.xen.org
> >> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
> >> > format
> >> >> IRTE
> >> >>
> >> >> >>> On 17.10.16 at 09:02, <feng.wu@intel.com> wrote:
> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: Wednesday, October 12, 2016 9:56 PM
> >> >> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> >> >> >> > --- a/xen/drivers/passthrough/vtd/intremap.c
> >> >> >> > +++ b/xen/drivers/passthrough/vtd/intremap.c
> >> >> >> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
> >> >> >> >      return 0;
> >> >> >> >  }
> >> >> >> >
> >> >> >> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry
> *new,
> >> >> >>
> >> >> >> bool (and true/false respectively) please.
> >> >> >>
> >> >> >> And then the function name suggests that no modification gets done
> >> >> >> here (and hence the first parameter could be const too), yet the
> >> >> >> implementation does otherwise (and I don't understand why).
> >> >> >
> >> >> > The idea here is that if the old IRTE is in posted format and fields like
> >> >> > 'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use these
> >> >> > new values for the new_ire, while we still need to use the old values
> >> >> > of other fields in IRTE, so this function returns the new irte in its
> > first
> >> >> >  parameter it we cannot suppress the update. I try to do it in this
> >> >> > function.
> >> >>
> >> >> I don't understand: The caller fully constructs the new entry. Why
> >> >> would you want to do further modifications here? I continue to
> >> >> think that this function should solely check whether the changes
> >> >> between old and new entry are such that the actual update (and
> >> >> hence the flush) can be bypassed.
> >> >>
> >> >> >> > +    const struct iremap_entry *old)
> >> >> >> > +{
> >> >> >> > +    bool_t ret = 1;
> >> >> >> > +    u16 fpd, sid, sq, svt;
> >> >> >> > +
> >> >> >> > +    if ( !old->remap.p || !old->remap.im )
> >> >> >> > +        return 0;
> >> >> >> > +
> >> >> >> > +    fpd = new->post.fpd;
> >> >> >> > +    sid = new->post.sid;
> >> >> >> > +    sq = new->post.sq;
> >> >> >> > +    svt = new->post.svt;
> >> >> >> > +
> >> >> >> > +    *new = *old;
> >> >> >> > +
> >> >> >> > +    if ( fpd != old->post.fpd )
> >> >> >> > +    {
> >> >> >> > +        new->post.fpd = fpd;
> >> >> >> > +        ret = 0;
> >> >> >> > +    }
> >> >> >> > +
> >> >> >> > +    if ( sid != old->post.sid )
> >> >> >> > +    {
> >> >> >> > +        new->post.sid = sid;
> >> >> >> > +        ret = 0;
> >> >> >> > +    }
> >> >> >> > +
> >> >> >> > +    if ( sq != old->post.sq )
> >> >> >> > +    {
> >> >> >> > +        new->post.sq = sq;
> >> >> >> > +        ret = 0;
> >> >> >> > +    }
> >> >> >> > +
> >> >> >> > +    if ( svt != old->post.svt )
> >> >> >> > +    {
> >> >> >> > +        new->post.svt = svt;
> >> >> >> > +        ret = 0;
> >> >> >> > +    }
> >> >> >>
> >> >> >> What's the selection of the fields based on? Namely, what about
> >> >> >> vector, pda_l, and pda_h?
> >> >> >
> >> >> > These filed are the common field between posted format and remapped
> >> >> format.
> >> >> > 'vector' field has different meaning in the two formant, pda_l and pda_h
> > is
> >> >> only
> >> >> > for posted format. As mentioned above, the purpose of this function is
> to
> >> find
> >> >> > whether use need to update this common field in posted format, if it is,
> > we
> >> >> need
> >> >> > to use them and reuse the old value of other fields (pda_l, pda_h, vector,
> >> etc.).
> >> >> > since we need to suppress affinity related update for posted format.
> >> >>
> >> >> If that was the case, then the first thing you'd need to check would
> >> >> be whether the format actually changes. If it doesn't, all fields need
> >> >> to be compared, while if it does change, the write (and flush) clearly
> >> >> can't be suppressed.
> >> >>
> >> >
> >> > Let me elaborate a bit more on the function to make things clear:
> >> > 1. If the old IRTE is present, or it is in remapped mode, we cannot
> > suppress
> >> > the update, such as we may create a new IRTE and put it in remapped mode,
> >> > or update the remapped mode to posted mode.
> >> > 2. If the condition in step 1 is false, that means the old IRTE is present
> > and
> >> > in posted mode, so we need to suppress the affinity related updates,
> >>
> >> But only if the new entry is in posted mode too - see my earlier reply.
> >>
> >> > and
> >> > only update the fields:  'fpd', 'sid', 'sq', or 'svt'. (Here maybe we need
> > to check
> >> > whether if the new IRTE is in posted mode, if it is we need to update all
> >> > the field, but currently if we update posted mode -> posted mode, we don't
> >> > go to this function, it is done in pi_update_irte(),
> >>
> >> Which looks like a code flow problem anyway - there shouldn't be
> >> direct calls from vendor independent code to vendor dependent
> >> functions. And then I can't see how the call to pi_update_irte()
> >> prevents execution flow reaching msi_msg_to_remap_entry(); at
> >> best the function would just re-write the same entry unchanged.
> >>
> >> In any event - msi_msg_to_remap_entry() should be correct for
> >> all current and future callers, and hence I continue to think you
> >> want to adjust the code as suggested.
> >>
> >> > so maybe we can add a WARN_ON() for that case?)
> >>
> >> We need to be very careful with such WARN_ON()s - they must
> >> not be guest triggerable (I think this one wouldn't be) and they
> >> should not be raised more than once until a "good" update
> >> happened again (to avoid spamming the log).
> >
> > So based on your comments about, I summarize the handling flow:
> > 1. The same as above
> > 2. If the condition in step 1 is false, that means the old IRTE is present and
> > in posted mode. If the new IRTE is in posted mode, we just update it, but
> > if it is in remapped mode, we need to suppress the affinity related updates,
> > and only update the fields:  'fpd', 'sid', 'sq', and 'svt'.
> >
> > Does this looks okay to you?
> 
> No. Just to repeat: "In any event - msi_msg_to_remap_entry()
> should be correct for all current and future callers, and hence I
> continue to think you want to adjust the code as suggested." IOW
> the checking function should really just be checking things, and it
> should do so (correctly) for all possible inputs. Its return value
> ought to indicate whether the update can be suppressed.
> 

Okay, I can make a checking only function. But I would like to listen
to your advice about how to handle the case: " if it is in remapped
mode, we need to suppress the affinity related updates, and only
update the fields:  'fpd', 'sid', 'sq', and 'svt'". Is this okay?

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-24 11:10                 ` Wu, Feng
@ 2016-10-24 11:31                   ` Jan Beulich
  2016-10-25  1:04                     ` Wu, Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2016-10-24 11:31 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 24.10.16 at 13:10, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Monday, October 24, 2016 6:57 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted 
> format
>> IRTE
>> 
>> >>> On 24.10.16 at 12:18, <feng.wu@intel.com> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Monday, October 24, 2016 5:54 PM
>> >> To: Wu, Feng <feng.wu@intel.com>
>> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> devel@lists.xen.org 
>> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
>> > format
>> >> IRTE
>> >>
>> >> >>> On 24.10.16 at 10:57, <feng.wu@intel.com> wrote:
>> >>
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: Monday, October 24, 2016 3:28 PM
>> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> >> devel@lists.xen.org 
>> >> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
>> >> > format
>> >> >> IRTE
>> >> >>
>> >> >> >>> On 17.10.16 at 09:02, <feng.wu@intel.com> wrote:
>> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> Sent: Wednesday, October 12, 2016 9:56 PM
>> >> >> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
>> >> >> >> > --- a/xen/drivers/passthrough/vtd/intremap.c
>> >> >> >> > +++ b/xen/drivers/passthrough/vtd/intremap.c
>> >> >> >> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
>> >> >> >> >      return 0;
>> >> >> >> >  }
>> >> >> >> >
>> >> >> >> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry
>> *new,
>> >> >> >>
>> >> >> >> bool (and true/false respectively) please.
>> >> >> >>
>> >> >> >> And then the function name suggests that no modification gets done
>> >> >> >> here (and hence the first parameter could be const too), yet the
>> >> >> >> implementation does otherwise (and I don't understand why).
>> >> >> >
>> >> >> > The idea here is that if the old IRTE is in posted format and fields like
>> >> >> > 'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use 
> these
>> >> >> > new values for the new_ire, while we still need to use the old values
>> >> >> > of other fields in IRTE, so this function returns the new irte in its
>> > first
>> >> >> >  parameter it we cannot suppress the update. I try to do it in this
>> >> >> > function.
>> >> >>
>> >> >> I don't understand: The caller fully constructs the new entry. Why
>> >> >> would you want to do further modifications here? I continue to
>> >> >> think that this function should solely check whether the changes
>> >> >> between old and new entry are such that the actual update (and
>> >> >> hence the flush) can be bypassed.
>> >> >>
>> >> >> >> > +    const struct iremap_entry *old)
>> >> >> >> > +{
>> >> >> >> > +    bool_t ret = 1;
>> >> >> >> > +    u16 fpd, sid, sq, svt;
>> >> >> >> > +
>> >> >> >> > +    if ( !old->remap.p || !old->remap.im )
>> >> >> >> > +        return 0;
>> >> >> >> > +
>> >> >> >> > +    fpd = new->post.fpd;
>> >> >> >> > +    sid = new->post.sid;
>> >> >> >> > +    sq = new->post.sq;
>> >> >> >> > +    svt = new->post.svt;
>> >> >> >> > +
>> >> >> >> > +    *new = *old;
>> >> >> >> > +
>> >> >> >> > +    if ( fpd != old->post.fpd )
>> >> >> >> > +    {
>> >> >> >> > +        new->post.fpd = fpd;
>> >> >> >> > +        ret = 0;
>> >> >> >> > +    }
>> >> >> >> > +
>> >> >> >> > +    if ( sid != old->post.sid )
>> >> >> >> > +    {
>> >> >> >> > +        new->post.sid = sid;
>> >> >> >> > +        ret = 0;
>> >> >> >> > +    }
>> >> >> >> > +
>> >> >> >> > +    if ( sq != old->post.sq )
>> >> >> >> > +    {
>> >> >> >> > +        new->post.sq = sq;
>> >> >> >> > +        ret = 0;
>> >> >> >> > +    }
>> >> >> >> > +
>> >> >> >> > +    if ( svt != old->post.svt )
>> >> >> >> > +    {
>> >> >> >> > +        new->post.svt = svt;
>> >> >> >> > +        ret = 0;
>> >> >> >> > +    }
>> >> >> >>
>> >> >> >> What's the selection of the fields based on? Namely, what about
>> >> >> >> vector, pda_l, and pda_h?
>> >> >> >
>> >> >> > These filed are the common field between posted format and remapped
>> >> >> format.
>> >> >> > 'vector' field has different meaning in the two formant, pda_l and pda_h
>> > is
>> >> >> only
>> >> >> > for posted format. As mentioned above, the purpose of this function is
>> to
>> >> find
>> >> >> > whether use need to update this common field in posted format, if it is,
>> > we
>> >> >> need
>> >> >> > to use them and reuse the old value of other fields (pda_l, pda_h, 
> vector,
>> >> etc.).
>> >> >> > since we need to suppress affinity related update for posted format.
>> >> >>
>> >> >> If that was the case, then the first thing you'd need to check would
>> >> >> be whether the format actually changes. If it doesn't, all fields need
>> >> >> to be compared, while if it does change, the write (and flush) clearly
>> >> >> can't be suppressed.
>> >> >>
>> >> >
>> >> > Let me elaborate a bit more on the function to make things clear:
>> >> > 1. If the old IRTE is present, or it is in remapped mode, we cannot
>> > suppress
>> >> > the update, such as we may create a new IRTE and put it in remapped mode,
>> >> > or update the remapped mode to posted mode.
>> >> > 2. If the condition in step 1 is false, that means the old IRTE is present
>> > and
>> >> > in posted mode, so we need to suppress the affinity related updates,
>> >>
>> >> But only if the new entry is in posted mode too - see my earlier reply.
>> >>
>> >> > and
>> >> > only update the fields:  'fpd', 'sid', 'sq', or 'svt'. (Here maybe we need
>> > to check
>> >> > whether if the new IRTE is in posted mode, if it is we need to update all
>> >> > the field, but currently if we update posted mode -> posted mode, we don't
>> >> > go to this function, it is done in pi_update_irte(),
>> >>
>> >> Which looks like a code flow problem anyway - there shouldn't be
>> >> direct calls from vendor independent code to vendor dependent
>> >> functions. And then I can't see how the call to pi_update_irte()
>> >> prevents execution flow reaching msi_msg_to_remap_entry(); at
>> >> best the function would just re-write the same entry unchanged.
>> >>
>> >> In any event - msi_msg_to_remap_entry() should be correct for
>> >> all current and future callers, and hence I continue to think you
>> >> want to adjust the code as suggested.
>> >>
>> >> > so maybe we can add a WARN_ON() for that case?)
>> >>
>> >> We need to be very careful with such WARN_ON()s - they must
>> >> not be guest triggerable (I think this one wouldn't be) and they
>> >> should not be raised more than once until a "good" update
>> >> happened again (to avoid spamming the log).
>> >
>> > So based on your comments about, I summarize the handling flow:
>> > 1. The same as above
>> > 2. If the condition in step 1 is false, that means the old IRTE is present and
>> > in posted mode. If the new IRTE is in posted mode, we just update it, but
>> > if it is in remapped mode, we need to suppress the affinity related updates,
>> > and only update the fields:  'fpd', 'sid', 'sq', and 'svt'.
>> >
>> > Does this looks okay to you?
>> 
>> No. Just to repeat: "In any event - msi_msg_to_remap_entry()
>> should be correct for all current and future callers, and hence I
>> continue to think you want to adjust the code as suggested." IOW
>> the checking function should really just be checking things, and it
>> should do so (correctly) for all possible inputs. Its return value
>> ought to indicate whether the update can be suppressed.
> 
> Okay, I can make a checking only function. But I would like to listen
> to your advice about how to handle the case: " if it is in remapped
> mode, we need to suppress the affinity related updates, and only
> update the fields:  'fpd', 'sid', 'sq', and 'svt'". Is this okay?

First of all I think you mean "if it is in posted mode". But then yes,
of course only fields that are relevant in the respective format
need updating. Yet once again - a fresh IRTE gets prepared anyway,
to it's really just a matter of which fields the checking function should
compare in both modes (of course provided the mode itself doesn't
change). And then - also as said before - I don't think the list you
gave is exhaustive.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-24 11:31                   ` Jan Beulich
@ 2016-10-25  1:04                     ` Wu, Feng
  2016-10-25  5:57                       ` Tian, Kevin
  2016-10-25  8:09                       ` Jan Beulich
  0 siblings, 2 replies; 35+ messages in thread
From: Wu, Feng @ 2016-10-25  1:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, October 24, 2016 7:31 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format
> IRTE
> 
> >>> On 24.10.16 at 13:10, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Monday, October 24, 2016 6:57 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> devel@lists.xen.org
> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
> > format
> >> IRTE
> >>
> >> >>> On 24.10.16 at 12:18, <feng.wu@intel.com> wrote:
> >>
> >> >
> >> >> -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Monday, October 24, 2016 5:54 PM
> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> >> devel@lists.xen.org
> >> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
> >> > format
> >> >> IRTE
> >> >>
> >> >> >>> On 24.10.16 at 10:57, <feng.wu@intel.com> wrote:
> >> >>
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: Monday, October 24, 2016 3:28 PM
> >> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
> xen-
> >> >> >> devel@lists.xen.org
> >> >> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
> >> >> > format
> >> >> >> IRTE
> >> >> >>
> >> >> >> >>> On 17.10.16 at 09:02, <feng.wu@intel.com> wrote:
> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> Sent: Wednesday, October 12, 2016 9:56 PM
> >> >> >> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> >> >> >> >> > --- a/xen/drivers/passthrough/vtd/intremap.c
> >> >> >> >> > +++ b/xen/drivers/passthrough/vtd/intremap.c
> >> >> >> >> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
> >> >> >> >> >      return 0;
> >> >> >> >> >  }
> >> >> >> >> >
> >> >> >> >> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry
> >> *new,
> >> >> >> >>
> >> >> >> >> bool (and true/false respectively) please.
> >> >> >> >>
> >> >> >> >> And then the function name suggests that no modification gets done
> >> >> >> >> here (and hence the first parameter could be const too), yet the
> >> >> >> >> implementation does otherwise (and I don't understand why).
> >> >> >> >
> >> >> >> > The idea here is that if the old IRTE is in posted format and fields like
> >> >> >> > 'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use
> > these
> >> >> >> > new values for the new_ire, while we still need to use the old values
> >> >> >> > of other fields in IRTE, so this function returns the new irte in its
> >> > first
> >> >> >> >  parameter it we cannot suppress the update. I try to do it in this
> >> >> >> > function.
> >> >> >>
> >> >> >> I don't understand: The caller fully constructs the new entry. Why
> >> >> >> would you want to do further modifications here? I continue to
> >> >> >> think that this function should solely check whether the changes
> >> >> >> between old and new entry are such that the actual update (and
> >> >> >> hence the flush) can be bypassed.
> >> >> >>
> >> >> >> >> > +    const struct iremap_entry *old)
> >> >> >> >> > +{
> >> >> >> >> > +    bool_t ret = 1;
> >> >> >> >> > +    u16 fpd, sid, sq, svt;
> >> >> >> >> > +
> >> >> >> >> > +    if ( !old->remap.p || !old->remap.im )
> >> >> >> >> > +        return 0;
> >> >> >> >> > +
> >> >> >> >> > +    fpd = new->post.fpd;
> >> >> >> >> > +    sid = new->post.sid;
> >> >> >> >> > +    sq = new->post.sq;
> >> >> >> >> > +    svt = new->post.svt;
> >> >> >> >> > +
> >> >> >> >> > +    *new = *old;
> >> >> >> >> > +
> >> >> >> >> > +    if ( fpd != old->post.fpd )
> >> >> >> >> > +    {
> >> >> >> >> > +        new->post.fpd = fpd;
> >> >> >> >> > +        ret = 0;
> >> >> >> >> > +    }
> >> >> >> >> > +
> >> >> >> >> > +    if ( sid != old->post.sid )
> >> >> >> >> > +    {
> >> >> >> >> > +        new->post.sid = sid;
> >> >> >> >> > +        ret = 0;
> >> >> >> >> > +    }
> >> >> >> >> > +
> >> >> >> >> > +    if ( sq != old->post.sq )
> >> >> >> >> > +    {
> >> >> >> >> > +        new->post.sq = sq;
> >> >> >> >> > +        ret = 0;
> >> >> >> >> > +    }
> >> >> >> >> > +
> >> >> >> >> > +    if ( svt != old->post.svt )
> >> >> >> >> > +    {
> >> >> >> >> > +        new->post.svt = svt;
> >> >> >> >> > +        ret = 0;
> >> >> >> >> > +    }
> >> >> >> >>
> >> >> >> >> What's the selection of the fields based on? Namely, what about
> >> >> >> >> vector, pda_l, and pda_h?
> >> >> >> >
> >> >> >> > These filed are the common field between posted format and
> remapped
> >> >> >> format.
> >> >> >> > 'vector' field has different meaning in the two formant, pda_l and
> pda_h
> >> > is
> >> >> >> only
> >> >> >> > for posted format. As mentioned above, the purpose of this function
> is
> >> to
> >> >> find
> >> >> >> > whether use need to update this common field in posted format, if it
> is,
> >> > we
> >> >> >> need
> >> >> >> > to use them and reuse the old value of other fields (pda_l, pda_h,
> > vector,
> >> >> etc.).
> >> >> >> > since we need to suppress affinity related update for posted format.
> >> >> >>
> >> >> >> If that was the case, then the first thing you'd need to check would
> >> >> >> be whether the format actually changes. If it doesn't, all fields need
> >> >> >> to be compared, while if it does change, the write (and flush) clearly
> >> >> >> can't be suppressed.
> >> >> >>
> >> >> >
> >> >> > Let me elaborate a bit more on the function to make things clear:
> >> >> > 1. If the old IRTE is present, or it is in remapped mode, we cannot
> >> > suppress
> >> >> > the update, such as we may create a new IRTE and put it in remapped
> mode,
> >> >> > or update the remapped mode to posted mode.
> >> >> > 2. If the condition in step 1 is false, that means the old IRTE is present
> >> > and
> >> >> > in posted mode, so we need to suppress the affinity related updates,
> >> >>
> >> >> But only if the new entry is in posted mode too - see my earlier reply.
> >> >>
> >> >> > and
> >> >> > only update the fields:  'fpd', 'sid', 'sq', or 'svt'. (Here maybe we need
> >> > to check
> >> >> > whether if the new IRTE is in posted mode, if it is we need to update all
> >> >> > the field, but currently if we update posted mode -> posted mode, we
> don't
> >> >> > go to this function, it is done in pi_update_irte(),
> >> >>
> >> >> Which looks like a code flow problem anyway - there shouldn't be
> >> >> direct calls from vendor independent code to vendor dependent
> >> >> functions. And then I can't see how the call to pi_update_irte()
> >> >> prevents execution flow reaching msi_msg_to_remap_entry(); at
> >> >> best the function would just re-write the same entry unchanged.
> >> >>
> >> >> In any event - msi_msg_to_remap_entry() should be correct for
> >> >> all current and future callers, and hence I continue to think you
> >> >> want to adjust the code as suggested.
> >> >>
> >> >> > so maybe we can add a WARN_ON() for that case?)
> >> >>
> >> >> We need to be very careful with such WARN_ON()s - they must
> >> >> not be guest triggerable (I think this one wouldn't be) and they
> >> >> should not be raised more than once until a "good" update
> >> >> happened again (to avoid spamming the log).
> >> >
> >> > So based on your comments about, I summarize the handling flow:
> >> > 1. The same as above
> >> > 2. If the condition in step 1 is false, that means the old IRTE is present and
> >> > in posted mode. If the new IRTE is in posted mode, we just update it, but
> >> > if it is in remapped mode, we need to suppress the affinity related updates,
> >> > and only update the fields:  'fpd', 'sid', 'sq', and 'svt'.
> >> >
> >> > Does this looks okay to you?
> >>
> >> No. Just to repeat: "In any event - msi_msg_to_remap_entry()
> >> should be correct for all current and future callers, and hence I
> >> continue to think you want to adjust the code as suggested." IOW
> >> the checking function should really just be checking things, and it
> >> should do so (correctly) for all possible inputs. Its return value
> >> ought to indicate whether the update can be suppressed.
> >
> > Okay, I can make a checking only function. But I would like to listen
> > to your advice about how to handle the case: " if it is in remapped
> > mode, we need to suppress the affinity related updates, and only
> > update the fields:  'fpd', 'sid', 'sq', and 'svt'". Is this okay?
> 
> First of all I think you mean "if it is in posted mode". But then yes,

I mean "if it is in remapped mode", here _it_ refers to the old IRTE.
we only need to suppress the affinity related field when we update
a remapped IRTE to posted IRTE. If the old IRTE is in posted mode,
we can just update the new posted mode IRTE.

> of course only fields that are relevant in the respective format
> need updating. Yet once again - a fresh IRTE gets prepared anyway,
> to it's really just a matter of which fields the checking function should
> compare in both modes (of course provided the mode itself doesn't
> change). And then - also as said before - I don't think the list you
> gave is exhaustive.

I really don't get the point why you think the list is not enough. Could
you please explain more, thanks a lot!

Thanks,
Feng

> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-25  1:04                     ` Wu, Feng
@ 2016-10-25  5:57                       ` Tian, Kevin
  2016-10-25  8:09                       ` Jan Beulich
  1 sibling, 0 replies; 35+ messages in thread
From: Tian, Kevin @ 2016-10-25  5:57 UTC (permalink / raw)
  To: Wu, Feng, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, xen-devel

> From: Wu, Feng
> Sent: Tuesday, October 25, 2016 9:05 AM
> 
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: Monday, October 24, 2016 7:31 PM
> > To: Wu, Feng <feng.wu@intel.com>
> > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> > devel@lists.xen.org
> > Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format
> > IRTE
> >
> > >>> On 24.10.16 at 13:10, <feng.wu@intel.com> wrote:
> >
> > >
> > >> -----Original Message-----
> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> Sent: Monday, October 24, 2016 6:57 PM
> > >> To: Wu, Feng <feng.wu@intel.com>
> > >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> > >> devel@lists.xen.org
> > >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
> > > format
> > >> IRTE
> > >>
> > >> >>> On 24.10.16 at 12:18, <feng.wu@intel.com> wrote:
> > >>
> > >> >
> > >> >> -----Original Message-----
> > >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> >> Sent: Monday, October 24, 2016 5:54 PM
> > >> >> To: Wu, Feng <feng.wu@intel.com>
> > >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> > >> >> devel@lists.xen.org
> > >> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
> > >> > format
> > >> >> IRTE
> > >> >>
> > >> >> >>> On 24.10.16 at 10:57, <feng.wu@intel.com> wrote:
> > >> >>
> > >> >> >
> > >> >> >> -----Original Message-----
> > >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> >> >> Sent: Monday, October 24, 2016 3:28 PM
> > >> >> >> To: Wu, Feng <feng.wu@intel.com>
> > >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
> > xen-
> > >> >> >> devel@lists.xen.org
> > >> >> >> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted
> > >> >> > format
> > >> >> >> IRTE
> > >> >> >>
> > >> >> >> >>> On 17.10.16 at 09:02, <feng.wu@intel.com> wrote:
> > >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> >> >> >> Sent: Wednesday, October 12, 2016 9:56 PM
> > >> >> >> >> >>> On 11.10.16 at 02:57, <feng.wu@intel.com> wrote:
> > >> >> >> >> > --- a/xen/drivers/passthrough/vtd/intremap.c
> > >> >> >> >> > +++ b/xen/drivers/passthrough/vtd/intremap.c
> > >> >> >> >> > @@ -547,6 +547,49 @@ static int remap_entry_to_msi_msg(
> > >> >> >> >> >      return 0;
> > >> >> >> >> >  }
> > >> >> >> >> >
> > >> >> >> >> > +static bool_t pi_can_suppress_irte_update(struct iremap_entry
> > >> *new,
> > >> >> >> >>
> > >> >> >> >> bool (and true/false respectively) please.
> > >> >> >> >>
> > >> >> >> >> And then the function name suggests that no modification gets done
> > >> >> >> >> here (and hence the first parameter could be const too), yet the
> > >> >> >> >> implementation does otherwise (and I don't understand why).
> > >> >> >> >
> > >> >> >> > The idea here is that if the old IRTE is in posted format and fields like
> > >> >> >> > 'fpd', 'sid', 'sq', or 'svt' is going to be changed , we need to use
> > > these
> > >> >> >> > new values for the new_ire, while we still need to use the old values
> > >> >> >> > of other fields in IRTE, so this function returns the new irte in its
> > >> > first
> > >> >> >> >  parameter it we cannot suppress the update. I try to do it in this
> > >> >> >> > function.
> > >> >> >>
> > >> >> >> I don't understand: The caller fully constructs the new entry. Why
> > >> >> >> would you want to do further modifications here? I continue to
> > >> >> >> think that this function should solely check whether the changes
> > >> >> >> between old and new entry are such that the actual update (and
> > >> >> >> hence the flush) can be bypassed.
> > >> >> >>
> > >> >> >> >> > +    const struct iremap_entry *old)
> > >> >> >> >> > +{
> > >> >> >> >> > +    bool_t ret = 1;
> > >> >> >> >> > +    u16 fpd, sid, sq, svt;
> > >> >> >> >> > +
> > >> >> >> >> > +    if ( !old->remap.p || !old->remap.im )
> > >> >> >> >> > +        return 0;
> > >> >> >> >> > +
> > >> >> >> >> > +    fpd = new->post.fpd;
> > >> >> >> >> > +    sid = new->post.sid;
> > >> >> >> >> > +    sq = new->post.sq;
> > >> >> >> >> > +    svt = new->post.svt;
> > >> >> >> >> > +
> > >> >> >> >> > +    *new = *old;
> > >> >> >> >> > +
> > >> >> >> >> > +    if ( fpd != old->post.fpd )
> > >> >> >> >> > +    {
> > >> >> >> >> > +        new->post.fpd = fpd;
> > >> >> >> >> > +        ret = 0;
> > >> >> >> >> > +    }
> > >> >> >> >> > +
> > >> >> >> >> > +    if ( sid != old->post.sid )
> > >> >> >> >> > +    {
> > >> >> >> >> > +        new->post.sid = sid;
> > >> >> >> >> > +        ret = 0;
> > >> >> >> >> > +    }
> > >> >> >> >> > +
> > >> >> >> >> > +    if ( sq != old->post.sq )
> > >> >> >> >> > +    {
> > >> >> >> >> > +        new->post.sq = sq;
> > >> >> >> >> > +        ret = 0;
> > >> >> >> >> > +    }
> > >> >> >> >> > +
> > >> >> >> >> > +    if ( svt != old->post.svt )
> > >> >> >> >> > +    {
> > >> >> >> >> > +        new->post.svt = svt;
> > >> >> >> >> > +        ret = 0;
> > >> >> >> >> > +    }
> > >> >> >> >>
> > >> >> >> >> What's the selection of the fields based on? Namely, what about
> > >> >> >> >> vector, pda_l, and pda_h?
> > >> >> >> >
> > >> >> >> > These filed are the common field between posted format and
> > remapped
> > >> >> >> format.
> > >> >> >> > 'vector' field has different meaning in the two formant, pda_l and
> > pda_h
> > >> > is
> > >> >> >> only
> > >> >> >> > for posted format. As mentioned above, the purpose of this function
> > is
> > >> to
> > >> >> find
> > >> >> >> > whether use need to update this common field in posted format, if it
> > is,
> > >> > we
> > >> >> >> need
> > >> >> >> > to use them and reuse the old value of other fields (pda_l, pda_h,
> > > vector,
> > >> >> etc.).
> > >> >> >> > since we need to suppress affinity related update for posted format.
> > >> >> >>
> > >> >> >> If that was the case, then the first thing you'd need to check would
> > >> >> >> be whether the format actually changes. If it doesn't, all fields need
> > >> >> >> to be compared, while if it does change, the write (and flush) clearly
> > >> >> >> can't be suppressed.
> > >> >> >>
> > >> >> >
> > >> >> > Let me elaborate a bit more on the function to make things clear:
> > >> >> > 1. If the old IRTE is present, or it is in remapped mode, we cannot
> > >> > suppress
> > >> >> > the update, such as we may create a new IRTE and put it in remapped
> > mode,
> > >> >> > or update the remapped mode to posted mode.
> > >> >> > 2. If the condition in step 1 is false, that means the old IRTE is present
> > >> > and
> > >> >> > in posted mode, so we need to suppress the affinity related updates,
> > >> >>
> > >> >> But only if the new entry is in posted mode too - see my earlier reply.
> > >> >>
> > >> >> > and
> > >> >> > only update the fields:  'fpd', 'sid', 'sq', or 'svt'. (Here maybe we need
> > >> > to check
> > >> >> > whether if the new IRTE is in posted mode, if it is we need to update all
> > >> >> > the field, but currently if we update posted mode -> posted mode, we
> > don't
> > >> >> > go to this function, it is done in pi_update_irte(),
> > >> >>
> > >> >> Which looks like a code flow problem anyway - there shouldn't be
> > >> >> direct calls from vendor independent code to vendor dependent
> > >> >> functions. And then I can't see how the call to pi_update_irte()
> > >> >> prevents execution flow reaching msi_msg_to_remap_entry(); at
> > >> >> best the function would just re-write the same entry unchanged.
> > >> >>
> > >> >> In any event - msi_msg_to_remap_entry() should be correct for
> > >> >> all current and future callers, and hence I continue to think you
> > >> >> want to adjust the code as suggested.
> > >> >>
> > >> >> > so maybe we can add a WARN_ON() for that case?)
> > >> >>
> > >> >> We need to be very careful with such WARN_ON()s - they must
> > >> >> not be guest triggerable (I think this one wouldn't be) and they
> > >> >> should not be raised more than once until a "good" update
> > >> >> happened again (to avoid spamming the log).
> > >> >
> > >> > So based on your comments about, I summarize the handling flow:
> > >> > 1. The same as above
> > >> > 2. If the condition in step 1 is false, that means the old IRTE is present and
> > >> > in posted mode. If the new IRTE is in posted mode, we just update it, but
> > >> > if it is in remapped mode, we need to suppress the affinity related updates,
> > >> > and only update the fields:  'fpd', 'sid', 'sq', and 'svt'.
> > >> >
> > >> > Does this looks okay to you?
> > >>
> > >> No. Just to repeat: "In any event - msi_msg_to_remap_entry()
> > >> should be correct for all current and future callers, and hence I
> > >> continue to think you want to adjust the code as suggested." IOW
> > >> the checking function should really just be checking things, and it
> > >> should do so (correctly) for all possible inputs. Its return value
> > >> ought to indicate whether the update can be suppressed.
> > >
> > > Okay, I can make a checking only function. But I would like to listen
> > > to your advice about how to handle the case: " if it is in remapped
> > > mode, we need to suppress the affinity related updates, and only
> > > update the fields:  'fpd', 'sid', 'sq', and 'svt'". Is this okay?
> >
> > First of all I think you mean "if it is in posted mode". But then yes,
> 
> I mean "if it is in remapped mode", here _it_ refers to the old IRTE.
> we only need to suppress the affinity related field when we update
> a remapped IRTE to posted IRTE. If the old IRTE is in posted mode,
> we can just update the new posted mode IRTE.
> 
> > of course only fields that are relevant in the respective format
> > need updating. Yet once again - a fresh IRTE gets prepared anyway,
> > to it's really just a matter of which fields the checking function should
> > compare in both modes (of course provided the mode itself doesn't
> > change). And then - also as said before - I don't think the list you
> > gave is exhaustive.
> 
> I really don't get the point why you think the list is not enough. Could
> you please explain more, thanks a lot!
> 

Not sure whether I understand Jan's point correctly. But after going
through your original patch, I felt it's hard to understand the current
flow - you always prepare new_irte in remappable format regardless of
actual format of old irte, and then relying on this new function to adjust 
and check whether actual updates are required for posted mode. It's not 
very readable w/o checking VT-d spec to know which fields are shared 
between two formats. Would it be clearer if you can prepare new_irte 
explicitly for different modes in msi_msg_to_remap_entry, and then 
use this new check-only function to judge any optimization that you 
intend for posted case?

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-25  1:04                     ` Wu, Feng
  2016-10-25  5:57                       ` Tian, Kevin
@ 2016-10-25  8:09                       ` Jan Beulich
  2016-10-26  9:12                         ` Wu, Feng
  1 sibling, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2016-10-25  8:09 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 25.10.16 at 03:04, <feng.wu@intel.com> wrote:
> I really don't get the point why you think the list is not enough. Could
> you please explain more, thanks a lot!

I have to admit that I don't really know what else to say. Isn't it
quite obvious that for you to suppress the actual update, _all_
meaningful (in the given format) fields have to match? Unless
you indeed compare all of them, you make assumptions about
your callers only covering a subset of all possible inputs.

And then I have to also say that for this simple an operation
(comparing two IRTEs) the discussion has already gone on way
too long.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-25  8:09                       ` Jan Beulich
@ 2016-10-26  9:12                         ` Wu, Feng
  2016-10-26  9:51                           ` Jan Beulich
  0 siblings, 1 reply; 35+ messages in thread
From: Wu, Feng @ 2016-10-26  9:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, October 25, 2016 4:09 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format
> IRTE
> 
> >>> On 25.10.16 at 03:04, <feng.wu@intel.com> wrote:
> > I really don't get the point why you think the list is not enough. Could
> > you please explain more, thanks a lot!
> 
> I have to admit that I don't really know what else to say. Isn't it
> quite obvious that for you to suppress the actual update, _all_
> meaningful (in the given format) fields have to match? Unless
> you indeed compare all of them, you make assumptions about
> your callers only covering a subset of all possible inputs.

Well, the problem is when the current IRTE is in posted mode, and we use
this function to update the IRTE to remapped mode, we cannot compare
all the fields, since they have different format, and that is why I compare
the common field ( other fields are either posted specific or remapped specific),
it doesn't make sense to compare them between the two format.

Basically, we can divide the format into some category:
- Common field
- posted specific (PI related) or remapped specific (affinity related field)

The purpose here is to suppress the affinity related field, so when updating
posted IRTE to remapped IRTE and common field changes, we need update
them with the new value while suppressing the affinity related field.

Thanks,
Feng

> 
> And then I have to also say that for this simple an operation
> (comparing two IRTEs) the discussion has already gone on way
> too long.
> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE
  2016-10-26  9:12                         ` Wu, Feng
@ 2016-10-26  9:51                           ` Jan Beulich
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2016-10-26  9:51 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 26.10.16 at 11:12, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Tuesday, October 25, 2016 4:09 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v5 5/7] VT-d: No need to set irq affinity for posted 
> format
>> IRTE
>> 
>> >>> On 25.10.16 at 03:04, <feng.wu@intel.com> wrote:
>> > I really don't get the point why you think the list is not enough. Could
>> > you please explain more, thanks a lot!
>> 
>> I have to admit that I don't really know what else to say. Isn't it
>> quite obvious that for you to suppress the actual update, _all_
>> meaningful (in the given format) fields have to match? Unless
>> you indeed compare all of them, you make assumptions about
>> your callers only covering a subset of all possible inputs.
> 
> Well, the problem is when the current IRTE is in posted mode, and we use
> this function to update the IRTE to remapped mode, we cannot compare
> all the fields, since they have different format, and that is why I compare
> the common field ( other fields are either posted specific or remapped specific),
> it doesn't make sense to compare them between the two format.

Of course, and I've repeatedly said that the first check of course
needs to be whether the two format bits are different - if they
are, no further checking is needed, and the update must not be
bypassed.

> Basically, we can divide the format into some category:
> - Common field
> - posted specific (PI related) or remapped specific (affinity related field)

That's fine of course, but any such checks would go after the
format bit comparison anyway.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2016-10-26  9:51 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-11  0:57 [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
2016-10-11  0:57 ` [PATCH v5 1/7] VMX: Statically assign two PI hooks Feng Wu
2016-10-11  8:11   ` Tian, Kevin
2016-10-12 13:25   ` Jan Beulich
2016-10-11  0:57 ` [PATCH v5 2/7] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
2016-10-11  8:20   ` Tian, Kevin
2016-10-11  0:57 ` [PATCH v5 3/7] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
2016-10-11  0:57 ` [PATCH v5 4/7] VMX: Make sure PI is in proper state before install the hooks Feng Wu
2016-10-12 13:45   ` Jan Beulich
2016-10-17  6:26     ` Wu, Feng
2016-10-24  7:22       ` Jan Beulich
2016-10-24  7:45         ` Wu, Feng
2016-10-24  7:59           ` Jan Beulich
2016-10-11  0:57 ` [PATCH v5 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
2016-10-12 13:56   ` Jan Beulich
2016-10-17  7:02     ` Wu, Feng
2016-10-24  7:27       ` Jan Beulich
2016-10-24  8:57         ` Wu, Feng
2016-10-24  9:53           ` Jan Beulich
2016-10-24 10:18             ` Wu, Feng
2016-10-24 10:57               ` Jan Beulich
2016-10-24 11:10                 ` Wu, Feng
2016-10-24 11:31                   ` Jan Beulich
2016-10-25  1:04                     ` Wu, Feng
2016-10-25  5:57                       ` Tian, Kevin
2016-10-25  8:09                       ` Jan Beulich
2016-10-26  9:12                         ` Wu, Feng
2016-10-26  9:51                           ` Jan Beulich
2016-10-11  0:57 ` [PATCH v5 6/7] VT-d: Some cleanups Feng Wu
2016-10-12 14:01   ` Jan Beulich
2016-10-11  0:57 ` [PATCH v5 7/7] VMX: Fixup PI descriptor when cpu is offline Feng Wu
2016-10-11  8:38   ` Tian, Kevin
2016-10-11 11:46     ` Wu, Feng
2016-10-11  8:08 ` [PATCH v5 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Tian, Kevin
2016-10-11  8:11   ` Wu, Feng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.