All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/6] VMX: Properly handle pi descriptor and per-cpu blocking list
@ 2016-08-31  3:56 Feng Wu
  2016-08-31  3:56 ` [PATCH v3 1/6] VMX: Statically assign two PI hooks Feng Wu
                   ` (5 more replies)
  0 siblings, 6 replies; 50+ messages in thread
From: Feng Wu @ 2016-08-31  3:56 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

The current VT-d PI related code may operate incorrectly in the
following scenarios:
1. When the last assigned device is dettached from the domain, all
the PI related hooks are removed then, however, the vCPU can be
blocked, switched to another pCPU, etc, all without the aware of
PI. After the next time we attach another device to the domain,
which makes the PI realted hooks avaliable again, the status
of the pi descriptor is not true. Beside that, the blocking vcpu
may still remain in the per-cpu blocking in this case. Patch [1/6]
and [2/6] handle this.

2. After the domain is destroyed, the the blocking vcpu may also
remain in the per-cpu blocking. Handled in patch [3/6].

3. When IRTE is in posted mode, we don't need to set the irq
affinity for it, since the destination of these interrupts is
vCPU and the vCPU affinity is set during vCPU scheduling. Patch
[5/6] handles this.

4. When a pCPU is unplugged, and there might be vCPUs on its
list. Since the pCPU is offline, those vCPUs might not be woken
up again. [6/6] addresses it.

The previous version of this series is:
https://lists.xen.org/archives/html/xen-devel/2016-05/msg02592.html

Feng Wu (6):
  VMX: Statically assign two PI hooks
  VMX: Properly handle pi when all the assigned devices are removed
  VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
  Pause/Unpause the domain before/after assigning PI hooks
  VT-d: No need to set irq affinity for posted format IRTE
  VMX: Fixup PI descritpor when cpu is offline

 xen/arch/x86/hvm/vmx/vmcs.c            |   1 +
 xen/arch/x86/hvm/vmx/vmx.c             | 110 ++++++++++++++++++++++++++++++---
 xen/drivers/passthrough/vtd/intremap.c |  62 ++++++++++---------
 xen/include/asm-x86/hvm/vmx/vmx.h      |   1 +
 4 files changed, 139 insertions(+), 35 deletions(-)

-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 1/6] VMX: Statically assign two PI hooks
  2016-08-31  3:56 [PATCH v3 0/6] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
@ 2016-08-31  3:56 ` Feng Wu
  2016-09-01  8:16   ` Jan Beulich
  2016-09-06  8:42   ` Dario Faggioli
  2016-08-31  3:56 ` [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 50+ messages in thread
From: Feng Wu @ 2016-08-31  3:56 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

PI hooks: vmx_pi_switch_from() and vmx_pi_switch_to() are
needed even all the assigned devices were dettached from
the domain. We change the state of SN bit in these two
functions, and evaluate this bit in vmx_deliver_posted_intr()
when trying to deliver the interrupt in posted way via software.
The problem is if we deassign the hooks while the vCPU is runnable
in the runqueue with 'SN' set, all the furture notificaton event
will be suppressed. This patch makes these two hooks statically
assigned.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 3d330b6..f5d2d3c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -207,8 +207,6 @@ void vmx_pi_hooks_assign(struct domain *d)
     ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
 
     d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
-    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
-    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
     d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
 }
 
@@ -221,8 +219,6 @@ void vmx_pi_hooks_deassign(struct domain *d)
     ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
 
     d->arch.hvm_domain.vmx.vcpu_block = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
     d->arch.hvm_domain.vmx.pi_do_resume = NULL;
 }
 
@@ -236,6 +232,12 @@ static int vmx_domain_initialise(struct domain *d)
     if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
         return rc;
 
+    if ( iommu_intpost )
+    {
+        d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
+        d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
+    }
+
     return 0;
 }
 
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed
  2016-08-31  3:56 [PATCH v3 0/6] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
  2016-08-31  3:56 ` [PATCH v3 1/6] VMX: Statically assign two PI hooks Feng Wu
@ 2016-08-31  3:56 ` Feng Wu
  2016-09-01  8:21   ` Jan Beulich
  2016-09-06  8:58   ` Dario Faggioli
  2016-08-31  3:56 ` [PATCH v3 3/6] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 50+ messages in thread
From: Feng Wu @ 2016-08-31  3:56 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

This patch handles some concern cases when the last assigned device
is removed from the domain. In this case we should carefully handle
pi descriptor and the per-cpu blocking list, to make sure:
- all the PI descriptor are in the right state when next time a
devices is assigned to the domain again.
- No remaining vcpus of the domain in the per-cpu blocking list.

Basically, we pause the domain before zapping the PI hooks and
removing the vCPU from the blocking list, then unpause it after
that.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 34 +++++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index f5d2d3c..b869728 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -158,14 +158,12 @@ static void vmx_pi_switch_to(struct vcpu *v)
     pi_clear_sn(pi_desc);
 }
 
-static void vmx_pi_do_resume(struct vcpu *v)
+static void vmx_pi_remove_vcpu_from_blocking_list(struct vcpu *v)
 {
     unsigned long flags;
     spinlock_t *pi_blocking_list_lock;
     struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
 
-    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
-
     /*
      * Set 'NV' field back to posted_intr_vector, so the
      * Posted-Interrupts can be delivered to the vCPU when
@@ -198,6 +196,21 @@ static void vmx_pi_do_resume(struct vcpu *v)
     spin_unlock_irqrestore(pi_blocking_list_lock, flags);
 }
 
+static void vmx_pi_do_resume(struct vcpu *v)
+{
+    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
+
+    vmx_pi_remove_vcpu_from_blocking_list(v);
+}
+
+static void vmx_pi_blocking_cleanup(struct vcpu *v)
+{
+    if ( !iommu_intpost )
+        return;
+
+    vmx_pi_remove_vcpu_from_blocking_list(v);
+}
+
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_assign(struct domain *d)
 {
@@ -213,13 +226,28 @@ void vmx_pi_hooks_assign(struct domain *d)
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_deassign(struct domain *d)
 {
+    struct vcpu *v;
+
     if ( !iommu_intpost || !has_hvm_container_domain(d) )
         return;
 
     ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
 
+    /*
+     * Pausing the domain can make sure the vCPU is not
+     * running and hence calling the hooks simultaneously
+     * when deassigning the PI hooks and removing the vCPU
+     * from the blocking list.
+     */
+    domain_pause(d);
+
     d->arch.hvm_domain.vmx.vcpu_block = NULL;
     d->arch.hvm_domain.vmx.pi_do_resume = NULL;
+
+    for_each_vcpu ( d, v )
+        vmx_pi_blocking_cleanup(v);
+
+    domain_unpause(d);
 }
 
 static int vmx_domain_initialise(struct domain *d)
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 3/6] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
  2016-08-31  3:56 [PATCH v3 0/6] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
  2016-08-31  3:56 ` [PATCH v3 1/6] VMX: Statically assign two PI hooks Feng Wu
  2016-08-31  3:56 ` [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
@ 2016-08-31  3:56 ` Feng Wu
  2016-09-06  9:21   ` Dario Faggioli
  2016-08-31  3:56 ` [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks Feng Wu
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 50+ messages in thread
From: Feng Wu @ 2016-08-31  3:56 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

We should remove the vCPU from the per-cpu blocking list
if it is going to be destroyed.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b869728..37fa2f1 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -346,6 +346,7 @@ static void vmx_vcpu_destroy(struct vcpu *v)
     vmx_destroy_vmcs(v);
     vpmu_destroy(v);
     passive_domain_destroy(v);
+    vmx_pi_blocking_cleanup(v);
 }
 
 static DEFINE_PER_CPU(struct vmx_msr_state, host_msr_state);
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-08-31  3:56 [PATCH v3 0/6] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (2 preceding siblings ...)
  2016-08-31  3:56 ` [PATCH v3 3/6] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
@ 2016-08-31  3:56 ` Feng Wu
  2016-09-01  8:29   ` Jan Beulich
  2016-08-31  3:56 ` [PATCH v3 5/6] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
  2016-08-31  3:56 ` [PATCH v3 6/6] VMX: Fixup PI descritpor when cpu is offline Feng Wu
  5 siblings, 1 reply; 50+ messages in thread
From: Feng Wu @ 2016-08-31  3:56 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

Pausing the domain can make sure the vCPU is not running and
hence calling the hooks simultaneously when deassigning the
PI hooks. This makes sure that all the appropriate state of
PI descriptor is actually set up for all vCPus before leaving
this function.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 37fa2f1..071c063 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
 
     ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
 
+    /*
+     * Pausing the domain can make sure the vCPU is not
+     * running and hence calling the hooks simultaneously
+     * when deassigning the PI hooks. This makes sure that
+     * all the appropriate state of PI descriptor is actually
+     * set up for all vCPus before leaving this function.
+     */
+    domain_pause(d);
+
     d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
     d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
+
+    domain_unpause(d);
 }
 
 /* This function is called when pcidevs_lock is held */
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 5/6] VT-d: No need to set irq affinity for posted format IRTE
  2016-08-31  3:56 [PATCH v3 0/6] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (3 preceding siblings ...)
  2016-08-31  3:56 ` [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks Feng Wu
@ 2016-08-31  3:56 ` Feng Wu
  2016-09-01  8:38   ` Jan Beulich
  2016-08-31  3:56 ` [PATCH v3 6/6] VMX: Fixup PI descritpor when cpu is offline Feng Wu
  5 siblings, 1 reply; 50+ messages in thread
From: Feng Wu @ 2016-08-31  3:56 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

We don't set the affinity for posted format IRTE, since the
destination of these interrupts is vCPU and the vCPU affinity
is set during vCPU scheduling.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/drivers/passthrough/vtd/intremap.c | 62 +++++++++++++++++++---------------
 1 file changed, 34 insertions(+), 28 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
index bfd468b..2083ee2 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -595,33 +595,36 @@ static int msi_msg_to_remap_entry(
     GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index,
                      iremap_entries, iremap_entry);
 
-    memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry));
+    if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
+    {
+        memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry));
 
-    /* Set interrupt remapping table entry */
-    new_ire.remap.fpd = 0;
-    new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
-    new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
-    new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
-    /* Hardware require RH = 1 for LPR delivery mode */
-    new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
-    new_ire.remap.avail = 0;
-    new_ire.remap.res_1 = 0;
-    new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
-                            MSI_DATA_VECTOR_MASK;
-    new_ire.remap.res_2 = 0;
-    if ( x2apic_enabled )
-        new_ire.remap.dst = msg->dest32;
-    else
-        new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
-                             & 0xff) << 8;
+        /* Set interrupt remapping table entry */
+        new_ire.remap.fpd = 0;
+        new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
+        new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
+        new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
+        /* Hardware require RH = 1 for LPR delivery mode */
+        new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
+        new_ire.remap.avail = 0;
+        new_ire.remap.res_1 = 0;
+        new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
+                                MSI_DATA_VECTOR_MASK;
+        new_ire.remap.res_2 = 0;
+        if ( x2apic_enabled )
+            new_ire.remap.dst = msg->dest32;
+        else
+            new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
+                                 & 0xff) << 8;
 
-    if ( pdev )
-        set_msi_source_id(pdev, &new_ire);
-    else
-        set_hpet_source_id(msi_desc->hpet_id, &new_ire);
-    new_ire.remap.res_3 = 0;
-    new_ire.remap.res_4 = 0;
-    new_ire.remap.p = 1;    /* finally, set present bit */
+        if ( pdev )
+            set_msi_source_id(pdev, &new_ire);
+        else
+            set_hpet_source_id(msi_desc->hpet_id, &new_ire);
+        new_ire.remap.res_3 = 0;
+        new_ire.remap.res_4 = 0;
+        new_ire.remap.p = 1;    /* finally, set present bit */
+    }
 
     /* now construct new MSI/MSI-X rte entry */
     remap_rte = (struct msi_msg_remap_entry *)msg;
@@ -637,9 +640,12 @@ static int msi_msg_to_remap_entry(
     remap_rte->address_hi = 0;
     remap_rte->data = index - i;
 
-    memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
-    iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
-    iommu_flush_iec_index(iommu, 0, index);
+    if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
+    {
+        memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
+        iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+        iommu_flush_iec_index(iommu, 0, index);
+    }
 
     unmap_vtd_domain_page(iremap_entries);
     spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags);
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 6/6] VMX: Fixup PI descritpor when cpu is offline
  2016-08-31  3:56 [PATCH v3 0/6] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
                   ` (4 preceding siblings ...)
  2016-08-31  3:56 ` [PATCH v3 5/6] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
@ 2016-08-31  3:56 ` Feng Wu
  2016-09-01  8:48   ` Jan Beulich
  5 siblings, 1 reply; 50+ messages in thread
From: Feng Wu @ 2016-08-31  3:56 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
	dario.faggioli, jbeulich

When cpu is offline, we need to move all the vcpus in its blocking
list to another online cpu, this patch handles it.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/hvm/vmx/vmcs.c       |  1 +
 xen/arch/x86/hvm/vmx/vmx.c        | 54 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/vmx/vmx.h |  1 +
 3 files changed, 56 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 1bd875a..f554d4c 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -578,6 +578,7 @@ void vmx_cpu_dead(unsigned int cpu)
     vmx_free_vmcs(per_cpu(vmxon_region, cpu));
     per_cpu(vmxon_region, cpu) = 0;
     nvmx_cpu_dead(cpu);
+    vmx_pi_desc_fixup(cpu);
 }
 
 int vmx_cpu_up(void)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 071c063..5f428b7 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -211,6 +211,60 @@ static void vmx_pi_blocking_cleanup(struct vcpu *v)
     vmx_pi_remove_vcpu_from_blocking_list(v);
 }
 
+void vmx_pi_desc_fixup(int cpu)
+{
+    unsigned int new_cpu, dest;
+    unsigned long flags;
+    struct arch_vmx_struct *vmx, *tmp;
+    spinlock_t *new_lock, *old_lock = &per_cpu(vmx_pi_blocking, cpu).lock;
+    struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list;
+
+    if ( !iommu_intpost )
+        return;
+
+    spin_lock_irqsave(old_lock, flags);
+
+    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
+    {
+        /*
+         * We need to find an online cpu as the NDST of the PI descriptor, it
+         * doesn't matter whether it is within the cpupool of the domain or
+         * not. As long as it is online, the vCPU will be woken up once the
+         * notification event arrives.
+         */
+restart:
+        new_cpu = cpumask_any(&cpu_online_map);
+        new_lock = &per_cpu(vmx_pi_blocking, new_cpu).lock;
+
+        spin_lock(new_lock);
+
+        /*
+         * If the new_cpu is not online, that means it became offline between
+         * we got 'new_cpu' and acquiring its lock above, we need to find
+         * another online cpu instead. Such as, this fucntion is being called
+         * on 'new_cpu' at the same time. Can this happen??
+         */
+        if ( !cpu_online(new_cpu) )
+        {
+            spin_unlock(new_lock);
+            goto restart;
+        }
+
+        ASSERT(vmx->pi_blocking.lock == old_lock);
+
+        dest = cpu_physical_id(new_cpu);
+        write_atomic(&vmx->pi_desc.ndst,
+                     x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
+
+        list_move(&vmx->pi_blocking.list,
+                  &per_cpu(vmx_pi_blocking, new_cpu).list);
+        vmx->pi_blocking.lock = new_lock;
+        spin_unlock(new_lock);
+    }
+
+    spin_unlock_irqrestore(old_lock, flags);
+}
+
 /* This function is called when pcidevs_lock is held */
 void vmx_pi_hooks_assign(struct domain *d)
 {
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 4cdd9b1..9783c70 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -569,6 +569,7 @@ void free_p2m_hap_data(struct p2m_domain *p2m);
 void p2m_init_hap_data(struct p2m_domain *p2m);
 
 void vmx_pi_per_cpu_init(unsigned int cpu);
+void vmx_pi_desc_fixup(int cpu);
 
 void vmx_pi_hooks_assign(struct domain *d);
 void vmx_pi_hooks_deassign(struct domain *d);
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 1/6] VMX: Statically assign two PI hooks
  2016-08-31  3:56 ` [PATCH v3 1/6] VMX: Statically assign two PI hooks Feng Wu
@ 2016-09-01  8:16   ` Jan Beulich
  2016-09-01  9:13     ` Wu, Feng
  2016-09-06  8:42   ` Dario Faggioli
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-01  8:16 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -207,8 +207,6 @@ void vmx_pi_hooks_assign(struct domain *d)
>      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>  
>      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> -    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
> -    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
>      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>  }

I'm sure I've said so before: While I can see why you want the
adjustment to vmx_pi_hooks_deassign(), I don't see why leaving
out this and ...

> @@ -236,6 +232,12 @@ static int vmx_domain_initialise(struct domain *d)
>      if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
>          return rc;
>  
> +    if ( iommu_intpost )
> +    {
> +        d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
> +        d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> +    }
> +
>      return 0;
>  }

... this would conflict with the goal of the patch.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed
  2016-08-31  3:56 ` [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
@ 2016-09-01  8:21   ` Jan Beulich
  2016-09-01  9:22     ` Wu, Feng
  2016-09-06  8:58   ` Dario Faggioli
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-01  8:21 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> This patch handles some concern cases when the last assigned device
> is removed from the domain. In this case we should carefully handle
> pi descriptor and the per-cpu blocking list, to make sure:
> - all the PI descriptor are in the right state when next time a
> devices is assigned to the domain again.
> - No remaining vcpus of the domain in the per-cpu blocking list.
> 
> Basically, we pause the domain before zapping the PI hooks and
> removing the vCPU from the blocking list, then unpause it after
> that.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>

Looks plausible, but
a) as already for patch 1 I'm missing information on what changed
   since v2 and
b) doesn't this make unnecessary patch 1?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-08-31  3:56 ` [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks Feng Wu
@ 2016-09-01  8:29   ` Jan Beulich
  2016-09-02  1:46     ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-01  8:29 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
>  
>      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>  
> +    /*
> +     * Pausing the domain can make sure the vCPU is not
> +     * running and hence calling the hooks simultaneously
> +     * when deassigning the PI hooks. This makes sure that
> +     * all the appropriate state of PI descriptor is actually
> +     * set up for all vCPus before leaving this function.
> +     */
> +    domain_pause(d);
> +
>      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> +
> +    domain_unpause(d);
>  }

First of all I'm missing a word on whether the race mentioned in
the description and comment can actually happen. Device
(de)assignment should already be pretty much serialized (via
the domctl lock, and maybe also via the pcidevs one).

And then - isn't this overkill? Wouldn't a simple spin lock, taken
here and in the deassign counterpart, do?

Or wait - is the comment perhaps wrongly talking about deassign?

If so the change is still questionable, as the hooks get set before
the first device gets actually assigned to a guest (I remember
that I insisted on things getting done that way when those
original patches had been under review).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 5/6] VT-d: No need to set irq affinity for posted format IRTE
  2016-08-31  3:56 ` [PATCH v3 5/6] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
@ 2016-09-01  8:38   ` Jan Beulich
  2016-09-02  1:58     ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-01  8:38 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> We don't set the affinity for posted format IRTE, since the
> destination of these interrupts is vCPU and the vCPU affinity
> is set during vCPU scheduling.

So is this based on the assumption that after initial setup the function
would only ever get called for affinity changes? I'm not even sure
this is the case today, but I don't think we should build in such a
dependency.

Assuming that the main motivation is to avoid ...

> @@ -637,9 +640,12 @@ static int msi_msg_to_remap_entry(
>      remap_rte->address_hi = 0;
>      remap_rte->data = index - i;
>  
> -    memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
> -    iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
> -    iommu_flush_iec_index(iommu, 0, index);
> +    if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
> +    {
> +        memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
> +        iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
> +        iommu_flush_iec_index(iommu, 0, index);
> +    }

... the actual updating here, may I suggest that you keep the
construction of new_ire and modify the if() here to check whether
nothing except affinity related bits changed? That would also take
care of certain (older) compiler versions likely warning about new_ire
potentially being used uninitialized.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 6/6] VMX: Fixup PI descritpor when cpu is offline
  2016-08-31  3:56 ` [PATCH v3 6/6] VMX: Fixup PI descritpor when cpu is offline Feng Wu
@ 2016-09-01  8:48   ` Jan Beulich
  2016-09-02  3:25     ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-01  8:48 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian, xen-devel

>>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> +void vmx_pi_desc_fixup(int cpu)

unsigned int

> +{
> +    unsigned int new_cpu, dest;
> +    unsigned long flags;
> +    struct arch_vmx_struct *vmx, *tmp;
> +    spinlock_t *new_lock, *old_lock = &per_cpu(vmx_pi_blocking, cpu).lock;
> +    struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list;
> +
> +    if ( !iommu_intpost )
> +        return;
> +
> +    spin_lock_irqsave(old_lock, flags);
> +
> +    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
> +    {
> +        /*
> +         * We need to find an online cpu as the NDST of the PI descriptor, it
> +         * doesn't matter whether it is within the cpupool of the domain or
> +         * not. As long as it is online, the vCPU will be woken up once the
> +         * notification event arrives.
> +         */
> +restart:

I'd prefer if you did this without label and goto, but in any case
labels should be indented by at least one space. Yet ...

> +        new_cpu = cpumask_any(&cpu_online_map);
> +        new_lock = &per_cpu(vmx_pi_blocking, new_cpu).lock;
> +
> +        spin_lock(new_lock);
> +
> +        /*
> +         * If the new_cpu is not online, that means it became offline between
> +         * we got 'new_cpu' and acquiring its lock above, we need to find
> +         * another online cpu instead. Such as, this fucntion is being called
> +         * on 'new_cpu' at the same time. Can this happen??
> +         */
> +        if ( !cpu_online(new_cpu) )
> +        {
> +            spin_unlock(new_lock);
> +            goto restart;
> +        }

... I think this too has been discussed before: Is this case really
possible? You're in the context of a CPU_DEAD or CPU_UP_CANCELED
notification, which both get issued with cpu_add_remove_lock held.
How can a second CPU go down in parallel?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 1/6] VMX: Statically assign two PI hooks
  2016-09-01  8:16   ` Jan Beulich
@ 2016-09-01  9:13     ` Wu, Feng
  2016-09-01  9:23       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Wu, Feng @ 2016-09-01  9:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, September 1, 2016 4:17 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v3 1/6] VMX: Statically assign two PI hooks
> 
> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -207,8 +207,6 @@ void vmx_pi_hooks_assign(struct domain *d)
> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >
> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> > -    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
> > -    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> >  }
> 
> I'm sure I've said so before: While I can see why you want the
> adjustment to vmx_pi_hooks_deassign(), I don't see why leaving
> out this and ...

As mentioned in the comments of this patch, these two hooks are also
needed for CPU side PI, so we cannot install here and ...

> 
> > @@ -236,6 +232,12 @@ static int vmx_domain_initialise(struct domain *d)
> >      if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
> >          return rc;
> >
> > +    if ( iommu_intpost )
> > +    {
> > +        d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
> > +        d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> > +    }
> > +
> >      return 0;
> >  }
> 
> ... this would conflict with the goal of the patch.

... Yes this is incorrect, I should install these two hooks when CPU side PI is
supported.

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed
  2016-09-01  8:21   ` Jan Beulich
@ 2016-09-01  9:22     ` Wu, Feng
  2016-09-01 10:23       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Wu, Feng @ 2016-09-01  9:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, September 1, 2016 4:21 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v3 2/6] VMX: Properly handle pi when all the assigned
> devices are removed
> 
> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> > This patch handles some concern cases when the last assigned device
> > is removed from the domain. In this case we should carefully handle
> > pi descriptor and the per-cpu blocking list, to make sure:
> > - all the PI descriptor are in the right state when next time a
> > devices is assigned to the domain again.
> > - No remaining vcpus of the domain in the per-cpu blocking list.
> >
> > Basically, we pause the domain before zapping the PI hooks and
> > removing the vCPU from the blocking list, then unpause it after
> > that.
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> 
> Looks plausible, but
> a) as already for patch 1 I'm missing information on what changed
>    since v2 and

The biggest changes since v2 is that we use domain pause/unpause
(suggested by George) to handle the concern case, while v2 was using
some ugly and tricky method to do it, which was considered as hard
to maintain.

> b) doesn't this make unnecessary patch 1?

The purpose of patch 1 is to make sure the two hooks are installed
while CPU side PI is available event VT-d PI is not supported, I cannot
see why this patch will make it unnecessary.

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 1/6] VMX: Statically assign two PI hooks
  2016-09-01  9:13     ` Wu, Feng
@ 2016-09-01  9:23       ` Jan Beulich
  2016-09-01  9:38         ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-01  9:23 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 01.09.16 at 11:13, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Thursday, September 1, 2016 4:17 PM
>> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
>> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> > @@ -207,8 +207,6 @@ void vmx_pi_hooks_assign(struct domain *d)
>> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> >
>> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> > -    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
>> > -    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
>> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>> >  }
>> 
>> I'm sure I've said so before: While I can see why you want the
>> adjustment to vmx_pi_hooks_deassign(), I don't see why leaving
>> out this and ...
> 
> As mentioned in the comments of this patch, these two hooks are also
> needed for CPU side PI, so we cannot install here and ...

I've just read it again, and to me it doesn't say so. I.e. ...

>> > @@ -236,6 +232,12 @@ static int vmx_domain_initialise(struct domain *d)
>> >      if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
>> >          return rc;
>> >
>> > +    if ( iommu_intpost )
>> > +    {
>> > +        d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
>> > +        d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
>> > +    }
>> > +
>> >      return 0;
>> >  }
>> 
>> ... this would conflict with the goal of the patch.
> 
> ... Yes this is incorrect, I should install these two hooks when CPU side PI is
> supported.

... you'd then additionally need to explain what bug there is without
these hooks and with only CPU side PI present (iirc that's functionality
which has been there before the VT-d part).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 1/6] VMX: Statically assign two PI hooks
  2016-09-01  9:23       ` Jan Beulich
@ 2016-09-01  9:38         ` Wu, Feng
  0 siblings, 0 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-01  9:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, September 1, 2016 5:23 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v3 1/6] VMX: Statically assign two PI hooks
> 
> >>> On 01.09.16 at 11:13, <feng.wu@intel.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Thursday, September 1, 2016 4:17 PM
> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> > @@ -207,8 +207,6 @@ void vmx_pi_hooks_assign(struct domain *d)
> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >> >
> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> >> > -    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
> >> > -    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> >> >  }
> >>
> >> I'm sure I've said so before: While I can see why you want the
> >> adjustment to vmx_pi_hooks_deassign(), I don't see why leaving
> >> out this and ...
> >
> > As mentioned in the comments of this patch, these two hooks are also
> > needed for CPU side PI, so we cannot install here and ...
> 
> I've just read it again, and to me it doesn't say so. I.e. ...
> 
> >> > @@ -236,6 +232,12 @@ static int vmx_domain_initialise(struct domain *d)
> >> >      if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
> >> >          return rc;
> >> >
> >> > +    if ( iommu_intpost )
> >> > +    {
> >> > +        d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
> >> > +        d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> >> > +    }
> >> > +
> >> >      return 0;
> >> >  }
> >>
> >> ... this would conflict with the goal of the patch.
> >
> > ... Yes this is incorrect, I should install these two hooks when CPU side PI is
> > supported.
> 
> ... you'd then additionally need to explain what bug there is without
> these hooks and with only CPU side PI present (iirc that's functionality
> which has been there before the VT-d part).

Oh, my bad, I mixed things up. Then I should install these two hooks when
the first device assignment happens and don't zap it after that. Thanks for
the comments!

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed
  2016-09-01  9:22     ` Wu, Feng
@ 2016-09-01 10:23       ` Jan Beulich
  2016-09-01 13:12         ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-01 10:23 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 01.09.16 at 11:22, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Thursday, September 1, 2016 4:21 PM
>> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
>> > This patch handles some concern cases when the last assigned device
>> > is removed from the domain. In this case we should carefully handle
>> > pi descriptor and the per-cpu blocking list, to make sure:
>> > - all the PI descriptor are in the right state when next time a
>> > devices is assigned to the domain again.
>> > - No remaining vcpus of the domain in the per-cpu blocking list.
>> >
>> > Basically, we pause the domain before zapping the PI hooks and
>> > removing the vCPU from the blocking list, then unpause it after
>> > that.
>> >
>> > Signed-off-by: Feng Wu <feng.wu@intel.com>
>> 
>> Looks plausible, but
>> a) as already for patch 1 I'm missing information on what changed
>>    since v2 and
> 
> The biggest changes since v2 is that we use domain pause/unpause
> (suggested by George) to handle the concern case, while v2 was using
> some ugly and tricky method to do it, which was considered as hard
> to maintain.
> 
>> b) doesn't this make unnecessary patch 1?
> 
> The purpose of patch 1 is to make sure the two hooks are installed
> while CPU side PI is available event VT-d PI is not supported, I cannot
> see why this patch will make it unnecessary.

So I guess this doesn't hold anymore with your subsequent reply
to my comments on patch 1?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed
  2016-09-01 10:23       ` Jan Beulich
@ 2016-09-01 13:12         ` Wu, Feng
  0 siblings, 0 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-01 13:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, September 1, 2016 6:24 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v3 2/6] VMX: Properly handle pi when all the assigned
> devices are removed
> 
> >>> On 01.09.16 at 11:22, <feng.wu@intel.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Thursday, September 1, 2016 4:21 PM
> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> >> > This patch handles some concern cases when the last assigned device
> >> > is removed from the domain. In this case we should carefully handle
> >> > pi descriptor and the per-cpu blocking list, to make sure:
> >> > - all the PI descriptor are in the right state when next time a
> >> > devices is assigned to the domain again.
> >> > - No remaining vcpus of the domain in the per-cpu blocking list.
> >> >
> >> > Basically, we pause the domain before zapping the PI hooks and
> >> > removing the vCPU from the blocking list, then unpause it after
> >> > that.
> >> >
> >> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> >>
> >> Looks plausible, but
> >> a) as already for patch 1 I'm missing information on what changed
> >>    since v2 and
> >
> > The biggest changes since v2 is that we use domain pause/unpause
> > (suggested by George) to handle the concern case, while v2 was using
> > some ugly and tricky method to do it, which was considered as hard
> > to maintain.
> >
> >> b) doesn't this make unnecessary patch 1?
> >
> > The purpose of patch 1 is to make sure the two hooks are installed
> > while CPU side PI is available event VT-d PI is not supported, I cannot
> > see why this patch will make it unnecessary.
> 
> So I guess this doesn't hold anymore with your subsequent reply
> to my comments on patch 1?

I think we still need patch 1, since we may pause a runnable vcpu
which is in the run queue ('SN' bit is set), and 'SN' continues to be set
when the vCPU is paused. If we zap these two hooks, the 'SN' can still
be set after the vCPU is unpaused

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-01  8:29   ` Jan Beulich
@ 2016-09-02  1:46     ` Wu, Feng
  2016-09-02  7:04       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Wu, Feng @ 2016-09-02  1:46 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, September 1, 2016 4:30 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
> >
> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >
> > +    /*
> > +     * Pausing the domain can make sure the vCPU is not
> > +     * running and hence calling the hooks simultaneously
> > +     * when deassigning the PI hooks. This makes sure that
> > +     * all the appropriate state of PI descriptor is actually
> > +     * set up for all vCPus before leaving this function.
> > +     */
> > +    domain_pause(d);
> > +
> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> > +
> > +    domain_unpause(d);
> >  }
> 
> First of all I'm missing a word on whether the race mentioned in
> the description and comment can actually happen. Device
> (de)assignment should already be pretty much serialized (via
> the domctl lock, and maybe also via the pcidevs one).

The purpose of this patch is to address the race condition that
the _vCPU_ is running while we are installing these hooks. Do you
think this cannot happen?  This patch is trying to fix the issue
described at:
http://www.gossamer-threads.com/lists/xen/devel/433229
Consider that the other two hooks were installed when the VM
is created, seems no such race condition. However, according
to the discussion about patch 1 and patch 2 of series, we need
to install the other two hooks here as well, then the race
condition comes again, so we still need to handle it.

> 
> And then - isn't this overkill? Wouldn't a simple spin lock, taken
> here and in the deassign counterpart, do?
> 
> Or wait - is the comment perhaps wrongly talking about deassign?

Oh, yes, there are something wrong in the comments, this patch
has nothing to do with the _deassign_ stuff. The comments should
be like below:

+    /*
+     * Pausing the domain can make sure the vCPU is not
+     * running and hence calling the hooks simultaneously
+     * when _assigning_ the PI hooks. This makes sure that
+     * all the appropriate state of PI descriptor is actually
+     * set up for all vCPus before leaving this function.
+     */

Sorry for that.

> 
> If so the change is still questionable, as the hooks get set before
> the first device gets actually assigned to a guest (I remember
> that I insisted on things getting done that way when those
> original patches had been under review).

Yes, the hooks were installed before the first device gets assigned.
Then could you please elaborate what is the question here?

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 5/6] VT-d: No need to set irq affinity for posted format IRTE
  2016-09-01  8:38   ` Jan Beulich
@ 2016-09-02  1:58     ` Wu, Feng
  0 siblings, 0 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-02  1:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, September 1, 2016 4:39 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v3 5/6] VT-d: No need to set irq affinity for posted format
> IRTE
> 
> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> > We don't set the affinity for posted format IRTE, since the
> > destination of these interrupts is vCPU and the vCPU affinity
> > is set during vCPU scheduling.
> 
> So is this based on the assumption that after initial setup the function
> would only ever get called for affinity changes? I'm not even sure
> this is the case today, but I don't think we should build in such a
> dependency.

I don't think we have such dependency, as you mentioned below, the
purpose of this patch if to void changing the affinity related bit when
the IRTE is in posted mode.

> 
> Assuming that the main motivation is to avoid ...
> 
> > @@ -637,9 +640,12 @@ static int msi_msg_to_remap_entry(
> >      remap_rte->address_hi = 0;
> >      remap_rte->data = index - i;
> >
> > -    memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
> > -    iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
> > -    iommu_flush_iec_index(iommu, 0, index);
> > +    if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
> > +    {
> > +        memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
> > +        iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
> > +        iommu_flush_iec_index(iommu, 0, index);
> > +    }
> 
> ... the actual updating here, may I suggest that you keep the
> construction of new_ire and modify the if() here to check whether
> nothing except affinity related bits changed? That would also take
> care of certain (older) compiler versions likely warning about new_ire
> potentially being used uninitialized.

Sure, maybe I can keep the construction of new_ire and only add this
if() part, since if we find the IRTE is in posted mode ('IM' is set), we don't
need to copy new_ire to iremap_entry.

Thanks,
Feng 

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 6/6] VMX: Fixup PI descritpor when cpu is offline
  2016-09-01  8:48   ` Jan Beulich
@ 2016-09-02  3:25     ` Wu, Feng
  2016-09-02  7:08       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Wu, Feng @ 2016-09-02  3:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, September 1, 2016 4:49 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v3 6/6] VMX: Fixup PI descritpor when cpu is offline
> 
> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> > +void vmx_pi_desc_fixup(int cpu)
> 
> unsigned int
> 
> > +{
> > +    unsigned int new_cpu, dest;
> > +    unsigned long flags;
> > +    struct arch_vmx_struct *vmx, *tmp;
> > +    spinlock_t *new_lock, *old_lock = &per_cpu(vmx_pi_blocking, cpu).lock;
> > +    struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list;
> > +
> > +    if ( !iommu_intpost )
> > +        return;
> > +
> > +    spin_lock_irqsave(old_lock, flags);
> > +
> > +    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
> > +    {
> > +        /*
> > +         * We need to find an online cpu as the NDST of the PI descriptor, it
> > +         * doesn't matter whether it is within the cpupool of the domain or
> > +         * not. As long as it is online, the vCPU will be woken up once the
> > +         * notification event arrives.
> > +         */
> > +restart:
> 
> I'd prefer if you did this without label and goto, but in any case
> labels should be indented by at least one space. Yet ...
> 
> > +        new_cpu = cpumask_any(&cpu_online_map);
> > +        new_lock = &per_cpu(vmx_pi_blocking, new_cpu).lock;
> > +
> > +        spin_lock(new_lock);
> > +
> > +        /*
> > +         * If the new_cpu is not online, that means it became offline between
> > +         * we got 'new_cpu' and acquiring its lock above, we need to find
> > +         * another online cpu instead. Such as, this fucntion is being called
> > +         * on 'new_cpu' at the same time. Can this happen??
> > +         */
> > +        if ( !cpu_online(new_cpu) )
> > +        {
> > +            spin_unlock(new_lock);
> > +            goto restart;
> > +        }
> 
> ... I think this too has been discussed before: Is this case really
> possible? You're in the context of a CPU_DEAD or CPU_UP_CANCELED
> notification, which both get issued with cpu_add_remove_lock held.
> How can a second CPU go down in parallel?

Here is the call chain:

cpu_down() ->
	stop_machine_run() ->
		get_cpu_maps()	/* Try to hold the cpu_add_remove_lock */
		......
		put_cpu_maps()	/* Release the lock */
	notifier_call_chain(..., CPU_DEAD, ...) ->
		vmx_vcpu_dead() ->
			vmx_pi_desc_fixup()

Seems vmx_pi_desc_fixup() is not calling with holding cpu_add_remove_lock?
Or do I miss something? Thanks for further comments in advance!

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02  1:46     ` Wu, Feng
@ 2016-09-02  7:04       ` Jan Beulich
  2016-09-02  7:31         ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-02  7:04 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Thursday, September 1, 2016 4:30 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
>> PI hooks
>> 
>> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
>> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
>> >
>> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> >
>> > +    /*
>> > +     * Pausing the domain can make sure the vCPU is not
>> > +     * running and hence calling the hooks simultaneously
>> > +     * when deassigning the PI hooks. This makes sure that
>> > +     * all the appropriate state of PI descriptor is actually
>> > +     * set up for all vCPus before leaving this function.
>> > +     */
>> > +    domain_pause(d);
>> > +
>> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>> > +
>> > +    domain_unpause(d);
>> >  }
>> 
>> First of all I'm missing a word on whether the race mentioned in
>> the description and comment can actually happen. Device
>> (de)assignment should already be pretty much serialized (via
>> the domctl lock, and maybe also via the pcidevs one).
> 
> The purpose of this patch is to address the race condition that
> the _vCPU_ is running while we are installing these hooks. Do you
> think this cannot happen?  This patch is trying to fix the issue
> described at:
> http://www.gossamer-threads.com/lists/xen/devel/433229 
> Consider that the other two hooks were installed when the VM
> is created, seems no such race condition. However, according
> to the discussion about patch 1 and patch 2 of series, we need
> to install the other two hooks here as well,

I don't think we've agreed that the creation time installation of
those hooks is actually necessary. In fact your most recent
response to patch 1 makes me think you now agree we don't
need to do so. And hence with that precondition not holding
anymore, I don't think the conclusion does.

> then the race
> condition comes again, so we still need to handle it.
> 
>> 
>> And then - isn't this overkill? Wouldn't a simple spin lock, taken
>> here and in the deassign counterpart, do?
>> 
>> Or wait - is the comment perhaps wrongly talking about deassign?
> 
> Oh, yes, there are something wrong in the comments, this patch
> has nothing to do with the _deassign_ stuff. The comments should
> be like below:
> 
> +    /*
> +     * Pausing the domain can make sure the vCPU is not
> +     * running and hence calling the hooks simultaneously
> +     * when _assigning_ the PI hooks. This makes sure that
> +     * all the appropriate state of PI descriptor is actually
> +     * set up for all vCPus before leaving this function.
> +     */
> 
> Sorry for that.
> 
>> 
>> If so the change is still questionable, as the hooks get set before
>> the first device gets actually assigned to a guest (I remember
>> that I insisted on things getting done that way when those
>> original patches had been under review).
> 
> Yes, the hooks were installed before the first device gets assigned.
> Then could you please elaborate what is the question here?

The question here is whether this patch (taking into consideration
comments on patches earlier in the series) is (a) needed and if so
(b) reasonable in how it achieves the goal. Pausing a domain
shouldn't really become a thing we routinely do.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 6/6] VMX: Fixup PI descritpor when cpu is offline
  2016-09-02  3:25     ` Wu, Feng
@ 2016-09-02  7:08       ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2016-09-02  7:08 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 02.09.16 at 05:25, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Thursday, September 1, 2016 4:49 PM
>> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
>> > +        new_cpu = cpumask_any(&cpu_online_map);
>> > +        new_lock = &per_cpu(vmx_pi_blocking, new_cpu).lock;
>> > +
>> > +        spin_lock(new_lock);
>> > +
>> > +        /*
>> > +         * If the new_cpu is not online, that means it became offline between
>> > +         * we got 'new_cpu' and acquiring its lock above, we need to find
>> > +         * another online cpu instead. Such as, this fucntion is being called
>> > +         * on 'new_cpu' at the same time. Can this happen??
>> > +         */
>> > +        if ( !cpu_online(new_cpu) )
>> > +        {
>> > +            spin_unlock(new_lock);
>> > +            goto restart;
>> > +        }
>> 
>> ... I think this too has been discussed before: Is this case really
>> possible? You're in the context of a CPU_DEAD or CPU_UP_CANCELED
>> notification, which both get issued with cpu_add_remove_lock held.
>> How can a second CPU go down in parallel?
> 
> Here is the call chain:
> 
> cpu_down() ->
> 	stop_machine_run() ->
> 		get_cpu_maps()	/* Try to hold the cpu_add_remove_lock */
> 		......
> 		put_cpu_maps()	/* Release the lock */
> 	notifier_call_chain(..., CPU_DEAD, ...) ->
> 		vmx_vcpu_dead() ->
> 			vmx_pi_desc_fixup()
> 
> Seems vmx_pi_desc_fixup() is not calling with holding cpu_add_remove_lock?
> Or do I miss something? Thanks for further comments in advance!

The only place I see CPU_DEAD being passed to the notifier is
right in cpu_down(), with cpu_hotplug_done() (which is basically an
alias of put_cpu_maps()) clearly called a few lines later.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02  7:04       ` Jan Beulich
@ 2016-09-02  7:31         ` Wu, Feng
  2016-09-02  8:16           ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Wu, Feng @ 2016-09-02  7:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, September 2, 2016 3:04 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Thursday, September 1, 2016 4:30 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> devel@lists.xen.org
> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after
> assigning
> >> PI hooks
> >>
> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
> >> >
> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >> >
> >> > +    /*
> >> > +     * Pausing the domain can make sure the vCPU is not
> >> > +     * running and hence calling the hooks simultaneously
> >> > +     * when deassigning the PI hooks. This makes sure that
> >> > +     * all the appropriate state of PI descriptor is actually
> >> > +     * set up for all vCPus before leaving this function.
> >> > +     */
> >> > +    domain_pause(d);
> >> > +
> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> >> > +
> >> > +    domain_unpause(d);
> >> >  }
> >>
> >> First of all I'm missing a word on whether the race mentioned in
> >> the description and comment can actually happen. Device
> >> (de)assignment should already be pretty much serialized (via
> >> the domctl lock, and maybe also via the pcidevs one).
> >
> > The purpose of this patch is to address the race condition that
> > the _vCPU_ is running while we are installing these hooks. Do you
> > think this cannot happen?  This patch is trying to fix the issue
> > described at:
> > http://www.gossamer-threads.com/lists/xen/devel/433229
> > Consider that the other two hooks were installed when the VM
> > is created, seems no such race condition. However, according
> > to the discussion about patch 1 and patch 2 of series, we need
> > to install the other two hooks here as well,
> 
> I don't think we've agreed that the creation time installation of
> those hooks is actually necessary. In fact your most recent
> response to patch 1 makes me think you now agree we don't
> need to do so. And hence with that precondition not holding
> anymore, I don't think the conclusion does.

I think there might be some confusion. Let me explain what I
am think of to make sure we are on the same page:
1. We need install all the four hooks when the first device is
assigned.
2. If _1_ is true, the issue described in
http://www.gossamer-threads.com/lists/xen/devel/433229
exists.

Let's make the above clear before going forward on this patch.

Thanks,
Feng

> 
> > then the race
> > condition comes again, so we still need to handle it.
> >
> >>
> >> And then - isn't this overkill? Wouldn't a simple spin lock, taken
> >> here and in the deassign counterpart, do?
> >>
> >> Or wait - is the comment perhaps wrongly talking about deassign?
> >
> > Oh, yes, there are something wrong in the comments, this patch
> > has nothing to do with the _deassign_ stuff. The comments should
> > be like below:
> >
> > +    /*
> > +     * Pausing the domain can make sure the vCPU is not
> > +     * running and hence calling the hooks simultaneously
> > +     * when _assigning_ the PI hooks. This makes sure that
> > +     * all the appropriate state of PI descriptor is actually
> > +     * set up for all vCPus before leaving this function.
> > +     */
> >
> > Sorry for that.
> >
> >>
> >> If so the change is still questionable, as the hooks get set before
> >> the first device gets actually assigned to a guest (I remember
> >> that I insisted on things getting done that way when those
> >> original patches had been under review).
> >
> > Yes, the hooks were installed before the first device gets assigned.
> > Then could you please elaborate what is the question here?
> 
> The question here is whether this patch (taking into consideration
> comments on patches earlier in the series) is (a) needed and if so
> (b) reasonable in how it achieves the goal. Pausing a domain
> shouldn't really become a thing we routinely do.
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02  7:31         ` Wu, Feng
@ 2016-09-02  8:16           ` Jan Beulich
  2016-09-02  8:40             ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-02  8:16 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Friday, September 2, 2016 3:04 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
>> PI hooks
>> 
>> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Thursday, September 1, 2016 4:30 PM
>> >> To: Wu, Feng <feng.wu@intel.com>
>> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> devel@lists.xen.org 
>> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> assigning
>> >> PI hooks
>> >>
>> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
>> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
>> >> >
>> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> >> >
>> >> > +    /*
>> >> > +     * Pausing the domain can make sure the vCPU is not
>> >> > +     * running and hence calling the hooks simultaneously
>> >> > +     * when deassigning the PI hooks. This makes sure that
>> >> > +     * all the appropriate state of PI descriptor is actually
>> >> > +     * set up for all vCPus before leaving this function.
>> >> > +     */
>> >> > +    domain_pause(d);
>> >> > +
>> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>> >> > +
>> >> > +    domain_unpause(d);
>> >> >  }
>> >>
>> >> First of all I'm missing a word on whether the race mentioned in
>> >> the description and comment can actually happen. Device
>> >> (de)assignment should already be pretty much serialized (via
>> >> the domctl lock, and maybe also via the pcidevs one).
>> >
>> > The purpose of this patch is to address the race condition that
>> > the _vCPU_ is running while we are installing these hooks. Do you
>> > think this cannot happen?  This patch is trying to fix the issue
>> > described at:
>> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> > Consider that the other two hooks were installed when the VM
>> > is created, seems no such race condition. However, according
>> > to the discussion about patch 1 and patch 2 of series, we need
>> > to install the other two hooks here as well,
>> 
>> I don't think we've agreed that the creation time installation of
>> those hooks is actually necessary. In fact your most recent
>> response to patch 1 makes me think you now agree we don't
>> need to do so. And hence with that precondition not holding
>> anymore, I don't think the conclusion does.
> 
> I think there might be some confusion. Let me explain what I
> am think of to make sure we are on the same page:
> 1. We need install all the four hooks when the first device is
> assigned.
> 2. If _1_ is true, the issue described in
> http://www.gossamer-threads.com/lists/xen/devel/433229 
> exists.

If you mean this

* vcpu 0 starts running on a pcpu
* a device is assigned, causing the hooks to be set
* an interrupt from the device is routed to vcpu 0, but it is not
actually delivered properly, since ndst is not pointing to the right
processor.

raised by George, then I'm not convinced it can happen (after all, the
hooks get set _before_ the device gets assigned, and hence before
the device can raise an interrupt destined at the guest). And if it can
happen, then rather than pausing the guest I don't see why, along
with setting the hooks, any possibly affected NDST field can't be
programmed correctly. ISTR having recommended something like
this already during review of the series originally introducing PI.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02  8:16           ` Jan Beulich
@ 2016-09-02  8:40             ` Wu, Feng
  2016-09-02  9:25               ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Wu, Feng @ 2016-09-02  8:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, September 2, 2016 4:16 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> >>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Friday, September 2, 2016 3:04 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> devel@lists.xen.org
> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> assigning
> >> PI hooks
> >>
> >> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
> >>
> >> >
> >> >> -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Thursday, September 1, 2016 4:30 PM
> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> >> devel@lists.xen.org
> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after
> >> assigning
> >> >> PI hooks
> >> >>
> >> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
> >> >> >
> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >> >> >
> >> >> > +    /*
> >> >> > +     * Pausing the domain can make sure the vCPU is not
> >> >> > +     * running and hence calling the hooks simultaneously
> >> >> > +     * when deassigning the PI hooks. This makes sure that
> >> >> > +     * all the appropriate state of PI descriptor is actually
> >> >> > +     * set up for all vCPus before leaving this function.
> >> >> > +     */
> >> >> > +    domain_pause(d);
> >> >> > +
> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> >> >> > +
> >> >> > +    domain_unpause(d);
> >> >> >  }
> >> >>
> >> >> First of all I'm missing a word on whether the race mentioned in
> >> >> the description and comment can actually happen. Device
> >> >> (de)assignment should already be pretty much serialized (via
> >> >> the domctl lock, and maybe also via the pcidevs one).
> >> >
> >> > The purpose of this patch is to address the race condition that
> >> > the _vCPU_ is running while we are installing these hooks. Do you
> >> > think this cannot happen?  This patch is trying to fix the issue
> >> > described at:
> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> >> > Consider that the other two hooks were installed when the VM
> >> > is created, seems no such race condition. However, according
> >> > to the discussion about patch 1 and patch 2 of series, we need
> >> > to install the other two hooks here as well,
> >>
> >> I don't think we've agreed that the creation time installation of
> >> those hooks is actually necessary. In fact your most recent
> >> response to patch 1 makes me think you now agree we don't
> >> need to do so. And hence with that precondition not holding
> >> anymore, I don't think the conclusion does.
> >
> > I think there might be some confusion. Let me explain what I
> > am think of to make sure we are on the same page:
> > 1. We need install all the four hooks when the first device is
> > assigned.
> > 2. If _1_ is true, the issue described in
> > http://www.gossamer-threads.com/lists/xen/devel/433229
> > exists.
> 
> If you mean this
> 
> * vcpu 0 starts running on a pcpu
> * a device is assigned, causing the hooks to be set
> * an interrupt from the device is routed to vcpu 0, but it is not
> actually delivered properly, since ndst is not pointing to the right
> processor.
> 
> raised by George, then I'm not convinced it can happen (after all, the
> hooks get set _before_ the device gets assigned, and hence before
> the device can raise an interrupt destined at the guest). And if it can
> happen, then rather than pausing the guest I don't see why, along
> with setting the hooks, any possibly affected NDST field can't be
> programmed correctly. ISTR having recommended something like
> this already during review of the series originally introducing PI.

Actually here is the scenario I am concerned about:
1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
and then vmx_vcpu_block().
2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
we may hit the ASSERT() since 'NDST' may not have been set to the
current processor yet.

My previous solution in v2 is to delete that ASSERT(), but seems you
guys don't like it. So here I use this new method in v3 to make sure
the vCPU is running while we are installing the hooks.

Thanks,
Feng


> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02  8:40             ` Wu, Feng
@ 2016-09-02  9:25               ` Jan Beulich
  2016-09-02 10:30                 ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-02  9:25 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 02.09.16 at 10:40, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Friday, September 2, 2016 4:16 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
>> PI hooks
>> 
>> >>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Friday, September 2, 2016 3:04 PM
>> >> To: Wu, Feng <feng.wu@intel.com>
>> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> devel@lists.xen.org 
>> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> assigning
>> >> PI hooks
>> >>
>> >> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
>> >>
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: Thursday, September 1, 2016 4:30 PM
>> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> >> devel@lists.xen.org 
>> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> assigning
>> >> >> PI hooks
>> >> >>
>> >> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
>> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
>> >> >> >
>> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> >> >> >
>> >> >> > +    /*
>> >> >> > +     * Pausing the domain can make sure the vCPU is not
>> >> >> > +     * running and hence calling the hooks simultaneously
>> >> >> > +     * when deassigning the PI hooks. This makes sure that
>> >> >> > +     * all the appropriate state of PI descriptor is actually
>> >> >> > +     * set up for all vCPus before leaving this function.
>> >> >> > +     */
>> >> >> > +    domain_pause(d);
>> >> >> > +
>> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>> >> >> > +
>> >> >> > +    domain_unpause(d);
>> >> >> >  }
>> >> >>
>> >> >> First of all I'm missing a word on whether the race mentioned in
>> >> >> the description and comment can actually happen. Device
>> >> >> (de)assignment should already be pretty much serialized (via
>> >> >> the domctl lock, and maybe also via the pcidevs one).
>> >> >
>> >> > The purpose of this patch is to address the race condition that
>> >> > the _vCPU_ is running while we are installing these hooks. Do you
>> >> > think this cannot happen?  This patch is trying to fix the issue
>> >> > described at:
>> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> > Consider that the other two hooks were installed when the VM
>> >> > is created, seems no such race condition. However, according
>> >> > to the discussion about patch 1 and patch 2 of series, we need
>> >> > to install the other two hooks here as well,
>> >>
>> >> I don't think we've agreed that the creation time installation of
>> >> those hooks is actually necessary. In fact your most recent
>> >> response to patch 1 makes me think you now agree we don't
>> >> need to do so. And hence with that precondition not holding
>> >> anymore, I don't think the conclusion does.
>> >
>> > I think there might be some confusion. Let me explain what I
>> > am think of to make sure we are on the same page:
>> > 1. We need install all the four hooks when the first device is
>> > assigned.
>> > 2. If _1_ is true, the issue described in
>> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> > exists.
>> 
>> If you mean this
>> 
>> * vcpu 0 starts running on a pcpu
>> * a device is assigned, causing the hooks to be set
>> * an interrupt from the device is routed to vcpu 0, but it is not
>> actually delivered properly, since ndst is not pointing to the right
>> processor.
>> 
>> raised by George, then I'm not convinced it can happen (after all, the
>> hooks get set _before_ the device gets assigned, and hence before
>> the device can raise an interrupt destined at the guest). And if it can
>> happen, then rather than pausing the guest I don't see why, along
>> with setting the hooks, any possibly affected NDST field can't be
>> programmed correctly. ISTR having recommended something like
>> this already during review of the series originally introducing PI.
> 
> Actually here is the scenario I am concerned about:
> 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
> and then vmx_vcpu_block().
> 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
> we may hit the ASSERT() since 'NDST' may not have been set to the
> current processor yet.
> 
> My previous solution in v2 is to delete that ASSERT(), but seems you
> guys don't like it. So here I use this new method in v3 to make sure
> the vCPU is running while we are installing the hooks.

Indeed, deleting the assertion doesn't seem right. But then why
can't vmx_vcpu_block() bail early when the domain has no devices
assigned? That would allow for

1) set blocking hook
2) set up PI state
3) actually assign device

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02  9:25               ` Jan Beulich
@ 2016-09-02 10:30                 ` Wu, Feng
  2016-09-02 10:45                   ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Wu, Feng @ 2016-09-02 10:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, September 2, 2016 5:26 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> >>> On 02.09.16 at 10:40, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Friday, September 2, 2016 4:16 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> devel@lists.xen.org
> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> assigning
> >> PI hooks
> >>
> >> >>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:
> >>
> >> >
> >> >> -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Friday, September 2, 2016 3:04 PM
> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> >> devel@lists.xen.org
> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> >> assigning
> >> >> PI hooks
> >> >>
> >> >> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
> >> >>
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
> >> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
> xen-
> >> >> >> devel@lists.xen.org
> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after
> >> >> assigning
> >> >> >> PI hooks
> >> >> >>
> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
> >> >> >> >
> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >> >> >> >
> >> >> >> > +    /*
> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
> >> >> >> > +     * running and hence calling the hooks simultaneously
> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
> >> >> >> > +     * all the appropriate state of PI descriptor is actually
> >> >> >> > +     * set up for all vCPus before leaving this function.
> >> >> >> > +     */
> >> >> >> > +    domain_pause(d);
> >> >> >> > +
> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> >> >> >> > +
> >> >> >> > +    domain_unpause(d);
> >> >> >> >  }
> >> >> >>
> >> >> >> First of all I'm missing a word on whether the race mentioned in
> >> >> >> the description and comment can actually happen. Device
> >> >> >> (de)assignment should already be pretty much serialized (via
> >> >> >> the domctl lock, and maybe also via the pcidevs one).
> >> >> >
> >> >> > The purpose of this patch is to address the race condition that
> >> >> > the _vCPU_ is running while we are installing these hooks. Do you
> >> >> > think this cannot happen?  This patch is trying to fix the issue
> >> >> > described at:
> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> >> >> > Consider that the other two hooks were installed when the VM
> >> >> > is created, seems no such race condition. However, according
> >> >> > to the discussion about patch 1 and patch 2 of series, we need
> >> >> > to install the other two hooks here as well,
> >> >>
> >> >> I don't think we've agreed that the creation time installation of
> >> >> those hooks is actually necessary. In fact your most recent
> >> >> response to patch 1 makes me think you now agree we don't
> >> >> need to do so. And hence with that precondition not holding
> >> >> anymore, I don't think the conclusion does.
> >> >
> >> > I think there might be some confusion. Let me explain what I
> >> > am think of to make sure we are on the same page:
> >> > 1. We need install all the four hooks when the first device is
> >> > assigned.
> >> > 2. If _1_ is true, the issue described in
> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> >> > exists.
> >>
> >> If you mean this
> >>
> >> * vcpu 0 starts running on a pcpu
> >> * a device is assigned, causing the hooks to be set
> >> * an interrupt from the device is routed to vcpu 0, but it is not
> >> actually delivered properly, since ndst is not pointing to the right
> >> processor.
> >>
> >> raised by George, then I'm not convinced it can happen (after all, the
> >> hooks get set _before_ the device gets assigned, and hence before
> >> the device can raise an interrupt destined at the guest). And if it can
> >> happen, then rather than pausing the guest I don't see why, along
> >> with setting the hooks, any possibly affected NDST field can't be
> >> programmed correctly. ISTR having recommended something like
> >> this already during review of the series originally introducing PI.
> >
> > Actually here is the scenario I am concerned about:
> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
> > and then vmx_vcpu_block().
> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
> > we may hit the ASSERT() since 'NDST' may not have been set to the
> > current processor yet.
> >
> > My previous solution in v2 is to delete that ASSERT(), but seems you
> > guys don't like it. So here I use this new method in v3 to make sure
> > the vCPU is running while we are installing the hooks.
> 
> Indeed, deleting the assertion doesn't seem right. But then why
> can't vmx_vcpu_block() bail early when the domain has no devices
> assigned? That would allow for
> 
> 1) set blocking hook
> 2) set up PI state
> 3) actually assign device

Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
false, we return early? But has_arch_pdev() needs hold
_pcidevs_lock, right?

Thanks,
Feng

> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02 10:30                 ` Wu, Feng
@ 2016-09-02 10:45                   ` Jan Beulich
  2016-09-02 13:15                     ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-02 10:45 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 02.09.16 at 12:30, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Friday, September 2, 2016 5:26 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
>> PI hooks
>> 
>> >>> On 02.09.16 at 10:40, <feng.wu@intel.com> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Friday, September 2, 2016 4:16 PM
>> >> To: Wu, Feng <feng.wu@intel.com>
>> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> devel@lists.xen.org 
>> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> assigning
>> >> PI hooks
>> >>
>> >> >>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:
>> >>
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: Friday, September 2, 2016 3:04 PM
>> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> >> devel@lists.xen.org 
>> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> assigning
>> >> >> PI hooks
>> >> >>
>> >> >> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
>> >> >>
>> >> >> >
>> >> >> >> -----Original Message-----
>> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
>> >> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
>> xen-
>> >> >> >> devel@lists.xen.org 
>> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> >> assigning
>> >> >> >> PI hooks
>> >> >> >>
>> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
>> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
>> >> >> >> >
>> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> >> >> >> >
>> >> >> >> > +    /*
>> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
>> >> >> >> > +     * running and hence calling the hooks simultaneously
>> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
>> >> >> >> > +     * all the appropriate state of PI descriptor is actually
>> >> >> >> > +     * set up for all vCPus before leaving this function.
>> >> >> >> > +     */
>> >> >> >> > +    domain_pause(d);
>> >> >> >> > +
>> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>> >> >> >> > +
>> >> >> >> > +    domain_unpause(d);
>> >> >> >> >  }
>> >> >> >>
>> >> >> >> First of all I'm missing a word on whether the race mentioned in
>> >> >> >> the description and comment can actually happen. Device
>> >> >> >> (de)assignment should already be pretty much serialized (via
>> >> >> >> the domctl lock, and maybe also via the pcidevs one).
>> >> >> >
>> >> >> > The purpose of this patch is to address the race condition that
>> >> >> > the _vCPU_ is running while we are installing these hooks. Do you
>> >> >> > think this cannot happen?  This patch is trying to fix the issue
>> >> >> > described at:
>> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> >> > Consider that the other two hooks were installed when the VM
>> >> >> > is created, seems no such race condition. However, according
>> >> >> > to the discussion about patch 1 and patch 2 of series, we need
>> >> >> > to install the other two hooks here as well,
>> >> >>
>> >> >> I don't think we've agreed that the creation time installation of
>> >> >> those hooks is actually necessary. In fact your most recent
>> >> >> response to patch 1 makes me think you now agree we don't
>> >> >> need to do so. And hence with that precondition not holding
>> >> >> anymore, I don't think the conclusion does.
>> >> >
>> >> > I think there might be some confusion. Let me explain what I
>> >> > am think of to make sure we are on the same page:
>> >> > 1. We need install all the four hooks when the first device is
>> >> > assigned.
>> >> > 2. If _1_ is true, the issue described in
>> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> > exists.
>> >>
>> >> If you mean this
>> >>
>> >> * vcpu 0 starts running on a pcpu
>> >> * a device is assigned, causing the hooks to be set
>> >> * an interrupt from the device is routed to vcpu 0, but it is not
>> >> actually delivered properly, since ndst is not pointing to the right
>> >> processor.
>> >>
>> >> raised by George, then I'm not convinced it can happen (after all, the
>> >> hooks get set _before_ the device gets assigned, and hence before
>> >> the device can raise an interrupt destined at the guest). And if it can
>> >> happen, then rather than pausing the guest I don't see why, along
>> >> with setting the hooks, any possibly affected NDST field can't be
>> >> programmed correctly. ISTR having recommended something like
>> >> this already during review of the series originally introducing PI.
>> >
>> > Actually here is the scenario I am concerned about:
>> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
>> > and then vmx_vcpu_block().
>> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
>> > we may hit the ASSERT() since 'NDST' may not have been set to the
>> > current processor yet.
>> >
>> > My previous solution in v2 is to delete that ASSERT(), but seems you
>> > guys don't like it. So here I use this new method in v3 to make sure
>> > the vCPU is running while we are installing the hooks.
>> 
>> Indeed, deleting the assertion doesn't seem right. But then why
>> can't vmx_vcpu_block() bail early when the domain has no devices
>> assigned? That would allow for
>> 
>> 1) set blocking hook
>> 2) set up PI state
>> 3) actually assign device
> 
> Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
> false, we return early? But has_arch_pdev() needs hold
> _pcidevs_lock, right?

I don't think you strictly need to: list_empty() will reliably return
true in the case of interest here. And possible races when the last
device gets removed are - afaics - benign (i.e. it doesn't matter
what it returns at that point in time).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02 10:45                   ` Jan Beulich
@ 2016-09-02 13:15                     ` Wu, Feng
  2016-09-02 13:54                       ` Jan Beulich
  2016-09-23 14:19                       ` Jan Beulich
  0 siblings, 2 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-02 13:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, September 2, 2016 6:46 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> >>> On 02.09.16 at 12:30, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Friday, September 2, 2016 5:26 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> devel@lists.xen.org
> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> assigning
> >> PI hooks
> >>
> >> >>> On 02.09.16 at 10:40, <feng.wu@intel.com> wrote:
> >>
> >> >
> >> >> -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Friday, September 2, 2016 4:16 PM
> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> >> devel@lists.xen.org
> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> >> assigning
> >> >> PI hooks
> >> >>
> >> >> >>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:
> >> >>
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: Friday, September 2, 2016 3:04 PM
> >> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
> xen-
> >> >> >> devel@lists.xen.org
> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> >> >> assigning
> >> >> >> PI hooks
> >> >> >>
> >> >> >> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
> >> >> >>
> >> >> >> >
> >> >> >> >> -----Original Message-----
> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
> >> >> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
> >> xen-
> >> >> >> >> devel@lists.xen.org
> >> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain
> before/after
> >> >> >> assigning
> >> >> >> >> PI hooks
> >> >> >> >>
> >> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> >> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain
> *d)
> >> >> >> >> >
> >> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >> >> >> >> >
> >> >> >> >> > +    /*
> >> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
> >> >> >> >> > +     * running and hence calling the hooks simultaneously
> >> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
> >> >> >> >> > +     * all the appropriate state of PI descriptor is actually
> >> >> >> >> > +     * set up for all vCPus before leaving this function.
> >> >> >> >> > +     */
> >> >> >> >> > +    domain_pause(d);
> >> >> >> >> > +
> >> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> >> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> >> >> >> >> > +
> >> >> >> >> > +    domain_unpause(d);
> >> >> >> >> >  }
> >> >> >> >>
> >> >> >> >> First of all I'm missing a word on whether the race mentioned in
> >> >> >> >> the description and comment can actually happen. Device
> >> >> >> >> (de)assignment should already be pretty much serialized (via
> >> >> >> >> the domctl lock, and maybe also via the pcidevs one).
> >> >> >> >
> >> >> >> > The purpose of this patch is to address the race condition that
> >> >> >> > the _vCPU_ is running while we are installing these hooks. Do you
> >> >> >> > think this cannot happen?  This patch is trying to fix the issue
> >> >> >> > described at:
> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> >> >> >> > Consider that the other two hooks were installed when the VM
> >> >> >> > is created, seems no such race condition. However, according
> >> >> >> > to the discussion about patch 1 and patch 2 of series, we need
> >> >> >> > to install the other two hooks here as well,
> >> >> >>
> >> >> >> I don't think we've agreed that the creation time installation of
> >> >> >> those hooks is actually necessary. In fact your most recent
> >> >> >> response to patch 1 makes me think you now agree we don't
> >> >> >> need to do so. And hence with that precondition not holding
> >> >> >> anymore, I don't think the conclusion does.
> >> >> >
> >> >> > I think there might be some confusion. Let me explain what I
> >> >> > am think of to make sure we are on the same page:
> >> >> > 1. We need install all the four hooks when the first device is
> >> >> > assigned.
> >> >> > 2. If _1_ is true, the issue described in
> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> >> >> > exists.
> >> >>
> >> >> If you mean this
> >> >>
> >> >> * vcpu 0 starts running on a pcpu
> >> >> * a device is assigned, causing the hooks to be set
> >> >> * an interrupt from the device is routed to vcpu 0, but it is not
> >> >> actually delivered properly, since ndst is not pointing to the right
> >> >> processor.
> >> >>
> >> >> raised by George, then I'm not convinced it can happen (after all, the
> >> >> hooks get set _before_ the device gets assigned, and hence before
> >> >> the device can raise an interrupt destined at the guest). And if it can
> >> >> happen, then rather than pausing the guest I don't see why, along
> >> >> with setting the hooks, any possibly affected NDST field can't be
> >> >> programmed correctly. ISTR having recommended something like
> >> >> this already during review of the series originally introducing PI.
> >> >
> >> > Actually here is the scenario I am concerned about:
> >> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
> >> > and then vmx_vcpu_block().
> >> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
> >> > we may hit the ASSERT() since 'NDST' may not have been set to the
> >> > current processor yet.
> >> >
> >> > My previous solution in v2 is to delete that ASSERT(), but seems you
> >> > guys don't like it. So here I use this new method in v3 to make sure
> >> > the vCPU is running while we are installing the hooks.
> >>
> >> Indeed, deleting the assertion doesn't seem right. But then why
> >> can't vmx_vcpu_block() bail early when the domain has no devices
> >> assigned? That would allow for
> >>
> >> 1) set blocking hook
> >> 2) set up PI state
> >> 3) actually assign device
> >
> > Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
> > false, we return early? But has_arch_pdev() needs hold
> > _pcidevs_lock, right?
> 
> I don't think you strictly need to: list_empty() will reliably return
> true in the case of interest here. And possible races when the last
> device gets removed are - afaics - benign (i.e. it doesn't matter
> what it returns at that point in time).

I remind of another case:
1. The four hooks are installed.
2. vmx_vcpu_block() gets called before other three hooks gets called,
even if a device has already been assigned to the domain. We may still
hit the ASSERT() in vmx_vcpu_block() since 'NDST' field is changed in
the other hooks.

And that is another reason I use pause/unpause here, it can address
all the races. And this is an one-time action (Only occurs at the first
device gets assigned), do you think it is that unacceptable?

Thanks,
Feng

> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02 13:15                     ` Wu, Feng
@ 2016-09-02 13:54                       ` Jan Beulich
  2016-09-05  3:11                         ` Wu, Feng
  2016-09-23 14:19                       ` Jan Beulich
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-02 13:54 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 02.09.16 at 15:15, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Friday, September 2, 2016 6:46 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
>> PI hooks
>> 
>> >>> On 02.09.16 at 12:30, <feng.wu@intel.com> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Friday, September 2, 2016 5:26 PM
>> >> To: Wu, Feng <feng.wu@intel.com>
>> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> devel@lists.xen.org 
>> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> assigning
>> >> PI hooks
>> >>
>> >> >>> On 02.09.16 at 10:40, <feng.wu@intel.com> wrote:
>> >>
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: Friday, September 2, 2016 4:16 PM
>> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> >> devel@lists.xen.org 
>> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> assigning
>> >> >> PI hooks
>> >> >>
>> >> >> >>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:
>> >> >>
>> >> >> >
>> >> >> >> -----Original Message-----
>> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> Sent: Friday, September 2, 2016 3:04 PM
>> >> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
>> xen-
>> >> >> >> devel@lists.xen.org 
>> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> >> assigning
>> >> >> >> PI hooks
>> >> >> >>
>> >> >> >> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
>> >> >> >>
>> >> >> >> >
>> >> >> >> >> -----Original Message-----
>> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
>> >> >> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
>> >> xen-
>> >> >> >> >> devel@lists.xen.org 
>> >> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain
>> before/after
>> >> >> >> assigning
>> >> >> >> >> PI hooks
>> >> >> >> >>
>> >> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
>> >> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain
>> *d)
>> >> >> >> >> >
>> >> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> >> >> >> >> >
>> >> >> >> >> > +    /*
>> >> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
>> >> >> >> >> > +     * running and hence calling the hooks simultaneously
>> >> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
>> >> >> >> >> > +     * all the appropriate state of PI descriptor is actually
>> >> >> >> >> > +     * set up for all vCPus before leaving this function.
>> >> >> >> >> > +     */
>> >> >> >> >> > +    domain_pause(d);
>> >> >> >> >> > +
>> >> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> >> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>> >> >> >> >> > +
>> >> >> >> >> > +    domain_unpause(d);
>> >> >> >> >> >  }
>> >> >> >> >>
>> >> >> >> >> First of all I'm missing a word on whether the race mentioned in
>> >> >> >> >> the description and comment can actually happen. Device
>> >> >> >> >> (de)assignment should already be pretty much serialized (via
>> >> >> >> >> the domctl lock, and maybe also via the pcidevs one).
>> >> >> >> >
>> >> >> >> > The purpose of this patch is to address the race condition that
>> >> >> >> > the _vCPU_ is running while we are installing these hooks. Do you
>> >> >> >> > think this cannot happen?  This patch is trying to fix the issue
>> >> >> >> > described at:
>> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> >> >> > Consider that the other two hooks were installed when the VM
>> >> >> >> > is created, seems no such race condition. However, according
>> >> >> >> > to the discussion about patch 1 and patch 2 of series, we need
>> >> >> >> > to install the other two hooks here as well,
>> >> >> >>
>> >> >> >> I don't think we've agreed that the creation time installation of
>> >> >> >> those hooks is actually necessary. In fact your most recent
>> >> >> >> response to patch 1 makes me think you now agree we don't
>> >> >> >> need to do so. And hence with that precondition not holding
>> >> >> >> anymore, I don't think the conclusion does.
>> >> >> >
>> >> >> > I think there might be some confusion. Let me explain what I
>> >> >> > am think of to make sure we are on the same page:
>> >> >> > 1. We need install all the four hooks when the first device is
>> >> >> > assigned.
>> >> >> > 2. If _1_ is true, the issue described in
>> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> >> > exists.
>> >> >>
>> >> >> If you mean this
>> >> >>
>> >> >> * vcpu 0 starts running on a pcpu
>> >> >> * a device is assigned, causing the hooks to be set
>> >> >> * an interrupt from the device is routed to vcpu 0, but it is not
>> >> >> actually delivered properly, since ndst is not pointing to the right
>> >> >> processor.
>> >> >>
>> >> >> raised by George, then I'm not convinced it can happen (after all, the
>> >> >> hooks get set _before_ the device gets assigned, and hence before
>> >> >> the device can raise an interrupt destined at the guest). And if it can
>> >> >> happen, then rather than pausing the guest I don't see why, along
>> >> >> with setting the hooks, any possibly affected NDST field can't be
>> >> >> programmed correctly. ISTR having recommended something like
>> >> >> this already during review of the series originally introducing PI.
>> >> >
>> >> > Actually here is the scenario I am concerned about:
>> >> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
>> >> > and then vmx_vcpu_block().
>> >> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
>> >> > we may hit the ASSERT() since 'NDST' may not have been set to the
>> >> > current processor yet.
>> >> >
>> >> > My previous solution in v2 is to delete that ASSERT(), but seems you
>> >> > guys don't like it. So here I use this new method in v3 to make sure
>> >> > the vCPU is running while we are installing the hooks.
>> >>
>> >> Indeed, deleting the assertion doesn't seem right. But then why
>> >> can't vmx_vcpu_block() bail early when the domain has no devices
>> >> assigned? That would allow for
>> >>
>> >> 1) set blocking hook
>> >> 2) set up PI state
>> >> 3) actually assign device
>> >
>> > Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
>> > false, we return early? But has_arch_pdev() needs hold
>> > _pcidevs_lock, right?
>> 
>> I don't think you strictly need to: list_empty() will reliably return
>> true in the case of interest here. And possible races when the last
>> device gets removed are - afaics - benign (i.e. it doesn't matter
>> what it returns at that point in time).
> 
> I remind of another case:
> 1. The four hooks are installed.
> 2. vmx_vcpu_block() gets called before other three hooks gets called,
> even if a device has already been assigned to the domain. We may still
> hit the ASSERT() in vmx_vcpu_block() since 'NDST' field is changed in
> the other hooks.

I don't understand: Step 2) in what I've outline above would make
sure NDST is set correctly. Perhaps one should even reverse 2) and
1).

> And that is another reason I use pause/unpause here, it can address
> all the races. And this is an one-time action (Only occurs at the first
> device gets assigned), do you think it is that unacceptable?

No, I've never said it's unacceptable. I just want such to not be
added without good reason.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02 13:54                       ` Jan Beulich
@ 2016-09-05  3:11                         ` Wu, Feng
  2016-09-05  9:27                           ` Jan Beulich
  2016-09-14  2:23                           ` Wu, Feng
  0 siblings, 2 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-05  3:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, September 2, 2016 9:55 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> >>> On 02.09.16 at 15:15, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Friday, September 2, 2016 6:46 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> devel@lists.xen.org
> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> assigning
> >> PI hooks
> >>
> >> >>> On 02.09.16 at 12:30, <feng.wu@intel.com> wrote:
> >>
> >> >
> >> >> -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Friday, September 2, 2016 5:26 PM
> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> >> >> devel@lists.xen.org
> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> >> assigning
> >> >> PI hooks
> >> >>
> >> >> >>> On 02.09.16 at 10:40, <feng.wu@intel.com> wrote:
> >> >>
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: Friday, September 2, 2016 4:16 PM
> >> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
> xen-
> >> >> >> devel@lists.xen.org
> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> >> >> assigning
> >> >> >> PI hooks
> >> >> >>
> >> >> >> >>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:
> >> >> >>
> >> >> >> >
> >> >> >> >> -----Original Message-----
> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> Sent: Friday, September 2, 2016 3:04 PM
> >> >> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
> >> xen-
> >> >> >> >> devel@lists.xen.org
> >> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain
> before/after
> >> >> >> assigning
> >> >> >> >> PI hooks
> >> >> >> >>
> >> >> >> >> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> >> -----Original Message-----
> >> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
> >> >> >> >> >> To: Wu, Feng <feng.wu@intel.com>
> >> >> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> >> >> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin
> <kevin.tian@intel.com>;
> >> >> xen-
> >> >> >> >> >> devel@lists.xen.org
> >> >> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain
> >> before/after
> >> >> >> >> assigning
> >> >> >> >> >> PI hooks
> >> >> >> >> >>
> >> >> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> >> >> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> >> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> >> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct
> domain
> >> *d)
> >> >> >> >> >> >
> >> >> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >> >> >> >> >> >
> >> >> >> >> >> > +    /*
> >> >> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
> >> >> >> >> >> > +     * running and hence calling the hooks simultaneously
> >> >> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
> >> >> >> >> >> > +     * all the appropriate state of PI descriptor is actually
> >> >> >> >> >> > +     * set up for all vCPus before leaving this function.
> >> >> >> >> >> > +     */
> >> >> >> >> >> > +    domain_pause(d);
> >> >> >> >> >> > +
> >> >> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> >> >> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume =
> vmx_pi_do_resume;
> >> >> >> >> >> > +
> >> >> >> >> >> > +    domain_unpause(d);
> >> >> >> >> >> >  }
> >> >> >> >> >>
> >> >> >> >> >> First of all I'm missing a word on whether the race mentioned in
> >> >> >> >> >> the description and comment can actually happen. Device
> >> >> >> >> >> (de)assignment should already be pretty much serialized (via
> >> >> >> >> >> the domctl lock, and maybe also via the pcidevs one).
> >> >> >> >> >
> >> >> >> >> > The purpose of this patch is to address the race condition that
> >> >> >> >> > the _vCPU_ is running while we are installing these hooks. Do you
> >> >> >> >> > think this cannot happen?  This patch is trying to fix the issue
> >> >> >> >> > described at:
> >> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> >> >> >> >> > Consider that the other two hooks were installed when the VM
> >> >> >> >> > is created, seems no such race condition. However, according
> >> >> >> >> > to the discussion about patch 1 and patch 2 of series, we need
> >> >> >> >> > to install the other two hooks here as well,
> >> >> >> >>
> >> >> >> >> I don't think we've agreed that the creation time installation of
> >> >> >> >> those hooks is actually necessary. In fact your most recent
> >> >> >> >> response to patch 1 makes me think you now agree we don't
> >> >> >> >> need to do so. And hence with that precondition not holding
> >> >> >> >> anymore, I don't think the conclusion does.
> >> >> >> >
> >> >> >> > I think there might be some confusion. Let me explain what I
> >> >> >> > am think of to make sure we are on the same page:
> >> >> >> > 1. We need install all the four hooks when the first device is
> >> >> >> > assigned.
> >> >> >> > 2. If _1_ is true, the issue described in
> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> >> >> >> > exists.
> >> >> >>
> >> >> >> If you mean this
> >> >> >>
> >> >> >> * vcpu 0 starts running on a pcpu
> >> >> >> * a device is assigned, causing the hooks to be set
> >> >> >> * an interrupt from the device is routed to vcpu 0, but it is not
> >> >> >> actually delivered properly, since ndst is not pointing to the right
> >> >> >> processor.
> >> >> >>
> >> >> >> raised by George, then I'm not convinced it can happen (after all, the
> >> >> >> hooks get set _before_ the device gets assigned, and hence before
> >> >> >> the device can raise an interrupt destined at the guest). And if it can
> >> >> >> happen, then rather than pausing the guest I don't see why, along
> >> >> >> with setting the hooks, any possibly affected NDST field can't be
> >> >> >> programmed correctly. ISTR having recommended something like
> >> >> >> this already during review of the series originally introducing PI.
> >> >> >
> >> >> > Actually here is the scenario I am concerned about:
> >> >> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
> >> >> > and then vmx_vcpu_block().
> >> >> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
> >> >> > we may hit the ASSERT() since 'NDST' may not have been set to the
> >> >> > current processor yet.
> >> >> >
> >> >> > My previous solution in v2 is to delete that ASSERT(), but seems you
> >> >> > guys don't like it. So here I use this new method in v3 to make sure
> >> >> > the vCPU is running while we are installing the hooks.
> >> >>
> >> >> Indeed, deleting the assertion doesn't seem right. But then why
> >> >> can't vmx_vcpu_block() bail early when the domain has no devices
> >> >> assigned? That would allow for
> >> >>
> >> >> 1) set blocking hook
> >> >> 2) set up PI state
> >> >> 3) actually assign device
> >> >
> >> > Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
> >> > false, we return early? But has_arch_pdev() needs hold
> >> > _pcidevs_lock, right?
> >>
> >> I don't think you strictly need to: list_empty() will reliably return
> >> true in the case of interest here. And possible races when the last
> >> device gets removed are - afaics - benign (i.e. it doesn't matter
> >> what it returns at that point in time).
> >
> > I remind of another case:
> > 1. The four hooks are installed.
> > 2. vmx_vcpu_block() gets called before other three hooks gets called,
> > even if a device has already been assigned to the domain. We may still
> > hit the ASSERT() in vmx_vcpu_block() since 'NDST' field is changed in
> > the other hooks.
> 
> I don't understand: Step 2) in what I've outline above would make
> sure NDST is set correctly. Perhaps one should even reverse 2) and
> 1).

I still have trouble to fully understand this, please see the following
scenario:

1) Set 'NDST' to the pCPU0 (which vCPU is currently running on, but
it may changes since vCPU scheduling happens behind us, so how to
decide which pCPU for 'NDST'?)
2) Install all four hooks and vCPU is re-scheduled before
'vmx_pi_switch_to()' gets installed, so the pCPU of the vCPU might
be changed to pCPU1 without 'NDST' gets updated.
4) vmx_vcpu_block() gets called and we hit the ASSERT()

Maybe I missed something, It would be appreciated if you can
correct me if my understanding is wrong.

Thanks,
Feng

> 
> > And that is another reason I use pause/unpause here, it can address
> > all the races. And this is an one-time action (Only occurs at the first
> > device gets assigned), do you think it is that unacceptable?
> 
> No, I've never said it's unacceptable. I just want such to not be
> added without good reason.
> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-05  3:11                         ` Wu, Feng
@ 2016-09-05  9:27                           ` Jan Beulich
  2016-09-14  2:23                           ` Wu, Feng
  1 sibling, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2016-09-05  9:27 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 05.09.16 at 05:11, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Friday, September 2, 2016 9:55 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org 
>> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
>> PI hooks
>> 
>> >>> On 02.09.16 at 15:15, <feng.wu@intel.com> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Friday, September 2, 2016 6:46 PM
>> >> To: Wu, Feng <feng.wu@intel.com>
>> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> devel@lists.xen.org 
>> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> assigning
>> >> PI hooks
>> >>
>> >> >>> On 02.09.16 at 12:30, <feng.wu@intel.com> wrote:
>> >>
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: Friday, September 2, 2016 5:26 PM
>> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
>> >> >> devel@lists.xen.org 
>> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> assigning
>> >> >> PI hooks
>> >> >>
>> >> >> >>> On 02.09.16 at 10:40, <feng.wu@intel.com> wrote:
>> >> >>
>> >> >> >
>> >> >> >> -----Original Message-----
>> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> Sent: Friday, September 2, 2016 4:16 PM
>> >> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
>> xen-
>> >> >> >> devel@lists.xen.org 
>> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> >> assigning
>> >> >> >> PI hooks
>> >> >> >>
>> >> >> >> >>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:
>> >> >> >>
>> >> >> >> >
>> >> >> >> >> -----Original Message-----
>> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> >> Sent: Friday, September 2, 2016 3:04 PM
>> >> >> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
>> >> xen-
>> >> >> >> >> devel@lists.xen.org 
>> >> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain
>> before/after
>> >> >> >> assigning
>> >> >> >> >> PI hooks
>> >> >> >> >>
>> >> >> >> >> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
>> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >> >> -----Original Message-----
>> >> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
>> >> >> >> >> >> To: Wu, Feng <feng.wu@intel.com>
>> >> >> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
>> >> >> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin
>> <kevin.tian@intel.com>;
>> >> >> xen-
>> >> >> >> >> >> devel@lists.xen.org 
>> >> >> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain
>> >> before/after
>> >> >> >> >> assigning
>> >> >> >> >> >> PI hooks
>> >> >> >> >> >>
>> >> >> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
>> >> >> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct
>> domain
>> >> *d)
>> >> >> >> >> >> >
>> >> >> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> >> >> >> >> >> >
>> >> >> >> >> >> > +    /*
>> >> >> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
>> >> >> >> >> >> > +     * running and hence calling the hooks simultaneously
>> >> >> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
>> >> >> >> >> >> > +     * all the appropriate state of PI descriptor is actually
>> >> >> >> >> >> > +     * set up for all vCPus before leaving this function.
>> >> >> >> >> >> > +     */
>> >> >> >> >> >> > +    domain_pause(d);
>> >> >> >> >> >> > +
>> >> >> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> >> >> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume =
>> vmx_pi_do_resume;
>> >> >> >> >> >> > +
>> >> >> >> >> >> > +    domain_unpause(d);
>> >> >> >> >> >> >  }
>> >> >> >> >> >>
>> >> >> >> >> >> First of all I'm missing a word on whether the race mentioned in
>> >> >> >> >> >> the description and comment can actually happen. Device
>> >> >> >> >> >> (de)assignment should already be pretty much serialized (via
>> >> >> >> >> >> the domctl lock, and maybe also via the pcidevs one).
>> >> >> >> >> >
>> >> >> >> >> > The purpose of this patch is to address the race condition that
>> >> >> >> >> > the _vCPU_ is running while we are installing these hooks. Do you
>> >> >> >> >> > think this cannot happen?  This patch is trying to fix the issue
>> >> >> >> >> > described at:
>> >> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> >> >> >> > Consider that the other two hooks were installed when the VM
>> >> >> >> >> > is created, seems no such race condition. However, according
>> >> >> >> >> > to the discussion about patch 1 and patch 2 of series, we need
>> >> >> >> >> > to install the other two hooks here as well,
>> >> >> >> >>
>> >> >> >> >> I don't think we've agreed that the creation time installation of
>> >> >> >> >> those hooks is actually necessary. In fact your most recent
>> >> >> >> >> response to patch 1 makes me think you now agree we don't
>> >> >> >> >> need to do so. And hence with that precondition not holding
>> >> >> >> >> anymore, I don't think the conclusion does.
>> >> >> >> >
>> >> >> >> > I think there might be some confusion. Let me explain what I
>> >> >> >> > am think of to make sure we are on the same page:
>> >> >> >> > 1. We need install all the four hooks when the first device is
>> >> >> >> > assigned.
>> >> >> >> > 2. If _1_ is true, the issue described in
>> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> >> >> > exists.
>> >> >> >>
>> >> >> >> If you mean this
>> >> >> >>
>> >> >> >> * vcpu 0 starts running on a pcpu
>> >> >> >> * a device is assigned, causing the hooks to be set
>> >> >> >> * an interrupt from the device is routed to vcpu 0, but it is not
>> >> >> >> actually delivered properly, since ndst is not pointing to the right
>> >> >> >> processor.
>> >> >> >>
>> >> >> >> raised by George, then I'm not convinced it can happen (after all, the
>> >> >> >> hooks get set _before_ the device gets assigned, and hence before
>> >> >> >> the device can raise an interrupt destined at the guest). And if it can
>> >> >> >> happen, then rather than pausing the guest I don't see why, along
>> >> >> >> with setting the hooks, any possibly affected NDST field can't be
>> >> >> >> programmed correctly. ISTR having recommended something like
>> >> >> >> this already during review of the series originally introducing PI.
>> >> >> >
>> >> >> > Actually here is the scenario I am concerned about:
>> >> >> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
>> >> >> > and then vmx_vcpu_block().
>> >> >> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
>> >> >> > we may hit the ASSERT() since 'NDST' may not have been set to the
>> >> >> > current processor yet.
>> >> >> >
>> >> >> > My previous solution in v2 is to delete that ASSERT(), but seems you
>> >> >> > guys don't like it. So here I use this new method in v3 to make sure
>> >> >> > the vCPU is running while we are installing the hooks.
>> >> >>
>> >> >> Indeed, deleting the assertion doesn't seem right. But then why
>> >> >> can't vmx_vcpu_block() bail early when the domain has no devices
>> >> >> assigned? That would allow for
>> >> >>
>> >> >> 1) set blocking hook
>> >> >> 2) set up PI state
>> >> >> 3) actually assign device
>> >> >
>> >> > Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
>> >> > false, we return early? But has_arch_pdev() needs hold
>> >> > _pcidevs_lock, right?
>> >>
>> >> I don't think you strictly need to: list_empty() will reliably return
>> >> true in the case of interest here. And possible races when the last
>> >> device gets removed are - afaics - benign (i.e. it doesn't matter
>> >> what it returns at that point in time).
>> >
>> > I remind of another case:
>> > 1. The four hooks are installed.
>> > 2. vmx_vcpu_block() gets called before other three hooks gets called,
>> > even if a device has already been assigned to the domain. We may still
>> > hit the ASSERT() in vmx_vcpu_block() since 'NDST' field is changed in
>> > the other hooks.
>> 
>> I don't understand: Step 2) in what I've outline above would make
>> sure NDST is set correctly. Perhaps one should even reverse 2) and
>> 1).
> 
> I still have trouble to fully understand this, please see the following
> scenario:
> 
> 1) Set 'NDST' to the pCPU0 (which vCPU is currently running on, but
> it may changes since vCPU scheduling happens behind us, so how to
> decide which pCPU for 'NDST'?)
> 2) Install all four hooks and vCPU is re-scheduled before
> 'vmx_pi_switch_to()' gets installed, so the pCPU of the vCPU might
> be changed to pCPU1 without 'NDST' gets updated.
> 4) vmx_vcpu_block() gets called and we hit the ASSERT()
> 
> Maybe I missed something, It would be appreciated if you can
> correct me if my understanding is wrong.

Well, the problem is that you keep trying to find issues with the
(simplified) sequence of things I'm proposing to investigate as
an alternative. The above scenario _again_ can be dealt with
without having to pause the guest: Install the context switch
hook(s) first, then set NDST, then set the block hook. So what
I'd really like to ask you to do is first try to find a model without
pausing, and only if you figure that's impossible, then explain
why and we'll go with the pausing approach.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 1/6] VMX: Statically assign two PI hooks
  2016-08-31  3:56 ` [PATCH v3 1/6] VMX: Statically assign two PI hooks Feng Wu
  2016-09-01  8:16   ` Jan Beulich
@ 2016-09-06  8:42   ` Dario Faggioli
  2016-09-06  9:53     ` Wu, Feng
  1 sibling, 1 reply; 50+ messages in thread
From: Dario Faggioli @ 2016-09-06  8:42 UTC (permalink / raw)
  To: Feng Wu, xen-devel; +Cc: george.dunlap, andrew.cooper3, kevin.tian, jbeulich


[-- Attachment #1.1: Type: text/plain, Size: 1419 bytes --]

On Wed, 2016-08-31 at 11:56 +0800, Feng Wu wrote:
> PI hooks: vmx_pi_switch_from() and vmx_pi_switch_to() are
> needed even all the assigned devices were dettached from
> the domain. 
>
maybe "are needed even when any previously passed through device is
detached from the domain" (or something like that)?

> We change the state of SN bit in these two
> functions, and evaluate this bit in vmx_deliver_posted_intr()
> when trying to deliver the interrupt in posted way via software.
> The problem is if we deassign the hooks while the vCPU is runnable
> in the runqueue with 'SN' set, all the furture notificaton event
> will be suppressed. This patch makes these two hooks statically
> assigned.
> 
Which, if SN is used only for controlling VT-d PI from passed thru
devices does not sound like an issue to me.

What I sort of get from the discussion you had with Jan, however, is
that this is an issue, because SN is also used for other things, i.e.,
it is indeed useful even when there are no passed thru device, is that
the case?

If yes, I think this deserves at least a quick mention in the sentence
above.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed
  2016-08-31  3:56 ` [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
  2016-09-01  8:21   ` Jan Beulich
@ 2016-09-06  8:58   ` Dario Faggioli
  1 sibling, 0 replies; 50+ messages in thread
From: Dario Faggioli @ 2016-09-06  8:58 UTC (permalink / raw)
  To: Feng Wu, xen-devel; +Cc: george.dunlap, andrew.cooper3, kevin.tian, jbeulich


[-- Attachment #1.1: Type: text/plain, Size: 1583 bytes --]

On Wed, 2016-08-31 at 11:56 +0800, Feng Wu wrote:
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -158,14 +158,12 @@ static void vmx_pi_switch_to(struct vcpu *v)
>      pi_clear_sn(pi_desc);
>  }
>  
Not terribly important, but what about calling this:

> -static void vmx_pi_do_resume(struct vcpu *v)
> +static void vmx_pi_remove_vcpu_from_blocking_list(struct vcpu *v)

vmx_pi_list_del() or vmx_pi_list_remove()

> @@ -198,6 +196,21 @@ static void vmx_pi_do_resume(struct vcpu *v)
>      spin_unlock_irqrestore(pi_blocking_list_lock, flags);
>  }
>  
> +static void vmx_pi_do_resume(struct vcpu *v)
> +{
> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> +
> +    vmx_pi_remove_vcpu_from_blocking_list(v);
> +}
> +
And this:

> +static void vmx_pi_blocking_cleanup(struct vcpu *v)
>
vmx_pi_list_cleanup()

etc.?

I.e., using shorter and more consistent names.

> +{
> +    if ( !iommu_intpost )
> +        return;
> +
At least as far as this patch is concerned, you are only calling this
function from vmx_pi_hooks_deassing() which already checks at the very
beginning iommu_intpost to be true, or it returns, and you won't get
here.

So you don't need to re-check the same thing here.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 3/6] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
  2016-08-31  3:56 ` [PATCH v3 3/6] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
@ 2016-09-06  9:21   ` Dario Faggioli
  2016-09-06 23:27     ` Wu, Feng
  0 siblings, 1 reply; 50+ messages in thread
From: Dario Faggioli @ 2016-09-06  9:21 UTC (permalink / raw)
  To: Feng Wu, xen-devel; +Cc: george.dunlap, andrew.cooper3, kevin.tian, jbeulich


[-- Attachment #1.1: Type: text/plain, Size: 3455 bytes --]

On Wed, 2016-08-31 at 11:56 +0800, Feng Wu wrote:
> We should remove the vCPU from the per-cpu blocking list
> if it is going to be destroyed.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
>  xen/arch/x86/hvm/vmx/vmx.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index b869728..37fa2f1 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -346,6 +346,7 @@ static void vmx_vcpu_destroy(struct vcpu *v)
>      vmx_destroy_vmcs(v);
>      vpmu_destroy(v);
>      passive_domain_destroy(v);
> +    vmx_pi_blocking_cleanup(v);
>
I'm not too much into VMX, so I may be wrong (in which case, sorry),
but is it safe to call this after vmx_destroy_vmcs() ?

Also (even if it is), we're basically calling and executing the
following (called by vmx_pi_blocking_clanup()):

static void vmx_pi_remove_vcpu_from_blocking_list(struct vcpu *v)
{
    unsigned long flags;
    spinlock_t *pi_blocking_list_lock;
    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;

    /*
     * Set 'NV' field back to posted_intr_vector, so the
     * Posted-Interrupts can be delivered to the vCPU when
     * it is running in non-root mode.
     */
    write_atomic(&pi_desc->nv, posted_intr_vector);

    /* The vCPU is not on any blocking list. */
    pi_blocking_list_lock = v->arch.hvm_vmx.pi_blocking.lock;

    /* Prevent the compiler from eliminating the local variable.*/
    smp_rmb();

    if ( pi_blocking_list_lock == NULL )
        return;

    spin_lock_irqsave(pi_blocking_list_lock, flags);

    /*
     * v->arch.hvm_vmx.pi_blocking.lock == NULL here means the vCPU
     * was removed from the blocking list while we are acquiring the lock.
     */
    if ( v->arch.hvm_vmx.pi_blocking.lock != NULL )
    {
        ASSERT(v->arch.hvm_vmx.pi_blocking.lock == pi_blocking_list_lock);
        list_del(&v->arch.hvm_vmx.pi_blocking.list);
        v->arch.hvm_vmx.pi_blocking.lock = NULL;
    }

    spin_unlock_irqrestore(pi_blocking_list_lock, flags);
}

Considering that we're destroying, isn't this too much? Maybe it's not
a big deal, but I'd have expected that all is needed here is something
like:

 if ( v->arch.hvm_vmx.pi_blocking.lock )
 {
     spin_lock_irqsave(..);
     list_del(..);
     spin_unlock_irqrestore(..);
 }

Maybe the resume, list_remove, and cleanup functions need to be broken
up a bit more/better?

Also, as a side note (which I think would be more appropriate as a
comment to patch 1, but bear with me, I'm just back from vacations, I
have a lot of catch up to do, and I'm in hurry! :-P), now that the
function is called vmx_pi_remove_vcpu_from_blocking_list(), this
comment being part of its body sounds a bit weird:

    ...
    /* The vCPU is not on any blocking list. */
    pi_blocking_list_loc
k = v->arch.hvm_vmx.pi_blocking.lock;
    ...

I'd take the chance for rephrasing it.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 1/6] VMX: Statically assign two PI hooks
  2016-09-06  8:42   ` Dario Faggioli
@ 2016-09-06  9:53     ` Wu, Feng
  0 siblings, 0 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-06  9:53 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: george.dunlap, andrew.cooper3, Tian, Kevin, Wu, Feng, jbeulich



> -----Original Message-----
> From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> Sent: Tuesday, September 6, 2016 4:43 PM
> To: Wu, Feng <feng.wu@intel.com>; xen-devel@lists.xen.org
> Cc: Tian, Kevin <kevin.tian@intel.com>; george.dunlap@eu.citrix.com;
> andrew.cooper3@citrix.com; jbeulich@suse.com
> Subject: Re: [Xen-devel] [PATCH v3 1/6] VMX: Statically assign two PI hooks
> 
> On Wed, 2016-08-31 at 11:56 +0800, Feng Wu wrote:
> > PI hooks: vmx_pi_switch_from() and vmx_pi_switch_to() are
> > needed even all the assigned devices were dettached from
> > the domain.
> >
> maybe "are needed even when any previously passed through device is
> detached from the domain" (or something like that)?

Looks good, thanks for improve the wording.

> 
> > We change the state of SN bit in these two
> > functions, and evaluate this bit in vmx_deliver_posted_intr()
> > when trying to deliver the interrupt in posted way via software.
> > The problem is if we deassign the hooks while the vCPU is runnable
> > in the runqueue with 'SN' set, all the furture notificaton event
> > will be suppressed. This patch makes these two hooks statically
> > assigned.
> >
> Which, if SN is used only for controlling VT-d PI from passed thru
> devices does not sound like an issue to me.
> 
> What I sort of get from the discussion you had with Jan, however, is
> that this is an issue, because SN is also used for other things, i.e.,
> it is indeed useful even when there are no passed thru device, is that
> the case?
> 
> If yes, I think this deserves at least a quick mention in the sentence
> above.

Yes, SN controls all the PI including CPU side PI activity, sure, I will
explicitly add those information.

Thanks,
Feng

> 
> Regards,
> Dario
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 3/6] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed
  2016-09-06  9:21   ` Dario Faggioli
@ 2016-09-06 23:27     ` Wu, Feng
  0 siblings, 0 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-06 23:27 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel
  Cc: george.dunlap, andrew.cooper3, Tian, Kevin, Wu, Feng, jbeulich



> -----Original Message-----
> From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> Sent: Tuesday, September 6, 2016 5:22 PM
> To: Wu, Feng <feng.wu@intel.com>; xen-devel@lists.xen.org
> Cc: Tian, Kevin <kevin.tian@intel.com>; george.dunlap@eu.citrix.com;
> andrew.cooper3@citrix.com; jbeulich@suse.com
> Subject: Re: [Xen-devel] [PATCH v3 3/6] VMX: Cleanup PI per-cpu blocking list
> when vcpu is destroyed
> 
> On Wed, 2016-08-31 at 11:56 +0800, Feng Wu wrote:
> > We should remove the vCPU from the per-cpu blocking list
> > if it is going to be destroyed.
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > ---
> >  xen/arch/x86/hvm/vmx/vmx.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> > index b869728..37fa2f1 100644
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -346,6 +346,7 @@ static void vmx_vcpu_destroy(struct vcpu *v)
> >      vmx_destroy_vmcs(v);
> >      vpmu_destroy(v);
> >      passive_domain_destroy(v);
> > +    vmx_pi_blocking_cleanup(v);
> >
> I'm not too much into VMX, so I may be wrong (in which case, sorry),
> but is it safe to call this after vmx_destroy_vmcs() ?

It is safe, since vmx_pi_blocking_cleanup(v) doesn't deal with any
real VMCS stuff, it just handle some the pi blocking queue. But I think
your doubt here is available, maybe it looks more reasonable to
move it before vmx_destory_vmcs().

> 
> Also (even if it is), we're basically calling and executing the
> following (called by vmx_pi_blocking_clanup()):
> 
> static void vmx_pi_remove_vcpu_from_blocking_list(struct vcpu *v)
> {
>     unsigned long flags;
>     spinlock_t *pi_blocking_list_lock;
>     struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> 
>     /*
>      * Set 'NV' field back to posted_intr_vector, so the
>      * Posted-Interrupts can be delivered to the vCPU when
>      * it is running in non-root mode.
>      */
>     write_atomic(&pi_desc->nv, posted_intr_vector);
> 
>     /* The vCPU is not on any blocking list. */
>     pi_blocking_list_lock = v->arch.hvm_vmx.pi_blocking.lock;
> 
>     /* Prevent the compiler from eliminating the local variable.*/
>     smp_rmb();
> 
>     if ( pi_blocking_list_lock == NULL )
>         return;
> 
>     spin_lock_irqsave(pi_blocking_list_lock, flags);
> 
>     /*
>      * v->arch.hvm_vmx.pi_blocking.lock == NULL here means the vCPU
>      * was removed from the blocking list while we are acquiring the lock.
>      */
>     if ( v->arch.hvm_vmx.pi_blocking.lock != NULL )
>     {
>         ASSERT(v->arch.hvm_vmx.pi_blocking.lock == pi_blocking_list_lock);
>         list_del(&v->arch.hvm_vmx.pi_blocking.list);
>         v->arch.hvm_vmx.pi_blocking.lock = NULL;
>     }
> 
>     spin_unlock_irqrestore(pi_blocking_list_lock, flags);
> }
> 
> Considering that we're destroying, isn't this too much? Maybe it's not
> a big deal, but I'd have expected that all is needed here is something
> like:
> 
>  if ( v->arch.hvm_vmx.pi_blocking.lock )
>  {
>      spin_lock_irqsave(..);
>      list_del(..);
>      spin_unlock_irqrestore(..);
>  }
> 
> Maybe the resume, list_remove, and cleanup functions need to be broken
> up a bit more/better?

Actually, this function carefully handles some cases, since
' v->arch.hvm_vmx.pi_blocking.lock ' can be set to NULL in other places,
such as, in pi_wakeup_interrupt(), so even we are in the destroy path,
we still need to save it to a local variable and check its value, then delete
the vCPU from the list if needed. So if we really want to eliminate something
in this function for the destroy path, maybe we don't need to restore the NV
field. If that is the case, seems this is not a big deal :)

> 
> Also, as a side note (which I think would be more appropriate as a
> comment to patch 1, but bear with me, I'm just back from vacations, I
> have a lot of catch up to do, and I'm in hurry! :-P), now that the
> function is called vmx_pi_remove_vcpu_from_blocking_list(), this
> comment being part of its body sounds a bit weird:
> 
>     ...
>     /* The vCPU is not on any blocking list. */
>     pi_blocking_list_loc
> k = v->arch.hvm_vmx.pi_blocking.lock;
>     ...
> 
> I'd take the chance for rephrasing it.

Oh, yes, this needs some adjustment. Thanks for the comments!

Thanks,
Feng

> 
> Regards,
> Dario
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-05  3:11                         ` Wu, Feng
  2016-09-05  9:27                           ` Jan Beulich
@ 2016-09-14  2:23                           ` Wu, Feng
  2016-09-14  8:46                             ` Jan Beulich
  2016-09-14 14:51                             ` Dario Faggioli
  1 sibling, 2 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-14  2:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Wu, Feng
> Sent: Monday, September 5, 2016 11:12 AM
> To: Jan Beulich <JBeulich@suse.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org; Wu, Feng <feng.wu@intel.com>
> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> 
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: Friday, September 2, 2016 9:55 PM
> > To: Wu, Feng <feng.wu@intel.com>
> > Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> > devel@lists.xen.org
> > Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> > PI hooks
> >
> > >>> On 02.09.16 at 15:15, <feng.wu@intel.com> wrote:
> >
> > >
> > >> -----Original Message-----
> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> Sent: Friday, September 2, 2016 6:46 PM
> > >> To: Wu, Feng <feng.wu@intel.com>
> > >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> > >> devel@lists.xen.org
> > >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> > assigning
> > >> PI hooks
> > >>
> > >> >>> On 02.09.16 at 12:30, <feng.wu@intel.com> wrote:
> > >>
> > >> >
> > >> >> -----Original Message-----
> > >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> >> Sent: Friday, September 2, 2016 5:26 PM
> > >> >> To: Wu, Feng <feng.wu@intel.com>
> > >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> > >> >> devel@lists.xen.org
> > >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> > >> assigning
> > >> >> PI hooks
> > >> >>
> > >> >> >>> On 02.09.16 at 10:40, <feng.wu@intel.com> wrote:
> > >> >>
> > >> >> >
> > >> >> >> -----Original Message-----
> > >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> >> >> Sent: Friday, September 2, 2016 4:16 PM
> > >> >> >> To: Wu, Feng <feng.wu@intel.com>
> > >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
> > xen-
> > >> >> >> devel@lists.xen.org
> > >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> > >> >> assigning
> > >> >> >> PI hooks
> > >> >> >>
> > >> >> >> >>> On 02.09.16 at 09:31, <feng.wu@intel.com> wrote:
> > >> >> >>
> > >> >> >> >
> > >> >> >> >> -----Original Message-----
> > >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> >> >> >> Sent: Friday, September 2, 2016 3:04 PM
> > >> >> >> >> To: Wu, Feng <feng.wu@intel.com>
> > >> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > >> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>;
> > >> xen-
> > >> >> >> >> devel@lists.xen.org
> > >> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain
> > before/after
> > >> >> >> assigning
> > >> >> >> >> PI hooks
> > >> >> >> >>
> > >> >> >> >> >>> On 02.09.16 at 03:46, <feng.wu@intel.com> wrote:
> > >> >> >> >>
> > >> >> >> >> >
> > >> >> >> >> >> -----Original Message-----
> > >> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> >> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
> > >> >> >> >> >> To: Wu, Feng <feng.wu@intel.com>
> > >> >> >> >> >> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> > >> >> >> >> >> george.dunlap@eu.citrix.com; Tian, Kevin
> > <kevin.tian@intel.com>;
> > >> >> xen-
> > >> >> >> >> >> devel@lists.xen.org
> > >> >> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain
> > >> before/after
> > >> >> >> >> assigning
> > >> >> >> >> >> PI hooks
> > >> >> >> >> >>
> > >> >> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@intel.com> wrote:
> > >> >> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > >> >> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > >> >> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct
> > domain
> > >> *d)
> > >> >> >> >> >> >
> > >> >> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> > >> >> >> >> >> >
> > >> >> >> >> >> > +    /*
> > >> >> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
> > >> >> >> >> >> > +     * running and hence calling the hooks simultaneously
> > >> >> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
> > >> >> >> >> >> > +     * all the appropriate state of PI descriptor is actually
> > >> >> >> >> >> > +     * set up for all vCPus before leaving this function.
> > >> >> >> >> >> > +     */
> > >> >> >> >> >> > +    domain_pause(d);
> > >> >> >> >> >> > +
> > >> >> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> > >> >> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume =
> > vmx_pi_do_resume;
> > >> >> >> >> >> > +
> > >> >> >> >> >> > +    domain_unpause(d);
> > >> >> >> >> >> >  }
> > >> >> >> >> >>
> > >> >> >> >> >> First of all I'm missing a word on whether the race mentioned in
> > >> >> >> >> >> the description and comment can actually happen. Device
> > >> >> >> >> >> (de)assignment should already be pretty much serialized (via
> > >> >> >> >> >> the domctl lock, and maybe also via the pcidevs one).
> > >> >> >> >> >
> > >> >> >> >> > The purpose of this patch is to address the race condition that
> > >> >> >> >> > the _vCPU_ is running while we are installing these hooks. Do
> you
> > >> >> >> >> > think this cannot happen?  This patch is trying to fix the issue
> > >> >> >> >> > described at:
> > >> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> > >> >> >> >> > Consider that the other two hooks were installed when the VM
> > >> >> >> >> > is created, seems no such race condition. However, according
> > >> >> >> >> > to the discussion about patch 1 and patch 2 of series, we need
> > >> >> >> >> > to install the other two hooks here as well,
> > >> >> >> >>
> > >> >> >> >> I don't think we've agreed that the creation time installation of
> > >> >> >> >> those hooks is actually necessary. In fact your most recent
> > >> >> >> >> response to patch 1 makes me think you now agree we don't
> > >> >> >> >> need to do so. And hence with that precondition not holding
> > >> >> >> >> anymore, I don't think the conclusion does.
> > >> >> >> >
> > >> >> >> > I think there might be some confusion. Let me explain what I
> > >> >> >> > am think of to make sure we are on the same page:
> > >> >> >> > 1. We need install all the four hooks when the first device is
> > >> >> >> > assigned.
> > >> >> >> > 2. If _1_ is true, the issue described in
> > >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> > >> >> >> > exists.
> > >> >> >>
> > >> >> >> If you mean this
> > >> >> >>
> > >> >> >> * vcpu 0 starts running on a pcpu
> > >> >> >> * a device is assigned, causing the hooks to be set
> > >> >> >> * an interrupt from the device is routed to vcpu 0, but it is not
> > >> >> >> actually delivered properly, since ndst is not pointing to the right
> > >> >> >> processor.
> > >> >> >>
> > >> >> >> raised by George, then I'm not convinced it can happen (after all, the
> > >> >> >> hooks get set _before_ the device gets assigned, and hence before
> > >> >> >> the device can raise an interrupt destined at the guest). And if it can
> > >> >> >> happen, then rather than pausing the guest I don't see why, along
> > >> >> >> with setting the hooks, any possibly affected NDST field can't be
> > >> >> >> programmed correctly. ISTR having recommended something like
> > >> >> >> this already during review of the series originally introducing PI.
> > >> >> >
> > >> >> > Actually here is the scenario I am concerned about:
> > >> >> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
> > >> >> > and then vmx_vcpu_block().
> > >> >> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
> > >> >> > we may hit the ASSERT() since 'NDST' may not have been set to the
> > >> >> > current processor yet.
> > >> >> >
> > >> >> > My previous solution in v2 is to delete that ASSERT(), but seems you
> > >> >> > guys don't like it. So here I use this new method in v3 to make sure
> > >> >> > the vCPU is running while we are installing the hooks.
> > >> >>
> > >> >> Indeed, deleting the assertion doesn't seem right. But then why
> > >> >> can't vmx_vcpu_block() bail early when the domain has no devices
> > >> >> assigned? That would allow for
> > >> >>
> > >> >> 1) set blocking hook
> > >> >> 2) set up PI state
> > >> >> 3) actually assign device
> > >> >
> > >> > Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
> > >> > false, we return early? But has_arch_pdev() needs hold
> > >> > _pcidevs_lock, right?
> > >>
> > >> I don't think you strictly need to: list_empty() will reliably return
> > >> true in the case of interest here. And possible races when the last
> > >> device gets removed are - afaics - benign (i.e. it doesn't matter
> > >> what it returns at that point in time).
> > >
> > > I remind of another case:
> > > 1. The four hooks are installed.
> > > 2. vmx_vcpu_block() gets called before other three hooks gets called,
> > > even if a device has already been assigned to the domain. We may still
> > > hit the ASSERT() in vmx_vcpu_block() since 'NDST' field is changed in
> > > the other hooks.
> >
> > I don't understand: Step 2) in what I've outline above would make
> > sure NDST is set correctly. Perhaps one should even reverse 2) and
> > 1).
> 
> I still have trouble to fully understand this, please see the following
> scenario:
> 
> 1) Set 'NDST' to the pCPU0 (which vCPU is currently running on, but
> it may changes since vCPU scheduling happens behind us, so how to
> decide which pCPU for 'NDST'?)
> 2) Install all four hooks and vCPU is re-scheduled before
> 'vmx_pi_switch_to()' gets installed, so the pCPU of the vCPU might
> be changed to pCPU1 without 'NDST' gets updated.
> 4) vmx_vcpu_block() gets called and we hit the ASSERT()
> 
> Maybe I missed something, It would be appreciated if you can
> correct me if my understanding is wrong.

My email system had some problems, hence some emails didn't go
to my index, but I found your replay from the Internet as below:

" Well, the problem is that you keep trying to find issues with the
(simplified) sequence of things I'm proposing to investigate as
an alternative. The above scenario _again_ can be dealt with
without having to pause the guest: Install the context switch
hook(s) first, then set NDST, then set the block hook. So what
I'd really like to ask you to do is first try to find a model without
pausing, and only if you figure that's impossible, then explain
why and we'll go with the pausing approach.

Jan"

Then I tried to implement the function like the following:

/* This function is called when pcidevs_lock is held */
void vmx_pi_hooks_assign(struct domain *d)
{
    if ( !iommu_intpost || !has_hvm_container_domain(d) )
        return;

    ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);

    /*
     * We carefully handle the timing here:
     * - Install the context switch first
     * - Then set the NDST field
     * - Install the block and resume hooks in the end
     *
     * This can make sure the PI (especially the NDST feild) is
     * in proper state when we call vmx_vcpu_block().
     */
    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;

    for_each_vcpu ( d, v )
    {
        unsigned int dest = cpu_physical_id(v->processor);
        struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;

        /* spot 1 */

        write_atomic(&pi_desc->ndst,
                     x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
    }

    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
    d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
}

However, I still think it is problematic. Consider the _spot 1_ above, we get
the pCPU of the vCPU and update the NDST, but the vCPU can be happily
rescheduled to another pCPU before updating the NDST. The problem is
that it is not safe to update NDST here, since vCPU can be scheduled
behind us at any time. And if this is the case, it is hard to safely do this
without guarantee the vCPU is in a known state. Any further advice
on this? Thanks a lot!

Thanks,
Feng

> 
> Thanks,
> Feng
> 
> >
> > > And that is another reason I use pause/unpause here, it can address
> > > all the races. And this is an one-time action (Only occurs at the first
> > > device gets assigned), do you think it is that unacceptable?
> >
> > No, I've never said it's unacceptable. I just want such to not be
> > added without good reason.
> >
> > Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-14  2:23                           ` Wu, Feng
@ 2016-09-14  8:46                             ` Jan Beulich
  2016-09-14 14:51                             ` Dario Faggioli
  1 sibling, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2016-09-14  8:46 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 14.09.16 at 04:23, <feng.wu@intel.com> wrote:
>> From: Wu, Feng
>> Sent: Monday, September 5, 2016 11:12 AM
>> I still have trouble to fully understand this, please see the following
>> scenario:
>> 
>> 1) Set 'NDST' to the pCPU0 (which vCPU is currently running on, but
>> it may changes since vCPU scheduling happens behind us, so how to
>> decide which pCPU for 'NDST'?)
>> 2) Install all four hooks and vCPU is re-scheduled before
>> 'vmx_pi_switch_to()' gets installed, so the pCPU of the vCPU might
>> be changed to pCPU1 without 'NDST' gets updated.
>> 4) vmx_vcpu_block() gets called and we hit the ASSERT()
>> 
>> Maybe I missed something, It would be appreciated if you can
>> correct me if my understanding is wrong.
> 
> My email system had some problems, hence some emails didn't go
> to my index, but I found your replay from the Internet as below:
> 
> " Well, the problem is that you keep trying to find issues with the
> (simplified) sequence of things I'm proposing to investigate as
> an alternative. The above scenario _again_ can be dealt with
> without having to pause the guest: Install the context switch
> hook(s) first, then set NDST, then set the block hook. So what
> I'd really like to ask you to do is first try to find a model without
> pausing, and only if you figure that's impossible, then explain
> why and we'll go with the pausing approach.
> 
> Jan"
> 
> Then I tried to implement the function like the following:
> 
> /* This function is called when pcidevs_lock is held */
> void vmx_pi_hooks_assign(struct domain *d)
> {
>     if ( !iommu_intpost || !has_hvm_container_domain(d) )
>         return;
> 
>     ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> 
>     /*
>      * We carefully handle the timing here:
>      * - Install the context switch first
>      * - Then set the NDST field
>      * - Install the block and resume hooks in the end
>      *
>      * This can make sure the PI (especially the NDST feild) is
>      * in proper state when we call vmx_vcpu_block().
>      */
>     d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
>     d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> 
>     for_each_vcpu ( d, v )
>     {
>         unsigned int dest = cpu_physical_id(v->processor);
>         struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> 
>         /* spot 1 */
> 
>         write_atomic(&pi_desc->ndst,
>                      x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
>     }
> 
>     d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>     d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> }
> 
> However, I still think it is problematic. Consider the _spot 1_ above, we get
> the pCPU of the vCPU and update the NDST, but the vCPU can be happily
> rescheduled to another pCPU before updating the NDST. The problem is
> that it is not safe to update NDST here, since vCPU can be scheduled
> behind us at any time. And if this is the case, it is hard to safely do this
> without guarantee the vCPU is in a known state. Any further advice
> on this? Thanks a lot!

Well, yes, there still is a problem, but (sadly) also yes, you continue
to follow the pattern I mentioned in my earlier reply: Instead of trying
to find a suitable ordering of things, you simply complain about the
(necessarily simplified) suggestion I gave. For example, is
v->arch.hvm_vmx.pi_desc perhaps in a known never-touched state
upon entering vmx_pi_hooks_assign() the very first time? In that case
instead of the write_atomic() you use in the code above you could
cmpxchg() against that known initial value, avoiding the write here if
one has already happened in vmx_pi_switch_to(). (And before you
ask, please also consider writing a known-invalid NDST before assigning
the context switch hooks, and cmpxchg() against that one, in case
the initial state isn't sufficiently distinguishable. Using an invalid value
here shouldn't matter as no notifications should arrive prior to any
device getting assigned to the guest.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-14  2:23                           ` Wu, Feng
  2016-09-14  8:46                             ` Jan Beulich
@ 2016-09-14 14:51                             ` Dario Faggioli
  2016-09-18  8:37                               ` Wu, Feng
  1 sibling, 1 reply; 50+ messages in thread
From: Dario Faggioli @ 2016-09-14 14:51 UTC (permalink / raw)
  To: Wu, Feng, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, Tian, Kevin, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3229 bytes --]

On Wed, 2016-09-14 at 02:23 +0000, Wu, Feng wrote:
> Then I tried to implement the function like the following:
> 
> /* This function is called when pcidevs_lock is held */
> void vmx_pi_hooks_assign(struct domain *d)
> {
>     if ( !iommu_intpost || !has_hvm_container_domain(d) )
>         return;
> 
>     ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> 
>     /*
>      * We carefully handle the timing here:
>      * - Install the context switch first
>      * - Then set the NDST field
>      * - Install the block and resume hooks in the end
>      *
>      * This can make sure the PI (especially the NDST feild) is
>      * in proper state when we call vmx_vcpu_block().
>      */
>     d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
>     d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> 
>     for_each_vcpu ( d, v )
>     {
>         unsigned int dest = cpu_physical_id(v->processor);
>         struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> 
>         /* spot 1 */
> 
>         write_atomic(&pi_desc->ndst,
>                      x2apic_enabled ? dest : MASK_INSR(dest,
> PI_xAPIC_NDST_MASK));
>     }
> 
>     d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>     d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> }
> 
> However, I still think it is problematic. Consider the _spot 1_
> above, we get
> the pCPU of the vCPU and update the NDST, but the vCPU can be happily
> rescheduled to another pCPU before updating the NDST. The problem is
> that it is not safe to update NDST here, since vCPU can be scheduled
> behind us at any time. And if this is the case, it is hard to safely
> do this
> without guarantee the vCPU is in a known state. Any further advice
> on this? Thanks a lot!
> 
So, I'm sorry if this is me missing or not remembering something... but
I do remember that, at some point, in the early days of this series,
there were concerns about the use of v->processor behind the back of
the scheduler, i.e., without holding the proper scheduler related
locks.

Pausing the domain was not even being remotely considered at the time,
it (again, at least AFAICR) came up later for trying to address other
issues.

No, the whole point of why that was not a problem in the first place is
that what counts here is on the wait list of what pcpu the vcpu is put,
not really where the vcpu is being or has been scheduled last. Of
course it'd be better --and it would also be true most of the times--
if there where a match, but that was not a correctness concern.

So why this is all of the sudden becoming one? Am I completely off with
my recollection (or in general :-P)? Or what am I missing about the
issue we are trying to address with this new bits of the work?

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-14 14:51                             ` Dario Faggioli
@ 2016-09-18  8:37                               ` Wu, Feng
  2016-09-19 23:12                                 ` Dario Faggioli
  0 siblings, 1 reply; 50+ messages in thread
From: Wu, Feng @ 2016-09-18  8:37 UTC (permalink / raw)
  To: Dario Faggioli, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, Tian, Kevin, Wu, Feng, xen-devel

Hi Dario,

> -----Original Message-----
> From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> Sent: Wednesday, September 14, 2016 10:52 PM
> To: Wu, Feng <feng.wu@intel.com>; Jan Beulich <JBeulich@suse.com>
> Cc: andrew.cooper3@citrix.com; george.dunlap@eu.citrix.com; Tian, Kevin
> <kevin.tian@intel.com>; xen-devel@lists.xen.org
> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> On Wed, 2016-09-14 at 02:23 +0000, Wu, Feng wrote:
> > Then I tried to implement the function like the following:
> >
> > /* This function is called when pcidevs_lock is held */
> > void vmx_pi_hooks_assign(struct domain *d)
> > {
> >     if ( !iommu_intpost || !has_hvm_container_domain(d) )
> >         return;
> >
> >     ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >
> >     /*
> >      * We carefully handle the timing here:
> >      * - Install the context switch first
> >      * - Then set the NDST field
> >      * - Install the block and resume hooks in the end
> >      *
> >      * This can make sure the PI (especially the NDST feild) is
> >      * in proper state when we call vmx_vcpu_block().
> >      */
> >     d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
> >     d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> >
> >     for_each_vcpu ( d, v )
> >     {
> >         unsigned int dest = cpu_physical_id(v->processor);
> >         struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> >
> >         /* spot 1 */
> >
> >         write_atomic(&pi_desc->ndst,
> >                      x2apic_enabled ? dest : MASK_INSR(dest,
> > PI_xAPIC_NDST_MASK));
> >     }
> >
> >     d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> >     d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> > }
> >
> > However, I still think it is problematic. Consider the _spot 1_
> > above, we get
> > the pCPU of the vCPU and update the NDST, but the vCPU can be happily
> > rescheduled to another pCPU before updating the NDST. The problem is
> > that it is not safe to update NDST here, since vCPU can be scheduled
> > behind us at any time. And if this is the case, it is hard to safely
> > do this
> > without guarantee the vCPU is in a known state. Any further advice
> > on this? Thanks a lot!
> >
> So, I'm sorry if this is me missing or not remembering something... but
> I do remember that, at some point, in the early days of this series,
> there were concerns about the use of v->processor behind the back of
> the scheduler, i.e., without holding the proper scheduler related
> locks.
> 
> Pausing the domain was not even being remotely considered at the time,
> it (again, at least AFAICR) came up later for trying to address other
> issues.
> 
> No, the whole point of why that was not a problem in the first place is
> that what counts here is on the wait list of what pcpu the vcpu is put,
> not really where the vcpu is being or has been scheduled last. Of
> course it'd be better --and it would also be true most of the times--
> if there where a match, but that was not a correctness concern.
> 
> So why this is all of the sudden becoming one? Am I completely off with
> my recollection (or in general :-P)? Or what am I missing about the
> issue we are trying to address with this new bits of the work?

The issue discussed between Jan and me is that now we have four
PI hooks: vmx_pi_switch_from, vmx_pi_switch_to, vmx_vcpu_block,
and vmx_pi_do_resume. The previous assumption in vmx_vcpu_block()
is that when we are running this function, the NDST field should have
the same meaning with the current pCPU the vCPU is running on.
However, I found this is not true in some scenarios, such as,
vmx_pi_switch_to() hasn't been installed or executed before
vmx_vcpu_block() gets called, in which case, we may hit the ASSERT
in it. In previous version, I suggested we remove the ASSERT in
vmx_vcpu_block() and set NDST explicitly in it. But seems maintainer
doesn't like this idea. So currently we need to make sure the PI is
in proper state before the hooks get called, that is why I used pause/
unpause mechanism.

Thanks,
Feng

> 
> Thanks and Regards,
> Dario
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-18  8:37                               ` Wu, Feng
@ 2016-09-19 23:12                                 ` Dario Faggioli
  2016-09-20  0:48                                   ` Wu, Feng
  2016-09-20  7:31                                   ` Jan Beulich
  0 siblings, 2 replies; 50+ messages in thread
From: Dario Faggioli @ 2016-09-19 23:12 UTC (permalink / raw)
  To: Wu, Feng, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, Tian, Kevin, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2418 bytes --]

On Sun, 2016-09-18 at 08:37 +0000, Wu, Feng wrote:
> > From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> > So why this is all of the sudden becoming one? Am I completely off
> > with
> > my recollection (or in general :-P)? Or what am I missing about the
> > issue we are trying to address with this new bits of the work?
> 
> The issue discussed between Jan and me is that now we have four
> PI hooks: vmx_pi_switch_from, vmx_pi_switch_to, vmx_vcpu_block,
> and vmx_pi_do_resume. The previous assumption in vmx_vcpu_block()
> is that when we are running this function, the NDST field should have
> the same meaning with the current pCPU the vCPU is running on.
>
I'm sorry, but I'm still not getting it. Feel free to drop me, if I'm
doing more harm than good, but really, I can't parse this sentence into
something that makes me better understand the problem at hand.

"The previous assumption": "previous" with respect to what?

"the field should have the same meaning with the current pCPU the vCPU
is running on", what's this meaning you're mentioning, and how would it
be the same?

> However, I found this is not true in some scenarios, such as,
> vmx_pi_switch_to() hasn't been installed or executed before
> vmx_vcpu_block() gets called, in which case, we may hit the ASSERT
> in it. In previous version, I suggested we remove the ASSERT in
> vmx_vcpu_block() and set NDST explicitly in it. But seems maintainer
> doesn't like this idea. 
>
And this is not helping either. An ASSERT() being hit means something
wrong happened. Whether or not to remove (or change) an ASSERT() is not
a matter of "like".

If we hit the ASSERT() but nothing is wrong with the code, then it is
the ASSERT() itself that is wrong (buggy), and we must remove it, like
it or not.

OTOH, if we hit a non buggy ASSERT(), then it means that the ASSERT()
has done its job in finding something wrong in the code, and we should
leave it alone and fix the problem.

How's possible for the solution here to be "either remove the ASSERT()
_OR_ change the code"? That really makes few sense to me... :-O

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-19 23:12                                 ` Dario Faggioli
@ 2016-09-20  0:48                                   ` Wu, Feng
  2016-09-20  7:31                                   ` Jan Beulich
  1 sibling, 0 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-20  0:48 UTC (permalink / raw)
  To: Dario Faggioli, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, Tian, Kevin, Wu, Feng, xen-devel



> -----Original Message-----
> From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> Sent: Tuesday, September 20, 2016 7:12 AM
> To: Wu, Feng <feng.wu@intel.com>; Jan Beulich <JBeulich@suse.com>
> Cc: andrew.cooper3@citrix.com; george.dunlap@eu.citrix.com; Tian, Kevin
> <kevin.tian@intel.com>; xen-devel@lists.xen.org
> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> On Sun, 2016-09-18 at 08:37 +0000, Wu, Feng wrote:
> > > From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> > > So why this is all of the sudden becoming one? Am I completely off
> > > with
> > > my recollection (or in general :-P)? Or what am I missing about the
> > > issue we are trying to address with this new bits of the work?
> >
> > The issue discussed between Jan and me is that now we have four
> > PI hooks: vmx_pi_switch_from, vmx_pi_switch_to, vmx_vcpu_block,
> > and vmx_pi_do_resume. The previous assumption in vmx_vcpu_block()
> > is that when we are running this function, the NDST field should have
> > the same meaning with the current pCPU the vCPU is running on.
> >
> I'm sorry, but I'm still not getting it. Feel free to drop me, if I'm
> doing more harm than good, but really, I can't parse this sentence into
> something that makes me better understand the problem at hand.
> 
> "The previous assumption": "previous" with respect to what?
> 
> "the field should have the same meaning with the current pCPU the vCPU
> is running on", what's this meaning you're mentioning, and how would it
> be the same?

Sorry for my bad description. I mean in the current code there an ASSERT
in vmx_vcpu_block():

ASSERT(pi_desc->ndst ==
	(x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK)))

So we assume that the value of NDST field should get from the pCPU the vCPU
is currently running on.

> 
> > However, I found this is not true in some scenarios, such as,
> > vmx_pi_switch_to() hasn't been installed or executed before
> > vmx_vcpu_block() gets called, in which case, we may hit the ASSERT
> > in it. In previous version, I suggested we remove the ASSERT in
> > vmx_vcpu_block() and set NDST explicitly in it. But seems maintainer
> > doesn't like this idea.
> >
> And this is not helping either. An ASSERT() being hit means something
> wrong happened. Whether or not to remove (or change) an ASSERT() is not
> a matter of "like".
> 
> If we hit the ASSERT() but nothing is wrong with the code, then it is
> the ASSERT() itself that is wrong (buggy), and we must remove it, like
> it or not.
> 
> OTOH, if we hit a non buggy ASSERT(), then it means that the ASSERT()
> has done its job in finding something wrong in the code, and we should
> leave it alone and fix the problem.

Yes, this is what we are doing now. We think the ASSERT() should be still
there, and try to fix the issue in the other place to make sure we will not
hit the ASSERT() here.

> 
> How's possible for the solution here to be "either remove the ASSERT()
> _OR_ change the code"? That really makes few sense to me... :-O

As mentioned above, I will remain the ASSERT and fix the issue.
Thanks for the comments!

Thanks,
Feng

> 
> Thanks and Regards,
> Dario
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-19 23:12                                 ` Dario Faggioli
  2016-09-20  0:48                                   ` Wu, Feng
@ 2016-09-20  7:31                                   ` Jan Beulich
  2016-09-20  7:53                                     ` Wu, Feng
  2016-09-20  8:13                                     ` Dario Faggioli
  1 sibling, 2 replies; 50+ messages in thread
From: Jan Beulich @ 2016-09-20  7:31 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: george.dunlap, andrew.cooper3, Kevin Tian, Feng Wu, xen-devel

>>> On 20.09.16 at 01:12, <dario.faggioli@citrix.com> wrote:
> On Sun, 2016-09-18 at 08:37 +0000, Wu, Feng wrote:
>> > From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
>> However, I found this is not true in some scenarios, such as,
>> vmx_pi_switch_to() hasn't been installed or executed before
>> vmx_vcpu_block() gets called, in which case, we may hit the ASSERT
>> in it. In previous version, I suggested we remove the ASSERT in
>> vmx_vcpu_block() and set NDST explicitly in it. But seems maintainer
>> doesn't like this idea. 
>>
> And this is not helping either. An ASSERT() being hit means something
> wrong happened. Whether or not to remove (or change) an ASSERT() is not
> a matter of "like".
> 
> If we hit the ASSERT() but nothing is wrong with the code, then it is
> the ASSERT() itself that is wrong (buggy), and we must remove it, like
> it or not.
> 
> OTOH, if we hit a non buggy ASSERT(), then it means that the ASSERT()
> has done its job in finding something wrong in the code, and we should
> leave it alone and fix the problem.
> 
> How's possible for the solution here to be "either remove the ASSERT()
> _OR_ change the code"? That really makes few sense to me... :-O

I disagree: Whether fixing/removing an ASSERT() that triggered or
adjusting other code to make the ASSERT() not trigger can indeed
both be an option, and may to some degree be a matter of taste.
And that's where I think Feng and I disagree - I'd prefer the ASSERT()
to stay, and code elsewhere to make sure it won't trigger, while he
would prefer to get rid of the ASSERT() (aiui on the basis that the
code following the ASSERT() establishes the intended state - Feng,
please correct me if I'm wrong).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-20  7:31                                   ` Jan Beulich
@ 2016-09-20  7:53                                     ` Wu, Feng
  2016-09-20  8:13                                     ` Dario Faggioli
  1 sibling, 0 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-20  7:53 UTC (permalink / raw)
  To: Jan Beulich, Dario Faggioli
  Cc: george.dunlap, andrew.cooper3, Tian, Kevin, Wu, Feng, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, September 20, 2016 3:32 PM
> To: Dario Faggioli <dario.faggioli@citrix.com>
> Cc: andrew.cooper3@citrix.com; george.dunlap@eu.citrix.com; Wu, Feng
> <feng.wu@intel.com>; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> >>> On 20.09.16 at 01:12, <dario.faggioli@citrix.com> wrote:
> > On Sun, 2016-09-18 at 08:37 +0000, Wu, Feng wrote:
> >> > From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> >> However, I found this is not true in some scenarios, such as,
> >> vmx_pi_switch_to() hasn't been installed or executed before
> >> vmx_vcpu_block() gets called, in which case, we may hit the ASSERT
> >> in it. In previous version, I suggested we remove the ASSERT in
> >> vmx_vcpu_block() and set NDST explicitly in it. But seems maintainer
> >> doesn't like this idea.
> >>
> > And this is not helping either. An ASSERT() being hit means something
> > wrong happened. Whether or not to remove (or change) an ASSERT() is not
> > a matter of "like".
> >
> > If we hit the ASSERT() but nothing is wrong with the code, then it is
> > the ASSERT() itself that is wrong (buggy), and we must remove it, like
> > it or not.
> >
> > OTOH, if we hit a non buggy ASSERT(), then it means that the ASSERT()
> > has done its job in finding something wrong in the code, and we should
> > leave it alone and fix the problem.
> >
> > How's possible for the solution here to be "either remove the ASSERT()
> > _OR_ change the code"? That really makes few sense to me... :-O
> 
> I disagree: Whether fixing/removing an ASSERT() that triggered or
> adjusting other code to make the ASSERT() not trigger can indeed
> both be an option, and may to some degree be a matter of taste.
> And that's where I think Feng and I disagree - I'd prefer the ASSERT()
> to stay, and code elsewhere to make sure it won't trigger, while he
> would prefer to get rid of the ASSERT() (aiui on the basis that the
> code following the ASSERT() establishes the intended state - Feng,
> please correct me if I'm wrong).

Yes, you are correct. Actually both solutions can resolve the issue.
And after some discussion, we are agree on fixing thing in the other
place to make we won't tirgger the ASSERT().

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-20  7:31                                   ` Jan Beulich
  2016-09-20  7:53                                     ` Wu, Feng
@ 2016-09-20  8:13                                     ` Dario Faggioli
  2016-09-20  8:18                                       ` Wu, Feng
  1 sibling, 1 reply; 50+ messages in thread
From: Dario Faggioli @ 2016-09-20  8:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: george.dunlap, andrew.cooper3, Kevin Tian, Feng Wu, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1343 bytes --]

On Tue, 2016-09-20 at 01:31 -0600, Jan Beulich wrote:
> > > > On 20.09.16 at 01:12, <dario.faggioli@citrix.com> wrote:
> > How's possible for the solution here to be "either remove the
> > ASSERT()
> > _OR_ change the code"? That really makes few sense to me... :-O
> 
> I disagree: Whether fixing/removing an ASSERT() that triggered or
> adjusting other code to make the ASSERT() not trigger can indeed
> both be an option, and may to some degree be a matter of taste.
>
Yes, of course this is a possibility... I thought it was clear that I
was simplifying things, but maybe I was simplifying too much, or "just"
expressed myself bad.

What I wasn't clear about was whether it is _correctness_ that is at
risk or not. And that's right because I thought we established already
that this wasn't a correctness issue, while it looked to me, from
reading the discussion, that it actually may be.

My bad again, for sure, sorry. Now I'll go back to the code to fix my
misconceptions and (hopefully) be able to make myself more useful. :-)

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-20  8:13                                     ` Dario Faggioli
@ 2016-09-20  8:18                                       ` Wu, Feng
  0 siblings, 0 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-20  8:18 UTC (permalink / raw)
  To: Dario Faggioli, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, Tian, Kevin, Wu, Feng, xen-devel



> -----Original Message-----
> From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> Sent: Tuesday, September 20, 2016 4:14 PM
> To: Jan Beulich <JBeulich@suse.com>
> Cc: andrew.cooper3@citrix.com; george.dunlap@eu.citrix.com; Wu, Feng
> <feng.wu@intel.com>; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> On Tue, 2016-09-20 at 01:31 -0600, Jan Beulich wrote:
> > > > > On 20.09.16 at 01:12, <dario.faggioli@citrix.com> wrote:
> > > How's possible for the solution here to be "either remove the
> > > ASSERT()
> > > _OR_ change the code"? That really makes few sense to me... :-O
> >
> > I disagree: Whether fixing/removing an ASSERT() that triggered or
> > adjusting other code to make the ASSERT() not trigger can indeed
> > both be an option, and may to some degree be a matter of taste.
> >
> Yes, of course this is a possibility... I thought it was clear that I
> was simplifying things, but maybe I was simplifying too much, or "just"
> expressed myself bad.
> 
> What I wasn't clear about was whether it is _correctness_ that is at
> risk or not. And that's right because I thought we established already
> that this wasn't a correctness issue, while it looked to me, from
> reading the discussion, that it actually may be.
> 
> My bad again, for sure, sorry. Now I'll go back to the code to fix my
> misconceptions and (hopefully) be able to make myself more useful. :-)

You are always useful, your advices are always valuable! :)

Thanks,
Feng

> 
> Thanks and Regards,
> Dario
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-02 13:15                     ` Wu, Feng
  2016-09-02 13:54                       ` Jan Beulich
@ 2016-09-23 14:19                       ` Jan Beulich
  2016-09-26  2:53                         ` Wu, Feng
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2016-09-23 14:19 UTC (permalink / raw)
  To: Feng Wu
  Cc: george.dunlap, andrew.cooper3, dario.faggioli, Kevin Tian, xen-devel

>>> On 02.09.16 at 15:15, <feng.wu@intel.com> wrote:
> And that is another reason I use pause/unpause here, it can address
> all the races. And this is an one-time action (Only occurs at the first
> device gets assigned), do you think it is that unacceptable?

Btw. please see George's very nice description - much better than
I would ever have been able to give - for why we will always want
alternatives to pausing considered first:
https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg02587.html

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks
  2016-09-23 14:19                       ` Jan Beulich
@ 2016-09-26  2:53                         ` Wu, Feng
  0 siblings, 0 replies; 50+ messages in thread
From: Wu, Feng @ 2016-09-26  2:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3,
	dario.faggioli, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, September 23, 2016 10:19 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: andrew.cooper3@citrix.com; dario.faggioli@citrix.com;
> george.dunlap@eu.citrix.com; Tian, Kevin <kevin.tian@intel.com>; xen-
> devel@lists.xen.org
> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> >>> On 02.09.16 at 15:15, <feng.wu@intel.com> wrote:
> > And that is another reason I use pause/unpause here, it can address
> > all the races. And this is an one-time action (Only occurs at the first
> > device gets assigned), do you think it is that unacceptable?
> 
> Btw. please see George's very nice description - much better than
> I would ever have been able to give - for why we will always want
> alternatives to pausing considered first:
> https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg02587.html

Yeah, thanks for sharing this and thanks to the nice description from George!

Thanks,
Feng

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2016-09-26  2:53 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-31  3:56 [PATCH v3 0/6] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
2016-08-31  3:56 ` [PATCH v3 1/6] VMX: Statically assign two PI hooks Feng Wu
2016-09-01  8:16   ` Jan Beulich
2016-09-01  9:13     ` Wu, Feng
2016-09-01  9:23       ` Jan Beulich
2016-09-01  9:38         ` Wu, Feng
2016-09-06  8:42   ` Dario Faggioli
2016-09-06  9:53     ` Wu, Feng
2016-08-31  3:56 ` [PATCH v3 2/6] VMX: Properly handle pi when all the assigned devices are removed Feng Wu
2016-09-01  8:21   ` Jan Beulich
2016-09-01  9:22     ` Wu, Feng
2016-09-01 10:23       ` Jan Beulich
2016-09-01 13:12         ` Wu, Feng
2016-09-06  8:58   ` Dario Faggioli
2016-08-31  3:56 ` [PATCH v3 3/6] VMX: Cleanup PI per-cpu blocking list when vcpu is destroyed Feng Wu
2016-09-06  9:21   ` Dario Faggioli
2016-09-06 23:27     ` Wu, Feng
2016-08-31  3:56 ` [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks Feng Wu
2016-09-01  8:29   ` Jan Beulich
2016-09-02  1:46     ` Wu, Feng
2016-09-02  7:04       ` Jan Beulich
2016-09-02  7:31         ` Wu, Feng
2016-09-02  8:16           ` Jan Beulich
2016-09-02  8:40             ` Wu, Feng
2016-09-02  9:25               ` Jan Beulich
2016-09-02 10:30                 ` Wu, Feng
2016-09-02 10:45                   ` Jan Beulich
2016-09-02 13:15                     ` Wu, Feng
2016-09-02 13:54                       ` Jan Beulich
2016-09-05  3:11                         ` Wu, Feng
2016-09-05  9:27                           ` Jan Beulich
2016-09-14  2:23                           ` Wu, Feng
2016-09-14  8:46                             ` Jan Beulich
2016-09-14 14:51                             ` Dario Faggioli
2016-09-18  8:37                               ` Wu, Feng
2016-09-19 23:12                                 ` Dario Faggioli
2016-09-20  0:48                                   ` Wu, Feng
2016-09-20  7:31                                   ` Jan Beulich
2016-09-20  7:53                                     ` Wu, Feng
2016-09-20  8:13                                     ` Dario Faggioli
2016-09-20  8:18                                       ` Wu, Feng
2016-09-23 14:19                       ` Jan Beulich
2016-09-26  2:53                         ` Wu, Feng
2016-08-31  3:56 ` [PATCH v3 5/6] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
2016-09-01  8:38   ` Jan Beulich
2016-09-02  1:58     ` Wu, Feng
2016-08-31  3:56 ` [PATCH v3 6/6] VMX: Fixup PI descritpor when cpu is offline Feng Wu
2016-09-01  8:48   ` Jan Beulich
2016-09-02  3:25     ` Wu, Feng
2016-09-02  7:08       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.