All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v11 0/2] Add VT-d Posted-Interrupts support
@ 2016-01-28  5:12 Feng Wu
  2016-01-28  5:12 ` [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling Feng Wu
  2016-01-28  5:12 ` [PATCH v11 2/2] Add a command line parameter for VT-d posted-interrupts Feng Wu
  0 siblings, 2 replies; 14+ messages in thread
From: Feng Wu @ 2016-01-28  5:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Feng Wu

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

Feng Wu (2):
  vmx: VT-d posted-interrupt core logic handling
  Add a command line parameter for VT-d posted-interrupts

 docs/misc/xen-command-line.markdown |   9 +-
 xen/arch/x86/hvm/vmx/vmcs.c         |   2 +
 xen/arch/x86/hvm/vmx/vmx.c          | 179 ++++++++++++++++++++++++++++++++++++
 xen/common/schedule.c               |   4 +
 xen/drivers/passthrough/iommu.c     |   3 +
 xen/drivers/passthrough/vtd/iommu.c |   2 +
 xen/include/asm-arm/domain.h        |   2 +
 xen/include/asm-x86/hvm/hvm.h       |   5 +
 xen/include/asm-x86/hvm/vmx/vmcs.h  |  14 +++
 xen/include/asm-x86/hvm/vmx/vmx.h   |   4 +
 10 files changed, 223 insertions(+), 1 deletion(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-01-28  5:12 [PATCH v11 0/2] Add VT-d Posted-Interrupts support Feng Wu
@ 2016-01-28  5:12 ` Feng Wu
  2016-01-28 16:38   ` Jan Beulich
  2016-02-10 12:35   ` George Dunlap
  2016-01-28  5:12 ` [PATCH v11 2/2] Add a command line parameter for VT-d posted-interrupts Feng Wu
  1 sibling, 2 replies; 14+ messages in thread
From: Feng Wu @ 2016-01-28  5:12 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, Jan Beulich, Feng Wu

This is the core logic handling for VT-d posted-interrupts. Basically it
deals with how and when to update posted-interrupts during the following
scenarios:
- vCPU is preempted
- vCPU is slept
- vCPU is blocked

When vCPU is preempted/slept, we update the posted-interrupts during
scheduling by introducing two new architecutral scheduler hooks:
vmx_pi_switch_from() and vmx_pi_switch_to(). When vCPU is blocked, we
introduce a new architectural hook: arch_vcpu_block() to update
posted-interrupts descriptor.

Besides that, before VM-entry, we will make sure the 'NV' filed is set
to 'posted_intr_vector' and the vCPU is not in any blocking lists, which
is needed when vCPU is running in non-root mode. The reason we do this check
is because we change the posted-interrupts descriptor in vcpu_block(),
however, we don't change it back in vcpu_unblock() or when vcpu_block()
directly returns due to event delivery (in fact, we don't need to do it
in the two places, that is why we do it before VM-Entry).

When we handle the lazy context switch for the following two scenarios:
- Preempted by a tasklet, which uses in an idle context.
- the prev vcpu is in offline and no new available vcpus in run queue.
We don't change the 'SN' bit in posted-interrupt descriptor, this
may incur spurious PI notification events, but since PI notification
event is only sent when 'ON' is clear, and once the PI notificatoin
is sent, ON is set by hardware, hence no more notification events
before 'ON' is clear. Besides that, spurious PI notification events are
going to happen from time to time in Xen hypervisor, such as, when
guests trap to Xen and PI notification event happens, there is
nothing Xen actually needs to do about it, the interrupts will be
delivered to guest atht the next time we do a VMENTRY.

CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Dario Faggioli <dario.faggioli@citrix.com>
Suggested-by: Yang Zhang <yang.z.zhang@intel.com>
Suggested-by: Dario Faggioli <dario.faggioli@citrix.com>
Suggested-by: George Dunlap <george.dunlap@eu.citrix.com>
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v11:
- Add ASSERT() in vmx_vcpu_block()
- Add some comments in vmx_pi_switch_from()
- Remove some comments which should have been removed when the
  related code was removed during v9 -> v10
- Rename 'vmx_pi_state_to_normal' to 'vmx_pi_do_resume'
- Coding style
- Make arch_vcpu_block() a macro
- Make 'pi_wakeup_vector' static
- Move hook 'vcpu_block' to 'struct hvm_vcpu'
- Initial hook 'vcpu_block' when assigning the first pci device
  and zap it on removal of the last device
- Save pointer to the block list lock instead of the processor
  id in 'struct arch_vmx_struct'
- Implement the following functions as hooks, so we
  can elimilate lots of checkings and spinlocks in scheduling
  related code path, which is good for performance.
        vmx_pi_switch_from
        vmx_pi_switch_to
        vmx_pi_do_resume

v10:
- Check iommu_intpost first
- Remove pointless checking of has_hvm_container_vcpu(v)
- Rename 'vmx_pi_state_change' to 'vmx_pi_state_to_normal'
- Since vcpu_unblock() doesn't acquire 'pi_blocked_vcpu_lock', we
  don't need use another list to save the vCPUs with 'ON' set, just
  directly call vcpu_unblock(v).

v9:
- Remove arch_vcpu_block_cancel() and arch_vcpu_wake_prepare()
- Add vmx_pi_state_change() and call it before VM Entry

v8:
- Remove the lazy context switch handling for PI state transition
- Change PI state in vcpu_block() and do_poll() when the vCPU
  is going to be blocked

v7:
- Merge [PATCH v6 16/18] vmx: Add some scheduler hooks for VT-d posted interrupts
  and "[PATCH v6 14/18] vmx: posted-interrupt handling when vCPU is blocked"
  into this patch, so it is self-contained and more convenient
  for code review.
- Make 'pi_blocked_vcpu' and 'pi_blocked_vcpu_lock' static
- Coding style
- Use per_cpu() instead of this_cpu() in pi_wakeup_interrupt()
- Move ack_APIC_irq() to the beginning of pi_wakeup_interrupt()
- Rename 'pi_ctxt_switch_from' to 'ctxt_switch_prepare'
- Rename 'pi_ctxt_switch_to' to 'ctxt_switch_cancel'
- Use 'has_hvm_container_vcpu' instead of 'is_hvm_vcpu'
- Use 'spin_lock' and 'spin_unlock' when the interrupt has been
  already disabled.
- Rename arch_vcpu_wake_prepare to vmx_vcpu_wake_prepare
- Define vmx_vcpu_wake_prepare in xen/arch/x86/hvm/hvm.c
- Call .pi_ctxt_switch_to() __context_switch() instead of directly
  calling vmx_post_ctx_switch_pi() in vmx_ctxt_switch_to()
- Make .pi_block_cpu unsigned int
- Use list_del() instead of list_del_init()
- Coding style

One remaining item in v7:
Jan has concern about calling vcpu_unblock() in vmx_pre_ctx_switch_pi(),
need Dario or George's input about this.

v6:
- Add two static inline functions for pi context switch
- Fix typos

v5:
- Rename arch_vcpu_wake to arch_vcpu_wake_prepare
- Make arch_vcpu_wake_prepare() inline for ARM
- Merge the ARM dummy hook with together
- Changes to some code comments
- Leave 'pi_ctxt_switch_from' and 'pi_ctxt_switch_to' NULL if
  PI is disabled or the vCPU is not in HVM
- Coding style

v4:
- Newly added

Changlog for "vmx: posted-interrupt handling when vCPU is blocked"
v6:
- Fix some typos
- Ack the interrupt right after the spin_unlock in pi_wakeup_interrupt()

v4:
- Use local variables in pi_wakeup_interrupt()
- Remove vcpu from the blocked list when pi_desc.on==1, this
- avoid kick vcpu multiple times.
- Remove tasklet

v3:
- This patch is generated by merging the following three patches in v2:
   [RFC v2 09/15] Add a new per-vCPU tasklet to wakeup the blocked vCPU
   [RFC v2 10/15] vmx: Define two per-cpu variables
   [RFC v2 11/15] vmx: Add a global wake-up vector for VT-d Posted-Interrupts
- rename 'vcpu_wakeup_tasklet' to 'pi_vcpu_wakeup_tasklet'
- Move the definition of 'pi_vcpu_wakeup_tasklet' to 'struct arch_vmx_struct'
- rename 'vcpu_wakeup_tasklet_handler' to 'pi_vcpu_wakeup_tasklet_handler'
- Make pi_wakeup_interrupt() static
- Rename 'blocked_vcpu_list' to 'pi_blocked_vcpu_list'
- move 'pi_blocked_vcpu_list' to 'struct arch_vmx_struct'
- Rename 'blocked_vcpu' to 'pi_blocked_vcpu'
- Rename 'blocked_vcpu_lock' to 'pi_blocked_vcpu_lock'

 xen/arch/x86/hvm/vmx/vmcs.c         |   2 +
 xen/arch/x86/hvm/vmx/vmx.c          | 179 ++++++++++++++++++++++++++++++++++++
 xen/common/schedule.c               |   4 +
 xen/drivers/passthrough/vtd/iommu.c |   2 +
 xen/include/asm-arm/domain.h        |   2 +
 xen/include/asm-x86/hvm/hvm.h       |   5 +
 xen/include/asm-x86/hvm/vmx/vmcs.h  |  14 +++
 xen/include/asm-x86/hvm/vmx/vmx.h   |   4 +
 8 files changed, 212 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index edd4c8d..2e535de 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -676,6 +676,8 @@ int vmx_cpu_up(void)
     if ( cpu_has_vmx_vpid )
         vpid_sync_all();
 
+    vmx_pi_per_cpu_init(cpu);
+
     return 0;
 }
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 7917fb7..d60c0d6 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg_intercept(unsigned long vaddr);
 static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
 
+/*
+ * We maintain a per-CPU linked-list of vCPUs, so in PI wakeup
+ * handler we can find which vCPU should be woken up.
+ */
+static DEFINE_PER_CPU(struct list_head, pi_blocked_vcpu);
+static DEFINE_PER_CPU(spinlock_t, pi_blocked_vcpu_lock);
+
 uint8_t __read_mostly posted_intr_vector;
+static uint8_t __read_mostly pi_wakeup_vector;
+
+void vmx_pi_per_cpu_init(unsigned int cpu)
+{
+    INIT_LIST_HEAD(&per_cpu(pi_blocked_vcpu, cpu));
+    spin_lock_init(&per_cpu(pi_blocked_vcpu_lock, cpu));
+}
+
+static void vmx_vcpu_block(struct vcpu *v)
+{
+    unsigned long flags;
+    unsigned int dest;
+
+    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+
+    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);
+
+    /* The vCPU is blocking, we need to add it to one of the per-cpu lists. */
+    v->arch.hvm_vmx.pi_block_list_lock =
+                    &per_cpu(pi_blocked_vcpu_lock, v->processor);
+
+    spin_lock_irqsave(v->arch.hvm_vmx.pi_block_list_lock, flags);
+    list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list,
+                  &per_cpu(pi_blocked_vcpu, v->processor));
+    spin_unlock_irqrestore(v->arch.hvm_vmx.pi_block_list_lock, flags);
+
+    ASSERT(!pi_test_sn(pi_desc));
+
+    dest = cpu_physical_id(v->processor);
+
+    ASSERT(pi_desc->ndst ==
+            (x2apic_enabled ? dest: MASK_INSR(dest, PI_xAPIC_NDST_MASK)));
+
+    write_atomic(&pi_desc->nv, pi_wakeup_vector);
+}
+
+static void vmx_pi_switch_from(struct vcpu *v)
+{
+    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+
+    if ( test_bit(_VPF_blocked, &v->pause_flags) )
+        return;
+
+    /*
+     * The vCPU has been preempted or went to sleep. We don't
+     * need to send notification event to a runnable or sleeping
+     * vcpu, the interrupt information will be delivered to it
+     * before VM-ENTRY when the vcpu is scheduled to run next time.
+     */
+    pi_set_sn(pi_desc);
+}
+
+static void vmx_pi_switch_to(struct vcpu *v)
+{
+    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+
+    write_atomic(&pi_desc->ndst, x2apic_enabled ?
+                 cpu_physical_id(v->processor) :
+                 MASK_INSR(cpu_physical_id(v->processor), PI_xAPIC_NDST_MASK));
+
+    /*
+     * The vCPU is going to run, we need to clear 'SN' to
+     * make it accept notification event when interrupts
+     * will being posted for it.
+     */
+    pi_clear_sn(pi_desc);
+}
+
+static void vmx_pi_do_resume(struct vcpu *v)
+{
+    unsigned long flags;
+    spinlock_t *pi_block_list_lock;
+    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+
+    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
+
+    /*
+     * Set 'NV' field back to posted_intr_vector, so the
+     * Posted-Interrupts can be delivered to the vCPU when
+     * it is running in non-root mode.
+     */
+    if ( pi_desc->nv != posted_intr_vector )
+        write_atomic(&pi_desc->nv, posted_intr_vector);
+
+    /* The vCPU is not on any blocking list. */
+    pi_block_list_lock = v->arch.hvm_vmx.pi_block_list_lock;
+    if ( pi_block_list_lock == NULL )
+        return;
+
+    spin_lock_irqsave(pi_block_list_lock, flags);
+
+    /*
+     * v->arch.hvm_vmx.pi_block_list_lock == NULL here means the vCPU was
+     * removed from the blocking list while we are acquiring the lock.
+     */
+    if ( v->arch.hvm_vmx.pi_block_list_lock != NULL )
+    {
+        list_del(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
+        v->arch.hvm_vmx.pi_block_list_lock = NULL;
+    }
+
+    spin_unlock_irqrestore(pi_block_list_lock, flags);
+}
+
+/* This function is called when pcidevs_lock is held */
+void vmx_pi_hooks_reassign(struct domain *source, struct domain *target)
+{
+    if (!iommu_intpost)
+        return;
+
+    if ( has_hvm_container_domain(source) &&
+         source->arch.hvm_domain.vmx.vcpu_block && !has_arch_pdevs(source) ) {
+        source->arch.hvm_domain.vmx.vcpu_block = NULL;
+        source->arch.hvm_domain.vmx.pi_switch_from = NULL;
+        source->arch.hvm_domain.vmx.pi_switch_to = NULL;
+        source->arch.hvm_domain.vmx.pi_do_resume = NULL;
+    }
+
+    if ( has_hvm_container_domain(target) &&
+         !target->arch.hvm_domain.vmx.vcpu_block && has_arch_pdevs(target) ) {
+        target->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
+        target->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
+        target->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
+        target->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
+    }
+}
 
 static int vmx_domain_initialise(struct domain *d)
 {
@@ -112,6 +245,10 @@ static int vmx_vcpu_initialise(struct vcpu *v)
 
     spin_lock_init(&v->arch.hvm_vmx.vmcs_lock);
 
+    INIT_LIST_HEAD(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
+
+    v->arch.hvm_vmx.pi_block_list_lock = NULL;
+
     v->arch.schedule_tail    = vmx_do_resume;
     v->arch.ctxt_switch_from = vmx_ctxt_switch_from;
     v->arch.ctxt_switch_to   = vmx_ctxt_switch_to;
@@ -740,6 +877,8 @@ static void vmx_ctxt_switch_from(struct vcpu *v)
     vmx_save_guest_msrs(v);
     vmx_restore_host_msrs();
     vmx_save_dr(v);
+    if (v->domain->arch.hvm_domain.vmx.pi_switch_from)
+        v->domain->arch.hvm_domain.vmx.pi_switch_from(v);
 }
 
 static void vmx_ctxt_switch_to(struct vcpu *v)
@@ -752,6 +891,8 @@ static void vmx_ctxt_switch_to(struct vcpu *v)
 
     vmx_restore_guest_msrs(v);
     vmx_restore_dr(v);
+    if (v->domain->arch.hvm_domain.vmx.pi_switch_to)
+        v->domain->arch.hvm_domain.vmx.pi_switch_to(v);
 }
 
 
@@ -2010,6 +2151,38 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .altp2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
 };
 
+/* Handle VT-d posted-interrupt when VCPU is blocked. */
+static void pi_wakeup_interrupt(struct cpu_user_regs *regs)
+{
+    struct arch_vmx_struct *vmx, *tmp;
+    spinlock_t *lock = &per_cpu(pi_blocked_vcpu_lock, smp_processor_id());
+    struct list_head *blocked_vcpus =
+                       &per_cpu(pi_blocked_vcpu, smp_processor_id());
+
+    ack_APIC_irq();
+    this_cpu(irq_count)++;
+
+    spin_lock(lock);
+
+    /*
+     * XXX: The length of the list depends on how many vCPU is current
+     * blocked on this specific pCPU. This may hurt the interrupt latency
+     * if the list grows to too many entries.
+     */
+    list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocked_vcpu_list)
+    {
+        if ( pi_test_on(&vmx->pi_desc) )
+        {
+            list_del(&vmx->pi_blocked_vcpu_list);
+            ASSERT(vmx->pi_block_list_lock == lock);
+            vmx->pi_block_list_lock = NULL;
+            vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
+        }
+    }
+
+    spin_unlock(lock);
+}
+
 /* Handle VT-d posted-interrupt when VCPU is running. */
 static void pi_notification_interrupt(struct cpu_user_regs *regs)
 {
@@ -2096,7 +2269,10 @@ const struct hvm_function_table * __init start_vmx(void)
     if ( cpu_has_vmx_posted_intr_processing )
     {
         if ( iommu_intpost )
+        {
             alloc_direct_apic_vector(&posted_intr_vector, pi_notification_interrupt);
+            alloc_direct_apic_vector(&pi_wakeup_vector, pi_wakeup_interrupt);
+        }
         else
             alloc_direct_apic_vector(&posted_intr_vector, event_check_interrupt);
     }
@@ -3574,6 +3750,9 @@ void vmx_vmenter_helper(const struct cpu_user_regs *regs)
     struct hvm_vcpu_asid *p_asid;
     bool_t need_flush;
 
+    if (curr->domain->arch.hvm_domain.vmx.pi_do_resume)
+        curr->domain->arch.hvm_domain.vmx.pi_do_resume(curr);
+
     if ( !cpu_has_vmx_vpid )
         goto out;
     if ( nestedhvm_vcpu_in_guestmode(curr) )
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index d121896..2d87021 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -802,6 +802,8 @@ void vcpu_block(void)
 
     set_bit(_VPF_blocked, &v->pause_flags);
 
+    arch_vcpu_block(v);
+
     /* Check for events /after/ blocking: avoids wakeup waiting race. */
     if ( local_events_need_delivery() )
     {
@@ -839,6 +841,8 @@ static long do_poll(struct sched_poll *sched_poll)
     v->poll_evtchn = -1;
     set_bit(v->vcpu_id, d->poll_mask);
 
+    arch_vcpu_block(v);
+
 #ifndef CONFIG_X86 /* set_bit() implies mb() on x86 */
     /* Check for events /after/ setting flags: avoids wakeup waiting race. */
     smp_mb();
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index ec31c6b..fb47d29 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
         pdev->domain = target;
     }
 
+    vmx_pi_hooks_reassign(source, target);
+
     return ret;
 }
 
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index aa7f283..37afa80 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -310,6 +310,8 @@ static inline void free_vcpu_guest_context(struct vcpu_guest_context *vgc)
     xfree(vgc);
 }
 
+static inline void arch_vcpu_block(struct vcpu *v) {}
+
 #endif /* __ASM_DOMAIN_H__ */
 
 /*
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index b9d893d..4590456 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -565,6 +565,11 @@ const char *hvm_efer_valid(const struct vcpu *v, uint64_t value,
                            signed int cr0_pg);
 unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t restore);
 
+#define arch_vcpu_block(v) ({                                               \
+    if ( (v)->domain->arch.hvm_domain.vmx.vcpu_block )                      \
+        (v)->domain->arch.hvm_domain.vmx.vcpu_block((v));                   \
+})
+
 #endif /* __ASM_X86_HVM_HVM_H__ */
 
 /*
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index d1496b8..b7cc900 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -77,6 +77,11 @@ struct vmx_domain {
     unsigned long apic_access_mfn;
     /* VMX_DOMAIN_* */
     unsigned int status;
+
+    void (*vcpu_block) (struct vcpu *);
+    void (*pi_switch_from) (struct vcpu *v);
+    void (*pi_switch_to) (struct vcpu *v);
+    void (*pi_do_resume) (struct vcpu *v);
 };
 
 struct pi_desc {
@@ -160,6 +165,15 @@ struct arch_vmx_struct {
     struct page_info     *vmwrite_bitmap;
 
     struct page_info     *pml_pg;
+
+    struct list_head     pi_blocked_vcpu_list;
+
+    /*
+     * Before it is blocked, vCPU is added to the per-cpu list.
+     * VT-d engine can send wakeup notification event to the
+     * pCPU and wakeup the related vCPU.
+     */
+    spinlock_t           *pi_block_list_lock;
 };
 
 int vmx_create_vmcs(struct vcpu *v);
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 1719965..baaaa53 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -563,6 +563,10 @@ int alloc_p2m_hap_data(struct p2m_domain *p2m);
 void free_p2m_hap_data(struct p2m_domain *p2m);
 void p2m_init_hap_data(struct p2m_domain *p2m);
 
+void vmx_pi_per_cpu_init(unsigned int cpu);
+
+void vmx_pi_hooks_reassign(struct domain *source, struct domain *target);
+
 /* EPT violation qualifications definitions */
 #define _EPT_READ_VIOLATION         0
 #define EPT_READ_VIOLATION          (1UL<<_EPT_READ_VIOLATION)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v11 2/2] Add a command line parameter for VT-d posted-interrupts
  2016-01-28  5:12 [PATCH v11 0/2] Add VT-d Posted-Interrupts support Feng Wu
  2016-01-28  5:12 ` [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling Feng Wu
@ 2016-01-28  5:12 ` Feng Wu
  1 sibling, 0 replies; 14+ messages in thread
From: Feng Wu @ 2016-01-28  5:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Feng Wu, Jan Beulich

Enable VT-d Posted-Interrupts and add a command line
parameter for it.

CC: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 docs/misc/xen-command-line.markdown | 9 ++++++++-
 xen/drivers/passthrough/iommu.c     | 3 +++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 467dc8f..ea1d60d 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -868,7 +868,7 @@ debug hypervisor only).
 > Default: `new` unless directed-EOI is supported
 
 ### iommu
-> `= List of [ <boolean> | force | required | intremap | qinval | snoop | sharept | dom0-passthrough | dom0-strict | amd-iommu-perdev-intremap | workaround_bios_bug | igfx | verbose | debug ]`
+> `= List of [ <boolean> | force | required | intremap | intpost | qinval | snoop | sharept | dom0-passthrough | dom0-strict | amd-iommu-perdev-intremap | workaround_bios_bug | igfx | verbose | debug ]`
 
 > Sub-options:
 
@@ -895,6 +895,13 @@ debug hypervisor only).
 >> Control the use of interrupt remapping (DMA remapping will always be enabled
 >> if IOMMU functionality is enabled).
 
+> `intpost`
+
+> Default: `false`
+
+>> Control the use of interrupt posting, which depends on the availability of
+>> interrupt remapping.
+
 > `qinval` (VT-d)
 
 > Default: `true`
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 0b2abf4..50d74a5 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -32,6 +32,7 @@ static void iommu_dump_p2m_table(unsigned char key);
  *   off|no|false|disable       Disable IOMMU (default)
  *   force|required             Don't boot unless IOMMU is enabled
  *   no-intremap                Disable interrupt remapping
+ *   no-intpost                 Disable VT-d Interrupt posting
  *   verbose                    Be more verbose
  *   debug                      Enable debugging messages and checks
  *   workaround_bios_bug        Workaround some bios issue to still enable
@@ -105,6 +106,8 @@ static void __init parse_iommu_param(char *s)
             iommu_qinval = val;
         else if ( !strcmp(s, "intremap") )
             iommu_intremap = val;
+        else if ( !strcmp(s, "intpost") )
+            iommu_intpost = val;
         else if ( !strcmp(s, "debug") )
         {
             iommu_debug = val;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-01-28  5:12 ` [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling Feng Wu
@ 2016-01-28 16:38   ` Jan Beulich
  2016-01-29  1:53     ` Wu, Feng
  2016-02-10 12:35   ` George Dunlap
  1 sibling, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2016-01-28 16:38 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, xen-devel

>>> On 28.01.16 at 06:12, <feng.wu@intel.com> wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
>  static void vmx_invlpg_intercept(unsigned long vaddr);
>  static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
>  
> +/*
> + * We maintain a per-CPU linked-list of vCPUs, so in PI wakeup
> + * handler we can find which vCPU should be woken up.
> + */
> +static DEFINE_PER_CPU(struct list_head, pi_blocked_vcpu);
> +static DEFINE_PER_CPU(spinlock_t, pi_blocked_vcpu_lock);
> +
>  uint8_t __read_mostly posted_intr_vector;
> +static uint8_t __read_mostly pi_wakeup_vector;
> +
> +void vmx_pi_per_cpu_init(unsigned int cpu)
> +{
> +    INIT_LIST_HEAD(&per_cpu(pi_blocked_vcpu, cpu));
> +    spin_lock_init(&per_cpu(pi_blocked_vcpu_lock, cpu));
> +}
> +
> +static void vmx_vcpu_block(struct vcpu *v)
> +{
> +    unsigned long flags;
> +    unsigned int dest;
> +
> +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;

Stray blank line between declarations.

> +    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);

I think this needs to be done after acquiring the lock.

> +    /* The vCPU is blocking, we need to add it to one of the per-cpu lists. */
> +    v->arch.hvm_vmx.pi_block_list_lock =
> +                    &per_cpu(pi_blocked_vcpu_lock, v->processor);
> +
> +    spin_lock_irqsave(v->arch.hvm_vmx.pi_block_list_lock, flags);
> +    list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list,
> +                  &per_cpu(pi_blocked_vcpu, v->processor));
> +    spin_unlock_irqrestore(v->arch.hvm_vmx.pi_block_list_lock, flags);
> +
> +    ASSERT(!pi_test_sn(pi_desc));
> +
> +    dest = cpu_physical_id(v->processor);
> +
> +    ASSERT(pi_desc->ndst ==
> +            (x2apic_enabled ? dest: MASK_INSR(dest, PI_xAPIC_NDST_MASK)));

Missing space before colon.

> +static void vmx_pi_switch_from(struct vcpu *v)
> +{
> +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> +
> +    if ( test_bit(_VPF_blocked, &v->pause_flags) )
> +        return;
> +
> +    /*
> +     * The vCPU has been preempted or went to sleep. We don't
> +     * need to send notification event to a runnable or sleeping
> +     * vcpu, the interrupt information will be delivered to it
> +     * before VM-ENTRY when the vcpu is scheduled to run next time.
> +     */
> +    pi_set_sn(pi_desc);
> +}

The comment confuses me: A runnable vCPU certainly doesn't
need waking, but a sleeping one? Why wouldn't an interrupt
coming in not need to wake that vCPU, if this interrupt is what
it's waiting for?

> +static void vmx_pi_do_resume(struct vcpu *v)
> +{
> +    unsigned long flags;
> +    spinlock_t *pi_block_list_lock;
> +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> +
> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> +
> +    /*
> +     * Set 'NV' field back to posted_intr_vector, so the
> +     * Posted-Interrupts can be delivered to the vCPU when
> +     * it is running in non-root mode.
> +     */
> +    if ( pi_desc->nv != posted_intr_vector )
> +        write_atomic(&pi_desc->nv, posted_intr_vector);

Perhaps this was discussed before, but I don't recall and now
wonder - why inside an if()? This is a simple memory write on
x86.

> +    /* The vCPU is not on any blocking list. */
> +    pi_block_list_lock = v->arch.hvm_vmx.pi_block_list_lock;
> +    if ( pi_block_list_lock == NULL )
> +        return;

As said before, I think you're missing a barrier here, to prevent
the compiler from eliminating the local variable.

> +/* This function is called when pcidevs_lock is held */
> +void vmx_pi_hooks_reassign(struct domain *source, struct domain *target)
> +{
> +    if (!iommu_intpost)

Coding style.

> +        return;
> +
> +    if ( has_hvm_container_domain(source) &&
> +         source->arch.hvm_domain.vmx.vcpu_block && !has_arch_pdevs(source) ) {

Again.

> +        source->arch.hvm_domain.vmx.vcpu_block = NULL;
> +        source->arch.hvm_domain.vmx.pi_switch_from = NULL;
> +        source->arch.hvm_domain.vmx.pi_switch_to = NULL;
> +        source->arch.hvm_domain.vmx.pi_do_resume = NULL;
> +    }
> +
> +    if ( has_hvm_container_domain(target) &&
> +         !target->arch.hvm_domain.vmx.vcpu_block && has_arch_pdevs(target) ) {

And again (not going to mention any further ones).

> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
>          pdev->domain = target;
>      }
>  
> +    vmx_pi_hooks_reassign(source, target);
> +
>      return ret;
>  }

I think you need to clear source's hooks here, but target's need to
get set ahead of the actual assignment.

> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -565,6 +565,11 @@ const char *hvm_efer_valid(const struct vcpu *v, uint64_t value,
>                             signed int cr0_pg);
>  unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t restore);
>  
> +#define arch_vcpu_block(v) ({                                               \
> +    if ( (v)->domain->arch.hvm_domain.vmx.vcpu_block )                      \
> +        (v)->domain->arch.hvm_domain.vmx.vcpu_block((v));                   \

Stray double parentheses. But - why is this a macro all of the
sudden anyway?

> @@ -160,6 +165,15 @@ struct arch_vmx_struct {
>      struct page_info     *vmwrite_bitmap;
>  
>      struct page_info     *pml_pg;
> +
> +    struct list_head     pi_blocked_vcpu_list;
> +
> +    /*
> +     * Before it is blocked, vCPU is added to the per-cpu list.
> +     * VT-d engine can send wakeup notification event to the
> +     * pCPU and wakeup the related vCPU.
> +     */
> +    spinlock_t           *pi_block_list_lock;
>  };

Wasn't the comment meant to go ahead of both new fields? For the
lock pointer alone it doesn't make much sense in the way it's written
right now.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-01-28 16:38   ` Jan Beulich
@ 2016-01-29  1:53     ` Wu, Feng
  2016-01-29  9:31       ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Wu, Feng @ 2016-01-29  1:53 UTC (permalink / raw)
  To: Jan Beulich, Dario Faggioli
  Cc: Tian, Kevin, Keir Fraser, George Dunlap, Andrew Cooper,
	xen-devel, Wu, Feng



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, January 29, 2016 12:38 AM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli
> <dario.faggioli@citrix.com>; George Dunlap <george.dunlap@eu.citrix.com>;
> Tian, Kevin <kevin.tian@intel.com>; xen-devel@lists.xen.org; Keir Fraser
> <keir@xen.org>
> Subject: Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
> 
> >>> On 28.01.16 at 06:12, <feng.wu@intel.com> wrote:
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int msr,
> uint64_t msr_content);
> >  static void vmx_invlpg_intercept(unsigned long vaddr);
> >  static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
> >
> > +static void vmx_vcpu_block(struct vcpu *v)
> > +{
> > +    unsigned long flags;
> > +    unsigned int dest;
> > +
> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> 
> Stray blank line between declarations.
> 
> > +    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);
> 
> I think this needs to be done after acquiring the lock.

I am not sure what you mean here. The ASSERT here means that
the vCPU is not on any pCPU list when we get here.

> 
> > +static void vmx_pi_switch_from(struct vcpu *v)
> > +{
> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> > +
> > +    if ( test_bit(_VPF_blocked, &v->pause_flags) )
> > +        return;
> > +
> > +    /*
> > +     * The vCPU has been preempted or went to sleep. We don't
> > +     * need to send notification event to a runnable or sleeping
> > +     * vcpu, the interrupt information will be delivered to it
> > +     * before VM-ENTRY when the vcpu is scheduled to run next time.
> > +     */
> > +    pi_set_sn(pi_desc);
> > +}
> 
> The comment confuses me: A runnable vCPU certainly doesn't
> need waking, but a sleeping one? Why wouldn't an interrupt
> coming in not need to wake that vCPU, if this interrupt is what
> it's waiting for?

First, let's put VT-d PI away, just thinking the case without it:
When a vCPU is sleeping, such as, vcpu_sleep_nosync() gets
called, the vCPU is in RUNSTATE_offline state. Then an interrupt
comes in and the slept vCPU is the destination, the vCPU is not
woken up by the interrupts. The slept vCPU would only be woken
up by calling vcpu_wakeup() in the code path responding to
the interrupt, however, this is not in the code. So this is
the current implementation and policy. I don't think VT-d PI needs
to break it. On the other hand, even we clear SN for slept vCPU,
it doesn't help, since it is still not in the run-queue and don't
have chance to run.

Above is my understanding about the scheduler related things,
maybe Dario can double confirm this. Thanks!

> 
> > +static void vmx_pi_do_resume(struct vcpu *v)
> > +{
> > +    unsigned long flags;
> > +    spinlock_t *pi_block_list_lock;
> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> > +
> > +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> > +
> > +    /*
> > +     * Set 'NV' field back to posted_intr_vector, so the
> > +     * Posted-Interrupts can be delivered to the vCPU when
> > +     * it is running in non-root mode.
> > +     */
> > +    if ( pi_desc->nv != posted_intr_vector )
> > +        write_atomic(&pi_desc->nv, posted_intr_vector);
> 
> Perhaps this was discussed before, but I don't recall and now
> wonder - why inside an if()? This is a simple memory write on
> x86.

The initial purpose is that if NV is already equal to 'posted_intr_vector',
we can save the following atomically write operation. There are some
volatile stuff and barriers in write_atomic().

> 
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
> >          pdev->domain = target;
> >      }
> >
> > +    vmx_pi_hooks_reassign(source, target);
> > +
> >      return ret;
> >  }
> 
> I think you need to clear source's hooks here, but target's need to
> get set ahead of the actual assignment.

I think this is the place where the device is moved from
&source->arch.pdev_list to &target->arch.pdev_list, if that is the
case, we should clear source and set target here, right?

Please refer to the following code before 'vmx_pi_hooks_reassign'

    if ( devfn == pdev->devfn )
    {
        list_move(&pdev->domain_list, &target->arch.pdev_list);
        pdev->domain = target;
    }

> 
> > --- a/xen/include/asm-x86/hvm/hvm.h
> > +++ b/xen/include/asm-x86/hvm/hvm.h
> > @@ -565,6 +565,11 @@ const char *hvm_efer_valid(const struct vcpu *v,
> uint64_t value,
> >                             signed int cr0_pg);
> >  unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t
> restore);
> >
> > +#define arch_vcpu_block(v) ({                                               \
> > +    if ( (v)->domain->arch.hvm_domain.vmx.vcpu_block )                      \
> > +        (v)->domain->arch.hvm_domain.vmx.vcpu_block((v));                   \
> 
> Stray double parentheses. But - why is this a macro all of the
> sudden anyway?
> 

In the previous version, you suggest to make arch_vcpu_block()
inline, I tried that, but I encountered building failures, which is
related to using '(v)->domain->arch.hvm_domain.vmx.vcpu_block '
here since it refers to some data structure, it is not so straightforward
to address it, so I change it to a macro, just like other micros in this
file, which refers to ' (v)->arch.hvm_vcpu.....'.

Thanks,
Feng

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-01-29  1:53     ` Wu, Feng
@ 2016-01-29  9:31       ` Jan Beulich
  2016-02-02  4:48         ` Wu, Feng
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2016-01-29  9:31 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, xen-devel

>>> On 29.01.16 at 02:53, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Friday, January 29, 2016 12:38 AM
>> >>> On 28.01.16 at 06:12, <feng.wu@intel.com> wrote:
>> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> > @@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int msr,
>> uint64_t msr_content);
>> >  static void vmx_invlpg_intercept(unsigned long vaddr);
>> >  static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
>> >
>> > +static void vmx_vcpu_block(struct vcpu *v)
>> > +{
>> > +    unsigned long flags;
>> > +    unsigned int dest;
>> > +
>> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
>> 
>> Stray blank line between declarations.
>> 
>> > +    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);
>> 
>> I think this needs to be done after acquiring the lock.
> 
> I am not sure what you mean here. The ASSERT here means that
> the vCPU is not on any pCPU list when we get here.

My point is that until you've taken the lock, whether the vCPU is on
any such list may change. All state updates are done under lock, so
checking for valid incoming state should be done under lock too, to
be most useful.

>> > +static void vmx_pi_switch_from(struct vcpu *v)
>> > +{
>> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
>> > +
>> > +    if ( test_bit(_VPF_blocked, &v->pause_flags) )
>> > +        return;
>> > +
>> > +    /*
>> > +     * The vCPU has been preempted or went to sleep. We don't
>> > +     * need to send notification event to a runnable or sleeping
>> > +     * vcpu, the interrupt information will be delivered to it
>> > +     * before VM-ENTRY when the vcpu is scheduled to run next time.
>> > +     */
>> > +    pi_set_sn(pi_desc);
>> > +}
>> 
>> The comment confuses me: A runnable vCPU certainly doesn't
>> need waking, but a sleeping one? Why wouldn't an interrupt
>> coming in not need to wake that vCPU, if this interrupt is what
>> it's waiting for?
> 
> First, let's put VT-d PI away, just thinking the case without it:
> When a vCPU is sleeping, such as, vcpu_sleep_nosync() gets
> called, the vCPU is in RUNSTATE_offline state. Then an interrupt
> comes in and the slept vCPU is the destination, the vCPU is not
> woken up by the interrupts. The slept vCPU would only be woken
> up by calling vcpu_wakeup() in the code path responding to
> the interrupt, however, this is not in the code. So this is
> the current implementation and policy. I don't think VT-d PI needs
> to break it. On the other hand, even we clear SN for slept vCPU,
> it doesn't help, since it is still not in the run-queue and don't
> have chance to run.

Okay, maybe this is just a matter of mixed up terminology (and
not even you mixing it up): "sleeping" to me means waiting for
some kind of event (e.g. resulting from SCHEDOP_block),
whereas "offline" means waiting to be onlined again. But indeed
I agree that vcpu_pause*() calling vcpu_sleep_*sync() makes
the wording above in line with internal uses. Just that the use
of vcpu_sleep_*sync() along with setting various bits in
->pause_{flags,count} is, well, not ideal: Either the fields should
have "sleep" in their names or the operation should have "pause"
in its name. Just that there already is a vcpu_pause_nosync().

So maybe replacing "went to sleep" by "has been paused" would
already clarify things enough. Or alternatively refer to the two
functions, e.g. "has been put to sleep by vcpu_sleep_{,no}sync()".

>> > +static void vmx_pi_do_resume(struct vcpu *v)
>> > +{
>> > +    unsigned long flags;
>> > +    spinlock_t *pi_block_list_lock;
>> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
>> > +
>> > +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>> > +
>> > +    /*
>> > +     * Set 'NV' field back to posted_intr_vector, so the
>> > +     * Posted-Interrupts can be delivered to the vCPU when
>> > +     * it is running in non-root mode.
>> > +     */
>> > +    if ( pi_desc->nv != posted_intr_vector )
>> > +        write_atomic(&pi_desc->nv, posted_intr_vector);
>> 
>> Perhaps this was discussed before, but I don't recall and now
>> wonder - why inside an if()? This is a simple memory write on
>> x86.
> 
> The initial purpose is that if NV is already equal to 'posted_intr_vector',
> we can save the following atomically write operation. There are some
> volatile stuff and barriers in write_atomic().

But what does the final generated code look like? As I said, I'm
sure it's just a single MOV. And putting a conditional around it will
likely make things slower rather than faster.

>> > --- a/xen/drivers/passthrough/vtd/iommu.c
>> > +++ b/xen/drivers/passthrough/vtd/iommu.c
>> > @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
>> >          pdev->domain = target;
>> >      }
>> >
>> > +    vmx_pi_hooks_reassign(source, target);
>> > +
>> >      return ret;
>> >  }
>> 
>> I think you need to clear source's hooks here, but target's need to
>> get set ahead of the actual assignment.
> 
> I think this is the place where the device is moved from
> &source->arch.pdev_list to &target->arch.pdev_list, if that is the
> case, we should clear source and set target here, right?

No - target needs to be ready to deal with events from the device
_before_ the device actually gets assigned to it.

> Please refer to the following code before 'vmx_pi_hooks_reassign'
> 
>     if ( devfn == pdev->devfn )
>     {
>         list_move(&pdev->domain_list, &target->arch.pdev_list);
>         pdev->domain = target;
>     }

Sure, but that's unrelated to the issue I'm trying to point out.

>> > --- a/xen/include/asm-x86/hvm/hvm.h
>> > +++ b/xen/include/asm-x86/hvm/hvm.h
>> > @@ -565,6 +565,11 @@ const char *hvm_efer_valid(const struct vcpu *v,
>> uint64_t value,
>> >                             signed int cr0_pg);
>> >  unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t
>> restore);
>> >
>> > +#define arch_vcpu_block(v) ({                                               \
>> > +    if ( (v)->domain->arch.hvm_domain.vmx.vcpu_block )                      \
>> > +        (v)->domain->arch.hvm_domain.vmx.vcpu_block((v));                   \
>> 
>> Stray double parentheses. But - why is this a macro all of the
>> sudden anyway?
> 
> In the previous version, you suggest to make arch_vcpu_block()
> inline, I tried that, but I encountered building failures, which is
> related to using '(v)->domain->arch.hvm_domain.vmx.vcpu_block '
> here since it refers to some data structure, it is not so straightforward
> to address it, so I change it to a macro, just like other micros in this
> file, which refers to ' (v)->arch.hvm_vcpu.....'.

Well, okay then for the macro part. We should address this, but
this certainly is beyond the scope of you will want to do.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-01-29  9:31       ` Jan Beulich
@ 2016-02-02  4:48         ` Wu, Feng
  2016-02-02  9:53           ` Jan Beulich
  2016-02-10 12:25           ` George Dunlap
  0 siblings, 2 replies; 14+ messages in thread
From: Wu, Feng @ 2016-02-02  4:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, xen-devel, Wu, Feng



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, January 29, 2016 5:32 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli
> <dario.faggioli@citrix.com>; George Dunlap <george.dunlap@eu.citrix.com>;
> Tian, Kevin <kevin.tian@intel.com>; xen-devel@lists.xen.org; Keir Fraser
> <keir@xen.org>
> Subject: RE: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
> 
> >>> On 29.01.16 at 02:53, <feng.wu@intel.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Friday, January 29, 2016 12:38 AM
> >> >>> On 28.01.16 at 06:12, <feng.wu@intel.com> wrote:
> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> > @@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int
> msr,
> >> uint64_t msr_content);
> >> >  static void vmx_invlpg_intercept(unsigned long vaddr);
> >> >  static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
> >> >
> >> > +static void vmx_vcpu_block(struct vcpu *v)
> >> > +{
> >> > +    unsigned long flags;
> >> > +    unsigned int dest;
> >> > +
> >> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> >>
> >> Stray blank line between declarations.
> >>
> >> > +    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);
> >>
> >> I think this needs to be done after acquiring the lock.
> >
> > I am not sure what you mean here. The ASSERT here means that
> > the vCPU is not on any pCPU list when we get here.
> 
> My point is that until you've taken the lock, whether the vCPU is on
> any such list may change. All state updates are done under lock, so
> checking for valid incoming state should be done under lock too, to
> be most useful.

I don't think so. We get this function via vcpu_block()/do_poll() ->
arch_vcpu_block() -> ... -> vmx_vcpu_block(), and the parameter
of this function should point to the current vCPU, so when we
get here, the v->arch.hvm_vmx.pi_block_list_lock should be NULL,
which means the vCPU is not in any blocking. and no one can change
it behind us this this moment.

> >> > +static void vmx_pi_do_resume(struct vcpu *v)
> >> > +{
> >> > +    unsigned long flags;
> >> > +    spinlock_t *pi_block_list_lock;
> >> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> >> > +
> >> > +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> >> > +
> >> > +    /*
> >> > +     * Set 'NV' field back to posted_intr_vector, so the
> >> > +     * Posted-Interrupts can be delivered to the vCPU when
> >> > +     * it is running in non-root mode.
> >> > +     */
> >> > +    if ( pi_desc->nv != posted_intr_vector )
> >> > +        write_atomic(&pi_desc->nv, posted_intr_vector);
> >>
> >> Perhaps this was discussed before, but I don't recall and now
> >> wonder - why inside an if()? This is a simple memory write on
> >> x86.
> >
> > The initial purpose is that if NV is already equal to 'posted_intr_vector',
> > we can save the following atomically write operation. There are some
> > volatile stuff and barriers in write_atomic().
> 
> But what does the final generated code look like? As I said, I'm
> sure it's just a single MOV. And putting a conditional around it will
> likely make things slower rather than faster.

Looking at the implementation of wirte_atomic(), it has volatile key
word barrier inside, if you think this is not a big deal, I am fine to
remove the check.

> 
> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> >> > @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
> >> >          pdev->domain = target;
> >> >      }
> >> >
> >> > +    vmx_pi_hooks_reassign(source, target);
> >> > +
> >> >      return ret;
> >> >  }
> >>
> >> I think you need to clear source's hooks here, but target's need to
> >> get set ahead of the actual assignment.
> >
> > I think this is the place where the device is moved from
> > &source->arch.pdev_list to &target->arch.pdev_list, if that is the
> > case, we should clear source and set target here, right?
> 
> No - target needs to be ready to deal with events from the device
> _before_ the device actually gets assigned to it.

I still don't get your point. I still think this is the place where the device
is being got assigned. Or maybe you can share more info about the
place "_before_ the device actually gets assigned to it " ?

Thanks,
Feng

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-02-02  4:48         ` Wu, Feng
@ 2016-02-02  9:53           ` Jan Beulich
  2016-02-16  6:33             ` Wu, Feng
  2016-02-10 12:25           ` George Dunlap
  1 sibling, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2016-02-02  9:53 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, xen-devel

>>> On 02.02.16 at 05:48, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Friday, January 29, 2016 5:32 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli
>> <dario.faggioli@citrix.com>; George Dunlap <george.dunlap@eu.citrix.com>;
>> Tian, Kevin <kevin.tian@intel.com>; xen-devel@lists.xen.org; Keir Fraser
>> <keir@xen.org>
>> Subject: RE: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
>> 
>> >>> On 29.01.16 at 02:53, <feng.wu@intel.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Friday, January 29, 2016 12:38 AM
>> >> >>> On 28.01.16 at 06:12, <feng.wu@intel.com> wrote:
>> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> > @@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int
>> msr,
>> >> uint64_t msr_content);
>> >> >  static void vmx_invlpg_intercept(unsigned long vaddr);
>> >> >  static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
>> >> >
>> >> > +static void vmx_vcpu_block(struct vcpu *v)
>> >> > +{
>> >> > +    unsigned long flags;
>> >> > +    unsigned int dest;
>> >> > +
>> >> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
>> >>
>> >> Stray blank line between declarations.
>> >>
>> >> > +    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);
>> >>
>> >> I think this needs to be done after acquiring the lock.
>> >
>> > I am not sure what you mean here. The ASSERT here means that
>> > the vCPU is not on any pCPU list when we get here.
>> 
>> My point is that until you've taken the lock, whether the vCPU is on
>> any such list may change. All state updates are done under lock, so
>> checking for valid incoming state should be done under lock too, to
>> be most useful.
> 
> I don't think so. We get this function via vcpu_block()/do_poll() ->
> arch_vcpu_block() -> ... -> vmx_vcpu_block(), and the parameter
> of this function should point to the current vCPU, so when we
> get here, the v->arch.hvm_vmx.pi_block_list_lock should be NULL,
> which means the vCPU is not in any blocking. and no one can change
> it behind us this this moment.

Well, if we were to only think about how the code currently looks
like, then ASSERT()s like this one could be left out. Yet they're
there to also catch unexpected state when later another caller
or another place setting pi_block_list_lock to non-NULL get
introduced. With what you write, an alternative to what I've
suggested above might be to _also_ ASSERT(v == current), but
this would still be a weaker check than if the assertion was moved
into the locked region.

>> >> > +static void vmx_pi_do_resume(struct vcpu *v)
>> >> > +{
>> >> > +    unsigned long flags;
>> >> > +    spinlock_t *pi_block_list_lock;
>> >> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
>> >> > +
>> >> > +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>> >> > +
>> >> > +    /*
>> >> > +     * Set 'NV' field back to posted_intr_vector, so the
>> >> > +     * Posted-Interrupts can be delivered to the vCPU when
>> >> > +     * it is running in non-root mode.
>> >> > +     */
>> >> > +    if ( pi_desc->nv != posted_intr_vector )
>> >> > +        write_atomic(&pi_desc->nv, posted_intr_vector);
>> >>
>> >> Perhaps this was discussed before, but I don't recall and now
>> >> wonder - why inside an if()? This is a simple memory write on
>> >> x86.
>> >
>> > The initial purpose is that if NV is already equal to 'posted_intr_vector',
>> > we can save the following atomically write operation. There are some
>> > volatile stuff and barriers in write_atomic().
>> 
>> But what does the final generated code look like? As I said, I'm
>> sure it's just a single MOV. And putting a conditional around it will
>> likely make things slower rather than faster.
> 
> Looking at the implementation of wirte_atomic(), it has volatile key
> word barrier inside, if you think this is not a big deal, I am fine to
> remove the check.

Once again - what I think matters only if it matches reality. I.e.
I may be wrong, and you are then free to correct me. That is,
rather than blindly changing the code as I suggest, why don't
you convince yourself which variant is the better one. It is for
that reason that I asked you to inspect the actual generated
code.

>> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
>> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
>> >> > @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
>> >> >          pdev->domain = target;
>> >> >      }
>> >> >
>> >> > +    vmx_pi_hooks_reassign(source, target);
>> >> > +
>> >> >      return ret;
>> >> >  }
>> >>
>> >> I think you need to clear source's hooks here, but target's need to
>> >> get set ahead of the actual assignment.
>> >
>> > I think this is the place where the device is moved from
>> > &source->arch.pdev_list to &target->arch.pdev_list, if that is the
>> > case, we should clear source and set target here, right?
>> 
>> No - target needs to be ready to deal with events from the device
>> _before_ the device actually gets assigned to it.
> 
> I still don't get your point. I still think this is the place where the 
> device
> is being got assigned. Or maybe you can share more info about the
> place "_before_ the device actually gets assigned to it " ?

The issue is with the sequence of events. Right now you assign
the device and only then prepare target to deal with it having a
device assigned. Whereas the needed sequence is the other
way around - prepare target, and only then assign the device.
I.e. you need to do things in this order (all inside this function)
- prepare target
- re-assign device
- clean up source

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-02-02  4:48         ` Wu, Feng
  2016-02-02  9:53           ` Jan Beulich
@ 2016-02-10 12:25           ` George Dunlap
  2016-02-16  5:54             ` Wu, Feng
  1 sibling, 1 reply; 14+ messages in thread
From: George Dunlap @ 2016-02-10 12:25 UTC (permalink / raw)
  To: Wu, Feng, Jan Beulich
  Cc: Tian, Kevin, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, xen-devel

On 02/02/16 04:48, Wu, Feng wrote:
>>>>> +static void vmx_pi_do_resume(struct vcpu *v)
>>>>> +{
>>>>> +    unsigned long flags;
>>>>> +    spinlock_t *pi_block_list_lock;
>>>>> +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
>>>>> +
>>>>> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>>>>> +
>>>>> +    /*
>>>>> +     * Set 'NV' field back to posted_intr_vector, so the
>>>>> +     * Posted-Interrupts can be delivered to the vCPU when
>>>>> +     * it is running in non-root mode.
>>>>> +     */
>>>>> +    if ( pi_desc->nv != posted_intr_vector )
>>>>> +        write_atomic(&pi_desc->nv, posted_intr_vector);
>>>>
>>>> Perhaps this was discussed before, but I don't recall and now
>>>> wonder - why inside an if()? This is a simple memory write on
>>>> x86.
>>>
>>> The initial purpose is that if NV is already equal to 'posted_intr_vector',
>>> we can save the following atomically write operation. There are some
>>> volatile stuff and barriers in write_atomic().
>>
>> But what does the final generated code look like? As I said, I'm
>> sure it's just a single MOV. And putting a conditional around it will
>> likely make things slower rather than faster.
> 
> Looking at the implementation of wirte_atomic(), it has volatile key
> word barrier inside, if you think this is not a big deal, I am fine to
> remove the check.

Oh, right -- so set_sn and clear_sn use the set/clear bits, which are
atomic read-modify-write, which we'd like to avoid on the vmexit/vmentry
paths (which is why we have the scheduler hooks); but this is just a
straight up write, so it's OK.

>>>>> --- a/xen/drivers/passthrough/vtd/iommu.c
>>>>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>>>>> @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
>>>>>          pdev->domain = target;
>>>>>      }
>>>>>
>>>>> +    vmx_pi_hooks_reassign(source, target);
>>>>> +
>>>>>      return ret;
>>>>>  }
>>>>
>>>> I think you need to clear source's hooks here, but target's need to
>>>> get set ahead of the actual assignment.
>>>
>>> I think this is the place where the device is moved from
>>> &source->arch.pdev_list to &target->arch.pdev_list, if that is the
>>> case, we should clear source and set target here, right?
>>
>> No - target needs to be ready to deal with events from the device
>> _before_ the device actually gets assigned to it.
> 
> I still don't get your point. I still think this is the place where the device
> is being got assigned. Or maybe you can share more info about the
> place "_before_ the device actually gets assigned to it " ?

What happens if a device generates a PI interrupt *immediately* after
domain_context_mapping(), but before calling vmx_pi_hooks_reassign()?

 -George

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-01-28  5:12 ` [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling Feng Wu
  2016-01-28 16:38   ` Jan Beulich
@ 2016-02-10 12:35   ` George Dunlap
  2016-02-16  6:00     ` Wu, Feng
  1 sibling, 1 reply; 14+ messages in thread
From: George Dunlap @ 2016-02-10 12:35 UTC (permalink / raw)
  To: Feng Wu, xen-devel
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, Jan Beulich

On 28/01/16 05:12, Feng Wu wrote:
> This is the core logic handling for VT-d posted-interrupts. Basically it
> deals with how and when to update posted-interrupts during the following
> scenarios:
> - vCPU is preempted
> - vCPU is slept
> - vCPU is blocked
> 
> When vCPU is preempted/slept, we update the posted-interrupts during
> scheduling by introducing two new architecutral scheduler hooks:
> vmx_pi_switch_from() and vmx_pi_switch_to(). When vCPU is blocked, we
> introduce a new architectural hook: arch_vcpu_block() to update
> posted-interrupts descriptor.
> 
> Besides that, before VM-entry, we will make sure the 'NV' filed is set
> to 'posted_intr_vector' and the vCPU is not in any blocking lists, which
> is needed when vCPU is running in non-root mode. The reason we do this check
> is because we change the posted-interrupts descriptor in vcpu_block(),
> however, we don't change it back in vcpu_unblock() or when vcpu_block()
> directly returns due to event delivery (in fact, we don't need to do it
> in the two places, that is why we do it before VM-Entry).
> 
> When we handle the lazy context switch for the following two scenarios:
> - Preempted by a tasklet, which uses in an idle context.
> - the prev vcpu is in offline and no new available vcpus in run queue.
> We don't change the 'SN' bit in posted-interrupt descriptor, this
> may incur spurious PI notification events, but since PI notification
> event is only sent when 'ON' is clear, and once the PI notificatoin
> is sent, ON is set by hardware, hence no more notification events
> before 'ON' is clear. Besides that, spurious PI notification events are
> going to happen from time to time in Xen hypervisor, such as, when
> guests trap to Xen and PI notification event happens, there is
> nothing Xen actually needs to do about it, the interrupts will be
> delivered to guest atht the next time we do a VMENTRY.
> 
> CC: Keir Fraser <keir@xen.org>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: George Dunlap <george.dunlap@eu.citrix.com>
> CC: Dario Faggioli <dario.faggioli@citrix.com>
> Suggested-by: Yang Zhang <yang.z.zhang@intel.com>
> Suggested-by: Dario Faggioli <dario.faggioli@citrix.com>
> Suggested-by: George Dunlap <george.dunlap@eu.citrix.com>
> Suggested-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Feng Wu <feng.wu@intel.com>

Feng,

Thanks for your work on this.

Coming back to this thread after 5 months, what strikes me first of all
is that it would be good to have a comment somewhere laying out exactly
all the things that need to change for the different runstates with
posted interrupts, so that someone later trying to change things has an
overview of what invariants need to be kept.

What do you think about adding the following comment somewhere near the
PI callbacks? (Corrected for accuracy of course.)

---
To handle posted interrupts correctly, we need to set the following state:

* The PI notification vector (NV)
* The PI notification destination processor (NDST)
* The PI "suppress notification" bit (SN)
* The vcpu pi "blocked" list

If a VM is currently running, we want the PI delivered to the guest vcpu
on the proper pcpu (NDST = v->processor, SN clear).

If the vm is blocked, we want the PI delivered to Xen so that it can
wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).

If the VM is currently either preempted or offline (i.e., not running
because of some reason other than blocking waiting for an interrupt),
there's nothing Xen can do -- we want the interrupt pending bit set in
the guest, but we don't want to bother Xen with an interrupt (SN clear).

There's a brief window of time between vmx_intr_assist() and checking
softirqs where if an interrupt comes in it may be lost; so we need Xen
to get an interrupt and raise a softirq so that it will go through the
vmx_intr_assist() path again (SN clear, NV = posted_interrupt).

The way we implement this now is by looking at what needs to happen on
the following runstate transitions:

A: runnable -> running
 - SN = 0
 - NDST = v->processor
B: running -> runnable
 - SN = 1
C: running -> blocked
 - NV = pi_wakeup_vector
 - Add vcpu to blocked list
D: blocked -> runnable
- NV = posted_intr_vector
- Take vcpu off blocked list

For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to} paths.

For transition C, we add a new arch hook, arch_vcpu_block(), which is
called from vcpu_block() and vcpu_do_poll().

For transition D, rather than add an extra arch hook on vcpu_wake, we
add a hook on the vmentry path which checks to see if either of the two
actions need to be taken.

These hooks only need to be called when the domain in question actually
has a physical device assigned to it, so we set and clear the callbacks
as appropriate when device assignment changes.
---

Is that about right?

If we had this, I don't think we'd need the comments in
vmx_pi_switch_{from,to}.

Laying things out this way, it also makes me wonder whether it might not
be more sensible / robust to set NDST on the vmentry path in the same
way we set NV.  But at this point it's just bikeshedding, so feel free
to leave it where it is.

Other than that -- and the details about placement of the ASSERT and the
hook reassignment -- it all looks good to me.

 -George

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-02-10 12:25           ` George Dunlap
@ 2016-02-16  5:54             ` Wu, Feng
  0 siblings, 0 replies; 14+ messages in thread
From: Wu, Feng @ 2016-02-16  5:54 UTC (permalink / raw)
  To: George Dunlap, Jan Beulich
  Cc: Tian, Kevin, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, xen-devel, Wu, Feng



> -----Original Message-----
> From: George Dunlap [mailto:george.dunlap@citrix.com]
> Sent: Wednesday, February 10, 2016 8:25 PM
> To: Wu, Feng <feng.wu@intel.com>; Jan Beulich <JBeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli
> <dario.faggioli@citrix.com>; George Dunlap <george.dunlap@eu.citrix.com>;
> Tian, Kevin <kevin.tian@intel.com>; xen-devel@lists.xen.org; Keir Fraser
> <keir@xen.org>
> Subject: Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
> 
> On 02/02/16 04:48, Wu, Feng wrote:
> >>>>> --- a/xen/drivers/passthrough/vtd/iommu.c
> >>>>> +++ b/xen/drivers/passthrough/vtd/iommu.c
> >>>>> @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
> >>>>>          pdev->domain = target;
> >>>>>      }
> >>>>>
> >>>>> +    vmx_pi_hooks_reassign(source, target);
> >>>>> +
> >>>>>      return ret;
> >>>>>  }
> >>>>
> >>>> I think you need to clear source's hooks here, but target's need to
> >>>> get set ahead of the actual assignment.
> >>>
> >>> I think this is the place where the device is moved from
> >>> &source->arch.pdev_list to &target->arch.pdev_list, if that is the
> >>> case, we should clear source and set target here, right?
> >>
> >> No - target needs to be ready to deal with events from the device
> >> _before_ the device actually gets assigned to it.
> >
> > I still don't get your point. I still think this is the place where the device
> > is being got assigned. Or maybe you can share more info about the
> > place "_before_ the device actually gets assigned to it " ?
> 
> What happens if a device generates a PI interrupt *immediately* after
> domain_context_mapping(), but before calling vmx_pi_hooks_reassign()?

Sorry for the late response due to holiday.

Good question. But I don't think that can happen, since we only support
MSI/MSIx for PI, and MSI/MSIx is configured by guests after assign_device()
gets called.

Thanks,
Feng

> 
>  -George

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-02-10 12:35   ` George Dunlap
@ 2016-02-16  6:00     ` Wu, Feng
  0 siblings, 0 replies; 14+ messages in thread
From: Wu, Feng @ 2016-02-16  6:00 UTC (permalink / raw)
  To: George Dunlap, xen-devel
  Cc: Tian, Kevin, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, Jan Beulich, Wu, Feng



> -----Original Message-----
> From: George Dunlap [mailto:george.dunlap@citrix.com]
> Sent: Wednesday, February 10, 2016 8:36 PM
> To: Wu, Feng <feng.wu@intel.com>; xen-devel@lists.xen.org
> Cc: Keir Fraser <keir@xen.org>; Jan Beulich <jbeulich@suse.com>; Andrew
> Cooper <andrew.cooper3@citrix.com>; Tian, Kevin <kevin.tian@intel.com>;
> George Dunlap <george.dunlap@eu.citrix.com>; Dario Faggioli
> <dario.faggioli@citrix.com>
> Subject: Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
> 
> On 28/01/16 05:12, Feng Wu wrote:
> > This is the core logic handling for VT-d posted-interrupts. Basically it
> > deals with how and when to update posted-interrupts during the following
> > scenarios:
> > - vCPU is preempted
> > - vCPU is slept
> > - vCPU is blocked
> >
> > When vCPU is preempted/slept, we update the posted-interrupts during
> > scheduling by introducing two new architecutral scheduler hooks:
> > vmx_pi_switch_from() and vmx_pi_switch_to(). When vCPU is blocked, we
> > introduce a new architectural hook: arch_vcpu_block() to update
> > posted-interrupts descriptor.
> >
> > Besides that, before VM-entry, we will make sure the 'NV' filed is set
> > to 'posted_intr_vector' and the vCPU is not in any blocking lists, which
> > is needed when vCPU is running in non-root mode. The reason we do this
> check
> > is because we change the posted-interrupts descriptor in vcpu_block(),
> > however, we don't change it back in vcpu_unblock() or when vcpu_block()
> > directly returns due to event delivery (in fact, we don't need to do it
> > in the two places, that is why we do it before VM-Entry).
> >
> > When we handle the lazy context switch for the following two scenarios:
> > - Preempted by a tasklet, which uses in an idle context.
> > - the prev vcpu is in offline and no new available vcpus in run queue.
> > We don't change the 'SN' bit in posted-interrupt descriptor, this
> > may incur spurious PI notification events, but since PI notification
> > event is only sent when 'ON' is clear, and once the PI notificatoin
> > is sent, ON is set by hardware, hence no more notification events
> > before 'ON' is clear. Besides that, spurious PI notification events are
> > going to happen from time to time in Xen hypervisor, such as, when
> > guests trap to Xen and PI notification event happens, there is
> > nothing Xen actually needs to do about it, the interrupts will be
> > delivered to guest atht the next time we do a VMENTRY.
> >
> > CC: Keir Fraser <keir@xen.org>
> > CC: Jan Beulich <jbeulich@suse.com>
> > CC: Andrew Cooper <andrew.cooper3@citrix.com>
> > CC: Kevin Tian <kevin.tian@intel.com>
> > CC: George Dunlap <george.dunlap@eu.citrix.com>
> > CC: Dario Faggioli <dario.faggioli@citrix.com>
> > Suggested-by: Yang Zhang <yang.z.zhang@intel.com>
> > Suggested-by: Dario Faggioli <dario.faggioli@citrix.com>
> > Suggested-by: George Dunlap <george.dunlap@eu.citrix.com>
> > Suggested-by: Jan Beulich <jbeulich@suse.com>
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> 
> Feng,
> 
> Thanks for your work on this.
> 
> Coming back to this thread after 5 months, what strikes me first of all
> is that it would be good to have a comment somewhere laying out exactly
> all the things that need to change for the different runstates with
> posted interrupts, so that someone later trying to change things has an
> overview of what invariants need to be kept.
> 
> What do you think about adding the following comment somewhere near the
> PI callbacks? (Corrected for accuracy of course.)
> 
> ---
> To handle posted interrupts correctly, we need to set the following state:
> 
> * The PI notification vector (NV)
> * The PI notification destination processor (NDST)
> * The PI "suppress notification" bit (SN)
> * The vcpu pi "blocked" list
> 
> If a VM is currently running, we want the PI delivered to the guest vcpu
> on the proper pcpu (NDST = v->processor, SN clear).
> 
> If the vm is blocked, we want the PI delivered to Xen so that it can
> wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
> 
> If the VM is currently either preempted or offline (i.e., not running
> because of some reason other than blocking waiting for an interrupt),
> there's nothing Xen can do -- we want the interrupt pending bit set in
> the guest, but we don't want to bother Xen with an interrupt (SN clear).
> 
> There's a brief window of time between vmx_intr_assist() and checking
> softirqs where if an interrupt comes in it may be lost; so we need Xen
> to get an interrupt and raise a softirq so that it will go through the
> vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
> 
> The way we implement this now is by looking at what needs to happen on
> the following runstate transitions:
> 
> A: runnable -> running
>  - SN = 0
>  - NDST = v->processor
> B: running -> runnable
>  - SN = 1
> C: running -> blocked
>  - NV = pi_wakeup_vector
>  - Add vcpu to blocked list
> D: blocked -> runnable
> - NV = posted_intr_vector
> - Take vcpu off blocked list
> 
> For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to} paths.
> 
> For transition C, we add a new arch hook, arch_vcpu_block(), which is
> called from vcpu_block() and vcpu_do_poll().
> 
> For transition D, rather than add an extra arch hook on vcpu_wake, we
> add a hook on the vmentry path which checks to see if either of the two
> actions need to be taken.
> 
> These hooks only need to be called when the domain in question actually
> has a physical device assigned to it, so we set and clear the callbacks
> as appropriate when device assignment changes.
> ---
> 
> Is that about right?

Perfect summary, I will add them. Thanks a lot, George!

Thanks,
Feng

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-02-02  9:53           ` Jan Beulich
@ 2016-02-16  6:33             ` Wu, Feng
  2016-02-16 10:23               ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Wu, Feng @ 2016-02-16  6:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, xen-devel, Wu, Feng



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, February 2, 2016 5:53 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli
> <dario.faggioli@citrix.com>; George Dunlap <george.dunlap@eu.citrix.com>;
> Tian, Kevin <kevin.tian@intel.com>; xen-devel@lists.xen.org; Keir Fraser
> <keir@xen.org>
> Subject: RE: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
> 
> >>> On 02.02.16 at 05:48, <feng.wu@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Friday, January 29, 2016 5:32 PM
> >> To: Wu, Feng <feng.wu@intel.com>
> >> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli
> >> <dario.faggioli@citrix.com>; George Dunlap <george.dunlap@eu.citrix.com>;
> >> Tian, Kevin <kevin.tian@intel.com>; xen-devel@lists.xen.org; Keir Fraser
> >> <keir@xen.org>
> >> Subject: RE: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
> >>
> >> >>> On 29.01.16 at 02:53, <feng.wu@intel.com> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Friday, January 29, 2016 12:38 AM
> >> >> >>> On 28.01.16 at 06:12, <feng.wu@intel.com> wrote:
> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> >> > @@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int
> >> msr,
> >> >> uint64_t msr_content);
> >> >> >  static void vmx_invlpg_intercept(unsigned long vaddr);
> >> >> >  static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
> >> >> >
> >> >> > +static void vmx_vcpu_block(struct vcpu *v)
> >> >> > +{
> >> >> > +    unsigned long flags;
> >> >> > +    unsigned int dest;
> >> >> > +
> >> >> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> >> >>
> >> >> Stray blank line between declarations.
> >> >>
> >> >> > +    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);
> >> >>
> >> >> I think this needs to be done after acquiring the lock.
> >> >
> >> > I am not sure what you mean here. The ASSERT here means that
> >> > the vCPU is not on any pCPU list when we get here.
> >>
> >> My point is that until you've taken the lock, whether the vCPU is on
> >> any such list may change. All state updates are done under lock, so
> >> checking for valid incoming state should be done under lock too, to
> >> be most useful.
> >
> > I don't think so. We get this function via vcpu_block()/do_poll() ->
> > arch_vcpu_block() -> ... -> vmx_vcpu_block(), and the parameter
> > of this function should point to the current vCPU, so when we
> > get here, the v->arch.hvm_vmx.pi_block_list_lock should be NULL,
> > which means the vCPU is not in any blocking. and no one can change
> > it behind us this this moment.
> 
> Well, if we were to only think about how the code currently looks
> like, then ASSERT()s like this one could be left out. Yet they're
> there to also catch unexpected state when later another caller
> or another place setting pi_block_list_lock to non-NULL get
> introduced. With what you write, an alternative to what I've
> suggested above might be to _also_ ASSERT(v == current), but
> this would still be a weaker check than if the assertion was moved
> into the locked region.

Sorry for the late response, just back from holiday.

v->arch.hvm_vmx.pi_block_list_lock is assigned a value after this
ASSERT, that means in the locked region ' v->arch.hvm_vmx.pi_block_list_lock '
has a value hence not NULL. So no need to ASSERT and adding
ASSERT here is away from my original purpose, If you insist on moving
ASSERT to the locked region, I suggest to remove it.

> >> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> >> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> >> >> > @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
> >> >> >          pdev->domain = target;
> >> >> >      }
> >> >> >
> >> >> > +    vmx_pi_hooks_reassign(source, target);
> >> >> > +
> >> >> >      return ret;
> >> >> >  }
> >> >>
> >> >> I think you need to clear source's hooks here, but target's need to
> >> >> get set ahead of the actual assignment.
> >> >
> >> > I think this is the place where the device is moved from
> >> > &source->arch.pdev_list to &target->arch.pdev_list, if that is the
> >> > case, we should clear source and set target here, right?
> >>
> >> No - target needs to be ready to deal with events from the device
> >> _before_ the device actually gets assigned to it.
> >
> > I still don't get your point. I still think this is the place where the
> > device
> > is being got assigned. Or maybe you can share more info about the
> > place "_before_ the device actually gets assigned to it " ?
> 
> The issue is with the sequence of events. Right now you assign
> the device and only then prepare target to deal with it having a
> device assigned. Whereas the needed sequence is the other
> way around - prepare target, and only then assign the device.
> I.e. you need to do things in this order (all inside this function)
> - prepare target
> - re-assign device
> - clean up source

Do you mean something like the follows?

@@ -2279,13 +2280,19 @@ static int reassign_device_ownership(
             }
     }

+    vmx_pi_hooks_assign(target);
+
     ret = domain_context_unmap(source, devfn, pdev);
-    if ( ret )
+    if ( ret ) {
+        vmx_pi_hooks_deassign(target);
         return ret;
+    }

     ret = domain_context_mapping(target, devfn, pdev);
-    if ( ret )
+    if ( ret ) {
+        vmx_pi_hooks_deassign(target);
         return ret;
+    }

     if ( devfn == pdev->devfn )
     {
@@ -2293,6 +2300,8 @@ static int reassign_device_ownership(
         pdev->domain = target;
     }

+    vmx_pi_hooks_deassign(source);
+
     return ret;
 }
 
Thanks,
Feng

> 
> Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
  2016-02-16  6:33             ` Wu, Feng
@ 2016-02-16 10:23               ` Jan Beulich
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2016-02-16 10:23 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Dario Faggioli, xen-devel

>>> On 16.02.16 at 07:33, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Tuesday, February 2, 2016 5:53 PM
>> To: Wu, Feng <feng.wu@intel.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli
>> <dario.faggioli@citrix.com>; George Dunlap <george.dunlap@eu.citrix.com>;
>> Tian, Kevin <kevin.tian@intel.com>; xen-devel@lists.xen.org; Keir Fraser
>> <keir@xen.org>
>> Subject: RE: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
>> 
>> >>> On 02.02.16 at 05:48, <feng.wu@intel.com> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Friday, January 29, 2016 5:32 PM
>> >> To: Wu, Feng <feng.wu@intel.com>
>> >> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli
>> >> <dario.faggioli@citrix.com>; George Dunlap <george.dunlap@eu.citrix.com>;
>> >> Tian, Kevin <kevin.tian@intel.com>; xen-devel@lists.xen.org; Keir Fraser
>> >> <keir@xen.org>
>> >> Subject: RE: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
>> >>
>> >> >>> On 29.01.16 at 02:53, <feng.wu@intel.com> wrote:
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: Friday, January 29, 2016 12:38 AM
>> >> >> >>> On 28.01.16 at 06:12, <feng.wu@intel.com> wrote:
>> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> > @@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int
>> >> msr,
>> >> >> uint64_t msr_content);
>> >> >> >  static void vmx_invlpg_intercept(unsigned long vaddr);
>> >> >> >  static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
>> >> >> >
>> >> >> > +static void vmx_vcpu_block(struct vcpu *v)
>> >> >> > +{
>> >> >> > +    unsigned long flags;
>> >> >> > +    unsigned int dest;
>> >> >> > +
>> >> >> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
>> >> >>
>> >> >> Stray blank line between declarations.
>> >> >>
>> >> >> > +    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);
>> >> >>
>> >> >> I think this needs to be done after acquiring the lock.
>> >> >
>> >> > I am not sure what you mean here. The ASSERT here means that
>> >> > the vCPU is not on any pCPU list when we get here.
>> >>
>> >> My point is that until you've taken the lock, whether the vCPU is on
>> >> any such list may change. All state updates are done under lock, so
>> >> checking for valid incoming state should be done under lock too, to
>> >> be most useful.
>> >
>> > I don't think so. We get this function via vcpu_block()/do_poll() ->
>> > arch_vcpu_block() -> ... -> vmx_vcpu_block(), and the parameter
>> > of this function should point to the current vCPU, so when we
>> > get here, the v->arch.hvm_vmx.pi_block_list_lock should be NULL,
>> > which means the vCPU is not in any blocking. and no one can change
>> > it behind us this this moment.
>> 
>> Well, if we were to only think about how the code currently looks
>> like, then ASSERT()s like this one could be left out. Yet they're
>> there to also catch unexpected state when later another caller
>> or another place setting pi_block_list_lock to non-NULL get
>> introduced. With what you write, an alternative to what I've
>> suggested above might be to _also_ ASSERT(v == current), but
>> this would still be a weaker check than if the assertion was moved
>> into the locked region.
> 
> Sorry for the late response, just back from holiday.
> 
> v->arch.hvm_vmx.pi_block_list_lock is assigned a value after this
> ASSERT, that means in the locked region ' v->arch.hvm_vmx.pi_block_list_lock '
> has a value hence not NULL. So no need to ASSERT and adding
> ASSERT here is away from my original purpose, If you insist on moving
> ASSERT to the locked region, I suggest to remove it.

The implication of course is that you would do the assignment
only after the assertion, inside the locked region. Perhaps the
assignment would even better be done atomically (e.g. via
cmpxchg() or xchg()), with an assertion checking the old value
to be NULL.

>> >> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
>> >> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
>> >> >> > @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
>> >> >> >          pdev->domain = target;
>> >> >> >      }
>> >> >> >
>> >> >> > +    vmx_pi_hooks_reassign(source, target);
>> >> >> > +
>> >> >> >      return ret;
>> >> >> >  }
>> >> >>
>> >> >> I think you need to clear source's hooks here, but target's need to
>> >> >> get set ahead of the actual assignment.
>> >> >
>> >> > I think this is the place where the device is moved from
>> >> > &source->arch.pdev_list to &target->arch.pdev_list, if that is the
>> >> > case, we should clear source and set target here, right?
>> >>
>> >> No - target needs to be ready to deal with events from the device
>> >> _before_ the device actually gets assigned to it.
>> >
>> > I still don't get your point. I still think this is the place where the
>> > device
>> > is being got assigned. Or maybe you can share more info about the
>> > place "_before_ the device actually gets assigned to it " ?
>> 
>> The issue is with the sequence of events. Right now you assign
>> the device and only then prepare target to deal with it having a
>> device assigned. Whereas the needed sequence is the other
>> way around - prepare target, and only then assign the device.
>> I.e. you need to do things in this order (all inside this function)
>> - prepare target
>> - re-assign device
>> - clean up source
> 
> Do you mean something like the follows?
> 
> @@ -2279,13 +2280,19 @@ static int reassign_device_ownership(
>              }
>      }
> 
> +    vmx_pi_hooks_assign(target);
> +
>      ret = domain_context_unmap(source, devfn, pdev);
> -    if ( ret )
> +    if ( ret ) {
> +        vmx_pi_hooks_deassign(target);
>          return ret;
> +    }
> 
>      ret = domain_context_mapping(target, devfn, pdev);
> -    if ( ret )
> +    if ( ret ) {
> +        vmx_pi_hooks_deassign(target);
>          return ret;
> +    }
> 
>      if ( devfn == pdev->devfn )
>      {
> @@ -2293,6 +2300,8 @@ static int reassign_device_ownership(
>          pdev->domain = target;
>      }
> 
> +    vmx_pi_hooks_deassign(source);
> +
>      return ret;
>  }

Something along these lines, yes, except that of course target
doesn't need to be set up before the source unmap (resulting in
fewer changes and slight less involved error handling).

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-02-16 10:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-28  5:12 [PATCH v11 0/2] Add VT-d Posted-Interrupts support Feng Wu
2016-01-28  5:12 ` [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling Feng Wu
2016-01-28 16:38   ` Jan Beulich
2016-01-29  1:53     ` Wu, Feng
2016-01-29  9:31       ` Jan Beulich
2016-02-02  4:48         ` Wu, Feng
2016-02-02  9:53           ` Jan Beulich
2016-02-16  6:33             ` Wu, Feng
2016-02-16 10:23               ` Jan Beulich
2016-02-10 12:25           ` George Dunlap
2016-02-16  5:54             ` Wu, Feng
2016-02-10 12:35   ` George Dunlap
2016-02-16  6:00     ` Wu, Feng
2016-01-28  5:12 ` [PATCH v11 2/2] Add a command line parameter for VT-d posted-interrupts Feng Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.