All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] Introduce AMD SVM AVIC
@ 2016-12-31  5:45 Suravee Suthikulpanit
  2016-12-31  5:45 ` [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops Suravee Suthikulpanit
                   ` (9 more replies)
  0 siblings, 10 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:45 UTC (permalink / raw)
  To: xen-devel
  Cc: jbeulich, andrew.cooper3, Suravee Suthikulpanit, sherry.hurwitz,
	boris.ostrovsky

CHANGES FROM RFC PATCH (v1):
=========================
https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01815.html
  * Rebased to Xen-4.8
  * Miscellaneous styling changes based on review comments from Konrad, Jan.
  * Add key handle for dumping AVIC-related statistics.
  * Add a new section in xen-command-line.markdown regarding the svm-avic option.
  * [From Konrad comments]
    - Fix avic_handle_dfr_update() in patch 6/10. Previously, the avic_last_ldr
      was incorrectly declared as per-domain, while it should have been
      per-vcpu.
    - Fix the logic in avic_handle_dfr_update(). The APIC logical id table
      (d->avic_log_ait_mfn) is per domain. So, clear_domain_page() should
      have been executed by the first vcpu that changes the domain DFR, and
      not every vcpu.
  * [From Konrad comments]
    - Replace arch_svm_struct.avic_bk_pg with vlapic->regs_page.
    - Rename avic_vcpu_put() to avic_vcpu_unload().
    - Fix logic to not intercept CR8.

GITHUB
======
Latest git tree can be found at:
    http://github.com/ssuthiku/xen.git    xen_avic_part1_v2

OVERVIEW
========
This patch set is the first of the two-part patch series to introduce 
the new AMD Advance Virtual Interrupt Controller (AVIC) support.

Basically, SVM AVIC hardware virtualizes local APIC registers of each
vCPU via the virtual APIC (vAPIC) backing page. This allows guest access
to certain APIC registers without the need to emulate the hardware behavior
in the hypervisor. More information about AVIC can be found in the
AMD64 Architecture Programmer’s Manual Volume 2 - System Programming.

  http://support.amd.com/TechDocs/24593.pdf

For SVM AVIC, we extend the existing SVM driver to:
  * Check CPUID to detect AVIC support in the processor
  * Program new fields in VMCB to enable AVIC
  * Introduce new AVIC data structures and add code to manage them
  * Handle two new AVIC #VMEXITs
  * Add new interrupt injection code using vAPIC backing page
    instead of the existing V_IRQ, V_INTR_PRIO, V_INTR_VECTOR,
    and V_IGN_TPR fields

Currently, this patch series does not enable AVIC by default since
it does not support nested-VM yet. Until then, users can enable
SVM AVIC by specifying Xen parameter svm-avic=1.

Later, in part 2, we will introduce the IOMMU AVIC support, which
provides speed up for PCI device pass-through use case by allowing
the IOMMU hardware to inject interrupt directly into the guest via
the vAPIC backing page.

OVERALL PERFORMANCE
===================
Currently, AVIC is available on the AMD family 15h models 6Xh
(Carrizo) processors and newer. Here, the Carizzo is used to collect
performance data shown below.

Generally, SVM AVIC alone (w/o IOMMU AVIC) should provide overall speed up
for HVM guest since it does not require #vmexit into the hypervisor to
emulate certain guest accesses to local APIC registers.

It should also improve performance when hypervisor wants to inject
interrupts into a running vcpu by setting the corresponded IRR
bit in the vAPIC backing page and trigger AVIC_DOORBELL MSR.

For sending IPI interrupts between running vcpus in Linux guest,
Xen is default to using event channel.  However, in case of
non-paravirtualize guest, AVIC can also provide performance
improvements for sending IPI.

BENCHMARK 1: HACKBENCH
======================

For measuring IPI performance used for scheduling workload, I have collected
some performance number on 2 and 3 CPU running hackbech with the following
detail:

  hackbench -p -l 100000
  Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks)
  Each sender will pass 100000 messages of 100 bytes

                         |  2 vcpus (sec) |  3 vcpus (sec)   
  --------------------------------------------------------
    No AVIC w/  xen_nopv |     299.57     |    337.779
    No AVIC w/o xen_nopv |     270.37     |    419.6064 
       AVIC w/o xen_nopv |     181.46     |    171.7957
       AVIC w/  xen_nopv |     171.81     |    169.0858

CURRENT UNTESTED USE-CASES
===========================
    - Nested VM

Any feedback and comments are very much appreciated.

Thank you,
Suravee

Suravee Suthikulpanit (10):
  x86/HVM: Introduce struct hvm_pi_ops
  x86/vLAPIC: Declare vlapic_read_aligned() and vlapic_reg_write() as
    non-static
  x86/HVM: Call vlapic_destroy after vcpu_destroy
  x86/SVM: Modify VMCB fields to add AVIC support
  x86/HVM/SVM: Add AVIC initialization code
  x86/SVM: Add AVIC vmexit handlers
  x86/SVM: Add vcpu scheduling support for AVIC
  x86/SVM: Add interrupt management code via AVIC
  x86/SVM: Hook up miscellaneous AVIC functions
  x86/SVM: Add AMD AVIC key handler

 docs/misc/xen-command-line.markdown |   9 +
 xen/arch/x86/hvm/hvm.c              |   4 +-
 xen/arch/x86/hvm/svm/Makefile       |   1 +
 xen/arch/x86/hvm/svm/avic.c         | 723 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/intr.c         |   4 +
 xen/arch/x86/hvm/svm/svm.c          |  64 +++-
 xen/arch/x86/hvm/svm/vmcb.c         |   3 +
 xen/arch/x86/hvm/vlapic.c           |   5 +-
 xen/arch/x86/hvm/vmx/vmx.c          |  32 +-
 xen/include/asm-x86/apic.h          |   2 +
 xen/include/asm-x86/hvm/domain.h    |  63 ++++
 xen/include/asm-x86/hvm/hvm.h       |   4 +-
 xen/include/asm-x86/hvm/svm/avic.h  |  47 +++
 xen/include/asm-x86/hvm/svm/svm.h   |   2 +
 xen/include/asm-x86/hvm/svm/vmcb.h  |  52 ++-
 xen/include/asm-x86/hvm/vlapic.h    |   4 +
 xen/include/asm-x86/hvm/vmx/vmcs.h  |  59 ---
 17 files changed, 981 insertions(+), 97 deletions(-)
 create mode 100644 xen/arch/x86/hvm/svm/avic.c
 create mode 100644 xen/include/asm-x86/hvm/svm/avic.h

-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
@ 2016-12-31  5:45 ` Suravee Suthikulpanit
  2017-01-05  2:54   ` Tian, Kevin
  2017-01-05 15:51   ` Jan Beulich
  2016-12-31  5:45 ` [PATCH v2 02/10] x86/vLAPIC: Declare vlapic_read_aligned() and vlapic_reg_write() as non-static Suravee Suthikulpanit
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:45 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, jbeulich, Jun Nakajima, andrew.cooper3,
	Suravee Suthikulpanit, sherry.hurwitz, boris.ostrovsky

The current function pointers for managing hvm posted interrupt
can be used also by SVM AVIC. Therefore, this patch introduces the
struct hvm_pi_ops in the struct hvm_domain to hold them.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c         | 32 +++++++++----------
 xen/include/asm-x86/hvm/domain.h   | 63 ++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/hvm.h      |  4 +--
 xen/include/asm-x86/hvm/vmx/vmcs.h | 59 -----------------------------------
 4 files changed, 81 insertions(+), 77 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 7b2c50c..3f6d888 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -204,12 +204,12 @@ void vmx_pi_hooks_assign(struct domain *d)
     if ( !iommu_intpost || !has_hvm_container_domain(d) )
         return;
 
-    ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
+    ASSERT(!d->arch.hvm_domain.pi_ops.vcpu_block);
 
-    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
-    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
-    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
-    d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
+    d->arch.hvm_domain.pi_ops.vcpu_block = vmx_vcpu_block;
+    d->arch.hvm_domain.pi_ops.pi_switch_from = vmx_pi_switch_from;
+    d->arch.hvm_domain.pi_ops.pi_switch_to = vmx_pi_switch_to;
+    d->arch.hvm_domain.pi_ops.pi_do_resume = vmx_pi_do_resume;
 }
 
 /* This function is called when pcidevs_lock is held */
@@ -218,12 +218,12 @@ void vmx_pi_hooks_deassign(struct domain *d)
     if ( !iommu_intpost || !has_hvm_container_domain(d) )
         return;
 
-    ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
+    ASSERT(d->arch.hvm_domain.pi_ops.vcpu_block);
 
-    d->arch.hvm_domain.vmx.vcpu_block = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
-    d->arch.hvm_domain.vmx.pi_do_resume = NULL;
+    d->arch.hvm_domain.pi_ops.vcpu_block = NULL;
+    d->arch.hvm_domain.pi_ops.pi_switch_from = NULL;
+    d->arch.hvm_domain.pi_ops.pi_switch_to = NULL;
+    d->arch.hvm_domain.pi_ops.pi_do_resume = NULL;
 }
 
 static int vmx_domain_initialise(struct domain *d)
@@ -901,8 +901,8 @@ static void vmx_ctxt_switch_from(struct vcpu *v)
     vmx_restore_host_msrs();
     vmx_save_dr(v);
 
-    if ( v->domain->arch.hvm_domain.vmx.pi_switch_from )
-        v->domain->arch.hvm_domain.vmx.pi_switch_from(v);
+    if ( v->domain->arch.hvm_domain.pi_ops.pi_switch_from )
+        v->domain->arch.hvm_domain.pi_ops.pi_switch_from(v);
 }
 
 static void vmx_ctxt_switch_to(struct vcpu *v)
@@ -916,8 +916,8 @@ static void vmx_ctxt_switch_to(struct vcpu *v)
     vmx_restore_guest_msrs(v);
     vmx_restore_dr(v);
 
-    if ( v->domain->arch.hvm_domain.vmx.pi_switch_to )
-        v->domain->arch.hvm_domain.vmx.pi_switch_to(v);
+    if ( v->domain->arch.hvm_domain.pi_ops.pi_switch_to )
+        v->domain->arch.hvm_domain.pi_ops.pi_switch_to(v);
 }
 
 
@@ -3963,8 +3963,8 @@ void vmx_vmenter_helper(const struct cpu_user_regs *regs)
     struct hvm_vcpu_asid *p_asid;
     bool_t need_flush;
 
-    if ( curr->domain->arch.hvm_domain.vmx.pi_do_resume )
-        curr->domain->arch.hvm_domain.vmx.pi_do_resume(curr);
+    if ( curr->domain->arch.hvm_domain.pi_ops.pi_do_resume )
+        curr->domain->arch.hvm_domain.pi_ops.pi_do_resume(curr);
 
     if ( !cpu_has_vmx_vpid )
         goto out;
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index f34d784..779927b 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -72,6 +72,67 @@ struct hvm_ioreq_server {
     bool_t                 bufioreq_atomic;
 };
 
+struct hvm_pi_ops {
+    /*
+     * To handle posted interrupts correctly, we need to set the following
+     * state:
+     *
+     * * The PI notification vector (NV)
+     * * The PI notification destination processor (NDST)
+     * * The PI "suppress notification" bit (SN)
+     * * The vcpu pi "blocked" list
+     *
+     * If a VM is currently running, we want the PI delivered to the guest vcpu
+     * on the proper pcpu (NDST = v->processor, SN clear).
+     *
+     * If the vm is blocked, we want the PI delivered to Xen so that it can
+     * wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
+     *
+     * If the VM is currently either preempted or offline (i.e., not running
+     * because of some reason other than blocking waiting for an interrupt),
+     * there's nothing Xen can do -- we want the interrupt pending bit set in
+     * the guest, but we don't want to bother Xen with an interrupt (SN clear).
+     *
+     * There's a brief window of time between vmx_intr_assist() and checking
+     * softirqs where if an interrupt comes in it may be lost; so we need Xen
+     * to get an interrupt and raise a softirq so that it will go through the
+     * vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
+     *
+     * The way we implement this now is by looking at what needs to happen on
+     * the following runstate transitions:
+     *
+     * A: runnable -> running
+     *  - SN = 0
+     *  - NDST = v->processor
+     * B: running -> runnable
+     *  - SN = 1
+     * C: running -> blocked
+     *  - NV = pi_wakeup_vector
+     *  - Add vcpu to blocked list
+     * D: blocked -> runnable
+     *  - NV = posted_intr_vector
+     *  - Take vcpu off blocked list
+     *
+     * For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to}
+     * paths.
+     *
+     * For transition C, we add a new arch hook, arch_vcpu_block(), which is
+     * called from vcpu_block() and vcpu_do_poll().
+     *
+     * For transition D, rather than add an extra arch hook on vcpu_wake, we
+     * add a hook on the vmentry path which checks to see if either of the two
+     * actions need to be taken.
+     *
+     * These hooks only need to be called when the domain in question actually
+     * has a physical device assigned to it, so we set and clear the callbacks
+     * as appropriate when device assignment changes.
+     */
+    void (*vcpu_block) (struct vcpu *);
+    void (*pi_switch_from) (struct vcpu *v);
+    void (*pi_switch_to) (struct vcpu *v);
+    void (*pi_do_resume) (struct vcpu *v);
+};
+
 struct hvm_domain {
     /* Guest page range used for non-default ioreq servers */
     struct {
@@ -148,6 +209,8 @@ struct hvm_domain {
         struct list_head list;
     } write_map;
 
+    struct hvm_pi_ops pi_ops;
+
     union {
         struct vmx_domain vmx;
         struct svm_domain svm;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 7e7462e..b1e4c75 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -638,8 +638,8 @@ unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t restore);
     struct vcpu *v_ = (v);                                      \
     struct domain *d_ = v_->domain;                             \
     if ( has_hvm_container_domain(d_) &&                        \
-         (cpu_has_vmx && d_->arch.hvm_domain.vmx.vcpu_block) )  \
-        d_->arch.hvm_domain.vmx.vcpu_block(v_);                 \
+         (d_->arch.hvm_domain.pi_ops.vcpu_block) )          \
+        d_->arch.hvm_domain.pi_ops.vcpu_block(v_);          \
 })
 
 #endif /* __ASM_X86_HVM_HVM_H__ */
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 997f4f5..4ec8b08 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -77,65 +77,6 @@ struct vmx_domain {
     unsigned long apic_access_mfn;
     /* VMX_DOMAIN_* */
     unsigned int status;
-
-    /*
-     * To handle posted interrupts correctly, we need to set the following
-     * state:
-     *
-     * * The PI notification vector (NV)
-     * * The PI notification destination processor (NDST)
-     * * The PI "suppress notification" bit (SN)
-     * * The vcpu pi "blocked" list
-     *
-     * If a VM is currently running, we want the PI delivered to the guest vcpu
-     * on the proper pcpu (NDST = v->processor, SN clear).
-     *
-     * If the vm is blocked, we want the PI delivered to Xen so that it can
-     * wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
-     *
-     * If the VM is currently either preempted or offline (i.e., not running
-     * because of some reason other than blocking waiting for an interrupt),
-     * there's nothing Xen can do -- we want the interrupt pending bit set in
-     * the guest, but we don't want to bother Xen with an interrupt (SN clear).
-     *
-     * There's a brief window of time between vmx_intr_assist() and checking
-     * softirqs where if an interrupt comes in it may be lost; so we need Xen
-     * to get an interrupt and raise a softirq so that it will go through the
-     * vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
-     *
-     * The way we implement this now is by looking at what needs to happen on
-     * the following runstate transitions:
-     *
-     * A: runnable -> running
-     *  - SN = 0
-     *  - NDST = v->processor
-     * B: running -> runnable
-     *  - SN = 1
-     * C: running -> blocked
-     *  - NV = pi_wakeup_vector
-     *  - Add vcpu to blocked list
-     * D: blocked -> runnable
-     *  - NV = posted_intr_vector
-     *  - Take vcpu off blocked list
-     *
-     * For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to}
-     * paths.
-     *
-     * For transition C, we add a new arch hook, arch_vcpu_block(), which is
-     * called from vcpu_block() and vcpu_do_poll().
-     *
-     * For transition D, rather than add an extra arch hook on vcpu_wake, we
-     * add a hook on the vmentry path which checks to see if either of the two
-     * actions need to be taken.
-     *
-     * These hooks only need to be called when the domain in question actually
-     * has a physical device assigned to it, so we set and clear the callbacks
-     * as appropriate when device assignment changes.
-     */
-    void (*vcpu_block) (struct vcpu *);
-    void (*pi_switch_from) (struct vcpu *v);
-    void (*pi_switch_to) (struct vcpu *v);
-    void (*pi_do_resume) (struct vcpu *v);
 };
 
 struct pi_desc {
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 02/10] x86/vLAPIC: Declare vlapic_read_aligned() and vlapic_reg_write() as non-static
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
  2016-12-31  5:45 ` [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops Suravee Suthikulpanit
@ 2016-12-31  5:45 ` Suravee Suthikulpanit
  2017-01-05 15:53   ` Jan Beulich
  2016-12-31  5:45 ` [PATCH v2 03/10] x86/HVM: Call vlapic_destroy after vcpu_destroy Suravee Suthikulpanit
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:45 UTC (permalink / raw)
  To: xen-devel
  Cc: jbeulich, andrew.cooper3, Suravee Suthikulpanit, sherry.hurwitz,
	boris.ostrovsky

Expose vlapic_read_aligned and vlapic_reg_write() to be used by AVIC.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/vlapic.c        | 5 ++---
 xen/include/asm-x86/hvm/vlapic.h | 4 ++++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 1d5d287..0f52067 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -562,7 +562,7 @@ static void vlapic_set_tdcr(struct vlapic *vlapic, unsigned int val)
                 "timer_divisor: %d", vlapic->hw.timer_divisor);
 }
 
-static uint32_t vlapic_read_aligned(struct vlapic *vlapic, unsigned int offset)
+uint32_t vlapic_read_aligned(struct vlapic *vlapic, unsigned int offset)
 {
     switch ( offset )
     {
@@ -680,8 +680,7 @@ static void vlapic_tdt_pt_cb(struct vcpu *v, void *data)
     vcpu_vlapic(v)->hw.tdt_msr = 0;
 }
 
-static void vlapic_reg_write(struct vcpu *v,
-                             unsigned int offset, uint32_t val)
+void vlapic_reg_write(struct vcpu *v, unsigned int offset, uint32_t val)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
 
diff --git a/xen/include/asm-x86/hvm/vlapic.h b/xen/include/asm-x86/hvm/vlapic.h
index 4656293..48ab3a6 100644
--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -132,6 +132,10 @@ void vlapic_ipi(struct vlapic *vlapic, uint32_t icr_low, uint32_t icr_high);
 
 int vlapic_apicv_write(struct vcpu *v, unsigned int offset);
 
+void vlapic_reg_write(struct vcpu *v, unsigned int offset, uint32_t val);
+
+uint32_t vlapic_read_aligned(struct vlapic *vlapic, unsigned int offset);
+
 struct vlapic *vlapic_lowest_prio(
     struct domain *d, const struct vlapic *source,
     int short_hand, uint32_t dest, bool_t dest_mode);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 03/10] x86/HVM: Call vlapic_destroy after vcpu_destroy
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
  2016-12-31  5:45 ` [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops Suravee Suthikulpanit
  2016-12-31  5:45 ` [PATCH v2 02/10] x86/vLAPIC: Declare vlapic_read_aligned() and vlapic_reg_write() as non-static Suravee Suthikulpanit
@ 2016-12-31  5:45 ` Suravee Suthikulpanit
  2017-01-05  2:56   ` Tian, Kevin
  2017-01-05 15:56   ` Jan Beulich
  2016-12-31  5:45 ` [PATCH v2 04/10] x86/SVM: Modify VMCB fields to add AVIC support Suravee Suthikulpanit
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:45 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, jbeulich, Jun Nakajima, andrew.cooper3,
	Suravee Suthikulpanit, sherry.hurwitz, boris.ostrovsky

Since vlapic_init() is called before vcpu_initialise().
We should call the destroy functions in the the reverse order here.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/hvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 25dc759..d573f0f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1613,10 +1613,10 @@ void hvm_vcpu_destroy(struct vcpu *v)
     tasklet_kill(&v->arch.hvm_vcpu.assert_evtchn_irq_tasklet);
     hvm_vcpu_cacheattr_destroy(v);
 
+    hvm_funcs.vcpu_destroy(v);
+
     if ( is_hvm_vcpu(v) )
         vlapic_destroy(v);
-
-    hvm_funcs.vcpu_destroy(v);
 }
 
 void hvm_vcpu_down(struct vcpu *v)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 04/10] x86/SVM: Modify VMCB fields to add AVIC support
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
                   ` (2 preceding siblings ...)
  2016-12-31  5:45 ` [PATCH v2 03/10] x86/HVM: Call vlapic_destroy after vcpu_destroy Suravee Suthikulpanit
@ 2016-12-31  5:45 ` Suravee Suthikulpanit
  2016-12-31  5:45 ` [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code Suravee Suthikulpanit
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:45 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, andrew.cooper3, Suravee Suthikulpanit,
	sherry.hurwitz, boris.ostrovsky

Introduce AVIC-related VMCB fields.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/include/asm-x86/hvm/svm/vmcb.h | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index bad2382..43cb98e 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -328,14 +328,15 @@ typedef union __packed
     struct 
     {
         u64 tpr:          8;
-        u64 irq:          1;
+        u64 irq:          1; /* ignored in avic mode */
         u64 rsvd0:        7;
-        u64 prio:         4;
-        u64 ign_tpr:      1;
+        u64 prio:         4; /* ignored in avic mode */
+        u64 ign_tpr:      1; /* ignored in avic mode */
         u64 rsvd1:        3;
         u64 intr_masking: 1;
-        u64 rsvd2:        7;
-        u64 vector:       8;
+        u64 rsvd2:        6;
+        u64 avic_enable:  1;
+        u64 vector:       8; /* ignored in avic mode */
         u64 rsvd3:       24;
     } fields;
 } vintr_t;
@@ -394,7 +395,8 @@ typedef union __packed
         uint32_t cr2: 1;
         /* debugctlmsr, last{branch,int}{to,from}ip */
         uint32_t lbr: 1;
-        uint32_t resv: 21;
+        uint32_t avic: 1;
+        uint32_t resv: 20;
     } fields;
 } vmcbcleanbits_t;
 
@@ -428,7 +430,8 @@ struct __packed vmcb_struct {
     u64 exitinfo2;              /* offset 0x80 */
     eventinj_t  exitintinfo;    /* offset 0x88 */
     u64 _np_enable;             /* offset 0x90 - cleanbit 4 */
-    u64 res08[2];
+    u64 avic_vapic_bar;         /* offset 0x98 */
+    u64 res08;                  /* offset 0xA0 */
     eventinj_t  eventinj;       /* offset 0xA8 */
     u64 _h_cr3;                 /* offset 0xB0 - cleanbit 4 */
     lbrctrl_t lbr_control;      /* offset 0xB8 */
@@ -437,7 +440,11 @@ struct __packed vmcb_struct {
     u64 nextrip;                /* offset 0xC8 */
     u8  guest_ins_len;          /* offset 0xD0 */
     u8  guest_ins[15];          /* offset 0xD1 */
-    u64 res10a[100];            /* offset 0xE0 pad to save area */
+    u64 avic_bk_pg_pa;          /* offset 0xE0 */
+    u64 res09a;                 /* offset 0xE8 */
+    u64 avic_log_apic_id;       /* offset 0xF0 */
+    u64 avic_phy_apic_id;       /* offset 0xF8 */
+    u64 res09b[96];             /* offset 0x100 pad to save area */
 
     svm_segment_register_t es;  /* offset 1024 - cleanbit 8 */
     svm_segment_register_t cs;  /* cleanbit 8 */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
                   ` (3 preceding siblings ...)
  2016-12-31  5:45 ` [PATCH v2 04/10] x86/SVM: Modify VMCB fields to add AVIC support Suravee Suthikulpanit
@ 2016-12-31  5:45 ` Suravee Suthikulpanit
  2017-01-02 16:37   ` Andrew Cooper
  2017-01-03 14:54   ` Boris Ostrovsky
  2016-12-31  5:45 ` [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers Suravee Suthikulpanit
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:45 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, andrew.cooper3, Suravee Suthikulpanit,
	sherry.hurwitz, boris.ostrovsky

Introduce AVIC base initialization code. This includes:
    * Setting up per-VM data structures.
    * Setting up per-vCPU data structure.
    * Initializing AVIC-related VMCB bit fields.

This patch also introduces a new Xen parameter (svm-avic),
which can be used to enable/disable AVIC support.
Currently, this svm-avic is disabled by default since
it does not supported nested virtualization yet.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 docs/misc/xen-command-line.markdown |   9 ++
 xen/arch/x86/hvm/svm/Makefile       |   1 +
 xen/arch/x86/hvm/svm/avic.c         | 232 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c          |   9 +-
 xen/arch/x86/hvm/svm/vmcb.c         |   3 +
 xen/include/asm-x86/hvm/svm/avic.h  |  40 +++++++
 xen/include/asm-x86/hvm/svm/svm.h   |   2 +
 xen/include/asm-x86/hvm/svm/vmcb.h  |  21 ++++
 8 files changed, 316 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/x86/hvm/svm/avic.c
 create mode 100644 xen/include/asm-x86/hvm/svm/avic.h

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 0138978..1139099 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1457,6 +1457,15 @@ enforces the maximum theoretically necessary timeout of 670ms. Any number
 is being interpreted as a custom timeout in milliseconds. Zero or boolean
 false disable the quirk workaround, which is also the default.
 
+### svm\_avic
+> `= <boolean>`
+
+> Default: `false`
+
+This option enables Advance Virtual Interrupt Controller (AVIC),
+which is an extension of AMD Secure Virtual Machine (SVM) to vitualize
+local APIC for guest VM.
+
 ### sync\_console
 > `= <boolean>`
 
diff --git a/xen/arch/x86/hvm/svm/Makefile b/xen/arch/x86/hvm/svm/Makefile
index 760d295..e0e4a59 100644
--- a/xen/arch/x86/hvm/svm/Makefile
+++ b/xen/arch/x86/hvm/svm/Makefile
@@ -1,4 +1,5 @@
 obj-y += asid.o
+obj-y += avic.o
 obj-y += emulate.o
 obj-bin-y += entry.o
 obj-y += intr.o
diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
new file mode 100644
index 0000000..b62f982
--- /dev/null
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -0,0 +1,232 @@
+/*
+ * avic.c: implements AMD Advance Virtual Interrupt Controller (AVIC) support
+ * Copyright (c) 2016, Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/domain_page.h>
+#include <xen/sched.h>
+#include <xen/stdbool.h>
+#include <asm/acpi.h>
+#include <asm/apicdef.h>
+#include <asm/event.h>
+#include <asm/hvm/emulate.h>
+#include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/support.h>
+#include <asm/hvm/svm/avic.h>
+#include <asm/hvm/vlapic.h>
+#include <asm/p2m.h>
+#include <asm/page.h>
+
+/*
+ * Note: Current max index allowed for physical APIC ID table is 255.
+ */
+#define AVIC_PHY_APIC_ID_MAX    0xFF
+
+#define AVIC_DOORBELL           0xc001011b
+
+#define AVIC_HPA_SHIFT  12
+#define AVIC_HPA_MASK           (((1ULL << 40) - 1) << AVIC_HPA_SHIFT)
+#define AVIC_VAPIC_BAR_MASK     AVIC_HPA_MASK
+
+/*
+ * Note:
+ * Currently, svm-avic mode is not supported with nested virtualization.
+ * Therefore, it is not yet currently enabled by default. Once the support
+ * is in-place, this should be enabled by default.
+ */
+bool svm_avic = 0;
+boolean_param("svm-avic", svm_avic);
+
+static struct avic_phy_apic_id_ent *
+avic_get_phy_apic_id_ent(const struct vcpu *v, unsigned int index)
+{
+    struct avic_phy_apic_id_ent *avic_phy_apic_id_table;
+    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
+
+    if ( !d->avic_phy_apic_id_table_mfn )
+        return NULL;
+
+    /*
+    * Note: APIC ID = 0xff is used for broadcast.
+    *       APIC ID > 0xff is reserved.
+    */
+    if ( index >= 0xff )
+        return NULL;
+
+    avic_phy_apic_id_table = mfn_to_virt(d->avic_phy_apic_id_table_mfn);
+
+    return &avic_phy_apic_id_table[index];
+}
+
+int svm_avic_dom_init(struct domain *d)
+{
+    int ret = 0;
+    struct page_info *pg;
+    unsigned long mfn;
+
+    if ( !svm_avic )
+        return 0;
+
+    /*
+     * Note:
+     * AVIC hardware walks the nested page table to check permissions,
+     * but does not use the SPA address specified in the leaf page
+     * table entry since it uses  address in the AVIC_BACKING_PAGE pointer
+     * field of the VMCB. Therefore, we set up a dummy page for APIC _mfn(0).
+     */
+    if ( !d->arch.hvm_domain.svm.avic_access_page_done )
+    {
+        set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE),
+                           _mfn(0), PAGE_ORDER_4K,
+                           p2m_get_hostp2m(d)->default_access);
+        d->arch.hvm_domain.svm.avic_access_page_done = true;
+    }
+
+    /* Init AVIC logical APIC ID table */
+    pg = alloc_domheap_page(d, MEMF_no_owner);
+    if ( !pg )
+    {
+        gdprintk(XENLOG_ERR,
+                "%d: AVIC logical APIC ID table could not be allocated.\n",
+                d->domain_id);
+        ret = -ENOMEM;
+        goto err_out;
+    }
+    mfn = page_to_mfn(pg);
+    clear_domain_page(_mfn(mfn));
+    d->arch.hvm_domain.svm.avic_log_apic_id_table_mfn = mfn;
+
+    /* Init AVIC physical APIC ID table */
+    pg = alloc_domheap_page(d, MEMF_no_owner);
+    if ( !pg )
+    {
+        gdprintk(XENLOG_ERR,
+                "%d: AVIC physical APIC ID table could not be allocated.\n",
+                d->domain_id);
+        ret = -ENOMEM;
+        goto err_out;
+    }
+    mfn = page_to_mfn(pg);
+    clear_domain_page(_mfn(mfn));
+    d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn = mfn;
+
+    return ret;
+ err_out:
+    svm_avic_dom_destroy(d);
+    return ret;
+}
+
+void svm_avic_dom_destroy(struct domain *d)
+{
+    if ( !svm_avic )
+        return;
+
+    if ( d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn )
+    {
+        free_domheap_page(mfn_to_page(d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn));
+        d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn = 0;
+    }
+
+    if ( d->arch.hvm_domain.svm.avic_log_apic_id_table_mfn )
+    {
+        free_domheap_page(mfn_to_page(d->arch.hvm_domain.svm.avic_log_apic_id_table_mfn));
+        d->arch.hvm_domain.svm.avic_log_apic_id_table_mfn = 0;
+    }
+}
+
+bool svm_avic_vcpu_enabled(const struct vcpu *v)
+{
+    const struct arch_svm_struct *s = &v->arch.hvm_svm;
+    const struct vmcb_struct *vmcb = s->vmcb;
+
+    return svm_avic && vmcb->_vintr.fields.avic_enable;
+}
+
+static inline u32 *
+avic_get_bk_page_entry(const struct vcpu *v, u32 offset)
+{
+    const struct vlapic *vlapic = vcpu_vlapic(v);
+    char *tmp;
+
+    if ( !vlapic || !vlapic->regs_page )
+        return NULL;
+
+    tmp = (char *)page_to_virt(vlapic->regs_page);
+    return (u32 *)(tmp + offset);
+}
+
+void svm_avic_update_vapic_bar(const struct vcpu *v, uint64_t data)
+{
+    const struct arch_svm_struct *s = &v->arch.hvm_svm;
+
+    s->vmcb->avic_vapic_bar = data & AVIC_VAPIC_BAR_MASK;
+    s->vmcb->cleanbits.fields.avic = 0;
+}
+
+int svm_avic_init_vmcb(struct vcpu *v)
+{
+    paddr_t ma;
+    u32 *apic_id_reg;
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+    struct vmcb_struct *vmcb = s->vmcb;
+    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
+    const struct vlapic *vlapic = vcpu_vlapic(v);
+    struct avic_phy_apic_id_ent entry;
+
+    if ( !svm_avic )
+        return 0;
+
+    if ( !vlapic || !vlapic->regs_page )
+        return -EINVAL;
+
+    apic_id_reg = avic_get_bk_page_entry(v, APIC_ID);
+    if ( !apic_id_reg )
+        return -EINVAL;
+
+    s->avic_last_phy_id = avic_get_phy_apic_id_ent(v, *apic_id_reg >> 24);
+    if ( !s->avic_last_phy_id )
+        return -EINVAL;
+
+    vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page) & AVIC_HPA_MASK;
+    ma = d->avic_log_apic_id_table_mfn;
+    vmcb->avic_log_apic_id = (ma << PAGE_SHIFT) & AVIC_HPA_MASK;
+    ma = d->avic_phy_apic_id_table_mfn;
+    vmcb->avic_phy_apic_id = (ma << PAGE_SHIFT) & AVIC_HPA_MASK;
+    vmcb->avic_phy_apic_id |= AVIC_PHY_APIC_ID_MAX;
+
+    entry = *(s->avic_last_phy_id);
+    smp_rmb();
+    entry.bk_pg_ptr = (vmcb->avic_bk_pg_pa & AVIC_HPA_MASK) >> AVIC_HPA_SHIFT;
+    entry.is_running = 0;
+    entry.valid = 1;
+    *(s->avic_last_phy_id) = entry;
+    smp_wmb();
+
+    svm_avic_update_vapic_bar(v, APIC_DEFAULT_PHYS_BASE);
+
+    vmcb->_vintr.fields.avic_enable = 1;
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 37bd6c4..cef29b2 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -48,6 +48,7 @@
 #include <asm/hvm/svm/asid.h>
 #include <asm/hvm/svm/svm.h>
 #include <asm/hvm/svm/vmcb.h>
+#include <asm/hvm/svm/avic.h>
 #include <asm/hvm/svm/emulate.h>
 #include <asm/hvm/svm/intr.h>
 #include <asm/hvm/svm/svmdebug.h>
@@ -1162,11 +1163,12 @@ void svm_host_osvw_init()
 
 static int svm_domain_initialise(struct domain *d)
 {
-    return 0;
+    return svm_avic_dom_init(d);
 }
 
 static void svm_domain_destroy(struct domain *d)
 {
+    svm_avic_dom_destroy(d);
 }
 
 static int svm_vcpu_initialise(struct vcpu *v)
@@ -1459,6 +1461,7 @@ const struct hvm_function_table * __init start_svm(void)
     P(cpu_has_svm_decode, "DecodeAssists");
     P(cpu_has_pause_filter, "Pause-Intercept Filter");
     P(cpu_has_tsc_ratio, "TSC Rate MSR");
+    P(cpu_has_svm_avic, "AVIC");
 #undef P
 
     if ( !printed )
@@ -1799,6 +1802,10 @@ static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
 
     switch ( msr )
     {
+    case MSR_IA32_APICBASE:
+        if ( svm_avic_vcpu_enabled(v) )
+            svm_avic_update_vapic_bar(v, msr_content);
+        break;
     case MSR_IA32_SYSENTER_CS:
     case MSR_IA32_SYSENTER_ESP:
     case MSR_IA32_SYSENTER_EIP:
diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
index 9ea014f..9ee7fc7 100644
--- a/xen/arch/x86/hvm/svm/vmcb.c
+++ b/xen/arch/x86/hvm/svm/vmcb.c
@@ -28,6 +28,7 @@
 #include <asm/msr-index.h>
 #include <asm/p2m.h>
 #include <asm/hvm/support.h>
+#include <asm/hvm/svm/avic.h>
 #include <asm/hvm/svm/svm.h>
 #include <asm/hvm/svm/svmdebug.h>
 
@@ -225,6 +226,8 @@ static int construct_vmcb(struct vcpu *v)
         vmcb->_general1_intercepts |= GENERAL1_INTERCEPT_PAUSE;
     }
 
+    svm_avic_init_vmcb(v);
+
     vmcb->cleanbits.bytes = 0;
 
     return 0;
diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
new file mode 100644
index 0000000..16b433c
--- /dev/null
+++ b/xen/include/asm-x86/hvm/svm/avic.h
@@ -0,0 +1,40 @@
+#ifndef _SVM_AVIC_H_
+#define _SVM_AVIC_H_
+
+enum avic_incmp_ipi_err_code {
+    AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE,
+    AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN,
+    AVIC_INCMP_IPI_ERR_INV_TARGET,
+    AVIC_INCMP_IPI_ERR_INV_BK_PAGE,
+};
+
+struct __attribute__ ((__packed__))
+avic_log_apic_id_ent {
+    u32 guest_phy_apic_id : 8;
+    u32 res               : 23;
+    u32 valid             : 1;
+};
+
+struct __attribute__ ((__packed__))
+avic_phy_apic_id_ent {
+    u64 host_phy_apic_id  : 8;
+    u64 res1              : 4;
+    u64 bk_pg_ptr         : 40;
+    u64 res2              : 10;
+    u64 is_running        : 1;
+    u64 valid             : 1;
+};
+
+extern bool_t svm_avic;
+
+int svm_avic_dom_init(struct domain *d);
+void svm_avic_dom_destroy(struct domain *d);
+
+int svm_avic_init_vcpu(struct vcpu *v);
+void svm_avic_destroy_vcpu(struct vcpu *v);
+bool_t svm_avic_vcpu_enabled(const struct vcpu *v);
+
+void svm_avic_update_vapic_bar(const struct vcpu *v,uint64_t data);
+int svm_avic_init_vmcb(struct vcpu *v);
+
+#endif /* _SVM_AVIC_H_ */
diff --git a/xen/include/asm-x86/hvm/svm/svm.h b/xen/include/asm-x86/hvm/svm/svm.h
index c954b7e..fea61bb 100644
--- a/xen/include/asm-x86/hvm/svm/svm.h
+++ b/xen/include/asm-x86/hvm/svm/svm.h
@@ -81,6 +81,7 @@ extern u32 svm_feature_flags;
 #define SVM_FEATURE_FLUSHBYASID    6 /* TLB flush by ASID support */
 #define SVM_FEATURE_DECODEASSISTS  7 /* Decode assists support */
 #define SVM_FEATURE_PAUSEFILTER   10 /* Pause intercept filter support */
+#define SVM_FEATURE_AVIC          13 /* AVIC support */
 
 #define cpu_has_svm_feature(f) test_bit(f, &svm_feature_flags)
 #define cpu_has_svm_npt       cpu_has_svm_feature(SVM_FEATURE_NPT)
@@ -91,6 +92,7 @@ extern u32 svm_feature_flags;
 #define cpu_has_svm_decode    cpu_has_svm_feature(SVM_FEATURE_DECODEASSISTS)
 #define cpu_has_pause_filter  cpu_has_svm_feature(SVM_FEATURE_PAUSEFILTER)
 #define cpu_has_tsc_ratio     cpu_has_svm_feature(SVM_FEATURE_TSCRATEMSR)
+#define cpu_has_svm_avic      cpu_has_svm_feature(SVM_FEATURE_AVIC)
 
 #define SVM_PAUSEFILTER_INIT    3000
 
diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index 43cb98e..d3d045f 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -496,6 +496,24 @@ struct __packed vmcb_struct {
 };
 
 struct svm_domain {
+    /*
+     * This per-domain table is used by the hardware to locate
+     * the vAPIC backing page to be used to deliver interrupts
+     * based on the guest physical APIC ID.
+     */
+    paddr_t avic_phy_apic_id_table_mfn;
+
+    /*
+     * This per-domain table is used by the hardware to map
+     * logically addressed interrupt requests (w/ guest logical APIC id)
+     * to the guest physical APIC ID.
+     */
+    paddr_t avic_log_apic_id_table_mfn;
+
+    u32 avic_max_vcpu_id;
+    bool avic_access_page_done;
+    spinlock_t avic_ldr_mode_lock;
+    u32 avic_ldr_mode;
 };
 
 struct arch_svm_struct {
@@ -529,6 +547,9 @@ struct arch_svm_struct {
         u64 length;
         u64 status;
     } osvw;
+
+    struct avic_phy_apic_id_ent *avic_last_phy_id;
+    u32 avic_last_ldr;
 };
 
 struct vmcb_struct *alloc_vmcb(void);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
                   ` (4 preceding siblings ...)
  2016-12-31  5:45 ` [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code Suravee Suthikulpanit
@ 2016-12-31  5:45 ` Suravee Suthikulpanit
  2017-01-02 17:28   ` Andrew Cooper
  2017-01-03 15:34   ` Boris Ostrovsky
  2016-12-31  5:45 ` [PATCH v2 07/10] x86/SVM: Add vcpu scheduling support for AVIC Suravee Suthikulpanit
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:45 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, andrew.cooper3, Suravee Suthikulpanit,
	sherry.hurwitz, boris.ostrovsky

AVIC introduces two new #vmexit handlers:

VMEXIT_INCOMP_IPI:
This occurs when an IPI could not be delivered to all targeted guest
virtual processors because at least one guest virtual processor
was not allocated to a physical core at the time. In this case,
Xen would try to emulate IPI.

VMEXIT_DO_NOACCEL:
This oocurs when a guest access to an APIC register that cannot be
accelerated by AVIC. In this case, Xen tries to emulate register accesses.

This fault is also generated if an EOI isattempted when the highest priority
in-service interrupt is set for level-triggered mode.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/svm/avic.c        | 337 +++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c         |   8 +
 xen/include/asm-x86/apic.h         |   2 +
 xen/include/asm-x86/hvm/svm/avic.h |   3 +
 xen/include/asm-x86/hvm/svm/vmcb.h |   2 +
 5 files changed, 352 insertions(+)

diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
index b62f982..c0b7151 100644
--- a/xen/arch/x86/hvm/svm/avic.c
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -19,6 +19,7 @@
 #include <xen/sched.h>
 #include <xen/stdbool.h>
 #include <asm/acpi.h>
+#include <asm/apic.h>
 #include <asm/apicdef.h>
 #include <asm/event.h>
 #include <asm/hvm/emulate.h>
@@ -40,6 +41,8 @@
 #define AVIC_HPA_MASK           (((1ULL << 40) - 1) << AVIC_HPA_SHIFT)
 #define AVIC_VAPIC_BAR_MASK     AVIC_HPA_MASK
 
+#define AVIC_UNACCEL_ACCESS_OFFSET_MASK    0xFF0
+
 /*
  * Note:
  * Currently, svm-avic mode is not supported with nested virtualization.
@@ -122,6 +125,8 @@ int svm_avic_dom_init(struct domain *d)
     clear_domain_page(_mfn(mfn));
     d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn = mfn;
 
+    spin_lock_init(&d->arch.hvm_domain.svm.avic_ldr_mode_lock);
+
     return ret;
  err_out:
     svm_avic_dom_destroy(d);
@@ -222,6 +227,338 @@ int svm_avic_init_vmcb(struct vcpu *v)
 }
 
 /*
+ * Note:
+ * This function handles the AVIC_INCOMP_IPI #vmexit when AVIC is enabled.
+ * The hardware generates this fault when an IPI could not be delivered
+ * to all targeted guest virtual processors because at least one guest
+ * virtual processor was not allocated to a physical core at the time.
+ */
+void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs)
+{
+    struct vcpu *curr = current;
+    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
+    u32 icrh = vmcb->exitinfo1 >> 32;
+    u32 icrl = vmcb->exitinfo1;
+    u32 id = vmcb->exitinfo2 >> 32;
+    u32 index = vmcb->exitinfo2 && 0xFF;
+
+    switch ( id )
+    {
+    case AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE:
+        /*
+         * AVIC hardware handles the delivery of
+         * IPIs when the specified Message Type is Fixed
+         * (also known as fixed delivery mode) and
+         * the Trigger Mode is edge-triggered. The hardware
+         * also supports self and broadcast delivery modes
+         * specified via the Destination Shorthand(DSH)
+         * field of the ICRL. Logical and physical APIC ID
+         * formats are supported. All other IPI types cause
+         * a #VMEXIT, which needs to emulated.
+         */
+        vlapic_reg_write(curr, APIC_ICR2, icrh);
+        vlapic_reg_write(curr, APIC_ICR, icrl);
+        break;
+    case AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN:
+    {
+        /*
+         * At this point, we expect that the AVIC HW has already
+         * set the appropriate IRR bits on the valid target
+         * vcpus. So, we just need to kick the appropriate vcpu.
+         */
+        struct vcpu *curc;
+        struct domain *curd = curr->domain;
+        uint32_t dest = GET_xAPIC_DEST_FIELD(icrh);
+        uint32_t short_hand = icrl & APIC_SHORT_MASK;
+        bool dest_mode = !!(icrl & APIC_DEST_MASK);
+
+        for_each_vcpu ( curd, curc )
+        {
+            if ( curc != curr &&
+                 vlapic_match_dest(vcpu_vlapic(curc), vcpu_vlapic(curr),
+                                   short_hand, dest, dest_mode) )
+            {
+                vcpu_kick(curc);
+                break;
+            }
+        }
+        break;
+    }
+    case AVIC_INCMP_IPI_ERR_INV_TARGET:
+        dprintk(XENLOG_ERR,
+                "SVM: %s: Invalid IPI target (icr=%#08x:%08x, idx=%u)\n",
+                __func__, icrh, icrl, index);
+        break;
+    case AVIC_INCMP_IPI_ERR_INV_BK_PAGE:
+        dprintk(XENLOG_ERR,
+                "SVM: %s: Invalid bk page (icr=%#08x:%08x, idx=%u)\n",
+                __func__, icrh, icrl, index);
+        break;
+    default:
+        dprintk(XENLOG_ERR, "SVM: %s: Unknown IPI interception (%#x)\n",
+                __func__, id);
+    }
+}
+
+static struct avic_log_apic_id_ent *
+avic_get_logical_id_entry(const struct vcpu *v, u32 ldr, bool flat)
+{
+    unsigned int index;
+    struct avic_log_apic_id_ent *avic_log_apid_id_table;
+    const struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
+    unsigned int dest_id = GET_APIC_LOGICAL_ID(ldr);
+
+    if ( !dest_id )
+        return NULL;
+
+    if ( flat )
+    {
+        index = ffs(dest_id) - 1;
+        if ( index > 7 )
+            return NULL;
+    }
+    else
+    {
+        unsigned int cluster = (dest_id & 0xf0) >> 4;
+        int apic = ffs(dest_id & 0x0f) - 1;
+
+        if ( (apic < 0) || (apic > 7) || (cluster >= 0xf) )
+            return NULL;
+        index = (cluster << 2) + apic;
+    }
+
+    ASSERT(index <= 255);
+
+    avic_log_apid_id_table = mfn_to_virt(d->avic_log_apic_id_table_mfn);
+
+    return &avic_log_apid_id_table[index];
+}
+
+static int avic_ldr_write(struct vcpu *v, u8 g_phy_id, u32 ldr, bool valid)
+{
+    struct avic_log_apic_id_ent *entry, new_entry;
+    u32 *bp = avic_get_bk_page_entry(v, APIC_DFR);
+
+    if ( !bp )
+        return -EINVAL;
+
+    entry = avic_get_logical_id_entry(v, ldr, (*bp == APIC_DFR_FLAT));
+    if (!entry)
+        return -EINVAL;
+
+    new_entry = *entry;
+    smp_rmb();
+    new_entry.guest_phy_apic_id = g_phy_id;
+    new_entry.valid = valid;
+    *entry = new_entry;
+    smp_wmb();
+
+    return 0;
+}
+
+static int avic_handle_ldr_update(struct vcpu *v)
+{
+    int ret = 0;
+    u32 *ldr = avic_get_bk_page_entry(v, APIC_LDR);
+    u32 *apic_id_reg = avic_get_bk_page_entry(v, APIC_ID);
+
+    if ( !ldr || !*ldr || !apic_id_reg )
+        return -EINVAL;
+
+    ret = avic_ldr_write(v, GET_APIC_PHYSICAL_ID(*apic_id_reg), *ldr, true);
+    if ( ret && v->arch.hvm_svm.avic_last_ldr )
+    {
+        /*
+         * Note:
+         * In case of failure to update LDR register,
+         * we set the guest physical APIC ID to 0,
+         * and set the entry logical APID ID entry
+         * to invalid (false).
+         */
+        avic_ldr_write(v, 0, v->arch.hvm_svm.avic_last_ldr, false);
+        v->arch.hvm_svm.avic_last_ldr = 0;
+    }
+    else
+    {
+        /*
+         * Note:
+         * This saves the last valid LDR so that we
+         * know which entry in the local APIC ID
+         * to clean up when the LDR is updated.
+         */
+        v->arch.hvm_svm.avic_last_ldr = *ldr;
+    }
+
+    return ret;
+}
+
+static int avic_handle_apic_id_update(struct vcpu *v, bool init)
+{
+    struct avic_phy_apic_id_ent *old, *new;
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+    u32 *apic_id_reg = avic_get_bk_page_entry(v, APIC_ID);
+
+    if ( !apic_id_reg )
+        return -EINVAL;
+
+    old = s->avic_last_phy_id;
+    ASSERT(old);
+
+    new = avic_get_phy_apic_id_ent(v, GET_APIC_PHYSICAL_ID(*apic_id_reg));
+    if ( !new )
+        return 0;
+
+    /* We need to move physical_id_entry to new offset */
+    *new = *old;
+    *((u64 *)old) = 0ULL;
+    s->avic_last_phy_id = new;
+
+    /*
+     * Update the guest physical APIC ID in the logical
+     * APIC ID table entry if LDR is already setup.
+     */
+    if ( v->arch.hvm_svm.avic_last_ldr )
+        avic_handle_ldr_update(v);
+
+    return 0;
+}
+
+static int avic_handle_dfr_update(struct vcpu *v)
+{
+    u32 mod;
+    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
+    u32 *dfr = avic_get_bk_page_entry(v, APIC_DFR);
+
+    if ( !dfr )
+        return -EINVAL;
+
+    mod = (*dfr >> 28) & 0xFu;
+
+    spin_lock(&d->avic_ldr_mode_lock);
+    if ( d->avic_ldr_mode != mod )
+    {
+        /*
+         * We assume that all local APICs are using the same type.
+         * If LDR mode changes, we need to flush the domain AVIC logical
+         * APIC id table.
+         */
+        clear_domain_page(_mfn(d->avic_log_apic_id_table_mfn));
+        smp_wmb();
+        d->avic_ldr_mode = mod;
+    }
+    spin_unlock(&d->avic_ldr_mode_lock);
+
+    if ( v->arch.hvm_svm.avic_last_ldr )
+        avic_handle_ldr_update(v);
+
+    return 0;
+}
+
+static int avic_unaccel_trap_write(struct vcpu *v)
+{
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
+    u32 offset = vmcb->exitinfo1 & AVIC_UNACCEL_ACCESS_OFFSET_MASK;
+    u32 *reg = avic_get_bk_page_entry(v, offset);
+
+    if ( !reg )
+        return X86EMUL_UNHANDLEABLE;
+
+    switch ( offset )
+    {
+    case APIC_ID:
+        if ( avic_handle_apic_id_update(v, false) )
+            return X86EMUL_UNHANDLEABLE;
+        break;
+    case APIC_LDR:
+        if ( avic_handle_ldr_update(v) )
+            return X86EMUL_UNHANDLEABLE;
+        break;
+    case APIC_DFR:
+        if ( avic_handle_dfr_update(v) )
+            return X86EMUL_UNHANDLEABLE;
+        break;
+    default:
+        break;
+    }
+
+    vlapic_reg_write(v, offset, *reg);
+
+    return X86EMUL_OKAY;
+}
+
+/*
+ * Note:
+ * This function handles the AVIC_NOACCEL #vmexit when AVIC is enabled.
+ * The hardware generates this fault when :
+ * -  A guest access to an APIC register that is not accelerated
+ *    by AVIC hardware.
+ * - EOI is attempted when the highest priority in-service interrupt
+ *   is level-triggered.
+ */
+void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
+{
+    struct vcpu *curr = current;
+    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
+    u32 offset = vmcb->exitinfo1 & 0xFF0;
+    u32 rw = (vmcb->exitinfo1 >> 32) & 0x1;
+
+    switch ( offset )
+    {
+    case APIC_ID:
+    case APIC_EOI:
+    case APIC_RRR:
+    case APIC_LDR:
+    case APIC_DFR:
+    case APIC_SPIV:
+    case APIC_ESR:
+    case APIC_ICR:
+    case APIC_LVTT:
+    case APIC_LVTTHMR:
+    case APIC_LVTPC:
+    case APIC_LVT0:
+    case APIC_LVT1:
+    case APIC_LVTERR:
+    case APIC_TMICT:
+    case APIC_TDCR:
+        /*
+         * Handling AVIC Trap (intercept right after the access).
+         */
+        if ( !rw )
+        {
+            /*
+             * If a read trap happens, the CPU microcode does not
+             * implement the spec.
+             */
+            BUG();
+        }
+        if ( avic_unaccel_trap_write(curr) != X86EMUL_OKAY )
+        {
+            gprintk(XENLOG_ERR, "%s: Failed to handle trap write (%#x)\n",
+                    __func__, offset);
+            return;
+        }
+        break;
+    default:
+        /*
+         * Handling AVIC Fault (intercept before the access).
+         */
+        if ( !rw )
+        {
+            u32 *entry = avic_get_bk_page_entry(curr, offset);
+
+            if ( !entry )
+                return;
+
+            *entry = vlapic_read_aligned(vcpu_vlapic(curr), offset);
+        }
+        hvm_emulate_one_vm_event(EMUL_KIND_NORMAL, TRAP_invalid_op,
+                                 HVM_DELIVER_NO_ERROR_CODE);
+    }
+
+    return;
+}
+
+/*
  * Local variables:
  * mode: C
  * c-file-style: "BSD"
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index cef29b2..cb5281c 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2682,6 +2682,14 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
         svm_vmexit_do_pause(regs);
         break;
 
+    case VMEXIT_AVIC_INCOMP_IPI:
+        svm_avic_vmexit_do_incomp_ipi(regs);
+        break;
+
+    case VMEXIT_AVIC_NOACCEL:
+        svm_avic_vmexit_do_noaccel(regs);
+        break;
+
     default:
     unexpected_exit_type:
         gdprintk(XENLOG_ERR, "unexpected VMEXIT: exit reason = %#"PRIx64", "
diff --git a/xen/include/asm-x86/apic.h b/xen/include/asm-x86/apic.h
index be9a535..aa5e7ec 100644
--- a/xen/include/asm-x86/apic.h
+++ b/xen/include/asm-x86/apic.h
@@ -16,6 +16,8 @@
 #define APIC_DEBUG   2
 
 #define	SET_APIC_LOGICAL_ID(x)	(((x)<<24))
+#define	GET_APIC_LOGICAL_ID(x)	(((x)>>24) & 0xFFu)
+#define	GET_APIC_PHYSICAL_ID(x)	(((x)>>24) & 0xFFu)
 
 #define IO_APIC_REDIR_VECTOR_MASK	0x000FF
 #define IO_APIC_REDIR_DEST_LOGICAL	0x00800
diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
index 16b433c..7e0abcb 100644
--- a/xen/include/asm-x86/hvm/svm/avic.h
+++ b/xen/include/asm-x86/hvm/svm/avic.h
@@ -37,4 +37,7 @@ bool_t svm_avic_vcpu_enabled(const struct vcpu *v);
 void svm_avic_update_vapic_bar(const struct vcpu *v,uint64_t data);
 int svm_avic_init_vmcb(struct vcpu *v);
 
+void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs);
+void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs);
+
 #endif /* _SVM_AVIC_H_ */
diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index d3d045f..9abf077 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -302,6 +302,8 @@ enum VMEXIT_EXITCODE
     VMEXIT_MWAIT_CONDITIONAL= 140, /* 0x8c */
     VMEXIT_XSETBV           = 141, /* 0x8d */
     VMEXIT_NPF              = 1024, /* 0x400, nested paging fault */
+    VMEXIT_AVIC_INCOMP_IPI  = 1025, /* 0x401 */
+    VMEXIT_AVIC_NOACCEL     = 1026, /* 0x402 */
     VMEXIT_INVALID          =  -1
 };
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 07/10] x86/SVM: Add vcpu scheduling support for AVIC
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
                   ` (5 preceding siblings ...)
  2016-12-31  5:45 ` [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers Suravee Suthikulpanit
@ 2016-12-31  5:45 ` Suravee Suthikulpanit
  2017-01-02 17:35   ` Andrew Cooper
  2017-01-03 15:43   ` Boris Ostrovsky
  2016-12-31  5:45 ` [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC Suravee Suthikulpanit
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:45 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, andrew.cooper3, Suravee Suthikulpanit,
	sherry.hurwitz, boris.ostrovsky

Add hooks to manage AVIC data structure during vcpu scheduling.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/svm/avic.c | 78 +++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c  | 10 ++++++
 2 files changed, 88 insertions(+)

diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
index c0b7151..6351c8e 100644
--- a/xen/arch/x86/hvm/svm/avic.c
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -73,6 +73,79 @@ avic_get_phy_apic_id_ent(const struct vcpu *v, unsigned int index)
     return &avic_phy_apic_id_table[index];
 }
 
+static void avic_vcpu_load(struct vcpu *v)
+{
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+    int h_phy_apic_id;
+    struct avic_phy_apic_id_ent entry;
+
+    if ( !s->avic_last_phy_id )
+        return;
+
+    if ( test_bit(_VPF_blocked, &v->pause_flags) )
+        return;
+
+    /*
+     * Note: APIC ID = 0xff is used for broadcast.
+     *       APIC ID > 0xff is reserved.
+     */
+    h_phy_apic_id = cpu_data[v->processor].apicid;
+    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
+
+    entry = *(s->avic_last_phy_id);
+    smp_rmb();
+    entry.host_phy_apic_id = h_phy_apic_id;
+    entry.is_running = 1;
+    *(s->avic_last_phy_id) = entry;
+    smp_wmb();
+}
+
+static void avic_vcpu_unload(struct vcpu *v)
+{
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+    struct avic_phy_apic_id_ent entry;
+
+    if ( !svm_avic || !s->avic_last_phy_id )
+        return;
+
+    entry = *(s->avic_last_phy_id);
+    smp_rmb();
+    entry.is_running = 0;
+    *(s->avic_last_phy_id) = entry;
+    smp_wmb();
+}
+
+static void avic_vcpu_resume(struct vcpu *v)
+{
+    struct avic_phy_apic_id_ent entry;
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+
+    ASSERT(svm_avic_vcpu_enabled(v));
+    ASSERT(s->avic_last_phy_id);
+    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
+
+    entry = *(s->avic_last_phy_id);
+    smp_rmb();
+    entry.is_running = 1;
+    *(s->avic_last_phy_id) = entry;
+    smp_wmb();
+}
+
+static void avic_vcpu_block(struct vcpu *v)
+{
+    struct avic_phy_apic_id_ent entry;
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+
+    ASSERT(svm_avic_vcpu_enabled(v));
+    ASSERT(s->avic_last_phy_id);
+
+    entry = *(s->avic_last_phy_id);
+    smp_rmb();
+    entry.is_running = 0;
+    *(s->avic_last_phy_id) = entry;
+    smp_wmb();
+}
+
 int svm_avic_dom_init(struct domain *d)
 {
     int ret = 0;
@@ -127,6 +200,11 @@ int svm_avic_dom_init(struct domain *d)
 
     spin_lock_init(&d->arch.hvm_domain.svm.avic_ldr_mode_lock);
 
+    d->arch.hvm_domain.pi_ops.pi_switch_to = avic_vcpu_unload;
+    d->arch.hvm_domain.pi_ops.pi_switch_from = avic_vcpu_load;
+    d->arch.hvm_domain.pi_ops.vcpu_block = avic_vcpu_block;
+    d->arch.hvm_domain.pi_ops.pi_do_resume = avic_vcpu_resume;
+
     return ret;
  err_out:
     svm_avic_dom_destroy(d);
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index cb5281c..df59b8d 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1012,6 +1012,10 @@ static void svm_ctxt_switch_from(struct vcpu *v)
     svm_tsc_ratio_save(v);
 
     svm_sync_vmcb(v);
+
+    if ( v->domain->arch.hvm_domain.pi_ops.pi_switch_from )
+        v->domain->arch.hvm_domain.pi_ops.pi_switch_from(v);
+
     svm_vmload(per_cpu(root_vmcb, cpu));
 
     /* Resume use of ISTs now that the host TR is reinstated. */
@@ -1048,6 +1052,9 @@ static void svm_ctxt_switch_to(struct vcpu *v)
     svm_lwp_load(v);
     svm_tsc_ratio_load(v);
 
+    if ( v->domain->arch.hvm_domain.pi_ops.pi_switch_to )
+        v->domain->arch.hvm_domain.pi_ops.pi_switch_to(v);
+
     if ( cpu_has_rdtscp )
         wrmsrl(MSR_TSC_AUX, hvm_msr_tsc_aux(v));
 }
@@ -1093,6 +1100,9 @@ static void noreturn svm_do_resume(struct vcpu *v)
         vmcb_set_vintr(vmcb, intr);
     }
 
+    if ( v->domain->arch.hvm_domain.pi_ops.pi_do_resume )
+        v->domain->arch.hvm_domain.pi_ops.pi_do_resume(v);
+
     hvm_do_resume(v);
 
     reset_stack_and_jump(svm_asm_do_resume);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
                   ` (6 preceding siblings ...)
  2016-12-31  5:45 ` [PATCH v2 07/10] x86/SVM: Add vcpu scheduling support for AVIC Suravee Suthikulpanit
@ 2016-12-31  5:45 ` Suravee Suthikulpanit
  2017-01-02 17:45   ` Andrew Cooper
  2017-01-05 16:01   ` Jan Beulich
  2016-12-31  5:46 ` [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions Suravee Suthikulpanit
  2016-12-31  5:46 ` [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler Suravee Suthikulpanit
  9 siblings, 2 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:45 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, andrew.cooper3, Suravee Suthikulpanit,
	sherry.hurwitz, boris.ostrovsky

Enabling AVIC implicitly disables the V_IRQ, V_INTR_PRIO, V_IGN_TPR,
and V_INTR_VECTOR fields in the VMCB Control Word. Therefore, this patch
introduces new interrupt injection code via AVIC backing page.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/svm/avic.c        | 28 ++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/intr.c        |  4 ++++
 xen/arch/x86/hvm/svm/svm.c         | 12 ++++++++++--
 xen/include/asm-x86/hvm/svm/avic.h |  1 +
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
index 6351c8e..faa5e45 100644
--- a/xen/arch/x86/hvm/svm/avic.c
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -636,6 +636,34 @@ void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
     return;
 }
 
+void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
+{
+    struct vlapic *vlapic = vcpu_vlapic(v);
+
+    /* Fallback to use non-AVIC if vcpu is not enabled with AVIC. */
+    if ( !svm_avic_vcpu_enabled(v) )
+    {
+        if ( !vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
+            vcpu_kick(v);
+        return;
+    }
+
+    if ( !(guest_cpu_user_regs()->eflags & X86_EFLAGS_IF) )
+        return;
+
+    if ( vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
+        return;
+
+    /*
+     * If vcpu is running on another cpu, hit the doorbell to signal
+     * it to process interrupt. Otherwise, kick it.
+     */
+    if ( v->is_running && (v != current) )
+        wrmsrl(AVIC_DOORBELL, cpu_data[v->processor].apicid);
+    else
+        vcpu_kick(v);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/hvm/svm/intr.c b/xen/arch/x86/hvm/svm/intr.c
index bd94731..876d2ad 100644
--- a/xen/arch/x86/hvm/svm/intr.c
+++ b/xen/arch/x86/hvm/svm/intr.c
@@ -30,6 +30,7 @@
 #include <asm/hvm/io.h>
 #include <asm/hvm/support.h>
 #include <asm/hvm/vlapic.h>
+#include <asm/hvm/svm/avic.h>
 #include <asm/hvm/svm/svm.h>
 #include <asm/hvm/svm/intr.h>
 #include <asm/hvm/nestedhvm.h> /* for nestedhvm_vcpu_in_guestmode */
@@ -101,6 +102,9 @@ static void svm_enable_intr_window(struct vcpu *v, struct hvm_intack intack)
     HVMTRACE_3D(INTR_WINDOW, intack.vector, intack.source,
                 vmcb->eventinj.fields.v?vmcb->eventinj.fields.vector:-1);
 
+    if ( svm_avic_vcpu_enabled(v) )
+        return;
+
     /*
      * Create a dummy virtual interrupt to intercept as soon as the
      * guest can accept the real interrupt.
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index df59b8d..922f48f 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1089,7 +1089,8 @@ static void noreturn svm_do_resume(struct vcpu *v)
         hvm_asid_flush_vcpu(v);
     }
 
-    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) )
+    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) &&
+         !svm_avic_vcpu_enabled(v) )
     {
         vintr_t intr;
 
@@ -2309,7 +2310,8 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
      * NB. We need to preserve the low bits of the TPR to make checked builds
      * of Windows work, even though they don't actually do anything.
      */
-    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) )
+    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) &&
+         !svm_avic_vcpu_enabled(v) )
     {
         intr = vmcb_get_vintr(vmcb);
         vlapic_set_reg(vlapic, APIC_TASKPRI,
@@ -2502,6 +2504,12 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
         u32 general1_intercepts = vmcb_get_general1_intercepts(vmcb);
         intr = vmcb_get_vintr(vmcb);
 
+        if ( svm_avic_vcpu_enabled(v) )
+        {
+            gdprintk(XENLOG_ERR, "AVIC VINTR:\n");
+            domain_crash(v->domain);
+        }
+
         intr.fields.irq = 0;
         general1_intercepts &= ~GENERAL1_INTERCEPT_VINTR;
 
diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
index 7e0abcb..1676e01 100644
--- a/xen/include/asm-x86/hvm/svm/avic.h
+++ b/xen/include/asm-x86/hvm/svm/avic.h
@@ -40,4 +40,5 @@ int svm_avic_init_vmcb(struct vcpu *v);
 void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs);
 void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs);
 
+void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vector);
 #endif /* _SVM_AVIC_H_ */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
                   ` (7 preceding siblings ...)
  2016-12-31  5:45 ` [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC Suravee Suthikulpanit
@ 2016-12-31  5:46 ` Suravee Suthikulpanit
  2017-01-02 17:49   ` Andrew Cooper
  2017-01-05 16:05   ` Jan Beulich
  2016-12-31  5:46 ` [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler Suravee Suthikulpanit
  9 siblings, 2 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, andrew.cooper3, Suravee Suthikulpanit,
	sherry.hurwitz, boris.ostrovsky

Hook up virtual_intr_delivery_enabled and deliver_posted_intr functions
when AVIC is enabled.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/svm/svm.c         | 26 +++++++++++++++++++++-----
 xen/include/asm-x86/hvm/svm/avic.h |  3 +++
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 922f48f..7c0cda0 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1438,6 +1438,11 @@ static int svm_cpu_up(void)
     return 0;
 }
 
+static inline int svm_avic_enabled(void)
+{
+    return svm_avic;
+}
+
 const struct hvm_function_table * __init start_svm(void)
 {
     bool_t printed = 0;
@@ -1472,16 +1477,27 @@ const struct hvm_function_table * __init start_svm(void)
     P(cpu_has_svm_decode, "DecodeAssists");
     P(cpu_has_pause_filter, "Pause-Intercept Filter");
     P(cpu_has_tsc_ratio, "TSC Rate MSR");
-    P(cpu_has_svm_avic, "AVIC");
-#undef P
-
-    if ( !printed )
-        printk(" - none\n");
 
     svm_function_table.hap_supported = !!cpu_has_svm_npt;
     svm_function_table.hap_capabilities = HVM_HAP_SUPERPAGE_2MB |
         ((cpuid_edx(0x80000001) & 0x04000000) ? HVM_HAP_SUPERPAGE_1GB : 0);
 
+    if ( !cpu_has_svm_avic )
+        svm_avic = 0;
+
+    if ( svm_avic )
+    {
+        svm_function_table.deliver_posted_intr  = svm_avic_deliver_posted_intr;
+        svm_function_table.virtual_intr_delivery_enabled = svm_avic_enabled;
+        P(cpu_has_svm_avic, "AVIC (enabled)");
+    }
+    else
+        P(cpu_has_svm_avic, "AVIC (disabled)");
+#undef P
+
+    if ( !printed )
+        printk(" - none\n");
+
     return &svm_function_table;
 }
 
diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
index 1676e01..5be3e76 100644
--- a/xen/include/asm-x86/hvm/svm/avic.h
+++ b/xen/include/asm-x86/hvm/svm/avic.h
@@ -41,4 +41,7 @@ void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs);
 void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs);
 
 void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vector);
+
+void setup_avic_dump(void);
+
 #endif /* _SVM_AVIC_H_ */
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler
  2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
                   ` (8 preceding siblings ...)
  2016-12-31  5:46 ` [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions Suravee Suthikulpanit
@ 2016-12-31  5:46 ` Suravee Suthikulpanit
  2017-01-03 16:01   ` Boris Ostrovsky
  2017-01-05 16:07   ` Jan Beulich
  9 siblings, 2 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2016-12-31  5:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, andrew.cooper3, Suravee Suthikulpanit,
	sherry.hurwitz, boris.ostrovsky

Adding new key-handler "j" for dumping AVIC-related information.
Here is an example of per-domain statistics being dumped.

  *********** SVM AVIC Statistics **************
  >>> Domain 1 <<<
      VCPU 0
      * incomp_ipi = 3110
      * noaccel    = 236475
      * post_intr  = 116176
      * doorbell   = 715
      VCPU 1
      * incomp_ipi = 2565
      * noaccel    = 233061
      * post_intr  = 115765
      * doorbell   = 771
  **************************************

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 xen/arch/x86/hvm/svm/avic.c        | 48 ++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c         |  1 +
 xen/include/asm-x86/hvm/svm/vmcb.h |  6 +++++
 3 files changed, 55 insertions(+)

diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
index faa5e45..1aea724 100644
--- a/xen/arch/x86/hvm/svm/avic.c
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -27,6 +27,7 @@
 #include <asm/hvm/support.h>
 #include <asm/hvm/svm/avic.h>
 #include <asm/hvm/vlapic.h>
+#include <xen/keyhandler.h>
 #include <asm/p2m.h>
 #include <asm/page.h>
 
@@ -320,6 +321,8 @@ void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs)
     u32 id = vmcb->exitinfo2 >> 32;
     u32 index = vmcb->exitinfo2 && 0xFF;
 
+    curr->arch.hvm_svm.cnt_avic_incomp_ipi++;
+
     switch ( id )
     {
     case AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE:
@@ -580,6 +583,8 @@ void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
     u32 offset = vmcb->exitinfo1 & 0xFF0;
     u32 rw = (vmcb->exitinfo1 >> 32) & 0x1;
 
+    curr->arch.hvm_svm.cnt_avic_noaccel++;
+
     switch ( offset )
     {
     case APIC_ID:
@@ -654,16 +659,59 @@ void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
     if ( vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
         return;
 
+    v->arch.hvm_svm.cnt_avic_post_intr++;
     /*
      * If vcpu is running on another cpu, hit the doorbell to signal
      * it to process interrupt. Otherwise, kick it.
      */
     if ( v->is_running && (v != current) )
+    {
         wrmsrl(AVIC_DOORBELL, cpu_data[v->processor].apicid);
+        v->arch.hvm_svm.cnt_avic_doorbell++;
+    }
     else
         vcpu_kick(v);
 }
 
+static void avic_dump(unsigned char ch)
+{
+    struct domain *d;
+    struct vcpu *v;
+
+    printk("*********** SVM AVIC Statistics **************\n");
+
+    rcu_read_lock(&domlist_read_lock);
+
+    for_each_domain ( d )
+    {
+        if ( !is_hvm_domain(d) )
+            continue;
+        printk(">>> Domain %d <<<\n", d->domain_id);
+        for_each_vcpu ( d, v )
+        {
+            printk("\tVCPU %d\n", v->vcpu_id);
+            printk("\t* incomp_ipi = %u\n",
+                   v->arch.hvm_svm.cnt_avic_incomp_ipi);
+            printk("\t* noaccel    = %u\n",
+                   v->arch.hvm_svm.cnt_avic_noaccel);
+            printk("\t* post_intr  = %u\n",
+                   v->arch.hvm_svm.cnt_avic_post_intr);
+            printk("\t* doorbell   = %u\n",
+                   v->arch.hvm_svm.cnt_avic_doorbell);
+        }
+    }
+
+    rcu_read_unlock(&domlist_read_lock);
+
+    printk("**************************************\n");
+
+}
+
+void __init setup_avic_dump(void)
+{
+    register_keyhandler('j', avic_dump, "dump SVM AVIC", 1);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 7c0cda0..b8861d8 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1456,6 +1456,7 @@ const struct hvm_function_table * __init start_svm(void)
     }
 
     setup_vmcb_dump();
+    setup_avic_dump();
 
     svm_feature_flags = (current_cpu_data.extended_cpuid_level >= 0x8000000A ?
                          cpuid_edx(0x8000000A) : 0);
diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index 9abf077..e2b810e 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -552,6 +552,12 @@ struct arch_svm_struct {
 
     struct avic_phy_apic_id_ent *avic_last_phy_id;
     u32 avic_last_ldr;
+
+    /* AVIC Statistics */
+    u32 cnt_avic_incomp_ipi;
+    u32 cnt_avic_noaccel;
+    u32 cnt_avic_post_intr;
+    u32 cnt_avic_doorbell;
 };
 
 struct vmcb_struct *alloc_vmcb(void);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code
  2016-12-31  5:45 ` [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code Suravee Suthikulpanit
@ 2017-01-02 16:37   ` Andrew Cooper
  2017-01-04 17:24     ` Suravee Suthikulpanit
  2017-01-10  3:06     ` Suravee Suthikulpanit
  2017-01-03 14:54   ` Boris Ostrovsky
  1 sibling, 2 replies; 48+ messages in thread
From: Andrew Cooper @ 2017-01-02 16:37 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel
  Cc: boris.ostrovsky, sherry.hurwitz, jbeulich

On 31/12/2016 05:45, Suravee Suthikulpanit wrote:
> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
> new file mode 100644
> index 0000000..b62f982
> --- /dev/null
> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -0,0 +1,232 @@
> +/*
> + * avic.c: implements AMD Advance Virtual Interrupt Controller (AVIC) support
> + * Copyright (c) 2016, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/domain_page.h>
> +#include <xen/sched.h>
> +#include <xen/stdbool.h>
> +#include <asm/acpi.h>
> +#include <asm/apicdef.h>
> +#include <asm/event.h>
> +#include <asm/hvm/emulate.h>
> +#include <asm/hvm/nestedhvm.h>
> +#include <asm/hvm/support.h>
> +#include <asm/hvm/svm/avic.h>
> +#include <asm/hvm/vlapic.h>
> +#include <asm/p2m.h>
> +#include <asm/page.h>
> +
> +/*
> + * Note: Current max index allowed for physical APIC ID table is 255.
> + */
> +#define AVIC_PHY_APIC_ID_MAX    0xFF
> +
> +#define AVIC_DOORBELL           0xc001011b
> +
> +#define AVIC_HPA_SHIFT  12
> +#define AVIC_HPA_MASK           (((1ULL << 40) - 1) << AVIC_HPA_SHIFT)

What does this mask represent?  I have spotted a truncation bug below,
but how to fix the issue depends on the meaning of the mask.

The only limits I can see in the spec is that the host physical
addresses must be present in system RAM, which presumably means they
should follow maxphysaddr like everything else?

> +#define AVIC_VAPIC_BAR_MASK     AVIC_HPA_MASK
> +
> +/*
> + * Note:
> + * Currently, svm-avic mode is not supported with nested virtualization.
> + * Therefore, it is not yet currently enabled by default. Once the support
> + * is in-place, this should be enabled by default.
> + */
> +bool svm_avic = 0;
> +boolean_param("svm-avic", svm_avic);

We are trying to avoid the proliferation of top-level options.  Please
could you introduce a "svm=" option which takes avic as a sub-boolean,
similar to how iommu= currently works?

> +
> +static struct avic_phy_apic_id_ent *
> +avic_get_phy_apic_id_ent(const struct vcpu *v, unsigned int index)

The Physical APIC table is per-domain, not per-cpu according to the
spec.  This function should take a struct domain, although I don't think
you need it at all (see below).

> +{
> +    struct avic_phy_apic_id_ent *avic_phy_apic_id_table;
> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
> +
> +    if ( !d->avic_phy_apic_id_table_mfn )
> +        return NULL;
> +
> +    /*
> +    * Note: APIC ID = 0xff is used for broadcast.
> +    *       APIC ID > 0xff is reserved.
> +    */
> +    if ( index >= 0xff )
> +        return NULL;
> +
> +    avic_phy_apic_id_table = mfn_to_virt(d->avic_phy_apic_id_table_mfn);

This is unsafe and will break on hosts with more than 5TB of RAM.

As you allocate avic_phy_apic_id_table_mfn with alloc_domheap_page(),
you must use {map,unmap}_domain_page() to get temporary mappings (which
will use mfn_to_virt() in the common case), or use
{map,unmap}_domain_page_global() to get permanent mappings.

As the values in here need modifying across a schedule, I would suggest
making a global mapping at create time, and storing the result in a
properly-typed pointer.

> +
> +    return &avic_phy_apic_id_table[index];
> +}
> +
> +int svm_avic_dom_init(struct domain *d)
> +{
> +    int ret = 0;
> +    struct page_info *pg;
> +    unsigned long mfn;

mfn_t please, which would also highlight the type error in svm_domain.

> +
> +    if ( !svm_avic )
> +        return 0;

|| !has_vlapic(d)

HVMLite domains may legitimately be configured without an LAPIC at all.

> +
> +    /*
> +     * Note:
> +     * AVIC hardware walks the nested page table to check permissions,
> +     * but does not use the SPA address specified in the leaf page
> +     * table entry since it uses  address in the AVIC_BACKING_PAGE pointer
> +     * field of the VMCB. Therefore, we set up a dummy page for APIC _mfn(0).
> +     */
> +    if ( !d->arch.hvm_domain.svm.avic_access_page_done )
> +    {
> +        set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE),
> +                           _mfn(0), PAGE_ORDER_4K,
> +                           p2m_get_hostp2m(d)->default_access);
> +        d->arch.hvm_domain.svm.avic_access_page_done = true;

I don't see the purpose of this avic_access_page_done boolean, even in
later patches.

> +    }
> +
> +    /* Init AVIC logical APIC ID table */
> +    pg = alloc_domheap_page(d, MEMF_no_owner);
> +    if ( !pg )
> +    {
> +        gdprintk(XENLOG_ERR,
> +                "%d: AVIC logical APIC ID table could not be allocated.\n",
> +                d->domain_id);

Do we really need to have a printk() in the -ENOMEM case?  I'd
personally not bother.

> +        ret = -ENOMEM;
> +        goto err_out;
> +    }
> +    mfn = page_to_mfn(pg);
> +    clear_domain_page(_mfn(mfn));
> +    d->arch.hvm_domain.svm.avic_log_apic_id_table_mfn = mfn;
> +
> +    /* Init AVIC physical APIC ID table */
> +    pg = alloc_domheap_page(d, MEMF_no_owner);
> +    if ( !pg )
> +    {
> +        gdprintk(XENLOG_ERR,
> +                "%d: AVIC physical APIC ID table could not be allocated.\n",
> +                d->domain_id);
> +        ret = -ENOMEM;
> +        goto err_out;
> +    }
> +    mfn = page_to_mfn(pg);
> +    clear_domain_page(_mfn(mfn));
> +    d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn = mfn;
> +
> +    return ret;
> + err_out:
> +    svm_avic_dom_destroy(d);
> +    return ret;
> +}
> +
> +void svm_avic_dom_destroy(struct domain *d)
> +{
> +    if ( !svm_avic )
> +        return;
> +
> +    if ( d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn )
> +    {
> +        free_domheap_page(mfn_to_page(d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn));
> +        d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn = 0;
> +    }
> +
> +    if ( d->arch.hvm_domain.svm.avic_log_apic_id_table_mfn )
> +    {
> +        free_domheap_page(mfn_to_page(d->arch.hvm_domain.svm.avic_log_apic_id_table_mfn));
> +        d->arch.hvm_domain.svm.avic_log_apic_id_table_mfn = 0;
> +    }
> +}
> +
> +bool svm_avic_vcpu_enabled(const struct vcpu *v)
> +{
> +    const struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    const struct vmcb_struct *vmcb = s->vmcb;
> +
> +    return svm_avic && vmcb->_vintr.fields.avic_enable;

This predicate should just be on avic_enable.  The svm_avic case should
prevent the avic_enable from being set in the first place.

> +}
> +
> +static inline u32 *
> +avic_get_bk_page_entry(const struct vcpu *v, u32 offset)
> +{
> +    const struct vlapic *vlapic = vcpu_vlapic(v);
> +    char *tmp;
> +
> +    if ( !vlapic || !vlapic->regs_page )
> +        return NULL;
> +
> +    tmp = (char *)page_to_virt(vlapic->regs_page);
> +    return (u32 *)(tmp + offset);

This is also unsafe over 5TB.  vlapic->regs is already a global mapping
which you should use.

Also, you should leave a comment by the allocation of regs_page that
AVIC depends on it being a full page allocation.

> +}
> +
> +void svm_avic_update_vapic_bar(const struct vcpu *v, uint64_t data)
> +{
> +    const struct arch_svm_struct *s = &v->arch.hvm_svm;
> +
> +    s->vmcb->avic_vapic_bar = data & AVIC_VAPIC_BAR_MASK;
> +    s->vmcb->cleanbits.fields.avic = 0;

While I can appreciate what you are trying to do here, altering the
physical address in IA32_APICBASE has never functioned properly under
Xen, as nothing updates the P2M.  Worse, with multi-vcpu guests, the use
of a single shared p2m makes this impossible to do properly.

I know it is non-architectural behaviour, but IMO vlapic_msr_set()
should be updated to raise #GP[0] when attempting to change the frame,
to make it fail obviously rather than having interrupts go missing.

Also I see that the Intel side (via vmx_vlapic_msr_changed()) makes the
same mistake.

> +}
> +
> +int svm_avic_init_vmcb(struct vcpu *v)
> +{
> +    paddr_t ma;
> +    u32 *apic_id_reg;
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    struct vmcb_struct *vmcb = s->vmcb;
> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
> +    const struct vlapic *vlapic = vcpu_vlapic(v);
> +    struct avic_phy_apic_id_ent entry;
> +
> +    if ( !svm_avic )
> +        return 0;
> +
> +    if ( !vlapic || !vlapic->regs_page )
> +        return -EINVAL;

Somewhere you need a bounds check against the APIC ID being within AVIC
limits.

> +
> +    apic_id_reg = avic_get_bk_page_entry(v, APIC_ID);
> +    if ( !apic_id_reg )
> +        return -EINVAL;

This should be vlapic_read()...

> +
> +    s->avic_last_phy_id = avic_get_phy_apic_id_ent(v, *apic_id_reg >> 24);

And this, GET_xAPIC_ID(),

> +    if ( !s->avic_last_phy_id )
> +        return -EINVAL;

Why does a pointer into the physical ID page need storing in
arch_svm_struct?

> +
> +    vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page) & AVIC_HPA_MASK;

This use of AVIC_HPA_MASK may truncate the the address, which is
definitely a problem.

If AVIC_HPA_MASK isn't just related to maxphysaddr, then you need to
pass appropriate memflags into the alloc_domheap_page() calls to get a
suitable page.

> +    ma = d->avic_log_apic_id_table_mfn;
> +    vmcb->avic_log_apic_id = (ma << PAGE_SHIFT) & AVIC_HPA_MASK;

This ma << PAGE_SHIFT also highlights the type error.  ma != mfn.

> +    ma = d->avic_phy_apic_id_table_mfn;
> +    vmcb->avic_phy_apic_id = (ma << PAGE_SHIFT) & AVIC_HPA_MASK;
> +    vmcb->avic_phy_apic_id |= AVIC_PHY_APIC_ID_MAX;

Surely this should be some calculation derived from d->max_vcpus?

> +
> +    entry = *(s->avic_last_phy_id);
> +    smp_rmb();
> +    entry.bk_pg_ptr = (vmcb->avic_bk_pg_pa & AVIC_HPA_MASK) >> AVIC_HPA_SHIFT;
> +    entry.is_running = 0;
> +    entry.valid = 1;
> +    *(s->avic_last_phy_id) = entry;
> +    smp_wmb();

During domain creation, no guests are running, so you can edit this cleanly.

What values are actually being excluded here?  This, and other patches,
look like you are actually just trying to do a read_atomic(), modify,
write_atomic() update, rather than actually requiring ordering.

> +
> +    svm_avic_update_vapic_bar(v, APIC_DEFAULT_PHYS_BASE);
> +
> +    vmcb->_vintr.fields.avic_enable = 1;
> +
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> @@ -1799,6 +1802,10 @@ static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
>  
>      switch ( msr )
>      {
> +    case MSR_IA32_APICBASE:
> +        if ( svm_avic_vcpu_enabled(v) )
> +            svm_avic_update_vapic_bar(v, msr_content);
> +        break;

I think this is dead code.  MSR_IA32_APICBASE is handled in
hvm_msr_write_intercept(), and won't fall into the vendor hook.  If you
were to continue down this route, I would add another pi_ops hook, and
replace the VMX layering violation in vlapic_msr_set().

>      case MSR_IA32_SYSENTER_CS:
>      case MSR_IA32_SYSENTER_ESP:
>      case MSR_IA32_SYSENTER_EIP:
> diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
> new file mode 100644
> index 0000000..16b433c
> --- /dev/null
> +++ b/xen/include/asm-x86/hvm/svm/avic.h
> @@ -0,0 +1,40 @@
> +#ifndef _SVM_AVIC_H_
> +#define _SVM_AVIC_H_
> +
> +enum avic_incmp_ipi_err_code {
> +    AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE,
> +    AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN,
> +    AVIC_INCMP_IPI_ERR_INV_TARGET,
> +    AVIC_INCMP_IPI_ERR_INV_BK_PAGE,
> +};
> +
> +struct __attribute__ ((__packed__))

Please include xen/compiler.h and use __packed

> +avic_log_apic_id_ent {
> +    u32 guest_phy_apic_id : 8;
> +    u32 res               : 23;
> +    u32 valid             : 1;
> +};
> +
> +struct __attribute__ ((__packed__))
> +avic_phy_apic_id_ent {
> +    u64 host_phy_apic_id  : 8;
> +    u64 res1              : 4;
> +    u64 bk_pg_ptr         : 40;

This field should have mfn in its name, as it has an implicit shift in
its definition.

> +    u64 res2              : 10;
> +    u64 is_running        : 1;
> +    u64 valid             : 1;
> +};
> <snip>
> diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
> index 43cb98e..d3d045f 100644
> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
> @@ -496,6 +496,24 @@ struct __packed vmcb_struct {
>  };
>  
>  struct svm_domain {
> +    /*
> +     * This per-domain table is used by the hardware to locate
> +     * the vAPIC backing page to be used to deliver interrupts
> +     * based on the guest physical APIC ID.
> +     */
> +    paddr_t avic_phy_apic_id_table_mfn;

paddrs and mfns differ by PAGE_SHIFT.  Please use mfn_t here.

> +
> +    /*
> +     * This per-domain table is used by the hardware to map
> +     * logically addressed interrupt requests (w/ guest logical APIC id)
> +     * to the guest physical APIC ID.
> +     */
> +    paddr_t avic_log_apic_id_table_mfn;
> +
> +    u32 avic_max_vcpu_id;
> +    bool avic_access_page_done;
> +    spinlock_t avic_ldr_mode_lock;

You introduce the spinlock here, but initialise it in the subsequent
patch.  I suspect that hunk wants moving into this patch?

~Andrew

> +    u32 avic_ldr_mode;
>  };
>  
>  struct arch_svm_struct {


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers
  2016-12-31  5:45 ` [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers Suravee Suthikulpanit
@ 2017-01-02 17:28   ` Andrew Cooper
  2017-01-05  4:07     ` Suravee Suthikulpanit
  2017-01-03 15:34   ` Boris Ostrovsky
  1 sibling, 1 reply; 48+ messages in thread
From: Andrew Cooper @ 2017-01-02 17:28 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel
  Cc: boris.ostrovsky, sherry.hurwitz, jbeulich

On 31/12/2016 05:45, Suravee Suthikulpanit wrote:
> VMEXIT_DO_NOACCEL:
> This oocurs when a guest access to an APIC register that cannot be

"occurs"

> accelerated by AVIC. In this case, Xen tries to emulate register accesses.
>
> This fault is also generated if an EOI isattempted when the highest priority

"is attempted"

> @@ -122,6 +125,8 @@ int svm_avic_dom_init(struct domain *d)
>      clear_domain_page(_mfn(mfn));
>      d->arch.hvm_domain.svm.avic_phy_apic_id_table_mfn = mfn;
>  
> +    spin_lock_init(&d->arch.hvm_domain.svm.avic_ldr_mode_lock);
> +

I think this hunk belongs in the previous patch.

>      return ret;
>   err_out:
>      svm_avic_dom_destroy(d);
> @@ -222,6 +227,338 @@ int svm_avic_init_vmcb(struct vcpu *v)
>  }
>  
>  /*
> + * Note:
> + * This function handles the AVIC_INCOMP_IPI #vmexit when AVIC is enabled.
> + * The hardware generates this fault when an IPI could not be delivered
> + * to all targeted guest virtual processors because at least one guest
> + * virtual processor was not allocated to a physical core at the time.
> + */
> +void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs)
> +{
> +    struct vcpu *curr = current;
> +    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
> +    u32 icrh = vmcb->exitinfo1 >> 32;
> +    u32 icrl = vmcb->exitinfo1;
> +    u32 id = vmcb->exitinfo2 >> 32;
> +    u32 index = vmcb->exitinfo2 && 0xFF;
> +
> +    switch ( id )
> +    {
> +    case AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE:
> +        /*
> +         * AVIC hardware handles the delivery of
> +         * IPIs when the specified Message Type is Fixed
> +         * (also known as fixed delivery mode) and
> +         * the Trigger Mode is edge-triggered. The hardware
> +         * also supports self and broadcast delivery modes
> +         * specified via the Destination Shorthand(DSH)
> +         * field of the ICRL. Logical and physical APIC ID
> +         * formats are supported. All other IPI types cause
> +         * a #VMEXIT, which needs to emulated.
> +         */
> +        vlapic_reg_write(curr, APIC_ICR2, icrh);
> +        vlapic_reg_write(curr, APIC_ICR, icrl);
> +        break;

Please have a newline between break and case.

> +    case AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN:
> +    {
> +        /*
> +         * At this point, we expect that the AVIC HW has already
> +         * set the appropriate IRR bits on the valid target
> +         * vcpus. So, we just need to kick the appropriate vcpu.
> +         */
> +        struct vcpu *curc;
> +        struct domain *curd = curr->domain;

currd is the more common name for this, and I would use a plain v
instead of curc as it isn't curr.

> +        uint32_t dest = GET_xAPIC_DEST_FIELD(icrh);
> +        uint32_t short_hand = icrl & APIC_SHORT_MASK;
> +        bool dest_mode = !!(icrl & APIC_DEST_MASK);
> +
> +        for_each_vcpu ( curd, curc )
> +        {
> +            if ( curc != curr &&
> +                 vlapic_match_dest(vcpu_vlapic(curc), vcpu_vlapic(curr),
> +                                   short_hand, dest, dest_mode) )
> +            {
> +                vcpu_kick(curc);
> +                break;
> +            }
> +        }
> +        break;
> +    }
> +    case AVIC_INCMP_IPI_ERR_INV_TARGET:
> +        dprintk(XENLOG_ERR,
> +                "SVM: %s: Invalid IPI target (icr=%#08x:%08x, idx=%u)\n",
> +                __func__, icrh, icrl, index);
> +        break;

Shouldn't this case be emulated to raise appropriate APIC errors in the
guests view?

> +    case AVIC_INCMP_IPI_ERR_INV_BK_PAGE:
> +        dprintk(XENLOG_ERR,
> +                "SVM: %s: Invalid bk page (icr=%#08x:%08x, idx=%u)\n",
> +                __func__, icrh, icrl, index);
> +        break;
> +    default:
> +        dprintk(XENLOG_ERR, "SVM: %s: Unknown IPI interception (%#x)\n",
> +                __func__, id);

These two cases are an error in Xen's setup, or the hardware.  They
should be gprintk()s (so as not to disappear in release builds), and
cause a domain_crash().

> +    }
> +}
> +
> +static struct avic_log_apic_id_ent *
> +avic_get_logical_id_entry(const struct vcpu *v, u32 ldr, bool flat)
> +{
> +    unsigned int index;
> +    struct avic_log_apic_id_ent *avic_log_apid_id_table;
> +    const struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
> +    unsigned int dest_id = GET_APIC_LOGICAL_ID(ldr);
> +
> +    if ( !dest_id )
> +        return NULL;
> +
> +    if ( flat )
> +    {
> +        index = ffs(dest_id) - 1;
> +        if ( index > 7 )
> +            return NULL;
> +    }
> +    else
> +    {
> +        unsigned int cluster = (dest_id & 0xf0) >> 4;
> +        int apic = ffs(dest_id & 0x0f) - 1;
> +
> +        if ( (apic < 0) || (apic > 7) || (cluster >= 0xf) )
> +            return NULL;
> +        index = (cluster << 2) + apic;
> +    }
> +
> +    ASSERT(index <= 255);
> +
> +    avic_log_apid_id_table = mfn_to_virt(d->avic_log_apic_id_table_mfn);

As with the phsyical table, this is buggy above 5TB, and should probably
be a global mapping to start with.

> +
> +    return &avic_log_apid_id_table[index];
> +}
> +
> +static int avic_ldr_write(struct vcpu *v, u8 g_phy_id, u32 ldr, bool valid)
> +{
> +    struct avic_log_apic_id_ent *entry, new_entry;
> +    u32 *bp = avic_get_bk_page_entry(v, APIC_DFR);
> +
> +    if ( !bp )
> +        return -EINVAL;
> +
> +    entry = avic_get_logical_id_entry(v, ldr, (*bp == APIC_DFR_FLAT));
> +    if (!entry)
> +        return -EINVAL;
> +
> +    new_entry = *entry;
> +    smp_rmb();
> +    new_entry.guest_phy_apic_id = g_phy_id;
> +    new_entry.valid = valid;
> +    *entry = new_entry;
> +    smp_wmb();
> +
> +    return 0;
> +}
> +
> +static int avic_handle_ldr_update(struct vcpu *v)
> +{
> +    int ret = 0;
> +    u32 *ldr = avic_get_bk_page_entry(v, APIC_LDR);
> +    u32 *apic_id_reg = avic_get_bk_page_entry(v, APIC_ID);
> +
> +    if ( !ldr || !*ldr || !apic_id_reg )
> +        return -EINVAL;
> +
> +    ret = avic_ldr_write(v, GET_APIC_PHYSICAL_ID(*apic_id_reg), *ldr, true);
> +    if ( ret && v->arch.hvm_svm.avic_last_ldr )
> +    {
> +        /*
> +         * Note:
> +         * In case of failure to update LDR register,
> +         * we set the guest physical APIC ID to 0,
> +         * and set the entry logical APID ID entry
> +         * to invalid (false).
> +         */
> +        avic_ldr_write(v, 0, v->arch.hvm_svm.avic_last_ldr, false);
> +        v->arch.hvm_svm.avic_last_ldr = 0;
> +    }
> +    else
> +    {
> +        /*
> +         * Note:
> +         * This saves the last valid LDR so that we
> +         * know which entry in the local APIC ID
> +         * to clean up when the LDR is updated.
> +         */
> +        v->arch.hvm_svm.avic_last_ldr = *ldr;
> +    }
> +
> +    return ret;
> +}
> +
> +static int avic_handle_apic_id_update(struct vcpu *v, bool init)
> +{
> +    struct avic_phy_apic_id_ent *old, *new;
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    u32 *apic_id_reg = avic_get_bk_page_entry(v, APIC_ID);
> +
> +    if ( !apic_id_reg )
> +        return -EINVAL;
> +
> +    old = s->avic_last_phy_id;
> +    ASSERT(old);
> +
> +    new = avic_get_phy_apic_id_ent(v, GET_APIC_PHYSICAL_ID(*apic_id_reg));
> +    if ( !new )
> +        return 0;
> +
> +    /* We need to move physical_id_entry to new offset */
> +    *new = *old;
> +    *((u64 *)old) = 0ULL;
> +    s->avic_last_phy_id = new;
> +
> +    /*
> +     * Update the guest physical APIC ID in the logical
> +     * APIC ID table entry if LDR is already setup.
> +     */
> +    if ( v->arch.hvm_svm.avic_last_ldr )
> +        avic_handle_ldr_update(v);
> +
> +    return 0;
> +}
> +
> +static int avic_handle_dfr_update(struct vcpu *v)
> +{
> +    u32 mod;
> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
> +    u32 *dfr = avic_get_bk_page_entry(v, APIC_DFR);
> +
> +    if ( !dfr )
> +        return -EINVAL;
> +
> +    mod = (*dfr >> 28) & 0xFu;
> +
> +    spin_lock(&d->avic_ldr_mode_lock);
> +    if ( d->avic_ldr_mode != mod )
> +    {
> +        /*
> +         * We assume that all local APICs are using the same type.
> +         * If LDR mode changes, we need to flush the domain AVIC logical
> +         * APIC id table.
> +         */

The logical APIC ID table is per-domain, yet LDR mode is per-vcpu.  How
can these coexist if we flush the table like this?  How would a
multi-vcpu domain actually change its mode without this logic in Xen
corrupting the table?

> +        clear_domain_page(_mfn(d->avic_log_apic_id_table_mfn));
> +        smp_wmb();
> +        d->avic_ldr_mode = mod;
> +    }
> +    spin_unlock(&d->avic_ldr_mode_lock);
> +
> +    if ( v->arch.hvm_svm.avic_last_ldr )
> +        avic_handle_ldr_update(v);
> +
> +    return 0;
> +}
> +
> +static int avic_unaccel_trap_write(struct vcpu *v)
> +{
> +    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
> +    u32 offset = vmcb->exitinfo1 & AVIC_UNACCEL_ACCESS_OFFSET_MASK;
> +    u32 *reg = avic_get_bk_page_entry(v, offset);
> +
> +    if ( !reg )
> +        return X86EMUL_UNHANDLEABLE;
> +
> +    switch ( offset )
> +    {
> +    case APIC_ID:
> +        if ( avic_handle_apic_id_update(v, false) )
> +            return X86EMUL_UNHANDLEABLE;
> +        break;
> +    case APIC_LDR:
> +        if ( avic_handle_ldr_update(v) )
> +            return X86EMUL_UNHANDLEABLE;
> +        break;
> +    case APIC_DFR:
> +        if ( avic_handle_dfr_update(v) )
> +            return X86EMUL_UNHANDLEABLE;
> +        break;
> +    default:
> +        break;
> +    }
> +
> +    vlapic_reg_write(v, offset, *reg);
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +/*
> + * Note:
> + * This function handles the AVIC_NOACCEL #vmexit when AVIC is enabled.
> + * The hardware generates this fault when :
> + * -  A guest access to an APIC register that is not accelerated
> + *    by AVIC hardware.
> + * - EOI is attempted when the highest priority in-service interrupt
> + *   is level-triggered.
> + */
> +void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
> +{
> +    struct vcpu *curr = current;
> +    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
> +    u32 offset = vmcb->exitinfo1 & 0xFF0;
> +    u32 rw = (vmcb->exitinfo1 >> 32) & 0x1;
> +
> +    switch ( offset )
> +    {
> +    case APIC_ID:
> +    case APIC_EOI:
> +    case APIC_RRR:
> +    case APIC_LDR:
> +    case APIC_DFR:
> +    case APIC_SPIV:
> +    case APIC_ESR:
> +    case APIC_ICR:
> +    case APIC_LVTT:
> +    case APIC_LVTTHMR:
> +    case APIC_LVTPC:
> +    case APIC_LVT0:
> +    case APIC_LVT1:
> +    case APIC_LVTERR:
> +    case APIC_TMICT:
> +    case APIC_TDCR:

Please use a bitmap in a similar way to hvm_x2apic_msr_read().  It
generates far more efficient code.

> +        /*
> +         * Handling AVIC Trap (intercept right after the access).
> +         */
> +        if ( !rw )
> +        {
> +            /*
> +             * If a read trap happens, the CPU microcode does not
> +             * implement the spec.

So it is all microcode then :)

> +             */
> +            BUG();

domain_crash() please, and a suitable gprintk().  There is no need to
take down the entire system, as non-avic guests should be able to
continue functioning.

> +        }
> +        if ( avic_unaccel_trap_write(curr) != X86EMUL_OKAY )
> +        {
> +            gprintk(XENLOG_ERR, "%s: Failed to handle trap write (%#x)\n",
> +                    __func__, offset);
> +            return;
> +        }
> +        break;
> +    default:
> +        /*
> +         * Handling AVIC Fault (intercept before the access).
> +         */
> +        if ( !rw )
> +        {
> +            u32 *entry = avic_get_bk_page_entry(curr, offset);
> +
> +            if ( !entry )
> +                return;
> +
> +            *entry = vlapic_read_aligned(vcpu_vlapic(curr), offset);
> +        }
> +        hvm_emulate_one_vm_event(EMUL_KIND_NORMAL, TRAP_invalid_op,
> +                                 HVM_DELIVER_NO_ERROR_CODE);

What is this doing here?  I don't see any connection.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 07/10] x86/SVM: Add vcpu scheduling support for AVIC
  2016-12-31  5:45 ` [PATCH v2 07/10] x86/SVM: Add vcpu scheduling support for AVIC Suravee Suthikulpanit
@ 2017-01-02 17:35   ` Andrew Cooper
  2017-01-03 15:43   ` Boris Ostrovsky
  1 sibling, 0 replies; 48+ messages in thread
From: Andrew Cooper @ 2017-01-02 17:35 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel
  Cc: boris.ostrovsky, sherry.hurwitz, jbeulich

On 31/12/2016 05:45, Suravee Suthikulpanit wrote:
> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
> index c0b7151..6351c8e 100644
> --- a/xen/arch/x86/hvm/svm/avic.c
> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -73,6 +73,79 @@ avic_get_phy_apic_id_ent(const struct vcpu *v, unsigned int index)
>      return &avic_phy_apic_id_table[index];
>  }
>  
> +static void avic_vcpu_load(struct vcpu *v)
> +{
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    int h_phy_apic_id;
> +    struct avic_phy_apic_id_ent entry;
> +
> +    if ( !s->avic_last_phy_id )
> +        return;

What is the purpose of this check?  The VM should uniformly be using
AVIC or not.

> +
> +    if ( test_bit(_VPF_blocked, &v->pause_flags) )
> +        return;

This has better never be true if the scheduler has decides to context
switch into v's state.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC
  2016-12-31  5:45 ` [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC Suravee Suthikulpanit
@ 2017-01-02 17:45   ` Andrew Cooper
  2017-02-28 12:01     ` George Dunlap
  2017-01-05 16:01   ` Jan Beulich
  1 sibling, 1 reply; 48+ messages in thread
From: Andrew Cooper @ 2017-01-02 17:45 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel
  Cc: George Dunlap, Dario Faggioli, jbeulich, sherry.hurwitz, boris.ostrovsky

On 31/12/2016 05:45, Suravee Suthikulpanit wrote:
> Enabling AVIC implicitly disables the V_IRQ, V_INTR_PRIO, V_IGN_TPR,
> and V_INTR_VECTOR fields in the VMCB Control Word. Therefore, this patch
> introduces new interrupt injection code via AVIC backing page.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  xen/arch/x86/hvm/svm/avic.c        | 28 ++++++++++++++++++++++++++++
>  xen/arch/x86/hvm/svm/intr.c        |  4 ++++
>  xen/arch/x86/hvm/svm/svm.c         | 12 ++++++++++--
>  xen/include/asm-x86/hvm/svm/avic.h |  1 +
>  4 files changed, 43 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
> index 6351c8e..faa5e45 100644
> --- a/xen/arch/x86/hvm/svm/avic.c
> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -636,6 +636,34 @@ void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
>      return;
>  }
>  
> +void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
> +{
> +    struct vlapic *vlapic = vcpu_vlapic(v);
> +
> +    /* Fallback to use non-AVIC if vcpu is not enabled with AVIC. */
> +    if ( !svm_avic_vcpu_enabled(v) )
> +    {
> +        if ( !vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
> +            vcpu_kick(v);
> +        return;
> +    }
> +
> +    if ( !(guest_cpu_user_regs()->eflags & X86_EFLAGS_IF) )
> +        return;

Won't this discard the interrupt?

> +
> +    if ( vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
> +        return;
> +
> +    /*
> +     * If vcpu is running on another cpu, hit the doorbell to signal
> +     * it to process interrupt. Otherwise, kick it.
> +     */
> +    if ( v->is_running && (v != current) )
> +        wrmsrl(AVIC_DOORBELL, cpu_data[v->processor].apicid);

Hmm - my gut feeling is that this is racy without holding the scheduler
lock for the target pcpu.  Nothing (I am aware of) excludes ->is_running
and ->processor changing behind our back.

CC'ing George and Dario for their input.

~Andrew

> +    else
> +        vcpu_kick(v);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions
  2016-12-31  5:46 ` [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions Suravee Suthikulpanit
@ 2017-01-02 17:49   ` Andrew Cooper
  2017-01-05 16:05   ` Jan Beulich
  1 sibling, 0 replies; 48+ messages in thread
From: Andrew Cooper @ 2017-01-02 17:49 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel
  Cc: boris.ostrovsky, sherry.hurwitz, jbeulich

On 31/12/2016 05:46, Suravee Suthikulpanit wrote:
> Hook up virtual_intr_delivery_enabled and deliver_posted_intr functions
> when AVIC is enabled.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  xen/arch/x86/hvm/svm/svm.c         | 26 +++++++++++++++++++++-----
>  xen/include/asm-x86/hvm/svm/avic.h |  3 +++
>  2 files changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 922f48f..7c0cda0 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -1438,6 +1438,11 @@ static int svm_cpu_up(void)
>      return 0;
>  }
>  
> +static inline int svm_avic_enabled(void)
> +{
> +    return svm_avic;
> +}
> +
>  const struct hvm_function_table * __init start_svm(void)
>  {
>      bool_t printed = 0;
> @@ -1472,16 +1477,27 @@ const struct hvm_function_table * __init start_svm(void)
>      P(cpu_has_svm_decode, "DecodeAssists");
>      P(cpu_has_pause_filter, "Pause-Intercept Filter");
>      P(cpu_has_tsc_ratio, "TSC Rate MSR");
> -    P(cpu_has_svm_avic, "AVIC");
> -#undef P
> -
> -    if ( !printed )
> -        printk(" - none\n");
>  
>      svm_function_table.hap_supported = !!cpu_has_svm_npt;
>      svm_function_table.hap_capabilities = HVM_HAP_SUPERPAGE_2MB |
>          ((cpuid_edx(0x80000001) & 0x04000000) ? HVM_HAP_SUPERPAGE_1GB : 0);
>  
> +    if ( !cpu_has_svm_avic )
> +        svm_avic = 0;
> +
> +    if ( svm_avic )
> +    {
> +        svm_function_table.deliver_posted_intr  = svm_avic_deliver_posted_intr;
> +        svm_function_table.virtual_intr_delivery_enabled = svm_avic_enabled;
> +        P(cpu_has_svm_avic, "AVIC (enabled)");
> +    }
> +    else
> +        P(cpu_has_svm_avic, "AVIC (disabled)");
> +#undef P
> +
> +    if ( !printed )
> +        printk(" - none\n");
> +
>      return &svm_function_table;
>  }
>  
> diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
> index 1676e01..5be3e76 100644
> --- a/xen/include/asm-x86/hvm/svm/avic.h
> +++ b/xen/include/asm-x86/hvm/svm/avic.h
> @@ -41,4 +41,7 @@ void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs);
>  void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs);
>  
>  void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vector);
> +
> +void setup_avic_dump(void);
> +
>  #endif /* _SVM_AVIC_H_ */

This hunk should be in the subsquent patch.  Otherwise, Reviewed-by:
Andrew Cooper <andrew.cooper3@citrix.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code
  2016-12-31  5:45 ` [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code Suravee Suthikulpanit
  2017-01-02 16:37   ` Andrew Cooper
@ 2017-01-03 14:54   ` Boris Ostrovsky
  1 sibling, 0 replies; 48+ messages in thread
From: Boris Ostrovsky @ 2017-01-03 14:54 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel; +Cc: andrew.cooper3, sherry.hurwitz, jbeulich


> +
> +#define AVIC_DOORBELL           0xc001011b

MSR_AVIC_DOORBELL (and it should probably go to include/asm-x86/msr-index.h)

> +
> +#define AVIC_HPA_SHIFT  12

Is there a reason not to use regular PAGE_SHIFT?

> +#define AVIC_HPA_MASK           (((1ULL << 40) - 1) << AVIC_HPA_SHIFT)
> +#define AVIC_VAPIC_BAR_MASK     AVIC_HPA_MASK
> +
> +/*
> + * Note:
> + * Currently, svm-avic mode is not supported with nested virtualization.
> + * Therefore, it is not yet currently enabled by default. Once the support
> + * is in-place, this should be enabled by default.
> + */
> +bool svm_avic = 0;
> +boolean_param("svm-avic", svm_avic);
> +
> +static struct avic_phy_apic_id_ent *
> +avic_get_phy_apic_id_ent(const struct vcpu *v, unsigned int index)

Something like phys_apicid might be more descriptive.

> +{
> +    struct avic_phy_apic_id_ent *avic_phy_apic_id_table;
> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
> +
> +    if ( !d->avic_phy_apic_id_table_mfn )
> +        return NULL;
> +
> +    /*
> +    * Note: APIC ID = 0xff is used for broadcast.
> +    *       APIC ID > 0xff is reserved.
> +    */
> +    if ( index >= 0xff )
> +        return NULL;
> +
> +    avic_phy_apic_id_table = mfn_to_virt(d->avic_phy_apic_id_table_mfn);
> +
> +    return &avic_phy_apic_id_table[index];
> +}


> +}
> +
> +static inline u32 *
> +avic_get_bk_page_entry(const struct vcpu *v, u32 offset)
> +{
> +    const struct vlapic *vlapic = vcpu_vlapic(v);
> +    char *tmp;
> +
> +    if ( !vlapic || !vlapic->regs_page )

Should this be changed to an ASSERT? I think this is only called from
places that already tested this or from places that you wouldn't get to
unless these both were true (like VMEXIT handler).

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers
  2016-12-31  5:45 ` [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers Suravee Suthikulpanit
  2017-01-02 17:28   ` Andrew Cooper
@ 2017-01-03 15:34   ` Boris Ostrovsky
  2017-01-05  6:41     ` Suravee Suthikulpanit
  1 sibling, 1 reply; 48+ messages in thread
From: Boris Ostrovsky @ 2017-01-03 15:34 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel; +Cc: andrew.cooper3, sherry.hurwitz, jbeulich


> +
> +static int avic_ldr_write(struct vcpu *v, u8 g_phy_id, u32 ldr, bool valid)
> +{
> +    struct avic_log_apic_id_ent *entry, new_entry;
> +    u32 *bp = avic_get_bk_page_entry(v, APIC_DFR);

dfr would be a better name (and you use it in avic_handle_dfr_update()).

Also, 'logical' instead of 'log' in avic_log_apic_id_ent would be far
less confusing imo.

> +
> +    if ( !bp )
> +        return -EINVAL;
> +
> +    entry = avic_get_logical_id_entry(v, ldr, (*bp == APIC_DFR_FLAT));
> +    if (!entry)
> +        return -EINVAL;
> +
> +    new_entry = *entry;
> +    smp_rmb();
> +    new_entry.guest_phy_apic_id = g_phy_id;
> +    new_entry.valid = valid;
> +    *entry = new_entry;
> +    smp_wmb();
> +
> +    return 0;
> +}



> +
> +static int avic_handle_apic_id_update(struct vcpu *v, bool init)
> +{
> +    struct avic_phy_apic_id_ent *old, *new;
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    u32 *apic_id_reg = avic_get_bk_page_entry(v, APIC_ID);
> +
> +    if ( !apic_id_reg )
> +        return -EINVAL;
> +
> +    old = s->avic_last_phy_id;
> +    ASSERT(old);
> +
> +    new = avic_get_phy_apic_id_ent(v, GET_APIC_PHYSICAL_ID(*apic_id_reg));
> +    if ( !new )
> +        return 0;
> +
> +    /* We need to move physical_id_entry to new offset */
> +    *new = *old;
> +    *((u64 *)old) = 0ULL;

This is pretty ugly. Can you define an invalid entry and assign it here
instead?

> +    s->avic_last_phy_id = new;
> +
> +    /*
> +     * Update the guest physical APIC ID in the logical
> +     * APIC ID table entry if LDR is already setup.
> +     */
> +    if ( v->arch.hvm_svm.avic_last_ldr )
> +        avic_handle_ldr_update(v);
> +
> +    return 0;
> +}
> +
> +static int avic_handle_dfr_update(struct vcpu *v)
> +{
> +    u32 mod;
> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
> +    u32 *dfr = avic_get_bk_page_entry(v, APIC_DFR);
> +
> +    if ( !dfr )
> +        return -EINVAL;
> +
> +    mod = (*dfr >> 28) & 0xFu;
> +
> +    spin_lock(&d->avic_ldr_mode_lock);
> +    if ( d->avic_ldr_mode != mod )
> +    {
> +        /*
> +         * We assume that all local APICs are using the same type.
> +         * If LDR mode changes, we need to flush the domain AVIC logical
> +         * APIC id table.
> +         */
> +        clear_domain_page(_mfn(d->avic_log_apic_id_table_mfn));
> +        smp_wmb();

Is this needed? I think clear_page() guarantees visibility/ordering (we
have SFENCE in clear_page_sse2() for this reason).


-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 07/10] x86/SVM: Add vcpu scheduling support for AVIC
  2016-12-31  5:45 ` [PATCH v2 07/10] x86/SVM: Add vcpu scheduling support for AVIC Suravee Suthikulpanit
  2017-01-02 17:35   ` Andrew Cooper
@ 2017-01-03 15:43   ` Boris Ostrovsky
  1 sibling, 0 replies; 48+ messages in thread
From: Boris Ostrovsky @ 2017-01-03 15:43 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel; +Cc: andrew.cooper3, sherry.hurwitz, jbeulich

On 12/31/2016 12:45 AM, Suravee Suthikulpanit wrote:
> Add hooks to manage AVIC data structure during vcpu scheduling.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  xen/arch/x86/hvm/svm/avic.c | 78 +++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/hvm/svm/svm.c  | 10 ++++++
>  2 files changed, 88 insertions(+)
>
> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
> index c0b7151..6351c8e 100644
> --- a/xen/arch/x86/hvm/svm/avic.c
> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -73,6 +73,79 @@ avic_get_phy_apic_id_ent(const struct vcpu *v, unsigned int index)
>      return &avic_phy_apic_id_table[index];
>  }
>  
> +static void avic_vcpu_load(struct vcpu *v)
> +{
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    int h_phy_apic_id;
> +    struct avic_phy_apic_id_ent entry;
> +
> +    if ( !s->avic_last_phy_id )
> +        return;
> +
> +    if ( test_bit(_VPF_blocked, &v->pause_flags) )
> +        return;
> +
> +    /*
> +     * Note: APIC ID = 0xff is used for broadcast.
> +     *       APIC ID > 0xff is reserved.
> +     */
> +    h_phy_apic_id = cpu_data[v->processor].apicid;
> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
> +
> +    entry = *(s->avic_last_phy_id);
> +    smp_rmb();
> +    entry.host_phy_apic_id = h_phy_apic_id;
> +    entry.is_running = 1;
> +    *(s->avic_last_phy_id) = entry;
> +    smp_wmb();
> +}
> +
> +static void avic_vcpu_unload(struct vcpu *v)
> +{
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    struct avic_phy_apic_id_ent entry;
> +
> +    if ( !svm_avic || !s->avic_last_phy_id )
> +        return;
> +
> +    entry = *(s->avic_last_phy_id);
> +    smp_rmb();
> +    entry.is_running = 0;
> +    *(s->avic_last_phy_id) = entry;
> +    smp_wmb();
> +}
> +
> +static void avic_vcpu_resume(struct vcpu *v)
> +{
> +    struct avic_phy_apic_id_ent entry;
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +
> +    ASSERT(svm_avic_vcpu_enabled(v));
> +    ASSERT(s->avic_last_phy_id);
> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> +
> +    entry = *(s->avic_last_phy_id);
> +    smp_rmb();
> +    entry.is_running = 1;
> +    *(s->avic_last_phy_id) = entry;
> +    smp_wmb();
> +}
> +
> +static void avic_vcpu_block(struct vcpu *v)
> +{
> +    struct avic_phy_apic_id_ent entry;
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +
> +    ASSERT(svm_avic_vcpu_enabled(v));
> +    ASSERT(s->avic_last_phy_id);
> +
> +    entry = *(s->avic_last_phy_id);
> +    smp_rmb();
> +    entry.is_running = 0;
> +    *(s->avic_last_phy_id) = entry;
> +    smp_wmb();
> +}

I don't understand use of r/w barriers here (and I think Andrew had
similar comment) but in any case the last 5 lines can be factored out.

>  int svm_avic_dom_init(struct domain *d)
>  {
>      int ret = 0;
> @@ -127,6 +200,11 @@ int svm_avic_dom_init(struct domain *d)
>  
>      spin_lock_init(&d->arch.hvm_domain.svm.avic_ldr_mode_lock);
>  
> +    d->arch.hvm_domain.pi_ops.pi_switch_to = avic_vcpu_unload;
> +    d->arch.hvm_domain.pi_ops.pi_switch_from = avic_vcpu_load;
> +    d->arch.hvm_domain.pi_ops.vcpu_block = avic_vcpu_block;
> +    d->arch.hvm_domain.pi_ops.pi_do_resume = avic_vcpu_resume;
> +

I wonder whether it might be better to declare a (static) svm_pi_ops
structure and assign is here to d->arch.hvm_domain.pi_ops. And make a
similar change in patch 1 for VMX.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler
  2016-12-31  5:46 ` [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler Suravee Suthikulpanit
@ 2017-01-03 16:01   ` Boris Ostrovsky
  2017-01-03 16:04     ` Andrew Cooper
  2017-01-05 16:07   ` Jan Beulich
  1 sibling, 1 reply; 48+ messages in thread
From: Boris Ostrovsky @ 2017-01-03 16:01 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel; +Cc: andrew.cooper3, sherry.hurwitz, jbeulich


>  
> +static void avic_dump(unsigned char ch)
> +{
> +    struct domain *d;
> +    struct vcpu *v;
> +
> +    printk("*********** SVM AVIC Statistics **************\n");
> +
> +    rcu_read_lock(&domlist_read_lock);
> +
> +    for_each_domain ( d )
> +    {
> +        if ( !is_hvm_domain(d) )

|| !svm_avic_vcpu_enabled(d->v[0]) (or something along these lines)?


> +            continue;
> +        printk(">>> Domain %d <<<\n", d->domain_id);
> +        for_each_vcpu ( d, v )
> +        {
> +            printk("\tVCPU %d\n", v->vcpu_id);
> +            printk("\t* incomp_ipi = %u\n",
> +                   v->arch.hvm_svm.cnt_avic_incomp_ipi);
> +            printk("\t* noaccel    = %u\n",
> +                   v->arch.hvm_svm.cnt_avic_noaccel);
> +            printk("\t* post_intr  = %u\n",
> +                   v->arch.hvm_svm.cnt_avic_post_intr);
> +            printk("\t* doorbell   = %u\n",
> +                   v->arch.hvm_svm.cnt_avic_doorbell);
> +        }
> +    }
> +
> +    rcu_read_unlock(&domlist_read_lock);
> +
> +    printk("**************************************\n");
> +
> +}
> +
> +void __init setup_avic_dump(void)
> +{

if ( !svm_avic ) return; ?


-boris


> +    register_keyhandler('j', avic_dump, "dump SVM AVIC", 1);
> +}
> +


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler
  2017-01-03 16:01   ` Boris Ostrovsky
@ 2017-01-03 16:04     ` Andrew Cooper
  2017-01-05  8:00       ` Suravee Suthikulpanit
  0 siblings, 1 reply; 48+ messages in thread
From: Andrew Cooper @ 2017-01-03 16:04 UTC (permalink / raw)
  To: Boris Ostrovsky, Suravee Suthikulpanit, xen-devel
  Cc: sherry.hurwitz, jbeulich

On 03/01/17 16:01, Boris Ostrovsky wrote:
>>  
>> +static void avic_dump(unsigned char ch)
>> +{
>> +    struct domain *d;
>> +    struct vcpu *v;
>> +
>> +    printk("*********** SVM AVIC Statistics **************\n");
>> +
>> +    rcu_read_lock(&domlist_read_lock);
>> +
>> +    for_each_domain ( d )
>> +    {
>> +        if ( !is_hvm_domain(d) )
> || !svm_avic_vcpu_enabled(d->v[0]) (or something along these lines)?

It isn't safe to deference the vcpu array like this, which in turn
highlights that the avic predicate should be per-domain, not per-cpu. 
Under no circumstances should we have AVIC on some vcpus but not others
of the same domain.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code
  2017-01-02 16:37   ` Andrew Cooper
@ 2017-01-04 17:24     ` Suravee Suthikulpanit
  2017-01-04 17:59       ` Andrew Cooper
  2017-01-10  3:06     ` Suravee Suthikulpanit
  1 sibling, 1 reply; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-04 17:24 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: boris.ostrovsky, sherry.hurwitz, jbeulich

Hi Andrew

On 1/2/17 23:37, Andrew Cooper wrote:
> On 31/12/2016 05:45, Suravee Suthikulpanit wrote:
>> [...]
>> +/*
>> + * Note: Current max index allowed for physical APIC ID table is 255.
>> + */
>> +#define AVIC_PHY_APIC_ID_MAX    0xFF
>> +
>> +#define AVIC_DOORBELL           0xc001011b
>> +
>> +#define AVIC_HPA_SHIFT  12
>> +#define AVIC_HPA_MASK           (((1ULL << 40) - 1) << AVIC_HPA_SHIFT)
>
> What does this mask represent?  I have spotted a truncation bug below,
> but how to fix the issue depends on the meaning of the mask.
>
> The only limits I can see in the spec is that the host physical
> addresses must be present in system RAM, which presumably means they
> should follow maxphysaddr like everything else?

Please see below.

>> [...]
>> +/*
>> + * Note:
>> + * Currently, svm-avic mode is not supported with nested virtualization.
>> + * Therefore, it is not yet currently enabled by default. Once the support
>> + * is in-place, this should be enabled by default.
>> + */
>> +bool svm_avic = 0;
>> +boolean_param("svm-avic", svm_avic);
>
> We are trying to avoid the proliferation of top-level options.  Please
> could you introduce a "svm=" option which takes avic as a sub-boolean,
> similar to how iommu= currently works?

Sure, I will introduce the custom_param("svm", parse_svm_param).

>> [...]
 >> +{
>> +    struct avic_phy_apic_id_ent *avic_phy_apic_id_table;
>> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
>> +
>> +    if ( !d->avic_phy_apic_id_table_mfn )
>> +        return NULL;
>> +
>> +    /*
>> +    * Note: APIC ID = 0xff is used for broadcast.
>> +    *       APIC ID > 0xff is reserved.
>> +    */
>> +    if ( index >= 0xff )
>> +        return NULL;
>> +
>> +    avic_phy_apic_id_table = mfn_to_virt(d->avic_phy_apic_id_table_mfn);
>
> This is unsafe and will break on hosts with more than 5TB of RAM.
>
> As you allocate avic_phy_apic_id_table_mfn with alloc_domheap_page(),
> you must use {map,unmap}_domain_page() to get temporary mappings (which
> will use mfn_to_virt() in the common case), or use
> {map,unmap}_domain_page_global() to get permanent mappings.
>
> As the values in here need modifying across a schedule, I would suggest
> making a global mapping at create time, and storing the result in a
> properly-typed pointer.

Thanks for pointing this out. I'll update both logical and physical APIC 
ID table to use the {map,unmap}_domain_page_global() instead.


>> [...]
>> +
>> +    if ( !svm_avic )
>> +        return 0;
>
> || !has_vlapic(d)
>
> HVMLite domains may legitimately be configured without an LAPIC at all.
>
Hm... That means I would also need to check this in 
svm_avic_dom_destroy() and svm_avic_init_vmcb().

>> [...]
>> +
>> +static inline u32 *
>> +avic_get_bk_page_entry(const struct vcpu *v, u32 offset)
>> +{
>> +    const struct vlapic *vlapic = vcpu_vlapic(v);
>> +    char *tmp;
>> +
>> +    if ( !vlapic || !vlapic->regs_page )
>> +        return NULL;
>> +
>> +    tmp = (char *)page_to_virt(vlapic->regs_page);
>> +    return (u32 *)(tmp + offset);
>
> This is also unsafe over 5TB.  vlapic->regs is already a global mapping
> which you should use.

Good point.
>
> Also, you should leave a comment by the allocation of regs_page that
> AVIC depends on it being a full page allocation.

Ok, I will add a comment in arch/x86/hvm/vlapic.c: vlapic_init().

>> +}
>> +
>> +void svm_avic_update_vapic_bar(const struct vcpu *v, uint64_t data)
>> +{
>> +    const struct arch_svm_struct *s = &v->arch.hvm_svm;
>> +
>> +    s->vmcb->avic_vapic_bar = data & AVIC_VAPIC_BAR_MASK;
>> +    s->vmcb->cleanbits.fields.avic = 0;
>
> While I can appreciate what you are trying to do here, altering the
> physical address in IA32_APICBASE has never functioned properly under
> Xen, as nothing updates the P2M.  Worse, with multi-vcpu guests, the use
> of a single shared p2m makes this impossible to do properly.

Ahh.. Good to know

> I know it is non-architectural behaviour, but IMO vlapic_msr_set()
> should be updated to raise #GP[0] when attempting to change the frame,
> to make it fail obviously rather than having interrupts go missing.
>
> Also I see that the Intel side (via vmx_vlapic_msr_changed()) makes the
> same mistake.

Okay, it seems that the hvm_msr_write_intercept() has already handled 
this isn't it? So, I should be able to pretty much removing the handling 
for the MSR here.

>> +}
>> +
>> +int svm_avic_init_vmcb(struct vcpu *v)
>> +{
>> +    paddr_t ma;
>> +    u32 *apic_id_reg;
>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>> +    struct vmcb_struct *vmcb = s->vmcb;
>> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
>> +    const struct vlapic *vlapic = vcpu_vlapic(v);
>> +    struct avic_phy_apic_id_ent entry;
>> +
>> +    if ( !svm_avic )
>> +        return 0;
>> +
>> +    if ( !vlapic || !vlapic->regs_page )
>> +        return -EINVAL;
>
> Somewhere you need a bounds check against the APIC ID being within AVIC
> limits.

We are checking the bound in the avic_get_phy_apic_id_ent().

>
>> +    if ( !s->avic_last_phy_id )
>> +        return -EINVAL;
>
> Why does a pointer into the physical ID page need storing in
> arch_svm_struct?
>

This was so that we can quickly access the physical APIC ID entry to 
update them w/o having to call avic_get_phy_apic_id_entry().

>> +
>> +    vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page) & AVIC_HPA_MASK;
>
> This use of AVIC_HPA_MASK may truncate the the address, which is
> definitely a problem.

I'm not quite sure that I got the truncation that you pointed out. Could 
you please elaborate?

> If AVIC_HPA_MASK isn't just related to maxphysaddr, then you need to
> pass appropriate memflags into the alloc_domheap_page() calls to get a
> suitable page.
>

If by "maxphysaddr" is the 52-bit physical address limit for the PAE 
mode, then I think that's related.

>> +
>> +    entry = *(s->avic_last_phy_id);
>> +    smp_rmb();
>> +    entry.bk_pg_ptr = (vmcb->avic_bk_pg_pa & AVIC_HPA_MASK) >> AVIC_HPA_SHIFT;
>> +    entry.is_running = 0;
>> +    entry.valid = 1;
>> +    *(s->avic_last_phy_id) = entry;
>> +    smp_wmb();
>
> During domain creation, no guests are running, so you can edit this cleanly.
 >
> What values are actually being excluded here?  This, and other patches,
> look like you are actually just trying to do a read_atomic(), modify,
> write_atomic() update, rather than actually requiring ordering.
>

Besides the read_atomic(), modify write_atomic() to update the entry. I 
also want to make sure that the compiler won't shuffle the order around, 
which I thought can be achieved via smp_rmb() and smp_wmb().

>> [...]
>> @@ -1799,6 +1802,10 @@ static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
>>
>>      switch ( msr )
>>      {
>> +    case MSR_IA32_APICBASE:
>> +        if ( svm_avic_vcpu_enabled(v) )
>> +            svm_avic_update_vapic_bar(v, msr_content);
>> +        break;
>
> I think this is dead code.  MSR_IA32_APICBASE is handled in
> hvm_msr_write_intercept(), and won't fall into the vendor hook.  If you
> were to continue down this route, I would add another pi_ops hook, and
> replace the VMX layering violation in vlapic_msr_set().
>

I see now. I will remove the code added here.


Thanks,
Suravee

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code
  2017-01-04 17:24     ` Suravee Suthikulpanit
@ 2017-01-04 17:59       ` Andrew Cooper
  0 siblings, 0 replies; 48+ messages in thread
From: Andrew Cooper @ 2017-01-04 17:59 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel
  Cc: boris.ostrovsky, jbeulich, sherry.hurwitz

On 04/01/17 17:24, Suravee Suthikulpanit wrote:
> On 1/2/17 23:37, Andrew Cooper wrote:
>>> +
>>> +    vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page) &
>>> AVIC_HPA_MASK;
>>
>> This use of AVIC_HPA_MASK may truncate the the address, which is
>> definitely a problem.
>
> I'm not quite sure that I got the truncation that you pointed out.
> Could you please elaborate?

My apologies.  I hadn't considered the PAGE_SHIFT in the definition, and
had come to the conclusion you were chopping the physical address off at
2^40.

>
>> If AVIC_HPA_MASK isn't just related to maxphysaddr, then you need to
>> pass appropriate memflags into the alloc_domheap_page() calls to get a
>> suitable page.
>>
>
> If by "maxphysaddr" is the 52-bit physical address limit for the PAE
> mode, then I think that's related.

You can safely rely on page_to_maddr() giving you a sensible value
without further masking.  By having a struct_page in the first place, it
is a known good frame.

>
>>> +
>>> +    entry = *(s->avic_last_phy_id);
>>> +    smp_rmb();
>>> +    entry.bk_pg_ptr = (vmcb->avic_bk_pg_pa & AVIC_HPA_MASK) >>
>>> AVIC_HPA_SHIFT;
>>> +    entry.is_running = 0;
>>> +    entry.valid = 1;
>>> +    *(s->avic_last_phy_id) = entry;
>>> +    smp_wmb();
>>
>> During domain creation, no guests are running, so you can edit this
>> cleanly.
> >
>> What values are actually being excluded here?  This, and other patches,
>> look like you are actually just trying to do a read_atomic(), modify,
>> write_atomic() update, rather than actually requiring ordering.
>>
>
> Besides the read_atomic(), modify write_atomic() to update the entry.
> I also want to make sure that the compiler won't shuffle the order
> around, which I thought can be achieved via smp_rmb() and smp_wmb().

Shuffle which order? Your smp_rmb() is between two writes, and the
smp_wmb() isn't paired with anything.

I think using read_atomic() and write_atomic() will DTRT for you; All
you appear to need is a guarantee that the code won't read/update the
live table using multiple accesses.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops
  2016-12-31  5:45 ` [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops Suravee Suthikulpanit
@ 2017-01-05  2:54   ` Tian, Kevin
  2017-01-05  7:57     ` Jan Beulich
  2017-01-05 15:51   ` Jan Beulich
  1 sibling, 1 reply; 48+ messages in thread
From: Tian, Kevin @ 2017-01-05  2:54 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel
  Cc: jbeulich, andrew.cooper3, Nakajima, Jun, sherry.hurwitz, boris.ostrovsky

> From: Suravee Suthikulpanit [mailto:suravee.suthikulpanit@amd.com]
> Sent: Saturday, December 31, 2016 1:46 PM
> 
> The current function pointers for managing hvm posted interrupt
> can be used also by SVM AVIC. Therefore, this patch introduces the
> struct hvm_pi_ops in the struct hvm_domain to hold them.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> ---
>  xen/arch/x86/hvm/vmx/vmx.c         | 32 +++++++++----------
>  xen/include/asm-x86/hvm/domain.h   | 63
> ++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/hvm/hvm.h      |  4 +--
>  xen/include/asm-x86/hvm/vmx/vmcs.h | 59 -----------------------------------
>  4 files changed, 81 insertions(+), 77 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 7b2c50c..3f6d888 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -204,12 +204,12 @@ void vmx_pi_hooks_assign(struct domain *d)
>      if ( !iommu_intpost || !has_hvm_container_domain(d) )
>          return;
> 
> -    ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> +    ASSERT(!d->arch.hvm_domain.pi_ops.vcpu_block);
> 
> -    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> -    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
> -    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> -    d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> +    d->arch.hvm_domain.pi_ops.vcpu_block = vmx_vcpu_block;
> +    d->arch.hvm_domain.pi_ops.pi_switch_from = vmx_pi_switch_from;
> +    d->arch.hvm_domain.pi_ops.pi_switch_to = vmx_pi_switch_to;
> +    d->arch.hvm_domain.pi_ops.pi_do_resume = vmx_pi_do_resume;
>  }
> 

what about removing pi_ prefix from callbacks given that they are
all under pi_ops?

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 03/10] x86/HVM: Call vlapic_destroy after vcpu_destroy
  2016-12-31  5:45 ` [PATCH v2 03/10] x86/HVM: Call vlapic_destroy after vcpu_destroy Suravee Suthikulpanit
@ 2017-01-05  2:56   ` Tian, Kevin
  2017-01-05 15:56   ` Jan Beulich
  1 sibling, 0 replies; 48+ messages in thread
From: Tian, Kevin @ 2017-01-05  2:56 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel
  Cc: jbeulich, andrew.cooper3, Nakajima, Jun, sherry.hurwitz, boris.ostrovsky

> From: Suravee Suthikulpanit [mailto:suravee.suthikulpanit@amd.com]
> Sent: Saturday, December 31, 2016 1:46 PM
> 
> Since vlapic_init() is called before vcpu_initialise().
> We should call the destroy functions in the the reverse order here.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> ---
>  xen/arch/x86/hvm/hvm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: Kevin Tian <kevin.tian@intel.com>. btw since it's a general
fix, maybe you can send it separately.

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers
  2017-01-02 17:28   ` Andrew Cooper
@ 2017-01-05  4:07     ` Suravee Suthikulpanit
  0 siblings, 0 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-05  4:07 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: boris.ostrovsky, sherry.hurwitz, jbeulich

Hi Andrew,

On 1/3/17 00:28, Andrew Cooper wrote:
> On 31/12/2016 05:45, Suravee Suthikulpanit wrote:
>> [...]
>> +    case AVIC_INCMP_IPI_ERR_INV_TARGET:
>> +        dprintk(XENLOG_ERR,
>> +                "SVM: %s: Invalid IPI target (icr=%#08x:%08x, idx=%u)\n",
>> +                __func__, icrh, icrl, index);
>> +        break;
>
> Shouldn't this case be emulated to raise appropriate APIC errors in the
> guests view?
>

Actually, I think we would need to domain_crash().

>> [...]
>> +static int avic_handle_dfr_update(struct vcpu *v)
>> +{
>> +    u32 mod;
>> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
>> +    u32 *dfr = avic_get_bk_page_entry(v, APIC_DFR);
>> +
>> +    if ( !dfr )
>> +        return -EINVAL;
>> +
>> +    mod = (*dfr >> 28) & 0xFu;
>> +
>> +    spin_lock(&d->avic_ldr_mode_lock);
>> +    if ( d->avic_ldr_mode != mod )
>> +    {
>> +        /*
>> +         * We assume that all local APICs are using the same type.
>> +         * If LDR mode changes, we need to flush the domain AVIC logical

My apology.... s/LDR mode/DFR mode/

>> +         * APIC id table.
>> +         */
>
> The logical APIC ID table is per-domain, yet LDR mode is per-vcpu.  How
> can these coexist if we flush the table like this?  How would a
> multi-vcpu domain actually change its mode without this logic in Xen
> corrupting the table?

My apology.... s/avic_ldr_mode/avic_dfr_mode/

The per-domain avic_dfr_mode is used to store the value of DFR. The idea 
here is that if the DFR is changed, we need to flush the logical APIC ID 
table and updated the avic_dfr_mode. In multi-vcpu domain, this will be 
done when the first vcpu updates its DFR register. Flushing the table 
will automatically invalidate the entries in the table, which should 
cause a #VMEXIT.

>> [...]
>> +        /*
>> +         * Handling AVIC Fault (intercept before the access).
>> +         */
>> +        if ( !rw )
>> +        {
>> +            u32 *entry = avic_get_bk_page_entry(curr, offset);
>> +
>> +            if ( !entry )
>> +                return;
>> +
>> +            *entry = vlapic_read_aligned(vcpu_vlapic(curr), offset);
>> +        }

This part is not needed. I will remove it.

>> +        hvm_emulate_one_vm_event(EMUL_KIND_NORMAL, TRAP_invalid_op,
>> +                                 HVM_DELIVER_NO_ERROR_CODE);
>
> What is this doing here?  I don't see any connection.
>
> ~Andrew
>
We need to emulate the instruction for the #vmexit do noaccel fault case.

Thanks,
Suravee

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers
  2017-01-03 15:34   ` Boris Ostrovsky
@ 2017-01-05  6:41     ` Suravee Suthikulpanit
  0 siblings, 0 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-05  6:41 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel; +Cc: andrew.cooper3, sherry.hurwitz, jbeulich

Hi Boris,

On 1/3/17 22:34, Boris Ostrovsky wrote:
>> +static int avic_handle_dfr_update(struct vcpu *v)
>> +{
>> +    u32 mod;
>> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
>> +    u32 *dfr = avic_get_bk_page_entry(v, APIC_DFR);
>> +
>> +    if ( !dfr )
>> +        return -EINVAL;
>> +
>> +    mod = (*dfr >> 28) & 0xFu;
>> +
>> +    spin_lock(&d->avic_ldr_mode_lock);
>> +    if ( d->avic_ldr_mode != mod )
>> +    {
>> +        /*
>> +         * We assume that all local APICs are using the same type.
>> +         * If LDR mode changes, we need to flush the domain AVIC logical
>> +         * APIC id table.
>> +         */
>> +        clear_domain_page(_mfn(d->avic_log_apic_id_table_mfn));
>> +        smp_wmb();
> Is this needed? I think clear_page() guarantees visibility/ordering (we
> have SFENCE in clear_page_sse2() for this reason).
>
> -boris
>

Ah... Okay. Thanks for pointing out.

Suravee

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops
  2017-01-05  2:54   ` Tian, Kevin
@ 2017-01-05  7:57     ` Jan Beulich
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Beulich @ 2017-01-05  7:57 UTC (permalink / raw)
  To: Suravee Suthikulpanit, Kevin Tian
  Cc: andrew.cooper3, xen-devel, Jun Nakajima, sherry.hurwitz, boris.ostrovsky

>>> On 05.01.17 at 03:54, <kevin.tian@intel.com> wrote:
>>  From: Suravee Suthikulpanit [mailto:suravee.suthikulpanit@amd.com]
>> Sent: Saturday, December 31, 2016 1:46 PM
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -204,12 +204,12 @@ void vmx_pi_hooks_assign(struct domain *d)
>>      if ( !iommu_intpost || !has_hvm_container_domain(d) )
>>          return;
>> 
>> -    ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> +    ASSERT(!d->arch.hvm_domain.pi_ops.vcpu_block);
>> 
>> -    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> -    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
>> -    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
>> -    d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>> +    d->arch.hvm_domain.pi_ops.vcpu_block = vmx_vcpu_block;
>> +    d->arch.hvm_domain.pi_ops.pi_switch_from = vmx_pi_switch_from;
>> +    d->arch.hvm_domain.pi_ops.pi_switch_to = vmx_pi_switch_to;
>> +    d->arch.hvm_domain.pi_ops.pi_do_resume = vmx_pi_do_resume;
>>  }
>> 
> 
> what about removing pi_ prefix from callbacks given that they are
> all under pi_ops?

+1

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler
  2017-01-03 16:04     ` Andrew Cooper
@ 2017-01-05  8:00       ` Suravee Suthikulpanit
  0 siblings, 0 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-05  8:00 UTC (permalink / raw)
  To: Andrew Cooper, Boris Ostrovsky, xen-devel; +Cc: sherry.hurwitz, jbeulich

Hi Boris/Andrew,

On 1/3/17 23:04, Andrew Cooper wrote:
> On 03/01/17 16:01, Boris Ostrovsky wrote:
>>>
>>> +static void avic_dump(unsigned char ch)
>>> +{
>>> +    struct domain *d;
>>> +    struct vcpu *v;
>>> +
>>> +    printk("*********** SVM AVIC Statistics **************\n");
>>> +
>>> +    rcu_read_lock(&domlist_read_lock);
>>> +
>>> +    for_each_domain ( d )
>>> +    {
>>> +        if ( !is_hvm_domain(d) )
>> || !svm_avic_vcpu_enabled(d->v[0]) (or something along these lines)?
>
> It isn't safe to deference the vcpu array like this, which in turn
> highlights that the avic predicate should be per-domain, not per-cpu.
> Under no circumstances should we have AVIC on some vcpus but not others
> of the same domain.
>
> ~Andrew
>

Let me add something like:

     static inline bool svm_is_avic_domain(struct domain *d)
     {
         return ( d->arch.hvm_domain.svm.avic_physical_id_table != 0 );
     }

This should allow us to check whether the svm_avic_dom_init() is enabled 
successfully.

Thanks,
Suravee

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops
  2016-12-31  5:45 ` [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops Suravee Suthikulpanit
  2017-01-05  2:54   ` Tian, Kevin
@ 2017-01-05 15:51   ` Jan Beulich
  2017-01-10  6:51     ` Suravee Suthikulpanit
  1 sibling, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2017-01-05 15:51 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: Kevin Tian, andrew.cooper3, xen-devel, Jun Nakajima,
	sherry.hurwitz, boris.ostrovsky

>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -72,6 +72,67 @@ struct hvm_ioreq_server {
>      bool_t                 bufioreq_atomic;
>  };
>  
> +struct hvm_pi_ops {
> +    /*
> +     * To handle posted interrupts correctly, we need to set the following
> +     * state:
> +     *
> +     * * The PI notification vector (NV)
> +     * * The PI notification destination processor (NDST)
> +     * * The PI "suppress notification" bit (SN)
> +     * * The vcpu pi "blocked" list
> +     *
> +     * If a VM is currently running, we want the PI delivered to the guest vcpu
> +     * on the proper pcpu (NDST = v->processor, SN clear).
> +     *
> +     * If the vm is blocked, we want the PI delivered to Xen so that it can
> +     * wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
> +     *
> +     * If the VM is currently either preempted or offline (i.e., not running
> +     * because of some reason other than blocking waiting for an interrupt),
> +     * there's nothing Xen can do -- we want the interrupt pending bit set in
> +     * the guest, but we don't want to bother Xen with an interrupt (SN clear).
> +     *
> +     * There's a brief window of time between vmx_intr_assist() and checking
> +     * softirqs where if an interrupt comes in it may be lost; so we need Xen
> +     * to get an interrupt and raise a softirq so that it will go through the
> +     * vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
> +     *
> +     * The way we implement this now is by looking at what needs to happen on
> +     * the following runstate transitions:
> +     *
> +     * A: runnable -> running
> +     *  - SN = 0
> +     *  - NDST = v->processor
> +     * B: running -> runnable
> +     *  - SN = 1
> +     * C: running -> blocked
> +     *  - NV = pi_wakeup_vector
> +     *  - Add vcpu to blocked list
> +     * D: blocked -> runnable
> +     *  - NV = posted_intr_vector
> +     *  - Take vcpu off blocked list
> +     *
> +     * For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to}
> +     * paths.
> +     *
> +     * For transition C, we add a new arch hook, arch_vcpu_block(), which is
> +     * called from vcpu_block() and vcpu_do_poll().
> +     *
> +     * For transition D, rather than add an extra arch hook on vcpu_wake, we
> +     * add a hook on the vmentry path which checks to see if either of the two
> +     * actions need to be taken.
> +     *
> +     * These hooks only need to be called when the domain in question actually
> +     * has a physical device assigned to it, so we set and clear the callbacks
> +     * as appropriate when device assignment changes.
> +     */
> +    void (*vcpu_block) (struct vcpu *);
> +    void (*pi_switch_from) (struct vcpu *v);
> +    void (*pi_switch_to) (struct vcpu *v);
> +    void (*pi_do_resume) (struct vcpu *v);
> +};

While the hooks (as said, with the pi_ prefixes dropped) are
certainly fine to move here, the comment is extremely VMX
centric, and hence doesn't fit in this file. It either needs to be
generalized, or it should remain in VMX specific code, perhaps
with a referral to it added here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 02/10] x86/vLAPIC: Declare vlapic_read_aligned() and vlapic_reg_write() as non-static
  2016-12-31  5:45 ` [PATCH v2 02/10] x86/vLAPIC: Declare vlapic_read_aligned() and vlapic_reg_write() as non-static Suravee Suthikulpanit
@ 2017-01-05 15:53   ` Jan Beulich
  2017-01-10  6:57     ` Suravee Suthikulpanit
  0 siblings, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2017-01-05 15:53 UTC (permalink / raw)
  To: Suravee Suthikulpanit, xen-devel
  Cc: andrew.cooper3, boris.ostrovsky, sherry.hurwitz

>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
> Expose vlapic_read_aligned and vlapic_reg_write() to be used by AVIC.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Generally I dislike functions being non-static when all their callers
live in the same file. Therefore it would be better (and hardly
harder to review) if they got made non-static at the point of their
first external use. That'll also help understanding whether that's
an appropriate move.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 03/10] x86/HVM: Call vlapic_destroy after vcpu_destroy
  2016-12-31  5:45 ` [PATCH v2 03/10] x86/HVM: Call vlapic_destroy after vcpu_destroy Suravee Suthikulpanit
  2017-01-05  2:56   ` Tian, Kevin
@ 2017-01-05 15:56   ` Jan Beulich
  2017-01-10  8:18     ` Suravee Suthikulpanit
  1 sibling, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2017-01-05 15:56 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: KevinTian, andrew.cooper3, xen-devel, Jun Nakajima,
	sherry.hurwitz, boris.ostrovsky

>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
> Since vlapic_init() is called before vcpu_initialise().
> We should call the destroy functions in the the reverse order here.

Double "the". And to quote from my RFC reply:

"Also the ordering issue extends to other calls, and I think if at all
 possible we should then do all the teardown in reverse order of
 init."

Is there anything preventing this?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC
  2016-12-31  5:45 ` [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC Suravee Suthikulpanit
  2017-01-02 17:45   ` Andrew Cooper
@ 2017-01-05 16:01   ` Jan Beulich
  1 sibling, 0 replies; 48+ messages in thread
From: Jan Beulich @ 2017-01-05 16:01 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: andrew.cooper3, boris.ostrovsky, xen-devel, sherry.hurwitz

>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
> --- a/xen/arch/x86/hvm/svm/avic.c
> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -636,6 +636,34 @@ void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
>      return;
>  }
>  
> +void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
> +{
> +    struct vlapic *vlapic = vcpu_vlapic(v);
> +
> +    /* Fallback to use non-AVIC if vcpu is not enabled with AVIC. */
> +    if ( !svm_avic_vcpu_enabled(v) )
> +    {
> +        if ( !vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
> +            vcpu_kick(v);
> +        return;
> +    }
> +
> +    if ( !(guest_cpu_user_regs()->eflags & X86_EFLAGS_IF) )
> +        return;

On top of Andrew's comment, this assumes v == current, which you
neither assert, nor allow the reviewers to verify (as the function has
no caller).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions
  2016-12-31  5:46 ` [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions Suravee Suthikulpanit
  2017-01-02 17:49   ` Andrew Cooper
@ 2017-01-05 16:05   ` Jan Beulich
  2017-01-10  8:35     ` Suravee Suthikulpanit
  1 sibling, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2017-01-05 16:05 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: andrew.cooper3, boris.ostrovsky, xen-devel, sherry.hurwitz

>>> On 31.12.16 at 06:46, <suravee.suthikulpanit@amd.com> wrote:
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -1438,6 +1438,11 @@ static int svm_cpu_up(void)
>      return 0;
>  }
>  
> +static inline int svm_avic_enabled(void)

bool?

> @@ -1472,16 +1477,27 @@ const struct hvm_function_table * __init start_svm(void)
>      P(cpu_has_svm_decode, "DecodeAssists");
>      P(cpu_has_pause_filter, "Pause-Intercept Filter");
>      P(cpu_has_tsc_ratio, "TSC Rate MSR");
> -    P(cpu_has_svm_avic, "AVIC");
> -#undef P
> -
> -    if ( !printed )
> -        printk(" - none\n");
>  
>      svm_function_table.hap_supported = !!cpu_has_svm_npt;
>      svm_function_table.hap_capabilities = HVM_HAP_SUPERPAGE_2MB |
>          ((cpuid_edx(0x80000001) & 0x04000000) ? HVM_HAP_SUPERPAGE_1GB : 0);
>  
> +    if ( !cpu_has_svm_avic )
> +        svm_avic = 0;
> +
> +    if ( svm_avic )
> +    {
> +        svm_function_table.deliver_posted_intr  = svm_avic_deliver_posted_intr;
> +        svm_function_table.virtual_intr_delivery_enabled = svm_avic_enabled;
> +        P(cpu_has_svm_avic, "AVIC (enabled)");
> +    }
> +    else
> +        P(cpu_has_svm_avic, "AVIC (disabled)");
> +#undef P
> +
> +    if ( !printed )
> +        printk(" - none\n");

Could I talk you into moving this up a few lines, so that effectively
the last four lines here won't need to move at all?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler
  2016-12-31  5:46 ` [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler Suravee Suthikulpanit
  2017-01-03 16:01   ` Boris Ostrovsky
@ 2017-01-05 16:07   ` Jan Beulich
  2017-01-10 11:14     ` Suravee Suthikulpanit
  1 sibling, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2017-01-05 16:07 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: andrew.cooper3, boris.ostrovsky, xen-devel, sherry.hurwitz

>>> On 31.12.16 at 06:46, <suravee.suthikulpanit@amd.com> wrote:
> +void __init setup_avic_dump(void)
> +{
> +    register_keyhandler('j', avic_dump, "dump SVM AVIC", 1);
> +}

For one the description should include the word "stats". And then
I'm rather uncertain this is work burning one of the few remaining
available keys. Could this be made a domctl instead?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code
  2017-01-02 16:37   ` Andrew Cooper
  2017-01-04 17:24     ` Suravee Suthikulpanit
@ 2017-01-10  3:06     ` Suravee Suthikulpanit
  1 sibling, 0 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-10  3:06 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: boris.ostrovsky, sherry.hurwitz, jbeulich

Hi Andrew,

On 1/2/17 23:37, Andrew Cooper wrote:
>> +    ma = d->avic_phy_apic_id_table_mfn;
>> +    vmcb->avic_phy_apic_id = (ma << PAGE_SHIFT) & AVIC_HPA_MASK;
>> +    vmcb->avic_phy_apic_id |= AVIC_PHY_APIC_ID_MAX;
> Surely this should be some calculation derived from d->max_vcpus?

This is actually should be the value of highest physical APIC ID, since 
the documentations mentioned that this is the highest index into the 
physical APIC ID table.

I am checking with the HW folks to try to understand if there are any 
negative effects if we were to always set this to 0xFF.

However, I have also looking into setting/updating this in the following 
scenarios:

1. During svm_avic_init_vmcb(): We should be able to just have a 
for_each_vcpu() loop to iterate through all VMCBs and updating this 
field with the max physical APIC ID.

2. During avic_unaccel_trap_write()->avic_handle_apic_id_update():
This is more tricky since we would need to trap all VCPUs to update this 
field in all VMCBs to have the same value. What mechanism would you 
suggest to achieve this?

Thanks,
Suravee


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops
  2017-01-05 15:51   ` Jan Beulich
@ 2017-01-10  6:51     ` Suravee Suthikulpanit
  2017-01-10  8:24       ` Jan Beulich
  0 siblings, 1 reply; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-10  6:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, andrew.cooper3, xen-devel, Jun Nakajima,
	sherry.hurwitz, boris.ostrovsky

Jan,

On 01/05/2017 10:51 PM, Jan Beulich wrote:
>>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
>> --- a/xen/include/asm-x86/hvm/domain.h
>> +++ b/xen/include/asm-x86/hvm/domain.h
>> @@ -72,6 +72,67 @@ struct hvm_ioreq_server {
>>      bool_t                 bufioreq_atomic;
>>  };
>>
>> +struct hvm_pi_ops {
>> +    /*
>> +     * To handle posted interrupts correctly, we need to set the following
>> +     * state:
>> +     *
>> +     * * The PI notification vector (NV)
>> +     * * The PI notification destination processor (NDST)
>> +     * * The PI "suppress notification" bit (SN)
>> +     * * The vcpu pi "blocked" list
>> +     *
>> +     * If a VM is currently running, we want the PI delivered to the guest vcpu
>> +     * on the proper pcpu (NDST = v->processor, SN clear).
>> +     *
>> +     * If the vm is blocked, we want the PI delivered to Xen so that it can
>> +     * wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
>> +     *
>> +     * If the VM is currently either preempted or offline (i.e., not running
>> +     * because of some reason other than blocking waiting for an interrupt),
>> +     * there's nothing Xen can do -- we want the interrupt pending bit set in
>> +     * the guest, but we don't want to bother Xen with an interrupt (SN clear).
>> +     *
>> +     * There's a brief window of time between vmx_intr_assist() and checking
>> +     * softirqs where if an interrupt comes in it may be lost; so we need Xen
>> +     * to get an interrupt and raise a softirq so that it will go through the
>> +     * vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
>> +     *
>> +     * The way we implement this now is by looking at what needs to happen on
>> +     * the following runstate transitions:
>> +     *
>> +     * A: runnable -> running
>> +     *  - SN = 0
>> +     *  - NDST = v->processor
>> +     * B: running -> runnable
>> +     *  - SN = 1
>> +     * C: running -> blocked
>> +     *  - NV = pi_wakeup_vector
>> +     *  - Add vcpu to blocked list
>> +     * D: blocked -> runnable
>> +     *  - NV = posted_intr_vector
>> +     *  - Take vcpu off blocked list
>> +     *
>> +     * For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to}
>> +     * paths.
>> +     *
>> +     * For transition C, we add a new arch hook, arch_vcpu_block(), which is
>> +     * called from vcpu_block() and vcpu_do_poll().
>> +     *
>> +     * For transition D, rather than add an extra arch hook on vcpu_wake, we
>> +     * add a hook on the vmentry path which checks to see if either of the two
>> +     * actions need to be taken.
>> +     *
>> +     * These hooks only need to be called when the domain in question actually
>> +     * has a physical device assigned to it, so we set and clear the callbacks
>> +     * as appropriate when device assignment changes.
>> +     */
>> +    void (*vcpu_block) (struct vcpu *);
>> +    void (*pi_switch_from) (struct vcpu *v);
>> +    void (*pi_switch_to) (struct vcpu *v);
>> +    void (*pi_do_resume) (struct vcpu *v);
>> +};
>
> While the hooks (as said, with the pi_ prefixes dropped) are
> certainly fine to move here, the comment is extremely VMX
> centric, and hence doesn't fit in this file. It either needs to be
> generalized, or it should remain in VMX specific code, perhaps
> with a referral to it added here.
>
> Jan
>

I see. I will move the comment into arch/x86/hvm/vmx/vmx.c close to 
where these hooks are implemented.

Thanks,
Suravee

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 02/10] x86/vLAPIC: Declare vlapic_read_aligned() and vlapic_reg_write() as non-static
  2017-01-05 15:53   ` Jan Beulich
@ 2017-01-10  6:57     ` Suravee Suthikulpanit
  2017-01-10  8:25       ` Jan Beulich
  0 siblings, 1 reply; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-10  6:57 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: andrew.cooper3, boris.ostrovsky, sherry.hurwitz



On 01/05/2017 10:53 PM, Jan Beulich wrote:
>>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
>> Expose vlapic_read_aligned and vlapic_reg_write() to be used by AVIC.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>
> Generally I dislike functions being non-static when all their callers
> live in the same file. Therefore it would be better (and hardly
> harder to review) if they got made non-static at the point of their
> first external use. That'll also help understanding whether that's
> an appropriate move.
>
> Jan
>

IIUC, you want these changes to be in the same patch of the one making 
use of them externally. I can certainly combine this patch with patch 6/10.

Thanks,
S

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 03/10] x86/HVM: Call vlapic_destroy after vcpu_destroy
  2017-01-05 15:56   ` Jan Beulich
@ 2017-01-10  8:18     ` Suravee Suthikulpanit
  0 siblings, 0 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-10  8:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: KevinTian, andrew.cooper3, xen-devel, Jun Nakajima,
	sherry.hurwitz, boris.ostrovsky



On 01/05/2017 10:56 PM, Jan Beulich wrote:
>>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
>> Since vlapic_init() is called before vcpu_initialise().
>> We should call the destroy functions in the the reverse order here.
>
> Double "the". And to quote from my RFC reply:
>
> "Also the ordering issue extends to other calls, and I think if at all
>  possible we should then do all the teardown in reverse order of
>  init."
>
> Is there anything preventing this?
>
> Jan
>

Ah, sorry. I didn't get what you tried to say earlier. I see now that 
you want to change the teardown order in hvm_vcpu_destroy() to follow 
the teardown order (i.e. the "failX:") in hvm_vcpu_initialize().

Thanks,
Suravee

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops
  2017-01-10  6:51     ` Suravee Suthikulpanit
@ 2017-01-10  8:24       ` Jan Beulich
  2017-01-10  9:45         ` Suravee Suthikulpanit
  0 siblings, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2017-01-10  8:24 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: Kevin Tian, andrew.cooper3, xen-devel, Jun Nakajima,
	sherry.hurwitz, boris.ostrovsky

>>> On 10.01.17 at 07:51, <Suravee.Suthikulpanit@amd.com> wrote:
> On 01/05/2017 10:51 PM, Jan Beulich wrote:
>>>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
>>> --- a/xen/include/asm-x86/hvm/domain.h
>>> +++ b/xen/include/asm-x86/hvm/domain.h
>>> @@ -72,6 +72,67 @@ struct hvm_ioreq_server {
>>>      bool_t                 bufioreq_atomic;
>>>  };
>>>
>>> +struct hvm_pi_ops {
>>> +    /*
>>> +     * To handle posted interrupts correctly, we need to set the following
>>> +     * state:
>>> +     *
>>> +     * * The PI notification vector (NV)
>>> +     * * The PI notification destination processor (NDST)
>>> +     * * The PI "suppress notification" bit (SN)
>>> +     * * The vcpu pi "blocked" list
>>> +     *
>>> +     * If a VM is currently running, we want the PI delivered to the guest vcpu
>>> +     * on the proper pcpu (NDST = v->processor, SN clear).
>>> +     *
>>> +     * If the vm is blocked, we want the PI delivered to Xen so that it can
>>> +     * wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
>>> +     *
>>> +     * If the VM is currently either preempted or offline (i.e., not running
>>> +     * because of some reason other than blocking waiting for an interrupt),
>>> +     * there's nothing Xen can do -- we want the interrupt pending bit set in
>>> +     * the guest, but we don't want to bother Xen with an interrupt (SN clear).
>>> +     *
>>> +     * There's a brief window of time between vmx_intr_assist() and checking
>>> +     * softirqs where if an interrupt comes in it may be lost; so we need Xen
>>> +     * to get an interrupt and raise a softirq so that it will go through the
>>> +     * vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
>>> +     *
>>> +     * The way we implement this now is by looking at what needs to happen on
>>> +     * the following runstate transitions:
>>> +     *
>>> +     * A: runnable -> running
>>> +     *  - SN = 0
>>> +     *  - NDST = v->processor
>>> +     * B: running -> runnable
>>> +     *  - SN = 1
>>> +     * C: running -> blocked
>>> +     *  - NV = pi_wakeup_vector
>>> +     *  - Add vcpu to blocked list
>>> +     * D: blocked -> runnable
>>> +     *  - NV = posted_intr_vector
>>> +     *  - Take vcpu off blocked list
>>> +     *
>>> +     * For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to}
>>> +     * paths.
>>> +     *
>>> +     * For transition C, we add a new arch hook, arch_vcpu_block(), which is
>>> +     * called from vcpu_block() and vcpu_do_poll().
>>> +     *
>>> +     * For transition D, rather than add an extra arch hook on vcpu_wake, we
>>> +     * add a hook on the vmentry path which checks to see if either of the two
>>> +     * actions need to be taken.
>>> +     *
>>> +     * These hooks only need to be called when the domain in question actually
>>> +     * has a physical device assigned to it, so we set and clear the callbacks
>>> +     * as appropriate when device assignment changes.
>>> +     */
>>> +    void (*vcpu_block) (struct vcpu *);
>>> +    void (*pi_switch_from) (struct vcpu *v);
>>> +    void (*pi_switch_to) (struct vcpu *v);
>>> +    void (*pi_do_resume) (struct vcpu *v);
>>> +};
>>
>> While the hooks (as said, with the pi_ prefixes dropped) are
>> certainly fine to move here, the comment is extremely VMX
>> centric, and hence doesn't fit in this file. It either needs to be
>> generalized, or it should remain in VMX specific code, perhaps
>> with a referral to it added here.
> 
> I see. I will move the comment into arch/x86/hvm/vmx/vmx.c close to 
> where these hooks are implemented.

So you see no way of generalizing it? (I confess I didn't look closely
enough yet at the [dis]similarities between VMX/VT-d PI and AVIC
to be able to easily tell myself.)

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 02/10] x86/vLAPIC: Declare vlapic_read_aligned() and vlapic_reg_write() as non-static
  2017-01-10  6:57     ` Suravee Suthikulpanit
@ 2017-01-10  8:25       ` Jan Beulich
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Beulich @ 2017-01-10  8:25 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: andrew.cooper3, boris.ostrovsky, xen-devel, sherry.hurwitz

>>> On 10.01.17 at 07:57, <Suravee.Suthikulpanit@amd.com> wrote:
> On 01/05/2017 10:53 PM, Jan Beulich wrote:
>>>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
>>> Expose vlapic_read_aligned and vlapic_reg_write() to be used by AVIC.
>>>
>>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>
>> Generally I dislike functions being non-static when all their callers
>> live in the same file. Therefore it would be better (and hardly
>> harder to review) if they got made non-static at the point of their
>> first external use. That'll also help understanding whether that's
>> an appropriate move.
> 
> IIUC, you want these changes to be in the same patch of the one making 
> use of them externally.

Yes.

> I can certainly combine this patch with patch 6/10.

Thanks.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions
  2017-01-05 16:05   ` Jan Beulich
@ 2017-01-10  8:35     ` Suravee Suthikulpanit
  2017-01-10  9:00       ` Jan Beulich
  0 siblings, 1 reply; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-10  8:35 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, boris.ostrovsky, xen-devel, sherry.hurwitz

Jan,

On 01/05/2017 11:05 PM, Jan Beulich wrote:
>>>> On 31.12.16 at 06:46, <suravee.suthikulpanit@amd.com> wrote:
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -1438,6 +1438,11 @@ static int svm_cpu_up(void)
>>      return 0;
>>  }
>>
>> +static inline int svm_avic_enabled(void)
>
> bool?

Actually, I declared this as int because the 
hvm_function_table.virtual_intr_delivery_enabled() is returning int.

>
>> @@ -1472,16 +1477,27 @@ const struct hvm_function_table * __init start_svm(void)
>>      P(cpu_has_svm_decode, "DecodeAssists");
>>      P(cpu_has_pause_filter, "Pause-Intercept Filter");
>>      P(cpu_has_tsc_ratio, "TSC Rate MSR");
>> -    P(cpu_has_svm_avic, "AVIC");
>> -#undef P
>> -
>> -    if ( !printed )
>> -        printk(" - none\n");
>>
>>      svm_function_table.hap_supported = !!cpu_has_svm_npt;
>>      svm_function_table.hap_capabilities = HVM_HAP_SUPERPAGE_2MB |
>>          ((cpuid_edx(0x80000001) & 0x04000000) ? HVM_HAP_SUPERPAGE_1GB : 0);
>>
>> +    if ( !cpu_has_svm_avic )
>> +        svm_avic = 0;
>> +
>> +    if ( svm_avic )
>> +    {
>> +        svm_function_table.deliver_posted_intr  = svm_avic_deliver_posted_intr;
>> +        svm_function_table.virtual_intr_delivery_enabled = svm_avic_enabled;
>> +        P(cpu_has_svm_avic, "AVIC (enabled)");
>> +    }
>> +    else
>> +        P(cpu_has_svm_avic, "AVIC (disabled)");
>> +#undef P
>> +
>> +    if ( !printed )
>> +        printk(" - none\n");
>
> Could I talk you into moving this up a few lines, so that effectively
> the last four lines here won't need to move at all?
>
> Jan
>

Sure, good point.

Thanks,
Suravee

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions
  2017-01-10  8:35     ` Suravee Suthikulpanit
@ 2017-01-10  9:00       ` Jan Beulich
  2017-01-10 10:28         ` Suravee Suthikulpanit
  0 siblings, 1 reply; 48+ messages in thread
From: Jan Beulich @ 2017-01-10  9:00 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: andrew.cooper3, boris.ostrovsky, xen-devel, sherry.hurwitz

>>> On 10.01.17 at 09:35, <Suravee.Suthikulpanit@amd.com> wrote:
> On 01/05/2017 11:05 PM, Jan Beulich wrote:
>>>>> On 31.12.16 at 06:46, <suravee.suthikulpanit@amd.com> wrote:
>>> --- a/xen/arch/x86/hvm/svm/svm.c
>>> +++ b/xen/arch/x86/hvm/svm/svm.c
>>> @@ -1438,6 +1438,11 @@ static int svm_cpu_up(void)
>>>      return 0;
>>>  }
>>>
>>> +static inline int svm_avic_enabled(void)
>>
>> bool?
> 
> Actually, I declared this as int because the 
> hvm_function_table.virtual_intr_delivery_enabled() is returning int.

Oh, that's used as a hook function. That's pretty un-obvious for a
function declared inline. Of course in that case you need to match
the hook function type, even if that ought to have return type bool.
Even farther - I question the need for a function here in the first
place, as both VMX and now SVM AVIC return a static value. This
could therefore be a bool flag alongside the various others we
already have near the beginning of the structure, if you're up to
such a change. If you'd rather stick with what's there now, we can
always put together a cleanup patch later on.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops
  2017-01-10  8:24       ` Jan Beulich
@ 2017-01-10  9:45         ` Suravee Suthikulpanit
  0 siblings, 0 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-10  9:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, andrew.cooper3, xen-devel, Jun Nakajima,
	sherry.hurwitz, boris.ostrovsky



On 01/10/2017 03:24 PM, Jan Beulich wrote:
>>>> On 10.01.17 at 07:51, <Suravee.Suthikulpanit@amd.com> wrote:
>> On 01/05/2017 10:51 PM, Jan Beulich wrote:
>>>>>> On 31.12.16 at 06:45, <suravee.suthikulpanit@amd.com> wrote:
>>>> --- a/xen/include/asm-x86/hvm/domain.h
>>>> +++ b/xen/include/asm-x86/hvm/domain.h
>>>> @@ -72,6 +72,67 @@ struct hvm_ioreq_server {
>>>>      bool_t                 bufioreq_atomic;
>>>>  };
>>>>
>>>> +struct hvm_pi_ops {
>>>> +    /*
>>>> +     * To handle posted interrupts correctly, we need to set the following
>>>> +     * state:
>>>> +     *
>>>> +     * * The PI notification vector (NV)
>>>> +     * * The PI notification destination processor (NDST)
>>>> +     * * The PI "suppress notification" bit (SN)
>>>> +     * * The vcpu pi "blocked" list
>>>> +     *
>>>> +     * If a VM is currently running, we want the PI delivered to the guest vcpu
>>>> +     * on the proper pcpu (NDST = v->processor, SN clear).
>>>> +     *
>>>> +     * If the vm is blocked, we want the PI delivered to Xen so that it can
>>>> +     * wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
>>>> +     *
>>>> +     * If the VM is currently either preempted or offline (i.e., not running
>>>> +     * because of some reason other than blocking waiting for an interrupt),
>>>> +     * there's nothing Xen can do -- we want the interrupt pending bit set in
>>>> [....]
>>>> +     *
>>>> +     * For transition C, we add a new arch hook, arch_vcpu_block(), which is
>>>> +     * called from vcpu_block() and vcpu_do_poll().
>>>> +     *
>>>> +     * For transition D, rather than add an extra arch hook on vcpu_wake, we
>>>> +     * add a hook on the vmentry path which checks to see if either of the two
>>>> +     * actions need to be taken.
>>>> +     *
>>>> +     * These hooks only need to be called when the domain in question actually
>>>> +     * has a physical device assigned to it, so we set and clear the callbacks
>>>> +     * as appropriate when device assignment changes.
>>>> +     */
>>>> +    void (*vcpu_block) (struct vcpu *);
>>>> +    void (*pi_switch_from) (struct vcpu *v);
>>>> +    void (*pi_switch_to) (struct vcpu *v);
>>>> +    void (*pi_do_resume) (struct vcpu *v);
>>>> +};
>>>
>>> While the hooks (as said, with the pi_ prefixes dropped) are
>>> certainly fine to move here, the comment is extremely VMX
>>> centric, and hence doesn't fit in this file. It either needs to be
>>> generalized, or it should remain in VMX specific code, perhaps
>>> with a referral to it added here.
>>
>> I see. I will move the comment into arch/x86/hvm/vmx/vmx.c close to
>> where these hooks are implemented.
>
> So you see no way of generalizing it? (I confess I didn't look closely
> enough yet at the [dis]similarities between VMX/VT-d PI and AVIC
> to be able to easily tell myself.)

I would need to look at VMX/VT-d PI also. But my impression is it's 
quite different.

S

> Jan
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions
  2017-01-10  9:00       ` Jan Beulich
@ 2017-01-10 10:28         ` Suravee Suthikulpanit
  0 siblings, 0 replies; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-10 10:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, boris.ostrovsky, xen-devel, sherry.hurwitz



On 01/10/2017 04:00 PM, Jan Beulich wrote:
>>>> On 10.01.17 at 09:35, <Suravee.Suthikulpanit@amd.com> wrote:
>> On 01/05/2017 11:05 PM, Jan Beulich wrote:
>>>>>> On 31.12.16 at 06:46, <suravee.suthikulpanit@amd.com> wrote:
>>>> --- a/xen/arch/x86/hvm/svm/svm.c
>>>> +++ b/xen/arch/x86/hvm/svm/svm.c
>>>> @@ -1438,6 +1438,11 @@ static int svm_cpu_up(void)
>>>>      return 0;
>>>>  }
>>>>
>>>> +static inline int svm_avic_enabled(void)
>>>
>>> bool?
>>
>> Actually, I declared this as int because the
>> hvm_function_table.virtual_intr_delivery_enabled() is returning int.
>
> Oh, that's used as a hook function. That's pretty un-obvious for a
> function declared inline. Of course in that case you need to match
> the hook function type, even if that ought to have return type bool.
> Even farther - I question the need for a function here in the first
> place, as both VMX and now SVM AVIC return a static value. This
> could therefore be a bool flag alongside the various others we
> already have near the beginning of the structure, if you're up to
> such a change. If you'd rather stick with what's there now, we can
> always put together a cleanup patch later on.
>
> Jan
>

Sure, I can incorporate the changes to make this a bool flag in this patch.

S

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler
  2017-01-05 16:07   ` Jan Beulich
@ 2017-01-10 11:14     ` Suravee Suthikulpanit
  2017-01-10 12:55       ` Jan Beulich
  0 siblings, 1 reply; 48+ messages in thread
From: Suravee Suthikulpanit @ 2017-01-10 11:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, boris.ostrovsky, xen-devel, sherry.hurwitz

Jan,

On 01/05/2017 11:07 PM, Jan Beulich wrote:
>>>> On 31.12.16 at 06:46, <suravee.suthikulpanit@amd.com> wrote:
>> +void __init setup_avic_dump(void)
>> +{
>> +    register_keyhandler('j', avic_dump, "dump SVM AVIC", 1);
>> +}
>
> For one the description should include the word "stats". And then
> I'm rather uncertain this is work burning one of the few remaining
> available keys. Could this be made a domctl instead?
>
> Jan
>

Not sure how you are thinking about using domctl to output statistics. 
Are there examples? Could you please describe?

Thanks,
Suravee

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler
  2017-01-10 11:14     ` Suravee Suthikulpanit
@ 2017-01-10 12:55       ` Jan Beulich
  0 siblings, 0 replies; 48+ messages in thread
From: Jan Beulich @ 2017-01-10 12:55 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: andrew.cooper3, boris.ostrovsky, xen-devel, sherry.hurwitz

>>> On 10.01.17 at 12:14, <Suravee.Suthikulpanit@amd.com> wrote:
> On 01/05/2017 11:07 PM, Jan Beulich wrote:
>>>>> On 31.12.16 at 06:46, <suravee.suthikulpanit@amd.com> wrote:
>>> +void __init setup_avic_dump(void)
>>> +{
>>> +    register_keyhandler('j', avic_dump, "dump SVM AVIC", 1);
>>> +}
>>
>> For one the description should include the word "stats". And then
>> I'm rather uncertain this is work burning one of the few remaining
>> available keys. Could this be made a domctl instead?
> 
> Not sure how you are thinking about using domctl to output statistics. 
> Are there examples? Could you please describe?

Provide a domctl to obtain the values, and a new xl command to
wrap that domctl.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC
  2017-01-02 17:45   ` Andrew Cooper
@ 2017-02-28 12:01     ` George Dunlap
  0 siblings, 0 replies; 48+ messages in thread
From: George Dunlap @ 2017-02-28 12:01 UTC (permalink / raw)
  To: Andrew Cooper, Suravee Suthikulpanit, xen-devel
  Cc: George Dunlap, Dario Faggioli, jbeulich, sherry.hurwitz, boris.ostrovsky

On 02/01/17 17:45, Andrew Cooper wrote:
> On 31/12/2016 05:45, Suravee Suthikulpanit wrote:
>> Enabling AVIC implicitly disables the V_IRQ, V_INTR_PRIO, V_IGN_TPR,
>> and V_INTR_VECTOR fields in the VMCB Control Word. Therefore, this patch
>> introduces new interrupt injection code via AVIC backing page.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Jan Beulich <JBeulich@suse.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> ---
>>  xen/arch/x86/hvm/svm/avic.c        | 28 ++++++++++++++++++++++++++++
>>  xen/arch/x86/hvm/svm/intr.c        |  4 ++++
>>  xen/arch/x86/hvm/svm/svm.c         | 12 ++++++++++--
>>  xen/include/asm-x86/hvm/svm/avic.h |  1 +
>>  4 files changed, 43 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
>> index 6351c8e..faa5e45 100644
>> --- a/xen/arch/x86/hvm/svm/avic.c
>> +++ b/xen/arch/x86/hvm/svm/avic.c
>> @@ -636,6 +636,34 @@ void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
>>      return;
>>  }
>>  
>> +void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
>> +{
>> +    struct vlapic *vlapic = vcpu_vlapic(v);
>> +
>> +    /* Fallback to use non-AVIC if vcpu is not enabled with AVIC. */
>> +    if ( !svm_avic_vcpu_enabled(v) )
>> +    {
>> +        if ( !vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
>> +            vcpu_kick(v);
>> +        return;
>> +    }
>> +
>> +    if ( !(guest_cpu_user_regs()->eflags & X86_EFLAGS_IF) )
>> +        return;
> 
> Won't this discard the interrupt?
> 
>> +
>> +    if ( vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
>> +        return;
>> +
>> +    /*
>> +     * If vcpu is running on another cpu, hit the doorbell to signal
>> +     * it to process interrupt. Otherwise, kick it.
>> +     */
>> +    if ( v->is_running && (v != current) )
>> +        wrmsrl(AVIC_DOORBELL, cpu_data[v->processor].apicid);
> 
> Hmm - my gut feeling is that this is racy without holding the scheduler
> lock for the target pcpu.  Nothing (I am aware of) excludes ->is_running
> and ->processor changing behind our back.
> 
> CC'ing George and Dario for their input.

I'm not sure how AVIC_DOORBELL works (haven't looked at the whole
series) -- the vcpu_kick() path accesses both is_running and
v->processor without locks; but that's because any schedule event which
may change those values will also check to see whether there is a
pending event to be delivered.  In theory the same could apply to this
mechanism, but it would take some careful thinking (in particular,
understanding the "NB's" in vcpu_kick() to see if and how they apply).

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2017-02-28 12:01 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-31  5:45 [PATCH v2 00/10] Introduce AMD SVM AVIC Suravee Suthikulpanit
2016-12-31  5:45 ` [PATCH v2 01/10] x86/HVM: Introduce struct hvm_pi_ops Suravee Suthikulpanit
2017-01-05  2:54   ` Tian, Kevin
2017-01-05  7:57     ` Jan Beulich
2017-01-05 15:51   ` Jan Beulich
2017-01-10  6:51     ` Suravee Suthikulpanit
2017-01-10  8:24       ` Jan Beulich
2017-01-10  9:45         ` Suravee Suthikulpanit
2016-12-31  5:45 ` [PATCH v2 02/10] x86/vLAPIC: Declare vlapic_read_aligned() and vlapic_reg_write() as non-static Suravee Suthikulpanit
2017-01-05 15:53   ` Jan Beulich
2017-01-10  6:57     ` Suravee Suthikulpanit
2017-01-10  8:25       ` Jan Beulich
2016-12-31  5:45 ` [PATCH v2 03/10] x86/HVM: Call vlapic_destroy after vcpu_destroy Suravee Suthikulpanit
2017-01-05  2:56   ` Tian, Kevin
2017-01-05 15:56   ` Jan Beulich
2017-01-10  8:18     ` Suravee Suthikulpanit
2016-12-31  5:45 ` [PATCH v2 04/10] x86/SVM: Modify VMCB fields to add AVIC support Suravee Suthikulpanit
2016-12-31  5:45 ` [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code Suravee Suthikulpanit
2017-01-02 16:37   ` Andrew Cooper
2017-01-04 17:24     ` Suravee Suthikulpanit
2017-01-04 17:59       ` Andrew Cooper
2017-01-10  3:06     ` Suravee Suthikulpanit
2017-01-03 14:54   ` Boris Ostrovsky
2016-12-31  5:45 ` [PATCH v2 06/10] x86/SVM: Add AVIC vmexit handlers Suravee Suthikulpanit
2017-01-02 17:28   ` Andrew Cooper
2017-01-05  4:07     ` Suravee Suthikulpanit
2017-01-03 15:34   ` Boris Ostrovsky
2017-01-05  6:41     ` Suravee Suthikulpanit
2016-12-31  5:45 ` [PATCH v2 07/10] x86/SVM: Add vcpu scheduling support for AVIC Suravee Suthikulpanit
2017-01-02 17:35   ` Andrew Cooper
2017-01-03 15:43   ` Boris Ostrovsky
2016-12-31  5:45 ` [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC Suravee Suthikulpanit
2017-01-02 17:45   ` Andrew Cooper
2017-02-28 12:01     ` George Dunlap
2017-01-05 16:01   ` Jan Beulich
2016-12-31  5:46 ` [PATCH v2 09/10] x86/SVM: Hook up miscellaneous AVIC functions Suravee Suthikulpanit
2017-01-02 17:49   ` Andrew Cooper
2017-01-05 16:05   ` Jan Beulich
2017-01-10  8:35     ` Suravee Suthikulpanit
2017-01-10  9:00       ` Jan Beulich
2017-01-10 10:28         ` Suravee Suthikulpanit
2016-12-31  5:46 ` [PATCH v2 10/10] x86/SVM: Add AMD AVIC key handler Suravee Suthikulpanit
2017-01-03 16:01   ` Boris Ostrovsky
2017-01-03 16:04     ` Andrew Cooper
2017-01-05  8:00       ` Suravee Suthikulpanit
2017-01-05 16:07   ` Jan Beulich
2017-01-10 11:14     ` Suravee Suthikulpanit
2017-01-10 12:55       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.