All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Introduce AMD SVM AVIC
@ 2018-04-03 23:01 Janakarajan Natarajan
  2018-04-03 23:01 ` [PATCH 1/8] x86/SVM: Modify VMCB fields to add AVIC support Janakarajan Natarajan
                   ` (8 more replies)
  0 siblings, 9 replies; 35+ messages in thread
From: Janakarajan Natarajan @ 2018-04-03 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, Janakarajan Natarajan, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jan Beulich, Boris Ostrovsky

OVERVIEW
========
This patchset is the first of a two-part patch series to introduce
the AMD Advanced Virtual Interrupt Controller (AVIC) support.

The AVIC hardware virtualizes local APIC registers of each vCPU via
the virtual APIC (vAPIC) backing page. This allows the guest to access
certain APIC registers without the need for emulation of hardware
behaviour in the hypervisor. More information about AVIC can be found in

* AMD64 Architecture Programmers Manual Volume 2 - System Programming
  https://support.amd.com/TechDocs/24593.pdf

For SVM AVIC, we extend the existing SVM driver to:
* Check CPUID to detect AVIC support in the processor.
* Program new fields in VMCB to enable AVIC.
* Introduce new AVIC data structures and add code to manage them.
* Handle two new AVIC #VMEXITs.
* Add new interrupt injection code using vAPIC backing page
  instead of the existing V_IRQ, V_INTR_PRIO, V_INTR_VECTOR and
  V_IGN_TPR fields.

This patchset does not enable AVIC by default since it does not
yet support nested VMs. Until then, users can enable SVM AVIC by
specifying Xen parameter svm=avic.

Later, in Part 2, we will introduce the IOMMU AVIC support, which
provides speed-up for the PCI device passthrough use case by allowing
the IOMMU hardware to inject interrupts directly into the guest via
the vAPIC backing page.

OVERALL PERFORMANCE
===================
AVIC is available on AMD Family 15h models 6Xh (Carrizo) processors
and newer. An AMD Family 17h Epyc processor is used to collect the
performance data shown below.

Generally, SVM AVIC alone (w/o IOMMU AVIC) should provide overall
speed-up for HVM guest since it does not require #VMEXIT into the
hypervisor to emulate certain guest accesses to local APIC registers.

It should also improve performance when the hypervisor wants to
inject interrupts into a running vcpu. It can do this by setting the 
corresponding IRR bit in the vAPIC backing page and triggering the
AVIC_DOORBELL.

For sending IPI interrupts between running vcpus in a Linux guest,
Xen defaults to using event channels. However, in case of
non-paravirtualized guests, AVIC can also provide performance
improvements for sending IPIs.

BENCHMARK: HACKBENCH
====================
For measuring IPI performance used for scheduling workload, some
performance numbers are collected using hackbench.

    # hackbench -p -l 100000
    Running in process mode with 10 groups using 40 file descriptors
    each (== 400 tasks)
    Each sender will pass 100000 messages of 100 bytes
			  |	 3 vcpus (sec)        |
    --------------------------------------------------
    No AVIC w/  xen_nopv  |           517             |
    AVIC w/ xen_nopv      |           173             |
    No AVIC w/o xen_nopv  |           141             |
    AVIC w/o xen_nopv     |           135             |

Each benchmark test was averaged over 10 runs.

CURRENT UNTESTED USE_CASES
==========================
* Nested VM

Any feedback and comments are very much appreciated.

Suravee Suthikulpanit (8):
  x86/SVM: Modify VMCB fields to add AVIC support
  x86/HVM/SVM: Add AVIC initialization code
  x86/SVM: Add AVIC vmexit handlers
  x86/SVM: Add vcpu scheduling support for AVIC
  x86/SVM: Add interrupt management code via AVIC
  x86/HVM: Hook up miscellaneous AVIC functions
  x86/SVM: Introduce svm command line option
  x86/SVM: Add AMD AVIC key handler

 docs/misc/xen-command-line.markdown |  16 +
 xen/arch/x86/hvm/svm/Makefile       |   1 +
 xen/arch/x86/hvm/svm/avic.c         | 626 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/intr.c         |   4 +
 xen/arch/x86/hvm/svm/svm.c          |  77 ++++-
 xen/arch/x86/hvm/svm/vmcb.c         |   3 +
 xen/arch/x86/hvm/vlapic.c           |  20 +-
 xen/arch/x86/hvm/vmx/vmx.c          |   8 +-
 xen/include/asm-x86/hvm/hvm.h       |   4 +-
 xen/include/asm-x86/hvm/svm/avic.h  |  43 +++
 xen/include/asm-x86/hvm/svm/svm.h   |   2 +
 xen/include/asm-x86/hvm/svm/vmcb.h  |  52 ++-
 xen/include/asm-x86/hvm/vlapic.h    |   4 +
 xen/include/asm-x86/msr-index.h     |   1 +
 14 files changed, 831 insertions(+), 30 deletions(-)
 create mode 100644 xen/arch/x86/hvm/svm/avic.c
 create mode 100644 xen/include/asm-x86/hvm/svm/avic.h

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/8] x86/SVM: Modify VMCB fields to add AVIC support
  2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
@ 2018-04-03 23:01 ` Janakarajan Natarajan
  2018-04-03 23:01 ` [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code Janakarajan Natarajan
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 35+ messages in thread
From: Janakarajan Natarajan @ 2018-04-03 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, Janakarajan Natarajan, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jan Beulich, Boris Ostrovsky

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Introduce AVIC-related VMCB fields.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
---
 xen/include/asm-x86/hvm/svm/vmcb.h | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index de07429dff..591d98fc8c 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -324,16 +324,17 @@ typedef union
     struct
     {
         u64 tpr:          8;
-        u64 irq:          1;
+        u64 irq:          1; /* ignored in avic mode */
         u64 vgif:         1;
         u64 rsvd0:        6;
-        u64 prio:         4;
-        u64 ign_tpr:      1;
+        u64 prio:         4; /* ignored in avic mode */
+        u64 ign_tpr:      1; /* ignored in avic mode */
         u64 rsvd1:        3;
         u64 intr_masking: 1;
         u64 vgif_enable:  1;
-        u64 rsvd2:        6;
-        u64 vector:       8;
+        u64 rsvd2:        5;
+        u64 avic_enable:  1;
+        u64 vector:       8; /* ignored in avic mode */
         u64 rsvd3:       24;
     } fields;
 } vintr_t;
@@ -393,7 +394,8 @@ typedef union
         uint32_t cr2: 1;
         /* debugctlmsr, last{branch,int}{to,from}ip */
         uint32_t lbr: 1;
-        uint32_t resv: 21;
+        uint32_t avic: 1;
+        uint32_t resv: 20;
     } fields;
 } vmcbcleanbits_t;
 
@@ -427,7 +429,8 @@ struct vmcb_struct {
     u64 exitinfo2;              /* offset 0x80 */
     eventinj_t  exitintinfo;    /* offset 0x88 */
     u64 _np_enable;             /* offset 0x90 - cleanbit 4 */
-    u64 res08[2];
+    u64 avic_vapic_bar;         /* offset 0x98 */
+    u64 res08;                  /* offset 0xA0 */
     eventinj_t  eventinj;       /* offset 0xA8 */
     u64 _h_cr3;                 /* offset 0xB0 - cleanbit 4 */
     virt_ext_t virt_ext;        /* offset 0xB8 */
@@ -436,7 +439,11 @@ struct vmcb_struct {
     u64 nextrip;                /* offset 0xC8 */
     u8  guest_ins_len;          /* offset 0xD0 */
     u8  guest_ins[15];          /* offset 0xD1 */
-    u64 res10a[100];            /* offset 0xE0 pad to save area */
+    u64 avic_bk_pg_pa;          /* offset 0xE0 */
+    u64 res09a;                 /* offset 0xE8 */
+    u64 avic_logical_id_table_pa;  /* offset 0xF0 */
+    u64 avic_physical_id_table_pa; /* offset 0xF8 */
+    u64 res10a[96];             /* offset 0x100 pad to save area */
 
     union {
         struct segment_register sreg[6];
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code
  2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
  2018-04-03 23:01 ` [PATCH 1/8] x86/SVM: Modify VMCB fields to add AVIC support Janakarajan Natarajan
@ 2018-04-03 23:01 ` Janakarajan Natarajan
  2018-04-13 17:35   ` Andrew Cooper
                     ` (2 more replies)
  2018-04-03 23:01 ` [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers Janakarajan Natarajan
                   ` (6 subsequent siblings)
  8 siblings, 3 replies; 35+ messages in thread
From: Janakarajan Natarajan @ 2018-04-03 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, Janakarajan Natarajan, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jan Beulich, Boris Ostrovsky

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Introduce AVIC base initialization code. This includes:
    * Setting up per-VM data structures.
    * Setting up per-vCPU data structure.
    * Initializing AVIC-related VMCB bit fields.

This patch also introduces a new Xen parameter (svm-avic),
which can be used to enable/disable AVIC support.
Currently, this svm-avic is disabled by default.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
---
 xen/arch/x86/hvm/svm/Makefile      |   1 +
 xen/arch/x86/hvm/svm/avic.c        | 191 +++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c         |   8 +-
 xen/arch/x86/hvm/svm/vmcb.c        |   3 +
 xen/arch/x86/hvm/vlapic.c          |   4 +
 xen/include/asm-x86/hvm/svm/avic.h |  36 +++++++
 xen/include/asm-x86/hvm/svm/svm.h  |   2 +
 xen/include/asm-x86/hvm/svm/vmcb.h |  17 ++++
 8 files changed, 261 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/x86/hvm/svm/avic.c
 create mode 100644 xen/include/asm-x86/hvm/svm/avic.h

diff --git a/xen/arch/x86/hvm/svm/Makefile b/xen/arch/x86/hvm/svm/Makefile
index 760d2954da..e0e4a59f7d 100644
--- a/xen/arch/x86/hvm/svm/Makefile
+++ b/xen/arch/x86/hvm/svm/Makefile
@@ -1,4 +1,5 @@
 obj-y += asid.o
+obj-y += avic.o
 obj-y += emulate.o
 obj-bin-y += entry.o
 obj-y += intr.o
diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
new file mode 100644
index 0000000000..8108698911
--- /dev/null
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -0,0 +1,191 @@
+/*
+ * avic.c: implements AMD Advanced Virtual Interrupt Controller (AVIC) support
+ * Copyright (c) 2016, Advanced Micro Devices, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/domain_page.h>
+#include <xen/sched.h>
+#include <xen/stdbool.h>
+#include <asm/acpi.h>
+#include <asm/apicdef.h>
+#include <asm/atomic.h>
+#include <asm/event.h>
+#include <asm/hvm/emulate.h>
+#include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/support.h>
+#include <asm/hvm/svm/avic.h>
+#include <asm/hvm/vlapic.h>
+#include <asm/p2m.h>
+#include <asm/page.h>
+
+/*
+ * Note: Current max index allowed for physical APIC ID table is 255.
+ */
+#define AVIC_PHY_APIC_ID_MAX    0xFF
+
+#define AVIC_VAPIC_BAR_MASK     (((1ULL << 40) - 1) << PAGE_SHIFT)
+
+/*
+ * Note:
+ * Currently, svm-avic mode is not supported with nested virtualization.
+ * Therefore, it is not yet currently enabled by default. Once the support
+ * is in-place, this should be enabled by default.
+ */
+bool svm_avic = 0;
+
+static struct avic_physical_id_entry *
+avic_get_physical_id_entry(struct svm_domain *d, unsigned int index)
+{
+    if ( !d->avic_physical_id_table )
+        return NULL;
+
+    /*
+    * Note: APIC ID = 0xFF is used for broadcast.
+    *       APIC ID > 0xFF is reserved.
+    */
+    ASSERT(index < AVIC_PHY_APIC_ID_MAX);
+
+    return &d->avic_physical_id_table[index];
+}
+
+int svm_avic_dom_init(struct domain *d)
+{
+    int ret = 0;
+    struct page_info *pg;
+
+    if ( !svm_avic || !has_vlapic(d) )
+        return 0;
+
+    /*
+     * Note:
+     * AVIC hardware walks the nested page table to check permissions,
+     * but does not use the SPA address specified in the leaf page
+     * table entry since it uses  address in the AVIC_BACKING_PAGE pointer
+     * field of the VMCB. Therefore, we set up a dummy page for APIC _mfn(0).
+     */
+    set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE),
+                       _mfn(0), PAGE_ORDER_4K,
+                       p2m_get_hostp2m(d)->default_access);
+
+    /* Init AVIC logical APIC ID table */
+    pg = alloc_domheap_page(d, MEMF_no_owner);
+    if ( !pg )
+    {
+        ret = -ENOMEM;
+        goto err_out;
+    }
+    clear_domain_page(_mfn(page_to_mfn(pg)));
+    d->arch.hvm_domain.svm.avic_logical_id_table_pg = pg;
+    d->arch.hvm_domain.svm.avic_logical_id_table = __map_domain_page_global(pg);
+
+    /* Init AVIC physical APIC ID table */
+    pg = alloc_domheap_page(d, MEMF_no_owner);
+    if ( !pg )
+    {
+        ret = -ENOMEM;
+        goto err_out;
+    }
+    clear_domain_page(_mfn(page_to_mfn(pg)));
+    d->arch.hvm_domain.svm.avic_physical_id_table_pg = pg;
+    d->arch.hvm_domain.svm.avic_physical_id_table = __map_domain_page_global(pg);
+
+    return ret;
+ err_out:
+    svm_avic_dom_destroy(d);
+    return ret;
+}
+
+void svm_avic_dom_destroy(struct domain *d)
+{
+    if ( !svm_avic || !has_vlapic(d) )
+        return;
+
+    if ( d->arch.hvm_domain.svm.avic_physical_id_table )
+    {
+        unmap_domain_page_global(d->arch.hvm_domain.svm.avic_physical_id_table);
+        free_domheap_page(d->arch.hvm_domain.svm.avic_physical_id_table_pg);
+        d->arch.hvm_domain.svm.avic_physical_id_table_pg = 0;
+        d->arch.hvm_domain.svm.avic_physical_id_table = 0;
+    }
+
+    if ( d->arch.hvm_domain.svm.avic_logical_id_table)
+    {
+        unmap_domain_page_global(d->arch.hvm_domain.svm.avic_logical_id_table);
+        free_domheap_page(d->arch.hvm_domain.svm.avic_logical_id_table_pg);
+        d->arch.hvm_domain.svm.avic_logical_id_table_pg = 0;
+        d->arch.hvm_domain.svm.avic_logical_id_table = 0;
+    }
+}
+
+bool svm_avic_vcpu_enabled(const struct vcpu *v)
+{
+    const struct arch_svm_struct *s = &v->arch.hvm_svm;
+    const struct vmcb_struct *vmcb = s->vmcb;
+
+    return vmcb->_vintr.fields.avic_enable;
+}
+
+int svm_avic_init_vmcb(struct vcpu *v)
+{
+    u32 apic_id;
+    unsigned long tmp;
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+    struct vmcb_struct *vmcb = s->vmcb;
+    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
+    const struct vlapic *vlapic = vcpu_vlapic(v);
+    struct avic_physical_id_entry *entry = (struct avic_physical_id_entry *)&tmp;
+
+    if ( !svm_avic || !has_vlapic(v->domain) )
+        return 0;
+
+    if ( !vlapic || !vlapic->regs_page )
+        return -EINVAL;
+
+    apic_id = vlapic_read_aligned(vcpu_vlapic(v), APIC_ID);
+    s->avic_last_phy_id = avic_get_physical_id_entry(d, GET_xAPIC_ID(apic_id));
+    if ( !s->avic_last_phy_id )
+        return -EINVAL;
+
+    vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page);
+    vmcb->avic_logical_id_table_pa = domain_page_map_to_mfn(d->avic_logical_id_table) << PAGE_SHIFT;
+    vmcb->avic_physical_id_table_pa = domain_page_map_to_mfn(d->avic_physical_id_table) << PAGE_SHIFT;
+
+    /* Set Physical ID Table Pointer [7:0] to max apic id of the domain */
+    vmcb->avic_logical_id_table_pa &= ~AVIC_PHY_APIC_ID_MAX;
+    vmcb->avic_physical_id_table_pa |= (v->domain->max_vcpus * 2) & 0xFF;
+
+    tmp = read_atomic((u64*)(s->avic_last_phy_id));
+    entry->bk_pg_ptr_mfn = (vmcb->avic_bk_pg_pa) >> PAGE_SHIFT;
+    entry->is_running = 0;
+    entry->valid = 1;
+    write_atomic((u64*)(s->avic_last_phy_id), tmp);
+
+    vmcb->avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & AVIC_VAPIC_BAR_MASK;
+    vmcb->cleanbits.fields.avic = 0;
+
+    vmcb->_vintr.fields.avic_enable = 1;
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 8538232f68..b445f59ada 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -47,6 +47,7 @@
 #include <asm/hvm/svm/asid.h>
 #include <asm/hvm/svm/svm.h>
 #include <asm/hvm/svm/vmcb.h>
+#include <asm/hvm/svm/avic.h>
 #include <asm/hvm/svm/emulate.h>
 #include <asm/hvm/svm/intr.h>
 #include <asm/hvm/svm/svmdebug.h>
@@ -1225,11 +1226,12 @@ static int svm_domain_initialise(struct domain *d)
 
     d->arch.ctxt_switch = &csw;
 
-    return 0;
+    return svm_avic_dom_init(d);
 }
 
 static void svm_domain_destroy(struct domain *d)
 {
+    svm_avic_dom_destroy(d);
 }
 
 static int svm_vcpu_initialise(struct vcpu *v)
@@ -1694,6 +1696,9 @@ const struct hvm_function_table * __init start_svm(void)
     if ( cpu_has_tsc_ratio )
         svm_function_table.tsc_scaling.ratio_frac_bits = 32;
 
+    if ( !cpu_has_svm_avic )
+        svm_avic = 0;
+
 #define P(p,s) if ( p ) { printk(" - %s\n", s); printed = 1; }
     P(cpu_has_svm_npt, "Nested Page Tables (NPT)");
     P(cpu_has_svm_lbrv, "Last Branch Record (LBR) Virtualisation");
@@ -1705,6 +1710,7 @@ const struct hvm_function_table * __init start_svm(void)
     P(cpu_has_pause_filter, "Pause-Intercept Filter");
     P(cpu_has_pause_thresh, "Pause-Intercept Filter Threshold");
     P(cpu_has_tsc_ratio, "TSC Rate MSR");
+    P(cpu_has_svm_avic, svm_avic ? "AVIC (enabled)" : "AVIC (disabled)");
 #undef P
 
     if ( !printed )
diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
index ae60d8dc1c..7ade023cfe 100644
--- a/xen/arch/x86/hvm/svm/vmcb.c
+++ b/xen/arch/x86/hvm/svm/vmcb.c
@@ -27,6 +27,7 @@
 #include <asm/msr-index.h>
 #include <asm/p2m.h>
 #include <asm/hvm/support.h>
+#include <asm/hvm/svm/avic.h>
 #include <asm/hvm/svm/svm.h>
 #include <asm/hvm/svm/svmdebug.h>
 
@@ -215,6 +216,8 @@ static int construct_vmcb(struct vcpu *v)
             vmcb->_pause_filter_thresh = SVM_PAUSETHRESH_INIT;
     }
 
+    svm_avic_init_vmcb(v);
+
     vmcb->cleanbits.bytes = 0;
 
     return 0;
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 1b9f00a0e4..a4472200a6 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1597,6 +1597,10 @@ int vlapic_init(struct vcpu *v)
 
     if (vlapic->regs_page == NULL)
     {
+        /*
+         * SVM AVIC depends on the vlapic->regs_page being a full
+         * page allocation as it is also used for vAPIC backing page.
+         */
         vlapic->regs_page = alloc_domheap_page(v->domain, MEMF_no_owner);
         if ( vlapic->regs_page == NULL )
         {
diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
new file mode 100644
index 0000000000..32bb9a91e8
--- /dev/null
+++ b/xen/include/asm-x86/hvm/svm/avic.h
@@ -0,0 +1,36 @@
+#ifndef _SVM_AVIC_H_
+#define _SVM_AVIC_H_
+
+#include <xen/compiler.h>
+
+enum avic_incmp_ipi_err_code {
+    AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE,
+    AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN,
+    AVIC_INCMP_IPI_ERR_INV_TARGET,
+    AVIC_INCMP_IPI_ERR_INV_BK_PAGE,
+};
+
+struct __packed avic_logical_id_entry {
+    u32 guest_phy_apic_id : 8;
+    u32 res               : 23;
+    u32 valid             : 1;
+};
+
+struct __packed avic_physical_id_entry {
+    u64 host_phy_apic_id  : 8;
+    u64 res1              : 4;
+    u64 bk_pg_ptr_mfn     : 40;
+    u64 res2              : 10;
+    u64 is_running        : 1;
+    u64 valid             : 1;
+};
+
+extern bool svm_avic;
+
+int svm_avic_dom_init(struct domain *d);
+void svm_avic_dom_destroy(struct domain *d);
+
+bool svm_avic_vcpu_enabled(const struct vcpu *v);
+int svm_avic_init_vmcb(struct vcpu *v);
+
+#endif /* _SVM_AVIC_H_ */
diff --git a/xen/include/asm-x86/hvm/svm/svm.h b/xen/include/asm-x86/hvm/svm/svm.h
index 4e5e142910..983750fee9 100644
--- a/xen/include/asm-x86/hvm/svm/svm.h
+++ b/xen/include/asm-x86/hvm/svm/svm.h
@@ -65,6 +65,7 @@ extern u32 svm_feature_flags;
 #define SVM_FEATURE_DECODEASSISTS  7 /* Decode assists support */
 #define SVM_FEATURE_PAUSEFILTER   10 /* Pause intercept filter support */
 #define SVM_FEATURE_PAUSETHRESH   12 /* Pause intercept filter support */
+#define SVM_FEATURE_AVIC          13 /* AVIC Support */
 #define SVM_FEATURE_VLOADSAVE     15 /* virtual vmload/vmsave */
 #define SVM_FEATURE_VGIF          16 /* Virtual GIF */
 
@@ -79,6 +80,7 @@ extern u32 svm_feature_flags;
 #define cpu_has_pause_filter  cpu_has_svm_feature(SVM_FEATURE_PAUSEFILTER)
 #define cpu_has_pause_thresh  cpu_has_svm_feature(SVM_FEATURE_PAUSETHRESH)
 #define cpu_has_tsc_ratio     cpu_has_svm_feature(SVM_FEATURE_TSCRATEMSR)
+#define cpu_has_svm_avic      cpu_has_svm_feature(SVM_FEATURE_AVIC)
 #define cpu_has_svm_vloadsave cpu_has_svm_feature(SVM_FEATURE_VLOADSAVE)
 
 #define SVM_PAUSEFILTER_INIT    4000
diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index 591d98fc8c..386aad2260 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -500,6 +500,21 @@ struct vmcb_struct {
 };
 
 struct svm_domain {
+    /*
+     * This per-domain table is used by the hardware to locate
+     * the vAPIC backing page to be used to deliver interrupts
+     * based on the guest physical APIC ID.
+     */
+    struct avic_physical_id_entry *avic_physical_id_table;
+    struct page_info *avic_physical_id_table_pg;
+
+    /*
+     * This per-domain table is used by the hardware to map
+     * logically addressed interrupt requests (w/ guest logical APIC id)
+     * to the guest physical APIC ID.
+     */
+    struct avic_logical_id_entry *avic_logical_id_table;
+    struct page_info *avic_logical_id_table_pg;
 };
 
 struct arch_svm_struct {
@@ -533,6 +548,8 @@ struct arch_svm_struct {
         u64 length;
         u64 status;
     } osvw;
+
+    struct avic_physical_id_entry *avic_last_phy_id;
 };
 
 struct vmcb_struct *alloc_vmcb(void);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers
  2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
  2018-04-03 23:01 ` [PATCH 1/8] x86/SVM: Modify VMCB fields to add AVIC support Janakarajan Natarajan
  2018-04-03 23:01 ` [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code Janakarajan Natarajan
@ 2018-04-03 23:01 ` Janakarajan Natarajan
  2018-04-13 17:48   ` Andrew Cooper
  2018-04-17 12:58   ` Jan Beulich
  2018-04-03 23:01 ` [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC Janakarajan Natarajan
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 35+ messages in thread
From: Janakarajan Natarajan @ 2018-04-03 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, Janakarajan Natarajan, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jan Beulich, Boris Ostrovsky

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

AVIC introduces two new #vmexit handlers:

VMEXIT_INCOMP_IPI:
This occurs when an IPI could not be delivered to all targeted guest
virtual processors because at least one guest virtual processor
was not allocated to a physical core at the time. In this case,
Xen would try to emulate IPI.

VMEXIT_DO_NOACCEL:
This occurs when a guest access to an APIC register that cannot be
accelerated by AVIC. In this case, Xen tries to emulate register accesses.

This fault is also generated if an EOI is attempted when the highest priority
in-service interrupt is set for level-triggered mode.

This patch also declare vlapic_read_aligned() and vlapic_reg_write()
as non-static to expose them to be used by AVIC.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
---
 xen/arch/x86/hvm/svm/avic.c        | 296 +++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c         |   8 +
 xen/arch/x86/hvm/vlapic.c          |   4 +-
 xen/include/asm-x86/hvm/svm/avic.h |   3 +
 xen/include/asm-x86/hvm/svm/vmcb.h |   6 +
 xen/include/asm-x86/hvm/vlapic.h   |   4 +
 6 files changed, 319 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
index 8108698911..e112469774 100644
--- a/xen/arch/x86/hvm/svm/avic.c
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -19,6 +19,7 @@
 #include <xen/sched.h>
 #include <xen/stdbool.h>
 #include <asm/acpi.h>
+#include <asm/apic.h>
 #include <asm/apicdef.h>
 #include <asm/atomic.h>
 #include <asm/event.h>
@@ -37,6 +38,8 @@
 
 #define AVIC_VAPIC_BAR_MASK     (((1ULL << 40) - 1) << PAGE_SHIFT)
 
+#define AVIC_UNACCEL_ACCESS_OFFSET_MASK    0xFF0
+
 /*
  * Note:
  * Currently, svm-avic mode is not supported with nested virtualization.
@@ -101,6 +104,8 @@ int svm_avic_dom_init(struct domain *d)
     d->arch.hvm_domain.svm.avic_physical_id_table_pg = pg;
     d->arch.hvm_domain.svm.avic_physical_id_table = __map_domain_page_global(pg);
 
+    spin_lock_init(&d->arch.hvm_domain.svm.avic_dfr_mode_lock);
+
     return ret;
  err_out:
     svm_avic_dom_destroy(d);
@@ -181,6 +186,297 @@ int svm_avic_init_vmcb(struct vcpu *v)
 }
 
 /*
+ * Note:
+ * This function handles the AVIC_INCOMP_IPI #vmexit when AVIC is enabled.
+ * The hardware generates this fault when an IPI could not be delivered
+ * to all targeted guest virtual processors because at least one guest
+ * virtual processor was not allocated to a physical core at the time.
+ */
+void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs)
+{
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
+    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
+    u32 icrh = vmcb->exitinfo1 >> 32;
+    u32 icrl = vmcb->exitinfo1;
+    u32 id = vmcb->exitinfo2 >> 32;
+    u32 index = vmcb->exitinfo2 && 0xFF;
+
+    switch ( id )
+    {
+    case AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE:
+        /*
+         * AVIC hardware handles the delivery of
+         * IPIs when the specified Message Type is Fixed
+         * (also known as fixed delivery mode) and
+         * the Trigger Mode is edge-triggered. The hardware
+         * also supports self and broadcast delivery modes
+         * specified via the Destination Shorthand(DSH)
+         * field of the ICRL. Logical and physical APIC ID
+         * formats are supported. All other IPI types cause
+         * a #VMEXIT, which needs to emulated.
+         */
+        vlapic_reg_write(curr, APIC_ICR2, icrh);
+        vlapic_reg_write(curr, APIC_ICR, icrl);
+        break;
+
+    case AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN:
+    {
+        /*
+         * At this point, we expect that the AVIC HW has already
+         * set the appropriate IRR bits on the valid target
+         * vcpus. So, we just need to kick the appropriate vcpu.
+         */
+        struct vcpu *v;
+        uint32_t dest = GET_xAPIC_DEST_FIELD(icrh);
+        uint32_t short_hand = icrl & APIC_SHORT_MASK;
+        bool dest_mode = !!(icrl & APIC_DEST_MASK);
+
+        for_each_vcpu ( currd,  v )
+        {
+            if ( v != curr &&
+                 vlapic_match_dest(vcpu_vlapic(v), vcpu_vlapic(curr),
+                                   short_hand, dest, dest_mode) )
+            {
+                vcpu_kick(v);
+                break;
+            }
+        }
+        break;
+    }
+
+    case AVIC_INCMP_IPI_ERR_INV_TARGET:
+        gprintk(XENLOG_ERR,
+                "SVM: %s: Invalid IPI target (icr=%#08x:%08x, idx=%u)\n",
+                __func__, icrh, icrl, index);
+        domain_crash(currd);
+        break;
+
+    case AVIC_INCMP_IPI_ERR_INV_BK_PAGE:
+        gprintk(XENLOG_ERR,
+                "SVM: %s: Invalid bk page (icr=%#08x:%08x, idx=%u)\n",
+                __func__, icrh, icrl, index);
+        domain_crash(currd);
+        break;
+
+    default:
+        gprintk(XENLOG_ERR, "SVM: %s: Unknown IPI interception (%#x)\n",
+                __func__, id);
+        domain_crash(currd);
+    }
+}
+
+static struct avic_logical_id_entry *
+avic_get_logical_id_entry(struct svm_domain *d, u32 ldr, bool flat)
+{
+    unsigned int index;
+    unsigned int dest_id = GET_xAPIC_LOGICAL_ID(ldr);
+
+    if ( !dest_id )
+        return NULL;
+
+    if ( flat )
+    {
+        index = ffs(dest_id) - 1;
+        if ( index > 7 )
+            return NULL;
+    }
+    else
+    {
+        unsigned int cluster = (dest_id & 0xf0) >> 4;
+        int apic = ffs(dest_id & 0x0f) - 1;
+
+        if ( (apic < 0) || (apic > 7) || (cluster >= 0xf) )
+            return NULL;
+        index = (cluster << 2) + apic;
+    }
+
+    ASSERT(index <= 255);
+
+    return &d->avic_logical_id_table[index];
+}
+
+static int avic_ldr_write(struct vcpu *v, u8 g_phy_id, u32 ldr, bool valid)
+{
+    struct avic_logical_id_entry *entry, new_entry;
+    u32 dfr = vlapic_read_aligned(vcpu_vlapic(v), APIC_DFR);
+
+    entry = avic_get_logical_id_entry(&v->domain->arch.hvm_domain.svm,
+                                      ldr, (dfr == APIC_DFR_FLAT));
+    if (!entry)
+        return -EINVAL;
+
+    new_entry = *entry;
+    smp_rmb();
+    new_entry.guest_phy_apic_id = g_phy_id;
+    new_entry.valid = valid;
+    *entry = new_entry;
+    smp_wmb();
+
+    return 0;
+}
+
+static int avic_handle_ldr_update(struct vcpu *v)
+{
+    int ret = 0;
+    u32 ldr = vlapic_read_aligned(vcpu_vlapic(v), APIC_LDR);
+    u32 apic_id = vlapic_read_aligned(vcpu_vlapic(v), APIC_ID);
+
+    if ( !ldr )
+        return -EINVAL;
+
+    ret = avic_ldr_write(v, GET_xAPIC_ID(apic_id), ldr, true);
+    if ( ret && v->arch.hvm_svm.avic_last_ldr )
+    {
+        /*
+         * Note:
+         * In case of failure to update LDR register,
+         * we set the guest physical APIC ID to 0,
+         * and set the entry logical APID ID entry
+         * to invalid (false).
+         */
+        avic_ldr_write(v, 0, v->arch.hvm_svm.avic_last_ldr, false);
+        v->arch.hvm_svm.avic_last_ldr = 0;
+    }
+    else
+    {
+        /*
+         * Note:
+         * This saves the last valid LDR so that we
+         * know which entry in the local APIC ID
+         * to clean up when the LDR is updated.
+         */
+        v->arch.hvm_svm.avic_last_ldr = ldr;
+    }
+
+    return ret;
+}
+
+static int avic_handle_dfr_update(struct vcpu *v)
+{
+    u32 mod;
+    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
+    u32 dfr = vlapic_read_aligned(vcpu_vlapic(v), APIC_DFR);
+
+    mod = (dfr >> 28) & 0xFu;
+
+    spin_lock(&d->avic_dfr_mode_lock);
+    if ( d->avic_dfr_mode != mod )
+    {
+        /*
+         * We assume that all local APICs are using the same type.
+         * If DFR mode changes, we need to flush the domain AVIC logical
+         * APIC id table.
+         */
+        clear_domain_page(_mfn(page_to_mfn(d->avic_logical_id_table_pg)));
+        d->avic_dfr_mode = mod;
+    }
+    spin_unlock(&d->avic_dfr_mode_lock);
+
+    if ( v->arch.hvm_svm.avic_last_ldr )
+        avic_handle_ldr_update(v);
+
+    return 0;
+}
+
+static int avic_unaccel_trap_write(struct vcpu *v)
+{
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
+    u32 offset = vmcb->exitinfo1 & AVIC_UNACCEL_ACCESS_OFFSET_MASK;
+    u32 reg = vlapic_read_aligned(vcpu_vlapic(v), offset);
+
+    switch ( offset )
+    {
+    case APIC_ID:
+        /*
+         * Currently, we do not support APIC_ID update while
+         * the vcpus are running, which might require updating
+         * AVIC max APIC ID in all VMCBs. This would require
+         * synchronize update on all running VCPUs.
+         */
+        return X86EMUL_UNHANDLEABLE;
+    case APIC_LDR:
+        if ( avic_handle_ldr_update(v) )
+            return X86EMUL_UNHANDLEABLE;
+        break;
+
+    case APIC_DFR:
+        if ( avic_handle_dfr_update(v) )
+            return X86EMUL_UNHANDLEABLE;
+        break;
+
+    default:
+        break;
+    }
+
+    vlapic_reg_write(v, offset, reg);
+
+    return X86EMUL_OKAY;
+}
+
+static inline bool avic_is_trap(u32 offset)
+{
+    u32 pos = offset >> 4;
+    static const unsigned long avic_trap[] =
+        {
+#define REG(x) (1UL << (APIC_ ## x >> 4))
+            REG(ID)   | REG(EOI)     | REG(RRR)   | REG(LDR)  |
+            REG(DFR)  | REG(SPIV)    | REG(ESR)   | REG(ICR)  |
+            REG(LVTT) | REG(LVTTHMR) | REG(LVTPC) | REG(LVT0) |
+            REG(LVT1) | REG(LVTERR)  | REG(TMICT) | REG(TDCR)
+#undef REG
+        };
+
+    if ( !test_bit(pos, avic_trap) )
+        return false;
+    return true;
+}
+
+/*
+ * Note:
+ * This function handles the AVIC_NOACCEL #vmexit when AVIC is enabled.
+ * The hardware generates this fault when :
+ * -  A guest access to an APIC register that is not accelerated
+ *    by AVIC hardware.
+ * - EOI is attempted when the highest priority in-service interrupt
+ *   is level-triggered.
+ */
+void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
+{
+    struct vcpu *curr = current;
+    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
+    u32 offset = vmcb->exitinfo1 & 0xFF0;
+    u32 rw = (vmcb->exitinfo1 >> 32) & 0x1;
+
+    if ( avic_is_trap(offset) )
+    {
+        /* Handling AVIC Trap (intercept right after the access). */
+        if ( !rw )
+        {
+            /*
+             * If a read trap happens, the CPU microcode does not
+             * implement the spec.
+             */
+            gprintk(XENLOG_ERR, "%s: Invalid #VMEXIT due to trap read (%#x)\n",
+                    __func__, offset);
+            domain_crash(curr->domain);
+        }
+
+        if ( avic_unaccel_trap_write(curr) != X86EMUL_OKAY )
+        {
+            gprintk(XENLOG_ERR, "%s: Failed to handle trap write (%#x)\n",
+                    __func__, offset);
+            domain_crash(curr->domain);
+        }
+    }
+    else
+        /* Handling AVIC Fault (intercept before the access). */
+        hvm_emulate_one_vm_event(EMUL_KIND_NORMAL, TRAP_invalid_op,
+                                 X86_EVENT_NO_EC);
+    return;
+}
+
+/*
  * Local variables:
  * mode: C
  * c-file-style: "BSD"
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index b445f59ada..c47042bf6b 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2976,6 +2976,14 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
             VM_EVENT_DESC_TR, exit_reason == VMEXIT_TR_WRITE);
         break;
 
+    case VMEXIT_AVIC_INCOMP_IPI:
+        svm_avic_vmexit_do_incomp_ipi(regs);
+        break;
+
+    case VMEXIT_AVIC_NOACCEL:
+        svm_avic_vmexit_do_noaccel(regs);
+        break;
+
     default:
     unexpected_exit_type:
         gprintk(XENLOG_ERR, "Unexpected vmexit: reason %#"PRIx64", "
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index a4472200a6..64c4e8dd93 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -592,7 +592,7 @@ static void vlapic_set_tdcr(struct vlapic *vlapic, unsigned int val)
                 "timer_divisor: %d", vlapic->hw.timer_divisor);
 }
 
-static uint32_t vlapic_read_aligned(const struct vlapic *vlapic,
+uint32_t vlapic_read_aligned(const struct vlapic *vlapic,
                                     unsigned int offset)
 {
     switch ( offset )
@@ -788,7 +788,7 @@ static void vlapic_update_timer(struct vlapic *vlapic, uint32_t lvtt,
     }
 }
 
-static void vlapic_reg_write(struct vcpu *v,
+void vlapic_reg_write(struct vcpu *v,
                              unsigned int offset, uint32_t val)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
index 32bb9a91e8..7af850ecbd 100644
--- a/xen/include/asm-x86/hvm/svm/avic.h
+++ b/xen/include/asm-x86/hvm/svm/avic.h
@@ -33,4 +33,7 @@ void svm_avic_dom_destroy(struct domain *d);
 bool svm_avic_vcpu_enabled(const struct vcpu *v);
 int svm_avic_init_vmcb(struct vcpu *v);
 
+void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs);
+void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs);
+
 #endif /* _SVM_AVIC_H_ */
diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index 386aad2260..c6384ae197 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -301,6 +301,8 @@ enum VMEXIT_EXITCODE
     VMEXIT_MWAIT_CONDITIONAL= 140, /* 0x8c */
     VMEXIT_XSETBV           = 141, /* 0x8d */
     VMEXIT_NPF              = 1024, /* 0x400, nested paging fault */
+    VMEXIT_AVIC_INCOMP_IPI  = 1025, /* 0x401 */
+    VMEXIT_AVIC_NOACCEL     = 1026, /* 0x402 */
     VMEXIT_INVALID          =  -1
 };
 
@@ -515,6 +517,9 @@ struct svm_domain {
      */
     struct avic_logical_id_entry *avic_logical_id_table;
     struct page_info *avic_logical_id_table_pg;
+
+    u32 avic_dfr_mode;
+    spinlock_t avic_dfr_mode_lock;
 };
 
 struct arch_svm_struct {
@@ -550,6 +555,7 @@ struct arch_svm_struct {
     } osvw;
 
     struct avic_physical_id_entry *avic_last_phy_id;
+    u32 avic_last_ldr;
 };
 
 struct vmcb_struct *alloc_vmcb(void);
diff --git a/xen/include/asm-x86/hvm/vlapic.h b/xen/include/asm-x86/hvm/vlapic.h
index 212c36b5c2..3a56b6db0e 100644
--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -137,6 +137,10 @@ void vlapic_ipi(struct vlapic *vlapic, uint32_t icr_low, uint32_t icr_high);
 
 int vlapic_apicv_write(struct vcpu *v, unsigned int offset);
 
+void vlapic_reg_write(struct vcpu *v, unsigned int offset, uint32_t val);
+
+uint32_t vlapic_read_aligned(const struct vlapic *vlapic, unsigned int offset);
+
 struct vlapic *vlapic_lowest_prio(
     struct domain *d, const struct vlapic *source,
     int short_hand, uint32_t dest, bool_t dest_mode);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC
  2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
                   ` (2 preceding siblings ...)
  2018-04-03 23:01 ` [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers Janakarajan Natarajan
@ 2018-04-03 23:01 ` Janakarajan Natarajan
  2018-04-13 17:57   ` Andrew Cooper
  2018-04-03 23:01 ` [PATCH 5/8] x86/SVM: Add interrupt management code via AVIC Janakarajan Natarajan
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 35+ messages in thread
From: Janakarajan Natarajan @ 2018-04-03 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, Janakarajan Natarajan, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jan Beulich, Boris Ostrovsky

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Add hooks to manage AVIC data structure during vcpu scheduling.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
---
 xen/arch/x86/hvm/svm/avic.c | 54 +++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c  | 10 +++++++++
 2 files changed, 64 insertions(+)

diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
index e112469774..7a25d83954 100644
--- a/xen/arch/x86/hvm/svm/avic.c
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -40,6 +40,7 @@
 
 #define AVIC_UNACCEL_ACCESS_OFFSET_MASK    0xFF0
 
+#define IS_RUNNING_BIT        62
 /*
  * Note:
  * Currently, svm-avic mode is not supported with nested virtualization.
@@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d, unsigned int index)
     return &d->avic_physical_id_table[index];
 }
 
+static void avic_vcpu_load(struct vcpu *v)
+{
+    unsigned long tmp;
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+    int h_phy_apic_id;
+    struct avic_physical_id_entry *entry = (struct avic_physical_id_entry *)&tmp;
+
+    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
+
+    /*
+     * Note: APIC ID = 0xff is used for broadcast.
+     *       APIC ID > 0xff is reserved.
+     */
+    h_phy_apic_id = cpu_data[v->processor].apicid;
+    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
+
+    tmp = read_atomic((u64*)(s->avic_last_phy_id));
+    entry->host_phy_apic_id = h_phy_apic_id;
+    entry->is_running = 1;
+    write_atomic((u64*)(s->avic_last_phy_id), tmp);
+}
+
+static void avic_vcpu_unload(struct vcpu *v)
+{
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+
+    clear_bit(IS_RUNNING_BIT, (u64*)(s->avic_last_phy_id));
+}
+
+static void avic_vcpu_resume(struct vcpu *v)
+{
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+
+    ASSERT(svm_avic_vcpu_enabled(v));
+    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
+
+    set_bit(IS_RUNNING_BIT, (u64*)(s->avic_last_phy_id));
+}
+
+static void avic_vcpu_block(struct vcpu *v)
+{
+    struct arch_svm_struct *s = &v->arch.hvm_svm;
+
+    ASSERT(svm_avic_vcpu_enabled(v));
+
+    clear_bit(IS_RUNNING_BIT, (u64*)(s->avic_last_phy_id));
+}
+
 int svm_avic_dom_init(struct domain *d)
 {
     int ret = 0;
@@ -106,6 +155,11 @@ int svm_avic_dom_init(struct domain *d)
 
     spin_lock_init(&d->arch.hvm_domain.svm.avic_dfr_mode_lock);
 
+    d->arch.hvm_domain.pi_ops.switch_from = avic_vcpu_unload;
+    d->arch.hvm_domain.pi_ops.switch_to = avic_vcpu_load;
+    d->arch.hvm_domain.pi_ops.vcpu_block = avic_vcpu_block;
+    d->arch.hvm_domain.pi_ops.do_resume = avic_vcpu_resume;
+
     return ret;
  err_out:
     svm_avic_dom_destroy(d);
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index c47042bf6b..641ad0dbc1 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1070,6 +1070,10 @@ static void svm_ctxt_switch_from(struct vcpu *v)
     svm_tsc_ratio_save(v);
 
     svm_sync_vmcb(v);
+
+    if ( v->domain->arch.hvm_domain.pi_ops.switch_from )
+        v->domain->arch.hvm_domain.pi_ops.switch_from(v);
+
     svm_vmload_pa(per_cpu(host_vmcb, cpu));
 
     /* Resume use of ISTs now that the host TR is reinstated. */
@@ -1102,6 +1106,9 @@ static void svm_ctxt_switch_to(struct vcpu *v)
     svm_lwp_load(v);
     svm_tsc_ratio_load(v);
 
+    if ( v->domain->arch.hvm_domain.pi_ops.switch_to )
+        v->domain->arch.hvm_domain.pi_ops.switch_to(v);
+
     if ( cpu_has_rdtscp )
         wrmsr_tsc_aux(hvm_msr_tsc_aux(v));
 }
@@ -1148,6 +1155,9 @@ static void noreturn svm_do_resume(struct vcpu *v)
         vmcb_set_vintr(vmcb, intr);
     }
 
+    if ( v->domain->arch.hvm_domain.pi_ops.do_resume )
+        v->domain->arch.hvm_domain.pi_ops.do_resume(v);
+
     hvm_do_resume(v);
 
     reset_stack_and_jump(svm_asm_do_resume);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 5/8] x86/SVM: Add interrupt management code via AVIC
  2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
                   ` (3 preceding siblings ...)
  2018-04-03 23:01 ` [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC Janakarajan Natarajan
@ 2018-04-03 23:01 ` Janakarajan Natarajan
  2018-04-17 13:06   ` Jan Beulich
  2018-04-03 23:01 ` [PATCH 6/8] x86/HVM: Hook up miscellaneous AVIC functions Janakarajan Natarajan
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 35+ messages in thread
From: Janakarajan Natarajan @ 2018-04-03 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, Boris Ostrovsky

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Enabling AVIC implicitly disables the V_IRQ, V_INTR_PRIO, V_IGN_TPR,
and V_INTR_VECTOR fields in the VMCB Control Word. Therefore, this patch
introduces new interrupt injection code via AVIC backing page.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 xen/arch/x86/hvm/svm/avic.c        | 29 +++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/intr.c        |  4 ++++
 xen/arch/x86/hvm/svm/svm.c         | 15 +++++++++++++--
 xen/include/asm-x86/hvm/svm/avic.h |  1 +
 xen/include/asm-x86/msr-index.h    |  1 +
 5 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
index 7a25d83954..4786ad472a 100644
--- a/xen/arch/x86/hvm/svm/avic.c
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -530,6 +530,35 @@ void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
     return;
 }
 
+void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
+{
+    struct vlapic *vlapic = vcpu_vlapic(v);
+
+    /* Fallback to use non-AVIC if vcpu is not enabled with AVIC. */
+    if ( !svm_avic_vcpu_enabled(v) )
+    {
+        if ( !vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
+            vcpu_kick(v);
+        return;
+    }
+
+    /* If interrupt is disabled, do not ignore the interrupt */
+    if ( !(guest_cpu_user_regs()->eflags & X86_EFLAGS_IF) )
+        return;
+
+    if ( vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
+        return;
+
+    /*
+     * If vcpu is running on another cpu, hit the doorbell to signal
+     * it to process interrupt. Otherwise, kick it.
+     */
+    if ( v->is_running && (v != current) )
+        wrmsrl(MSR_AMD_AVIC_DOORBELL, cpu_data[v->processor].apicid);
+    else
+        vcpu_kick(v);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/hvm/svm/intr.c b/xen/arch/x86/hvm/svm/intr.c
index 8511ff0b70..8c5a2eb2cd 100644
--- a/xen/arch/x86/hvm/svm/intr.c
+++ b/xen/arch/x86/hvm/svm/intr.c
@@ -29,6 +29,7 @@
 #include <asm/hvm/io.h>
 #include <asm/hvm/support.h>
 #include <asm/hvm/vlapic.h>
+#include <asm/hvm/svm/avic.h>
 #include <asm/hvm/svm/svm.h>
 #include <asm/hvm/svm/intr.h>
 #include <asm/hvm/nestedhvm.h> /* for nestedhvm_vcpu_in_guestmode */
@@ -100,6 +101,9 @@ static void svm_enable_intr_window(struct vcpu *v, struct hvm_intack intack)
     HVMTRACE_3D(INTR_WINDOW, intack.vector, intack.source,
                 vmcb->eventinj.fields.v?vmcb->eventinj.fields.vector:-1);
 
+    if ( svm_avic_vcpu_enabled(v) )
+        return;
+
     /*
      * Create a dummy virtual interrupt to intercept as soon as the
      * guest can accept the real interrupt.
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 641ad0dbc1..09b5676b32 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1144,7 +1144,8 @@ static void noreturn svm_do_resume(struct vcpu *v)
         hvm_asid_flush_vcpu(v);
     }
 
-    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) )
+    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) &&
+         !svm_avic_vcpu_enabled(v) )
     {
         vintr_t intr;
 
@@ -1709,6 +1710,9 @@ const struct hvm_function_table * __init start_svm(void)
     if ( !cpu_has_svm_avic )
         svm_avic = 0;
 
+    if ( svm_avic )
+        svm_function_table.deliver_posted_intr  = svm_avic_deliver_posted_intr;
+
 #define P(p,s) if ( p ) { printk(" - %s\n", s); printed = 1; }
     P(cpu_has_svm_npt, "Nested Page Tables (NPT)");
     P(cpu_has_svm_lbrv, "Last Branch Record (LBR) Virtualisation");
@@ -2552,7 +2556,8 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
      * NB. We need to preserve the low bits of the TPR to make checked builds
      * of Windows work, even though they don't actually do anything.
      */
-    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) )
+    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) &&
+         !svm_avic_vcpu_enabled(v) )
     {
         intr = vmcb_get_vintr(vmcb);
         vlapic_set_reg(vlapic, APIC_TASKPRI,
@@ -2759,6 +2764,12 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
         u32 general1_intercepts = vmcb_get_general1_intercepts(vmcb);
         intr = vmcb_get_vintr(vmcb);
 
+        if ( svm_avic_vcpu_enabled(v) )
+        {
+            gdprintk(XENLOG_ERR, "AVIC VINTR:\n");
+            domain_crash(v->domain);
+        }
+
         intr.fields.irq = 0;
         general1_intercepts &= ~GENERAL1_INTERCEPT_VINTR;
 
diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
index 7af850ecbd..a1138d3969 100644
--- a/xen/include/asm-x86/hvm/svm/avic.h
+++ b/xen/include/asm-x86/hvm/svm/avic.h
@@ -36,4 +36,5 @@ int svm_avic_init_vmcb(struct vcpu *v);
 void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs);
 void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs);
 
+void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vector);
 #endif /* _SVM_AVIC_H_ */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 23ad74399c..44b05b4383 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -187,6 +187,7 @@
 #define MSR_K8_ENABLE_C1E		0xc0010055
 #define MSR_K8_VM_CR			0xc0010114
 #define MSR_K8_VM_HSAVE_PA		0xc0010117
+#define MSR_AMD_AVIC_DOORBELL		0xc001011b
 
 #define MSR_AMD_FAM15H_EVNTSEL0		0xc0010200
 #define MSR_AMD_FAM15H_PERFCTR0		0xc0010201
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 6/8] x86/HVM: Hook up miscellaneous AVIC functions
  2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
                   ` (4 preceding siblings ...)
  2018-04-03 23:01 ` [PATCH 5/8] x86/SVM: Add interrupt management code via AVIC Janakarajan Natarajan
@ 2018-04-03 23:01 ` Janakarajan Natarajan
  2018-04-17 13:10   ` Jan Beulich
  2018-04-03 23:01 ` [PATCH 7/8] x86/SVM: Introduce svm command line option Janakarajan Natarajan
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 35+ messages in thread
From: Janakarajan Natarajan @ 2018-04-03 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, Boris Ostrovsky

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch modifies the hvm_function_table.virtual_intr_delivery_enabled()
to become a bool variable as both VMX and SVM simply return static value.

Also, this patch hooks up virtual_intr_delivery_enabled and
deliver_posted_intr functions when AVIC is enabled.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 xen/arch/x86/hvm/svm/svm.c    |  3 +++
 xen/arch/x86/hvm/vlapic.c     | 12 ++----------
 xen/arch/x86/hvm/vmx/vmx.c    |  8 ++------
 xen/include/asm-x86/hvm/hvm.h |  4 +++-
 4 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 09b5676b32..fdbe8e3008 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1711,7 +1711,10 @@ const struct hvm_function_table * __init start_svm(void)
         svm_avic = 0;
 
     if ( svm_avic )
+    {
         svm_function_table.deliver_posted_intr  = svm_avic_deliver_posted_intr;
+        svm_function_table.virtual_intr_delivery_enabled = svm_avic;
+    }
 
 #define P(p,s) if ( p ) { printk(" - %s\n", s); printed = 1; }
     P(cpu_has_svm_npt, "Nested Page Tables (NPT)");
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 64c4e8dd93..664082e034 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1258,14 +1258,6 @@ void vlapic_adjust_i8259_target(struct domain *d)
     pt_adjust_global_vcpu_target(v);
 }
 
-int vlapic_virtual_intr_delivery_enabled(void)
-{
-    if ( hvm_funcs.virtual_intr_delivery_enabled )
-        return hvm_funcs.virtual_intr_delivery_enabled();
-    else
-        return 0;
-}
-
 int vlapic_has_pending_irq(struct vcpu *v)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
@@ -1278,7 +1270,7 @@ int vlapic_has_pending_irq(struct vcpu *v)
     if ( irr == -1 )
         return -1;
 
-    if ( vlapic_virtual_intr_delivery_enabled() &&
+    if ( hvm_funcs.virtual_intr_delivery_enabled &&
          !nestedhvm_vcpu_in_guestmode(v) )
         return irr;
 
@@ -1316,7 +1308,7 @@ int vlapic_ack_pending_irq(struct vcpu *v, int vector, bool_t force_ack)
     int isr;
 
     if ( !force_ack &&
-         vlapic_virtual_intr_delivery_enabled() )
+         hvm_funcs.virtual_intr_delivery_enabled )
         return 1;
 
     /* If there's no chance of using APIC assist then bail now. */
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b2fdbf0ef0..1834dc8bd1 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1898,11 +1898,6 @@ static void vmx_update_eoi_exit_bitmap(struct vcpu *v, u8 vector, u8 trig)
         vmx_clear_eoi_exit_bitmap(v, vector);
 }
 
-static int vmx_virtual_intr_delivery_enabled(void)
-{
-    return cpu_has_vmx_virtual_intr_delivery;
-}
-
 static void vmx_process_isr(int isr, struct vcpu *v)
 {
     unsigned long status;
@@ -2281,7 +2276,6 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .nhvm_intr_blocked    = nvmx_intr_blocked,
     .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
     .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
-    .virtual_intr_delivery_enabled = vmx_virtual_intr_delivery_enabled,
     .process_isr          = vmx_process_isr,
     .deliver_posted_intr  = vmx_deliver_posted_intr,
     .sync_pir_to_irr      = vmx_sync_pir_to_irr,
@@ -2421,6 +2415,8 @@ const struct hvm_function_table * __init start_vmx(void)
         vmx_function_table.process_isr = NULL;
         vmx_function_table.handle_eoi = NULL;
     }
+    else
+        vmx_function_table.virtual_intr_delivery_enabled = true;
 
     if ( cpu_has_vmx_posted_intr_processing )
     {
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 2376ed6912..522362775e 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -97,6 +97,9 @@ struct hvm_function_table {
     /* Necessary hardware support for alternate p2m's? */
     bool altp2m_supported;
 
+    /* Hardware virtual interrupt delivery enable? */
+    bool virtual_intr_delivery_enabled;
+
     /* Indicate HAP capabilities. */
     unsigned int hap_capabilities;
 
@@ -195,7 +198,6 @@ struct hvm_function_table {
 
     /* Virtual interrupt delivery */
     void (*update_eoi_exit_bitmap)(struct vcpu *v, u8 vector, u8 trig);
-    int (*virtual_intr_delivery_enabled)(void);
     void (*process_isr)(int isr, struct vcpu *v);
     void (*deliver_posted_intr)(struct vcpu *v, u8 vector);
     void (*sync_pir_to_irr)(struct vcpu *v);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 7/8] x86/SVM: Introduce svm command line option
  2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
                   ` (5 preceding siblings ...)
  2018-04-03 23:01 ` [PATCH 6/8] x86/HVM: Hook up miscellaneous AVIC functions Janakarajan Natarajan
@ 2018-04-03 23:01 ` Janakarajan Natarajan
  2018-04-13 18:04   ` Andrew Cooper
  2018-04-03 23:01 ` [PATCH 8/8] x86/SVM: Add AMD AVIC key handler Janakarajan Natarajan
  2018-04-13 17:04 ` [PATCH 0/8] Introduce AMD SVM AVIC Brian Woods
  8 siblings, 1 reply; 35+ messages in thread
From: Janakarajan Natarajan @ 2018-04-03 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, Janakarajan Natarajan, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jan Beulich, Boris Ostrovsky

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch introduces a new Xen command line option to enable/disable
SVM sub-options. Currently, it support sub-option "avic", which can
be used to enable/disable SVM AVIC feature.

Signed-off-by: Suavee Suthikulpant <suravee.suthikulpanit@amd.com>
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
---
 docs/misc/xen-command-line.markdown | 16 ++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c          | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index b353352adf..60a1005c42 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1730,6 +1730,22 @@ enforces the maximum theoretically necessary timeout of 670ms. Any number
 is being interpreted as a custom timeout in milliseconds. Zero or boolean
 false disable the quirk workaround, which is also the default.
 
+### svm
+> `= List of [ avic ]`
+
+> Sub-options:
+
+> All sub-options are of boolean kind and can be prefixed with `no-` to
+> effect the inverse meaning.
+
+> `avic`
+
+> Default: `false`
+
+>> This option enables Advanced Virtual Interrupt Controller (AVIC),
+>> which is an extension of AMD Secure Virtual Machine (SVM) to virtualize
+>> local APIC for guest VM.
+
 ### sync\_console
 > `= <boolean>`
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index fdbe8e3008..0c5c26cce8 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -64,6 +64,16 @@
 #include <asm/monitor.h>
 #include <asm/xstate.h>
 
+static int parse_svm_param(const char *s);
+
+/*
+ * The 'svm' parameter en/dis-ables various SVM features.
+ * Optional comma separated value may contain:
+ *
+ *   avic - Enable SVM Advanced Virtual Interrupt Controller (AVIC)
+ */
+custom_param("svm", parse_svm_param);
+
 void svm_asm_do_resume(void);
 
 u32 svm_feature_flags;
@@ -89,6 +99,28 @@ static bool_t amd_erratum383_found __read_mostly;
 static uint64_t osvw_length, osvw_status;
 static DEFINE_SPINLOCK(osvw_lock);
 
+static int __init parse_svm_param(const char *s)
+{
+    char *ss;
+    int val;
+
+    do {
+        val = !!strncmp(s, "no-", 3);
+        if ( !val )
+            s += 3;
+
+        ss = strchr(s, ',');
+        if ( ss )
+            *ss = '\0';
+
+        if ( !strcmp(s, "avic") )
+            svm_avic = val;
+
+        s = ss + 1;
+    } while ( ss );
+
+    return 0;
+}
 /* Only crash the guest if the problem originates in kernel mode. */
 static void svm_crash_or_fault(struct vcpu *v)
 {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 8/8] x86/SVM: Add AMD AVIC key handler
  2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
                   ` (6 preceding siblings ...)
  2018-04-03 23:01 ` [PATCH 7/8] x86/SVM: Introduce svm command line option Janakarajan Natarajan
@ 2018-04-03 23:01 ` Janakarajan Natarajan
  2018-04-17 13:36   ` Jan Beulich
  2018-04-13 17:04 ` [PATCH 0/8] Introduce AMD SVM AVIC Brian Woods
  8 siblings, 1 reply; 35+ messages in thread
From: Janakarajan Natarajan @ 2018-04-03 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jan Beulich, Boris Ostrovsky

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Adding new key-handler "j" for dumping AVIC-related information.
Here is an example of per-domain statistics being dumped.

    *********** SVM AVIC Statistics **************
    >>> Domain 1 <<<
        VCPU 0
        * incomp_ipi = 3110
        * noaccel    = 236475
        * post_intr  = 116176
        * doorbell   = 715
        VCPU 1
        * incomp_ipi = 2565
        * noaccel    = 233061
        * post_intr  = 115765
        * doorbell   = 771
    **************************************

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 xen/arch/x86/hvm/svm/avic.c        | 58 +++++++++++++++++++++++++++++++++++++-
 xen/arch/x86/hvm/svm/svm.c         |  1 +
 xen/include/asm-x86/hvm/svm/avic.h |  3 ++
 xen/include/asm-x86/hvm/svm/vmcb.h |  6 ++++
 4 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
index 4786ad472a..7273a5b99f 100644
--- a/xen/arch/x86/hvm/svm/avic.c
+++ b/xen/arch/x86/hvm/svm/avic.c
@@ -28,6 +28,7 @@
 #include <asm/hvm/support.h>
 #include <asm/hvm/svm/avic.h>
 #include <asm/hvm/vlapic.h>
+#include <xen/keyhandler.h>
 #include <asm/p2m.h>
 #include <asm/page.h>
 
@@ -49,6 +50,11 @@
  */
 bool svm_avic = 0;
 
+static inline bool svm_is_avic_domain(struct domain *d)
+{
+    return ( d->arch.hvm_domain.svm.avic_physical_id_table != 0 );
+}
+
 static struct avic_physical_id_entry *
 avic_get_physical_id_entry(struct svm_domain *d, unsigned int index)
 {
@@ -256,6 +262,8 @@ void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs)
     u32 id = vmcb->exitinfo2 >> 32;
     u32 index = vmcb->exitinfo2 && 0xFF;
 
+    curr->arch.hvm_svm.cnt_avic_incomp_ipi++;
+
     switch ( id )
     {
     case AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE:
@@ -502,6 +510,8 @@ void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs)
     u32 offset = vmcb->exitinfo1 & 0xFF0;
     u32 rw = (vmcb->exitinfo1 >> 32) & 0x1;
 
+    curr->arch.hvm_svm.cnt_avic_noaccel++;
+
     if ( avic_is_trap(offset) )
     {
         /* Handling AVIC Trap (intercept right after the access). */
@@ -549,14 +559,60 @@ void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
     if ( vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
         return;
 
+    v->arch.hvm_svm.cnt_avic_post_intr++;
     /*
      * If vcpu is running on another cpu, hit the doorbell to signal
      * it to process interrupt. Otherwise, kick it.
      */
     if ( v->is_running && (v != current) )
+    {
         wrmsrl(MSR_AMD_AVIC_DOORBELL, cpu_data[v->processor].apicid);
-    else
+        v->arch.hvm_svm.cnt_avic_doorbell++;
+    }
+    else {
         vcpu_kick(v);
+    }
+}
+
+static void avic_dump(unsigned char ch)
+{
+    struct domain *d;
+    struct vcpu *v;
+
+    printk("*********** SVM AVIC Statistics **************\n");
+
+    rcu_read_lock(&domlist_read_lock);
+
+    for_each_domain ( d )
+    {
+        if ( !svm_is_avic_domain(d) )
+            continue;
+
+        printk(">>> Domain %d <<<\n", d->domain_id);
+        for_each_vcpu ( d, v )
+        {
+            printk("\tVCPU %d\n", v->vcpu_id);
+            printk("\t* incomp_ipi = %u\n",
+                   v->arch.hvm_svm.cnt_avic_incomp_ipi);
+            printk("\t* noaccel    = %u\n",
+                   v->arch.hvm_svm.cnt_avic_noaccel);
+            printk("\t* post_intr  = %u\n",
+                   v->arch.hvm_svm.cnt_avic_post_intr);
+            printk("\t* doorbell   = %u\n",
+                   v->arch.hvm_svm.cnt_avic_doorbell);
+        }
+    }
+
+    rcu_read_unlock(&domlist_read_lock);
+
+    printk("**************************************\n");
+
+}
+
+void __init setup_avic_dump(void)
+{
+    if ( svm_avic )
+        register_keyhandler('j', avic_dump, "dump SVM AVIC stats", 1);
 }
 
 /*
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 0c5c26cce8..e874cbc62d 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1726,6 +1726,7 @@ const struct hvm_function_table * __init start_svm(void)
     }
 
     setup_vmcb_dump();
+    setup_avic_dump();
 
     svm_feature_flags = (current_cpu_data.extended_cpuid_level >= 0x8000000A ?
                          cpuid_edx(0x8000000A) : 0);
diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
index a1138d3969..8447ecdcd7 100644
--- a/xen/include/asm-x86/hvm/svm/avic.h
+++ b/xen/include/asm-x86/hvm/svm/avic.h
@@ -37,4 +37,7 @@ void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs);
 void svm_avic_vmexit_do_noaccel(struct cpu_user_regs *regs);
 
 void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vector);
+
+void setup_avic_dump(void);
+
 #endif /* _SVM_AVIC_H_ */
diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index c6384ae197..da0e615808 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -556,6 +556,12 @@ struct arch_svm_struct {
 
     struct avic_physical_id_entry *avic_last_phy_id;
     u32 avic_last_ldr;
+
+    /* AVIC Statistics */
+    u32 cnt_avic_incomp_ipi;
+    u32 cnt_avic_noaccel;
+    u32 cnt_avic_post_intr;
+    u32 cnt_avic_doorbell;
 };
 
 struct vmcb_struct *alloc_vmcb(void);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/8] Introduce AMD SVM AVIC
  2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
                   ` (7 preceding siblings ...)
  2018-04-03 23:01 ` [PATCH 8/8] x86/SVM: Add AMD AVIC key handler Janakarajan Natarajan
@ 2018-04-13 17:04 ` Brian Woods
  8 siblings, 0 replies; 35+ messages in thread
From: Brian Woods @ 2018-04-13 17:04 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Stefano Stabellini,
	Wei Liu, Suravee Suthikulpanit, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, Julien Grall, Jan Beulich,
	Boris Ostrovsky

On Tue, Apr 03, 2018 at 06:01:16PM -0500, Janakarajan Natarajan wrote:
> OVERVIEW
> ========
> This patchset is the first of a two-part patch series to introduce
> the AMD Advanced Virtual Interrupt Controller (AVIC) support.
> 
> The AVIC hardware virtualizes local APIC registers of each vCPU via
> the virtual APIC (vAPIC) backing page. This allows the guest to access
> certain APIC registers without the need for emulation of hardware
> behaviour in the hypervisor. More information about AVIC can be found in
> 
> * AMD64 Architecture Programmers Manual Volume 2 - System Programming
>   https://support.amd.com/TechDocs/24593.pdf
> 
> For SVM AVIC, we extend the existing SVM driver to:
> * Check CPUID to detect AVIC support in the processor.
> * Program new fields in VMCB to enable AVIC.
> * Introduce new AVIC data structures and add code to manage them.
> * Handle two new AVIC #VMEXITs.
> * Add new interrupt injection code using vAPIC backing page
>   instead of the existing V_IRQ, V_INTR_PRIO, V_INTR_VECTOR and
>   V_IGN_TPR fields.
> 
> This patchset does not enable AVIC by default since it does not
> yet support nested VMs. Until then, users can enable SVM AVIC by
> specifying Xen parameter svm=avic.
> 
> Later, in Part 2, we will introduce the IOMMU AVIC support, which
> provides speed-up for the PCI device passthrough use case by allowing
> the IOMMU hardware to inject interrupts directly into the guest via
> the vAPIC backing page.
> 
> OVERALL PERFORMANCE
> ===================
> AVIC is available on AMD Family 15h models 6Xh (Carrizo) processors
> and newer. An AMD Family 17h Epyc processor is used to collect the
> performance data shown below.
> 
> Generally, SVM AVIC alone (w/o IOMMU AVIC) should provide overall
> speed-up for HVM guest since it does not require #VMEXIT into the
> hypervisor to emulate certain guest accesses to local APIC registers.
> 
> It should also improve performance when the hypervisor wants to
> inject interrupts into a running vcpu. It can do this by setting the 
> corresponding IRR bit in the vAPIC backing page and triggering the
> AVIC_DOORBELL.
> 
> For sending IPI interrupts between running vcpus in a Linux guest,
> Xen defaults to using event channels. However, in case of
> non-paravirtualized guests, AVIC can also provide performance
> improvements for sending IPIs.
> 
> BENCHMARK: HACKBENCH
> ====================
> For measuring IPI performance used for scheduling workload, some
> performance numbers are collected using hackbench.
> 
>     # hackbench -p -l 100000
>     Running in process mode with 10 groups using 40 file descriptors
>     each (== 400 tasks)
>     Each sender will pass 100000 messages of 100 bytes
> 			  |	 3 vcpus (sec)        |
>     --------------------------------------------------
>     No AVIC w/  xen_nopv  |           517             |
>     AVIC w/ xen_nopv      |           173             |
>     No AVIC w/o xen_nopv  |           141             |
>     AVIC w/o xen_nopv     |           135             |
> 
> Each benchmark test was averaged over 10 runs.
> 
> CURRENT UNTESTED USE_CASES
> ==========================
> * Nested VM
> 
> Any feedback and comments are very much appreciated.
> 
> Suravee Suthikulpanit (8):
>   x86/SVM: Modify VMCB fields to add AVIC support
>   x86/HVM/SVM: Add AVIC initialization code
>   x86/SVM: Add AVIC vmexit handlers
>   x86/SVM: Add vcpu scheduling support for AVIC
>   x86/SVM: Add interrupt management code via AVIC
>   x86/HVM: Hook up miscellaneous AVIC functions
>   x86/SVM: Introduce svm command line option
>   x86/SVM: Add AMD AVIC key handler
> 
>  docs/misc/xen-command-line.markdown |  16 +
>  xen/arch/x86/hvm/svm/Makefile       |   1 +
>  xen/arch/x86/hvm/svm/avic.c         | 626 ++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/hvm/svm/intr.c         |   4 +
>  xen/arch/x86/hvm/svm/svm.c          |  77 ++++-
>  xen/arch/x86/hvm/svm/vmcb.c         |   3 +
>  xen/arch/x86/hvm/vlapic.c           |  20 +-
>  xen/arch/x86/hvm/vmx/vmx.c          |   8 +-
>  xen/include/asm-x86/hvm/hvm.h       |   4 +-
>  xen/include/asm-x86/hvm/svm/avic.h  |  43 +++
>  xen/include/asm-x86/hvm/svm/svm.h   |   2 +
>  xen/include/asm-x86/hvm/svm/vmcb.h  |  52 ++-
>  xen/include/asm-x86/hvm/vlapic.h    |   4 +
>  xen/include/asm-x86/msr-index.h     |   1 +
>  14 files changed, 831 insertions(+), 30 deletions(-)
>  create mode 100644 xen/arch/x86/hvm/svm/avic.c
>  create mode 100644 xen/include/asm-x86/hvm/svm/avic.h
> 
> -- 
> 2.11.0
> 

Not a Xen expert by any means but I've looked over the patch set and
nothing jumps out as wrong.

Reviewed-by: Brian Woods <brian.woods@amd.com>

-- 
Brian Woods

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code
  2018-04-03 23:01 ` [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code Janakarajan Natarajan
@ 2018-04-13 17:35   ` Andrew Cooper
  2018-04-19 15:46     ` Natarajan, Janakarajan
  2018-04-16 15:55   ` Jan Beulich
  2018-04-23 19:33   ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 35+ messages in thread
From: Andrew Cooper @ 2018-04-13 17:35 UTC (permalink / raw)
  To: Janakarajan Natarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich, Boris Ostrovsky

On 04/04/18 00:01, Janakarajan Natarajan wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>
> Introduce AVIC base initialization code. This includes:
>     * Setting up per-VM data structures.
>     * Setting up per-vCPU data structure.
>     * Initializing AVIC-related VMCB bit fields.
>
> This patch also introduces a new Xen parameter (svm-avic),
> which can be used to enable/disable AVIC support.
> Currently, this svm-avic is disabled by default.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
> ---
>  xen/arch/x86/hvm/svm/Makefile      |   1 +
>  xen/arch/x86/hvm/svm/avic.c        | 191 +++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/hvm/svm/svm.c         |   8 +-
>  xen/arch/x86/hvm/svm/vmcb.c        |   3 +
>  xen/arch/x86/hvm/vlapic.c          |   4 +
>  xen/include/asm-x86/hvm/svm/avic.h |  36 +++++++
>  xen/include/asm-x86/hvm/svm/svm.h  |   2 +
>  xen/include/asm-x86/hvm/svm/vmcb.h |  17 ++++
>  8 files changed, 261 insertions(+), 1 deletion(-)
>  create mode 100644 xen/arch/x86/hvm/svm/avic.c
>  create mode 100644 xen/include/asm-x86/hvm/svm/avic.h
>
> diff --git a/xen/arch/x86/hvm/svm/Makefile b/xen/arch/x86/hvm/svm/Makefile
> index 760d2954da..e0e4a59f7d 100644
> --- a/xen/arch/x86/hvm/svm/Makefile
> +++ b/xen/arch/x86/hvm/svm/Makefile
> @@ -1,4 +1,5 @@
>  obj-y += asid.o
> +obj-y += avic.o
>  obj-y += emulate.o
>  obj-bin-y += entry.o
>  obj-y += intr.o
> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
> new file mode 100644
> index 0000000000..8108698911
> --- /dev/null
> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -0,0 +1,191 @@
> +/*
> + * avic.c: implements AMD Advanced Virtual Interrupt Controller (AVIC) support
> + * Copyright (c) 2016, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/domain_page.h>
> +#include <xen/sched.h>
> +#include <xen/stdbool.h>
> +#include <asm/acpi.h>
> +#include <asm/apicdef.h>
> +#include <asm/atomic.h>
> +#include <asm/event.h>
> +#include <asm/hvm/emulate.h>
> +#include <asm/hvm/nestedhvm.h>
> +#include <asm/hvm/support.h>
> +#include <asm/hvm/svm/avic.h>
> +#include <asm/hvm/vlapic.h>
> +#include <asm/p2m.h>
> +#include <asm/page.h>
> +
> +/*
> + * Note: Current max index allowed for physical APIC ID table is 255.
> + */
> +#define AVIC_PHY_APIC_ID_MAX    0xFF
> +
> +#define AVIC_VAPIC_BAR_MASK     (((1ULL << 40) - 1) << PAGE_SHIFT)

This constant should hopefully be stale (following the debugging), and
should be dropped.

> +
> +/*
> + * Note:
> + * Currently, svm-avic mode is not supported with nested virtualization.
> + * Therefore, it is not yet currently enabled by default. Once the support
> + * is in-place, this should be enabled by default.
> + */
> +bool svm_avic = 0;
> +
> +static struct avic_physical_id_entry *
> +avic_get_physical_id_entry(struct svm_domain *d, unsigned int index)
> +{
> +    if ( !d->avic_physical_id_table )
> +        return NULL;
> +
> +    /*
> +    * Note: APIC ID = 0xFF is used for broadcast.
> +    *       APIC ID > 0xFF is reserved.
> +    */
> +    ASSERT(index < AVIC_PHY_APIC_ID_MAX);

Please also have:

if ( index >= AVIC_PHY_APIC_ID_MAX )
    return NULL; /* Avoid out-of-bounds access in release builds. */

> +
> +    return &d->avic_physical_id_table[index];
> +}
> +
> +int svm_avic_dom_init(struct domain *d)
> +{
> +    int ret = 0;
> +    struct page_info *pg;
> +
> +    if ( !svm_avic || !has_vlapic(d) )
> +        return 0;
> +
> +    /*
> +     * Note:
> +     * AVIC hardware walks the nested page table to check permissions,
> +     * but does not use the SPA address specified in the leaf page
> +     * table entry since it uses  address in the AVIC_BACKING_PAGE pointer
> +     * field of the VMCB. Therefore, we set up a dummy page for APIC _mfn(0).
> +     */
> +    set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE),
> +                       _mfn(0), PAGE_ORDER_4K,
> +                       p2m_get_hostp2m(d)->default_access);

Is default access right here, rather than blanket R/W?  IIRC, the frame
is magic in the pipeline and causes redirections to the APIC handling,
so the permissions of the page don't take effect.

> +
> +    /* Init AVIC logical APIC ID table */
> +    pg = alloc_domheap_page(d, MEMF_no_owner);
> +    if ( !pg )
> +    {
> +        ret = -ENOMEM;
> +        goto err_out;
> +    }
> +    clear_domain_page(_mfn(page_to_mfn(pg)));

You'll need to rebase your changes onto the latest staging.  Julien
finished the tree-wide cleanup of page_to_mfn() and changed its type, so
you'll need to drop _mfn() wrappers, or insert mfn_x() wrappers.

> +    d->arch.hvm_domain.svm.avic_logical_id_table_pg = pg;
> +    d->arch.hvm_domain.svm.avic_logical_id_table = __map_domain_page_global(pg);
> +
> +    /* Init AVIC physical APIC ID table */
> +    pg = alloc_domheap_page(d, MEMF_no_owner);
> +    if ( !pg )
> +    {
> +        ret = -ENOMEM;
> +        goto err_out;
> +    }
> +    clear_domain_page(_mfn(page_to_mfn(pg)));
> +    d->arch.hvm_domain.svm.avic_physical_id_table_pg = pg;
> +    d->arch.hvm_domain.svm.avic_physical_id_table = __map_domain_page_global(pg);
> +
> +    return ret;
> + err_out:
> +    svm_avic_dom_destroy(d);
> +    return ret;
> +}
> +
> +void svm_avic_dom_destroy(struct domain *d)
> +{
> +    if ( !svm_avic || !has_vlapic(d) )
> +        return;
> +
> +    if ( d->arch.hvm_domain.svm.avic_physical_id_table )
> +    {
> +        unmap_domain_page_global(d->arch.hvm_domain.svm.avic_physical_id_table);
> +        free_domheap_page(d->arch.hvm_domain.svm.avic_physical_id_table_pg);
> +        d->arch.hvm_domain.svm.avic_physical_id_table_pg = 0;
> +        d->arch.hvm_domain.svm.avic_physical_id_table = 0;
> +    }
> +
> +    if ( d->arch.hvm_domain.svm.avic_logical_id_table)
> +    {
> +        unmap_domain_page_global(d->arch.hvm_domain.svm.avic_logical_id_table);
> +        free_domheap_page(d->arch.hvm_domain.svm.avic_logical_id_table_pg);
> +        d->arch.hvm_domain.svm.avic_logical_id_table_pg = 0;
> +        d->arch.hvm_domain.svm.avic_logical_id_table = 0;
> +    }
> +}
> +
> +bool svm_avic_vcpu_enabled(const struct vcpu *v)
> +{
> +    const struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    const struct vmcb_struct *vmcb = s->vmcb;
> +
> +    return vmcb->_vintr.fields.avic_enable;
> +}
> +
> +int svm_avic_init_vmcb(struct vcpu *v)
> +{
> +    u32 apic_id;
> +    unsigned long tmp;
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    struct vmcb_struct *vmcb = s->vmcb;
> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
> +    const struct vlapic *vlapic = vcpu_vlapic(v);
> +    struct avic_physical_id_entry *entry = (struct avic_physical_id_entry *)&tmp;

This is still broken.  Anywhere where type punning like this is going on
needs fixing.

> +
> +    if ( !svm_avic || !has_vlapic(v->domain) )
> +        return 0;
> +
> +    if ( !vlapic || !vlapic->regs_page )
> +        return -EINVAL;
> +
> +    apic_id = vlapic_read_aligned(vcpu_vlapic(v), APIC_ID);
> +    s->avic_last_phy_id = avic_get_physical_id_entry(d, GET_xAPIC_ID(apic_id));
> +    if ( !s->avic_last_phy_id )
> +        return -EINVAL;
> +
> +    vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page);
> +    vmcb->avic_logical_id_table_pa = domain_page_map_to_mfn(d->avic_logical_id_table) << PAGE_SHIFT;
> +    vmcb->avic_physical_id_table_pa = domain_page_map_to_mfn(d->avic_physical_id_table) << PAGE_SHIFT;
> +
> +    /* Set Physical ID Table Pointer [7:0] to max apic id of the domain */
> +    vmcb->avic_logical_id_table_pa &= ~AVIC_PHY_APIC_ID_MAX;
> +    vmcb->avic_physical_id_table_pa |= (v->domain->max_vcpus * 2) & 0xFF;
> +
> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
> +    entry->bk_pg_ptr_mfn = (vmcb->avic_bk_pg_pa) >> PAGE_SHIFT;
> +    entry->is_running = 0;
> +    entry->valid = 1;
> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);

This is an init function, and it will be called sequentially for each
allocated vcpu.  The guest is guaranteed not to be executing, so don't
need to have any special interlocking for s->avic_last_phy_id

> +
> +    vmcb->avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & AVIC_VAPIC_BAR_MASK;
> +    vmcb->cleanbits.fields.avic = 0;
> +
> +    vmcb->_vintr.fields.avic_enable = 1;
> +
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 8538232f68..b445f59ada 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -47,6 +47,7 @@
>  #include <asm/hvm/svm/asid.h>
>  #include <asm/hvm/svm/svm.h>
>  #include <asm/hvm/svm/vmcb.h>
> +#include <asm/hvm/svm/avic.h>
>  #include <asm/hvm/svm/emulate.h>
>  #include <asm/hvm/svm/intr.h>
>  #include <asm/hvm/svm/svmdebug.h>
> @@ -1225,11 +1226,12 @@ static int svm_domain_initialise(struct domain *d)
>  
>      d->arch.ctxt_switch = &csw;
>  
> -    return 0;
> +    return svm_avic_dom_init(d);
>  }
>  
>  static void svm_domain_destroy(struct domain *d)
>  {
> +    svm_avic_dom_destroy(d);
>  }
>  
>  static int svm_vcpu_initialise(struct vcpu *v)
> @@ -1694,6 +1696,9 @@ const struct hvm_function_table * __init start_svm(void)
>      if ( cpu_has_tsc_ratio )
>          svm_function_table.tsc_scaling.ratio_frac_bits = 32;
>  
> +    if ( !cpu_has_svm_avic )
> +        svm_avic = 0;
> +
>  #define P(p,s) if ( p ) { printk(" - %s\n", s); printed = 1; }
>      P(cpu_has_svm_npt, "Nested Page Tables (NPT)");
>      P(cpu_has_svm_lbrv, "Last Branch Record (LBR) Virtualisation");
> @@ -1705,6 +1710,7 @@ const struct hvm_function_table * __init start_svm(void)
>      P(cpu_has_pause_filter, "Pause-Intercept Filter");
>      P(cpu_has_pause_thresh, "Pause-Intercept Filter Threshold");
>      P(cpu_has_tsc_ratio, "TSC Rate MSR");
> +    P(cpu_has_svm_avic, svm_avic ? "AVIC (enabled)" : "AVIC (disabled)");
>  #undef P
>  
>      if ( !printed )
> diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
> index ae60d8dc1c..7ade023cfe 100644
> --- a/xen/arch/x86/hvm/svm/vmcb.c
> +++ b/xen/arch/x86/hvm/svm/vmcb.c
> @@ -27,6 +27,7 @@
>  #include <asm/msr-index.h>
>  #include <asm/p2m.h>
>  #include <asm/hvm/support.h>
> +#include <asm/hvm/svm/avic.h>
>  #include <asm/hvm/svm/svm.h>
>  #include <asm/hvm/svm/svmdebug.h>
>  
> @@ -215,6 +216,8 @@ static int construct_vmcb(struct vcpu *v)
>              vmcb->_pause_filter_thresh = SVM_PAUSETHRESH_INIT;
>      }
>  
> +    svm_avic_init_vmcb(v);
> +
>      vmcb->cleanbits.bytes = 0;
>  
>      return 0;
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 1b9f00a0e4..a4472200a6 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -1597,6 +1597,10 @@ int vlapic_init(struct vcpu *v)
>  
>      if (vlapic->regs_page == NULL)
>      {
> +        /*
> +         * SVM AVIC depends on the vlapic->regs_page being a full
> +         * page allocation as it is also used for vAPIC backing page.
> +         */
>          vlapic->regs_page = alloc_domheap_page(v->domain, MEMF_no_owner);
>          if ( vlapic->regs_page == NULL )
>          {
> diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
> new file mode 100644
> index 0000000000..32bb9a91e8
> --- /dev/null
> +++ b/xen/include/asm-x86/hvm/svm/avic.h
> @@ -0,0 +1,36 @@
> +#ifndef _SVM_AVIC_H_
> +#define _SVM_AVIC_H_
> +
> +#include <xen/compiler.h>
> +
> +enum avic_incmp_ipi_err_code {
> +    AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE,
> +    AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN,
> +    AVIC_INCMP_IPI_ERR_INV_TARGET,
> +    AVIC_INCMP_IPI_ERR_INV_BK_PAGE,
> +};
> +
> +struct __packed avic_logical_id_entry {
> +    u32 guest_phy_apic_id : 8;
> +    u32 res               : 23;
> +    u32 valid             : 1;
> +};
> +
> +struct __packed avic_physical_id_entry {
> +    u64 host_phy_apic_id  : 8;
> +    u64 res1              : 4;
> +    u64 bk_pg_ptr_mfn     : 40;
> +    u64 res2              : 10;
> +    u64 is_running        : 1;
> +    u64 valid             : 1;
> +};

For help with your type-punning fixes in later patches, you'll want
these as a union.

typedef union avic_logical_id_entry {
    uint32_t raw;
    struct {
        ...
    };
} avic_logical_id_entry_t;

~Andrew

> +
> +extern bool svm_avic;
> +
> +int svm_avic_dom_init(struct domain *d);
> +void svm_avic_dom_destroy(struct domain *d);
> +
> +bool svm_avic_vcpu_enabled(const struct vcpu *v);
> +int svm_avic_init_vmcb(struct vcpu *v);
> +
> +#endif /* _SVM_AVIC_H_ */
> diff --git a/xen/include/asm-x86/hvm/svm/svm.h b/xen/include/asm-x86/hvm/svm/svm.h
> index 4e5e142910..983750fee9 100644
> --- a/xen/include/asm-x86/hvm/svm/svm.h
> +++ b/xen/include/asm-x86/hvm/svm/svm.h
> @@ -65,6 +65,7 @@ extern u32 svm_feature_flags;
>  #define SVM_FEATURE_DECODEASSISTS  7 /* Decode assists support */
>  #define SVM_FEATURE_PAUSEFILTER   10 /* Pause intercept filter support */
>  #define SVM_FEATURE_PAUSETHRESH   12 /* Pause intercept filter support */
> +#define SVM_FEATURE_AVIC          13 /* AVIC Support */
>  #define SVM_FEATURE_VLOADSAVE     15 /* virtual vmload/vmsave */
>  #define SVM_FEATURE_VGIF          16 /* Virtual GIF */
>  
> @@ -79,6 +80,7 @@ extern u32 svm_feature_flags;
>  #define cpu_has_pause_filter  cpu_has_svm_feature(SVM_FEATURE_PAUSEFILTER)
>  #define cpu_has_pause_thresh  cpu_has_svm_feature(SVM_FEATURE_PAUSETHRESH)
>  #define cpu_has_tsc_ratio     cpu_has_svm_feature(SVM_FEATURE_TSCRATEMSR)
> +#define cpu_has_svm_avic      cpu_has_svm_feature(SVM_FEATURE_AVIC)
>  #define cpu_has_svm_vloadsave cpu_has_svm_feature(SVM_FEATURE_VLOADSAVE)
>  
>  #define SVM_PAUSEFILTER_INIT    4000
> diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
> index 591d98fc8c..386aad2260 100644
> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
> @@ -500,6 +500,21 @@ struct vmcb_struct {
>  };
>  
>  struct svm_domain {
> +    /*
> +     * This per-domain table is used by the hardware to locate
> +     * the vAPIC backing page to be used to deliver interrupts
> +     * based on the guest physical APIC ID.
> +     */
> +    struct avic_physical_id_entry *avic_physical_id_table;
> +    struct page_info *avic_physical_id_table_pg;
> +
> +    /*
> +     * This per-domain table is used by the hardware to map
> +     * logically addressed interrupt requests (w/ guest logical APIC id)
> +     * to the guest physical APIC ID.
> +     */
> +    struct avic_logical_id_entry *avic_logical_id_table;
> +    struct page_info *avic_logical_id_table_pg;
>  };
>  
>  struct arch_svm_struct {
> @@ -533,6 +548,8 @@ struct arch_svm_struct {
>          u64 length;
>          u64 status;
>      } osvw;
> +
> +    struct avic_physical_id_entry *avic_last_phy_id;
>  };
>  
>  struct vmcb_struct *alloc_vmcb(void);


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers
  2018-04-03 23:01 ` [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers Janakarajan Natarajan
@ 2018-04-13 17:48   ` Andrew Cooper
  2018-04-30 21:23     ` Natarajan, Janakarajan
  2018-04-17 12:58   ` Jan Beulich
  1 sibling, 1 reply; 35+ messages in thread
From: Andrew Cooper @ 2018-04-13 17:48 UTC (permalink / raw)
  To: Janakarajan Natarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich, Boris Ostrovsky

On 04/04/18 00:01, Janakarajan Natarajan wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>
> AVIC introduces two new #vmexit handlers:
>
> VMEXIT_INCOMP_IPI:
> This occurs when an IPI could not be delivered to all targeted guest
> virtual processors because at least one guest virtual processor
> was not allocated to a physical core at the time. In this case,
> Xen would try to emulate IPI.
>
> VMEXIT_DO_NOACCEL:
> This occurs when a guest access to an APIC register that cannot be
> accelerated by AVIC. In this case, Xen tries to emulate register accesses.
>
> This fault is also generated if an EOI is attempted when the highest priority
> in-service interrupt is set for level-triggered mode.
>
> This patch also declare vlapic_read_aligned() and vlapic_reg_write()
> as non-static to expose them to be used by AVIC.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
> ---
>  xen/arch/x86/hvm/svm/avic.c        | 296 +++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/hvm/svm/svm.c         |   8 +
>  xen/arch/x86/hvm/vlapic.c          |   4 +-
>  xen/include/asm-x86/hvm/svm/avic.h |   3 +
>  xen/include/asm-x86/hvm/svm/vmcb.h |   6 +
>  xen/include/asm-x86/hvm/vlapic.h   |   4 +
>  6 files changed, 319 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
> index 8108698911..e112469774 100644
> --- a/xen/arch/x86/hvm/svm/avic.c
> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -19,6 +19,7 @@
>  #include <xen/sched.h>
>  #include <xen/stdbool.h>
>  #include <asm/acpi.h>
> +#include <asm/apic.h>
>  #include <asm/apicdef.h>
>  #include <asm/atomic.h>
>  #include <asm/event.h>
> @@ -37,6 +38,8 @@
>  
>  #define AVIC_VAPIC_BAR_MASK     (((1ULL << 40) - 1) << PAGE_SHIFT)
>  
> +#define AVIC_UNACCEL_ACCESS_OFFSET_MASK    0xFF0
> +
>  /*
>   * Note:
>   * Currently, svm-avic mode is not supported with nested virtualization.
> @@ -101,6 +104,8 @@ int svm_avic_dom_init(struct domain *d)
>      d->arch.hvm_domain.svm.avic_physical_id_table_pg = pg;
>      d->arch.hvm_domain.svm.avic_physical_id_table = __map_domain_page_global(pg);
>  
> +    spin_lock_init(&d->arch.hvm_domain.svm.avic_dfr_mode_lock);
> +
>      return ret;
>   err_out:
>      svm_avic_dom_destroy(d);
> @@ -181,6 +186,297 @@ int svm_avic_init_vmcb(struct vcpu *v)
>  }
>  
>  /*
> + * Note:
> + * This function handles the AVIC_INCOMP_IPI #vmexit when AVIC is enabled.
> + * The hardware generates this fault when an IPI could not be delivered
> + * to all targeted guest virtual processors because at least one guest
> + * virtual processor was not allocated to a physical core at the time.
> + */
> +void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs)
> +{
> +    struct vcpu *curr = current;
> +    struct domain *currd = curr->domain;
> +    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
> +    u32 icrh = vmcb->exitinfo1 >> 32;
> +    u32 icrl = vmcb->exitinfo1;
> +    u32 id = vmcb->exitinfo2 >> 32;
> +    u32 index = vmcb->exitinfo2 && 0xFF;
> +
> +    switch ( id )
> +    {
> +    case AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE:
> +        /*
> +         * AVIC hardware handles the delivery of
> +         * IPIs when the specified Message Type is Fixed
> +         * (also known as fixed delivery mode) and
> +         * the Trigger Mode is edge-triggered. The hardware
> +         * also supports self and broadcast delivery modes
> +         * specified via the Destination Shorthand(DSH)
> +         * field of the ICRL. Logical and physical APIC ID
> +         * formats are supported. All other IPI types cause
> +         * a #VMEXIT, which needs to emulated.
> +         */
> +        vlapic_reg_write(curr, APIC_ICR2, icrh);
> +        vlapic_reg_write(curr, APIC_ICR, icrl);
> +        break;
> +
> +    case AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN:
> +    {
> +        /*
> +         * At this point, we expect that the AVIC HW has already
> +         * set the appropriate IRR bits on the valid target
> +         * vcpus. So, we just need to kick the appropriate vcpu.
> +         */
> +        struct vcpu *v;
> +        uint32_t dest = GET_xAPIC_DEST_FIELD(icrh);
> +        uint32_t short_hand = icrl & APIC_SHORT_MASK;
> +        bool dest_mode = !!(icrl & APIC_DEST_MASK);

No need for !!.  It is the explicit behaviour of the bool type.

> +
> +        for_each_vcpu ( currd,  v )
> +        {
> +            if ( v != curr &&
> +                 vlapic_match_dest(vcpu_vlapic(v), vcpu_vlapic(curr),
> +                                   short_hand, dest, dest_mode) )
> +            {
> +                vcpu_kick(v);
> +                break;
> +            }
> +        }
> +        break;
> +    }
> +
> +    case AVIC_INCMP_IPI_ERR_INV_TARGET:
> +        gprintk(XENLOG_ERR,
> +                "SVM: %s: Invalid IPI target (icr=%#08x:%08x, idx=%u)\n",
> +                __func__, icrh, icrl, index);
> +        domain_crash(currd);
> +        break;
> +
> +    case AVIC_INCMP_IPI_ERR_INV_BK_PAGE:
> +        gprintk(XENLOG_ERR,
> +                "SVM: %s: Invalid bk page (icr=%#08x:%08x, idx=%u)\n",
> +                __func__, icrh, icrl, index);
> +        domain_crash(currd);
> +        break;
> +
> +    default:
> +        gprintk(XENLOG_ERR, "SVM: %s: Unknown IPI interception (%#x)\n",
> +                __func__, id);
> +        domain_crash(currd);
> +    }
> +}
> +
> +static struct avic_logical_id_entry *
> +avic_get_logical_id_entry(struct svm_domain *d, u32 ldr, bool flat)
> +{
> +    unsigned int index;
> +    unsigned int dest_id = GET_xAPIC_LOGICAL_ID(ldr);
> +
> +    if ( !dest_id )
> +        return NULL;
> +
> +    if ( flat )
> +    {
> +        index = ffs(dest_id) - 1;
> +        if ( index > 7 )
> +            return NULL;
> +    }
> +    else
> +    {
> +        unsigned int cluster = (dest_id & 0xf0) >> 4;
> +        int apic = ffs(dest_id & 0x0f) - 1;
> +
> +        if ( (apic < 0) || (apic > 7) || (cluster >= 0xf) )
> +            return NULL;
> +        index = (cluster << 2) + apic;
> +    }
> +
> +    ASSERT(index <= 255);
> +
> +    return &d->avic_logical_id_table[index];
> +}
> +
> +static int avic_ldr_write(struct vcpu *v, u8 g_phy_id, u32 ldr, bool valid)
> +{
> +    struct avic_logical_id_entry *entry, new_entry;
> +    u32 dfr = vlapic_read_aligned(vcpu_vlapic(v), APIC_DFR);
> +
> +    entry = avic_get_logical_id_entry(&v->domain->arch.hvm_domain.svm,
> +                                      ldr, (dfr == APIC_DFR_FLAT));
> +    if (!entry)

if ( !entry )

> +        return -EINVAL;
> +
> +    new_entry = *entry;
> +    smp_rmb();
> +    new_entry.guest_phy_apic_id = g_phy_id;
> +    new_entry.valid = valid;
> +    *entry = new_entry;
> +    smp_wmb();

These barriers don't do what you want.  The pattern you are looking for
would require an smp_mb() between setting valid and writing things
back.  However, that is overkill - all that matters is that the compiler
doesn't generate multiple partial updates.

new_entry.raw = ACCESS_ONCE(entry->raw);

new_entry.guest_phy_apic_id = g_phy_id;
new_entry.valid = valid;

ACCESS_ONCE(entry->raw) = new_entry.raw;

> +
> +    return 0;
> +}
> +
> +static int avic_handle_ldr_update(struct vcpu *v)
> +{
> +    int ret = 0;
> +    u32 ldr = vlapic_read_aligned(vcpu_vlapic(v), APIC_LDR);
> +    u32 apic_id = vlapic_read_aligned(vcpu_vlapic(v), APIC_ID);
> +
> +    if ( !ldr )
> +        return -EINVAL;
> +
> +    ret = avic_ldr_write(v, GET_xAPIC_ID(apic_id), ldr, true);
> +    if ( ret && v->arch.hvm_svm.avic_last_ldr )
> +    {
> +        /*
> +         * Note:
> +         * In case of failure to update LDR register,
> +         * we set the guest physical APIC ID to 0,
> +         * and set the entry logical APID ID entry
> +         * to invalid (false).
> +         */
> +        avic_ldr_write(v, 0, v->arch.hvm_svm.avic_last_ldr, false);
> +        v->arch.hvm_svm.avic_last_ldr = 0;
> +    }
> +    else
> +    {
> +        /*
> +         * Note:
> +         * This saves the last valid LDR so that we
> +         * know which entry in the local APIC ID
> +         * to clean up when the LDR is updated.
> +         */
> +        v->arch.hvm_svm.avic_last_ldr = ldr;
> +    }
> +
> +    return ret;
> +}
> +
> +static int avic_handle_dfr_update(struct vcpu *v)
> +{
> +    u32 mod;
> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
> +    u32 dfr = vlapic_read_aligned(vcpu_vlapic(v), APIC_DFR);
> +
> +    mod = (dfr >> 28) & 0xFu;
> +
> +    spin_lock(&d->avic_dfr_mode_lock);
> +    if ( d->avic_dfr_mode != mod )
> +    {
> +        /*
> +         * We assume that all local APICs are using the same type.
> +         * If DFR mode changes, we need to flush the domain AVIC logical
> +         * APIC id table.
> +         */
> +        clear_domain_page(_mfn(page_to_mfn(d->avic_logical_id_table_pg)));
> +        d->avic_dfr_mode = mod;
> +    }
> +    spin_unlock(&d->avic_dfr_mode_lock);
> +
> +    if ( v->arch.hvm_svm.avic_last_ldr )
> +        avic_handle_ldr_update(v);
> +
> +    return 0;
> +}
> +
> +static int avic_unaccel_trap_write(struct vcpu *v)
> +{
> +    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
> +    u32 offset = vmcb->exitinfo1 & AVIC_UNACCEL_ACCESS_OFFSET_MASK;
> +    u32 reg = vlapic_read_aligned(vcpu_vlapic(v), offset);
> +
> +    switch ( offset )
> +    {
> +    case APIC_ID:
> +        /*
> +         * Currently, we do not support APIC_ID update while
> +         * the vcpus are running, which might require updating
> +         * AVIC max APIC ID in all VMCBs. This would require
> +         * synchronize update on all running VCPUs.
> +         */
> +        return X86EMUL_UNHANDLEABLE;

Newline please.

> +    case APIC_LDR:
> +        if ( avic_handle_ldr_update(v) )
> +            return X86EMUL_UNHANDLEABLE;
> +        break;
> +
> +    case APIC_DFR:
> +        if ( avic_handle_dfr_update(v) )
> +            return X86EMUL_UNHANDLEABLE;
> +        break;
> +
> +    default:
> +        break;
> +    }
> +
> +    vlapic_reg_write(v, offset, reg);
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +static inline bool avic_is_trap(u32 offset)
> +{
> +    u32 pos = offset >> 4;
> +    static const unsigned long avic_trap[] =
> +        {
> +#define REG(x) (1UL << (APIC_ ## x >> 4))
> +            REG(ID)   | REG(EOI)     | REG(RRR)   | REG(LDR)  |
> +            REG(DFR)  | REG(SPIV)    | REG(ESR)   | REG(ICR)  |
> +            REG(LVTT) | REG(LVTTHMR) | REG(LVTPC) | REG(LVT0) |
> +            REG(LVT1) | REG(LVTERR)  | REG(TMICT) | REG(TDCR)
> +#undef REG
> +        };

I know I'm the author of the piece of code you've copied here, but I am
in the process of trying to fix its style.

static const unsigned long avic_trap[] = {
#define REG(x) (1UL << (APIC_ ## x >> 4))
    REG(ID)   | REG(EOI)     | REG(RRR)   | REG(LDR)  |
    ....
#undef REG
};

> +
> +    if ( !test_bit(pos, avic_trap) )
> +        return false;
> +    return true;

return pos < (sizeof(avic_trap) * 8) && test_bit(pos, avic_trap);

You need to avoid reading beyond the end of avic_trap[].

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC
  2018-04-03 23:01 ` [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC Janakarajan Natarajan
@ 2018-04-13 17:57   ` Andrew Cooper
  2018-04-19 15:54     ` Natarajan, Janakarajan
  0 siblings, 1 reply; 35+ messages in thread
From: Andrew Cooper @ 2018-04-13 17:57 UTC (permalink / raw)
  To: Janakarajan Natarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich, Boris Ostrovsky

On 04/04/18 00:01, Janakarajan Natarajan wrote:
> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d, unsigned int index)
>      return &d->avic_physical_id_table[index];
>  }
>  
> +static void avic_vcpu_load(struct vcpu *v)
> +{
> +    unsigned long tmp;
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    int h_phy_apic_id;
> +    struct avic_physical_id_entry *entry = (struct avic_physical_id_entry *)&tmp;
> +
> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> +
> +    /*
> +     * Note: APIC ID = 0xff is used for broadcast.
> +     *       APIC ID > 0xff is reserved.
> +     */
> +    h_phy_apic_id = cpu_data[v->processor].apicid;
> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
> +
> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
> +    entry->host_phy_apic_id = h_phy_apic_id;
> +    entry->is_running = 1;
> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);

What is the purpose of s->avic_last_phy_id ?

As far as I can tell, it is always an unchanging pointer into the
physical ID table, which is only ever updated synchronously in current
context.

If so, I don't see why it needs any of these hoops to be jumped though.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 7/8] x86/SVM: Introduce svm command line option
  2018-04-03 23:01 ` [PATCH 7/8] x86/SVM: Introduce svm command line option Janakarajan Natarajan
@ 2018-04-13 18:04   ` Andrew Cooper
  0 siblings, 0 replies; 35+ messages in thread
From: Andrew Cooper @ 2018-04-13 18:04 UTC (permalink / raw)
  To: Janakarajan Natarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich, Boris Ostrovsky

On 04/04/18 00:01, Janakarajan Natarajan wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>
> This patch introduces a new Xen command line option to enable/disable
> SVM sub-options. Currently, it support sub-option "avic", which can
> be used to enable/disable SVM AVIC feature.
>
> Signed-off-by: Suavee Suthikulpant <suravee.suthikulpanit@amd.com>
> Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
> ---
>  docs/misc/xen-command-line.markdown | 16 ++++++++++++++++
>  xen/arch/x86/hvm/svm/svm.c          | 32 ++++++++++++++++++++++++++++++++
>  2 files changed, 48 insertions(+)
>
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index b353352adf..60a1005c42 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1730,6 +1730,22 @@ enforces the maximum theoretically necessary timeout of 670ms. Any number
>  is being interpreted as a custom timeout in milliseconds. Zero or boolean
>  false disable the quirk workaround, which is also the default.
>  
> +### svm
> +> `= List of [ avic ]`
> +
> +> Sub-options:
> +
> +> All sub-options are of boolean kind and can be prefixed with `no-` to
> +> effect the inverse meaning.
> +
> +> `avic`
> +
> +> Default: `false`
> +
> +>> This option enables Advanced Virtual Interrupt Controller (AVIC),
> +>> which is an extension of AMD Secure Virtual Machine (SVM) to virtualize
> +>> local APIC for guest VM.
> +
>  ### sync\_console
>  > `= <boolean>`
>  
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index fdbe8e3008..0c5c26cce8 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -64,6 +64,16 @@
>  #include <asm/monitor.h>
>  #include <asm/xstate.h>
>  
> +static int parse_svm_param(const char *s);
> +
> +/*
> + * The 'svm' parameter en/dis-ables various SVM features.
> + * Optional comma separated value may contain:
> + *
> + *   avic - Enable SVM Advanced Virtual Interrupt Controller (AVIC)
> + */
> +custom_param("svm", parse_svm_param);

Move this custom_param to below the definition, and drop the forward
declaration.  You can drop the comment here, because the important
information is in the real command line doc.

> +
>  void svm_asm_do_resume(void);
>  
>  u32 svm_feature_flags;
> @@ -89,6 +99,28 @@ static bool_t amd_erratum383_found __read_mostly;
>  static uint64_t osvw_length, osvw_status;
>  static DEFINE_SPINLOCK(osvw_lock);
>  
> +static int __init parse_svm_param(const char *s)
> +{
> +    char *ss;
> +    int val;
> +
> +    do {
> +        val = !!strncmp(s, "no-", 3);
> +        if ( !val )
> +            s += 3;
> +
> +        ss = strchr(s, ',');
> +        if ( ss )
> +            *ss = '\0';

This is writing to a const pointer.  strchr() is a particularly evil
part of the C standard library, because it drops const from its input
parameter.

See parse_xen_cpuid() for a similar piece of functionality, and please
use the parse_boolean() helper.

~Andrew

> +
> +        if ( !strcmp(s, "avic") )
> +            svm_avic = val;
> +
> +        s = ss + 1;
> +    } while ( ss );
> +
> +    return 0;
> +}
>  /* Only crash the guest if the problem originates in kernel mode. */
>  static void svm_crash_or_fault(struct vcpu *v)
>  {


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code
  2018-04-03 23:01 ` [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code Janakarajan Natarajan
  2018-04-13 17:35   ` Andrew Cooper
@ 2018-04-16 15:55   ` Jan Beulich
  2018-04-16 16:15     ` Roger Pau Monné
  2018-04-26 15:32     ` Natarajan, Janakarajan
  2018-04-23 19:33   ` Konrad Rzeszutek Wilk
  2 siblings, 2 replies; 35+ messages in thread
From: Jan Beulich @ 2018-04-16 15:55 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	Jun Nakajima, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Julien Grall, Suravee Suthikulpanit, Boris Ostrovsky

>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> Introduce AVIC base initialization code. This includes:
>     * Setting up per-VM data structures.
>     * Setting up per-vCPU data structure.
>     * Initializing AVIC-related VMCB bit fields.
> 
> This patch also introduces a new Xen parameter (svm-avic),
> which can be used to enable/disable AVIC support.

This sentence looks stale/misplaced now.

> --- /dev/null
> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -0,0 +1,191 @@
> +/*
> + * avic.c: implements AMD Advanced Virtual Interrupt Controller (AVIC) support
> + * Copyright (c) 2016, Advanced Micro Devices, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/domain_page.h>
> +#include <xen/sched.h>
> +#include <xen/stdbool.h>
> +#include <asm/acpi.h>
> +#include <asm/apicdef.h>
> +#include <asm/atomic.h>
> +#include <asm/event.h>
> +#include <asm/hvm/emulate.h>
> +#include <asm/hvm/nestedhvm.h>
> +#include <asm/hvm/support.h>
> +#include <asm/hvm/svm/avic.h>
> +#include <asm/hvm/vlapic.h>
> +#include <asm/p2m.h>
> +#include <asm/page.h>
> +
> +/*
> + * Note: Current max index allowed for physical APIC ID table is 255.
> + */

This is a single line comment.

> +#define AVIC_PHY_APIC_ID_MAX    0xFF

Is this really an AVIC-specific constant, rather than e.g.
GET_xAPIC_ID(APIC_ID_MASK)?

> +#define AVIC_VAPIC_BAR_MASK     (((1ULL << 40) - 1) << PAGE_SHIFT)
> +
> +/*
> + * Note:
> + * Currently, svm-avic mode is not supported with nested virtualization.
> + * Therefore, it is not yet currently enabled by default. Once the support
> + * is in-place, this should be enabled by default.
> + */
> +bool svm_avic = 0;

false (or simply omit the intializer)

> +int svm_avic_dom_init(struct domain *d)
> +{
> +    int ret = 0;
> +    struct page_info *pg;
> +
> +    if ( !svm_avic || !has_vlapic(d) )
> +        return 0;
> +
> +    /*
> +     * Note:
> +     * AVIC hardware walks the nested page table to check permissions,
> +     * but does not use the SPA address specified in the leaf page
> +     * table entry since it uses  address in the AVIC_BACKING_PAGE pointer
> +     * field of the VMCB. Therefore, we set up a dummy page for APIC _mfn(0).
> +     */
> +    set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE),
> +                       _mfn(0), PAGE_ORDER_4K,
> +                       p2m_get_hostp2m(d)->default_access);

The use of MFN 0 here looks risky to me: How do you guarantee nothing else
might ever use that P2M entry?

> +    /* Init AVIC logical APIC ID table */
> +    pg = alloc_domheap_page(d, MEMF_no_owner);
> +    if ( !pg )
> +    {
> +        ret = -ENOMEM;
> +        goto err_out;
> +    }
> +    clear_domain_page(_mfn(page_to_mfn(pg)));
> +    d->arch.hvm_domain.svm.avic_logical_id_table_pg = pg;
> +    d->arch.hvm_domain.svm.avic_logical_id_table = __map_domain_page_global(pg);

Both here and ...

> +    /* Init AVIC physical APIC ID table */
> +    pg = alloc_domheap_page(d, MEMF_no_owner);
> +    if ( !pg )
> +    {
> +        ret = -ENOMEM;
> +        goto err_out;
> +    }
> +    clear_domain_page(_mfn(page_to_mfn(pg)));
> +    d->arch.hvm_domain.svm.avic_physical_id_table_pg = pg;
> +    d->arch.hvm_domain.svm.avic_physical_id_table = __map_domain_page_global(pg);

... here: Do you really need to store both page pointer and map address?

> +void svm_avic_dom_destroy(struct domain *d)
> +{
> +    if ( !svm_avic || !has_vlapic(d) )
> +        return;
> +
> +    if ( d->arch.hvm_domain.svm.avic_physical_id_table )
> +    {
> +        unmap_domain_page_global(d->arch.hvm_domain.svm.avic_physical_id_table);
> +        free_domheap_page(d->arch.hvm_domain.svm.avic_physical_id_table_pg);
> +        d->arch.hvm_domain.svm.avic_physical_id_table_pg = 0;
> +        d->arch.hvm_domain.svm.avic_physical_id_table = 0;

DYM NULL?

> +int svm_avic_init_vmcb(struct vcpu *v)
> +{
> +    u32 apic_id;
> +    unsigned long tmp;
> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
> +    struct vmcb_struct *vmcb = s->vmcb;
> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
> +    const struct vlapic *vlapic = vcpu_vlapic(v);
> +    struct avic_physical_id_entry *entry = (struct avic_physical_id_entry *)&tmp;
> +
> +    if ( !svm_avic || !has_vlapic(v->domain) )
> +        return 0;
> +
> +    if ( !vlapic || !vlapic->regs_page )
> +        return -EINVAL;
> +
> +    apic_id = vlapic_read_aligned(vcpu_vlapic(v), APIC_ID);
> +    s->avic_last_phy_id = avic_get_physical_id_entry(d, GET_xAPIC_ID(apic_id));
> +    if ( !s->avic_last_phy_id )
> +        return -EINVAL;
> +
> +    vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page);
> +    vmcb->avic_logical_id_table_pa = domain_page_map_to_mfn(d->avic_logical_id_table) << PAGE_SHIFT;
> +    vmcb->avic_physical_id_table_pa = domain_page_map_to_mfn(d->avic_physical_id_table) << PAGE_SHIFT;

DYM something like mfn_to_maddr()? Please don't open-code things.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code
  2018-04-16 15:55   ` Jan Beulich
@ 2018-04-16 16:15     ` Roger Pau Monné
  2018-04-26 15:32     ` Natarajan, Janakarajan
  1 sibling, 0 replies; 35+ messages in thread
From: Roger Pau Monné @ 2018-04-16 16:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Janakarajan Natarajan, George Dunlap, Andrew Cooper, Tim Deegan,
	xen-devel, Julien Grall, Jun Nakajima, Boris Ostrovsky,
	Ian Jackson

On Mon, Apr 16, 2018 at 09:55:20AM -0600, Jan Beulich wrote:
> >>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
> > From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> > +/*
> > + * Note: Current max index allowed for physical APIC ID table is 255.
> > + */
> 
> This is a single line comment.
> 
> > +#define AVIC_PHY_APIC_ID_MAX    0xFF
> 
> Is this really an AVIC-specific constant, rather than e.g.
> GET_xAPIC_ID(APIC_ID_MASK)?

Also, this is not the max APIC ID, max APIC ID would be 0xfe, 255 is
the APIC broadcast id. You cannot have a physical APIC with ID 255 in
xAPIC mode.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers
  2018-04-03 23:01 ` [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers Janakarajan Natarajan
  2018-04-13 17:48   ` Andrew Cooper
@ 2018-04-17 12:58   ` Jan Beulich
  2018-04-19 16:31     ` Natarajan, Janakarajan
  2018-04-20 20:02     ` Natarajan, Janakarajan
  1 sibling, 2 replies; 35+ messages in thread
From: Jan Beulich @ 2018-04-17 12:58 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	Jun Nakajima, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Julien Grall, Suravee Suthikulpanit, Boris Ostrovsky

>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
> --- a/xen/include/asm-x86/hvm/vlapic.h
> +++ b/xen/include/asm-x86/hvm/vlapic.h
> @@ -137,6 +137,10 @@ void vlapic_ipi(struct vlapic *vlapic, uint32_t icr_low, 
> uint32_t icr_high);
>  
>  int vlapic_apicv_write(struct vcpu *v, unsigned int offset);
>  
> +void vlapic_reg_write(struct vcpu *v, unsigned int offset, uint32_t val);
> +
> +uint32_t vlapic_read_aligned(const struct vlapic *vlapic, unsigned int offset);

If making these non-static is really necessary, they should (name-wise) become
proper pairs of one another, e.g. renamed the former to vlapic_reg_read().

Also while here you properly use uint32_t, almost everywhere you use u32.
Please switch this throughout the series, and of course for all other fixed
width integer types.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 5/8] x86/SVM: Add interrupt management code via AVIC
  2018-04-03 23:01 ` [PATCH 5/8] x86/SVM: Add interrupt management code via AVIC Janakarajan Natarajan
@ 2018-04-17 13:06   ` Jan Beulich
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2018-04-17 13:06 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	Jun Nakajima, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Julien Grall, Suravee Suthikulpanit, Boris Ostrovsky

>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
> +void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
> +{
> +    struct vlapic *vlapic = vcpu_vlapic(v);
> +
> +    /* Fallback to use non-AVIC if vcpu is not enabled with AVIC. */
> +    if ( !svm_avic_vcpu_enabled(v) )
> +    {
> +        if ( !vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
> +            vcpu_kick(v);
> +        return;
> +    }
> +
> +    /* If interrupt is disabled, do not ignore the interrupt */
> +    if ( !(guest_cpu_user_regs()->eflags & X86_EFLAGS_IF) )
> +        return;

I'm confused by the comment: How is returning here without any indication
to the caller different from ignoring the interrupt? How come EFLAGS.IF
matters here in the first place?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 6/8] x86/HVM: Hook up miscellaneous AVIC functions
  2018-04-03 23:01 ` [PATCH 6/8] x86/HVM: Hook up miscellaneous AVIC functions Janakarajan Natarajan
@ 2018-04-17 13:10   ` Jan Beulich
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2018-04-17 13:10 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	Jun Nakajima, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Julien Grall, Suravee Suthikulpanit, Boris Ostrovsky

>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -1711,7 +1711,10 @@ const struct hvm_function_table * __init start_svm(void)
>          svm_avic = 0;
>  
>      if ( svm_avic )
> +    {
>          svm_function_table.deliver_posted_intr  = svm_avic_deliver_posted_intr;
> +        svm_function_table.virtual_intr_delivery_enabled = svm_avic;

Considering the controlling expression of the if() we're in, why not simply "true"?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 8/8] x86/SVM: Add AMD AVIC key handler
  2018-04-03 23:01 ` [PATCH 8/8] x86/SVM: Add AMD AVIC key handler Janakarajan Natarajan
@ 2018-04-17 13:36   ` Jan Beulich
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2018-04-17 13:36 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	Jun Nakajima, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Julien Grall, Suravee Suthikulpanit, Boris Ostrovsky

>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
> Adding new key-handler "j" for dumping AVIC-related information.

Considering how few keys we have left, I have significant reservations against
adding such a narrow purpose key. If you really want to expose such information,
add it to a suitable existing key or introduce a sysctl.

> --- a/xen/arch/x86/hvm/svm/avic.c
> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -28,6 +28,7 @@
>  #include <asm/hvm/support.h>
>  #include <asm/hvm/svm/avic.h>
>  #include <asm/hvm/vlapic.h>
> +#include <xen/keyhandler.h>
>  #include <asm/p2m.h>
>  #include <asm/page.h>

Please insert new includes in a more logical manner (where possible we ask for
them to be sorted alphabetically, but that may not make sense here, but at
least group it with other xen/ ones).

> @@ -49,6 +50,11 @@
>   */
>  bool svm_avic = 0;
>  
> +static inline bool svm_is_avic_domain(struct domain *d)

const

> +{
> +    return ( d->arch.hvm_domain.svm.avic_physical_id_table != 0 );

Stray blanks (and unnecessary parentheses anyway).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code
  2018-04-13 17:35   ` Andrew Cooper
@ 2018-04-19 15:46     ` Natarajan, Janakarajan
  0 siblings, 0 replies; 35+ messages in thread
From: Natarajan, Janakarajan @ 2018-04-19 15:46 UTC (permalink / raw)
  To: Andrew Cooper, Janakarajan Natarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich, Boris Ostrovsky

On 4/13/2018 12:35 PM, Andrew Cooper wrote:
> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> Introduce AVIC base initialization code. This includes:
>>      * Setting up per-VM data structures.
>>      * Setting up per-vCPU data structure.
>>      * Initializing AVIC-related VMCB bit fields.
>>
>> This patch also introduces a new Xen parameter (svm-avic),
>> which can be used to enable/disable AVIC support.
>> Currently, this svm-avic is disabled by default.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
>> ---
>>   xen/arch/x86/hvm/svm/Makefile      |   1 +
>>   xen/arch/x86/hvm/svm/avic.c        | 191 +++++++++++++++++++++++++++++++++++++
>>   xen/arch/x86/hvm/svm/svm.c         |   8 +-
>>   xen/arch/x86/hvm/svm/vmcb.c        |   3 +
>>   xen/arch/x86/hvm/vlapic.c          |   4 +
>>   xen/include/asm-x86/hvm/svm/avic.h |  36 +++++++
>>   xen/include/asm-x86/hvm/svm/svm.h  |   2 +
>>   xen/include/asm-x86/hvm/svm/vmcb.h |  17 ++++
>>   8 files changed, 261 insertions(+), 1 deletion(-)
>>   create mode 100644 xen/arch/x86/hvm/svm/avic.c
>>   create mode 100644 xen/include/asm-x86/hvm/svm/avic.h
>>
>> diff --git a/xen/arch/x86/hvm/svm/Makefile b/xen/arch/x86/hvm/svm/Makefile
>> index 760d2954da..e0e4a59f7d 100644
>> --- a/xen/arch/x86/hvm/svm/Makefile
>> +++ b/xen/arch/x86/hvm/svm/Makefile
>> @@ -1,4 +1,5 @@
>>   obj-y += asid.o
>> +obj-y += avic.o
>>   obj-y += emulate.o
>>   obj-bin-y += entry.o
>>   obj-y += intr.o
>> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
>> new file mode 100644
>> index 0000000000..8108698911
>> --- /dev/null
>> +++ b/xen/arch/x86/hvm/svm/avic.c
>> @@ -0,0 +1,191 @@
>> +/*
>> + * avic.c: implements AMD Advanced Virtual Interrupt Controller (AVIC) support
>> + * Copyright (c) 2016, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/domain_page.h>
>> +#include <xen/sched.h>
>> +#include <xen/stdbool.h>
>> +#include <asm/acpi.h>
>> +#include <asm/apicdef.h>
>> +#include <asm/atomic.h>
>> +#include <asm/event.h>
>> +#include <asm/hvm/emulate.h>
>> +#include <asm/hvm/nestedhvm.h>
>> +#include <asm/hvm/support.h>
>> +#include <asm/hvm/svm/avic.h>
>> +#include <asm/hvm/vlapic.h>
>> +#include <asm/p2m.h>
>> +#include <asm/page.h>
>> +
>> +/*
>> + * Note: Current max index allowed for physical APIC ID table is 255.
>> + */
>> +#define AVIC_PHY_APIC_ID_MAX    0xFF
>> +
>> +#define AVIC_VAPIC_BAR_MASK     (((1ULL << 40) - 1) << PAGE_SHIFT)
> This constant should hopefully be stale (following the debugging), and
> should be dropped.
>
>> +
>> +/*
>> + * Note:
>> + * Currently, svm-avic mode is not supported with nested virtualization.
>> + * Therefore, it is not yet currently enabled by default. Once the support
>> + * is in-place, this should be enabled by default.
>> + */
>> +bool svm_avic = 0;
>> +
>> +static struct avic_physical_id_entry *
>> +avic_get_physical_id_entry(struct svm_domain *d, unsigned int index)
>> +{
>> +    if ( !d->avic_physical_id_table )
>> +        return NULL;
>> +
>> +    /*
>> +    * Note: APIC ID = 0xFF is used for broadcast.
>> +    *       APIC ID > 0xFF is reserved.
>> +    */
>> +    ASSERT(index < AVIC_PHY_APIC_ID_MAX);
> Please also have:
>
> if ( index >= AVIC_PHY_APIC_ID_MAX )
>      return NULL; /* Avoid out-of-bounds access in release builds. */
>
>> +
>> +    return &d->avic_physical_id_table[index];
>> +}
>> +
>> +int svm_avic_dom_init(struct domain *d)
>> +{
>> +    int ret = 0;
>> +    struct page_info *pg;
>> +
>> +    if ( !svm_avic || !has_vlapic(d) )
>> +        return 0;
>> +
>> +    /*
>> +     * Note:
>> +     * AVIC hardware walks the nested page table to check permissions,
>> +     * but does not use the SPA address specified in the leaf page
>> +     * table entry since it uses  address in the AVIC_BACKING_PAGE pointer
>> +     * field of the VMCB. Therefore, we set up a dummy page for APIC _mfn(0).
>> +     */
>> +    set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE),
>> +                       _mfn(0), PAGE_ORDER_4K,
>> +                       p2m_get_hostp2m(d)->default_access);
> Is default access right here, rather than blanket R/W?  IIRC, the frame
> is magic in the pipeline and causes redirections to the APIC handling,
> so the permissions of the page don't take effect.
>
>> +
>> +    /* Init AVIC logical APIC ID table */
>> +    pg = alloc_domheap_page(d, MEMF_no_owner);
>> +    if ( !pg )
>> +    {
>> +        ret = -ENOMEM;
>> +        goto err_out;
>> +    }
>> +    clear_domain_page(_mfn(page_to_mfn(pg)));
> You'll need to rebase your changes onto the latest staging.  Julien
> finished the tree-wide cleanup of page_to_mfn() and changed its type, so
> you'll need to drop _mfn() wrappers, or insert mfn_x() wrappers.
>
>> +    d->arch.hvm_domain.svm.avic_logical_id_table_pg = pg;
>> +    d->arch.hvm_domain.svm.avic_logical_id_table = __map_domain_page_global(pg);
>> +
>> +    /* Init AVIC physical APIC ID table */
>> +    pg = alloc_domheap_page(d, MEMF_no_owner);
>> +    if ( !pg )
>> +    {
>> +        ret = -ENOMEM;
>> +        goto err_out;
>> +    }
>> +    clear_domain_page(_mfn(page_to_mfn(pg)));
>> +    d->arch.hvm_domain.svm.avic_physical_id_table_pg = pg;
>> +    d->arch.hvm_domain.svm.avic_physical_id_table = __map_domain_page_global(pg);
>> +
>> +    return ret;
>> + err_out:
>> +    svm_avic_dom_destroy(d);
>> +    return ret;
>> +}
>> +
>> +void svm_avic_dom_destroy(struct domain *d)
>> +{
>> +    if ( !svm_avic || !has_vlapic(d) )
>> +        return;
>> +
>> +    if ( d->arch.hvm_domain.svm.avic_physical_id_table )
>> +    {
>> +        unmap_domain_page_global(d->arch.hvm_domain.svm.avic_physical_id_table);
>> +        free_domheap_page(d->arch.hvm_domain.svm.avic_physical_id_table_pg);
>> +        d->arch.hvm_domain.svm.avic_physical_id_table_pg = 0;
>> +        d->arch.hvm_domain.svm.avic_physical_id_table = 0;
>> +    }
>> +
>> +    if ( d->arch.hvm_domain.svm.avic_logical_id_table)
>> +    {
>> +        unmap_domain_page_global(d->arch.hvm_domain.svm.avic_logical_id_table);
>> +        free_domheap_page(d->arch.hvm_domain.svm.avic_logical_id_table_pg);
>> +        d->arch.hvm_domain.svm.avic_logical_id_table_pg = 0;
>> +        d->arch.hvm_domain.svm.avic_logical_id_table = 0;
>> +    }
>> +}
>> +
>> +bool svm_avic_vcpu_enabled(const struct vcpu *v)
>> +{
>> +    const struct arch_svm_struct *s = &v->arch.hvm_svm;
>> +    const struct vmcb_struct *vmcb = s->vmcb;
>> +
>> +    return vmcb->_vintr.fields.avic_enable;
>> +}
>> +
>> +int svm_avic_init_vmcb(struct vcpu *v)
>> +{
>> +    u32 apic_id;
>> +    unsigned long tmp;
>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>> +    struct vmcb_struct *vmcb = s->vmcb;
>> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
>> +    const struct vlapic *vlapic = vcpu_vlapic(v);
>> +    struct avic_physical_id_entry *entry = (struct avic_physical_id_entry *)&tmp;
> This is still broken.  Anywhere where type punning like this is going on
> needs fixing.
>
>> +
>> +    if ( !svm_avic || !has_vlapic(v->domain) )
>> +        return 0;
>> +
>> +    if ( !vlapic || !vlapic->regs_page )
>> +        return -EINVAL;
>> +
>> +    apic_id = vlapic_read_aligned(vcpu_vlapic(v), APIC_ID);
>> +    s->avic_last_phy_id = avic_get_physical_id_entry(d, GET_xAPIC_ID(apic_id));
>> +    if ( !s->avic_last_phy_id )
>> +        return -EINVAL;
>> +
>> +    vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page);
>> +    vmcb->avic_logical_id_table_pa = domain_page_map_to_mfn(d->avic_logical_id_table) << PAGE_SHIFT;
>> +    vmcb->avic_physical_id_table_pa = domain_page_map_to_mfn(d->avic_physical_id_table) << PAGE_SHIFT;
>> +
>> +    /* Set Physical ID Table Pointer [7:0] to max apic id of the domain */
>> +    vmcb->avic_logical_id_table_pa &= ~AVIC_PHY_APIC_ID_MAX;
>> +    vmcb->avic_physical_id_table_pa |= (v->domain->max_vcpus * 2) & 0xFF;
>> +
>> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
>> +    entry->bk_pg_ptr_mfn = (vmcb->avic_bk_pg_pa) >> PAGE_SHIFT;
>> +    entry->is_running = 0;
>> +    entry->valid = 1;
>> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
> This is an init function, and it will be called sequentially for each
> allocated vcpu.  The guest is guaranteed not to be executing, so don't
> need to have any special interlocking for s->avic_last_phy_id
>
>> +
>> +    vmcb->avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & AVIC_VAPIC_BAR_MASK;
>> +    vmcb->cleanbits.fields.avic = 0;
>> +
>> +    vmcb->_vintr.fields.avic_enable = 1;
>> +
>> +    return 0;
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
>> index 8538232f68..b445f59ada 100644
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -47,6 +47,7 @@
>>   #include <asm/hvm/svm/asid.h>
>>   #include <asm/hvm/svm/svm.h>
>>   #include <asm/hvm/svm/vmcb.h>
>> +#include <asm/hvm/svm/avic.h>
>>   #include <asm/hvm/svm/emulate.h>
>>   #include <asm/hvm/svm/intr.h>
>>   #include <asm/hvm/svm/svmdebug.h>
>> @@ -1225,11 +1226,12 @@ static int svm_domain_initialise(struct domain *d)
>>   
>>       d->arch.ctxt_switch = &csw;
>>   
>> -    return 0;
>> +    return svm_avic_dom_init(d);
>>   }
>>   
>>   static void svm_domain_destroy(struct domain *d)
>>   {
>> +    svm_avic_dom_destroy(d);
>>   }
>>   
>>   static int svm_vcpu_initialise(struct vcpu *v)
>> @@ -1694,6 +1696,9 @@ const struct hvm_function_table * __init start_svm(void)
>>       if ( cpu_has_tsc_ratio )
>>           svm_function_table.tsc_scaling.ratio_frac_bits = 32;
>>   
>> +    if ( !cpu_has_svm_avic )
>> +        svm_avic = 0;
>> +
>>   #define P(p,s) if ( p ) { printk(" - %s\n", s); printed = 1; }
>>       P(cpu_has_svm_npt, "Nested Page Tables (NPT)");
>>       P(cpu_has_svm_lbrv, "Last Branch Record (LBR) Virtualisation");
>> @@ -1705,6 +1710,7 @@ const struct hvm_function_table * __init start_svm(void)
>>       P(cpu_has_pause_filter, "Pause-Intercept Filter");
>>       P(cpu_has_pause_thresh, "Pause-Intercept Filter Threshold");
>>       P(cpu_has_tsc_ratio, "TSC Rate MSR");
>> +    P(cpu_has_svm_avic, svm_avic ? "AVIC (enabled)" : "AVIC (disabled)");
>>   #undef P
>>   
>>       if ( !printed )
>> diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
>> index ae60d8dc1c..7ade023cfe 100644
>> --- a/xen/arch/x86/hvm/svm/vmcb.c
>> +++ b/xen/arch/x86/hvm/svm/vmcb.c
>> @@ -27,6 +27,7 @@
>>   #include <asm/msr-index.h>
>>   #include <asm/p2m.h>
>>   #include <asm/hvm/support.h>
>> +#include <asm/hvm/svm/avic.h>
>>   #include <asm/hvm/svm/svm.h>
>>   #include <asm/hvm/svm/svmdebug.h>
>>   
>> @@ -215,6 +216,8 @@ static int construct_vmcb(struct vcpu *v)
>>               vmcb->_pause_filter_thresh = SVM_PAUSETHRESH_INIT;
>>       }
>>   
>> +    svm_avic_init_vmcb(v);
>> +
>>       vmcb->cleanbits.bytes = 0;
>>   
>>       return 0;
>> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
>> index 1b9f00a0e4..a4472200a6 100644
>> --- a/xen/arch/x86/hvm/vlapic.c
>> +++ b/xen/arch/x86/hvm/vlapic.c
>> @@ -1597,6 +1597,10 @@ int vlapic_init(struct vcpu *v)
>>   
>>       if (vlapic->regs_page == NULL)
>>       {
>> +        /*
>> +         * SVM AVIC depends on the vlapic->regs_page being a full
>> +         * page allocation as it is also used for vAPIC backing page.
>> +         */
>>           vlapic->regs_page = alloc_domheap_page(v->domain, MEMF_no_owner);
>>           if ( vlapic->regs_page == NULL )
>>           {
>> diff --git a/xen/include/asm-x86/hvm/svm/avic.h b/xen/include/asm-x86/hvm/svm/avic.h
>> new file mode 100644
>> index 0000000000..32bb9a91e8
>> --- /dev/null
>> +++ b/xen/include/asm-x86/hvm/svm/avic.h
>> @@ -0,0 +1,36 @@
>> +#ifndef _SVM_AVIC_H_
>> +#define _SVM_AVIC_H_
>> +
>> +#include <xen/compiler.h>
>> +
>> +enum avic_incmp_ipi_err_code {
>> +    AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE,
>> +    AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN,
>> +    AVIC_INCMP_IPI_ERR_INV_TARGET,
>> +    AVIC_INCMP_IPI_ERR_INV_BK_PAGE,
>> +};
>> +
>> +struct __packed avic_logical_id_entry {
>> +    u32 guest_phy_apic_id : 8;
>> +    u32 res               : 23;
>> +    u32 valid             : 1;
>> +};
>> +
>> +struct __packed avic_physical_id_entry {
>> +    u64 host_phy_apic_id  : 8;
>> +    u64 res1              : 4;
>> +    u64 bk_pg_ptr_mfn     : 40;
>> +    u64 res2              : 10;
>> +    u64 is_running        : 1;
>> +    u64 valid             : 1;
>> +};
> For help with your type-punning fixes in later patches, you'll want
> these as a union.
>
> typedef union avic_logical_id_entry {
>      uint32_t raw;
>      struct {
>          ...
>      };
> } avic_logical_id_entry_t;

I'll put out a V2 with all the changes.

Janak

>
> ~Andrew
>
>> +
>> +extern bool svm_avic;
>> +
>> +int svm_avic_dom_init(struct domain *d);
>> +void svm_avic_dom_destroy(struct domain *d);
>> +
>> +bool svm_avic_vcpu_enabled(const struct vcpu *v);
>> +int svm_avic_init_vmcb(struct vcpu *v);
>> +
>> +#endif /* _SVM_AVIC_H_ */
>> diff --git a/xen/include/asm-x86/hvm/svm/svm.h b/xen/include/asm-x86/hvm/svm/svm.h
>> index 4e5e142910..983750fee9 100644
>> --- a/xen/include/asm-x86/hvm/svm/svm.h
>> +++ b/xen/include/asm-x86/hvm/svm/svm.h
>> @@ -65,6 +65,7 @@ extern u32 svm_feature_flags;
>>   #define SVM_FEATURE_DECODEASSISTS  7 /* Decode assists support */
>>   #define SVM_FEATURE_PAUSEFILTER   10 /* Pause intercept filter support */
>>   #define SVM_FEATURE_PAUSETHRESH   12 /* Pause intercept filter support */
>> +#define SVM_FEATURE_AVIC          13 /* AVIC Support */
>>   #define SVM_FEATURE_VLOADSAVE     15 /* virtual vmload/vmsave */
>>   #define SVM_FEATURE_VGIF          16 /* Virtual GIF */
>>   
>> @@ -79,6 +80,7 @@ extern u32 svm_feature_flags;
>>   #define cpu_has_pause_filter  cpu_has_svm_feature(SVM_FEATURE_PAUSEFILTER)
>>   #define cpu_has_pause_thresh  cpu_has_svm_feature(SVM_FEATURE_PAUSETHRESH)
>>   #define cpu_has_tsc_ratio     cpu_has_svm_feature(SVM_FEATURE_TSCRATEMSR)
>> +#define cpu_has_svm_avic      cpu_has_svm_feature(SVM_FEATURE_AVIC)
>>   #define cpu_has_svm_vloadsave cpu_has_svm_feature(SVM_FEATURE_VLOADSAVE)
>>   
>>   #define SVM_PAUSEFILTER_INIT    4000
>> diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
>> index 591d98fc8c..386aad2260 100644
>> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
>> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
>> @@ -500,6 +500,21 @@ struct vmcb_struct {
>>   };
>>   
>>   struct svm_domain {
>> +    /*
>> +     * This per-domain table is used by the hardware to locate
>> +     * the vAPIC backing page to be used to deliver interrupts
>> +     * based on the guest physical APIC ID.
>> +     */
>> +    struct avic_physical_id_entry *avic_physical_id_table;
>> +    struct page_info *avic_physical_id_table_pg;
>> +
>> +    /*
>> +     * This per-domain table is used by the hardware to map
>> +     * logically addressed interrupt requests (w/ guest logical APIC id)
>> +     * to the guest physical APIC ID.
>> +     */
>> +    struct avic_logical_id_entry *avic_logical_id_table;
>> +    struct page_info *avic_logical_id_table_pg;
>>   };
>>   
>>   struct arch_svm_struct {
>> @@ -533,6 +548,8 @@ struct arch_svm_struct {
>>           u64 length;
>>           u64 status;
>>       } osvw;
>> +
>> +    struct avic_physical_id_entry *avic_last_phy_id;
>>   };
>>   
>>   struct vmcb_struct *alloc_vmcb(void);


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC
  2018-04-13 17:57   ` Andrew Cooper
@ 2018-04-19 15:54     ` Natarajan, Janakarajan
  2018-04-19 18:18       ` Andrew Cooper
  0 siblings, 1 reply; 35+ messages in thread
From: Natarajan, Janakarajan @ 2018-04-19 15:54 UTC (permalink / raw)
  To: Andrew Cooper, Janakarajan Natarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich, Boris Ostrovsky

On 4/13/2018 12:57 PM, Andrew Cooper wrote:
> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d, unsigned int index)
>>       return &d->avic_physical_id_table[index];
>>   }
>>   
>> +static void avic_vcpu_load(struct vcpu *v)
>> +{
>> +    unsigned long tmp;
>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>> +    int h_phy_apic_id;
>> +    struct avic_physical_id_entry *entry = (struct avic_physical_id_entry *)&tmp;
>> +
>> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>> +
>> +    /*
>> +     * Note: APIC ID = 0xff is used for broadcast.
>> +     *       APIC ID > 0xff is reserved.
>> +     */
>> +    h_phy_apic_id = cpu_data[v->processor].apicid;
>> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
>> +
>> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
>> +    entry->host_phy_apic_id = h_phy_apic_id;
>> +    entry->is_running = 1;
>> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
> What is the purpose of s->avic_last_phy_id ?
>
> As far as I can tell, it is always an unchanging pointer into the
> physical ID table, which is only ever updated synchronously in current
> context.
>
> If so, I don't see why it needs any of these hoops to be jumped though.

s->avic_last_phy_id is used to quickly access the entry in the table.

When the code was pushed for Linux, memory barriers were used and it was 
suggested that atomic operations
be used instead to ensure compiler ordering. The same is done here.

>
> ~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers
  2018-04-17 12:58   ` Jan Beulich
@ 2018-04-19 16:31     ` Natarajan, Janakarajan
  2018-04-20 20:02     ` Natarajan, Janakarajan
  1 sibling, 0 replies; 35+ messages in thread
From: Natarajan, Janakarajan @ 2018-04-19 16:31 UTC (permalink / raw)
  To: Jan Beulich, Janakarajan Natarajan
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	Jun Nakajima, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Julien Grall, Suravee Suthikulpanit, Boris Ostrovsky

On 4/17/2018 7:58 AM, Jan Beulich wrote:
>>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
>> --- a/xen/include/asm-x86/hvm/vlapic.h
>> +++ b/xen/include/asm-x86/hvm/vlapic.h
>> @@ -137,6 +137,10 @@ void vlapic_ipi(struct vlapic *vlapic, uint32_t icr_low,
>> uint32_t icr_high);
>>   
>>   int vlapic_apicv_write(struct vcpu *v, unsigned int offset);
>>   
>> +void vlapic_reg_write(struct vcpu *v, unsigned int offset, uint32_t val);
>> +
>> +uint32_t vlapic_read_aligned(const struct vlapic *vlapic, unsigned int offset);
> If making these non-static is really necessary, they should (name-wise) become
> proper pairs of one another, e.g. renamed the former to vlapic_reg_read().
>
> Also while here you properly use uint32_t, almost everywhere you use u32.
> Please switch this throughout the series, and of course for all other fixed
> width integer types.

Okay. I can change it.

Janak

>
> Jan
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC
  2018-04-19 15:54     ` Natarajan, Janakarajan
@ 2018-04-19 18:18       ` Andrew Cooper
  2018-04-19 23:04         ` Boris Ostrovsky
  2018-04-20 10:12         ` Jan Beulich
  0 siblings, 2 replies; 35+ messages in thread
From: Andrew Cooper @ 2018-04-19 18:18 UTC (permalink / raw)
  To: Natarajan, Janakarajan, Janakarajan Natarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich, Boris Ostrovsky

On 19/04/18 16:54, Natarajan, Janakarajan wrote:
> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
>> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>>> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
>>> unsigned int index)
>>>       return &d->avic_physical_id_table[index];
>>>   }
>>>   +static void avic_vcpu_load(struct vcpu *v)
>>> +{
>>> +    unsigned long tmp;
>>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>>> +    int h_phy_apic_id;
>>> +    struct avic_physical_id_entry *entry = (struct
>>> avic_physical_id_entry *)&tmp;
>>> +
>>> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>>> +
>>> +    /*
>>> +     * Note: APIC ID = 0xff is used for broadcast.
>>> +     *       APIC ID > 0xff is reserved.
>>> +     */
>>> +    h_phy_apic_id = cpu_data[v->processor].apicid;
>>> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
>>> +
>>> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
>>> +    entry->host_phy_apic_id = h_phy_apic_id;
>>> +    entry->is_running = 1;
>>> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
>> What is the purpose of s->avic_last_phy_id ?
>>
>> As far as I can tell, it is always an unchanging pointer into the
>> physical ID table, which is only ever updated synchronously in current
>> context.
>>
>> If so, I don't see why it needs any of these hoops to be jumped though.
>
> s->avic_last_phy_id is used to quickly access the entry in the table.
>
> When the code was pushed for Linux, memory barriers were used and it
> was suggested that atomic operations
> be used instead to ensure compiler ordering. The same is done here.

Ok - summing up a conversation on IRC, and some digging around the manual.

Per VM, there is a single Physical APIC Table, which lives in a 4k
page.  This table is referenced by the VMCB, and read by hardware when
processing guest actions.

The contents of this table a list of 64bit entries,

struct __packed avic_physical_id_entry {
    u64 host_phy_apic_id  : 8;
    u64 res1              : 4;
    u64 bk_pg_ptr_mfn     : 40;
    u64 res2              : 10;
    u64 is_running        : 1;
    u64 valid             : 1;
};

which are indexed by guest APIC_ID.

AMD hardware allows writes to the APIC_ID register, but OSes don't do
this in practice (the register is read/discard on some hardware, and
strictly read-only in x2apic).  The implementation in Xen is to crash
the domain if we see a write here, and that is reasonable behaviour
which I don't expect to change going forwards.

As a result, the layout of the Physical APIC Table is fixed based on the
APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
valid bit (valid) are set up during construction, and remain unchanged
for the lifetime of the domain.

The only fields which change during runtime are the host_phys_apic_id,
and its valid bit (is_running), and these change on vcpu context switch.

Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
e->is_running to signify that the vcpu isn't allocated to a pcpu.

On ctxt_switch_to(), we want a simple

e->host_phy_apic_id = this_pcpu_apic_id;
smp_wmb();
__set_bit(e->is_running);

which guarantees that the host physical apic id field is valid and up to
date, before hardware sees it being reported as valid.  As these changes
are only made in current context, there are no other ordering or
atomicity concerns.

This table is expected to live in regular WB RAM, and the manual has no
comment/reference to requiring special accesses.  Therefore, I'm
moderately confident that the above ordering is sufficient for correct
behaviour, and no explicitly atomic actions are required.

Thoughts/comments/suggestions?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC
  2018-04-19 18:18       ` Andrew Cooper
@ 2018-04-19 23:04         ` Boris Ostrovsky
  2018-04-20 10:09           ` Andrew Cooper
  2018-04-20 10:12         ` Jan Beulich
  1 sibling, 1 reply; 35+ messages in thread
From: Boris Ostrovsky @ 2018-04-19 23:04 UTC (permalink / raw)
  To: Andrew Cooper, Natarajan, Janakarajan, Janakarajan Natarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich

On 04/19/2018 02:18 PM, Andrew Cooper wrote:
> On 19/04/18 16:54, Natarajan, Janakarajan wrote:
>> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
>>> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>>>> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
>>>> unsigned int index)
>>>>       return &d->avic_physical_id_table[index];
>>>>   }
>>>>   +static void avic_vcpu_load(struct vcpu *v)
>>>> +{
>>>> +    unsigned long tmp;
>>>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>>>> +    int h_phy_apic_id;
>>>> +    struct avic_physical_id_entry *entry = (struct
>>>> avic_physical_id_entry *)&tmp;
>>>> +
>>>> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>>>> +
>>>> +    /*
>>>> +     * Note: APIC ID = 0xff is used for broadcast.
>>>> +     *       APIC ID > 0xff is reserved.
>>>> +     */
>>>> +    h_phy_apic_id = cpu_data[v->processor].apicid;
>>>> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
>>>> +
>>>> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
>>>> +    entry->host_phy_apic_id = h_phy_apic_id;
>>>> +    entry->is_running = 1;
>>>> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
>>> What is the purpose of s->avic_last_phy_id ?
>>>
>>> As far as I can tell, it is always an unchanging pointer into the
>>> physical ID table, which is only ever updated synchronously in current
>>> context.
>>>
>>> If so, I don't see why it needs any of these hoops to be jumped though.
>> s->avic_last_phy_id is used to quickly access the entry in the table.
>>
>> When the code was pushed for Linux, memory barriers were used and it
>> was suggested that atomic operations
>> be used instead to ensure compiler ordering. The same is done here.
> Ok - summing up a conversation on IRC, and some digging around the manual.
>
> Per VM, there is a single Physical APIC Table, which lives in a 4k
> page.  This table is referenced by the VMCB, and read by hardware when
> processing guest actions.
>
> The contents of this table a list of 64bit entries,
>
> struct __packed avic_physical_id_entry {
>     u64 host_phy_apic_id  : 8;
>     u64 res1              : 4;
>     u64 bk_pg_ptr_mfn     : 40;
>     u64 res2              : 10;
>     u64 is_running        : 1;
>     u64 valid             : 1;
> };
>
> which are indexed by guest APIC_ID.
>
> AMD hardware allows writes to the APIC_ID register, but OSes don't do
> this in practice (the register is read/discard on some hardware, and
> strictly read-only in x2apic).  The implementation in Xen is to crash
> the domain if we see a write here, and that is reasonable behaviour
> which I don't expect to change going forwards.
>
> As a result, the layout of the Physical APIC Table is fixed based on the
> APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
> valid bit (valid) are set up during construction, and remain unchanged
> for the lifetime of the domain.
>
> The only fields which change during runtime are the host_phys_apic_id,
> and its valid bit (is_running), and these change on vcpu context switch.
>
> Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
> e->is_running to signify that the vcpu isn't allocated to a pcpu.
>
> On ctxt_switch_to(), we want a simple
>
> e->host_phy_apic_id = this_pcpu_apic_id;
> smp_wmb();
> __set_bit(e->is_running);
>
> which guarantees that the host physical apic id field is valid and up to
> date, before hardware sees it being reported as valid.  As these changes
> are only made in current context, there are no other ordering or
> atomicity concerns.
>
> This table is expected to live in regular WB RAM, and the manual has no
> comment/reference to requiring special accesses.  Therefore, I'm
> moderately confident that the above ordering is sufficient for correct
> behaviour, and no explicitly atomic actions are required.
>
> Thoughts/comments/suggestions?


The entry can also be written as a single raw 64-bit value (I think you
suggested in one of the reviews to make it a union with a uint64_t).

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC
  2018-04-19 23:04         ` Boris Ostrovsky
@ 2018-04-20 10:09           ` Andrew Cooper
  0 siblings, 0 replies; 35+ messages in thread
From: Andrew Cooper @ 2018-04-20 10:09 UTC (permalink / raw)
  To: Boris Ostrovsky, Natarajan, Janakarajan, Janakarajan Natarajan,
	xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suravee Suthikulpanit, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich

On 20/04/18 00:04, Boris Ostrovsky wrote:
> On 04/19/2018 02:18 PM, Andrew Cooper wrote:
>> On 19/04/18 16:54, Natarajan, Janakarajan wrote:
>>> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
>>>> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>>>>> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
>>>>> unsigned int index)
>>>>>       return &d->avic_physical_id_table[index];
>>>>>   }
>>>>>   +static void avic_vcpu_load(struct vcpu *v)
>>>>> +{
>>>>> +    unsigned long tmp;
>>>>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>>>>> +    int h_phy_apic_id;
>>>>> +    struct avic_physical_id_entry *entry = (struct
>>>>> avic_physical_id_entry *)&tmp;
>>>>> +
>>>>> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>>>>> +
>>>>> +    /*
>>>>> +     * Note: APIC ID = 0xff is used for broadcast.
>>>>> +     *       APIC ID > 0xff is reserved.
>>>>> +     */
>>>>> +    h_phy_apic_id = cpu_data[v->processor].apicid;
>>>>> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
>>>>> +
>>>>> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
>>>>> +    entry->host_phy_apic_id = h_phy_apic_id;
>>>>> +    entry->is_running = 1;
>>>>> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
>>>> What is the purpose of s->avic_last_phy_id ?
>>>>
>>>> As far as I can tell, it is always an unchanging pointer into the
>>>> physical ID table, which is only ever updated synchronously in current
>>>> context.
>>>>
>>>> If so, I don't see why it needs any of these hoops to be jumped though.
>>> s->avic_last_phy_id is used to quickly access the entry in the table.
>>>
>>> When the code was pushed for Linux, memory barriers were used and it
>>> was suggested that atomic operations
>>> be used instead to ensure compiler ordering. The same is done here.
>> Ok - summing up a conversation on IRC, and some digging around the manual.
>>
>> Per VM, there is a single Physical APIC Table, which lives in a 4k
>> page.  This table is referenced by the VMCB, and read by hardware when
>> processing guest actions.
>>
>> The contents of this table a list of 64bit entries,
>>
>> struct __packed avic_physical_id_entry {
>>     u64 host_phy_apic_id  : 8;
>>     u64 res1              : 4;
>>     u64 bk_pg_ptr_mfn     : 40;
>>     u64 res2              : 10;
>>     u64 is_running        : 1;
>>     u64 valid             : 1;
>> };
>>
>> which are indexed by guest APIC_ID.
>>
>> AMD hardware allows writes to the APIC_ID register, but OSes don't do
>> this in practice (the register is read/discard on some hardware, and
>> strictly read-only in x2apic).  The implementation in Xen is to crash
>> the domain if we see a write here, and that is reasonable behaviour
>> which I don't expect to change going forwards.
>>
>> As a result, the layout of the Physical APIC Table is fixed based on the
>> APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
>> valid bit (valid) are set up during construction, and remain unchanged
>> for the lifetime of the domain.
>>
>> The only fields which change during runtime are the host_phys_apic_id,
>> and its valid bit (is_running), and these change on vcpu context switch.
>>
>> Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
>> e->is_running to signify that the vcpu isn't allocated to a pcpu.
>>
>> On ctxt_switch_to(), we want a simple
>>
>> e->host_phy_apic_id = this_pcpu_apic_id;
>> smp_wmb();
>> __set_bit(e->is_running);
>>
>> which guarantees that the host physical apic id field is valid and up to
>> date, before hardware sees it being reported as valid.  As these changes
>> are only made in current context, there are no other ordering or
>> atomicity concerns.
>>
>> This table is expected to live in regular WB RAM, and the manual has no
>> comment/reference to requiring special accesses.  Therefore, I'm
>> moderately confident that the above ordering is sufficient for correct
>> behaviour, and no explicitly atomic actions are required.
>>
>> Thoughts/comments/suggestions?
>
> The entry can also be written as a single raw 64-bit value (I think you
> suggested in one of the reviews to make it a union with a uint64_t).

Yes it can, but the handling for that is more complicated and (AFAICT)
unnecessary.

You'd need something like:

e = ACCESS_ONCE(*ptr);
e->host_phy_apic_id = ...;
e->is_running = 1;
ACCESS_ONCE(*ptr) = e;

which would be the most flexible way of expressing the result.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC
  2018-04-19 18:18       ` Andrew Cooper
  2018-04-19 23:04         ` Boris Ostrovsky
@ 2018-04-20 10:12         ` Jan Beulich
  1 sibling, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2018-04-20 10:12 UTC (permalink / raw)
  To: Janakarajan Natarajan, JanakarajanNatarajan, Andrew Cooper
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	George Dunlap, Tim Deegan, Ian Jackson, xen-devel, Julien Grall,
	Jun Nakajima, Boris Ostrovsky

>>> On 19.04.18 at 20:18, <andrew.cooper3@citrix.com> wrote:
> On 19/04/18 16:54, Natarajan, Janakarajan wrote:
>> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
>>> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>>>> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
>>>> unsigned int index)
>>>>       return &d->avic_physical_id_table[index];
>>>>   }
>>>>   +static void avic_vcpu_load(struct vcpu *v)
>>>> +{
>>>> +    unsigned long tmp;
>>>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>>>> +    int h_phy_apic_id;
>>>> +    struct avic_physical_id_entry *entry = (struct
>>>> avic_physical_id_entry *)&tmp;
>>>> +
>>>> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>>>> +
>>>> +    /*
>>>> +     * Note: APIC ID = 0xff is used for broadcast.
>>>> +     *       APIC ID > 0xff is reserved.
>>>> +     */
>>>> +    h_phy_apic_id = cpu_data[v->processor].apicid;
>>>> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
>>>> +
>>>> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
>>>> +    entry->host_phy_apic_id = h_phy_apic_id;
>>>> +    entry->is_running = 1;
>>>> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
>>> What is the purpose of s->avic_last_phy_id ?
>>>
>>> As far as I can tell, it is always an unchanging pointer into the
>>> physical ID table, which is only ever updated synchronously in current
>>> context.
>>>
>>> If so, I don't see why it needs any of these hoops to be jumped though.
>>
>> s->avic_last_phy_id is used to quickly access the entry in the table.
>>
>> When the code was pushed for Linux, memory barriers were used and it
>> was suggested that atomic operations
>> be used instead to ensure compiler ordering. The same is done here.
> 
> Ok - summing up a conversation on IRC, and some digging around the manual.
> 
> Per VM, there is a single Physical APIC Table, which lives in a 4k
> page.  This table is referenced by the VMCB, and read by hardware when
> processing guest actions.
> 
> The contents of this table a list of 64bit entries,
> 
> struct __packed avic_physical_id_entry {
>     u64 host_phy_apic_id  : 8;
>     u64 res1              : 4;
>     u64 bk_pg_ptr_mfn     : 40;
>     u64 res2              : 10;
>     u64 is_running        : 1;
>     u64 valid             : 1;
> };
> 
> which are indexed by guest APIC_ID.
> 
> AMD hardware allows writes to the APIC_ID register, but OSes don't do
> this in practice (the register is read/discard on some hardware, and
> strictly read-only in x2apic).  The implementation in Xen is to crash
> the domain if we see a write here, and that is reasonable behaviour
> which I don't expect to change going forwards.
> 
> As a result, the layout of the Physical APIC Table is fixed based on the
> APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
> valid bit (valid) are set up during construction, and remain unchanged
> for the lifetime of the domain.
> 
> The only fields which change during runtime are the host_phys_apic_id,
> and its valid bit (is_running), and these change on vcpu context switch.
> 
> Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
> e->is_running to signify that the vcpu isn't allocated to a pcpu.
> 
> On ctxt_switch_to(), we want a simple
> 
> e->host_phy_apic_id = this_pcpu_apic_id;
> smp_wmb();
> __set_bit(e->is_running);
> 
> which guarantees that the host physical apic id field is valid and up to
> date, before hardware sees it being reported as valid.  As these changes
> are only made in current context, there are no other ordering or
> atomicity concerns.

Besides the above not going to compile (due to the bitfield nature),
why two accesses when one suffices? After all both fields needing
updating live within the same qword.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers
  2018-04-17 12:58   ` Jan Beulich
  2018-04-19 16:31     ` Natarajan, Janakarajan
@ 2018-04-20 20:02     ` Natarajan, Janakarajan
  2018-04-23  7:54       ` Jan Beulich
  1 sibling, 1 reply; 35+ messages in thread
From: Natarajan, Janakarajan @ 2018-04-20 20:02 UTC (permalink / raw)
  To: Jan Beulich, Janakarajan Natarajan
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	Jun Nakajima, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Julien Grall, Suravee Suthikulpanit, Boris Ostrovsky

On 4/17/2018 7:58 AM, Jan Beulich wrote:
>>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
>> --- a/xen/include/asm-x86/hvm/vlapic.h
>> +++ b/xen/include/asm-x86/hvm/vlapic.h
>> @@ -137,6 +137,10 @@ void vlapic_ipi(struct vlapic *vlapic, uint32_t icr_low,
>> uint32_t icr_high);
>>   
>>   int vlapic_apicv_write(struct vcpu *v, unsigned int offset);
>>   
>> +void vlapic_reg_write(struct vcpu *v, unsigned int offset, uint32_t val);
>> +
>> +uint32_t vlapic_read_aligned(const struct vlapic *vlapic, unsigned int offset);
> If making these non-static is really necessary, they should (name-wise) become
> proper pairs of one another, e.g. renamed the former to vlapic_reg_read().
>
> Also while here you properly use uint32_t, almost everywhere you use u32.
> Please switch this throughout the series, and of course for all other fixed
> width integer types.

In the case of vmcb.h, where the existing variables use u32, u64 would 
you want me to have uint32_t,
uint64_t for the new avic variables?

>
> Jan
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers
  2018-04-20 20:02     ` Natarajan, Janakarajan
@ 2018-04-23  7:54       ` Jan Beulich
  0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2018-04-23  7:54 UTC (permalink / raw)
  To: Janakarajan Natarajan, JanakarajanNatarajan
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	Jun Nakajima, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Julien Grall, Suravee Suthikulpanit, Boris Ostrovsky

>>> On 20.04.18 at 22:02, <jnataraj@amd.com> wrote:
> On 4/17/2018 7:58 AM, Jan Beulich wrote:
>>>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
>>> --- a/xen/include/asm-x86/hvm/vlapic.h
>>> +++ b/xen/include/asm-x86/hvm/vlapic.h
>>> @@ -137,6 +137,10 @@ void vlapic_ipi(struct vlapic *vlapic, uint32_t 
> icr_low,
>>> uint32_t icr_high);
>>>   
>>>   int vlapic_apicv_write(struct vcpu *v, unsigned int offset);
>>>   
>>> +void vlapic_reg_write(struct vcpu *v, unsigned int offset, uint32_t val);
>>> +
>>> +uint32_t vlapic_read_aligned(const struct vlapic *vlapic, unsigned int offset);
>> If making these non-static is really necessary, they should (name-wise) become
>> proper pairs of one another, e.g. renamed the former to vlapic_reg_read().
>>
>> Also while here you properly use uint32_t, almost everywhere you use u32.
>> Please switch this throughout the series, and of course for all other fixed
>> width integer types.
> 
> In the case of vmcb.h, where the existing variables use u32, u64 would 
> you want me to have uint32_t,
> uint64_t for the new avic variables?

Ideally, yes (and even more ideally you'd find the time to switch the others
around), but I wouldn't make this a strict requirement (even less so as I'm
not a maintainer of those files anyway) - I can see the consistency
argument.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code
  2018-04-03 23:01 ` [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code Janakarajan Natarajan
  2018-04-13 17:35   ` Andrew Cooper
  2018-04-16 15:55   ` Jan Beulich
@ 2018-04-23 19:33   ` Konrad Rzeszutek Wilk
  2018-04-23 20:26     ` Natarajan, Janakarajan
  2 siblings, 1 reply; 35+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-23 19:33 UTC (permalink / raw)
  To: Janakarajan Natarajan
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Stefano Stabellini,
	Wei Liu, Suravee Suthikulpanit, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, Julien Grall, Jan Beulich,
	Boris Ostrovsky

> +++ b/xen/arch/x86/hvm/svm/avic.c
> @@ -0,0 +1,191 @@
> +/*
> + * avic.c: implements AMD Advanced Virtual Interrupt Controller (AVIC) support
> + * Copyright (c) 2016, Advanced Micro Devices, Inc.

Not 2018?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code
  2018-04-23 19:33   ` Konrad Rzeszutek Wilk
@ 2018-04-23 20:26     ` Natarajan, Janakarajan
  0 siblings, 0 replies; 35+ messages in thread
From: Natarajan, Janakarajan @ 2018-04-23 20:26 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Janakarajan Natarajan
  Cc: Jun Nakajima, Tim Deegan, Kevin Tian, Stefano Stabellini,
	Wei Liu, Suravee Suthikulpanit, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, Julien Grall, Jan Beulich,
	Boris Ostrovsky

On 4/23/2018 2:33 PM, Konrad Rzeszutek Wilk wrote:
>> +++ b/xen/arch/x86/hvm/svm/avic.c
>> @@ -0,0 +1,191 @@
>> +/*
>> + * avic.c: implements AMD Advanced Virtual Interrupt Controller (AVIC) support
>> + * Copyright (c) 2016, Advanced Micro Devices, Inc.
> Not 2018?

Ah good catch. Yeah it should be 2018.




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code
  2018-04-16 15:55   ` Jan Beulich
  2018-04-16 16:15     ` Roger Pau Monné
@ 2018-04-26 15:32     ` Natarajan, Janakarajan
  2018-04-26 15:39       ` Andrew Cooper
  1 sibling, 1 reply; 35+ messages in thread
From: Natarajan, Janakarajan @ 2018-04-26 15:32 UTC (permalink / raw)
  To: Jan Beulich, Janakarajan Natarajan
  Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
	Jun Nakajima, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, Julien Grall, Suravee Suthikulpanit, Boris Ostrovsky

On 4/16/2018 10:55 AM, Jan Beulich wrote:
>>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> Introduce AVIC base initialization code. This includes:
>>      * Setting up per-VM data structures.
>>      * Setting up per-vCPU data structure.
>>      * Initializing AVIC-related VMCB bit fields.
>>
>> This patch also introduces a new Xen parameter (svm-avic),
>> which can be used to enable/disable AVIC support.
> This sentence looks stale/misplaced now.
>
>> --- /dev/null
>> +++ b/xen/arch/x86/hvm/svm/avic.c
>> @@ -0,0 +1,191 @@
>> +/*
>> + * avic.c: implements AMD Advanced Virtual Interrupt Controller (AVIC) support
>> + * Copyright (c) 2016, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program; If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/domain_page.h>
>> +#include <xen/sched.h>
>> +#include <xen/stdbool.h>
>> +#include <asm/acpi.h>
>> +#include <asm/apicdef.h>
>> +#include <asm/atomic.h>
>> +#include <asm/event.h>
>> +#include <asm/hvm/emulate.h>
>> +#include <asm/hvm/nestedhvm.h>
>> +#include <asm/hvm/support.h>
>> +#include <asm/hvm/svm/avic.h>
>> +#include <asm/hvm/vlapic.h>
>> +#include <asm/p2m.h>
>> +#include <asm/page.h>
>> +
>> +/*
>> + * Note: Current max index allowed for physical APIC ID table is 255.
>> + */
> This is a single line comment.
>
>> +#define AVIC_PHY_APIC_ID_MAX    0xFF
> Is this really an AVIC-specific constant, rather than e.g.
> GET_xAPIC_ID(APIC_ID_MASK)?
>
>> +#define AVIC_VAPIC_BAR_MASK     (((1ULL << 40) - 1) << PAGE_SHIFT)
>> +
>> +/*
>> + * Note:
>> + * Currently, svm-avic mode is not supported with nested virtualization.
>> + * Therefore, it is not yet currently enabled by default. Once the support
>> + * is in-place, this should be enabled by default.
>> + */
>> +bool svm_avic = 0;
> false (or simply omit the intializer)
>
>> +int svm_avic_dom_init(struct domain *d)
>> +{
>> +    int ret = 0;
>> +    struct page_info *pg;
>> +
>> +    if ( !svm_avic || !has_vlapic(d) )
>> +        return 0;
>> +
>> +    /*
>> +     * Note:
>> +     * AVIC hardware walks the nested page table to check permissions,
>> +     * but does not use the SPA address specified in the leaf page
>> +     * table entry since it uses  address in the AVIC_BACKING_PAGE pointer
>> +     * field of the VMCB. Therefore, we set up a dummy page for APIC _mfn(0).
>> +     */
>> +    set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE),
>> +                       _mfn(0), PAGE_ORDER_4K,
>> +                       p2m_get_hostp2m(d)->default_access);
> The use of MFN 0 here looks risky to me: How do you guarantee nothing else
> might ever use that P2M entry?

Do you have any suggestions for an alternative to MFN 0 that is 
guaranteed to never be used?

Thanks,
Janak

>
>> +    /* Init AVIC logical APIC ID table */
>> +    pg = alloc_domheap_page(d, MEMF_no_owner);
>> +    if ( !pg )
>> +    {
>> +        ret = -ENOMEM;
>> +        goto err_out;
>> +    }
>> +    clear_domain_page(_mfn(page_to_mfn(pg)));
>> +    d->arch.hvm_domain.svm.avic_logical_id_table_pg = pg;
>> +    d->arch.hvm_domain.svm.avic_logical_id_table = __map_domain_page_global(pg);
> Both here and ...
>
>> +    /* Init AVIC physical APIC ID table */
>> +    pg = alloc_domheap_page(d, MEMF_no_owner);
>> +    if ( !pg )
>> +    {
>> +        ret = -ENOMEM;
>> +        goto err_out;
>> +    }
>> +    clear_domain_page(_mfn(page_to_mfn(pg)));
>> +    d->arch.hvm_domain.svm.avic_physical_id_table_pg = pg;
>> +    d->arch.hvm_domain.svm.avic_physical_id_table = __map_domain_page_global(pg);
> ... here: Do you really need to store both page pointer and map address?
>
>> +void svm_avic_dom_destroy(struct domain *d)
>> +{
>> +    if ( !svm_avic || !has_vlapic(d) )
>> +        return;
>> +
>> +    if ( d->arch.hvm_domain.svm.avic_physical_id_table )
>> +    {
>> +        unmap_domain_page_global(d->arch.hvm_domain.svm.avic_physical_id_table);
>> +        free_domheap_page(d->arch.hvm_domain.svm.avic_physical_id_table_pg);
>> +        d->arch.hvm_domain.svm.avic_physical_id_table_pg = 0;
>> +        d->arch.hvm_domain.svm.avic_physical_id_table = 0;
> DYM NULL?
>
>> +int svm_avic_init_vmcb(struct vcpu *v)
>> +{
>> +    u32 apic_id;
>> +    unsigned long tmp;
>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>> +    struct vmcb_struct *vmcb = s->vmcb;
>> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
>> +    const struct vlapic *vlapic = vcpu_vlapic(v);
>> +    struct avic_physical_id_entry *entry = (struct avic_physical_id_entry *)&tmp;
>> +
>> +    if ( !svm_avic || !has_vlapic(v->domain) )
>> +        return 0;
>> +
>> +    if ( !vlapic || !vlapic->regs_page )
>> +        return -EINVAL;
>> +
>> +    apic_id = vlapic_read_aligned(vcpu_vlapic(v), APIC_ID);
>> +    s->avic_last_phy_id = avic_get_physical_id_entry(d, GET_xAPIC_ID(apic_id));
>> +    if ( !s->avic_last_phy_id )
>> +        return -EINVAL;
>> +
>> +    vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page);
>> +    vmcb->avic_logical_id_table_pa = domain_page_map_to_mfn(d->avic_logical_id_table) << PAGE_SHIFT;
>> +    vmcb->avic_physical_id_table_pa = domain_page_map_to_mfn(d->avic_physical_id_table) << PAGE_SHIFT;
> DYM something like mfn_to_maddr()? Please don't open-code things.
>
> Jan
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code
  2018-04-26 15:32     ` Natarajan, Janakarajan
@ 2018-04-26 15:39       ` Andrew Cooper
  0 siblings, 0 replies; 35+ messages in thread
From: Andrew Cooper @ 2018-04-26 15:39 UTC (permalink / raw)
  To: Natarajan, Janakarajan, Jan Beulich, Janakarajan Natarajan
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	George Dunlap, Tim Deegan, Ian Jackson, xen-devel, Julien Grall,
	Jun Nakajima, Boris Ostrovsky

On 26/04/18 16:32, Natarajan, Janakarajan wrote:
> On 4/16/2018 10:55 AM, Jan Beulich wrote:
>>>>> On 04.04.18 at 01:01, <Janakarajan.Natarajan@amd.com> wrote:
>>>
>>> +int svm_avic_dom_init(struct domain *d)
>>> +{
>>> +    int ret = 0;
>>> +    struct page_info *pg;
>>> +
>>> +    if ( !svm_avic || !has_vlapic(d) )
>>> +        return 0;
>>> +
>>> +    /*
>>> +     * Note:
>>> +     * AVIC hardware walks the nested page table to check permissions,
>>> +     * but does not use the SPA address specified in the leaf page
>>> +     * table entry since it uses  address in the AVIC_BACKING_PAGE
>>> pointer
>>> +     * field of the VMCB. Therefore, we set up a dummy page for
>>> APIC _mfn(0).
>>> +     */
>>> +    set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE),
>>> +                       _mfn(0), PAGE_ORDER_4K,
>>> +                       p2m_get_hostp2m(d)->default_access);
>> The use of MFN 0 here looks risky to me: How do you guarantee nothing
>> else
>> might ever use that P2M entry?
>
> Do you have any suggestions for an alternative to MFN 0 that is
> guaranteed to never be used?

static const __section(".bss.page_aligned.const") __aligned(PAGE_SIZE)
avic_backing_page[PAGE_SIZE];

However, don't writes outside of APIC registers get dumped into this
page?  If so, you need to allocate a new page per guest.  If not, where
do the writes outside of the APIC registers get dumped?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers
  2018-04-13 17:48   ` Andrew Cooper
@ 2018-04-30 21:23     ` Natarajan, Janakarajan
  2018-04-30 23:06       ` Andrew Cooper
  0 siblings, 1 reply; 35+ messages in thread
From: Natarajan, Janakarajan @ 2018-04-30 21:23 UTC (permalink / raw)
  To: Andrew Cooper, Natarajan, Janakarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suthikulpanit, Suravee, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich, Boris Ostrovsky



On 4/13/2018 12:48 PM, Andrew Cooper wrote:
> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> AVIC introduces two new #vmexit handlers:
>>
>> VMEXIT_INCOMP_IPI:
>> This occurs when an IPI could not be delivered to all targeted guest
>> virtual processors because at least one guest virtual processor
>> was not allocated to a physical core at the time. In this case,
>> Xen would try to emulate IPI.
>>
>> VMEXIT_DO_NOACCEL:
>> This occurs when a guest access to an APIC register that cannot be
>> accelerated by AVIC. In this case, Xen tries to emulate register accesses.
>>
>> This fault is also generated if an EOI is attempted when the highest priority
>> in-service interrupt is set for level-triggered mode.
>>
>> This patch also declare vlapic_read_aligned() and vlapic_reg_write()
>> as non-static to expose them to be used by AVIC.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
>> ---
>>   xen/arch/x86/hvm/svm/avic.c        | 296 +++++++++++++++++++++++++++++++++++++
>>   xen/arch/x86/hvm/svm/svm.c         |   8 +
>>   xen/arch/x86/hvm/vlapic.c          |   4 +-
>>   xen/include/asm-x86/hvm/svm/avic.h |   3 +
>>   xen/include/asm-x86/hvm/svm/vmcb.h |   6 +
>>   xen/include/asm-x86/hvm/vlapic.h   |   4 +
>>   6 files changed, 319 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
>> index 8108698911..e112469774 100644
>> --- a/xen/arch/x86/hvm/svm/avic.c
>> +++ b/xen/arch/x86/hvm/svm/avic.c
>> @@ -19,6 +19,7 @@
>>   #include <xen/sched.h>
>>   #include <xen/stdbool.h>
>>   #include <asm/acpi.h>
>> +#include <asm/apic.h>
>>   #include <asm/apicdef.h>
>>   #include <asm/atomic.h>
>>   #include <asm/event.h>
>> @@ -37,6 +38,8 @@
>>   
>>   #define AVIC_VAPIC_BAR_MASK     (((1ULL << 40) - 1) << PAGE_SHIFT)
>>   
>> +#define AVIC_UNACCEL_ACCESS_OFFSET_MASK    0xFF0
>> +
>>   /*
>>    * Note:
>>    * Currently, svm-avic mode is not supported with nested virtualization.
>> @@ -101,6 +104,8 @@ int svm_avic_dom_init(struct domain *d)
>>       d->arch.hvm_domain.svm.avic_physical_id_table_pg = pg;
>>       d->arch.hvm_domain.svm.avic_physical_id_table = __map_domain_page_global(pg);
>>   
>> +    spin_lock_init(&d->arch.hvm_domain.svm.avic_dfr_mode_lock);
>> +
>>       return ret;
>>    err_out:
>>       svm_avic_dom_destroy(d);
>> @@ -181,6 +186,297 @@ int svm_avic_init_vmcb(struct vcpu *v)
>>   }
>>   
>>   /*
>> + * Note:
>> + * This function handles the AVIC_INCOMP_IPI #vmexit when AVIC is enabled.
>> + * The hardware generates this fault when an IPI could not be delivered
>> + * to all targeted guest virtual processors because at least one guest
>> + * virtual processor was not allocated to a physical core at the time.
>> + */
>> +void svm_avic_vmexit_do_incomp_ipi(struct cpu_user_regs *regs)
>> +{
>> +    struct vcpu *curr = current;
>> +    struct domain *currd = curr->domain;
>> +    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
>> +    u32 icrh = vmcb->exitinfo1 >> 32;
>> +    u32 icrl = vmcb->exitinfo1;
>> +    u32 id = vmcb->exitinfo2 >> 32;
>> +    u32 index = vmcb->exitinfo2 && 0xFF;
>> +
>> +    switch ( id )
>> +    {
>> +    case AVIC_INCMP_IPI_ERR_INVALID_INT_TYPE:
>> +        /*
>> +         * AVIC hardware handles the delivery of
>> +         * IPIs when the specified Message Type is Fixed
>> +         * (also known as fixed delivery mode) and
>> +         * the Trigger Mode is edge-triggered. The hardware
>> +         * also supports self and broadcast delivery modes
>> +         * specified via the Destination Shorthand(DSH)
>> +         * field of the ICRL. Logical and physical APIC ID
>> +         * formats are supported. All other IPI types cause
>> +         * a #VMEXIT, which needs to emulated.
>> +         */
>> +        vlapic_reg_write(curr, APIC_ICR2, icrh);
>> +        vlapic_reg_write(curr, APIC_ICR, icrl);
>> +        break;
>> +
>> +    case AVIC_INCMP_IPI_ERR_TARGET_NOT_RUN:
>> +    {
>> +        /*
>> +         * At this point, we expect that the AVIC HW has already
>> +         * set the appropriate IRR bits on the valid target
>> +         * vcpus. So, we just need to kick the appropriate vcpu.
>> +         */
>> +        struct vcpu *v;
>> +        uint32_t dest = GET_xAPIC_DEST_FIELD(icrh);
>> +        uint32_t short_hand = icrl & APIC_SHORT_MASK;
>> +        bool dest_mode = !!(icrl & APIC_DEST_MASK);
> No need for !!.  It is the explicit behaviour of the bool type.
>
>> +
>> +        for_each_vcpu ( currd,  v )
>> +        {
>> +            if ( v != curr &&
>> +                 vlapic_match_dest(vcpu_vlapic(v), vcpu_vlapic(curr),
>> +                                   short_hand, dest, dest_mode) )
>> +            {
>> +                vcpu_kick(v);
>> +                break;
>> +            }
>> +        }
>> +        break;
>> +    }
>> +
>> +    case AVIC_INCMP_IPI_ERR_INV_TARGET:
>> +        gprintk(XENLOG_ERR,
>> +                "SVM: %s: Invalid IPI target (icr=%#08x:%08x, idx=%u)\n",
>> +                __func__, icrh, icrl, index);
>> +        domain_crash(currd);
>> +        break;
>> +
>> +    case AVIC_INCMP_IPI_ERR_INV_BK_PAGE:
>> +        gprintk(XENLOG_ERR,
>> +                "SVM: %s: Invalid bk page (icr=%#08x:%08x, idx=%u)\n",
>> +                __func__, icrh, icrl, index);
>> +        domain_crash(currd);
>> +        break;
>> +
>> +    default:
>> +        gprintk(XENLOG_ERR, "SVM: %s: Unknown IPI interception (%#x)\n",
>> +                __func__, id);
>> +        domain_crash(currd);
>> +    }
>> +}
>> +
>> +static struct avic_logical_id_entry *
>> +avic_get_logical_id_entry(struct svm_domain *d, u32 ldr, bool flat)
>> +{
>> +    unsigned int index;
>> +    unsigned int dest_id = GET_xAPIC_LOGICAL_ID(ldr);
>> +
>> +    if ( !dest_id )
>> +        return NULL;
>> +
>> +    if ( flat )
>> +    {
>> +        index = ffs(dest_id) - 1;
>> +        if ( index > 7 )
>> +            return NULL;
>> +    }
>> +    else
>> +    {
>> +        unsigned int cluster = (dest_id & 0xf0) >> 4;
>> +        int apic = ffs(dest_id & 0x0f) - 1;
>> +
>> +        if ( (apic < 0) || (apic > 7) || (cluster >= 0xf) )
>> +            return NULL;
>> +        index = (cluster << 2) + apic;
>> +    }
>> +
>> +    ASSERT(index <= 255);
>> +
>> +    return &d->avic_logical_id_table[index];
>> +}
>> +
>> +static int avic_ldr_write(struct vcpu *v, u8 g_phy_id, u32 ldr, bool valid)
>> +{
>> +    struct avic_logical_id_entry *entry, new_entry;
>> +    u32 dfr = vlapic_read_aligned(vcpu_vlapic(v), APIC_DFR);
>> +
>> +    entry = avic_get_logical_id_entry(&v->domain->arch.hvm_domain.svm,
>> +                                      ldr, (dfr == APIC_DFR_FLAT));
>> +    if (!entry)
> if ( !entry )
>
>> +        return -EINVAL;
>> +
>> +    new_entry = *entry;
>> +    smp_rmb();
>> +    new_entry.guest_phy_apic_id = g_phy_id;
>> +    new_entry.valid = valid;
>> +    *entry = new_entry;
>> +    smp_wmb();
> These barriers don't do what you want.  The pattern you are looking for
> would require an smp_mb() between setting valid and writing things
> back.  However, that is overkill - all that matters is that the compiler
> doesn't generate multiple partial updates.
>
> new_entry.raw = ACCESS_ONCE(entry->raw);
>
> new_entry.guest_phy_apic_id = g_phy_id;
> new_entry.valid = valid;
>
> ACCESS_ONCE(entry->raw) = new_entry.raw;

Since it was decided to not use

union ... {
         uint64_t raw;
         struct avic_logical_table_entry {
         ....
         ....
         };
};

would smp_mb() make sense here?

Thanks,
Janak

>
>> +
>> +    return 0;
>> +}
>> +
>> +static int avic_handle_ldr_update(struct vcpu *v)
>> +{
>> +    int ret = 0;
>> +    u32 ldr = vlapic_read_aligned(vcpu_vlapic(v), APIC_LDR);
>> +    u32 apic_id = vlapic_read_aligned(vcpu_vlapic(v), APIC_ID);
>> +
>> +    if ( !ldr )
>> +        return -EINVAL;
>> +
>> +    ret = avic_ldr_write(v, GET_xAPIC_ID(apic_id), ldr, true);
>> +    if ( ret && v->arch.hvm_svm.avic_last_ldr )
>> +    {
>> +        /*
>> +         * Note:
>> +         * In case of failure to update LDR register,
>> +         * we set the guest physical APIC ID to 0,
>> +         * and set the entry logical APID ID entry
>> +         * to invalid (false).
>> +         */
>> +        avic_ldr_write(v, 0, v->arch.hvm_svm.avic_last_ldr, false);
>> +        v->arch.hvm_svm.avic_last_ldr = 0;
>> +    }
>> +    else
>> +    {
>> +        /*
>> +         * Note:
>> +         * This saves the last valid LDR so that we
>> +         * know which entry in the local APIC ID
>> +         * to clean up when the LDR is updated.
>> +         */
>> +        v->arch.hvm_svm.avic_last_ldr = ldr;
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static int avic_handle_dfr_update(struct vcpu *v)
>> +{
>> +    u32 mod;
>> +    struct svm_domain *d = &v->domain->arch.hvm_domain.svm;
>> +    u32 dfr = vlapic_read_aligned(vcpu_vlapic(v), APIC_DFR);
>> +
>> +    mod = (dfr >> 28) & 0xFu;
>> +
>> +    spin_lock(&d->avic_dfr_mode_lock);
>> +    if ( d->avic_dfr_mode != mod )
>> +    {
>> +        /*
>> +         * We assume that all local APICs are using the same type.
>> +         * If DFR mode changes, we need to flush the domain AVIC logical
>> +         * APIC id table.
>> +         */
>> +        clear_domain_page(_mfn(page_to_mfn(d->avic_logical_id_table_pg)));
>> +        d->avic_dfr_mode = mod;
>> +    }
>> +    spin_unlock(&d->avic_dfr_mode_lock);
>> +
>> +    if ( v->arch.hvm_svm.avic_last_ldr )
>> +        avic_handle_ldr_update(v);
>> +
>> +    return 0;
>> +}
>> +
>> +static int avic_unaccel_trap_write(struct vcpu *v)
>> +{
>> +    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
>> +    u32 offset = vmcb->exitinfo1 & AVIC_UNACCEL_ACCESS_OFFSET_MASK;
>> +    u32 reg = vlapic_read_aligned(vcpu_vlapic(v), offset);
>> +
>> +    switch ( offset )
>> +    {
>> +    case APIC_ID:
>> +        /*
>> +         * Currently, we do not support APIC_ID update while
>> +         * the vcpus are running, which might require updating
>> +         * AVIC max APIC ID in all VMCBs. This would require
>> +         * synchronize update on all running VCPUs.
>> +         */
>> +        return X86EMUL_UNHANDLEABLE;
> Newline please.
>
>> +    case APIC_LDR:
>> +        if ( avic_handle_ldr_update(v) )
>> +            return X86EMUL_UNHANDLEABLE;
>> +        break;
>> +
>> +    case APIC_DFR:
>> +        if ( avic_handle_dfr_update(v) )
>> +            return X86EMUL_UNHANDLEABLE;
>> +        break;
>> +
>> +    default:
>> +        break;
>> +    }
>> +
>> +    vlapic_reg_write(v, offset, reg);
>> +
>> +    return X86EMUL_OKAY;
>> +}
>> +
>> +static inline bool avic_is_trap(u32 offset)
>> +{
>> +    u32 pos = offset >> 4;
>> +    static const unsigned long avic_trap[] =
>> +        {
>> +#define REG(x) (1UL << (APIC_ ## x >> 4))
>> +            REG(ID)   | REG(EOI)     | REG(RRR)   | REG(LDR)  |
>> +            REG(DFR)  | REG(SPIV)    | REG(ESR)   | REG(ICR)  |
>> +            REG(LVTT) | REG(LVTTHMR) | REG(LVTPC) | REG(LVT0) |
>> +            REG(LVT1) | REG(LVTERR)  | REG(TMICT) | REG(TDCR)
>> +#undef REG
>> +        };
> I know I'm the author of the piece of code you've copied here, but I am
> in the process of trying to fix its style.
>
> static const unsigned long avic_trap[] = {
> #define REG(x) (1UL << (APIC_ ## x >> 4))
>      REG(ID)   | REG(EOI)     | REG(RRR)   | REG(LDR)  |
>      ....
> #undef REG
> };
>
>> +
>> +    if ( !test_bit(pos, avic_trap) )
>> +        return false;
>> +    return true;
> return pos < (sizeof(avic_trap) * 8) && test_bit(pos, avic_trap);
>
> You need to avoid reading beyond the end of avic_trap[].
>
> ~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers
  2018-04-30 21:23     ` Natarajan, Janakarajan
@ 2018-04-30 23:06       ` Andrew Cooper
  0 siblings, 0 replies; 35+ messages in thread
From: Andrew Cooper @ 2018-04-30 23:06 UTC (permalink / raw)
  To: Natarajan, Janakarajan, Natarajan, Janakarajan, xen-devel
  Cc: Jun Nakajima, Kevin Tian, Stefano Stabellini, Wei Liu,
	Suthikulpanit, Suravee, George Dunlap, Tim Deegan, Ian Jackson,
	Julien Grall, Jan Beulich, Boris Ostrovsky

On 30/04/2018 22:23, Natarajan, Janakarajan wrote:
>>> +static int avic_ldr_write(struct vcpu *v, u8 g_phy_id, u32 ldr,
>>> bool valid)
>>> +{
>>> +    struct avic_logical_id_entry *entry, new_entry;
>>> +    u32 dfr = vlapic_read_aligned(vcpu_vlapic(v), APIC_DFR);
>>> +
>>> +    entry = avic_get_logical_id_entry(&v->domain->arch.hvm_domain.svm,
>>> +                                      ldr, (dfr == APIC_DFR_FLAT));
>>> +    if (!entry)
>> if ( !entry )
>>
>>> +        return -EINVAL;
>>> +
>>> +    new_entry = *entry;
>>> +    smp_rmb();
>>> +    new_entry.guest_phy_apic_id = g_phy_id;
>>> +    new_entry.valid = valid;
>>> +    *entry = new_entry;
>>> +    smp_wmb();
>> These barriers don't do what you want.  The pattern you are looking for
>> would require an smp_mb() between setting valid and writing things
>> back.  However, that is overkill - all that matters is that the compiler
>> doesn't generate multiple partial updates.
>>
>> new_entry.raw = ACCESS_ONCE(entry->raw);
>>
>> new_entry.guest_phy_apic_id = g_phy_id;
>> new_entry.valid = valid;
>>
>> ACCESS_ONCE(entry->raw) = new_entry.raw;
>
> Since it was decided to not use
>
> union ... {
>         uint64_t raw;
>         struct avic_logical_table_entry {
>         ....
>         ....
>         };
> };
>
> would smp_mb() make sense here?

Nothing has been decided yet.  I've made an argument for why the
minimalistic approach (as suggested in in the thread hanging off patch
4, which supersedes this) would be correct (and best, IMO) but the
decision as to what code to use is ultimately up to you as the submitter
(subject to it being functionally correct, which is the purpose of review).

In the ACCESS_ONCE() case, all the assignments are strictly dependent on
previous reads, which forces the compiler and pipeline to issue them in
order, so I don't see any reason for an mfence instruction.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2018-04-30 23:06 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-03 23:01 [PATCH 0/8] Introduce AMD SVM AVIC Janakarajan Natarajan
2018-04-03 23:01 ` [PATCH 1/8] x86/SVM: Modify VMCB fields to add AVIC support Janakarajan Natarajan
2018-04-03 23:01 ` [PATCH 2/8] x86/HVM/SVM: Add AVIC initialization code Janakarajan Natarajan
2018-04-13 17:35   ` Andrew Cooper
2018-04-19 15:46     ` Natarajan, Janakarajan
2018-04-16 15:55   ` Jan Beulich
2018-04-16 16:15     ` Roger Pau Monné
2018-04-26 15:32     ` Natarajan, Janakarajan
2018-04-26 15:39       ` Andrew Cooper
2018-04-23 19:33   ` Konrad Rzeszutek Wilk
2018-04-23 20:26     ` Natarajan, Janakarajan
2018-04-03 23:01 ` [PATCH 3/8] x86/SVM: Add AVIC vmexit handlers Janakarajan Natarajan
2018-04-13 17:48   ` Andrew Cooper
2018-04-30 21:23     ` Natarajan, Janakarajan
2018-04-30 23:06       ` Andrew Cooper
2018-04-17 12:58   ` Jan Beulich
2018-04-19 16:31     ` Natarajan, Janakarajan
2018-04-20 20:02     ` Natarajan, Janakarajan
2018-04-23  7:54       ` Jan Beulich
2018-04-03 23:01 ` [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC Janakarajan Natarajan
2018-04-13 17:57   ` Andrew Cooper
2018-04-19 15:54     ` Natarajan, Janakarajan
2018-04-19 18:18       ` Andrew Cooper
2018-04-19 23:04         ` Boris Ostrovsky
2018-04-20 10:09           ` Andrew Cooper
2018-04-20 10:12         ` Jan Beulich
2018-04-03 23:01 ` [PATCH 5/8] x86/SVM: Add interrupt management code via AVIC Janakarajan Natarajan
2018-04-17 13:06   ` Jan Beulich
2018-04-03 23:01 ` [PATCH 6/8] x86/HVM: Hook up miscellaneous AVIC functions Janakarajan Natarajan
2018-04-17 13:10   ` Jan Beulich
2018-04-03 23:01 ` [PATCH 7/8] x86/SVM: Introduce svm command line option Janakarajan Natarajan
2018-04-13 18:04   ` Andrew Cooper
2018-04-03 23:01 ` [PATCH 8/8] x86/SVM: Add AMD AVIC key handler Janakarajan Natarajan
2018-04-17 13:36   ` Jan Beulich
2018-04-13 17:04 ` [PATCH 0/8] Introduce AMD SVM AVIC Brian Woods

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.