All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 00/21] Introduce HVM without dm and new boot ABI
@ 2015-11-06 16:05 Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices Roger Pau Monne
                   ` (20 more replies)
  0 siblings, 21 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel

This series is split in the following order:

 - Patches from 1 to 12 allow disabling the devices emulated inside of Xen, 
   with the exception of patch 2 which is a bugfix for the vlapic.
 - Patches from 13 to 21 introduce the creation of HVM guests without a
   device model and without the devices emulated inside of Xen.

This series has been successfully tested on the following hardware:

 - Intel Core i3-5010U.
 - AMD Opteron 4184.

With both hap=0 and hap=1 in the configuration file. I've been able to boot
a SMP guest in this mode with a virtual hard drive and a virtual network
card, all working fine AFAICT. Migration/save/restore has also been tested 
with a SMP guest using the FreeBSD kernel provided below.

The series has been compile tested on arm32.

The series can also be found in the following git repo:

git://xenbits.xen.org/people/royger/xen.git branch hvm_without_dm_v8

And for the FreeBSD part:

git://xenbits.xen.org/people/royger/freebsd.git branch new_entry_point_v5

In case someone wants to give it a try, I've uploaded a FreeBSD kernel that
should work when booted into this mode:

https://people.freebsd.org/~royger/kernel_no_dm

This FreeBSD kernel starts the APs in long mode. There are examples for 
starting the APs in other modes in the sys/x86/xen/pv.c file.

The config file that I've used is:

<config>
kernel="/path/to/kernel_no_dm"

builder="hvm"
device_model_version="none"

memory=128
vcpus=2
name = "freebsd"
</config>

Of course if you have a FreeBSD disk already setup it can also be added to
the configuration file, and the following line can be used to point FreeBSD
to the disk:

extra="vfs.root.mountfrom=ufs:/dev/ufsid/<disk_id>"

As usual, each patch has it's own changelog.

AW   M 01/21 xen/x86: add bitmap of enabled emulated devices
   B M 02/21 xen/vlapic: fixes for HVM code when running without
A J  M 03/21 xen/x86: allow disabling the emulated local apic
A      04/21 xen/x86: allow disabling the emulated HPET
A    M 05/21 xen/x86: allow disabling power management
A      06/21 xen/x86: allow disabling the emulated RTC
A      07/21 xen/x86: allow disabling the emulated IO APIC
A      08/21 xen/x86: allow disabling the emulated PIC
     M 09/21 xen/x86: set the vPMU interface based on the
A      10/21 xen/x86: allow disabling the emulated VGA
A      11/21 xen/x86: allow disabling the emulated IOMMU
    N  12/21 xen/x86: allow disabling the emulated PIT
     M 13/21 xen/x86: make sure the HVM callback vector is
A    M 14/21 xen/x86: allow disabling all emulated devices inside
AW     15/21 elfnotes: intorduce a new PHYS_ENTRY elfnote
AW     16/21 libxc: allow creating domains without emulated
     M 17/21 xen/x86: allow HVM guests to use hypercalls to bring
AW   M 18/21 libxc/xen: introduce a start info structure for
AW     19/21 libxc: switch xc_dom_elfloader to be used with
 W     20/21 libxl: allow the creation of HVM domains without a
     M 21/21 libxl: add support for migrating HVM guests without

A = Acked/Reviewed by Andrew Cooper.
W = Acked/Reviewed by Wei Liu.
J = Acked/Reviewed by Jan Beulich.
N = New in this version.
M = Modified in this version.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-16 12:18   ` Jan Beulich
  2015-11-06 16:05 ` [PATCH v8 02/21] xen/vlapic: fixes for HVM code when running without a vlapic Roger Pau Monne
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Andrew Cooper, Ian Jackson, Jan Beulich, Roger Pau Monne

Introduce a bitmap in x86 xen_arch_domainconfig that allows enabling or
disabling specific devices emulated inside of Xen for HVM guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
---
Changes since v7:
 - Make the has_* macros actual booleans.
 - Allow the idle domain to be created without an arch specific config.
 - Check that PVH guests don't try to enable any emulated devices.
 - Add a flag for disabling the PIT.
 - Rename XEN_X86_EMU_PMTIMER into XEN_X86_EMU_PM and also rename the
   related macro.

Changes since v6:
 - Define XEN_X86_EMU_ALL to contain all the possible emulated devices.
 - Remove full stops form the printks added to arch_domain_create.
 - Add Wei Liu Acked-by.
 - Added a check to x86 arch_domain_create in order to make sure a non-null
   config is always provided.
 - Check that emulation_flags is always 0 for PV guests.
 - Fix x86 callers of domain_create in order to make sure a non-null arch
   config is always provided.
 - Removed XEN_X86_EMU_PMU.
 - Removed Andrew Cooper's Reviewed-by, since the hypervisor side code has
   changed substantially.

Changes since v4:
 - Add a check to make sure the emulation bitmap is sane (undefined bits are
   all 0s).
 - Add Andrew Cooper Reviewed-by.

Changes since v3:
 - Return EOPNOTSUPP instead of ENOPERM if an invalid emulation mask is
   used.
 - Fix error messages (prefix them with d%d and use %#x instead of 0x%x).
 - Clearly state in the public header that emulation_flags should only be
   used with HVM guests.
 - Add a XEN_X86 prefix to the emulation flags defines.
 - Properly parenthese the has_* marcos.
---
 tools/libxl/libxl_x86.c           |  5 ++++-
 xen/arch/x86/domain.c             | 27 +++++++++++++++++++++++++++
 xen/arch/x86/setup.c              |  3 ++-
 xen/common/schedule.c             |  1 -
 xen/include/asm-x86/domain.h      | 13 +++++++++++++
 xen/include/public/arch-x86/xen.h | 26 +++++++++++++++++++++++++-
 6 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index c992261..183b6c2 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -7,7 +7,10 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       libxl_domain_config *d_config,
                                       xc_domain_configuration_t *xc_config)
 {
-    /* No specific configuration right now */
+    if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM)
+        xc_config->emulation_flags = XEN_X86_EMU_ALL;
+    else
+        xc_config->emulation_flags = 0;
 
     return 0;
 }
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index fe3be30..3f9e5d2 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -496,6 +496,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
     int i, paging_initialised = 0;
     int rc = -ENOMEM;
 
+    if ( config == NULL && !is_idle_domain(d) )
+        return -EINVAL;
+
     d->arch.s3_integrity = !!(domcr_flags & DOMCRF_s3_integrity);
 
     INIT_LIST_HEAD(&d->arch.pdev_list);
@@ -517,6 +520,30 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
                d->domain_id);
     }
 
+    if ( is_idle_domain(d) )
+    {
+        d->arch.emulation_flags = 0;
+    }
+    else
+    {
+        if ( (config->emulation_flags & ~XEN_X86_EMU_ALL) != 0 )
+        {
+            printk(XENLOG_G_ERR "d%d: Invalid emulation bitmap: %#x\n",
+                   d->domain_id, config->emulation_flags);
+            return -EINVAL;
+        }
+        if ( is_hvm_domain(d) ? (config->emulation_flags != XEN_X86_EMU_ALL)
+                              : (config->emulation_flags != 0) )
+        {
+            printk(XENLOG_G_ERR "d%d: Xen does not allow %s domain creation "
+                   "with the current selection of emulators: %#x\n",
+                   d->domain_id, is_hvm_domain(d) ? "HVM" : "PV",
+                   config->emulation_flags);
+            return -EOPNOTSUPP;
+        }
+        d->arch.emulation_flags = config->emulation_flags;
+    }
+
     if ( has_hvm_container_domain(d) )
     {
         d->arch.hvm_domain.hap_enabled =
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 4ed0110..6714473 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -582,6 +582,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         .parity    = 'n',
         .stop_bits = 1
     };
+    struct xen_arch_domainconfig config = { .emulation_flags = 0 };
 
     /* Critical region without IDT or TSS.  Any fault is deadly! */
 
@@ -1390,7 +1391,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
      * x86 doesn't support arch-configuration. So it's fine to pass
      * NULL.
      */
-    dom0 = domain_create(0, domcr_flags, 0, NULL);
+    dom0 = domain_create(0, domcr_flags, 0, &config);
     if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
         panic("Error creating domain 0");
 
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 292e9a0..20f5f56 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -1474,7 +1474,6 @@ void __init scheduler_init(void)
         sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US;
     }
 
-    /* There is no need of arch-specific configuration for an idle domain */
     idle_domain = domain_create(DOMID_IDLE, 0, 0, NULL);
     BUG_ON(IS_ERR(idle_domain));
     idle_domain->vcpu = idle_vcpu;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index f1d7ed6..c825975 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -387,8 +387,21 @@ struct arch_domain
     /* Mem_access emulation control */
     bool_t mem_access_emulate_enabled;
     bool_t mem_access_emulate_each_rep;
+
+    /* Emulated devices enabled bitmap. */
+    uint32_t emulation_flags;
 } __cacheline_aligned;
 
+#define has_vlapic(d)      (!!((d)->arch.emulation_flags & XEN_X86_EMU_LAPIC))
+#define has_vhpet(d)       (!!((d)->arch.emulation_flags & XEN_X86_EMU_HPET))
+#define has_vpm(d)         (!!((d)->arch.emulation_flags & XEN_X86_EMU_PM))
+#define has_vrtc(d)        (!!((d)->arch.emulation_flags & XEN_X86_EMU_RTC))
+#define has_vioapic(d)     (!!((d)->arch.emulation_flags & XEN_X86_EMU_IOAPIC))
+#define has_vpic(d)        (!!((d)->arch.emulation_flags & XEN_X86_EMU_PIC))
+#define has_vvga(d)        (!!((d)->arch.emulation_flags & XEN_X86_EMU_VGA))
+#define has_viommu(d)      (!!((d)->arch.emulation_flags & XEN_X86_EMU_IOMMU))
+#define has_vpit(d)        (!!((d)->arch.emulation_flags & XEN_X86_EMU_PIT))
+
 #define has_arch_pdevs(d)    (!list_empty(&(d)->arch.pdev_list))
 
 #define gdt_ldt_pt_idx(v) \
diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h
index 5187560..cdd93c1 100644
--- a/xen/include/public/arch-x86/xen.h
+++ b/xen/include/public/arch-x86/xen.h
@@ -265,7 +265,31 @@ typedef struct arch_shared_info arch_shared_info_t;
  * XEN_DOMCTL_INTERFACE_VERSION.
  */
 struct xen_arch_domainconfig {
-    char dummy;
+#define _XEN_X86_EMU_LAPIC          0
+#define XEN_X86_EMU_LAPIC           (1U<<_XEN_X86_EMU_LAPIC)
+#define _XEN_X86_EMU_HPET           1
+#define XEN_X86_EMU_HPET            (1U<<_XEN_X86_EMU_HPET)
+#define _XEN_X86_EMU_PM             2
+#define XEN_X86_EMU_PM              (1U<<_XEN_X86_EMU_PM)
+#define _XEN_X86_EMU_RTC            3
+#define XEN_X86_EMU_RTC             (1U<<_XEN_X86_EMU_RTC)
+#define _XEN_X86_EMU_IOAPIC         4
+#define XEN_X86_EMU_IOAPIC          (1U<<_XEN_X86_EMU_IOAPIC)
+#define _XEN_X86_EMU_PIC            5
+#define XEN_X86_EMU_PIC             (1U<<_XEN_X86_EMU_PIC)
+#define _XEN_X86_EMU_VGA            6
+#define XEN_X86_EMU_VGA             (1U<<_XEN_X86_EMU_VGA)
+#define _XEN_X86_EMU_IOMMU          7
+#define XEN_X86_EMU_IOMMU           (1U<<_XEN_X86_EMU_IOMMU)
+#define _XEN_X86_EMU_PIT            8
+#define XEN_X86_EMU_PIT             (1U<<_XEN_X86_EMU_PIT)
+
+#define XEN_X86_EMU_ALL             (XEN_X86_EMU_LAPIC | XEN_X86_EMU_HPET |  \
+                                     XEN_X86_EMU_PM | XEN_X86_EMU_RTC |      \
+                                     XEN_X86_EMU_IOAPIC | XEN_X86_EMU_PIC |  \
+                                     XEN_X86_EMU_VGA | XEN_X86_EMU_IOMMU |   \
+                                     XEN_X86_EMU_PIT)
+    uint32_t emulation_flags;
 };
 #endif
 
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 02/21] xen/vlapic: fixes for HVM code when running without a vlapic
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:10   ` Andrew Cooper
  2015-11-26  5:41   ` Tian, Kevin
  2015-11-06 16:05 ` [PATCH v8 03/21] xen/x86: allow disabling the emulated local apic Roger Pau Monne
                   ` (18 subsequent siblings)
  20 siblings, 2 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Suravee Suthikulpanit, Jun Nakajima, Andrew Cooper,
	Eddie Dong, Aravind Gopalakrishnan, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

The HVM related code (SVM, VMX) generally assumed that a local apic is
always present. With the introduction of a HVM mode were the local apic can
be removed, some of this broken code paths arised.

The SVM exit/resume paths unconditionally checked the state of the lapic,
which is wrong if it's been disabled by hardware, fix this by adding the
necessary checks. On the VMX side, make sure we don't add mappings for a
local apic if it's disabled.

In the generic vlapic code, add checks to prevent setting the TSC deadline
timer if the lapic is disabled, and also prevent trying to inject interrupts
from the PIC is the lapic is also disabled.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Eddie Dong <eddie.dong@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
---
Changes since v7:
 - Only check apic_access_mfn in vmx_install_vlapic_mapping, and add an
   assert.
 - Return 0 (instead of -ENODEV) in vlapic_accept_pic_intr if the vlapic is
   disabled.
 - Add Boris Ostrovsky Reviewed-by tag.
---
 xen/arch/x86/hvm/svm/svm.c | 18 ++++++++++--------
 xen/arch/x86/hvm/vlapic.c  |  7 ++++++-
 xen/arch/x86/hvm/vmx/vmx.c |  4 +++-
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 8de41fa..404634b 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1035,6 +1035,7 @@ static void noreturn svm_do_resume(struct vcpu *v)
     struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
     bool_t debug_state = v->domain->debugger_attached;
     bool_t vcpu_guestmode = 0;
+    struct vlapic *vlapic = vcpu_vlapic(v);
 
     if ( nestedhvm_enabled(v->domain) && nestedhvm_vcpu_in_guestmode(v) )
         vcpu_guestmode = 1;
@@ -1058,14 +1059,14 @@ static void noreturn svm_do_resume(struct vcpu *v)
         hvm_asid_flush_vcpu(v);
     }
 
-    if ( !vcpu_guestmode )
+    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) )
     {
         vintr_t intr;
 
         /* Reflect the vlapic's TPR in the hardware vtpr */
         intr = vmcb_get_vintr(vmcb);
         intr.fields.tpr =
-            (vlapic_get_reg(vcpu_vlapic(v), APIC_TASKPRI) & 0xFF) >> 4;
+            (vlapic_get_reg(vlapic, APIC_TASKPRI) & 0xFF) >> 4;
         vmcb_set_vintr(vmcb, intr);
     }
 
@@ -2294,6 +2295,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
     int inst_len, rc;
     vintr_t intr;
     bool_t vcpu_guestmode = 0;
+    struct vlapic *vlapic = vcpu_vlapic(v);
 
     hvm_invalidate_regs_fields(regs);
 
@@ -2311,11 +2313,12 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
      * NB. We need to preserve the low bits of the TPR to make checked builds
      * of Windows work, even though they don't actually do anything.
      */
-    if ( !vcpu_guestmode ) {
+    if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) )
+    {
         intr = vmcb_get_vintr(vmcb);
-        vlapic_set_reg(vcpu_vlapic(v), APIC_TASKPRI,
+        vlapic_set_reg(vlapic, APIC_TASKPRI,
                    ((intr.fields.tpr & 0x0F) << 4) |
-                   (vlapic_get_reg(vcpu_vlapic(v), APIC_TASKPRI) & 0x0F));
+                   (vlapic_get_reg(vlapic, APIC_TASKPRI) & 0x0F));
     }
 
     exit_reason = vmcb->exitcode;
@@ -2697,14 +2700,13 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
     }
 
   out:
-    if ( vcpu_guestmode )
-        /* Don't clobber TPR of the nested guest. */
+    if ( vcpu_guestmode || vlapic_hw_disabled(vlapic) )
         return;
 
     /* The exit may have updated the TPR: reflect this in the hardware vtpr */
     intr = vmcb_get_vintr(vmcb);
     intr.fields.tpr =
-        (vlapic_get_reg(vcpu_vlapic(v), APIC_TASKPRI) & 0xFF) >> 4;
+        (vlapic_get_reg(vlapic, APIC_TASKPRI) & 0xFF) >> 4;
     vmcb_set_vintr(vmcb, intr);
 }
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index b893b40..0debe5e 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1042,7 +1042,9 @@ void vlapic_tdt_msr_set(struct vlapic *vlapic, uint64_t value)
     uint64_t guest_tsc;
     struct vcpu *v = vlapic_vcpu(vlapic);
 
-    /* may need to exclude some other conditions like vlapic->hw.disabled */
+    if ( vlapic_hw_disabled(vlapic) )
+        return;
+
     if ( !vlapic_lvtt_tdt(vlapic) )
     {
         HVM_DBG_LOG(DBG_LEVEL_VLAPIC_TIMER, "ignore tsc deadline msr write");
@@ -1118,6 +1120,9 @@ static int __vlapic_accept_pic_intr(struct vcpu *v)
 
 int vlapic_accept_pic_intr(struct vcpu *v)
 {
+    if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
+        return 0;
+
     TRACE_2D(TRC_HVM_EMUL_LAPIC_PIC_INTR,
              (v == v->domain->arch.hvm_domain.i8259_target),
              v ? __vlapic_accept_pic_intr(v) : -1);
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index bbec0e8..fcfd70b 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2412,9 +2412,11 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
 {
     paddr_t virt_page_ma, apic_page_ma;
 
-    if ( !cpu_has_vmx_virtualize_apic_accesses )
+    if ( v->domain->arch.hvm_domain.vmx.apic_access_mfn == 0 )
         return;
 
+    ASSERT(cpu_has_vmx_virtualize_apic_accesses);
+
     virt_page_ma = page_to_maddr(vcpu_vlapic(v)->regs_page);
     apic_page_ma = v->domain->arch.hvm_domain.vmx.apic_access_mfn;
     apic_page_ma <<= PAGE_SHIFT;
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 03/21] xen/x86: allow disabling the emulated local apic
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 02/21] xen/vlapic: fixes for HVM code when running without a vlapic Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-26  5:47   ` Tian, Kevin
  2015-11-06 16:05 ` [PATCH v8 04/21] xen/x86: allow disabling the emulated HPET Roger Pau Monne
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Eddie Dong, Jun Nakajima,
	Roger Pau Monne

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Eddie Dong <eddie.dong@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
---
Changes since v7:
 - Return 0 on vlapic_msr_set to note an error if the vlapic is disabled.
 - Add Andrew Cooper Reviewed-by.

Changes since v6:
 - Remove stall comments.
 - Adds checks for has_vlapic in the msixtbl_pt_{un}register functions.
 - Simplify the is_pvh_domain(d) || !has_vlapic(d) to !has_vlapic(d) when
   appropriate.
 - Split parts of this patch that can be considered bug fixes to a
   pre-patch.
 - Removed Acks since the patch changed substantially.

Changes since v5:
 - Add Boris Ostrovsky Reviewed-by.

Changes since v4:
 - Split the is_pvh_domain check into two, so part of the code can be shared
   with the !has_lapic case.
 - Add Andrew Cooper Acked-by.
---
 xen/arch/x86/hvm/vlapic.c   | 27 ++++++++++++++++++++++++---
 xen/arch/x86/hvm/vmsi.c     | 12 ++++++++++++
 xen/arch/x86/hvm/vmx/vmcs.c |  5 ++++-
 xen/arch/x86/hvm/vmx/vmx.c  |  6 ++++++
 4 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 0debe5e..417b1e7 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -993,6 +993,9 @@ static void set_x2apic_id(struct vlapic *vlapic)
 
 bool_t vlapic_msr_set(struct vlapic *vlapic, uint64_t value)
 {
+    if ( !has_vlapic(vlapic_domain(vlapic)) )
+        return 0;
+
     if ( (vlapic->hw.apic_base_msr ^ value) & MSR_IA32_APICBASE_ENABLE )
     {
         if ( unlikely(value & MSR_IA32_APICBASE_EXTD) )
@@ -1205,6 +1208,9 @@ void vlapic_reset(struct vlapic *vlapic)
     struct vcpu *v = vlapic_vcpu(vlapic);
     int i;
 
+    if ( !has_vlapic(v->domain) )
+        return;
+
     vlapic_set_reg(vlapic, APIC_ID,  (v->vcpu_id * 2) << 24);
     vlapic_set_reg(vlapic, APIC_LVR, VLAPIC_VERSION);
 
@@ -1270,6 +1276,9 @@ static int lapic_save_hidden(struct domain *d, hvm_domain_context_t *h)
     struct vlapic *s;
     int rc = 0;
 
+    if ( !has_vlapic(d) )
+        return 0;
+
     for_each_vcpu ( d, v )
     {
         s = vcpu_vlapic(v);
@@ -1286,6 +1295,9 @@ static int lapic_save_regs(struct domain *d, hvm_domain_context_t *h)
     struct vlapic *s;
     int rc = 0;
 
+    if ( !has_vlapic(d) )
+        return 0;
+
     for_each_vcpu ( d, v )
     {
         if ( hvm_funcs.sync_pir_to_irr )
@@ -1333,7 +1345,10 @@ static int lapic_load_hidden(struct domain *d, hvm_domain_context_t *h)
     uint16_t vcpuid;
     struct vcpu *v;
     struct vlapic *s;
-    
+
+    if ( !has_vlapic(d) )
+        return -ENODEV;
+
     /* Which vlapic to load? */
     vcpuid = hvm_load_instance(h); 
     if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
@@ -1365,7 +1380,10 @@ static int lapic_load_regs(struct domain *d, hvm_domain_context_t *h)
     uint16_t vcpuid;
     struct vcpu *v;
     struct vlapic *s;
-    
+
+    if ( !has_vlapic(d) )
+        return -ENODEV;
+
     /* Which vlapic to load? */
     vcpuid = hvm_load_instance(h); 
     if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
@@ -1404,7 +1422,7 @@ int vlapic_init(struct vcpu *v)
 
     HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "%d", v->vcpu_id);
 
-    if ( is_pvh_vcpu(v) )
+    if ( !has_vlapic(v->domain) )
     {
         vlapic->hw.disabled = VLAPIC_HW_DISABLED;
         return 0;
@@ -1457,6 +1475,9 @@ void vlapic_destroy(struct vcpu *v)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
 
+    if ( !has_vlapic(v->domain) )
+        return;
+
     tasklet_kill(&vlapic->init_sipi.tasklet);
     TRACE_0D(TRC_HVM_EMUL_LAPIC_STOP_TIMER);
     destroy_periodic_time(&vlapic->pt);
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index ac838a9..499151e 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -391,6 +391,9 @@ int msixtbl_pt_register(struct domain *d, struct pirq *pirq, uint64_t gtable)
     ASSERT(spin_is_locked(&pcidevs_lock));
     ASSERT(spin_is_locked(&d->event_lock));
 
+    if ( !has_vlapic(d) )
+        return -ENODEV;
+
     /*
      * xmalloc() with irq_disabled causes the failure of check_lock() 
      * for xenpool->lock. So we allocate an entry beforehand.
@@ -446,6 +449,9 @@ void msixtbl_pt_unregister(struct domain *d, struct pirq *pirq)
     ASSERT(spin_is_locked(&pcidevs_lock));
     ASSERT(spin_is_locked(&d->event_lock));
 
+    if ( !has_vlapic(d) )
+        return;
+
     irq_desc = pirq_spin_lock_irq_desc(pirq, NULL);
     if ( !irq_desc )
         return;
@@ -482,6 +488,9 @@ found:
 
 void msixtbl_init(struct domain *d)
 {
+    if ( !has_vlapic(d) )
+        return;
+
     INIT_LIST_HEAD(&d->arch.hvm_domain.msixtbl_list);
     spin_lock_init(&d->arch.hvm_domain.msixtbl_list_lock);
 
@@ -493,6 +502,9 @@ void msixtbl_pt_cleanup(struct domain *d)
     struct msixtbl_entry *entry, *temp;
     unsigned long flags;
 
+    if ( !has_vlapic(d) )
+        return;
+
     /* msixtbl_list_lock must be acquired with irq_disabled for check_lock() */
     local_irq_save(flags); 
     spin_lock(&d->arch.hvm_domain.msixtbl_list_lock);
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 53207e6..d461f64 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1013,7 +1013,7 @@ static int construct_vmcs(struct vcpu *v)
         ~(SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
           SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
 
-    if ( is_pvh_domain(d) )
+    if ( !has_vlapic(d) )
     {
         /* Disable virtual apics, TPR */
         v->arch.hvm_vmx.secondary_exec_control &=
@@ -1025,7 +1025,10 @@ static int construct_vmcs(struct vcpu *v)
         /* In turn, disable posted interrupts. */
         __vmwrite(PIN_BASED_VM_EXEC_CONTROL,
                   vmx_pin_based_exec_control & ~PIN_BASED_POSTED_INTERRUPT);
+    }
 
+    if ( is_pvh_domain(d) )
+    {
         /* Unrestricted guest (real mode for EPT) */
         v->arch.hvm_vmx.secondary_exec_control &=
             ~SECONDARY_EXEC_UNRESTRICTED_GUEST;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index fcfd70b..073e5be 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -89,6 +89,9 @@ static int vmx_domain_initialise(struct domain *d)
 {
     int rc;
 
+    if ( !has_vlapic(d) )
+        return 0;
+
     if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
         return rc;
 
@@ -97,6 +100,9 @@ static int vmx_domain_initialise(struct domain *d)
 
 static void vmx_domain_destroy(struct domain *d)
 {
+    if ( !has_vlapic(d) )
+        return;
+
     vmx_free_vlapic_mapping(d);
 }
 
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 04/21] xen/x86: allow disabling the emulated HPET
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (2 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 03/21] xen/x86: allow disabling the emulated local apic Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 05/21] xen/x86: allow disabling power management Roger Pau Monne
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monne

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v6:
 - Return ENODEV in hpet_load if the vhpet is disabled.

Changes since v4:
 - Add Andrew Cooper Acked-by.
---
 xen/arch/x86/hvm/hpet.c | 13 +++++++++++++
 xen/arch/x86/hvm/hvm.c  |  1 -
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index facab83..5e020ae 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -517,6 +517,9 @@ static int hpet_save(struct domain *d, hvm_domain_context_t *h)
     int rc;
     uint64_t guest_time;
 
+    if ( !has_vhpet(d) )
+        return 0;
+
     write_lock(&hp->lock);
     guest_time = (v->arch.hvm_vcpu.guest_time ?: hvm_get_guest_time(v)) /
                  STIME_PER_HPET_TICK;
@@ -577,6 +580,9 @@ static int hpet_load(struct domain *d, hvm_domain_context_t *h)
     uint64_t guest_time;
     int i;
 
+    if ( !has_vhpet(d) )
+        return -ENODEV;
+
     write_lock(&hp->lock);
 
     /* Reload the HPET registers */
@@ -635,6 +641,9 @@ void hpet_init(struct domain *d)
     HPETState *h = domain_vhpet(d);
     int i;
 
+    if ( !has_vhpet(d) )
+        return;
+
     memset(h, 0, sizeof(HPETState));
 
     rwlock_init(&h->lock);
@@ -662,6 +671,7 @@ void hpet_init(struct domain *d)
     }
 
     register_mmio_handler(d, &hpet_mmio_ops);
+    d->arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1;
 }
 
 void hpet_deinit(struct domain *d)
@@ -669,6 +679,9 @@ void hpet_deinit(struct domain *d)
     int i;
     HPETState *h = domain_vhpet(d);
 
+    if ( !has_vhpet(d) )
+        return;
+
     write_lock(&h->lock);
 
     if ( hpet_enabled(h) )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 21f42a7..4490e9d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1605,7 +1605,6 @@ int hvm_domain_initialise(struct domain *d)
 
     hvm_init_guest_time(d);
 
-    d->arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1;
     d->arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_REASON] = SHUTDOWN_reboot;
 
     vpic_init(d);
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 05/21] xen/x86: allow disabling power management
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (3 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 04/21] xen/x86: allow disabling the emulated HPET Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 06/21] xen/x86: allow disabling the emulated RTC Roger Pau Monne
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monne

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v7:
 - Add Andrew Cooper Reviewed-by.
 - Apply renaming from earlier patch (s/PMTIMER/PM/).

Changes since v6:
 - Return ENODEV in pmtimer_load if the timer is disabled.
 - hvm_acpi_power_button and hvm_acpi_sleep_button become noops if the
   pmtimer is disabled.
 - Return ENODEV if pmtimer_change_ioport is called with the pmtimer
   disabled.
 - Add a check for disabled pmtimer in pmtimer_reset. Although it's safe to
   execute this function now, it might change in the future.
 - Drop Andrew's Ack due to the changes.

Changes since v4:
 - Add Andrew Cooper Acked-by.
---
 xen/arch/x86/hvm/pmtimer.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/xen/arch/x86/hvm/pmtimer.c b/xen/arch/x86/hvm/pmtimer.c
index c8229e0..9c2e4bd 100644
--- a/xen/arch/x86/hvm/pmtimer.c
+++ b/xen/arch/x86/hvm/pmtimer.c
@@ -67,6 +67,10 @@ static void pmt_update_sci(PMTState *s)
 void hvm_acpi_power_button(struct domain *d)
 {
     PMTState *s = &d->arch.hvm_domain.pl_time.vpmt;
+
+    if ( !has_vpm(d) )
+        return;
+
     spin_lock(&s->lock);
     s->pm.pm1a_sts |= PWRBTN_STS;
     pmt_update_sci(s);
@@ -76,6 +80,10 @@ void hvm_acpi_power_button(struct domain *d)
 void hvm_acpi_sleep_button(struct domain *d)
 {
     PMTState *s = &d->arch.hvm_domain.pl_time.vpmt;
+
+    if ( !has_vpm(d) )
+        return;
+
     spin_lock(&s->lock);
     s->pm.pm1a_sts |= SLPBTN_STS;
     pmt_update_sci(s);
@@ -247,6 +255,9 @@ static int pmtimer_save(struct domain *d, hvm_domain_context_t *h)
     uint32_t x, msb = s->pm.tmr_val & TMR_VAL_MSB;
     int rc;
 
+    if ( !has_vpm(d) )
+        return 0;
+
     spin_lock(&s->lock);
 
     /*
@@ -273,6 +284,9 @@ static int pmtimer_load(struct domain *d, hvm_domain_context_t *h)
 {
     PMTState *s = &d->arch.hvm_domain.pl_time.vpmt;
 
+    if ( !has_vpm(d) )
+        return -ENODEV;
+
     spin_lock(&s->lock);
 
     /* Reload the registers */
@@ -301,6 +315,9 @@ int pmtimer_change_ioport(struct domain *d, unsigned int version)
 {
     unsigned int old_version;
 
+    if ( !has_vpm(d) )
+        return -ENODEV;
+
     /* Check that version is changing. */
     old_version = d->arch.hvm_domain.params[HVM_PARAM_ACPI_IOPORTS_LOCATION];
     if ( version == old_version )
@@ -330,6 +347,9 @@ void pmtimer_init(struct vcpu *v)
 {
     PMTState *s = &v->domain->arch.hvm_domain.pl_time.vpmt;
 
+    if ( !has_vpm(v->domain) )
+        return;
+
     spin_lock_init(&s->lock);
 
     s->scale = ((uint64_t)FREQUENCE_PMTIMER << 32) / SYSTEM_TIME_HZ;
@@ -350,11 +370,18 @@ void pmtimer_init(struct vcpu *v)
 void pmtimer_deinit(struct domain *d)
 {
     PMTState *s = &d->arch.hvm_domain.pl_time.vpmt;
+
+    if ( !has_vpm(d) )
+        return;
+
     kill_timer(&s->timer);
 }
 
 void pmtimer_reset(struct domain *d)
 {
+    if ( !has_vpm(d) )
+        return;
+
     /* Reset the counter. */
     d->arch.hvm_domain.pl_time.vpmt.pm.tmr_val = 0;
 }
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 06/21] xen/x86: allow disabling the emulated RTC
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (4 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 05/21] xen/x86: allow disabling power management Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 07/21] xen/x86: allow disabling the emulated IO APIC Roger Pau Monne
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monne

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v6:
 - Return ENODEV in rtc_load if rtc is disabled.
 - Add checks to rtc_reset and rtc_update_clock to prevent calling them if
   rtc is disabled.

Changes since v4:
 - Add Andrew Cooper Acked-by.
---
 xen/arch/x86/hvm/rtc.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c
index a9efeaf..d391e38 100644
--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -726,6 +726,9 @@ void rtc_migrate_timers(struct vcpu *v)
 {
     RTCState *s = vcpu_vrtc(v);
 
+    if ( !has_vrtc(v->domain) )
+        return;
+
     if ( v->vcpu_id == 0 )
     {
         migrate_timer(&s->update_timer, v->processor);;
@@ -739,6 +742,10 @@ static int rtc_save(struct domain *d, hvm_domain_context_t *h)
 {
     RTCState *s = domain_vrtc(d);
     int rc;
+
+    if ( !has_vrtc(d) )
+        return 0;
+
     spin_lock(&s->lock);
     rc = hvm_save_entry(RTC, 0, h, &s->hw);
     spin_unlock(&s->lock);
@@ -750,6 +757,9 @@ static int rtc_load(struct domain *d, hvm_domain_context_t *h)
 {
     RTCState *s = domain_vrtc(d);
 
+    if ( !has_vrtc(d) )
+        return -ENODEV;
+
     spin_lock(&s->lock);
 
     /* Restore the registers */
@@ -780,6 +790,9 @@ void rtc_reset(struct domain *d)
 {
     RTCState *s = domain_vrtc(d);
 
+    if ( !has_vrtc(d) )
+        return;
+
     TRACE_0D(TRC_HVM_EMUL_RTC_STOP_TIMER);
     destroy_periodic_time(&s->pt);
     s->period = 0;
@@ -790,6 +803,9 @@ void rtc_init(struct domain *d)
 {
     RTCState *s = domain_vrtc(d);
 
+    if ( !has_vrtc(d) )
+        return;
+
     spin_lock_init(&s->lock);
 
     init_timer(&s->update_timer, rtc_update_timer, s, smp_processor_id());
@@ -820,6 +836,9 @@ void rtc_deinit(struct domain *d)
 {
     RTCState *s = domain_vrtc(d);
 
+    if ( !has_vrtc(d) )
+        return;
+
     spin_barrier(&s->lock);
 
     TRACE_0D(TRC_HVM_EMUL_RTC_STOP_TIMER);
@@ -833,6 +852,9 @@ void rtc_update_clock(struct domain *d)
 {
     RTCState *s = domain_vrtc(d);
 
+    if ( !has_vrtc(d) )
+        return;
+
     spin_lock(&s->lock);
     s->current_tm = gmtime(get_localtime(d));
     spin_unlock(&s->lock);
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 07/21] xen/x86: allow disabling the emulated IO APIC
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (5 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 06/21] xen/x86: allow disabling the emulated RTC Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 08/21] xen/x86: allow disabling the emulated PIC Roger Pau Monne
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monne

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v6:
 - Return ENODEV in ioapic_load if the ioapic is disabled.
 - Add an assert to make sure vioapic_update_EOI and
   vioapic_irq_positive_edge is only called when the vioapic is enabled.

Changes since v4:
 - Add Andrew Cooper Acked-by.
---
 xen/arch/x86/hvm/vioapic.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index d348235..611be87 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -365,6 +365,8 @@ void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
     struct hvm_hw_vioapic *vioapic = domain_vioapic(d);
     union vioapic_redir_entry *ent;
 
+    ASSERT(has_vioapic(d));
+
     HVM_DBG_LOG(DBG_LEVEL_IOAPIC, "irq %x", irq);
 
     ASSERT(irq < VIOAPIC_NUM_PINS);
@@ -392,6 +394,8 @@ void vioapic_update_EOI(struct domain *d, u8 vector)
     union vioapic_redir_entry *ent;
     int gsi;
 
+    ASSERT(has_vioapic(d));
+
     spin_lock(&d->arch.hvm_domain.irq_lock);
 
     for ( gsi = 0; gsi < VIOAPIC_NUM_PINS; gsi++ )
@@ -424,12 +428,20 @@ void vioapic_update_EOI(struct domain *d, u8 vector)
 static int ioapic_save(struct domain *d, hvm_domain_context_t *h)
 {
     struct hvm_hw_vioapic *s = domain_vioapic(d);
+
+    if ( !has_vioapic(d) )
+        return 0;
+
     return hvm_save_entry(IOAPIC, 0, h, s);
 }
 
 static int ioapic_load(struct domain *d, hvm_domain_context_t *h)
 {
     struct hvm_hw_vioapic *s = domain_vioapic(d);
+
+    if ( !has_vioapic(d) )
+        return -ENODEV;
+
     return hvm_load_entry(IOAPIC, h, s);
 }
 
@@ -440,6 +452,9 @@ void vioapic_reset(struct domain *d)
     struct hvm_vioapic *vioapic = d->arch.hvm_domain.vioapic;
     int i;
 
+    if ( !has_vioapic(d) )
+        return;
+
     memset(&vioapic->hvm_hw_vioapic, 0, sizeof(vioapic->hvm_hw_vioapic));
     for ( i = 0; i < VIOAPIC_NUM_PINS; i++ )
         vioapic->hvm_hw_vioapic.redirtbl[i].fields.mask = 1;
@@ -448,6 +463,9 @@ void vioapic_reset(struct domain *d)
 
 int vioapic_init(struct domain *d)
 {
+    if ( !has_vioapic(d) )
+        return 0;
+
     if ( (d->arch.hvm_domain.vioapic == NULL) &&
          ((d->arch.hvm_domain.vioapic = xmalloc(struct hvm_vioapic)) == NULL) )
         return -ENOMEM;
@@ -462,6 +480,9 @@ int vioapic_init(struct domain *d)
 
 void vioapic_deinit(struct domain *d)
 {
+    if ( !has_vioapic(d) )
+        return;
+
     xfree(d->arch.hvm_domain.vioapic);
     d->arch.hvm_domain.vioapic = NULL;
 }
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 08/21] xen/x86: allow disabling the emulated PIC
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (6 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 07/21] xen/x86: allow disabling the emulated IO APIC Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 09/21] xen/x86: set the vPMU interface based on the presence of a lapic Roger Pau Monne
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monne

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v6:
 - Return ENODEV in vpic_load if the vpic is disabled.
 - Add asserts to vpic_irq_{negative/positive}_edge and vpic_ack_pending_irq
   to make sure they are not called when the vpic is disabled.
 - Add a check to vpic_reset in order to prevent calling it with the vpic
   disabled.

Changes since v4:
 - Add Andrew Cooper Acked-by.
---
 xen/arch/x86/hvm/vpic.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c
index 7c2edc8..6a2e87b 100644
--- a/xen/arch/x86/hvm/vpic.c
+++ b/xen/arch/x86/hvm/vpic.c
@@ -377,6 +377,9 @@ static int vpic_save(struct domain *d, hvm_domain_context_t *h)
     struct hvm_hw_vpic *s;
     int i;
 
+    if ( !has_vpic(d) )
+        return 0;
+
     /* Save the state of both PICs */
     for ( i = 0; i < 2 ; i++ )
     {
@@ -392,7 +395,10 @@ static int vpic_load(struct domain *d, hvm_domain_context_t *h)
 {
     struct hvm_hw_vpic *s;
     uint16_t inst;
-    
+
+    if ( !has_vpic(d) )
+        return -ENODEV;
+
     /* Which PIC is this? */
     inst = hvm_load_instance(h);
     if ( inst > 1 )
@@ -412,6 +418,9 @@ void vpic_reset(struct domain *d)
 {
     struct hvm_hw_vpic *vpic;
 
+    if ( !has_vpic(d) )
+        return;
+
     /* Master PIC. */
     vpic = &d->arch.hvm_domain.vpic[0];
     memset(vpic, 0, sizeof(*vpic));
@@ -425,6 +434,9 @@ void vpic_reset(struct domain *d)
 
 void vpic_init(struct domain *d)
 {
+    if ( !has_vpic(d) )
+        return;
+
     vpic_reset(d);
 
     register_portio_handler(d, 0x20, 2, vpic_intercept_pic_io);
@@ -439,6 +451,7 @@ void vpic_irq_positive_edge(struct domain *d, int irq)
     struct hvm_hw_vpic *vpic = &d->arch.hvm_domain.vpic[irq >> 3];
     uint8_t mask = 1 << (irq & 7);
 
+    ASSERT(has_vpic(d));
     ASSERT(irq <= 15);
     ASSERT(vpic_is_locked(vpic));
 
@@ -456,6 +469,7 @@ void vpic_irq_negative_edge(struct domain *d, int irq)
     struct hvm_hw_vpic *vpic = &d->arch.hvm_domain.vpic[irq >> 3];
     uint8_t mask = 1 << (irq & 7);
 
+    ASSERT(has_vpic(d));
     ASSERT(irq <= 15);
     ASSERT(vpic_is_locked(vpic));
 
@@ -473,6 +487,8 @@ int vpic_ack_pending_irq(struct vcpu *v)
     int irq, vector;
     struct hvm_hw_vpic *vpic = &v->domain->arch.hvm_domain.vpic[0];
 
+    ASSERT(has_vpic(v->domain));
+
     TRACE_2D(TRC_HVM_EMUL_PIC_PEND_IRQ_CALL, vlapic_accept_pic_intr(v),
              vpic->int_output);
     if ( !vlapic_accept_pic_intr(v) || !vpic->int_output )
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 09/21] xen/x86: set the vPMU interface based on the presence of a lapic
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (7 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 08/21] xen/x86: allow disabling the emulated PIC Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-09 15:04   ` Jan Beulich
  2015-11-06 16:05 ` [PATCH v8 10/21] xen/x86: allow disabling the emulated VGA Roger Pau Monne
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Boris Ostrovsky, Jan Beulich, Roger Pau Monne

Instead of choosing the interface to expose to guests based on the guest
type, do it based on whether the guest has an emulated local apic or not.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v7:
 - Merge vpmu work from Boris.

Changes since v6:
 - Major rework of the approach.
 - Drop Andrew Cooper Acked-by.

Changes since v4:
 - Add Andrew Cooper Acked-by.
---
 xen/arch/x86/cpu/vpmu.c       | 11 ++++++-----
 xen/arch/x86/cpu/vpmu_amd.c   |  6 +++---
 xen/arch/x86/cpu/vpmu_intel.c |  6 +++---
 xen/arch/x86/hvm/hvm.c        |  2 ++
 xen/arch/x86/hvm/svm/svm.c    |  2 +-
 xen/arch/x86/hvm/vmx/vmx.c    |  2 +-
 6 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index 2f5156a..4346c66 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -94,7 +94,7 @@ void vpmu_lvtpc_update(uint32_t val)
     vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR | (val & APIC_LVT_MASKED);
 
     /* Postpone APIC updates for PV(H) guests if PMU interrupt is pending */
-    if ( is_hvm_vcpu(curr) || !vpmu->xenpmu_data ||
+    if ( has_vlapic(curr->domain) || !vpmu->xenpmu_data ||
          !vpmu_is_set(vpmu, VPMU_CACHED) )
         apic_write(APIC_LVTPC, vpmu->hw_lapic_lvtpc);
 }
@@ -129,7 +129,7 @@ int vpmu_do_msr(unsigned int msr, uint64_t *msr_content,
      * and since do_wr/rdmsr may load VPMU context we should save
      * (and unload) it again.
      */
-    if ( !is_hvm_vcpu(curr) && vpmu->xenpmu_data &&
+    if ( !has_vlapic(curr->domain) && vpmu->xenpmu_data &&
         vpmu_is_set(vpmu, VPMU_CACHED) )
     {
         vpmu_set(vpmu, VPMU_CONTEXT_SAVE);
@@ -184,7 +184,7 @@ void vpmu_do_interrupt(struct cpu_user_regs *regs)
         return;
 
     /* PV(H) guest */
-    if ( !is_hvm_vcpu(sampling) || (vpmu_mode & XENPMU_MODE_ALL) )
+    if ( !has_vlapic(sampling->domain) || (vpmu_mode & XENPMU_MODE_ALL) )
     {
         const struct cpu_user_regs *cur_regs;
         uint64_t *flags = &vpmu->xenpmu_data->pmu.pmu_flags;
@@ -411,7 +411,8 @@ int vpmu_load(struct vcpu *v, bool_t from_guest)
 
     /* Only when PMU is counting, we load PMU context immediately. */
     if ( !vpmu_is_set(vpmu, VPMU_RUNNING) ||
-         (!is_hvm_vcpu(vpmu_vcpu(vpmu)) && vpmu_is_set(vpmu, VPMU_CACHED)) )
+         (!has_vlapic(vpmu_vcpu(vpmu)->domain) &&
+         vpmu_is_set(vpmu, VPMU_CACHED)) )
         return 0;
 
     if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->arch_vpmu_load )
@@ -635,7 +636,7 @@ long do_xenpmu_op(unsigned int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg)
     struct xen_pmu_data *xenpmu_data;
     struct vpmu_struct *vpmu;
 
-    if ( !opt_vpmu_enabled )
+    if ( !opt_vpmu_enabled || has_vlapic(current->domain) )
         return -EOPNOTSUPP;
 
     ret = xsm_pmu_op(XSM_OTHER, current->domain, op);
diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c
index 04da81a..990e6f3 100644
--- a/xen/arch/x86/cpu/vpmu_amd.c
+++ b/xen/arch/x86/cpu/vpmu_amd.c
@@ -238,7 +238,7 @@ static int amd_vpmu_load(struct vcpu *v, bool_t from_guest)
         bool_t is_running = 0;
         struct xen_pmu_amd_ctxt *guest_ctxt = &vpmu->xenpmu_data->pmu.c.amd;
 
-        ASSERT(!is_hvm_vcpu(v));
+        ASSERT(!has_vlapic(v->domain));
 
         ctxt = vpmu->context;
         ctrl_regs = vpmu_reg_pointer(ctxt, ctrls);
@@ -314,7 +314,7 @@ static int amd_vpmu_save(struct vcpu *v,  bool_t to_guest)
     {
         struct xen_pmu_amd_ctxt *guest_ctxt, *ctxt;
 
-        ASSERT(!is_hvm_vcpu(v));
+        ASSERT(!has_vlapic(v->domain));
         ctxt = vpmu->context;
         guest_ctxt = &vpmu->xenpmu_data->pmu.c.amd;
         memcpy(&guest_ctxt->regs[0], &ctxt->regs[0], regs_sz);
@@ -525,7 +525,7 @@ int svm_vpmu_initialise(struct vcpu *v)
     vpmu->context = ctxt;
     vpmu->priv_context = NULL;
 
-    if ( !is_hvm_vcpu(v) )
+    if ( !has_vlapic(v->domain) )
     {
         /* Copy register offsets to shared area */
         ASSERT(vpmu->xenpmu_data);
diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
index 12f80ae..a9e0e68 100644
--- a/xen/arch/x86/cpu/vpmu_intel.c
+++ b/xen/arch/x86/cpu/vpmu_intel.c
@@ -337,7 +337,7 @@ static int core2_vpmu_save(struct vcpu *v, bool_t to_guest)
 
     if ( to_guest )
     {
-        ASSERT(!is_hvm_vcpu(v));
+        ASSERT(!has_vlapic(v->domain));
         memcpy((void *)(&vpmu->xenpmu_data->pmu.c.intel) + regs_off,
                vpmu->context + regs_off, regs_sz);
     }
@@ -442,7 +442,7 @@ static int core2_vpmu_load(struct vcpu *v, bool_t from_guest)
     {
         int ret;
 
-        ASSERT(!is_hvm_vcpu(v));
+        ASSERT(!has_vlapic(v->domain));
 
         memcpy(vpmu->context + regs_off,
                (void *)&v->arch.vpmu.xenpmu_data->pmu.c.intel + regs_off,
@@ -502,7 +502,7 @@ static int core2_vpmu_alloc_resource(struct vcpu *v)
     vpmu->context = core2_vpmu_cxt;
     vpmu->priv_context = p;
 
-    if ( !is_hvm_vcpu(v) )
+    if ( !has_vlapic(v->domain) )
     {
         /* Copy fixed/arch register offsets to shared area */
         ASSERT(vpmu->xenpmu_data);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4490e9d..409b61d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5143,6 +5143,7 @@ static hvm_hypercall_t *const hvm_hypercall64_table[NR_hypercalls] = {
     HYPERCALL(sysctl),
     HYPERCALL(domctl),
     HYPERCALL(tmem_op),
+    HYPERCALL(xenpmu_op),
     [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation
 };
 
@@ -5164,6 +5165,7 @@ static hvm_hypercall_t *const hvm_hypercall32_table[NR_hypercalls] = {
     HYPERCALL(sysctl),
     HYPERCALL(domctl),
     HYPERCALL(tmem_op),
+    HYPERCALL(xenpmu_op),
     [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation
 };
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 404634b..4adf86b 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1166,7 +1166,7 @@ static int svm_vcpu_initialise(struct vcpu *v)
     }
 
     /* PVH's VPMU is initialized via hypercall */
-    if ( is_hvm_vcpu(v) )
+    if ( is_hvm_vcpu(v) && has_vlapic(v->domain) )
         vpmu_initialise(v);
 
     svm_guest_osvw_init(v);
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 073e5be..5df866e 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -148,7 +148,7 @@ static int vmx_vcpu_initialise(struct vcpu *v)
     }
 
     /* PVH's VPMU is initialized via hypercall */
-    if ( is_hvm_vcpu(v) )
+    if ( is_hvm_vcpu(v) && has_vlapic(v->domain) )
         vpmu_initialise(v);
 
     vmx_install_vlapic_mapping(v);
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 10/21] xen/x86: allow disabling the emulated VGA
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (8 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 09/21] xen/x86: set the vPMU interface based on the presence of a lapic Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 11/21] xen/x86: allow disabling the emulated IOMMU Roger Pau Monne
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monne

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v4:
 - Add Andrew Cooper Acked-by.
---
 xen/arch/x86/hvm/stdvga.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 02a97f9..ae1c2d0 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -554,6 +554,9 @@ void stdvga_init(struct domain *d)
     struct page_info *pg;
     unsigned int i;
 
+    if ( !has_vvga(d) )
+        return;
+
     memset(s, 0, sizeof(*s));
     spin_lock_init(&s->lock);
     
@@ -591,6 +594,9 @@ void stdvga_deinit(struct domain *d)
     struct hvm_hw_stdvga *s = &d->arch.hvm_domain.stdvga;
     int i;
 
+    if ( !has_vvga(d) )
+        return;
+
     for ( i = 0; i != ARRAY_SIZE(s->vram_page); i++ )
     {
         if ( s->vram_page[i] == NULL )
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 11/21] xen/x86: allow disabling the emulated IOMMU
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (9 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 10/21] xen/x86: allow disabling the emulated VGA Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 12/21] xen/x86: allow disabling the emulated PIT Roger Pau Monne
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Aravind Gopalakrishnan, Suravee Suthikulpanit, Roger Pau Monne

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
---
Changes since v4:
 - Add Andrew Cooper Acked-by.
---
 xen/drivers/passthrough/amd/iommu_guest.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c
index e74f469..b4e75ac 100644
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -887,7 +887,8 @@ int guest_iommu_init(struct domain* d)
     struct guest_iommu *iommu;
     struct hvm_iommu *hd  = domain_hvm_iommu(d);
 
-    if ( !is_hvm_domain(d) || !iommu_enabled || !iommuv2_enabled )
+    if ( !is_hvm_domain(d) || !iommu_enabled || !iommuv2_enabled ||
+         !has_viommu(d) )
         return 0;
 
     iommu = xzalloc(struct guest_iommu);
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 12/21] xen/x86: allow disabling the emulated PIT
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (10 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 11/21] xen/x86: allow disabling the emulated IOMMU Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-09 15:31   ` Jan Beulich
  2015-11-06 16:05 ` [PATCH v8 13/21] xen/x86: make sure the HVM callback vector is correctly set Roger Pau Monne
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monne

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v7:
 - Patch added.
---
 xen/arch/x86/hvm/i8254.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c
index 8a93c88..b517cd6 100644
--- a/xen/arch/x86/hvm/i8254.c
+++ b/xen/arch/x86/hvm/i8254.c
@@ -382,6 +382,9 @@ static uint32_t pit_ioport_read(struct PITState *pit, uint32_t addr)
 
 void pit_stop_channel0_irq(PITState *pit)
 {
+    if ( !has_vpit(current->domain) )
+        return;
+
     TRACE_0D(TRC_HVM_EMUL_PIT_STOP_TIMER);
     spin_lock(&pit->lock);
     destroy_periodic_time(&pit->pt0);
@@ -393,6 +396,9 @@ static int pit_save(struct domain *d, hvm_domain_context_t *h)
     PITState *pit = domain_vpit(d);
     int rc;
 
+    if ( !has_vpit(d) )
+        return 0;
+
     spin_lock(&pit->lock);
     
     rc = hvm_save_entry(PIT, 0, h, &pit->hw);
@@ -407,6 +413,9 @@ static int pit_load(struct domain *d, hvm_domain_context_t *h)
     PITState *pit = domain_vpit(d);
     int i;
 
+    if ( !has_vpit(d) )
+        return -ENODEV;
+
     spin_lock(&pit->lock);
 
     if ( hvm_load_entry(PIT, h, &pit->hw) )
@@ -437,6 +446,9 @@ void pit_reset(struct domain *d)
     struct hvm_hw_pit_channel *s;
     int i;
 
+    if ( !has_vpit(d) )
+        return;
+
     TRACE_0D(TRC_HVM_EMUL_PIT_STOP_TIMER);
     destroy_periodic_time(&pit->pt0);
     pit->pt0.source = PTSRC_isa;
@@ -458,6 +470,9 @@ void pit_init(struct domain *d, unsigned long cpu_khz)
 {
     PITState *pit = domain_vpit(d);
 
+    if ( !has_vpit(d) )
+        return;
+
     spin_lock_init(&pit->lock);
 
     if ( is_hvm_domain(d) )
@@ -473,6 +488,9 @@ void pit_deinit(struct domain *d)
 {
     PITState *pit = domain_vpit(d);
 
+    if ( !has_vpit(d) )
+        return;
+
     TRACE_0D(TRC_HVM_EMUL_PIT_STOP_TIMER);
     destroy_periodic_time(&pit->pt0);
 }
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 13/21] xen/x86: make sure the HVM callback vector is correctly set
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (11 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 12/21] xen/x86: allow disabling the emulated PIT Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-10 16:41   ` Jan Beulich
  2015-11-06 16:05 ` [PATCH v8 14/21] xen/x86: allow disabling all emulated devices inside of Xen Roger Pau Monne
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monne

If certain devices (like the local or the io apic) are disabled some modes
of operation of the HVM event channel callback cannot be used. Make sure Xen
doesn't try to setup them.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/irq.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index 990a2ca..b5f5b05 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -330,6 +330,10 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
          (via_type > HVMIRQ_callback_vector) )
         via_type = HVMIRQ_callback_none;
 
+    if ( via_type != HVMIRQ_callback_vector &&
+         (!has_vlapic(d) || !has_vioapic(d)) )
+        return;
+
     spin_lock(&d->arch.hvm_domain.irq_lock);
 
     /* Tear down old callback via. */
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 14/21] xen/x86: allow disabling all emulated devices inside of Xen
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (12 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 13/21] xen/x86: make sure the HVM callback vector is correctly set Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 15/21] elfnotes: intorduce a new PHYS_ENTRY elfnote Roger Pau Monne
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monne

Only allow enabling or disabling all the emulated devices inside of Xen,
right now Xen doesn't support enabling specific emulated devices only.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v7:
 - Rework if condition.

Changes since v5:
 - Add Andrew Cooper Reviewed-by.
---
 xen/arch/x86/domain.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 3f9e5d2..971c88d 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -532,8 +532,8 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
                    d->domain_id, config->emulation_flags);
             return -EINVAL;
         }
-        if ( is_hvm_domain(d) ? (config->emulation_flags != XEN_X86_EMU_ALL)
-                              : (config->emulation_flags != 0) )
+        if ( config->emulation_flags != 0 &&
+             (!is_hvm_domain(d) || config->emulation_flags != XEN_X86_EMU_ALL) )
         {
             printk(XENLOG_G_ERR "d%d: Xen does not allow %s domain creation "
                    "with the current selection of emulators: %#x\n",
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 15/21] elfnotes: intorduce a new PHYS_ENTRY elfnote
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (13 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 14/21] xen/x86: allow disabling all emulated devices inside of Xen Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 16/21] libxc: allow creating domains without emulated devices Roger Pau Monne
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Stefano Stabellini, Ian Jackson, Ian Campbell, Roger Pau Monne

This new elfnote contains the 32bit entry point into the kernel. Xen will
use this entry point in order to launch the guest kernel in 32bit protected
mode with paging disabled.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Changes since v6:
 - Reword a comment.

Changes since v4:
 - Add Andrew Cooper Reviewed-by and Wei Liu Acked-by.
---
 tools/xcutils/readnotes.c          |  3 +++
 xen/common/libelf/libelf-dominfo.c |  4 ++++
 xen/include/public/elfnote.h       | 12 +++++++++++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/xcutils/readnotes.c b/tools/xcutils/readnotes.c
index 5fa445e..e682dd1 100644
--- a/tools/xcutils/readnotes.c
+++ b/tools/xcutils/readnotes.c
@@ -159,6 +159,9 @@ static unsigned print_notes(struct elf_binary *elf, ELF_HANDLE_DECL(elf_note) st
 		case XEN_ELFNOTE_L1_MFN_VALID:
 			print_l1_mfn_valid_note("L1_MFN_VALID", elf , note);
 			break;
+		case XEN_ELFNOTE_PHYS32_ENTRY:
+			print_numeric_note("PHYS32_ENTRY", elf , note);
+			break;
 		default:
 			printf("unknown note type %#x\n",
 			       (unsigned)elf_uval(elf, note, type));
diff --git a/xen/common/libelf/libelf-dominfo.c b/xen/common/libelf/libelf-dominfo.c
index 3de1c23..dacd4ba 100644
--- a/xen/common/libelf/libelf-dominfo.c
+++ b/xen/common/libelf/libelf-dominfo.c
@@ -119,6 +119,7 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf,
         [XEN_ELFNOTE_BSD_SYMTAB] = { "BSD_SYMTAB", 1},
         [XEN_ELFNOTE_SUSPEND_CANCEL] = { "SUSPEND_CANCEL", 0 },
         [XEN_ELFNOTE_MOD_START_PFN] = { "MOD_START_PFN", 0 },
+        [XEN_ELFNOTE_PHYS32_ENTRY] = { "PHYS32_ENTRY", 0 },
     };
 /* *INDENT-ON* */
 
@@ -212,6 +213,9 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf,
                 elf, note, sizeof(*parms->f_supported), i);
         break;
 
+    case XEN_ELFNOTE_PHYS32_ENTRY:
+        parms->phys_entry = val;
+        break;
     }
     return 0;
 }
diff --git a/xen/include/public/elfnote.h b/xen/include/public/elfnote.h
index 3824a94..353985f 100644
--- a/xen/include/public/elfnote.h
+++ b/xen/include/public/elfnote.h
@@ -200,9 +200,19 @@
 #define XEN_ELFNOTE_SUPPORTED_FEATURES 17
 
 /*
+ * Physical entry point into the kernel.
+ *
+ * 32bit entry point into the kernel. When requested to launch the
+ * guest kernel in a HVM container, Xen will use this entry point to
+ * launch the guest in 32bit protected mode with paging disabled.
+ * Ignored otherwise.
+ */
+#define XEN_ELFNOTE_PHYS32_ENTRY 18
+
+/*
  * The number of the highest elfnote defined.
  */
-#define XEN_ELFNOTE_MAX XEN_ELFNOTE_SUPPORTED_FEATURES
+#define XEN_ELFNOTE_MAX XEN_ELFNOTE_PHYS32_ENTRY
 
 /*
  * System information exported through crash notes.
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 16/21] libxc: allow creating domains without emulated devices.
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (14 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 15/21] elfnotes: intorduce a new PHYS_ENTRY elfnote Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs Roger Pau Monne
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Stefano Stabellini, Ian Jackson, Ian Campbell, Roger Pau Monne

Introduce a new flag in xc_dom_image that turns on and off the emulated
devices. This prevents creating the VGA hole, the hvm_info page and the
ioreq server pages. libxl unconditionally sets it to true for all HVM
domains at the moment.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Changes since v5:
 - Add Andrew Cooper Reviewed-by.

Changes since v4:
 - Store the size of the VGA hole inside of the xc_dom_image struct and set
   it from libxl.
 - Rename dom->emulation to dom->device_model (no functional change).
 - Add Wei Liu Acked-by.

Changes since v3:
 - Explain the meaning of the "emulation" xc_dom_image field.
---
 tools/libxc/include/xc_dom.h |  4 +++
 tools/libxc/xc_dom_x86.c     | 73 ++++++++++++++++++++++++--------------------
 tools/libxl/libxl_dom.c      |  2 ++
 tools/libxl/libxl_internal.h |  1 +
 4 files changed, 47 insertions(+), 33 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index ccc5926..0f84600 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -190,6 +190,10 @@ struct xc_dom_image {
     xen_paddr_t mmio_size;
     xen_paddr_t lowmem_end;
     xen_paddr_t highmem_end;
+    xen_pfn_t vga_hole_size;
+
+    /* If unset disables the setup of the IOREQ pages. */
+    bool device_model;
 
     /* Extra ACPI tables passed to HVMLOADER */
     struct xc_hvm_firmware_module acpi_module;
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index ed43c28..63b4832 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -49,8 +49,6 @@
 #define X86_CR0_PE 0x01
 #define X86_CR0_ET 0x10
 
-#define VGA_HOLE_SIZE (0x20)
-
 #define SPECIALPAGE_PAGING   0
 #define SPECIALPAGE_ACCESS   1
 #define SPECIALPAGE_SHARING  2
@@ -523,12 +521,15 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
     xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
     xc_interface *xch = dom->xch;
 
-    if ( (hvm_info_page = xc_map_foreign_range(
-              xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE,
-              HVM_INFO_PFN)) == NULL )
-        goto error_out;
-    build_hvm_info(hvm_info_page, dom);
-    munmap(hvm_info_page, PAGE_SIZE);
+    if ( dom->device_model )
+    {
+        if ( (hvm_info_page = xc_map_foreign_range(
+                  xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE,
+                  HVM_INFO_PFN)) == NULL )
+            goto error_out;
+        build_hvm_info(hvm_info_page, dom);
+        munmap(hvm_info_page, PAGE_SIZE);
+    }
 
     /* Allocate and clear special pages. */
     for ( i = 0; i < NR_SPECIAL_PAGES; i++ )
@@ -560,30 +561,33 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
     xc_hvm_param_set(xch, domid, HVM_PARAM_SHARING_RING_PFN,
                      special_pfn(SPECIALPAGE_SHARING));
 
-    /*
-     * Allocate and clear additional ioreq server pages. The default
-     * server will use the IOREQ and BUFIOREQ special pages above.
-     */
-    for ( i = 0; i < NR_IOREQ_SERVER_PAGES; i++ )
-        ioreq_server_array[i] = ioreq_server_pfn(i);
-
-    rc = xc_domain_populate_physmap_exact(xch, domid, NR_IOREQ_SERVER_PAGES, 0,
-                                          0, ioreq_server_array);
-    if ( rc != 0 )
+    if ( dom->device_model )
     {
-        DOMPRINTF("Could not allocate ioreq server pages.");
-        goto error_out;
-    }
+        /*
+         * Allocate and clear additional ioreq server pages. The default
+         * server will use the IOREQ and BUFIOREQ special pages above.
+         */
+        for ( i = 0; i < NR_IOREQ_SERVER_PAGES; i++ )
+            ioreq_server_array[i] = ioreq_server_pfn(i);
 
-    if ( xc_clear_domain_pages(xch, domid, ioreq_server_pfn(0),
-                               NR_IOREQ_SERVER_PAGES) )
+        rc = xc_domain_populate_physmap_exact(xch, domid, NR_IOREQ_SERVER_PAGES, 0,
+                                              0, ioreq_server_array);
+        if ( rc != 0 )
+        {
+            DOMPRINTF("Could not allocate ioreq server pages.");
             goto error_out;
+        }
 
-    /* Tell the domain where the pages are and how many there are */
-    xc_hvm_param_set(xch, domid, HVM_PARAM_IOREQ_SERVER_PFN,
-                     ioreq_server_pfn(0));
-    xc_hvm_param_set(xch, domid, HVM_PARAM_NR_IOREQ_SERVER_PAGES,
-                     NR_IOREQ_SERVER_PAGES);
+        if ( xc_clear_domain_pages(xch, domid, ioreq_server_pfn(0),
+                                   NR_IOREQ_SERVER_PAGES) )
+                goto error_out;
+
+        /* Tell the domain where the pages are and how many there are */
+        xc_hvm_param_set(xch, domid, HVM_PARAM_IOREQ_SERVER_PFN,
+                         ioreq_server_pfn(0));
+        xc_hvm_param_set(xch, domid, HVM_PARAM_NR_IOREQ_SERVER_PAGES,
+                         NR_IOREQ_SERVER_PAGES);
+    }
 
     /*
      * Identity-map page table is required for running with CR0.PG=0 when
@@ -1314,7 +1318,8 @@ static int meminit_hvm(struct xc_dom_image *dom)
      * allocated is pointless.
      */
     if ( claim_enabled ) {
-        rc = xc_domain_claim_pages(xch, domid, target_pages - VGA_HOLE_SIZE);
+        rc = xc_domain_claim_pages(xch, domid,
+                                   target_pages - dom->vga_hole_size);
         if ( rc != 0 )
         {
             DOMPRINTF("Could not allocate memory for HVM guest as we cannot claim memory!");
@@ -1330,7 +1335,8 @@ static int meminit_hvm(struct xc_dom_image *dom)
          * tot_pages will be target_pages - VGA_HOLE_SIZE after
          * this call.
          */
-        rc = xc_domain_set_pod_target(xch, domid, target_pages - VGA_HOLE_SIZE,
+        rc = xc_domain_set_pod_target(xch, domid,
+                                      target_pages - dom->vga_hole_size,
                                       NULL, NULL, NULL);
         if ( rc != 0 )
         {
@@ -1349,8 +1355,9 @@ static int meminit_hvm(struct xc_dom_image *dom)
      * Under 2MB mode, we allocate pages in batches of no more than 8MB to 
      * ensure that we can be preempted and hence dom0 remains responsive.
      */
-    rc = xc_domain_populate_physmap_exact(
-        xch, domid, 0xa0, 0, memflags, &dom->p2m_host[0x00]);
+    if ( dom->device_model )
+        rc = xc_domain_populate_physmap_exact(
+            xch, domid, 0xa0, 0, memflags, &dom->p2m_host[0x00]);
 
     stat_normal_pages = 0;
     for ( vmemid = 0; vmemid < nr_vmemranges; vmemid++ )
@@ -1369,7 +1376,7 @@ static int meminit_hvm(struct xc_dom_image *dom)
          * 0xA0000-0xC0000. Note that 0x00000-0xA0000 is populated just
          * before this loop.
          */
-        if ( vmemranges[vmemid].start == 0 )
+        if ( vmemranges[vmemid].start == 0 && dom->device_model )
         {
             cur_pages = 0xc0;
             stat_normal_pages += 0xc0;
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 44d481b..f5866b6 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -994,6 +994,8 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     dom->lowmem_end = lowmem_end;
     dom->highmem_end = highmem_end;
     dom->mmio_start = mmio_start;
+    dom->vga_hole_size = LIBXL_VGA_HOLE_SIZE;
+    dom->device_model = true;
 
     rc = libxl__domain_device_construct_rdm(gc, d_config,
                                             info->u.hvm.rdm_mem_boundary_memkb*1024,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b00add0..0e0e9a3 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -101,6 +101,7 @@
 #define LIBXL_HVM_EXTRA_MEMORY 2048
 #define LIBXL_MIN_DOM0_MEM (128*1024)
 #define LIBXL_INVALID_GFN (~(uint64_t)0)
+#define LIBXL_VGA_HOLE_SIZE 0x20
 /* use 0 as the domid of the toolstack domain for now */
 #define LIBXL_TOOLSTACK_DOMID 0
 #define QEMU_SIGNATURE "DeviceModelRecord0002"
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (15 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 16/21] libxc: allow creating domains without emulated devices Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-12 16:57   ` Jan Beulich
  2015-11-06 16:05 ` [PATCH v8 18/21] libxc/xen: introduce a start info structure for HVMlite guests Roger Pau Monne
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, Jan Beulich,
	Roger Pau Monne

Allow the usage of the VCPUOP_initialise, VCPUOP_up, VCPUOP_down and
VCPUOP_is_up hypercalls from HVM guests.

This patch introduces a new structure (vcpu_hvm_context) that should be used
in conjuction with the VCPUOP_initialise hypercall in order to initialize
vCPUs for HVM guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@citrix.com>
---
Changes since v7:
 - Improved error messages.
 - Set EFER.LMA if EFER.LME is set by the user.
 - Fix calculation of CS limit.
 - Add more checks to segment registers.
 - Add checks to make sure padding fields are 0.
 - Remove ugly arch ifdefs from common domain.c.
 - Add the implicit padding of vcpu_hvm_x86_32 explicitly in the structure.
 - Simplify the compat vcpu code since it's only used on x86.

Changes since v6:
 - Add comments to clarify some initializations.
 - Introduce a generic default_initialize_vcpu that's used to initialize a
   ARM vCPU or a x86 PV vCPU.
 - Move the undef of the SEG macro.
 - Fix the size of the eflags register, it should be 32bits.
 - Add a comment regarding the value of the 12-15 bits of the _ar fields.
 - Remove the 16bit strucutre, the 32bit one can be used to start the cpu in
   real mode.
 - Add some sanity checks to the values passed in.
 - Add paddings to vcpu_hvm_context so the layout on 32/64bits is the same.
 - Add support for the compat version of VCPUOP_initialise.

Changes since v5:
 - Fix a coding style issue.
 - Merge the code from wip-dmlite-v5-refactor by Andrew in order to reduce
   bloat.
 - Print the offending %cr3 in case of error when using shadow.
 - Reduce the scope of local variables in arch_initialize_vcpu.
 - s/current->domain/v->domain/g in arch_initialize_vcpu.
 - Expand the comment in public/vcpu.h to document the usage of
   vcpu_hvm_context for HVM guests.
 - Add myself as the copyright holder for the public hvm_vcpu.h header.

Changes since v4:
 - Don't assume mode is 64B, add an explicit check.
 - Don't set TF_kernel_mode, it is only needed for PV guests.
 - Don't set CR0_ET unconditionally.
---
 xen/arch/arm/domain.c             |  24 ++++
 xen/arch/x86/domain.c             | 296 ++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/hvm.c            |   8 ++
 xen/common/compat/domain.c        |  50 +++++--
 xen/common/domain.c               |  17 +--
 xen/include/Makefile              |   1 +
 xen/include/asm-x86/domain.h      |   3 +
 xen/include/public/hvm/hvm_vcpu.h | 144 +++++++++++++++++++
 xen/include/public/vcpu.h         |   6 +-
 xen/include/xen/domain.h          |   2 +
 xen/include/xlat.lst              |   3 +
 11 files changed, 521 insertions(+), 33 deletions(-)
 create mode 100644 xen/include/public/hvm/hvm_vcpu.h

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 880d0a6..7fe4dd8 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -754,6 +754,30 @@ int arch_set_info_guest(
     return 0;
 }
 
+int arch_initialize_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    struct vcpu_guest_context *ctxt;
+    struct domain *d = v->domain;
+    int rc;
+
+    if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
+        return -ENOMEM;
+
+    if ( copy_from_guest(ctxt, arg, 1) )
+    {
+        free_vcpu_guest_context(ctxt);
+        return -EFAULT;
+    }
+
+    domain_lock(d);
+    rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt);
+    domain_unlock(d);
+
+    free_vcpu_guest_context(ctxt);
+
+    return rc;
+}
+
 int arch_vcpu_reset(struct vcpu *v)
 {
     vcpu_end_shutdown_deferral(v);
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 971c88d..a6775c9 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -37,6 +37,7 @@
 #include <xen/wait.h>
 #include <xen/guest_access.h>
 #include <public/sysctl.h>
+#include <public/hvm/hvm_vcpu.h>
 #include <asm/regs.h>
 #include <asm/mc146818rtc.h>
 #include <asm/system.h>
@@ -1183,6 +1184,301 @@ int arch_set_info_guest(
 #undef c
 }
 
+static inline int check_segment(struct segment_register reg,
+                                enum x86_segment seg)
+{
+
+    if ( reg.attr.fields.pad != 0 )
+    {
+        gprintk(XENLOG_ERR,
+                "Attribute bits 12-15 of the segment are not zero\n");
+        return -EINVAL;
+    }
+
+    if ( reg.sel == 0 && reg.base == 0 && reg.limit == 0 &&
+         reg.attr.bytes == 0)
+    {
+        if ( seg == x86_seg_cs || seg == x86_seg_ss )
+        {
+            gprintk(XENLOG_ERR, "Null selector provided for CS or SS\n");
+            return -EINVAL;
+        }
+        return 0;
+    }
+
+    if ( reg.attr.fields.p != 1 )
+    {
+        gprintk(XENLOG_ERR, "Non-present segment provided\n");
+        return -EINVAL;
+    }
+
+    if ( seg == x86_seg_cs && !(reg.attr.fields.type & 0x8) )
+    {
+        gprintk(XENLOG_ERR, "CS is missing code type\n");
+        return -EINVAL;
+    }
+
+    if ( seg != x86_seg_cs && !!(reg.attr.fields.type & 0x8) )
+    {
+        gprintk(XENLOG_ERR, "Non code segment with code type set\n");
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+/* Called by VCPUOP_initialise for HVM guests. */
+int arch_set_info_hvm_guest(struct vcpu *v, const vcpu_hvm_context_t *ctx)
+{
+    struct cpu_user_regs *uregs = &v->arch.user_regs;
+    struct segment_register cs, ds, ss, es, tr;
+    int rc;
+
+    if ( ctx->pad != 0 )
+    {
+        gprintk(XENLOG_ERR, "Padding field != 0\n");
+        return -EINVAL;
+    }
+
+    switch ( ctx->mode )
+    {
+    default:
+        return -EINVAL;
+
+    case VCPU_HVM_MODE_32B:
+    {
+        const struct vcpu_hvm_x86_32 *regs = &ctx->cpu_regs.x86_32;
+        uint32_t limit;
+
+        if ( ctx->cpu_regs.x86_32.pad1 != 0 ||
+             ctx->cpu_regs.x86_32.pad2[0] != 0 ||
+             ctx->cpu_regs.x86_32.pad2[1] != 0  ||
+             ctx->cpu_regs.x86_32.pad2[2] != 0 )
+        {
+            gprintk(XENLOG_ERR, "Padding field != 0\n");
+            return -EINVAL;
+        }
+
+#define SEG(s, r)                                                       \
+    (struct segment_register){ .sel = 0, .base = (r)->s ## _base,       \
+            .limit = (r)->s ## _limit, .attr.bytes = (r)->s ## _ar }
+        cs = SEG(cs, regs);
+        ds = SEG(ds, regs);
+        ss = SEG(ss, regs);
+        es = SEG(es, regs);
+        tr = SEG(tr, regs);
+#undef SEG
+
+        rc = check_segment(cs, x86_seg_cs);
+        rc |= check_segment(ds, x86_seg_ds);
+        rc |= check_segment(ss, x86_seg_ss);
+        rc |= check_segment(es, x86_seg_es);
+        rc |= check_segment(tr, x86_seg_tr);
+        if ( rc != 0 )
+            return rc;
+
+        /* Basic sanity checks. */
+        limit = cs.limit;
+        if ( cs.attr.fields.g )
+            limit = (limit << 12) | 0xfff;
+        if ( regs->eip > limit )
+        {
+            gprintk(XENLOG_ERR, "EIP (%#08x) outside CS limit (%#08x)\n",
+                                regs->eip, limit);
+            return -EINVAL;
+        }
+
+        if ( ds.attr.fields.p && ds.attr.fields.dpl > cs.attr.fields.dpl )
+        {
+            gprintk(XENLOG_ERR, "DS.DPL (%u) is greater than CS.DPL (%u)\n",
+                                ds.attr.fields.dpl, cs.attr.fields.dpl);
+            return -EINVAL;
+        }
+
+        if ( ss.attr.fields.p && ss.attr.fields.dpl != cs.attr.fields.dpl )
+        {
+            gprintk(XENLOG_ERR, "SS.DPL (%u) is different than CS.DPL (%u)\n",
+                                ss.attr.fields.dpl, cs.attr.fields.dpl);
+            return -EINVAL;
+        }
+
+        if ( es.attr.fields.p && es.attr.fields.dpl > cs.attr.fields.dpl )
+        {
+            gprintk(XENLOG_ERR, "ES.DPL (%u) is greater than CS.DPL (%u)\n",
+                                es.attr.fields.dpl, cs.attr.fields.dpl);
+            return -EINVAL;
+        }
+
+        if ( (regs->efer & EFER_LMA) && !(regs->efer & EFER_LME) )
+        {
+            gprintk(XENLOG_ERR, "EFER.LMA set without EFER.LME (%#016lx)\n",
+                                regs->efer);
+            return -EINVAL;
+        }
+
+        uregs->rax    = regs->eax;
+        uregs->rcx    = regs->ecx;
+        uregs->rdx    = regs->edx;
+        uregs->rbx    = regs->ebx;
+        uregs->rsp    = regs->esp;
+        uregs->rbp    = regs->ebp;
+        uregs->rsi    = regs->esi;
+        uregs->rdi    = regs->edi;
+        uregs->rip    = regs->eip;
+        uregs->rflags = regs->eflags;
+
+        v->arch.hvm_vcpu.guest_cr[0] = regs->cr0;
+        v->arch.hvm_vcpu.guest_cr[3] = regs->cr3;
+        v->arch.hvm_vcpu.guest_cr[4] = regs->cr4;
+        v->arch.hvm_vcpu.guest_efer  = regs->efer;
+    }
+    break;
+
+    case VCPU_HVM_MODE_64B:
+    {
+        const struct vcpu_hvm_x86_64 *regs = &ctx->cpu_regs.x86_64;
+
+        /* Basic sanity checks. */
+        if ( !is_canonical_address(regs->rip) )
+        {
+            gprintk(XENLOG_ERR, "RIP contains a non-canonical address (%#lx)\n",
+                                regs->rip);
+            return -EINVAL;
+        }
+
+        if ( !(regs->cr0 & X86_CR0_PG) )
+        {
+            gprintk(XENLOG_ERR, "CR0 doesn't have paging enabled (%#016lx)\n",
+                                regs->cr0);
+            return -EINVAL;
+        }
+
+        if ( !(regs->cr4 & X86_CR4_PAE) )
+        {
+            gprintk(XENLOG_ERR, "CR4 doesn't have PAE enabled (%#016lx)\n",
+                                regs->cr4);
+            return -EINVAL;
+        }
+
+        if ( !(regs->efer & EFER_LME) )
+        {
+            gprintk(XENLOG_ERR, "EFER doesn't have LME enabled (%#016lx)\n",
+                                regs->efer);
+            return -EINVAL;
+        }
+
+        uregs->rax    = regs->rax;
+        uregs->rcx    = regs->rcx;
+        uregs->rdx    = regs->rdx;
+        uregs->rbx    = regs->rbx;
+        uregs->rsp    = regs->rsp;
+        uregs->rbp    = regs->rbp;
+        uregs->rsi    = regs->rsi;
+        uregs->rdi    = regs->rdi;
+        uregs->rip    = regs->rip;
+        uregs->rflags = regs->rflags;
+
+        v->arch.hvm_vcpu.guest_cr[0] = regs->cr0;
+        v->arch.hvm_vcpu.guest_cr[3] = regs->cr3;
+        v->arch.hvm_vcpu.guest_cr[4] = regs->cr4;
+        v->arch.hvm_vcpu.guest_efer  = regs->efer;
+
+#define SEG(b, l, a)                                                    \
+    (struct segment_register){ .sel = 0, .base = (b), .limit = (l),     \
+                               .attr.bytes = (a) }
+        cs = SEG(0, ~0u, 0xa9b); /* 64bit code segment. */
+        ds = ss = es = SEG(0, ~0u, 0xc93);
+        tr = SEG(0, 0x67, 0x8b); /* 64bit TSS (busy). */
+#undef SEG
+    }
+    break;
+
+    }
+
+    if ( v->arch.hvm_vcpu.guest_efer & EFER_LME )
+        v->arch.hvm_vcpu.guest_efer |= EFER_LMA;
+
+    hvm_update_guest_cr(v, 0);
+    hvm_update_guest_cr(v, 3);
+    hvm_update_guest_cr(v, 4);
+    hvm_update_guest_efer(v);
+
+    if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) )
+    {
+        /* Shadow-mode CR3 change. Check PDBR and update refcounts. */
+        struct page_info *page = get_page_from_gfn(v->domain,
+                                 v->arch.hvm_vcpu.guest_cr[3] >> PAGE_SHIFT,
+                                 NULL, P2M_ALLOC);
+        if ( !page )
+        {
+            gprintk(XENLOG_ERR, "Invalid CR3: %#lx\n",
+                     v->arch.hvm_vcpu.guest_cr[3]);
+            domain_crash(v->domain);
+            return -EINVAL;
+        }
+
+        v->arch.guest_table = pagetable_from_page(page);
+    }
+
+    hvm_set_segment_register(v, x86_seg_cs, &cs);
+    hvm_set_segment_register(v, x86_seg_ds, &ds);
+    hvm_set_segment_register(v, x86_seg_ss, &ss);
+    hvm_set_segment_register(v, x86_seg_es, &es);
+    hvm_set_segment_register(v, x86_seg_tr, &tr);
+
+    /* Sync AP's TSC with BSP's. */
+    v->arch.hvm_vcpu.cache_tsc_offset =
+        v->domain->vcpu[0]->arch.hvm_vcpu.cache_tsc_offset;
+    hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset,
+                             v->domain->arch.hvm_domain.sync_tsc);
+
+    paging_update_paging_modes(v);
+
+    v->is_initialised = 1;
+    set_bit(_VPF_down, &v->pause_flags);
+
+    return 0;
+}
+
+int arch_initialize_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    struct domain *d = v->domain;
+    int rc;
+
+    if ( is_hvm_vcpu(v) )
+    {
+        struct vcpu_hvm_context ctxt;
+
+        if ( copy_from_guest(&ctxt, arg, 1) )
+            return -EFAULT;
+
+        domain_lock(d);
+        rc = v->is_initialised ? -EEXIST : arch_set_info_hvm_guest(v, &ctxt);
+        domain_unlock(d);
+    }
+    else
+    {
+        struct vcpu_guest_context *ctxt;
+
+        if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
+            return -ENOMEM;
+
+        if ( copy_from_guest(ctxt, arg, 1) )
+        {
+            free_vcpu_guest_context(ctxt);
+            return -EFAULT;
+        }
+
+        domain_lock(d);
+        rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt);
+        domain_unlock(d);
+
+        free_vcpu_guest_context(ctxt);
+    }
+
+    return rc;
+}
+
 int arch_vcpu_reset(struct vcpu *v)
 {
     if ( is_pv_vcpu(v) )
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 409b61d..438aa27 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5042,6 +5042,10 @@ static long hvm_vcpu_op(
     case VCPUOP_stop_singleshot_timer:
     case VCPUOP_register_vcpu_info:
     case VCPUOP_register_vcpu_time_memory_area:
+    case VCPUOP_initialise:
+    case VCPUOP_up:
+    case VCPUOP_down:
+    case VCPUOP_is_up:
         rc = do_vcpu_op(cmd, vcpuid, arg);
         break;
     default:
@@ -5100,6 +5104,10 @@ static long hvm_vcpu_op_compat32(
     case VCPUOP_stop_singleshot_timer:
     case VCPUOP_register_vcpu_info:
     case VCPUOP_register_vcpu_time_memory_area:
+    case VCPUOP_initialise:
+    case VCPUOP_up:
+    case VCPUOP_down:
+    case VCPUOP_is_up:
         rc = compat_vcpu_op(cmd, vcpuid, arg);
         break;
     default:
diff --git a/xen/common/compat/domain.c b/xen/common/compat/domain.c
index c60d824..88bfdc8 100644
--- a/xen/common/compat/domain.c
+++ b/xen/common/compat/domain.c
@@ -11,6 +11,7 @@ asm(".file \"" __FILE__ "\"");
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
 #include <compat/vcpu.h>
+#include <compat/hvm/hvm_vcpu.h>
 
 #define xen_vcpu_set_periodic_timer vcpu_set_periodic_timer
 CHECK_vcpu_set_periodic_timer;
@@ -24,6 +25,14 @@ CHECK_SIZE_(struct, vcpu_info);
 CHECK_vcpu_register_vcpu_info;
 #undef xen_vcpu_register_vcpu_info
 
+#define xen_vcpu_hvm_context vcpu_hvm_context
+#define xen_vcpu_hvm_x86_32 vcpu_hvm_x86_32
+#define xen_vcpu_hvm_x86_64 vcpu_hvm_x86_64
+CHECK_vcpu_hvm_context;
+#undef xen_vcpu_hvm_x86_64
+#undef xen_vcpu_hvm_x86_32
+#undef xen_vcpu_hvm_context
+
 int compat_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct domain *d = current->domain;
@@ -37,33 +46,44 @@ int compat_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) ar
     {
     case VCPUOP_initialise:
     {
-        struct compat_vcpu_guest_context *cmp_ctxt;
-
         if ( v->vcpu_info == &dummy_vcpu_info )
             return -EINVAL;
 
-        if ( (cmp_ctxt = xmalloc(struct compat_vcpu_guest_context)) == NULL )
+        if ( is_hvm_vcpu(v) )
         {
-            rc = -ENOMEM;
-            break;
-        }
+            struct vcpu_hvm_context ctxt;
 
-        if ( copy_from_guest(cmp_ctxt, arg, 1) )
-        {
-            xfree(cmp_ctxt);
-            rc = -EFAULT;
-            break;
+            if ( copy_from_guest(&ctxt, arg, 1) )
+                return -EFAULT;
+
+            domain_lock(d);
+            rc = v->is_initialised ? -EEXIST : arch_set_info_hvm_guest(v, &ctxt);
+            domain_unlock(d);
         }
+        else
+        {
+            struct compat_vcpu_guest_context *ctxt;
+
+            if ( (ctxt = xmalloc(struct compat_vcpu_guest_context)) == NULL )
+                return -ENOMEM;
+
+            if ( copy_from_guest(ctxt, arg, 1) )
+            {
+                xfree(ctxt);
+                return -EFAULT;
+            }
 
-        domain_lock(d);
-        rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, cmp_ctxt);
-        domain_unlock(d);
+            domain_lock(d);
+            rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt);
+            domain_unlock(d);
+
+            xfree(ctxt);
+        }
 
         if ( rc == -ERESTART )
             rc = hypercall_create_continuation(__HYPERVISOR_vcpu_op, "iuh",
                                                cmd, vcpuid, arg);
 
-        xfree(cmp_ctxt);
         break;
     }
 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index f56b7ff..c3c5e29 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1212,7 +1212,6 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct domain *d = current->domain;
     struct vcpu *v;
-    struct vcpu_guest_context *ctxt;
     long rc = 0;
 
     if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
@@ -1224,21 +1223,7 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( v->vcpu_info == &dummy_vcpu_info )
             return -EINVAL;
 
-        if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
-            return -ENOMEM;
-
-        if ( copy_from_guest(ctxt, arg, 1) )
-        {
-            free_vcpu_guest_context(ctxt);
-            return -EFAULT;
-        }
-
-        domain_lock(d);
-        rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt);
-        domain_unlock(d);
-
-        free_vcpu_guest_context(ctxt);
-
+        rc = arch_initialize_vcpu(v, arg);
         if ( rc == -ERESTART )
             rc = hypercall_create_continuation(__HYPERVISOR_vcpu_op, "iuh",
                                                cmd, vcpuid, arg);
diff --git a/xen/include/Makefile b/xen/include/Makefile
index c674f7f..94ba3d8 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -26,6 +26,7 @@ headers-$(CONFIG_X86)     += compat/arch-x86/pmu.h
 headers-$(CONFIG_X86)     += compat/arch-x86/xen-mca.h
 headers-$(CONFIG_X86)     += compat/arch-x86/xen.h
 headers-$(CONFIG_X86)     += compat/arch-x86/xen-$(compat-arch-y).h
+headers-$(CONFIG_X86)     += compat/hvm/hvm_vcpu.h
 headers-y                 += compat/arch-$(compat-arch-y).h compat/pmu.h compat/xlat.h
 headers-$(FLASK_ENABLE)   += compat/xsm/flask_op.h
 
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index c825975..e8f7037 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -600,6 +600,9 @@ static inline void free_vcpu_guest_context(struct vcpu_guest_context *vgc)
     vfree(vgc);
 }
 
+struct vcpu_hvm_context;
+int arch_set_info_hvm_guest(struct vcpu *v, const struct vcpu_hvm_context *ctx);
+
 #endif /* __ASM_DOMAIN_H__ */
 
 /*
diff --git a/xen/include/public/hvm/hvm_vcpu.h b/xen/include/public/hvm/hvm_vcpu.h
new file mode 100644
index 0000000..d21abf1
--- /dev/null
+++ b/xen/include/public/hvm/hvm_vcpu.h
@@ -0,0 +1,144 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2015, Roger Pau Monne <roger.pau@citrix.com>
+ */
+
+#ifndef __XEN_PUBLIC_HVM_HVM_VCPU_H__
+#define __XEN_PUBLIC_HVM_HVM_VCPU_H__
+
+#include "../xen.h"
+
+struct vcpu_hvm_x86_32 {
+    uint32_t eax;
+    uint32_t ecx;
+    uint32_t edx;
+    uint32_t ebx;
+    uint32_t esp;
+    uint32_t ebp;
+    uint32_t esi;
+    uint32_t edi;
+    uint32_t eip;
+    uint32_t eflags;
+
+    uint32_t cr0;
+    uint32_t cr3;
+    uint32_t cr4;
+
+    uint32_t pad1;
+
+    /*
+     * EFER should only be used to set the NXE bit (if required)
+     * when starting a vCPU in 32bit mode with paging enabled or
+     * to set the LME/LMA bits in order to start the vCPU in
+     * compatibility mode.
+     */
+    uint64_t efer;
+
+    uint32_t cs_base;
+    uint32_t ds_base;
+    uint32_t ss_base;
+    uint32_t es_base;
+    uint32_t tr_base;
+    uint32_t cs_limit;
+    uint32_t ds_limit;
+    uint32_t ss_limit;
+    uint32_t es_limit;
+    uint32_t tr_limit;
+    uint16_t cs_ar;
+    uint16_t ds_ar;
+    uint16_t ss_ar;
+    uint16_t es_ar;
+    uint16_t tr_ar;
+
+    uint16_t pad2[3];
+};
+
+/*
+ * The layout of the _ar fields of the segment registers is the
+ * following:
+ *
+ * Bits   [0,3]: type (bits 40-43).
+ * Bit        4: s    (descriptor type, bit 44).
+ * Bit    [5,6]: dpl  (descriptor privilege level, bits 45-46).
+ * Bit        7: p    (segment-present, bit 47).
+ * Bit        8: avl  (available for system software, bit 52).
+ * Bit        9: l    (64-bit code segment, bit 53).
+ * Bit       10: db   (meaning depends on the segment, bit 54).
+ * Bit       11: g    (granularity, bit 55)
+ * Bits [12,15]: unused, must be blank.
+ *
+ * A more complete description of the meaning of this fields can be
+ * obtained from the Intel SDM, Volume 3, section 3.4.5.
+ */
+
+struct vcpu_hvm_x86_64 {
+    uint64_t rax;
+    uint64_t rcx;
+    uint64_t rdx;
+    uint64_t rbx;
+    uint64_t rsp;
+    uint64_t rbp;
+    uint64_t rsi;
+    uint64_t rdi;
+    uint64_t rip;
+    uint64_t rflags;
+
+    uint64_t cr0;
+    uint64_t cr3;
+    uint64_t cr4;
+    uint64_t efer;
+
+    /*
+     * Using VCPU_HVM_MODE_64B implies that the vCPU is launched
+     * directly in long mode, so the cached parts of the segment
+     * registers get set to match that environment.
+     *
+     * If the user wants to launch the vCPU in compatibility mode
+     * the 32-bit structure should be used instead.
+     */
+};
+
+struct vcpu_hvm_context {
+#define VCPU_HVM_MODE_32B 0  /* 32bit fields of the structure will be used. */
+#define VCPU_HVM_MODE_64B 1  /* 64bit fields of the structure will be used. */
+    uint32_t mode;
+
+    uint32_t pad;
+
+    /* CPU registers. */
+    union {
+        struct vcpu_hvm_x86_32 x86_32;
+        struct vcpu_hvm_x86_64 x86_64;
+    } cpu_regs;
+};
+typedef struct vcpu_hvm_context vcpu_hvm_context_t;
+DEFINE_XEN_GUEST_HANDLE(vcpu_hvm_context_t);
+
+#endif /* __XEN_PUBLIC_HVM_HVM_VCPU_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/vcpu.h b/xen/include/public/vcpu.h
index 898b89f..692b87a 100644
--- a/xen/include/public/vcpu.h
+++ b/xen/include/public/vcpu.h
@@ -41,8 +41,10 @@
  * Initialise a VCPU. Each VCPU can be initialised only once. A 
  * newly-initialised VCPU will not run until it is brought up by VCPUOP_up.
  * 
- * @extra_arg == pointer to vcpu_guest_context structure containing initial
- *               state for the VCPU.
+ * @extra_arg == For PV or ARM guests this is a pointer to a vcpu_guest_context
+ *               structure containing the initial state for the VCPU. For x86
+ *               HVM based guests this is a pointer to a vcpu_hvm_context
+ *               structure.
  */
 #define VCPUOP_initialise            0
 
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 3dca3f6..601bdab 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -64,6 +64,8 @@ int arch_domain_soft_reset(struct domain *d);
 int arch_set_info_guest(struct vcpu *, vcpu_guest_context_u);
 void arch_get_info_guest(struct vcpu *, vcpu_guest_context_u);
 
+int arch_initialize_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg);
+
 int domain_relinquish_resources(struct domain *d);
 
 void dump_pageframe_info(struct domain *d);
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 3795059..fda1137 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -56,6 +56,9 @@
 ?       grant_entry_header              grant_table.h
 ?	grant_entry_v2			grant_table.h
 ?	gnttab_swap_grant_ref		grant_table.h
+?	vcpu_hvm_context		hvm/hvm_vcpu.h
+?	vcpu_hvm_x86_32			hvm/hvm_vcpu.h
+?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
 !	kexec_range			kexec.h
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 18/21] libxc/xen: introduce a start info structure for HVMlite guests
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (16 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-10 16:53   ` Jan Beulich
  2015-11-06 16:05 ` [PATCH v8 19/21] libxc: switch xc_dom_elfloader to be used with HVMlite domains Roger Pau Monne
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, Andrew Cooper,
	Ian Jackson, Jan Beulich, Roger Pau Monne

This structure contains the physical address of the command line, as well as
the physical address of the list of loaded modules. The physical address of
this structure is passed to the guest at boot time in the %ebx register.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v7:
 - Add a comment to clarify that nothing will be loaded at physical address
   0.
 - Add Andrew Cooper Reviewed-by.
 - Add Wei Liu Ack.

Changes since v6:
 - Add a check to make sure the start info data is placed below 4GB.
 - Make sure byte addresses are treated as uintptr_t.
 - Fix single-line comment.

Changes since v5:
 - Change some of the calculations performed to get the total size of the
   start_info region.
 - Replace the mention of HVMlite in a comment with PVH.
 - Don't use 64bit integers in hvm_modlist_entry.
---
 tools/libxc/xc_dom_x86.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++-
 xen/include/public/xen.h | 23 ++++++++++++++++
 2 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 63b4832..7cf2a17 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -561,7 +561,70 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
     xc_hvm_param_set(xch, domid, HVM_PARAM_SHARING_RING_PFN,
                      special_pfn(SPECIALPAGE_SHARING));
 
-    if ( dom->device_model )
+    if ( !dom->device_model )
+    {
+        struct xc_dom_seg seg;
+        struct hvm_start_info *start_info;
+        char *cmdline;
+        struct hvm_modlist_entry *modlist;
+        void *start_page;
+        size_t cmdline_size = 0;
+        size_t start_info_size = sizeof(*start_info);
+
+        if ( dom->cmdline )
+        {
+            cmdline_size = ROUNDUP(strlen(dom->cmdline) + 1, 8);
+            start_info_size += cmdline_size;
+
+        }
+        if ( dom->ramdisk_blob )
+            start_info_size += sizeof(*modlist); /* Limited to one module. */
+
+        rc = xc_dom_alloc_segment(dom, &seg, "HVMlite start info", 0,
+                                  start_info_size);
+        if ( rc != 0 )
+        {
+            DOMPRINTF("Unable to reserve memory for the start info");
+            goto out;
+        }
+
+        start_page = xc_map_foreign_range(xch, domid, start_info_size,
+                                          PROT_READ | PROT_WRITE,
+                                          seg.pfn);
+        if ( start_page == NULL )
+        {
+            DOMPRINTF("Unable to map HVM start info page");
+            goto error_out;
+        }
+
+        start_info = start_page;
+        cmdline = start_page + sizeof(*start_info);
+        modlist = start_page + sizeof(*start_info) + cmdline_size;
+
+        if ( dom->cmdline )
+        {
+            strncpy(cmdline, dom->cmdline, MAX_GUEST_CMDLINE);
+            cmdline[MAX_GUEST_CMDLINE - 1] = '\0';
+            start_info->cmdline_paddr = (seg.pfn << PAGE_SHIFT) +
+                                ((uintptr_t)cmdline - (uintptr_t)start_info);
+        }
+
+        if ( dom->ramdisk_blob )
+        {
+            modlist[0].paddr = dom->ramdisk_seg.vstart - dom->parms.virt_base;
+            modlist[0].size = dom->ramdisk_seg.vend - dom->ramdisk_seg.vstart;
+            start_info->modlist_paddr = (seg.pfn << PAGE_SHIFT) +
+                                ((uintptr_t)modlist - (uintptr_t)start_info);
+            start_info->nr_modules = 1;
+        }
+
+        start_info->magic = HVM_START_MAGIC_VALUE;
+
+        munmap(start_page, start_info_size);
+
+        dom->start_info_pfn = seg.pfn;
+    }
+    else
     {
         /*
          * Allocate and clear additional ioreq server pages. The default
@@ -953,6 +1016,9 @@ static int vcpu_hvm(struct xc_dom_image *dom)
     /* Set the IP. */
     bsp_ctx.cpu.rip = dom->parms.phys_entry;
 
+    if ( dom->start_info_pfn )
+        bsp_ctx.cpu.rbx = dom->start_info_pfn << PAGE_SHIFT;
+
     /* Set the end descriptor. */
     bsp_ctx.end_d.typecode = HVM_SAVE_CODE(END);
     bsp_ctx.end_d.instance = 0;
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index ff5547e..5486da3 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -784,6 +784,29 @@ struct start_info {
 };
 typedef struct start_info start_info_t;
 
+/*
+ * Start of day structure passed to PVH guests in %ebx.
+ *
+ * NOTE: nothing will be loaded at physical address 0, so
+ * a 0 value in any of this fields should be treated as
+ * not present.
+ */
+struct hvm_start_info {
+#define HVM_START_MAGIC_VALUE 0x336ec578
+    uint32_t magic;             /* Contains the magic value 0x336ec578       */
+                                /* ("xEn3" with the 0x80 bit of the "E" set).*/
+    uint32_t flags;             /* SIF_xxx flags.                            */
+    uint32_t cmdline_paddr;     /* Physical address of the command line.     */
+    uint32_t nr_modules;        /* Number of modules passed to the kernel.   */
+    uint32_t modlist_paddr;     /* Physical address of an array of           */
+                                /* hvm_modlist_entry.                        */
+};
+
+struct hvm_modlist_entry {
+    uint32_t paddr;             /* Physical address of the module.           */
+    uint32_t size;              /* Size of the module in bytes.              */
+};
+
 /* New console union for dom0 introduced in 0x00030203. */
 #if __XEN_INTERFACE_VERSION__ < 0x00030203
 #define console_mfn    console.domU.mfn
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 19/21] libxc: switch xc_dom_elfloader to be used with HVMlite domains
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (17 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 18/21] libxc/xen: introduce a start info structure for HVMlite guests Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 20/21] libxl: allow the creation of HVM domains without a device model Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 21/21] libxl: add support for migrating HVM guests " Roger Pau Monne
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Stefano Stabellini, Ian Jackson, Ian Campbell, Roger Pau Monne

Allow xc_dom_elfloader to report a guest type as hvm-3.0-x86_32 if it's
running inside of a HVM container and has the PHYS32_ENTRY elfnote set.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Only xc_dom_elfloader has been switched to support HVMlite, other loaders
should also be switched once we have a HVMlite compatible kernel that uses
them.
---
Changes since v5:
 - Add Wei Liu Ack.

Changes since v4:
 - Add Andrew Cooper Reviewed-by.
---
 tools/libxc/xc_dom_elfloader.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/libxc/xc_dom_elfloader.c b/tools/libxc/xc_dom_elfloader.c
index 36a115e..2ae575e 100644
--- a/tools/libxc/xc_dom_elfloader.c
+++ b/tools/libxc/xc_dom_elfloader.c
@@ -56,6 +56,10 @@ static char *xc_dom_guest_type(struct xc_dom_image *dom,
 {
     uint64_t machine = elf_uval(elf, elf->ehdr, e_machine);
 
+    if ( dom->container_type == XC_DOM_HVM_CONTAINER &&
+         dom->parms.phys_entry != UNSET_ADDR )
+        return "hvm-3.0-x86_32";
+
     switch ( machine )
     {
     case EM_386:
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 20/21] libxl: allow the creation of HVM domains without a device model.
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (18 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 19/21] libxc: switch xc_dom_elfloader to be used with HVMlite domains Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-06 16:05 ` [PATCH v8 21/21] libxl: add support for migrating HVM guests " Roger Pau Monne
  20 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Stefano Stabellini, Ian Jackson, Ian Campbell, Roger Pau Monne

Replace the firmware loaded into HVM guests with an OS kernel. Since the HVM
builder now uses the PV xc_dom_* set of functions this kernel will be parsed
and loaded inside the guest like on PV, but the container is a pure HVM
guest.

Also, if device_model_version is set to none or a device model for the
specified domain is not present unconditinally set the nic type to
LIBXL_NIC_TYPE_VIF.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Changes since v7:
 - Add LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE.

Changes since v5:
 - Add Wei Liu Acked-by.

Changes since v4:
 - Set dom->mmio_size to match the size of the special pages if there's no
   device model for the guest. This implies moving NR_SPECIAL_PAGES and
   X86_HVM_END_SPECIAL_REGION to a public header so they can be known by
   libxl when creating the memory map.
 - Reword the xl.cfg man page description of the "none" device model option.
 - Use libxl__device_model_version_running instead of creating a new
   function.

Changes since v3:
 - Add explicit /* fall through */ comments.
 - Expand libxl__device_nic_setdefault so that it sets the right nic type
   for HVMlite guests.
 - Remove stray space in hvm_build_set_params.
 - Fix the error paths of libxl__domain_firmware.
---
 docs/man/xl.cfg.pod.5        |  5 +++
 tools/libxc/include/xc_dom.h |  2 ++
 tools/libxc/xc_dom_x86.c     | 15 ++++-----
 tools/libxl/libxl.c          | 44 ++++++++++++++++++--------
 tools/libxl/libxl.h          |  8 +++++
 tools/libxl/libxl_create.c   | 16 +++++++++-
 tools/libxl/libxl_dm.c       |  3 +-
 tools/libxl/libxl_dom.c      | 74 ++++++++++++++++++++++++++++++--------------
 tools/libxl/libxl_internal.h |  9 +++++-
 tools/libxl/libxl_types.idl  |  1 +
 tools/libxl/libxl_x86.c      | 15 ++++++---
 tools/libxl/xl_cmdimpl.c     |  2 ++
 12 files changed, 144 insertions(+), 50 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index b63846a..54de01c 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1783,6 +1783,11 @@ This device-model is the default for Linux dom0.
 Use the device-model based upon the historical Xen fork of Qemu.
 This device-model is still the default for NetBSD dom0.
 
+=item B<none>
+
+Don't use any device model. This requires a kernel capable of booting
+without emulated devices.
+
 =back
 
 It is recommended to accept the default value for new guests.  If
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 0f84600..676858f 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -17,6 +17,8 @@
 #include <xenguest.h>
 
 #define INVALID_P2M_ENTRY   ((xen_pfn_t)-1)
+#define X86_HVM_NR_SPECIAL_PAGES    8
+#define X86_HVM_END_SPECIAL_REGION  0xff000u
 
 /* --- typedefs and structs ---------------------------------------- */
 
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 7cf2a17..2f5c5a9 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -57,8 +57,8 @@
 #define SPECIALPAGE_IOREQ    5
 #define SPECIALPAGE_IDENT_PT 6
 #define SPECIALPAGE_CONSOLE  7
-#define NR_SPECIAL_PAGES     8
-#define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x))
+#define special_pfn(x) \
+    (X86_HVM_END_SPECIAL_REGION - X86_HVM_NR_SPECIAL_PAGES + (x))
 
 #define NR_IOREQ_SERVER_PAGES 8
 #define ioreq_server_pfn(x) (special_pfn(0) - NR_IOREQ_SERVER_PAGES + (x))
@@ -517,7 +517,7 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
     void *hvm_info_page;
     uint32_t *ident_pt, domid = dom->guest_domid;
     int rc;
-    xen_pfn_t special_array[NR_SPECIAL_PAGES];
+    xen_pfn_t special_array[X86_HVM_NR_SPECIAL_PAGES];
     xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
     xc_interface *xch = dom->xch;
 
@@ -532,18 +532,19 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
     }
 
     /* Allocate and clear special pages. */
-    for ( i = 0; i < NR_SPECIAL_PAGES; i++ )
+    for ( i = 0; i < X86_HVM_NR_SPECIAL_PAGES; i++ )
         special_array[i] = special_pfn(i);
 
-    rc = xc_domain_populate_physmap_exact(xch, domid, NR_SPECIAL_PAGES, 0, 0,
-                                          special_array);
+    rc = xc_domain_populate_physmap_exact(xch, domid, X86_HVM_NR_SPECIAL_PAGES,
+                                          0, 0, special_array);
     if ( rc != 0 )
     {
         DOMPRINTF("Could not allocate special pages.");
         goto error_out;
     }
 
-    if ( xc_clear_domain_pages(xch, domid, special_pfn(0), NR_SPECIAL_PAGES) )
+    if ( xc_clear_domain_pages(xch, domid, special_pfn(0),
+                               X86_HVM_NR_SPECIAL_PAGES) )
             goto error_out;
 
     xc_hvm_param_set(xch, domid, HVM_PARAM_STORE_PFN,
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 854e957..0284026 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1065,11 +1065,14 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
     }
 
     if (type == LIBXL_DOMAIN_TYPE_HVM) {
-        rc = libxl__domain_resume_device_model(gc, domid);
-        if (rc < 0) {
-            LOG(ERROR, "failed to unpause device model for domain %u:%d",
-                domid, rc);
-            goto out;
+        if (libxl__device_model_version_running(gc, domid) !=
+            LIBXL_DEVICE_MODEL_VERSION_NONE) {
+            rc = libxl__domain_resume_device_model(gc, domid);
+            if (rc < 0) {
+                LOG(ERROR, "failed to unpause device model for domain %u:%d",
+                    domid, rc);
+                goto out;
+            }
         }
     }
     ret = xc_domain_unpause(ctx->xch, domid);
@@ -1616,11 +1619,11 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis)
 
     switch (libxl__domain_type(gc, domid)) {
     case LIBXL_DOMAIN_TYPE_HVM:
-        if (!libxl_get_stubdom_id(CTX, domid))
-            dm_present = 1;
-        else
+        if (libxl_get_stubdom_id(CTX, domid)) {
             dm_present = 0;
-        break;
+            break;
+        }
+        /* fall through */
     case LIBXL_DOMAIN_TYPE_PV:
         pid = libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc, "/local/domain/%d/image/device-model-pid", domid));
         dm_present = (pid != NULL);
@@ -3242,7 +3245,7 @@ out:
 /******************************************************************************/
 
 int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic,
-                                 uint32_t domid)
+                                 uint32_t domid, libxl_domain_build_info *info)
 {
     int rc;
 
@@ -3279,8 +3282,23 @@ int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic,
 
     switch (libxl__domain_type(gc, domid)) {
     case LIBXL_DOMAIN_TYPE_HVM:
-        if (!nic->nictype)
-            nic->nictype = LIBXL_NIC_TYPE_VIF_IOEMU;
+        if (!nic->nictype) {
+            if (info != NULL) {
+                /* Path taken at creation time. */
+                if (info->device_model_version ==
+                    LIBXL_DEVICE_MODEL_VERSION_NONE)
+                    nic->nictype = LIBXL_NIC_TYPE_VIF;
+                else
+                    nic->nictype = LIBXL_NIC_TYPE_VIF_IOEMU;
+            } else {
+                /* Path taken when hot-adding a nic. */
+                if (libxl__device_model_version_running(gc, domid) ==
+                    LIBXL_DEVICE_MODEL_VERSION_NONE)
+                    nic->nictype = LIBXL_NIC_TYPE_VIF;
+                else
+                    nic->nictype = LIBXL_NIC_TYPE_VIF_IOEMU;
+            }
+        }
         break;
     case LIBXL_DOMAIN_TYPE_PV:
         if (nic->nictype == LIBXL_NIC_TYPE_VIF_IOEMU) {
@@ -3329,7 +3347,7 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid,
     libxl_device_nic_init(&nic_saved);
     libxl_device_nic_copy(CTX, &nic_saved, nic);
 
-    rc = libxl__device_nic_setdefault(gc, nic, domid);
+    rc = libxl__device_nic_setdefault(gc, nic, domid, NULL);
     if (rc) goto out;
 
     front = flexarray_make(gc, 16, 1);
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 168fedd..b43cce4 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -853,6 +853,14 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
 */
 #define LIBXL_HAVE_GFX_PASSTHRU_KIND
 
+/*
+ * LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE
+ *
+ * In the case that LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE is set libxl
+ * allows the creation of HVM guests without a device model.
+ */
+#define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index f0fee00..7520d4a 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -119,6 +119,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
                 b_info->u.hvm.bios = LIBXL_BIOS_TYPE_ROMBIOS; break;
             case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
                 b_info->u.hvm.bios = LIBXL_BIOS_TYPE_SEABIOS; break;
+            case LIBXL_DEVICE_MODEL_VERSION_NONE:
+                break;
             default:return ERROR_INVAL;
             }
 
@@ -132,6 +134,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
             if (b_info->u.hvm.bios == LIBXL_BIOS_TYPE_ROMBIOS)
                 return ERROR_INVAL;
             break;
+        case LIBXL_DEVICE_MODEL_VERSION_NONE:
+            break;
         default:abort();
         }
 
@@ -236,6 +240,9 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
                 break;
             }
             break;
+        case LIBXL_DEVICE_MODEL_VERSION_NONE:
+            b_info->video_memkb = 0;
+            break;
         case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
         default:
             switch (b_info->u.hvm.vga.kind) {
@@ -921,7 +928,8 @@ static void initiate_domain_create(libxl__egc *egc,
          * called libxl_device_nic_add when domcreate_launch_dm gets called,
          * but qemu needs the nic information to be complete.
          */
-        ret = libxl__device_nic_setdefault(gc, &d_config->nics[i], domid);
+        ret = libxl__device_nic_setdefault(gc, &d_config->nics[i], domid,
+                                           &d_config->b_info);
         if (ret) {
             LOG(ERROR, "Unable to set nic defaults for nic %d", i);
             goto error_out;
@@ -1262,6 +1270,12 @@ static void domcreate_launch_dm(libxl__egc *egc, libxl__multidev *multidev,
         libxl__device_console_add(gc, domid, &console, state, &device);
         libxl__device_console_dispose(&console);
 
+        if (d_config->b_info.device_model_version ==
+            LIBXL_DEVICE_MODEL_VERSION_NONE) {
+            domcreate_devmodel_started(egc, &dcs->dmss.dm, 0);
+            return;
+        }
+
         libxl_device_vkb_init(&vkb);
         libxl__device_vkb_add(gc, domid, &vkb);
         libxl_device_vkb_dispose(&vkb);
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 9c9eaa3..9545e05 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1543,7 +1543,8 @@ static void spawn_stub_launch_dm(libxl__egc *egc,
          * called libxl_device_nic_add at this point, but qemu needs
          * the nic information to be complete.
          */
-        ret = libxl__device_nic_setdefault(gc, &dm_config->nics[i], dm_domid);
+        ret = libxl__device_nic_setdefault(gc, &dm_config->nics[i], dm_domid,
+                                           &dm_config->b_info);
         if (ret)
             goto out;
     }
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index f5866b6..037d4dc 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -787,21 +787,23 @@ static int hvm_build_set_params(xc_interface *handle, uint32_t domid,
     uint64_t str_mfn, cons_mfn;
     int i;
 
-    va_map = xc_map_foreign_range(handle, domid,
-                                  XC_PAGE_SIZE, PROT_READ | PROT_WRITE,
-                                  HVM_INFO_PFN);
-    if (va_map == NULL)
-        return ERROR_FAIL;
+    if (info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_NONE) {
+        va_map = xc_map_foreign_range(handle, domid,
+                                      XC_PAGE_SIZE, PROT_READ | PROT_WRITE,
+                                      HVM_INFO_PFN);
+        if (va_map == NULL)
+            return ERROR_FAIL;
 
-    va_hvm = (struct hvm_info_table *)(va_map + HVM_INFO_OFFSET);
-    va_hvm->apic_mode = libxl_defbool_val(info->u.hvm.apic);
-    va_hvm->nr_vcpus = info->max_vcpus;
-    memset(va_hvm->vcpu_online, 0, sizeof(va_hvm->vcpu_online));
-    memcpy(va_hvm->vcpu_online, info->avail_vcpus.map, info->avail_vcpus.size);
-    for (i = 0, sum = 0; i < va_hvm->length; i++)
-        sum += ((uint8_t *) va_hvm)[i];
-    va_hvm->checksum -= sum;
-    munmap(va_map, XC_PAGE_SIZE);
+        va_hvm = (struct hvm_info_table *)(va_map + HVM_INFO_OFFSET);
+        va_hvm->apic_mode = libxl_defbool_val(info->u.hvm.apic);
+        va_hvm->nr_vcpus = info->max_vcpus;
+        memset(va_hvm->vcpu_online, 0, sizeof(va_hvm->vcpu_online));
+        memcpy(va_hvm->vcpu_online, info->avail_vcpus.map, info->avail_vcpus.size);
+        for (i = 0, sum = 0; i < va_hvm->length; i++)
+            sum += ((uint8_t *) va_hvm)[i];
+        va_hvm->checksum -= sum;
+        munmap(va_map, XC_PAGE_SIZE);
+    }
 
     xc_hvm_param_get(handle, domid, HVM_PARAM_STORE_PFN, &str_mfn);
     xc_hvm_param_get(handle, domid, HVM_PARAM_CONSOLE_PFN, &cons_mfn);
@@ -867,7 +869,7 @@ static int libxl__domain_firmware(libxl__gc *gc,
 {
     libxl_ctx *ctx = libxl__gc_owner(gc);
     const char *firmware;
-    int e, rc = ERROR_FAIL;
+    int e, rc;
     int datalen = 0;
     void *data;
 
@@ -882,18 +884,34 @@ static int libxl__domain_firmware(libxl__gc *gc,
         case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
             firmware = "hvmloader";
             break;
+        case LIBXL_DEVICE_MODEL_VERSION_NONE:
+            if (info->kernel == NULL) {
+                LOG(ERROR, "no device model requested without a kernel");
+                rc = ERROR_FAIL;
+                goto out;
+            }
+            break;
         default:
             LOG(ERROR, "invalid device model version %d",
                 info->device_model_version);
-            return ERROR_FAIL;
-            break;
+            rc = ERROR_FAIL;
+            goto out;
         }
     }
 
-    rc = xc_dom_kernel_file(dom, libxl__abs_path(gc, firmware,
+    if (info->kernel != NULL &&
+        info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE) {
+        /* Try to load a kernel instead of the firmware. */
+        rc = xc_dom_kernel_file(dom, info->kernel);
+        if (rc == 0 && info->ramdisk != NULL)
+            rc = xc_dom_ramdisk_file(dom, info->ramdisk);
+    } else {
+        rc = xc_dom_kernel_file(dom, libxl__abs_path(gc, firmware,
                                                  libxl__xenfirmwaredir_path()));
+    }
+
     if (rc != 0) {
-        LOGE(ERROR, "xc_dom_kernel_file failed");
+        LOGE(ERROR, "xc_dom_{kernel_file/ramdisk_file} failed");
         goto out;
     }
 
@@ -904,6 +922,7 @@ static int libxl__domain_firmware(libxl__gc *gc,
         if (e) {
             LOGEV(ERROR, e, "failed to read SMBIOS firmware file %s",
                 info->u.hvm.smbios_firmware);
+            rc = ERROR_FAIL;
             goto out;
         }
         libxl__ptr_add(gc, data);
@@ -921,6 +940,7 @@ static int libxl__domain_firmware(libxl__gc *gc,
         if (e) {
             LOGEV(ERROR, e, "failed to read ACPI firmware file %s",
                 info->u.hvm.acpi_firmware);
+            rc = ERROR_FAIL;
             goto out;
         }
         libxl__ptr_add(gc, data);
@@ -933,6 +953,7 @@ static int libxl__domain_firmware(libxl__gc *gc,
 
     return 0;
 out:
+    assert(rc != 0);
     return rc;
 }
 
@@ -945,10 +966,13 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     uint64_t mmio_start, lowmem_end, highmem_end, mem_size;
     libxl_domain_build_info *const info = &d_config->b_info;
     struct xc_dom_image *dom = NULL;
+    bool device_model =
+        info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_NONE ?
+        true : false;
 
     xc_dom_loginit(ctx->xch);
 
-    dom = xc_dom_allocate(ctx->xch, NULL, NULL);
+    dom = xc_dom_allocate(ctx->xch, info->cmdline, NULL);
     if (!dom) {
         LOGE(ERROR, "xc_dom_allocate failed");
         rc = ERROR_NOMEM;
@@ -981,8 +1005,12 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
 
     if (dom->target_pages == 0)
         dom->target_pages = mem_size >> XC_PAGE_SHIFT;
-    if (dom->mmio_size == 0)
+    if (dom->mmio_size == 0 && device_model)
         dom->mmio_size = HVM_BELOW_4G_MMIO_LENGTH;
+    else if (dom->mmio_size == 0 && !device_model)
+        dom->mmio_size = GB(4) -
+                    ((X86_HVM_END_SPECIAL_REGION - X86_HVM_NR_SPECIAL_PAGES)
+                    << XC_PAGE_SHIFT);
     lowmem_end = mem_size;
     highmem_end = 0;
     mmio_start = (1ull << 32) - dom->mmio_size;
@@ -994,8 +1022,8 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     dom->lowmem_end = lowmem_end;
     dom->highmem_end = highmem_end;
     dom->mmio_start = mmio_start;
-    dom->vga_hole_size = LIBXL_VGA_HOLE_SIZE;
-    dom->device_model = true;
+    dom->vga_hole_size = device_model ? LIBXL_VGA_HOLE_SIZE : 0;
+    dom->device_model = device_model;
 
     rc = libxl__domain_device_construct_rdm(gc, d_config,
                                             info->u.hvm.rdm_mem_boundary_memkb*1024,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 0e0e9a3..091e030 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -114,6 +114,12 @@
 #define DOMID_XS_PATH "domid"
 #define INVALID_DOMID ~0
 
+/* Size macros. */
+#define __AC(X,Y)   (X##Y)
+#define _AC(X,Y)    __AC(X,Y)
+#define MB(_mb)     (_AC(_mb, ULL) << 20)
+#define GB(_gb)     (_AC(_gb, ULL) << 30)
+
 #define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0]))
 
 #define ROUNDUP(_val, _order)                                           \
@@ -1186,7 +1192,8 @@ _hidden int libxl__domain_build_info_setdefault(libxl__gc *gc,
 _hidden int libxl__device_disk_setdefault(libxl__gc *gc,
                                           libxl_device_disk *disk);
 _hidden int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic,
-                                         uint32_t domid);
+                                         uint32_t domid,
+                                         libxl_domain_build_info *info);
 _hidden int libxl__device_vtpm_setdefault(libxl__gc *gc, libxl_device_vtpm *vtpm);
 _hidden int libxl__device_vfb_setdefault(libxl__gc *gc, libxl_device_vfb *vfb);
 _hidden int libxl__device_vkb_setdefault(libxl__gc *gc, libxl_device_vkb *vkb);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 4d78f86..fd0202e 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -98,6 +98,7 @@ libxl_device_model_version = Enumeration("device_model_version", [
     (0, "UNKNOWN"),
     (1, "QEMU_XEN_TRADITIONAL"), # Historical qemu-xen device model (qemu-dm)
     (2, "QEMU_XEN"),             # Upstream based qemu-xen device model
+    (3, "NONE"),                 # No device model
     ])
 
 libxl_console_type = Enumeration("console_type", [
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 183b6c2..46cfafb 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -7,10 +7,16 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       libxl_domain_config *d_config,
                                       xc_domain_configuration_t *xc_config)
 {
-    if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM)
+
+    if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
+        d_config->b_info.device_model_version !=
+        LIBXL_DEVICE_MODEL_VERSION_NONE) {
+        /* HVM domains with a device model. */
         xc_config->emulation_flags = XEN_X86_EMU_ALL;
-    else
+    } else {
+        /* PV or HVM domains without a device model. */
         xc_config->emulation_flags = 0;
+    }
 
     return 0;
 }
@@ -485,6 +491,7 @@ int libxl__arch_domain_construct_memmap(libxl__gc *gc,
     struct e820entry *e820 = NULL;
     uint64_t highmem_size =
                     dom->highmem_end ? dom->highmem_end - (1ull << 32) : 0;
+    uint32_t lowmem_start = dom->device_model ? GUEST_LOW_MEM_START_DEFAULT : 0;
 
     /* Add all rdm entries. */
     for (i = 0; i < d_config->num_rdms; i++)
@@ -505,8 +512,8 @@ int libxl__arch_domain_construct_memmap(libxl__gc *gc,
     e820 = libxl__malloc(gc, sizeof(struct e820entry) * e820_entries);
 
     /* Low memory */
-    e820[nr].addr = GUEST_LOW_MEM_START_DEFAULT;
-    e820[nr].size = dom->lowmem_end - GUEST_LOW_MEM_START_DEFAULT;
+    e820[nr].addr = lowmem_start;
+    e820[nr].size = dom->lowmem_end - lowmem_start;
     e820[nr].type = E820_RAM;
     nr++;
 
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 03442e1..a5e3178 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2186,6 +2186,8 @@ skip_vfb:
         } else if (!strcmp(buf, "qemu-xen")) {
             b_info->device_model_version
                 = LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
+        } else if (!strcmp(buf, "none")) {
+            b_info->device_model_version = LIBXL_DEVICE_MODEL_VERSION_NONE;
         } else {
             fprintf(stderr,
                     "Unknown device_model_version \"%s\" specified\n", buf);
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v8 21/21] libxl: add support for migrating HVM guests without a device model
  2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
                   ` (19 preceding siblings ...)
  2015-11-06 16:05 ` [PATCH v8 20/21] libxl: allow the creation of HVM domains without a device model Roger Pau Monne
@ 2015-11-06 16:05 ` Roger Pau Monne
  2015-11-11 15:49   ` Wei Liu
  2015-11-11 17:14   ` Andrew Cooper
  20 siblings, 2 replies; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-06 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell, Roger Pau Monne

Only some minor libxl changes are needed in order to be able to migrate HVM
guests without a device model, no hypervisor changes are needed.

This change prevents sending the emulator context if the device model
version is set to none.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
Changes since v7:
 - Prevent sending the emulator context and xenstore record in
   write_emulator_context_record and write_emulator_xenstore_record.
 - Error out if an emulator record is received for a no-dm guest.
---
 tools/libxl/libxl_dom.c          |  3 +++
 tools/libxl/libxl_dom_suspend.c  |  2 ++
 tools/libxl/libxl_stream_read.c  | 16 ++++++++++++++++
 tools/libxl/libxl_stream_write.c | 16 ++++++++++++++++
 4 files changed, 37 insertions(+)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 037d4dc..685eb25 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1315,6 +1315,9 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
         domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
         break;
+    case LIBXL_DEVICE_MODEL_VERSION_NONE:
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+        break;
     default:
         LOG(ERROR,"logdirty switch failed"
             ", no valid device model version found, abandoning suspend");
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index 4cc01ad..3313ad1 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -43,6 +43,8 @@ int libxl__domain_suspend_device_model(libxl__gc *gc,
         if (ret)
             unlink(filename);
         break;
+    case LIBXL_DEVICE_MODEL_VERSION_NONE:
+        break;
     default:
         return ERROR_INVAL;
     }
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 4ec29da..258dec4 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -539,6 +539,14 @@ static bool process_record(libxl__egc *egc,
         break;
 
     case REC_TYPE_EMULATOR_XENSTORE_DATA:
+        if (dcs->guest_config->b_info.device_model_version ==
+            LIBXL_DEVICE_MODEL_VERSION_NONE) {
+            rc = ERROR_FAIL;
+            LOG(ERROR,
+                "Received a xenstore emulator record when none was expected");
+            goto err;
+        }
+
         if (rec->hdr.length < sizeof(libxl__sr_emulator_hdr)) {
             rc = ERROR_FAIL;
             LOG(ERROR,
@@ -560,6 +568,14 @@ static bool process_record(libxl__egc *egc,
         break;
 
     case REC_TYPE_EMULATOR_CONTEXT:
+        if (dcs->guest_config->b_info.device_model_version ==
+            LIBXL_DEVICE_MODEL_VERSION_NONE) {
+            rc = ERROR_FAIL;
+            LOG(ERROR,
+                "Received an emulator context record when none was expected");
+            goto err;
+        }
+
         write_emulator_blob(egc, stream, rec);
         break;
 
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index 52a60d7..0a6eaf9 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -232,6 +232,9 @@ void libxl__stream_write_start(libxl__egc *egc,
             stream->emu_sub_hdr.id = EMULATOR_QEMU_UPSTREAM;
             break;
 
+        case LIBXL_DEVICE_MODEL_VERSION_NONE:
+            break;
+
         default:
             rc = ERROR_FAIL;
             LOG(ERROR, "Unknown emulator for HVM domain");
@@ -359,6 +362,12 @@ static void write_emulator_xenstore_record(libxl__egc *egc,
     char *buf = NULL;
     uint32_t len = 0;
 
+    if (libxl__device_model_version_running(gc, dss->domid) ==
+        LIBXL_DEVICE_MODEL_VERSION_NONE) {
+        emulator_xenstore_record_done(egc, stream);
+        return;
+    }
+
     rc = libxl__save_emulator_xenstore_data(dss, &buf, &len);
     if (rc)
         goto err;
@@ -387,6 +396,7 @@ static void emulator_xenstore_record_done(libxl__egc *egc,
                                           libxl__stream_write_state *stream)
 {
     libxl__domain_suspend_state *dss = stream->dss;
+    STATE_AO_GC(stream->ao);
 
     if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
         write_emulator_context_record(egc, stream);
@@ -410,6 +420,12 @@ static void write_emulator_context_record(libxl__egc *egc,
 
     assert(dss->type == LIBXL_DOMAIN_TYPE_HVM);
 
+    if (libxl__device_model_version_running(gc, dss->domid) ==
+        LIBXL_DEVICE_MODEL_VERSION_NONE) {
+        emulator_context_record_done(egc, stream);
+        return;
+    }
+
     /* Convenience aliases */
     const char *const filename = dss->dm_savefile;
 
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 02/21] xen/vlapic: fixes for HVM code when running without a vlapic
  2015-11-06 16:05 ` [PATCH v8 02/21] xen/vlapic: fixes for HVM code when running without a vlapic Roger Pau Monne
@ 2015-11-06 16:10   ` Andrew Cooper
  2015-11-26  5:41   ` Tian, Kevin
  1 sibling, 0 replies; 47+ messages in thread
From: Andrew Cooper @ 2015-11-06 16:10 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Kevin Tian, Jan Beulich, Eddie Dong, Aravind Gopalakrishnan,
	Jun Nakajima, Boris Ostrovsky, Suravee Suthikulpanit

On 06/11/15 16:05, Roger Pau Monne wrote:
> The HVM related code (SVM, VMX) generally assumed that a local apic is
> always present. With the introduction of a HVM mode were the local apic can
> be removed, some of this broken code paths arised.
>
> The SVM exit/resume paths unconditionally checked the state of the lapic,
> which is wrong if it's been disabled by hardware, fix this by adding the
> necessary checks. On the VMX side, make sure we don't add mappings for a
> local apic if it's disabled.
>
> In the generic vlapic code, add checks to prevent setting the TSC deadline
> timer if the lapic is disabled, and also prevent trying to inject interrupts
> from the PIC is the lapic is also disabled.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Eddie Dong <eddie.dong@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 09/21] xen/x86: set the vPMU interface based on the presence of a lapic
  2015-11-06 16:05 ` [PATCH v8 09/21] xen/x86: set the vPMU interface based on the presence of a lapic Roger Pau Monne
@ 2015-11-09 15:04   ` Jan Beulich
  2015-11-10 11:30     ` [PATCH v8.1 " Roger Pau Monne
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2015-11-09 15:04 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Boris Ostrovsky, xen-devel

>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -5143,6 +5143,7 @@ static hvm_hypercall_t *const hvm_hypercall64_table[NR_hypercalls] = {
>      HYPERCALL(sysctl),
>      HYPERCALL(domctl),
>      HYPERCALL(tmem_op),
> +    HYPERCALL(xenpmu_op),
>      [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation
>  };
>  
> @@ -5164,6 +5165,7 @@ static hvm_hypercall_t *const hvm_hypercall32_table[NR_hypercalls] = {
>      HYPERCALL(sysctl),
>      HYPERCALL(domctl),
>      HYPERCALL(tmem_op),
> +    HYPERCALL(xenpmu_op),
>      [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation
>  };

I don't think this should be done now; instead this should imo be an
effect of eventually folding the two tables. With this dropped
Acked-by: Jan Beulich <jbeulich@suse.com>.

Jan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 12/21] xen/x86: allow disabling the emulated PIT
  2015-11-06 16:05 ` [PATCH v8 12/21] xen/x86: allow disabling the emulated PIT Roger Pau Monne
@ 2015-11-09 15:31   ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2015-11-09 15:31 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel

>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Reported by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v8.1 09/21] xen/x86: set the vPMU interface based on the presence of a lapic
  2015-11-09 15:04   ` Jan Beulich
@ 2015-11-10 11:30     ` Roger Pau Monne
  2015-11-10 14:21       ` Boris Ostrovsky
  0 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monne @ 2015-11-10 11:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Boris Ostrovsky, Jan Beulich, Roger Pau Monne

Instead of choosing the interface to expose to guests based on the guest
type, do it based on whether the guest has an emulated local apic or not.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v8:
 - Don't add the xenpmu hypercalls to the HVM hypercall table (yet).
 - Add Jan Beulich Acked-by.

Changes since v7:
 - Merge vpmu work from Boris.

Changes since v6:
 - Major rework of the approach.
 - Drop Andrew Cooper Acked-by.

Changes since v4:
 - Add Andrew Cooper Acked-by.
---
 xen/arch/x86/cpu/vpmu.c       | 11 ++++++-----
 xen/arch/x86/cpu/vpmu_amd.c   |  6 +++---
 xen/arch/x86/cpu/vpmu_intel.c |  6 +++---
 xen/arch/x86/hvm/svm/svm.c    |  2 +-
 xen/arch/x86/hvm/vmx/vmx.c    |  2 +-
 5 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index 2f5156a..4346c66 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -94,7 +94,7 @@ void vpmu_lvtpc_update(uint32_t val)
     vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR | (val & APIC_LVT_MASKED);
 
     /* Postpone APIC updates for PV(H) guests if PMU interrupt is pending */
-    if ( is_hvm_vcpu(curr) || !vpmu->xenpmu_data ||
+    if ( has_vlapic(curr->domain) || !vpmu->xenpmu_data ||
          !vpmu_is_set(vpmu, VPMU_CACHED) )
         apic_write(APIC_LVTPC, vpmu->hw_lapic_lvtpc);
 }
@@ -129,7 +129,7 @@ int vpmu_do_msr(unsigned int msr, uint64_t *msr_content,
      * and since do_wr/rdmsr may load VPMU context we should save
      * (and unload) it again.
      */
-    if ( !is_hvm_vcpu(curr) && vpmu->xenpmu_data &&
+    if ( !has_vlapic(curr->domain) && vpmu->xenpmu_data &&
         vpmu_is_set(vpmu, VPMU_CACHED) )
     {
         vpmu_set(vpmu, VPMU_CONTEXT_SAVE);
@@ -184,7 +184,7 @@ void vpmu_do_interrupt(struct cpu_user_regs *regs)
         return;
 
     /* PV(H) guest */
-    if ( !is_hvm_vcpu(sampling) || (vpmu_mode & XENPMU_MODE_ALL) )
+    if ( !has_vlapic(sampling->domain) || (vpmu_mode & XENPMU_MODE_ALL) )
     {
         const struct cpu_user_regs *cur_regs;
         uint64_t *flags = &vpmu->xenpmu_data->pmu.pmu_flags;
@@ -411,7 +411,8 @@ int vpmu_load(struct vcpu *v, bool_t from_guest)
 
     /* Only when PMU is counting, we load PMU context immediately. */
     if ( !vpmu_is_set(vpmu, VPMU_RUNNING) ||
-         (!is_hvm_vcpu(vpmu_vcpu(vpmu)) && vpmu_is_set(vpmu, VPMU_CACHED)) )
+         (!has_vlapic(vpmu_vcpu(vpmu)->domain) &&
+         vpmu_is_set(vpmu, VPMU_CACHED)) )
         return 0;
 
     if ( vpmu->arch_vpmu_ops && vpmu->arch_vpmu_ops->arch_vpmu_load )
@@ -635,7 +636,7 @@ long do_xenpmu_op(unsigned int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg)
     struct xen_pmu_data *xenpmu_data;
     struct vpmu_struct *vpmu;
 
-    if ( !opt_vpmu_enabled )
+    if ( !opt_vpmu_enabled || has_vlapic(current->domain) )
         return -EOPNOTSUPP;
 
     ret = xsm_pmu_op(XSM_OTHER, current->domain, op);
diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c
index 04da81a..990e6f3 100644
--- a/xen/arch/x86/cpu/vpmu_amd.c
+++ b/xen/arch/x86/cpu/vpmu_amd.c
@@ -238,7 +238,7 @@ static int amd_vpmu_load(struct vcpu *v, bool_t from_guest)
         bool_t is_running = 0;
         struct xen_pmu_amd_ctxt *guest_ctxt = &vpmu->xenpmu_data->pmu.c.amd;
 
-        ASSERT(!is_hvm_vcpu(v));
+        ASSERT(!has_vlapic(v->domain));
 
         ctxt = vpmu->context;
         ctrl_regs = vpmu_reg_pointer(ctxt, ctrls);
@@ -314,7 +314,7 @@ static int amd_vpmu_save(struct vcpu *v,  bool_t to_guest)
     {
         struct xen_pmu_amd_ctxt *guest_ctxt, *ctxt;
 
-        ASSERT(!is_hvm_vcpu(v));
+        ASSERT(!has_vlapic(v->domain));
         ctxt = vpmu->context;
         guest_ctxt = &vpmu->xenpmu_data->pmu.c.amd;
         memcpy(&guest_ctxt->regs[0], &ctxt->regs[0], regs_sz);
@@ -525,7 +525,7 @@ int svm_vpmu_initialise(struct vcpu *v)
     vpmu->context = ctxt;
     vpmu->priv_context = NULL;
 
-    if ( !is_hvm_vcpu(v) )
+    if ( !has_vlapic(v->domain) )
     {
         /* Copy register offsets to shared area */
         ASSERT(vpmu->xenpmu_data);
diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
index 12f80ae..a9e0e68 100644
--- a/xen/arch/x86/cpu/vpmu_intel.c
+++ b/xen/arch/x86/cpu/vpmu_intel.c
@@ -337,7 +337,7 @@ static int core2_vpmu_save(struct vcpu *v, bool_t to_guest)
 
     if ( to_guest )
     {
-        ASSERT(!is_hvm_vcpu(v));
+        ASSERT(!has_vlapic(v->domain));
         memcpy((void *)(&vpmu->xenpmu_data->pmu.c.intel) + regs_off,
                vpmu->context + regs_off, regs_sz);
     }
@@ -442,7 +442,7 @@ static int core2_vpmu_load(struct vcpu *v, bool_t from_guest)
     {
         int ret;
 
-        ASSERT(!is_hvm_vcpu(v));
+        ASSERT(!has_vlapic(v->domain));
 
         memcpy(vpmu->context + regs_off,
                (void *)&v->arch.vpmu.xenpmu_data->pmu.c.intel + regs_off,
@@ -502,7 +502,7 @@ static int core2_vpmu_alloc_resource(struct vcpu *v)
     vpmu->context = core2_vpmu_cxt;
     vpmu->priv_context = p;
 
-    if ( !is_hvm_vcpu(v) )
+    if ( !has_vlapic(v->domain) )
     {
         /* Copy fixed/arch register offsets to shared area */
         ASSERT(vpmu->xenpmu_data);
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 404634b..4adf86b 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1166,7 +1166,7 @@ static int svm_vcpu_initialise(struct vcpu *v)
     }
 
     /* PVH's VPMU is initialized via hypercall */
-    if ( is_hvm_vcpu(v) )
+    if ( is_hvm_vcpu(v) && has_vlapic(v->domain) )
         vpmu_initialise(v);
 
     svm_guest_osvw_init(v);
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 073e5be..5df866e 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -148,7 +148,7 @@ static int vmx_vcpu_initialise(struct vcpu *v)
     }
 
     /* PVH's VPMU is initialized via hypercall */
-    if ( is_hvm_vcpu(v) )
+    if ( is_hvm_vcpu(v) && has_vlapic(v->domain) )
         vpmu_initialise(v);
 
     vmx_install_vlapic_mapping(v);
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v8.1 09/21] xen/x86: set the vPMU interface based on the presence of a lapic
  2015-11-10 11:30     ` [PATCH v8.1 " Roger Pau Monne
@ 2015-11-10 14:21       ` Boris Ostrovsky
  0 siblings, 0 replies; 47+ messages in thread
From: Boris Ostrovsky @ 2015-11-10 14:21 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel; +Cc: Andrew Cooper, Jan Beulich

On 11/10/2015 06:30 AM, Roger Pau Monne wrote:
> Instead of choosing the interface to expose to guests based on the guest
> type, do it based on whether the guest has an emulated local apic or not.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> Changes since v8:
>   - Don't add the xenpmu hypercalls to the HVM hypercall table (yet).

This, of course, breaks VPMU. I thought that what Jan suggested 
yesterday (folding the two hypercall tables) was meant to be made part 
of the series.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 13/21] xen/x86: make sure the HVM callback vector is correctly set
  2015-11-06 16:05 ` [PATCH v8 13/21] xen/x86: make sure the HVM callback vector is correctly set Roger Pau Monne
@ 2015-11-10 16:41   ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2015-11-10 16:41 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel

>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/irq.c
> +++ b/xen/arch/x86/hvm/irq.c
> @@ -330,6 +330,10 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
>           (via_type > HVMIRQ_callback_vector) )
>          via_type = HVMIRQ_callback_none;
>  
> +    if ( via_type != HVMIRQ_callback_vector &&
> +         (!has_vlapic(d) || !has_vioapic(d)) )
> +        return;

What about PIC? With vpic_irq_positive_edge() now asserting
has_vpic(), and hvm_set_callback_irq_level() calling the former,
I don't think the above is sufficient.

Jan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 18/21] libxc/xen: introduce a start info structure for HVMlite guests
  2015-11-06 16:05 ` [PATCH v8 18/21] libxc/xen: introduce a start info structure for HVMlite guests Roger Pau Monne
@ 2015-11-10 16:53   ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2015-11-10 16:53 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, Andrew Cooper,
	Ian Jackson, xen-devel

>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
> --- a/xen/include/public/xen.h
> +++ b/xen/include/public/xen.h
> @@ -784,6 +784,29 @@ struct start_info {
>  };
>  typedef struct start_info start_info_t;
>  
> +/*
> + * Start of day structure passed to PVH guests in %ebx.
> + *
> + * NOTE: nothing will be loaded at physical address 0, so
> + * a 0 value in any of this fields should be treated as
> + * not present.
> + */
> +struct hvm_start_info {

Mind making the comment read "... in any of the address fields ..."?

Jan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 21/21] libxl: add support for migrating HVM guests without a device model
  2015-11-06 16:05 ` [PATCH v8 21/21] libxl: add support for migrating HVM guests " Roger Pau Monne
@ 2015-11-11 15:49   ` Wei Liu
  2015-11-11 15:55     ` Ian Jackson
  2015-11-11 17:14   ` Andrew Cooper
  1 sibling, 1 reply; 47+ messages in thread
From: Wei Liu @ 2015-11-11 15:49 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: xen-devel, Andrew Cooper, Ian Jackson, Ian Campbell, Wei Liu

CC Andrew

On Fri, Nov 06, 2015 at 05:05:55PM +0100, Roger Pau Monne wrote:
> Only some minor libxl changes are needed in order to be able to migrate HVM
> guests without a device model, no hypervisor changes are needed.
> 
> This change prevents sending the emulator context if the device model
> version is set to none.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
> Changes since v7:
>  - Prevent sending the emulator context and xenstore record in
>    write_emulator_context_record and write_emulator_xenstore_record.
>  - Error out if an emulator record is received for a no-dm guest.

I think this is in line with what you and Andrew discussed in previous
version. If Andrew is happy with this, I'm happy with it, too.

>  
> +    if (libxl__device_model_version_running(gc, dss->domid) ==
> +        LIBXL_DEVICE_MODEL_VERSION_NONE) {
> +        emulator_xenstore_record_done(egc, stream);
> +        return;
> +    }
> +
>      rc = libxl__save_emulator_xenstore_data(dss, &buf, &len);
>      if (rc)
>          goto err;
> @@ -387,6 +396,7 @@ static void emulator_xenstore_record_done(libxl__egc *egc,
>                                            libxl__stream_write_state *stream)
>  {
>      libxl__domain_suspend_state *dss = stream->dss;
> +    STATE_AO_GC(stream->ao);
>  

This is redundant.

Wei.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 21/21] libxl: add support for migrating HVM guests without a device model
  2015-11-11 15:49   ` Wei Liu
@ 2015-11-11 15:55     ` Ian Jackson
  0 siblings, 0 replies; 47+ messages in thread
From: Ian Jackson @ 2015-11-11 15:55 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Andrew Cooper, Ian Campbell, Roger Pau Monne

Wei Liu writes ("Re: [PATCH v8 21/21] libxl: add support for migrating HVM guests without a device model"):
> On Fri, Nov 06, 2015 at 05:05:55PM +0100, Roger Pau Monne wrote:
> > @@ -387,6 +396,7 @@ static void emulator_xenstore_record_done(libxl__egc *egc,
> >                                            libxl__stream_write_state *stream)
> >  {
> >      libxl__domain_suspend_state *dss = stream->dss;
> > +    STATE_AO_GC(stream->ao);
> >  
> 
> This is redundant.

`Unneeded' calls to STATE_AO_GC (ie ones in functions which do not
actually use anything from ao or gc) are at worst harmless.

In fact, such uses of STATE_AO_GC provide additional safety, because
they call libxl__ao_inprogress_gc which checks that the ao is in the
right state.

So I think there is nothing wrong with adding this call here (even if
it is a left-over addition from a previous version or draft of this
patch, or from wanting to add some calls to LOG* for debugging).

There would also be nothing wrong with not adding it.

Ian.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 21/21] libxl: add support for migrating HVM guests without a device model
  2015-11-06 16:05 ` [PATCH v8 21/21] libxl: add support for migrating HVM guests " Roger Pau Monne
  2015-11-11 15:49   ` Wei Liu
@ 2015-11-11 17:14   ` Andrew Cooper
  2015-11-26 18:10     ` Roger Pau Monné
  1 sibling, 1 reply; 47+ messages in thread
From: Andrew Cooper @ 2015-11-11 17:14 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

On 06/11/15 16:05, Roger Pau Monne wrote:
> diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
> index 52a60d7..0a6eaf9 100644
> --- a/tools/libxl/libxl_stream_write.c
> +++ b/tools/libxl/libxl_stream_write.c
> @@ -232,6 +232,9 @@ void libxl__stream_write_start(libxl__egc *egc,
>              stream->emu_sub_hdr.id = EMULATOR_QEMU_UPSTREAM;
>              break;
>  
> +        case LIBXL_DEVICE_MODEL_VERSION_NONE:
> +            break;
> +

This (in principle) leaves stream->emu_sub_hdr.id uninitialised
(although its value will be zero because of libxl__stream_write_init()).

I would be tempted to (ab)use EMULATOR_UNKNOWN here and have
setup_emulator_write() assert id != UNKNOWN, to catch calls which slip
through the cracks.

>          default:
>              rc = ERROR_FAIL;
>              LOG(ERROR, "Unknown emulator for HVM domain");
> @@ -359,6 +362,12 @@ static void write_emulator_xenstore_record(libxl__egc *egc,
>      char *buf = NULL;
>      uint32_t len = 0;
>  
> +    if (libxl__device_model_version_running(gc, dss->domid) ==
> +        LIBXL_DEVICE_MODEL_VERSION_NONE) {
> +        emulator_xenstore_record_done(egc, stream);
> +        return;
> +    }
> +

libxl__device_model_version_running() performs a xenstore action, which
is a steep overhead for constant information.

It would be better to introduce a new field in libxl__stream_write_state
and cache libxl__device_model_version_running() once in
libxl__stream_write_start().  You can then also use that for an
assertion in setup_emulator_write()

~Andrew

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs
  2015-11-06 16:05 ` [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs Roger Pau Monne
@ 2015-11-12 16:57   ` Jan Beulich
  2015-11-26 16:57     ` Roger Pau Monné
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2015-11-12 16:57 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, xen-devel

>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
> @@ -1183,6 +1184,301 @@ int arch_set_info_guest(
>  #undef c
>  }
>  
> +static inline int check_segment(struct segment_register reg,

Passed by value?

> +                                enum x86_segment seg)
> +{
> +
> +    if ( reg.attr.fields.pad != 0 )
> +    {
> +        gprintk(XENLOG_ERR,
> +                "Attribute bits 12-15 of the segment are not zero\n");
> +        return -EINVAL;
> +    }
> +
> +    if ( reg.sel == 0 && reg.base == 0 && reg.limit == 0 &&

What's the sel check good for when your only caller only ever calls
you with it being zero? Looking at base or limit here doesn't seem
right either.

> +         reg.attr.bytes == 0)
> +    {
> +        if ( seg == x86_seg_cs || seg == x86_seg_ss )
> +        {
> +            gprintk(XENLOG_ERR, "Null selector provided for CS or SS\n");
> +            return -EINVAL;
> +        }
> +        return 0;
> +    }
> +
> +    if ( reg.attr.fields.p != 1 )

This is effectively a boolean field - no need for "!= 1", should be
just !.

> +    {
> +        gprintk(XENLOG_ERR, "Non-present segment provided\n");
> +        return -EINVAL;
> +    }
> +
> +    if ( seg == x86_seg_cs && !(reg.attr.fields.type & 0x8) )
> +    {
> +        gprintk(XENLOG_ERR, "CS is missing code type\n");
> +        return -EINVAL;
> +    }
> +
> +    if ( seg != x86_seg_cs && !!(reg.attr.fields.type & 0x8) )

No need for !! here. Also only SS is disallowed to get code segments
loaded. For other selector registers they're okay when readable.

> +    {
> +        gprintk(XENLOG_ERR, "Non code segment with code type set\n");
> +        return -EINVAL;
> +    }

SS must be writable.

> +int arch_set_info_hvm_guest(struct vcpu *v, const vcpu_hvm_context_t *ctx)
> +{
> +    struct cpu_user_regs *uregs = &v->arch.user_regs;
> +    struct segment_register cs, ds, ss, es, tr;
> +    int rc;
> +
> +    if ( ctx->pad != 0 )
> +    {
> +        gprintk(XENLOG_ERR, "Padding field != 0\n");

Please don't.

> +        return -EINVAL;
> +    }
> +
> +    switch ( ctx->mode )
> +    {
> +    default:
> +        return -EINVAL;
> +
> +    case VCPU_HVM_MODE_32B:
> +    {
> +        const struct vcpu_hvm_x86_32 *regs = &ctx->cpu_regs.x86_32;
> +        uint32_t limit;
> +
> +        if ( ctx->cpu_regs.x86_32.pad1 != 0 ||
> +             ctx->cpu_regs.x86_32.pad2[0] != 0 ||
> +             ctx->cpu_regs.x86_32.pad2[1] != 0  ||
> +             ctx->cpu_regs.x86_32.pad2[2] != 0 )
> +        {
> +            gprintk(XENLOG_ERR, "Padding field != 0\n");
> +            return -EINVAL;
> +        }
> +
> +#define SEG(s, r)                                                       \
> +    (struct segment_register){ .sel = 0, .base = (r)->s ## _base,       \
> +            .limit = (r)->s ## _limit, .attr.bytes = (r)->s ## _ar }
> +        cs = SEG(cs, regs);
> +        ds = SEG(ds, regs);
> +        ss = SEG(ss, regs);
> +        es = SEG(es, regs);
> +        tr = SEG(tr, regs);
> +#undef SEG
> +
> +        rc = check_segment(cs, x86_seg_cs);
> +        rc |= check_segment(ds, x86_seg_ds);
> +        rc |= check_segment(ss, x86_seg_ss);
> +        rc |= check_segment(es, x86_seg_es);
> +        rc |= check_segment(tr, x86_seg_tr);
> +        if ( rc != 0 )
> +            return rc;
> +
> +        /* Basic sanity checks. */
> +        limit = cs.limit;
> +        if ( cs.attr.fields.g )
> +            limit = (limit << 12) | 0xfff;
> +        if ( regs->eip > limit )
> +        {
> +            gprintk(XENLOG_ERR, "EIP (%#08x) outside CS limit (%#08x)\n",
> +                                regs->eip, limit);
> +            return -EINVAL;
> +        }
> +
> +        if ( ds.attr.fields.p && ds.attr.fields.dpl > cs.attr.fields.dpl )
> +        {
> +            gprintk(XENLOG_ERR, "DS.DPL (%u) is greater than CS.DPL (%u)\n",
> +                                ds.attr.fields.dpl, cs.attr.fields.dpl);

Indentation.

> +            return -EINVAL;
> +        }
> +
> +        if ( ss.attr.fields.p && ss.attr.fields.dpl != cs.attr.fields.dpl )
> +        {
> +            gprintk(XENLOG_ERR, "SS.DPL (%u) is different than CS.DPL (%u)\n",
> +                                ss.attr.fields.dpl, cs.attr.fields.dpl);
> +            return -EINVAL;
> +        }

I think this should go ahead of the DS one. Also I don't think
either CS or SS would be okay with the present bit clear, even
less so in 32-bit mode.

> +int arch_initialize_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
> +{
> +    struct domain *d = v->domain;
> +    int rc;
> +
> +    if ( is_hvm_vcpu(v) )
> +    {
> +        struct vcpu_hvm_context ctxt;
> +
> +        if ( copy_from_guest(&ctxt, arg, 1) )
> +            return -EFAULT;
> +
> +        domain_lock(d);
> +        rc = v->is_initialised ? -EEXIST : arch_set_info_hvm_guest(v, &ctxt);
> +        domain_unlock(d);
> +    }
> +    else
> +    {
> +        struct vcpu_guest_context *ctxt;
> +
> +        if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
> +            return -ENOMEM;
> +
> +        if ( copy_from_guest(ctxt, arg, 1) )
> +        {
> +            free_vcpu_guest_context(ctxt);
> +            return -EFAULT;
> +        }
> +
> +        domain_lock(d);
> +        rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt);
> +        domain_unlock(d);
> +
> +        free_vcpu_guest_context(ctxt);
> +    }

This else branch looks suspiciously like the ARM variant, and iirc I
had asked already on an earlier version to have this handled in
common code (with ARM simply using the common function for its
arch_initialize_vcpu()).

Jan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices
  2015-11-06 16:05 ` [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices Roger Pau Monne
@ 2015-11-16 12:18   ` Jan Beulich
  2015-11-16 18:33     ` Andrew Cooper
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2015-11-16 12:18 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Andrew Cooper, Ian Jackson, xen-devel

>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
> --- a/xen/include/public/arch-x86/xen.h
> +++ b/xen/include/public/arch-x86/xen.h
> @@ -265,7 +265,31 @@ typedef struct arch_shared_info arch_shared_info_t;
>   * XEN_DOMCTL_INTERFACE_VERSION.
>   */
>  struct xen_arch_domainconfig {
> -    char dummy;
> +#define _XEN_X86_EMU_LAPIC          0
> +#define XEN_X86_EMU_LAPIC           (1U<<_XEN_X86_EMU_LAPIC)
> +#define _XEN_X86_EMU_HPET           1
> +#define XEN_X86_EMU_HPET            (1U<<_XEN_X86_EMU_HPET)
> +#define _XEN_X86_EMU_PM             2
> +#define XEN_X86_EMU_PM              (1U<<_XEN_X86_EMU_PM)
> +#define _XEN_X86_EMU_RTC            3
> +#define XEN_X86_EMU_RTC             (1U<<_XEN_X86_EMU_RTC)
> +#define _XEN_X86_EMU_IOAPIC         4
> +#define XEN_X86_EMU_IOAPIC          (1U<<_XEN_X86_EMU_IOAPIC)
> +#define _XEN_X86_EMU_PIC            5
> +#define XEN_X86_EMU_PIC             (1U<<_XEN_X86_EMU_PIC)
> +#define _XEN_X86_EMU_VGA            6
> +#define XEN_X86_EMU_VGA             (1U<<_XEN_X86_EMU_VGA)
> +#define _XEN_X86_EMU_IOMMU          7
> +#define XEN_X86_EMU_IOMMU           (1U<<_XEN_X86_EMU_IOMMU)
> +#define _XEN_X86_EMU_PIT            8
> +#define XEN_X86_EMU_PIT             (1U<<_XEN_X86_EMU_PIT)

While only used for a domctl (so far), I still think we should aim at
making this a complete set (i.e. preempt future additions to the
set if at all possible). I say this because - having looked again - I'm
missing things like MTRR, PAT, and 8254 here.

Jan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices
  2015-11-16 12:18   ` Jan Beulich
@ 2015-11-16 18:33     ` Andrew Cooper
  2015-11-17  9:42       ` Jan Beulich
  2015-11-26 10:25       ` Roger Pau Monné
  0 siblings, 2 replies; 47+ messages in thread
From: Andrew Cooper @ 2015-11-16 18:33 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monne
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, xen-devel

On 16/11/15 12:18, Jan Beulich wrote:
>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>> --- a/xen/include/public/arch-x86/xen.h
>> +++ b/xen/include/public/arch-x86/xen.h
>> @@ -265,7 +265,31 @@ typedef struct arch_shared_info arch_shared_info_t;
>>   * XEN_DOMCTL_INTERFACE_VERSION.
>>   */
>>  struct xen_arch_domainconfig {
>> -    char dummy;
>> +#define _XEN_X86_EMU_LAPIC          0
>> +#define XEN_X86_EMU_LAPIC           (1U<<_XEN_X86_EMU_LAPIC)
>> +#define _XEN_X86_EMU_HPET           1
>> +#define XEN_X86_EMU_HPET            (1U<<_XEN_X86_EMU_HPET)
>> +#define _XEN_X86_EMU_PM             2
>> +#define XEN_X86_EMU_PM              (1U<<_XEN_X86_EMU_PM)
>> +#define _XEN_X86_EMU_RTC            3
>> +#define XEN_X86_EMU_RTC             (1U<<_XEN_X86_EMU_RTC)
>> +#define _XEN_X86_EMU_IOAPIC         4
>> +#define XEN_X86_EMU_IOAPIC          (1U<<_XEN_X86_EMU_IOAPIC)
>> +#define _XEN_X86_EMU_PIC            5
>> +#define XEN_X86_EMU_PIC             (1U<<_XEN_X86_EMU_PIC)
>> +#define _XEN_X86_EMU_VGA            6
>> +#define XEN_X86_EMU_VGA             (1U<<_XEN_X86_EMU_VGA)
>> +#define _XEN_X86_EMU_IOMMU          7
>> +#define XEN_X86_EMU_IOMMU           (1U<<_XEN_X86_EMU_IOMMU)
>> +#define _XEN_X86_EMU_PIT            8
>> +#define XEN_X86_EMU_PIT             (1U<<_XEN_X86_EMU_PIT)
> While only used for a domctl (so far), I still think we should aim at
> making this a complete set (i.e. preempt future additions to the
> set if at all possible). I say this because - having looked again - I'm
> missing things like MTRR, PAT, and 8254 here.

Use (or not) of MTRR and PAT should be controlled exclusively via the
guests cpuid policy.  Unlike the above bits, they are architectural
components of the CPU itself, rather than external devices on the
motherboard.

~Andrew

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices
  2015-11-16 18:33     ` Andrew Cooper
@ 2015-11-17  9:42       ` Jan Beulich
  2015-11-17  9:58         ` Juergen Gross
  2015-11-26 10:25       ` Roger Pau Monné
  1 sibling, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2015-11-17  9:42 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, xen-devel, Roger Pau Monne

>>> On 16.11.15 at 19:33, <andrew.cooper3@citrix.com> wrote:
> On 16/11/15 12:18, Jan Beulich wrote:
>>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>>> --- a/xen/include/public/arch-x86/xen.h
>>> +++ b/xen/include/public/arch-x86/xen.h
>>> @@ -265,7 +265,31 @@ typedef struct arch_shared_info arch_shared_info_t;
>>>   * XEN_DOMCTL_INTERFACE_VERSION.
>>>   */
>>>  struct xen_arch_domainconfig {
>>> -    char dummy;
>>> +#define _XEN_X86_EMU_LAPIC          0
>>> +#define XEN_X86_EMU_LAPIC           (1U<<_XEN_X86_EMU_LAPIC)
>>> +#define _XEN_X86_EMU_HPET           1
>>> +#define XEN_X86_EMU_HPET            (1U<<_XEN_X86_EMU_HPET)
>>> +#define _XEN_X86_EMU_PM             2
>>> +#define XEN_X86_EMU_PM              (1U<<_XEN_X86_EMU_PM)
>>> +#define _XEN_X86_EMU_RTC            3
>>> +#define XEN_X86_EMU_RTC             (1U<<_XEN_X86_EMU_RTC)
>>> +#define _XEN_X86_EMU_IOAPIC         4
>>> +#define XEN_X86_EMU_IOAPIC          (1U<<_XEN_X86_EMU_IOAPIC)
>>> +#define _XEN_X86_EMU_PIC            5
>>> +#define XEN_X86_EMU_PIC             (1U<<_XEN_X86_EMU_PIC)
>>> +#define _XEN_X86_EMU_VGA            6
>>> +#define XEN_X86_EMU_VGA             (1U<<_XEN_X86_EMU_VGA)
>>> +#define _XEN_X86_EMU_IOMMU          7
>>> +#define XEN_X86_EMU_IOMMU           (1U<<_XEN_X86_EMU_IOMMU)
>>> +#define _XEN_X86_EMU_PIT            8
>>> +#define XEN_X86_EMU_PIT             (1U<<_XEN_X86_EMU_PIT)
>> While only used for a domctl (so far), I still think we should aim at
>> making this a complete set (i.e. preempt future additions to the
>> set if at all possible). I say this because - having looked again - I'm
>> missing things like MTRR, PAT, and 8254 here.
> 
> Use (or not) of MTRR and PAT should be controlled exclusively via the
> guests cpuid policy.  Unlike the above bits, they are architectural
> components of the CPU itself, rather than external devices on the
> motherboard.

For MTRR - yes, agreed. But what CPUID bit do I not recall that allows
identifying PAT availability?

Jan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices
  2015-11-17  9:42       ` Jan Beulich
@ 2015-11-17  9:58         ` Juergen Gross
  2015-11-17 10:30           ` Andrew Cooper
  0 siblings, 1 reply; 47+ messages in thread
From: Juergen Gross @ 2015-11-17  9:58 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, xen-devel, Roger Pau Monne

On 17/11/15 10:42, Jan Beulich wrote:
>>>> On 16.11.15 at 19:33, <andrew.cooper3@citrix.com> wrote:
>> On 16/11/15 12:18, Jan Beulich wrote:
>>>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>>>> --- a/xen/include/public/arch-x86/xen.h
>>>> +++ b/xen/include/public/arch-x86/xen.h
>>>> @@ -265,7 +265,31 @@ typedef struct arch_shared_info arch_shared_info_t;
>>>>   * XEN_DOMCTL_INTERFACE_VERSION.
>>>>   */
>>>>  struct xen_arch_domainconfig {
>>>> -    char dummy;
>>>> +#define _XEN_X86_EMU_LAPIC          0
>>>> +#define XEN_X86_EMU_LAPIC           (1U<<_XEN_X86_EMU_LAPIC)
>>>> +#define _XEN_X86_EMU_HPET           1
>>>> +#define XEN_X86_EMU_HPET            (1U<<_XEN_X86_EMU_HPET)
>>>> +#define _XEN_X86_EMU_PM             2
>>>> +#define XEN_X86_EMU_PM              (1U<<_XEN_X86_EMU_PM)
>>>> +#define _XEN_X86_EMU_RTC            3
>>>> +#define XEN_X86_EMU_RTC             (1U<<_XEN_X86_EMU_RTC)
>>>> +#define _XEN_X86_EMU_IOAPIC         4
>>>> +#define XEN_X86_EMU_IOAPIC          (1U<<_XEN_X86_EMU_IOAPIC)
>>>> +#define _XEN_X86_EMU_PIC            5
>>>> +#define XEN_X86_EMU_PIC             (1U<<_XEN_X86_EMU_PIC)
>>>> +#define _XEN_X86_EMU_VGA            6
>>>> +#define XEN_X86_EMU_VGA             (1U<<_XEN_X86_EMU_VGA)
>>>> +#define _XEN_X86_EMU_IOMMU          7
>>>> +#define XEN_X86_EMU_IOMMU           (1U<<_XEN_X86_EMU_IOMMU)
>>>> +#define _XEN_X86_EMU_PIT            8
>>>> +#define XEN_X86_EMU_PIT             (1U<<_XEN_X86_EMU_PIT)
>>> While only used for a domctl (so far), I still think we should aim at
>>> making this a complete set (i.e. preempt future additions to the
>>> set if at all possible). I say this because - having looked again - I'm
>>> missing things like MTRR, PAT, and 8254 here.
>>
>> Use (or not) of MTRR and PAT should be controlled exclusively via the
>> guests cpuid policy.  Unlike the above bits, they are architectural
>> components of the CPU itself, rather than external devices on the
>> motherboard.
> 
> For MTRR - yes, agreed. But what CPUID bit do I not recall that allows
> identifying PAT availability?

INTEL SDM Vol. 3 11.12.1:

An operating system or executive can detect the availability of the PAT
by executing the CPUID instruction with a value of 1 in the EAX
register. Support for the PAT is indicated by the PAT flag (bit 16 of
the values returned to EDX register).


Juergen

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices
  2015-11-17  9:58         ` Juergen Gross
@ 2015-11-17 10:30           ` Andrew Cooper
  2015-11-17 10:35             ` Juergen Gross
  0 siblings, 1 reply; 47+ messages in thread
From: Andrew Cooper @ 2015-11-17 10:30 UTC (permalink / raw)
  To: Juergen Gross, Jan Beulich
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, xen-devel, Roger Pau Monne

On 17/11/15 09:58, Juergen Gross wrote:
> On 17/11/15 10:42, Jan Beulich wrote:
>>>>> On 16.11.15 at 19:33, <andrew.cooper3@citrix.com> wrote:
>>> On 16/11/15 12:18, Jan Beulich wrote:
>>>>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>>>>> --- a/xen/include/public/arch-x86/xen.h
>>>>> +++ b/xen/include/public/arch-x86/xen.h
>>>>> @@ -265,7 +265,31 @@ typedef struct arch_shared_info arch_shared_info_t;
>>>>>   * XEN_DOMCTL_INTERFACE_VERSION.
>>>>>   */
>>>>>  struct xen_arch_domainconfig {
>>>>> -    char dummy;
>>>>> +#define _XEN_X86_EMU_LAPIC          0
>>>>> +#define XEN_X86_EMU_LAPIC           (1U<<_XEN_X86_EMU_LAPIC)
>>>>> +#define _XEN_X86_EMU_HPET           1
>>>>> +#define XEN_X86_EMU_HPET            (1U<<_XEN_X86_EMU_HPET)
>>>>> +#define _XEN_X86_EMU_PM             2
>>>>> +#define XEN_X86_EMU_PM              (1U<<_XEN_X86_EMU_PM)
>>>>> +#define _XEN_X86_EMU_RTC            3
>>>>> +#define XEN_X86_EMU_RTC             (1U<<_XEN_X86_EMU_RTC)
>>>>> +#define _XEN_X86_EMU_IOAPIC         4
>>>>> +#define XEN_X86_EMU_IOAPIC          (1U<<_XEN_X86_EMU_IOAPIC)
>>>>> +#define _XEN_X86_EMU_PIC            5
>>>>> +#define XEN_X86_EMU_PIC             (1U<<_XEN_X86_EMU_PIC)
>>>>> +#define _XEN_X86_EMU_VGA            6
>>>>> +#define XEN_X86_EMU_VGA             (1U<<_XEN_X86_EMU_VGA)
>>>>> +#define _XEN_X86_EMU_IOMMU          7
>>>>> +#define XEN_X86_EMU_IOMMU           (1U<<_XEN_X86_EMU_IOMMU)
>>>>> +#define _XEN_X86_EMU_PIT            8
>>>>> +#define XEN_X86_EMU_PIT             (1U<<_XEN_X86_EMU_PIT)
>>>> While only used for a domctl (so far), I still think we should aim at
>>>> making this a complete set (i.e. preempt future additions to the
>>>> set if at all possible). I say this because - having looked again - I'm
>>>> missing things like MTRR, PAT, and 8254 here.
>>> Use (or not) of MTRR and PAT should be controlled exclusively via the
>>> guests cpuid policy.  Unlike the above bits, they are architectural
>>> components of the CPU itself, rather than external devices on the
>>> motherboard.
>> For MTRR - yes, agreed. But what CPUID bit do I not recall that allows
>> identifying PAT availability?
> INTEL SDM Vol. 3 11.12.1:
>
> An operating system or executive can detect the availability of the PAT
> by executing the CPUID instruction with a value of 1 in the EAX
> register. Support for the PAT is indicated by the PAT flag (bit 16 of
> the values returned to EDX register).

Indeed.  X86_FEATURE_PAT.

~Andrew

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices
  2015-11-17 10:30           ` Andrew Cooper
@ 2015-11-17 10:35             ` Juergen Gross
  0 siblings, 0 replies; 47+ messages in thread
From: Juergen Gross @ 2015-11-17 10:35 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, xen-devel, Roger Pau Monne

On 17/11/15 11:30, Andrew Cooper wrote:
> On 17/11/15 09:58, Juergen Gross wrote:
>> On 17/11/15 10:42, Jan Beulich wrote:
>>>>>> On 16.11.15 at 19:33, <andrew.cooper3@citrix.com> wrote:
>>>> On 16/11/15 12:18, Jan Beulich wrote:
>>>>>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>>>>>> --- a/xen/include/public/arch-x86/xen.h
>>>>>> +++ b/xen/include/public/arch-x86/xen.h
>>>>>> @@ -265,7 +265,31 @@ typedef struct arch_shared_info arch_shared_info_t;
>>>>>>   * XEN_DOMCTL_INTERFACE_VERSION.
>>>>>>   */
>>>>>>  struct xen_arch_domainconfig {
>>>>>> -    char dummy;
>>>>>> +#define _XEN_X86_EMU_LAPIC          0
>>>>>> +#define XEN_X86_EMU_LAPIC           (1U<<_XEN_X86_EMU_LAPIC)
>>>>>> +#define _XEN_X86_EMU_HPET           1
>>>>>> +#define XEN_X86_EMU_HPET            (1U<<_XEN_X86_EMU_HPET)
>>>>>> +#define _XEN_X86_EMU_PM             2
>>>>>> +#define XEN_X86_EMU_PM              (1U<<_XEN_X86_EMU_PM)
>>>>>> +#define _XEN_X86_EMU_RTC            3
>>>>>> +#define XEN_X86_EMU_RTC             (1U<<_XEN_X86_EMU_RTC)
>>>>>> +#define _XEN_X86_EMU_IOAPIC         4
>>>>>> +#define XEN_X86_EMU_IOAPIC          (1U<<_XEN_X86_EMU_IOAPIC)
>>>>>> +#define _XEN_X86_EMU_PIC            5
>>>>>> +#define XEN_X86_EMU_PIC             (1U<<_XEN_X86_EMU_PIC)
>>>>>> +#define _XEN_X86_EMU_VGA            6
>>>>>> +#define XEN_X86_EMU_VGA             (1U<<_XEN_X86_EMU_VGA)
>>>>>> +#define _XEN_X86_EMU_IOMMU          7
>>>>>> +#define XEN_X86_EMU_IOMMU           (1U<<_XEN_X86_EMU_IOMMU)
>>>>>> +#define _XEN_X86_EMU_PIT            8
>>>>>> +#define XEN_X86_EMU_PIT             (1U<<_XEN_X86_EMU_PIT)
>>>>> While only used for a domctl (so far), I still think we should aim at
>>>>> making this a complete set (i.e. preempt future additions to the
>>>>> set if at all possible). I say this because - having looked again - I'm
>>>>> missing things like MTRR, PAT, and 8254 here.
>>>> Use (or not) of MTRR and PAT should be controlled exclusively via the
>>>> guests cpuid policy.  Unlike the above bits, they are architectural
>>>> components of the CPU itself, rather than external devices on the
>>>> motherboard.
>>> For MTRR - yes, agreed. But what CPUID bit do I not recall that allows
>>> identifying PAT availability?
>> INTEL SDM Vol. 3 11.12.1:
>>
>> An operating system or executive can detect the availability of the PAT
>> by executing the CPUID instruction with a value of 1 in the EAX
>> register. Support for the PAT is indicated by the PAT flag (bit 16 of
>> the values returned to EDX register).
> 
> Indeed.  X86_FEATURE_PAT.

But: SDM Vol.3 4.9.1:
The PAT is supported on all processors that support IA-32e paging.

So I wouldn't be suprised if some OS isn't looking at X86_FEATURE_PAT.


Juergen

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 02/21] xen/vlapic: fixes for HVM code when running without a vlapic
  2015-11-06 16:05 ` [PATCH v8 02/21] xen/vlapic: fixes for HVM code when running without a vlapic Roger Pau Monne
  2015-11-06 16:10   ` Andrew Cooper
@ 2015-11-26  5:41   ` Tian, Kevin
  1 sibling, 0 replies; 47+ messages in thread
From: Tian, Kevin @ 2015-11-26  5:41 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Jan Beulich, Andrew Cooper, Dong, Eddie, Aravind Gopalakrishnan,
	Nakajima, Jun, Boris Ostrovsky, Suravee Suthikulpanit

> From: Roger Pau Monne [mailto:roger.pau@citrix.com]
> Sent: Saturday, November 07, 2015 12:06 AM
> 
> The HVM related code (SVM, VMX) generally assumed that a local apic is
> always present. With the introduction of a HVM mode were the local apic can
> be removed, some of this broken code paths arised.
> 
> The SVM exit/resume paths unconditionally checked the state of the lapic,
> which is wrong if it's been disabled by hardware, fix this by adding the
> necessary checks. On the VMX side, make sure we don't add mappings for a
> local apic if it's disabled.
> 
> In the generic vlapic code, add checks to prevent setting the TSC deadline
> timer if the lapic is disabled, and also prevent trying to inject interrupts
> from the PIC is the lapic is also disabled.
> 

Sorry for late reply. It looks good to me:

Acked-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 03/21] xen/x86: allow disabling the emulated local apic
  2015-11-06 16:05 ` [PATCH v8 03/21] xen/x86: allow disabling the emulated local apic Roger Pau Monne
@ 2015-11-26  5:47   ` Tian, Kevin
  0 siblings, 0 replies; 47+ messages in thread
From: Tian, Kevin @ 2015-11-26  5:47 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Dong, Eddie, Nakajima, Jun, Jan Beulich

> From: Roger Pau Monne [mailto:roger.pau@citrix.com]
> Sent: Saturday, November 07, 2015 12:06 AM
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Eddie Dong <eddie.dong@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices
  2015-11-16 18:33     ` Andrew Cooper
  2015-11-17  9:42       ` Jan Beulich
@ 2015-11-26 10:25       ` Roger Pau Monné
  1 sibling, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2015-11-26 10:25 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, George Dunlap,
	Ian Jackson, xen-devel

El 16/11/15 a les 19.33, Andrew Cooper ha escrit:
> On 16/11/15 12:18, Jan Beulich wrote:
>>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>>> --- a/xen/include/public/arch-x86/xen.h
>>> +++ b/xen/include/public/arch-x86/xen.h
>>> @@ -265,7 +265,31 @@ typedef struct arch_shared_info arch_shared_info_t;
>>>   * XEN_DOMCTL_INTERFACE_VERSION.
>>>   */
>>>  struct xen_arch_domainconfig {
>>> -    char dummy;
>>> +#define _XEN_X86_EMU_LAPIC          0
>>> +#define XEN_X86_EMU_LAPIC           (1U<<_XEN_X86_EMU_LAPIC)
>>> +#define _XEN_X86_EMU_HPET           1
>>> +#define XEN_X86_EMU_HPET            (1U<<_XEN_X86_EMU_HPET)
>>> +#define _XEN_X86_EMU_PM             2
>>> +#define XEN_X86_EMU_PM              (1U<<_XEN_X86_EMU_PM)
>>> +#define _XEN_X86_EMU_RTC            3
>>> +#define XEN_X86_EMU_RTC             (1U<<_XEN_X86_EMU_RTC)
>>> +#define _XEN_X86_EMU_IOAPIC         4
>>> +#define XEN_X86_EMU_IOAPIC          (1U<<_XEN_X86_EMU_IOAPIC)
>>> +#define _XEN_X86_EMU_PIC            5
>>> +#define XEN_X86_EMU_PIC             (1U<<_XEN_X86_EMU_PIC)
>>> +#define _XEN_X86_EMU_VGA            6
>>> +#define XEN_X86_EMU_VGA             (1U<<_XEN_X86_EMU_VGA)
>>> +#define _XEN_X86_EMU_IOMMU          7
>>> +#define XEN_X86_EMU_IOMMU           (1U<<_XEN_X86_EMU_IOMMU)
>>> +#define _XEN_X86_EMU_PIT            8
>>> +#define XEN_X86_EMU_PIT             (1U<<_XEN_X86_EMU_PIT)
>> While only used for a domctl (so far), I still think we should aim at
>> making this a complete set (i.e. preempt future additions to the
>> set if at all possible). I say this because - having looked again - I'm
>> missing things like MTRR, PAT, and 8254 here.
> 
> Use (or not) of MTRR and PAT should be controlled exclusively via the
> guests cpuid policy.  Unlike the above bits, they are architectural
> components of the CPU itself, rather than external devices on the
> motherboard.

The emulated i8254 is controlled by the XEN_X86_EMU_PIT flag.

Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs
  2015-11-12 16:57   ` Jan Beulich
@ 2015-11-26 16:57     ` Roger Pau Monné
  2015-11-27  8:00       ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monné @ 2015-11-26 16:57 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, xen-devel

El 12/11/15 a les 17.57, Jan Beulich ha escrit:
>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>> @@ -1183,6 +1184,301 @@ int arch_set_info_guest(
>>  #undef c
>>  }
>>  
>> +static inline int check_segment(struct segment_register reg,
> 
> Passed by value?

Hm, right, will pass by reference in next version.

> 
>> +                                enum x86_segment seg)
>> +{
>> +
>> +    if ( reg.attr.fields.pad != 0 )
>> +    {
>> +        gprintk(XENLOG_ERR,
>> +                "Attribute bits 12-15 of the segment are not zero\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if ( reg.sel == 0 && reg.base == 0 && reg.limit == 0 &&
> 
> What's the sel check good for when your only caller only ever calls
> you with it being zero?

I don't mind removing the sel == 0 check but I don't think it hurts either.

> Looking at base or limit here doesn't seem
> right either.

I'm sorry but I'm not following you here, why is this not right? Would
you rather conclude that the user is trying to load a null segment by
just looking at the attributes field (and checking it's 0)?

>> +         reg.attr.bytes == 0)
>> +    {
>> +        if ( seg == x86_seg_cs || seg == x86_seg_ss )
>> +        {
>> +            gprintk(XENLOG_ERR, "Null selector provided for CS or SS\n");
>> +            return -EINVAL;
>> +        }
>> +        return 0;
>> +    }
>> +
>> +    if ( reg.attr.fields.p != 1 )
> 
> This is effectively a boolean field - no need for "!= 1", should be
> just !.

Ack.

>> +    {
>> +        gprintk(XENLOG_ERR, "Non-present segment provided\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if ( seg == x86_seg_cs && !(reg.attr.fields.type & 0x8) )
>> +    {
>> +        gprintk(XENLOG_ERR, "CS is missing code type\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if ( seg != x86_seg_cs && !!(reg.attr.fields.type & 0x8) )
> 
> No need for !! here. Also only SS is disallowed to get code segments
> loaded. For other selector registers they're okay when readable.
> 
>> +    {
>> +        gprintk(XENLOG_ERR, "Non code segment with code type set\n");
>> +        return -EINVAL;
>> +    }
> 
> SS must be writable.

OK, I've fixed both issues and also added a check to make sure the S
attribute bit is set for all segments except TR.

>> +int arch_set_info_hvm_guest(struct vcpu *v, const vcpu_hvm_context_t *ctx)
>> +{
>> +    struct cpu_user_regs *uregs = &v->arch.user_regs;
>> +    struct segment_register cs, ds, ss, es, tr;
>> +    int rc;
>> +
>> +    if ( ctx->pad != 0 )
>> +    {
>> +        gprintk(XENLOG_ERR, "Padding field != 0\n");
> 
> Please don't.

OK, I'm also going to remove the other gprintk below regarding non 0
padding.

> 
>> +        return -EINVAL;
>> +    }
>> +
>> +    switch ( ctx->mode )
>> +    {
>> +    default:
>> +        return -EINVAL;
>> +
>> +    case VCPU_HVM_MODE_32B:
>> +    {
>> +        const struct vcpu_hvm_x86_32 *regs = &ctx->cpu_regs.x86_32;
>> +        uint32_t limit;
>> +
>> +        if ( ctx->cpu_regs.x86_32.pad1 != 0 ||
>> +             ctx->cpu_regs.x86_32.pad2[0] != 0 ||
>> +             ctx->cpu_regs.x86_32.pad2[1] != 0  ||
>> +             ctx->cpu_regs.x86_32.pad2[2] != 0 )
>> +        {
>> +            gprintk(XENLOG_ERR, "Padding field != 0\n");
>> +            return -EINVAL;
>> +        }
>> +
>> +#define SEG(s, r)                                                       \
>> +    (struct segment_register){ .sel = 0, .base = (r)->s ## _base,       \
>> +            .limit = (r)->s ## _limit, .attr.bytes = (r)->s ## _ar }
>> +        cs = SEG(cs, regs);
>> +        ds = SEG(ds, regs);
>> +        ss = SEG(ss, regs);
>> +        es = SEG(es, regs);
>> +        tr = SEG(tr, regs);
>> +#undef SEG
>> +
>> +        rc = check_segment(cs, x86_seg_cs);
>> +        rc |= check_segment(ds, x86_seg_ds);
>> +        rc |= check_segment(ss, x86_seg_ss);
>> +        rc |= check_segment(es, x86_seg_es);
>> +        rc |= check_segment(tr, x86_seg_tr);
>> +        if ( rc != 0 )
>> +            return rc;
>> +
>> +        /* Basic sanity checks. */
>> +        limit = cs.limit;
>> +        if ( cs.attr.fields.g )
>> +            limit = (limit << 12) | 0xfff;
>> +        if ( regs->eip > limit )
>> +        {
>> +            gprintk(XENLOG_ERR, "EIP (%#08x) outside CS limit (%#08x)\n",
>> +                                regs->eip, limit);
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( ds.attr.fields.p && ds.attr.fields.dpl > cs.attr.fields.dpl )
>> +        {
>> +            gprintk(XENLOG_ERR, "DS.DPL (%u) is greater than CS.DPL (%u)\n",
>> +                                ds.attr.fields.dpl, cs.attr.fields.dpl);
> 
> Indentation.

Done (and fixed others above and below).

> 
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( ss.attr.fields.p && ss.attr.fields.dpl != cs.attr.fields.dpl )
>> +        {
>> +            gprintk(XENLOG_ERR, "SS.DPL (%u) is different than CS.DPL (%u)\n",
>> +                                ss.attr.fields.dpl, cs.attr.fields.dpl);
>> +            return -EINVAL;
>> +        }
> 
> I think this should go ahead of the DS one. Also I don't think
> either CS or SS would be okay with the present bit clear, even
> less so in 32-bit mode.

Yes, CS or SS cannot have the present bit clear, we already make sure of
that in the check_segment function above. This condition can indeed be
simplified.

>> +int arch_initialize_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
>> +{
>> +    struct domain *d = v->domain;
>> +    int rc;
>> +
>> +    if ( is_hvm_vcpu(v) )
>> +    {
>> +        struct vcpu_hvm_context ctxt;
>> +
>> +        if ( copy_from_guest(&ctxt, arg, 1) )
>> +            return -EFAULT;
>> +
>> +        domain_lock(d);
>> +        rc = v->is_initialised ? -EEXIST : arch_set_info_hvm_guest(v, &ctxt);
>> +        domain_unlock(d);
>> +    }
>> +    else
>> +    {
>> +        struct vcpu_guest_context *ctxt;
>> +
>> +        if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
>> +            return -ENOMEM;
>> +
>> +        if ( copy_from_guest(ctxt, arg, 1) )
>> +        {
>> +            free_vcpu_guest_context(ctxt);
>> +            return -EFAULT;
>> +        }
>> +
>> +        domain_lock(d);
>> +        rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt);
>> +        domain_unlock(d);
>> +
>> +        free_vcpu_guest_context(ctxt);
>> +    }
> 
> This else branch looks suspiciously like the ARM variant, and iirc I
> had asked already on an earlier version to have this handled in
> common code (with ARM simply using the common function for its
> arch_initialize_vcpu()).

Done, I've created a default_initalize_vcpu that's shared between ARM
and x86 PV guests. The arch_initialize_vcpu implementation on ARM is
just a stub that calls default_initialize_vcpu.

Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 21/21] libxl: add support for migrating HVM guests without a device model
  2015-11-11 17:14   ` Andrew Cooper
@ 2015-11-26 18:10     ` Roger Pau Monné
  0 siblings, 0 replies; 47+ messages in thread
From: Roger Pau Monné @ 2015-11-26 18:10 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Ian Jackson, Wei Liu, Ian Campbell

El 11/11/15 a les 18.14, Andrew Cooper ha escrit:
> On 06/11/15 16:05, Roger Pau Monne wrote:
>> diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
>> index 52a60d7..0a6eaf9 100644
>> --- a/tools/libxl/libxl_stream_write.c
>> +++ b/tools/libxl/libxl_stream_write.c
>> @@ -232,6 +232,9 @@ void libxl__stream_write_start(libxl__egc *egc,
>>              stream->emu_sub_hdr.id = EMULATOR_QEMU_UPSTREAM;
>>              break;
>>  
>> +        case LIBXL_DEVICE_MODEL_VERSION_NONE:
>> +            break;
>> +
> 
> This (in principle) leaves stream->emu_sub_hdr.id uninitialised
> (although its value will be zero because of libxl__stream_write_init()).
> 
> I would be tempted to (ab)use EMULATOR_UNKNOWN here and have
> setup_emulator_write() assert id != UNKNOWN, to catch calls which slip
> through the cracks.

This seems fine to me and I've implemented it.

> 
>>          default:
>>              rc = ERROR_FAIL;
>>              LOG(ERROR, "Unknown emulator for HVM domain");
>> @@ -359,6 +362,12 @@ static void write_emulator_xenstore_record(libxl__egc *egc,
>>      char *buf = NULL;
>>      uint32_t len = 0;
>>  
>> +    if (libxl__device_model_version_running(gc, dss->domid) ==
>> +        LIBXL_DEVICE_MODEL_VERSION_NONE) {
>> +        emulator_xenstore_record_done(egc, stream);
>> +        return;
>> +    }
>> +
> 
> libxl__device_model_version_running() performs a xenstore action, which
> is a steep overhead for constant information.
> 
> It would be better to introduce a new field in libxl__stream_write_state
> and cache libxl__device_model_version_running() once in
> libxl__stream_write_start().  You can then also use that for an
> assertion in setup_emulator_write()

Right, so I've ended up adding two asserts in setup_emulator_write:

assert(stream->emu_sub_hdr.id != EMULATOR_UNKNOWN);
assert(stream->device_model_version != LIBXL_DEVICE_MODEL_VERSION_NONE);

Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs
  2015-11-26 16:57     ` Roger Pau Monné
@ 2015-11-27  8:00       ` Jan Beulich
  2015-11-27  9:47         ` Roger Pau Monné
  0 siblings, 1 reply; 47+ messages in thread
From: Jan Beulich @ 2015-11-27  8:00 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, xen-devel

>>> On 26.11.15 at 17:57, <roger.pau@citrix.com> wrote:
> El 12/11/15 a les 17.57, Jan Beulich ha escrit:
>>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>>> +    if ( reg.attr.fields.pad != 0 )
>>> +    {
>>> +        gprintk(XENLOG_ERR,
>>> +                "Attribute bits 12-15 of the segment are not zero\n");
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    if ( reg.sel == 0 && reg.base == 0 && reg.limit == 0 &&
>> 
>> What's the sel check good for when your only caller only ever calls
>> you with it being zero?
> 
> I don't mind removing the sel == 0 check but I don't think it hurts either.

Its presence having confused me means it may confuse other readers.

>> Looking at base or limit here doesn't seem
>> right either.
> 
> I'm sorry but I'm not following you here, why is this not right? Would
> you rather conclude that the user is trying to load a null segment by
> just looking at the attributes field (and checking it's 0)?

Yes, exactly. Attributes being all zero makes a segment a null one
regardless of base or limit (if anything refusing non-zero base/limit
when attributes are zero as being inconsistent would be an option).

>>> +int arch_initialize_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
>>> +{
>>> +    struct domain *d = v->domain;
>>> +    int rc;
>>> +
>>> +    if ( is_hvm_vcpu(v) )
>>> +    {
>>> +        struct vcpu_hvm_context ctxt;
>>> +
>>> +        if ( copy_from_guest(&ctxt, arg, 1) )
>>> +            return -EFAULT;
>>> +
>>> +        domain_lock(d);
>>> +        rc = v->is_initialised ? -EEXIST : arch_set_info_hvm_guest(v, &ctxt);
>>> +        domain_unlock(d);
>>> +    }
>>> +    else
>>> +    {
>>> +        struct vcpu_guest_context *ctxt;
>>> +
>>> +        if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
>>> +            return -ENOMEM;
>>> +
>>> +        if ( copy_from_guest(ctxt, arg, 1) )
>>> +        {
>>> +            free_vcpu_guest_context(ctxt);
>>> +            return -EFAULT;
>>> +        }
>>> +
>>> +        domain_lock(d);
>>> +        rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt);
>>> +        domain_unlock(d);
>>> +
>>> +        free_vcpu_guest_context(ctxt);
>>> +    }
>> 
>> This else branch looks suspiciously like the ARM variant, and iirc I
>> had asked already on an earlier version to have this handled in
>> common code (with ARM simply using the common function for its
>> arch_initialize_vcpu()).
> 
> Done, I've created a default_initalize_vcpu that's shared between ARM
> and x86 PV guests. The arch_initialize_vcpu implementation on ARM is
> just a stub that calls default_initialize_vcpu.

I'd actually have expected that to just be a #define, but okay.

Jan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs
  2015-11-27  8:00       ` Jan Beulich
@ 2015-11-27  9:47         ` Roger Pau Monné
  2015-11-27 10:03           ` Jan Beulich
  0 siblings, 1 reply; 47+ messages in thread
From: Roger Pau Monné @ 2015-11-27  9:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, xen-devel

El 27/11/15 a les 9.00, Jan Beulich ha escrit:
>>>> On 26.11.15 at 17:57, <roger.pau@citrix.com> wrote:
>> El 12/11/15 a les 17.57, Jan Beulich ha escrit:
>>>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>>>> +    if ( reg.attr.fields.pad != 0 )
>>>> +    {
>>>> +        gprintk(XENLOG_ERR,
>>>> +                "Attribute bits 12-15 of the segment are not zero\n");
>>>> +        return -EINVAL;
>>>> +    }
>>>> +
>>>> +    if ( reg.sel == 0 && reg.base == 0 && reg.limit == 0 &&
>>>
>>> What's the sel check good for when your only caller only ever calls
>>> you with it being zero?
>>
>> I don't mind removing the sel == 0 check but I don't think it hurts either.
> 
> Its presence having confused me means it may confuse other readers.
> 
>>> Looking at base or limit here doesn't seem
>>> right either.
>>
>> I'm sorry but I'm not following you here, why is this not right? Would
>> you rather conclude that the user is trying to load a null segment by
>> just looking at the attributes field (and checking it's 0)?
> 
> Yes, exactly. Attributes being all zero makes a segment a null one
> regardless of base or limit (if anything refusing non-zero base/limit
> when attributes are zero as being inconsistent would be an option).

Thanks for the feedback, I'm also wondering whether I should call
hvm_cr4_guest_reserved_bits and hvm_efer_valid like it's done in
hvm_load_cpu_ctxt. Currently we don't perform any of the EFER/CR4 checks
in order to make sure that what the user enables is actually allowed.
What do you think?

>>>> +int arch_initialize_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>> +{
>>>> +    struct domain *d = v->domain;
>>>> +    int rc;
>>>> +
>>>> +    if ( is_hvm_vcpu(v) )
>>>> +    {
>>>> +        struct vcpu_hvm_context ctxt;
>>>> +
>>>> +        if ( copy_from_guest(&ctxt, arg, 1) )
>>>> +            return -EFAULT;
>>>> +
>>>> +        domain_lock(d);
>>>> +        rc = v->is_initialised ? -EEXIST : arch_set_info_hvm_guest(v, &ctxt);
>>>> +        domain_unlock(d);
>>>> +    }
>>>> +    else
>>>> +    {
>>>> +        struct vcpu_guest_context *ctxt;
>>>> +
>>>> +        if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
>>>> +            return -ENOMEM;
>>>> +
>>>> +        if ( copy_from_guest(ctxt, arg, 1) )
>>>> +        {
>>>> +            free_vcpu_guest_context(ctxt);
>>>> +            return -EFAULT;
>>>> +        }
>>>> +
>>>> +        domain_lock(d);
>>>> +        rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt);
>>>> +        domain_unlock(d);
>>>> +
>>>> +        free_vcpu_guest_context(ctxt);
>>>> +    }
>>>
>>> This else branch looks suspiciously like the ARM variant, and iirc I
>>> had asked already on an earlier version to have this handled in
>>> common code (with ARM simply using the common function for its
>>> arch_initialize_vcpu()).
>>
>> Done, I've created a default_initalize_vcpu that's shared between ARM
>> and x86 PV guests. The arch_initialize_vcpu implementation on ARM is
>> just a stub that calls default_initialize_vcpu.
> 
> I'd actually have expected that to just be a #define, but okay.

I wanted to do it as a define, like:

#define arch_initialize_vcpu default_initalize_vcpu

Inside of arm/domain.h, but that's included before the common domain.h,
so the prototype of qrch_initialize_vcpu gets replaced to
defaul_initialize_vcpu, and then the compiler complains about duplicate
prototypes. I could shuffle them a bit in order to fix it, but I think
the stub in arm/domain.c is clearer, and the compiler should optimize it
anyway.

Roger.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs
  2015-11-27  9:47         ` Roger Pau Monné
@ 2015-11-27 10:03           ` Jan Beulich
  0 siblings, 0 replies; 47+ messages in thread
From: Jan Beulich @ 2015-11-27 10:03 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Andrew Cooper, Stefano Stabellini, Ian Campbell, xen-devel

>>> On 27.11.15 at 10:47, <roger.pau@citrix.com> wrote:
> El 27/11/15 a les 9.00, Jan Beulich ha escrit:
>>>>> On 26.11.15 at 17:57, <roger.pau@citrix.com> wrote:
>>> El 12/11/15 a les 17.57, Jan Beulich ha escrit:
>>>>>>> On 06.11.15 at 17:05, <roger.pau@citrix.com> wrote:
>>>>> +    if ( reg.attr.fields.pad != 0 )
>>>>> +    {
>>>>> +        gprintk(XENLOG_ERR,
>>>>> +                "Attribute bits 12-15 of the segment are not zero\n");
>>>>> +        return -EINVAL;
>>>>> +    }
>>>>> +
>>>>> +    if ( reg.sel == 0 && reg.base == 0 && reg.limit == 0 &&
>>>>
>>>> What's the sel check good for when your only caller only ever calls
>>>> you with it being zero?
>>>
>>> I don't mind removing the sel == 0 check but I don't think it hurts either.
>> 
>> Its presence having confused me means it may confuse other readers.
>> 
>>>> Looking at base or limit here doesn't seem
>>>> right either.
>>>
>>> I'm sorry but I'm not following you here, why is this not right? Would
>>> you rather conclude that the user is trying to load a null segment by
>>> just looking at the attributes field (and checking it's 0)?
>> 
>> Yes, exactly. Attributes being all zero makes a segment a null one
>> regardless of base or limit (if anything refusing non-zero base/limit
>> when attributes are zero as being inconsistent would be an option).
> 
> Thanks for the feedback, I'm also wondering whether I should call
> hvm_cr4_guest_reserved_bits and hvm_efer_valid like it's done in
> hvm_load_cpu_ctxt. Currently we don't perform any of the EFER/CR4 checks
> in order to make sure that what the user enables is actually allowed.
> What do you think?

Sounds like a good idea.

Jan

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2015-11-27 10:04 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-06 16:05 [PATCH v8 00/21] Introduce HVM without dm and new boot ABI Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 01/21] xen/x86: add bitmap of enabled emulated devices Roger Pau Monne
2015-11-16 12:18   ` Jan Beulich
2015-11-16 18:33     ` Andrew Cooper
2015-11-17  9:42       ` Jan Beulich
2015-11-17  9:58         ` Juergen Gross
2015-11-17 10:30           ` Andrew Cooper
2015-11-17 10:35             ` Juergen Gross
2015-11-26 10:25       ` Roger Pau Monné
2015-11-06 16:05 ` [PATCH v8 02/21] xen/vlapic: fixes for HVM code when running without a vlapic Roger Pau Monne
2015-11-06 16:10   ` Andrew Cooper
2015-11-26  5:41   ` Tian, Kevin
2015-11-06 16:05 ` [PATCH v8 03/21] xen/x86: allow disabling the emulated local apic Roger Pau Monne
2015-11-26  5:47   ` Tian, Kevin
2015-11-06 16:05 ` [PATCH v8 04/21] xen/x86: allow disabling the emulated HPET Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 05/21] xen/x86: allow disabling power management Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 06/21] xen/x86: allow disabling the emulated RTC Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 07/21] xen/x86: allow disabling the emulated IO APIC Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 08/21] xen/x86: allow disabling the emulated PIC Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 09/21] xen/x86: set the vPMU interface based on the presence of a lapic Roger Pau Monne
2015-11-09 15:04   ` Jan Beulich
2015-11-10 11:30     ` [PATCH v8.1 " Roger Pau Monne
2015-11-10 14:21       ` Boris Ostrovsky
2015-11-06 16:05 ` [PATCH v8 10/21] xen/x86: allow disabling the emulated VGA Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 11/21] xen/x86: allow disabling the emulated IOMMU Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 12/21] xen/x86: allow disabling the emulated PIT Roger Pau Monne
2015-11-09 15:31   ` Jan Beulich
2015-11-06 16:05 ` [PATCH v8 13/21] xen/x86: make sure the HVM callback vector is correctly set Roger Pau Monne
2015-11-10 16:41   ` Jan Beulich
2015-11-06 16:05 ` [PATCH v8 14/21] xen/x86: allow disabling all emulated devices inside of Xen Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 15/21] elfnotes: intorduce a new PHYS_ENTRY elfnote Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 16/21] libxc: allow creating domains without emulated devices Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 17/21] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs Roger Pau Monne
2015-11-12 16:57   ` Jan Beulich
2015-11-26 16:57     ` Roger Pau Monné
2015-11-27  8:00       ` Jan Beulich
2015-11-27  9:47         ` Roger Pau Monné
2015-11-27 10:03           ` Jan Beulich
2015-11-06 16:05 ` [PATCH v8 18/21] libxc/xen: introduce a start info structure for HVMlite guests Roger Pau Monne
2015-11-10 16:53   ` Jan Beulich
2015-11-06 16:05 ` [PATCH v8 19/21] libxc: switch xc_dom_elfloader to be used with HVMlite domains Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 20/21] libxl: allow the creation of HVM domains without a device model Roger Pau Monne
2015-11-06 16:05 ` [PATCH v8 21/21] libxl: add support for migrating HVM guests " Roger Pau Monne
2015-11-11 15:49   ` Wei Liu
2015-11-11 15:55     ` Ian Jackson
2015-11-11 17:14   ` Andrew Cooper
2015-11-26 18:10     ` Roger Pau Monné

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.