[PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
@ 2015-06-22 16:11 Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 01/13] libxc: split x86 HVM setup_guest into smaller logical functions Roger Pau Monne
                   ` (13 more replies)
  0 siblings, 14 replies; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel

Before reading any further, keep in mind this is a VERY inital RFC 
prototype series. Many things are not finished, and those that are done 
make heavy use of duck tape in order to keep things into place.

Now that you are warned, this series is split in the following order:

 - Patches from 1 to 7 switch HVM domain contruction to use the xc_dom_* 
   family of functions, like they are used to build PV domains. 
 - Patches from 8 to 13 introduce the creation of HVM domains without 
   firmware, which is replaced by directly loading a kernel like it's done 
   for PV guests. A new boot ABI based on the discussion in the thread "RFC: 
   making the PVH 64bit ABI as stable" is also introduced, although it's not 
   finished.

Some things that are missing from the new boot ABI:

 - Although it supports loading a ramdisk, there's still now defined way 
   into how to pass this ramdisk to the guest. I'm thinking of using a 
   HVMPARAM or simply setting a GP register to contain the address of the 
   ramdisk. Ideally I would like to support loading more than one ramdisk.

Some patches contain comments after the SoB, which in general describe the 
shortcommings of the implementation. The aim of those is to initiate 
discussion about whether the aproach taken is TRTTD.

I've only tested this on Intel hw, but I see no reason why it shouldn't work 
on AMD. I've managed to boot FreeBSD up to the point were it should 
jump into user-space (I just didn't have a VBD attached to the VM so it just 
sits waiting for a valid disk). I have not tried to boot it any further 
since I think that's fine for the purpose of this series. 

The series can also be found in the following git repo:

git://xenbits.xen.org/people/royger/xen.git branch hvm_without_dm_v1

And for the FreeBSD part:

git://xenbits.xen.org/people/royger/freebsd.git branch new_entry_point_v1

In case someone wants to give it a try, I've uploaded a FreeBSD kernel that 
should work when booted into this mode:

https://people.freebsd.org/~royger/kernel_no_dm

The config file that I've used is:

<config>
kernel="/path/to/kernel_no_dm"

builder="hvm"
device_model_version="none"

memory=128
vcpus=1
name = "freebsd"
</config>

Thanks, Roger.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 01/13] libxc: split x86 HVM setup_guest into smaller logical functions
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 02/13] libxc: unify xc_dom_p2m_{host/guest} Roger Pau Monne
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

This is just a preparatory change to clean up the code in setup_guest.
Should not introduce any functional changes.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
 tools/libxc/xc_hvm_build_x86.c | 198 ++++++++++++++++++++++++-----------------
 1 file changed, 117 insertions(+), 81 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index 003ea06..f7616a8 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -232,28 +232,20 @@ static int check_mmio_hole(uint64_t start, uint64_t memsize,
         return 1;
 }
 
-static int setup_guest(xc_interface *xch,
-                       uint32_t dom, struct xc_hvm_build_args *args,
-                       char *image, unsigned long image_size)
+static int xc_hvm_populate_memory(xc_interface *xch, uint32_t dom,
+                                  struct xc_hvm_build_args *args,
+                                  xen_pfn_t *page_array)
 {
-    xen_pfn_t *page_array = NULL;
     unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
     unsigned long p2m_size;
     unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
-    unsigned long entry_eip, cur_pages, cur_pfn;
-    void *hvm_info_page;
-    uint32_t *ident_pt;
-    struct elf_binary elf;
-    uint64_t v_start, v_end;
-    uint64_t m_start = 0, m_end = 0;
+    unsigned long cur_pages, cur_pfn;
     int rc;
     xen_capabilities_info_t caps;
     unsigned long stat_normal_pages = 0, stat_2mb_pages = 0, 
         stat_1gb_pages = 0;
     unsigned int memflags = 0;
     int claim_enabled = args->claim_enabled;
-    xen_pfn_t special_array[NR_SPECIAL_PAGES];
-    xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
     uint64_t total_pages;
     xen_vmemrange_t dummy_vmemrange[2];
     unsigned int dummy_vnode_to_pnode[1];
@@ -261,19 +253,6 @@ static int setup_guest(xc_interface *xch,
     unsigned int *vnode_to_pnode;
     unsigned int nr_vmemranges, nr_vnodes;
 
-    memset(&elf, 0, sizeof(elf));
-    if ( elf_init(&elf, image, image_size) != 0 )
-    {
-        PERROR("Could not initialise ELF image");
-        goto error_out;
-    }
-
-    xc_elf_set_logfile(xch, &elf, 1);
-
-    elf_parse_binary(&elf);
-    v_start = 0;
-    v_end = args->mem_size;
-
     if ( nr_pages > target_pages )
         memflags |= XENMEMF_populate_on_demand;
 
@@ -346,24 +325,6 @@ static int setup_guest(xc_interface *xch,
         goto error_out;
     }
 
-    if ( modules_init(args, v_end, &elf, &m_start, &m_end) != 0 )
-    {
-        ERROR("Insufficient space to load modules.");
-        goto error_out;
-    }
-
-    DPRINTF("VIRTUAL MEMORY ARRANGEMENT:\n");
-    DPRINTF("  Loader:   %016"PRIx64"->%016"PRIx64"\n", elf.pstart, elf.pend);
-    DPRINTF("  Modules:  %016"PRIx64"->%016"PRIx64"\n", m_start, m_end);
-    DPRINTF("  TOTAL:    %016"PRIx64"->%016"PRIx64"\n", v_start, v_end);
-    DPRINTF("  ENTRY:    %016"PRIx64"\n", elf_uval(&elf, elf.ehdr, e_entry));
-
-    if ( (page_array = malloc(p2m_size * sizeof(xen_pfn_t))) == NULL )
-    {
-        PERROR("Could not allocate memory.");
-        goto error_out;
-    }
-
     for ( i = 0; i < p2m_size; i++ )
         page_array[i] = ((xen_pfn_t)-1);
     for ( vmemid = 0; vmemid < nr_vmemranges; vmemid++ )
@@ -563,7 +524,54 @@ static int setup_guest(xc_interface *xch,
     DPRINTF("  4KB PAGES: 0x%016lx\n", stat_normal_pages);
     DPRINTF("  2MB PAGES: 0x%016lx\n", stat_2mb_pages);
     DPRINTF("  1GB PAGES: 0x%016lx\n", stat_1gb_pages);
-    
+
+    rc = 0;
+    goto out;
+ error_out:
+    rc = -1;
+ out:
+
+    /* ensure no unclaimed pages are left unused */
+    xc_domain_claim_pages(xch, dom, 0 /* cancels the claim */);
+
+    return rc;
+}
+
+static int xc_hvm_load_image(xc_interface *xch,
+                       uint32_t dom, struct xc_hvm_build_args *args,
+                       xen_pfn_t *page_array)
+{
+    unsigned long entry_eip, image_size;
+    struct elf_binary elf;
+    uint64_t v_start, v_end;
+    uint64_t m_start = 0, m_end = 0;
+    char *image;
+    int rc;
+
+    image = xc_read_image(xch, args->image_file_name, &image_size);
+    if ( image == NULL )
+        return -1;
+
+    memset(&elf, 0, sizeof(elf));
+    if ( elf_init(&elf, image, image_size) != 0 )
+        goto error_out;
+
+    xc_elf_set_logfile(xch, &elf, 1);
+
+    elf_parse_binary(&elf);
+    v_start = 0;
+    v_end = args->mem_size;
+
+    if ( modules_init(args, v_end, &elf, &m_start, &m_end) != 0 )
+    {
+        ERROR("Insufficient space to load modules.");
+        goto error_out;
+    }
+
+    DPRINTF("VIRTUAL MEMORY ARRANGEMENT:\n");
+    DPRINTF("  Loader:   %016"PRIx64"->%016"PRIx64"\n", elf.pstart, elf.pend);
+    DPRINTF("  Modules:  %016"PRIx64"->%016"PRIx64"\n", m_start, m_end);
+
     if ( loadelfimage(xch, &elf, dom, page_array) != 0 )
     {
         PERROR("Could not load ELF image");
@@ -576,6 +584,44 @@ static int setup_guest(xc_interface *xch,
         goto error_out;
     }
 
+    /* Insert JMP <rel32> instruction at address 0x0 to reach entry point. */
+    entry_eip = elf_uval(&elf, elf.ehdr, e_entry);
+    if ( entry_eip != 0 )
+    {
+        char *page0 = xc_map_foreign_range(
+            xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0);
+        if ( page0 == NULL )
+            goto error_out;
+        page0[0] = 0xe9;
+        *(uint32_t *)&page0[1] = entry_eip - 5;
+        munmap(page0, PAGE_SIZE);
+    }
+
+    rc = 0;
+    goto out;
+ error_out:
+    rc = -1;
+ out:
+    if ( elf_check_broken(&elf) )
+        ERROR("HVM ELF broken: %s", elf_check_broken(&elf));
+    free(image);
+
+    return rc;
+}
+
+static int xc_hvm_populate_params(xc_interface *xch, uint32_t dom,
+                                  struct xc_hvm_build_args *args)
+{
+    unsigned long i;
+    void *hvm_info_page;
+    uint32_t *ident_pt;
+    uint64_t v_end;
+    int rc;
+    xen_pfn_t special_array[NR_SPECIAL_PAGES];
+    xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
+
+    v_end = args->mem_size;
+
     if ( (hvm_info_page = xc_map_foreign_range(
               xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
               HVM_INFO_PFN)) == NULL )
@@ -664,34 +710,12 @@ static int setup_guest(xc_interface *xch,
     xc_hvm_param_set(xch, dom, HVM_PARAM_IDENT_PT,
                      special_pfn(SPECIALPAGE_IDENT_PT) << PAGE_SHIFT);
 
-    /* Insert JMP <rel32> instruction at address 0x0 to reach entry point. */
-    entry_eip = elf_uval(&elf, elf.ehdr, e_entry);
-    if ( entry_eip != 0 )
-    {
-        char *page0 = xc_map_foreign_range(
-            xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0);
-        if ( page0 == NULL )
-        {
-            PERROR("Could not map page0");
-            goto error_out;
-        }
-        page0[0] = 0xe9;
-        *(uint32_t *)&page0[1] = entry_eip - 5;
-        munmap(page0, PAGE_SIZE);
-    }
-
     rc = 0;
     goto out;
  error_out:
     rc = -1;
  out:
-    if ( elf_check_broken(&elf) )
-        ERROR("HVM ELF broken: %s", elf_check_broken(&elf));
-
-    /* ensure no unclaimed pages are left unused */
-    xc_domain_claim_pages(xch, dom, 0 /* cancels the claim */);
 
-    free(page_array);
     return rc;
 }
 
@@ -702,9 +726,8 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid,
                  struct xc_hvm_build_args *hvm_args)
 {
     struct xc_hvm_build_args args = *hvm_args;
-    void *image;
-    unsigned long image_size;
-    int sts;
+    xen_pfn_t *parray = NULL;
+    int rc;
 
     if ( domid == 0 )
         return -1;
@@ -715,24 +738,37 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid,
     if ( args.mem_size < (2ull << 20) || args.mem_target < (2ull << 20) )
         return -1;
 
-    image = xc_read_image(xch, args.image_file_name, &image_size);
-    if ( image == NULL )
+    parray = malloc((args.mem_size >> PAGE_SHIFT) * sizeof(xen_pfn_t));
+    if ( parray == NULL )
         return -1;
 
-    sts = setup_guest(xch, domid, &args, image, image_size);
-
-    if (!sts)
+    rc = xc_hvm_populate_memory(xch, domid, &args, parray);
+    if ( rc != 0 )
     {
-        /* Return module load addresses to caller */
-        hvm_args->acpi_module.guest_addr_out = 
-            args.acpi_module.guest_addr_out;
-        hvm_args->smbios_module.guest_addr_out = 
-            args.smbios_module.guest_addr_out;
+        PERROR("xc_hvm_populate_memory failed");
+        goto out;
+    }
+    rc = xc_hvm_load_image(xch, domid, &args, parray);
+    if ( rc != 0 )
+    {
+        PERROR("xc_hvm_load_image failed");
+        goto out;
+    }
+    rc = xc_hvm_populate_params(xch, domid, &args);
+    if ( rc != 0 )
+    {
+        PERROR("xc_hvm_populate_params failed");
+        goto out;
     }
 
-    free(image);
+    /* Return module load addresses to caller */
+    hvm_args->acpi_module.guest_addr_out = args.acpi_module.guest_addr_out;
+    hvm_args->smbios_module.guest_addr_out = args.smbios_module.guest_addr_out;
 
-    return sts;
+out:
+    free(parray);
+
+    return rc;
 }
 
 /* xc_hvm_build_target_mem: 
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 02/13] libxc: unify xc_dom_p2m_{host/guest}
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 01/13] libxc: split x86 HVM setup_guest into smaller logical functions Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 03/13] libxc: introduce the notion of a container type Roger Pau Monne
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

Unify both functions into xc_dom_p2m. Should not introduce any functional
change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
 stubdom/grub/kexec.c              |  4 ++--
 tools/libxc/include/xc_dom.h      | 14 ++------------
 tools/libxc/xc_dom_boot.c         | 10 +++++-----
 tools/libxc/xc_dom_compat_linux.c |  4 ++--
 tools/libxc/xc_dom_x86.c          | 32 ++++++++++++++++----------------
 tools/libxl/libxl_dom.c           |  4 ++--
 6 files changed, 29 insertions(+), 39 deletions(-)

diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
index 4c33b25..0b2f4f3 100644
--- a/stubdom/grub/kexec.c
+++ b/stubdom/grub/kexec.c
@@ -358,9 +358,9 @@ void kexec(void *kernel, long kernel_size, void *module, long module_size, char
 #ifdef __x86_64__
                 MMUEXT_PIN_L4_TABLE,
 #endif
-                xc_dom_p2m_host(dom, dom->pgtables_seg.pfn),
+                xc_dom_p2m(dom, dom->pgtables_seg.pfn),
                 dom->guest_domid)) != 0 ) {
-        grub_printf("pin_table(%lx) returned %d\n", xc_dom_p2m_host(dom,
+        grub_printf("pin_table(%lx) returned %d\n", xc_dom_p2m(dom,
                     dom->pgtables_seg.pfn), rc);
         errnum = ERR_BOOT_FAILURE;
         goto out_remap;
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index a7d059a..ec9e293 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -376,19 +376,9 @@ static inline void *xc_dom_vaddr_to_ptr(struct xc_dom_image *dom,
     return ptr + offset;
 }
 
-static inline xen_pfn_t xc_dom_p2m_host(struct xc_dom_image *dom, xen_pfn_t pfn)
+static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn)
 {
-    if (dom->shadow_enabled)
-        return pfn;
-    if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages)
-        return INVALID_MFN;
-    return dom->p2m_host[pfn - dom->rambase_pfn];
-}
-
-static inline xen_pfn_t xc_dom_p2m_guest(struct xc_dom_image *dom,
-                                         xen_pfn_t pfn)
-{
-    if (xc_dom_feature_translated(dom))
+    if ( dom->shadow_enabled || xc_dom_feature_translated(dom) )
         return pfn;
     if (pfn < dom->rambase_pfn || pfn >= dom->rambase_pfn + dom->total_pages)
         return INVALID_MFN;
diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c
index f82db2d..fda9e52 100644
--- a/tools/libxc/xc_dom_boot.c
+++ b/tools/libxc/xc_dom_boot.c
@@ -54,7 +54,7 @@ static int setup_hypercall_page(struct xc_dom_image *dom)
                   dom->parms.virt_hypercall, pfn);
     domctl.cmd = XEN_DOMCTL_hypercall_init;
     domctl.domain = dom->guest_domid;
-    domctl.u.hypercall_init.gmfn = xc_dom_p2m_guest(dom, pfn);
+    domctl.u.hypercall_init.gmfn = xc_dom_p2m(dom, pfn);
     rc = do_domctl(dom->xch, &domctl);
     if ( rc != 0 )
         xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
@@ -84,7 +84,7 @@ static int clear_page(struct xc_dom_image *dom, xen_pfn_t pfn)
     if ( pfn == 0 )
         return 0;
 
-    dst = xc_dom_p2m_host(dom, pfn);
+    dst = xc_dom_p2m(dom, pfn);
     DOMPRINTF("%s: pfn 0x%" PRIpfn ", mfn 0x%" PRIpfn "",
               __FUNCTION__, pfn, dst);
     rc = xc_clear_domain_page(dom->xch, dom->guest_domid, dst);
@@ -178,7 +178,7 @@ void *xc_dom_boot_domU_map(struct xc_dom_image *dom, xen_pfn_t pfn,
     }
 
     for ( i = 0; i < count; i++ )
-        entries[i].mfn = xc_dom_p2m_host(dom, pfn + i);
+        entries[i].mfn = xc_dom_p2m(dom, pfn + i);
 
     ptr = xc_map_foreign_ranges(dom->xch, dom->guest_domid,
                 count << page_shift, PROT_READ | PROT_WRITE, 1 << page_shift,
@@ -435,8 +435,8 @@ int xc_dom_gnttab_init(struct xc_dom_image *dom)
                                       dom->console_domid, dom->xenstore_domid);
     } else {
         return xc_dom_gnttab_seed(dom->xch, dom->guest_domid,
-                                  xc_dom_p2m_host(dom, dom->console_pfn),
-                                  xc_dom_p2m_host(dom, dom->xenstore_pfn),
+                                  xc_dom_p2m(dom, dom->console_pfn),
+                                  xc_dom_p2m(dom, dom->xenstore_pfn),
                                   dom->console_domid, dom->xenstore_domid);
     }
 }
diff --git a/tools/libxc/xc_dom_compat_linux.c b/tools/libxc/xc_dom_compat_linux.c
index 2c14a0f..acccde9 100644
--- a/tools/libxc/xc_dom_compat_linux.c
+++ b/tools/libxc/xc_dom_compat_linux.c
@@ -65,8 +65,8 @@ static int xc_linux_build_internal(struct xc_dom_image *dom,
     if ( (rc = xc_dom_gnttab_init(dom)) != 0)
         goto out;
 
-    *console_mfn = xc_dom_p2m_host(dom, dom->console_pfn);
-    *store_mfn = xc_dom_p2m_host(dom, dom->xenstore_pfn);
+    *console_mfn = xc_dom_p2m(dom, dom->console_pfn);
+    *store_mfn = xc_dom_p2m(dom, dom->xenstore_pfn);
 
  out:
     return rc;
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 920fe42..0d80c18 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -248,7 +248,7 @@ static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
     unsigned long l3off, l2off = 0, l1off;
     xen_vaddr_t addr;
     xen_pfn_t pgpfn;
-    xen_pfn_t l3mfn = xc_dom_p2m_guest(dom, l3pfn);
+    xen_pfn_t l3mfn = xc_dom_p2m(dom, l3pfn);
 
     if ( dom->parms.pae == 1 )
     {
@@ -280,7 +280,7 @@ static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
                 goto pfn_error;
             l3off = l3_table_offset_pae(addr);
             l3tab[l3off] =
-                pfn_to_paddr(xc_dom_p2m_guest(dom, l2pfn)) | L3_PROT;
+                pfn_to_paddr(xc_dom_p2m(dom, l2pfn)) | L3_PROT;
             l2pfn++;
         }
 
@@ -292,7 +292,7 @@ static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
                 goto pfn_error;
             l2off = l2_table_offset_pae(addr);
             l2tab[l2off] =
-                pfn_to_paddr(xc_dom_p2m_guest(dom, l1pfn)) | L2_PROT;
+                pfn_to_paddr(xc_dom_p2m(dom, l1pfn)) | L2_PROT;
             l1pfn++;
         }
 
@@ -300,7 +300,7 @@ static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
         l1off = l1_table_offset_pae(addr);
         pgpfn = (addr - dom->parms.virt_base) >> PAGE_SHIFT_X86;
         l1tab[l1off] =
-            pfn_to_paddr(xc_dom_p2m_guest(dom, pgpfn)) | L1_PROT;
+            pfn_to_paddr(xc_dom_p2m(dom, pgpfn)) | L1_PROT;
         if ( (addr >= dom->pgtables_seg.vstart) &&
              (addr < dom->pgtables_seg.vend) )
             l1tab[l1off] &= ~_PAGE_RW; /* page tables are r/o */
@@ -316,7 +316,7 @@ static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
     if ( dom->virt_pgtab_end <= 0xc0000000 )
     {
         DOMPRINTF("%s: PAE: extra l2 page table for l3#3", __FUNCTION__);
-        l3tab[3] = pfn_to_paddr(xc_dom_p2m_guest(dom, l2pfn)) | L3_PROT;
+        l3tab[3] = pfn_to_paddr(xc_dom_p2m(dom, l2pfn)) | L3_PROT;
     }
     return 0;
 
@@ -375,7 +375,7 @@ static int setup_pgtables_x86_64(struct xc_dom_image *dom)
                 goto pfn_error;
             l4off = l4_table_offset_x86_64(addr);
             l4tab[l4off] =
-                pfn_to_paddr(xc_dom_p2m_guest(dom, l3pfn)) | L4_PROT;
+                pfn_to_paddr(xc_dom_p2m(dom, l3pfn)) | L4_PROT;
             l3pfn++;
         }
 
@@ -387,7 +387,7 @@ static int setup_pgtables_x86_64(struct xc_dom_image *dom)
                 goto pfn_error;
             l3off = l3_table_offset_x86_64(addr);
             l3tab[l3off] =
-                pfn_to_paddr(xc_dom_p2m_guest(dom, l2pfn)) | L3_PROT;
+                pfn_to_paddr(xc_dom_p2m(dom, l2pfn)) | L3_PROT;
             l2pfn++;
         }
 
@@ -399,7 +399,7 @@ static int setup_pgtables_x86_64(struct xc_dom_image *dom)
                 goto pfn_error;
             l2off = l2_table_offset_x86_64(addr);
             l2tab[l2off] =
-                pfn_to_paddr(xc_dom_p2m_guest(dom, l1pfn)) | L2_PROT;
+                pfn_to_paddr(xc_dom_p2m(dom, l1pfn)) | L2_PROT;
             l1pfn++;
         }
 
@@ -407,7 +407,7 @@ static int setup_pgtables_x86_64(struct xc_dom_image *dom)
         l1off = l1_table_offset_x86_64(addr);
         pgpfn = (addr - dom->parms.virt_base) >> PAGE_SHIFT_X86;
         l1tab[l1off] =
-            pfn_to_paddr(xc_dom_p2m_guest(dom, pgpfn)) | L1_PROT;
+            pfn_to_paddr(xc_dom_p2m(dom, pgpfn)) | L1_PROT;
         if ( (!dom->pvh_enabled)                &&
              (addr >= dom->pgtables_seg.vstart) &&
              (addr < dom->pgtables_seg.vend) )
@@ -490,9 +490,9 @@ static int start_info_x86_32(struct xc_dom_image *dom)
     start_info->mfn_list = dom->p2m_seg.vstart;
 
     start_info->flags = dom->flags;
-    start_info->store_mfn = xc_dom_p2m_guest(dom, dom->xenstore_pfn);
+    start_info->store_mfn = xc_dom_p2m(dom, dom->xenstore_pfn);
     start_info->store_evtchn = dom->xenstore_evtchn;
-    start_info->console.domU.mfn = xc_dom_p2m_guest(dom, dom->console_pfn);
+    start_info->console.domU.mfn = xc_dom_p2m(dom, dom->console_pfn);
     start_info->console.domU.evtchn = dom->console_evtchn;
 
     if ( dom->ramdisk_blob )
@@ -536,9 +536,9 @@ static int start_info_x86_64(struct xc_dom_image *dom)
     start_info->mfn_list = dom->p2m_seg.vstart;
 
     start_info->flags = dom->flags;
-    start_info->store_mfn = xc_dom_p2m_guest(dom, dom->xenstore_pfn);
+    start_info->store_mfn = xc_dom_p2m(dom, dom->xenstore_pfn);
     start_info->store_evtchn = dom->xenstore_evtchn;
-    start_info->console.domU.mfn = xc_dom_p2m_guest(dom, dom->console_pfn);
+    start_info->console.domU.mfn = xc_dom_p2m(dom, dom->console_pfn);
     start_info->console.domU.evtchn = dom->console_evtchn;
 
     if ( dom->ramdisk_blob )
@@ -622,7 +622,7 @@ static int vcpu_x86_32(struct xc_dom_image *dom, void *ptr)
          dom->parms.pae == 3 /* bimodal */ )
         ctxt->vm_assist |= (1UL << VMASST_TYPE_pae_extended_cr3);
 
-    cr3_pfn = xc_dom_p2m_guest(dom, dom->pgtables_seg.pfn);
+    cr3_pfn = xc_dom_p2m(dom, dom->pgtables_seg.pfn);
     ctxt->ctrlreg[3] = xen_pfn_to_cr3_x86_32(cr3_pfn);
     DOMPRINTF("%s: cr3: pfn 0x%" PRIpfn " mfn 0x%" PRIpfn "",
               __FUNCTION__, dom->pgtables_seg.pfn, cr3_pfn);
@@ -648,7 +648,7 @@ static int vcpu_x86_64(struct xc_dom_image *dom, void *ptr)
     ctxt->user_regs.rflags = 1 << 9; /* Interrupt Enable */
 
     ctxt->flags = VGCF_in_kernel_X86_64 | VGCF_online_X86_64;
-    cr3_pfn = xc_dom_p2m_guest(dom, dom->pgtables_seg.pfn);
+    cr3_pfn = xc_dom_p2m(dom, dom->pgtables_seg.pfn);
     ctxt->ctrlreg[3] = xen_pfn_to_cr3_x86_64(cr3_pfn);
     DOMPRINTF("%s: cr3: pfn 0x%" PRIpfn " mfn 0x%" PRIpfn "",
               __FUNCTION__, dom->pgtables_seg.pfn, cr3_pfn);
@@ -1020,7 +1020,7 @@ int arch_setup_bootlate(struct xc_dom_image *dom)
         /* paravirtualized guest */
         xc_dom_unmap_one(dom, dom->pgtables_seg.pfn);
         rc = pin_table(dom->xch, pgd_type,
-                       xc_dom_p2m_host(dom, dom->pgtables_seg.pfn),
+                       xc_dom_p2m(dom, dom->pgtables_seg.pfn),
                        dom->guest_domid);
         if ( rc != 0 )
         {
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 600393d..a970a8b 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -745,8 +745,8 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
         state->console_mfn = dom->console_pfn;
         state->store_mfn = dom->xenstore_pfn;
     } else {
-        state->console_mfn = xc_dom_p2m_host(dom, dom->console_pfn);
-        state->store_mfn = xc_dom_p2m_host(dom, dom->xenstore_pfn);
+        state->console_mfn = xc_dom_p2m(dom, dom->console_pfn);
+        state->store_mfn = xc_dom_p2m(dom, dom->xenstore_pfn);
     }
 
     libxl__file_reference_unmap(&state->pv_kernel);
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 03/13] libxc: introduce the notion of a container type
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 01/13] libxc: split x86 HVM setup_guest into smaller logical functions Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 02/13] libxc: unify xc_dom_p2m_{host/guest} Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 04/13] libxc: allow arch_setup_meminit to populate HVM domain memory Roger Pau Monne
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

Introduce the notion of a container type into xc_dom_image. This will be
needed by later changes that will also use xc_dom_image in order to build
HVM guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
 tools/libxc/include/xc_dom.h | 6 ++++++
 tools/libxc/xc_dom_x86.c     | 4 ++++
 tools/libxl/libxl_dom.c      | 1 +
 3 files changed, 11 insertions(+)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index ec9e293..f7b5f0f 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -180,6 +180,12 @@ struct xc_dom_image {
     struct xc_dom_arch *arch_hooks;
     /* allocate up to virt_alloc_end */
     int (*allocate) (struct xc_dom_image * dom, xen_vaddr_t up_to);
+
+    /* Container type (HVM or PV). */
+    enum {
+        XC_DOM_PV_CONTAINER,
+        XC_DOM_HVM_CONTAINER,
+    } container_type;
 };
 
 /* --- pluggable kernel loader ------------------------------------- */
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 0d80c18..b89f5c2 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -1071,6 +1071,10 @@ int arch_setup_bootlate(struct xc_dom_image *dom)
 
 int xc_dom_feature_translated(struct xc_dom_image *dom)
 {
+    /* Guests running inside HVM containers are always auto-translated. */
+    if ( dom->container_type == XC_DOM_HVM_CONTAINER )
+        return 1;
+
     return elf_xen_feature_get(XENFEAT_auto_translated_physmap, dom->f_active);
 }
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index a970a8b..8907bd6 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -626,6 +626,7 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
     }
 
     dom->pvh_enabled = state->pvh_enabled;
+    dom->container_type = XC_DOM_PV_CONTAINER;
 
     LOG(DEBUG, "pv kernel mapped %d path %s", state->pv_kernel.mapped, state->pv_kernel.path);
 
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 04/13] libxc: allow arch_setup_meminit to populate HVM domain memory
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (2 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 03/13] libxc: introduce the notion of a container type Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-25 10:29   ` Wei Liu
  2015-06-22 16:11 ` [PATCH RFC v1 05/13] libxc: introduce a domain loader for HVM guest firmware Roger Pau Monne
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

Introduce a new arch_setup_meminit_hvm that's going to be used to populate
HVM domain memory. Rename arch_setup_meminit to arch_setup_meminit_hvm_pv
and introduce a stub arch_setup_meminit that will call the right meminit
function depending on the contains type.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
I think that both arch_setup_meminit_hvm and arch_setup_meminit_pv could be
unified into a single meminit function. I have however not looked into it,
and just created arch_setup_meminit_hvm based on the code in
xc_hvm_populate_memory.
---
 tools/libxc/include/xc_dom.h |   8 +
 tools/libxc/xc_dom_x86.c     | 365 +++++++++++++++++++++++++++++++++++++++++--
 tools/libxl/libxl_dom.c      |   1 +
 3 files changed, 362 insertions(+), 12 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index f7b5f0f..051a7de 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -186,6 +186,14 @@ struct xc_dom_image {
         XC_DOM_PV_CONTAINER,
         XC_DOM_HVM_CONTAINER,
     } container_type;
+
+    /* HVM specific fields. */
+    xen_pfn_t target_pages;
+    xen_pfn_t mmio_start;
+    xen_pfn_t mmio_size;
+    xen_pfn_t lowmem_end;
+    xen_pfn_t highmem_end;
+    int vga_hole;
 };
 
 /* --- pluggable kernel loader ------------------------------------- */
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index b89f5c2..8a1ef24 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -40,10 +40,15 @@
 
 /* ------------------------------------------------------------------------ */
 
-#define SUPERPAGE_PFN_SHIFT  9
-#define SUPERPAGE_NR_PFNS    (1UL << SUPERPAGE_PFN_SHIFT)
 #define SUPERPAGE_BATCH_SIZE 512
 
+#define SUPERPAGE_2MB_SHIFT   9
+#define SUPERPAGE_2MB_NR_PFNS (1UL << SUPERPAGE_2MB_SHIFT)
+#define SUPERPAGE_1GB_SHIFT   18
+#define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT)
+
+#define VGA_HOLE_SIZE (0x20)
+
 #define bits_to_mask(bits)       (((xen_vaddr_t)1 << (bits))-1)
 #define round_down(addr, mask)   ((addr) & ~(mask))
 #define round_up(addr, mask)     ((addr) | (mask))
@@ -758,7 +763,7 @@ static int x86_shadow(xc_interface *xch, domid_t domid)
     return rc;
 }
 
-int arch_setup_meminit(struct xc_dom_image *dom)
+static int arch_setup_meminit_pv(struct xc_dom_image *dom)
 {
     int rc;
     xen_pfn_t pfn, allocsz, mfn, total, pfn_base;
@@ -782,7 +787,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
 
     if ( dom->superpages )
     {
-        int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
+        int count = dom->total_pages >> SUPERPAGE_2MB_SHIFT;
         xen_pfn_t extents[count];
 
         dom->p2m_size = dom->total_pages;
@@ -793,9 +798,9 @@ int arch_setup_meminit(struct xc_dom_image *dom)
 
         DOMPRINTF("Populating memory with %d superpages", count);
         for ( pfn = 0; pfn < count; pfn++ )
-            extents[pfn] = pfn << SUPERPAGE_PFN_SHIFT;
+            extents[pfn] = pfn << SUPERPAGE_2MB_SHIFT;
         rc = xc_domain_populate_physmap_exact(dom->xch, dom->guest_domid,
-                                               count, SUPERPAGE_PFN_SHIFT, 0,
+                                               count, SUPERPAGE_2MB_SHIFT, 0,
                                                extents);
         if ( rc )
             return rc;
@@ -805,7 +810,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
         for ( i = 0; i < count; i++ )
         {
             mfn = extents[i];
-            for ( j = 0; j < SUPERPAGE_NR_PFNS; j++, pfn++ )
+            for ( j = 0; j < SUPERPAGE_2MB_NR_PFNS; j++, pfn++ )
                 dom->p2m_host[pfn] = mfn + j;
         }
     }
@@ -881,7 +886,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
             unsigned int memflags;
             uint64_t pages;
             unsigned int pnode = vnode_to_pnode[vmemranges[i].nid];
-            int nr_spages = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
+            int nr_spages = dom->total_pages >> SUPERPAGE_2MB_SHIFT;
             xen_pfn_t extents[SUPERPAGE_BATCH_SIZE];
             xen_pfn_t pfn_base_idx;
 
@@ -902,11 +907,11 @@ int arch_setup_meminit(struct xc_dom_image *dom)
                 nr_spages -= count;
 
                 for ( pfn = pfn_base_idx, j = 0;
-                      pfn < pfn_base_idx + (count << SUPERPAGE_PFN_SHIFT);
-                      pfn += SUPERPAGE_NR_PFNS, j++ )
+                      pfn < pfn_base_idx + (count << SUPERPAGE_2MB_SHIFT);
+                      pfn += SUPERPAGE_2MB_NR_PFNS, j++ )
                     extents[j] = dom->p2m_host[pfn];
                 rc = xc_domain_populate_physmap(dom->xch, dom->guest_domid, count,
-                                                SUPERPAGE_PFN_SHIFT, memflags,
+                                                SUPERPAGE_2MB_SHIFT, memflags,
                                                 extents);
                 if ( rc < 0 )
                     return rc;
@@ -916,7 +921,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
                 for ( j = 0; j < rc; j++ )
                 {
                     mfn = extents[j];
-                    for ( k = 0; k < SUPERPAGE_NR_PFNS; k++, pfn++ )
+                    for ( k = 0; k < SUPERPAGE_2MB_NR_PFNS; k++, pfn++ )
                         dom->p2m_host[pfn] = mfn + k;
                 }
                 pfn_base_idx = pfn;
@@ -957,6 +962,342 @@ int arch_setup_meminit(struct xc_dom_image *dom)
     return rc;
 }
 
+/*
+ * Check whether there exists mmio hole in the specified memory range.
+ * Returns 1 if exists, else returns 0.
+ */
+static int check_mmio_hole(uint64_t start, uint64_t memsize,
+                           uint64_t mmio_start, uint64_t mmio_size)
+{
+    if ( start + memsize <= mmio_start || start >= mmio_start + mmio_size )
+        return 0;
+    else
+        return 1;
+}
+
+static int arch_setup_meminit_hvm(struct xc_dom_image *dom)
+{
+    unsigned long i, vmemid, nr_pages = dom->total_pages;
+    unsigned long p2m_size;
+    unsigned long target_pages = dom->target_pages;
+    unsigned long cur_pages, cur_pfn;
+    int rc;
+    xen_capabilities_info_t caps;
+    unsigned long stat_normal_pages = 0, stat_2mb_pages = 0, 
+        stat_1gb_pages = 0;
+    unsigned int memflags = 0;
+    int claim_enabled = dom->claim_enabled;
+    uint64_t total_pages;
+    xen_vmemrange_t dummy_vmemrange[2];
+    unsigned int dummy_vnode_to_pnode[1];
+    xen_vmemrange_t *vmemranges;
+    unsigned int *vnode_to_pnode;
+    unsigned int nr_vmemranges, nr_vnodes;
+    xc_interface *xch = dom->xch;
+    uint32_t domid = dom->guest_domid;
+
+    if ( nr_pages > target_pages )
+        memflags |= XENMEMF_populate_on_demand;
+
+    if ( dom->nr_vmemranges == 0 )
+    {
+        /* Build dummy vnode information
+         *
+         * Guest physical address space layout:
+         * [0, hole_start) [hole_start, 4G) [4G, highmem_end)
+         *
+         * Of course if there is no high memory, the second vmemrange
+         * has no effect on the actual result.
+         */
+
+        dummy_vmemrange[0].start = 0;
+        dummy_vmemrange[0].end   = dom->lowmem_end;
+        dummy_vmemrange[0].flags = 0;
+        dummy_vmemrange[0].nid   = 0;
+        nr_vmemranges = 1;
+
+        if ( dom->highmem_end > (1ULL << 32) )
+        {
+            dummy_vmemrange[1].start = 1ULL << 32;
+            dummy_vmemrange[1].end   = dom->highmem_end;
+            dummy_vmemrange[1].flags = 0;
+            dummy_vmemrange[1].nid   = 0;
+
+            nr_vmemranges++;
+        }
+
+        dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE;
+        nr_vnodes = 1;
+        vmemranges = dummy_vmemrange;
+        vnode_to_pnode = dummy_vnode_to_pnode;
+    }
+    else
+    {
+        if ( nr_pages > target_pages )
+        {
+            DOMPRINTF("Cannot enable vNUMA and PoD at the same time");
+            goto error_out;
+        }
+
+        nr_vmemranges = dom->nr_vmemranges;
+        nr_vnodes = dom->nr_vnodes;
+        vmemranges = dom->vmemranges;
+        vnode_to_pnode = dom->vnode_to_pnode;
+    }
+
+    total_pages = 0;
+    p2m_size = 0;
+    for ( i = 0; i < nr_vmemranges; i++ )
+    {
+        total_pages += ((vmemranges[i].end - vmemranges[i].start)
+                        >> PAGE_SHIFT);
+        p2m_size = p2m_size > (vmemranges[i].end >> PAGE_SHIFT) ?
+            p2m_size : (vmemranges[i].end >> PAGE_SHIFT);
+    }
+
+    if ( total_pages != nr_pages )
+    {
+        DOMPRINTF("vNUMA memory pages mismatch (0x%"PRIx64" != 0x%"PRIx64")",
+               total_pages, nr_pages);
+        goto error_out;
+    }
+
+    if ( xc_version(xch, XENVER_capabilities, &caps) != 0 )
+    {
+        DOMPRINTF("Could not get Xen capabilities");
+        goto error_out;
+    }
+
+    dom->p2m_size = p2m_size;
+    dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
+                                      dom->p2m_size);
+    if ( dom->p2m_host == NULL )
+    {
+        DOMPRINTF("Could not allocate p2m");
+        goto error_out;
+    }
+
+    for ( i = 0; i < p2m_size; i++ )
+        dom->p2m_host[i] = ((xen_pfn_t)-1);
+    for ( vmemid = 0; vmemid < nr_vmemranges; vmemid++ )
+    {
+        uint64_t pfn;
+
+        for ( pfn = vmemranges[vmemid].start >> PAGE_SHIFT;
+              pfn < vmemranges[vmemid].end >> PAGE_SHIFT;
+              pfn++ )
+            dom->p2m_host[pfn] = pfn;
+    }
+
+    /*
+     * Try to claim pages for early warning of insufficient memory available.
+     * This should go before xc_domain_set_pod_target, becuase that function
+     * actually allocates memory for the guest. Claiming after memory has been
+     * allocated is pointless.
+     */
+    if ( claim_enabled ) {
+        rc = xc_domain_claim_pages(xch, domid, target_pages -
+                                   dom->vga_hole ? VGA_HOLE_SIZE : 0);
+        if ( rc != 0 )
+        {
+            DOMPRINTF("Could not allocate memory for HVM guest as we cannot claim memory!");
+            goto error_out;
+        }
+    }
+
+    if ( memflags & XENMEMF_populate_on_demand )
+    {
+        /*
+         * Subtract VGA_HOLE_SIZE from target_pages for the VGA
+         * "hole".  Xen will adjust the PoD cache size so that domain
+         * tot_pages will be target_pages - VGA_HOLE_SIZE after
+         * this call.
+         */
+        rc = xc_domain_set_pod_target(xch, domid,
+                                      target_pages - 
+                                      dom->vga_hole ? VGA_HOLE_SIZE : 0,
+                                      NULL, NULL, NULL);
+        if ( rc != 0 )
+        {
+            DOMPRINTF("Could not set PoD target for HVM guest.\n");
+            goto error_out;
+        }
+    }
+
+    /*
+     * Allocate memory for HVM guest, skipping VGA hole 0xA0000-0xC0000.
+     *
+     * We attempt to allocate 1GB pages if possible. It falls back on 2MB
+     * pages if 1GB allocation fails. 4KB pages will be used eventually if
+     * both fail.
+     * 
+     * Under 2MB mode, we allocate pages in batches of no more than 8MB to 
+     * ensure that we can be preempted and hence dom0 remains responsive.
+     */
+    if ( dom->vga_hole )
+        rc = xc_domain_populate_physmap_exact(
+            xch, domid, 0xa0, 0, memflags, &dom->p2m_host[0x00]);
+
+    stat_normal_pages = 0;
+    for ( vmemid = 0; vmemid < nr_vmemranges; vmemid++ )
+    {
+        unsigned int new_memflags = memflags;
+        uint64_t end_pages;
+        unsigned int vnode = vmemranges[vmemid].nid;
+        unsigned int pnode = vnode_to_pnode[vnode];
+
+        if ( pnode != XC_NUMA_NO_NODE )
+            new_memflags |= XENMEMF_exact_node(pnode);
+
+        end_pages = vmemranges[vmemid].end >> PAGE_SHIFT;
+        /*
+         * Consider vga hole belongs to the vmemrange that covers
+         * 0xA0000-0xC0000. Note that 0x00000-0xA0000 is populated just
+         * before this loop.
+         */
+        if ( vmemranges[vmemid].start == 0 && dom->vga_hole )
+        {
+            cur_pages = 0xc0;
+            stat_normal_pages += 0xc0;
+        }
+        else
+            cur_pages = vmemranges[vmemid].start >> PAGE_SHIFT;
+
+        while ( (rc == 0) && (end_pages > cur_pages) )
+        {
+            /* Clip count to maximum 1GB extent. */
+            unsigned long count = end_pages - cur_pages;
+            unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
+
+            if ( count > max_pages )
+                count = max_pages;
+
+            cur_pfn = dom->p2m_host[cur_pages];
+
+            /* Take care the corner cases of super page tails */
+            if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
+                 (count > (-cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1))) )
+                count = -cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1);
+            else if ( ((count & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
+                      (count > SUPERPAGE_1GB_NR_PFNS) )
+                count &= ~(SUPERPAGE_1GB_NR_PFNS - 1);
+
+            /* Attemp to allocate 1GB super page. Because in each pass
+             * we only allocate at most 1GB, we don't have to clip
+             * super page boundaries.
+             */
+            if ( ((count | cur_pfn) & (SUPERPAGE_1GB_NR_PFNS - 1)) == 0 &&
+                 /* Check if there exists MMIO hole in the 1GB memory
+                  * range */
+                 !check_mmio_hole(cur_pfn << PAGE_SHIFT,
+                                  SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
+                                  dom->mmio_start, dom->mmio_size) )
+            {
+                long done;
+                unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
+                xen_pfn_t sp_extents[nr_extents];
+
+                for ( i = 0; i < nr_extents; i++ )
+                    sp_extents[i] =
+                        dom->p2m_host[cur_pages+(i<<SUPERPAGE_1GB_SHIFT)];
+
+                done = xc_domain_populate_physmap(xch, domid, nr_extents,
+                                                  SUPERPAGE_1GB_SHIFT,
+                                                  memflags, sp_extents);
+
+                if ( done > 0 )
+                {
+                    stat_1gb_pages += done;
+                    done <<= SUPERPAGE_1GB_SHIFT;
+                    cur_pages += done;
+                    count -= done;
+                }
+            }
+
+            if ( count != 0 )
+            {
+                /* Clip count to maximum 8MB extent. */
+                max_pages = SUPERPAGE_2MB_NR_PFNS * 4;
+                if ( count > max_pages )
+                    count = max_pages;
+
+                /* Clip partial superpage extents to superpage
+                 * boundaries. */
+                if ( ((cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
+                     (count > (-cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1))) )
+                    count = -cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1);
+                else if ( ((count & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
+                          (count > SUPERPAGE_2MB_NR_PFNS) )
+                    count &= ~(SUPERPAGE_2MB_NR_PFNS - 1); /* clip non-s.p. tail */
+
+                /* Attempt to allocate superpage extents. */
+                if ( ((count | cur_pfn) & (SUPERPAGE_2MB_NR_PFNS - 1)) == 0 )
+                {
+                    long done;
+                    unsigned long nr_extents = count >> SUPERPAGE_2MB_SHIFT;
+                    xen_pfn_t sp_extents[nr_extents];
+
+                    for ( i = 0; i < nr_extents; i++ )
+                        sp_extents[i] =
+                            dom->p2m_host[cur_pages+(i<<SUPERPAGE_2MB_SHIFT)];
+
+                    done = xc_domain_populate_physmap(xch, domid, nr_extents,
+                                                      SUPERPAGE_2MB_SHIFT,
+                                                      memflags, sp_extents);
+
+                    if ( done > 0 )
+                    {
+                        stat_2mb_pages += done;
+                        done <<= SUPERPAGE_2MB_SHIFT;
+                        cur_pages += done;
+                        count -= done;
+                    }
+                }
+            }
+
+            /* Fall back to 4kB extents. */
+            if ( count != 0 )
+            {
+                rc = xc_domain_populate_physmap_exact(
+                    xch, domid, count, 0, new_memflags, &dom->p2m_host[cur_pages]);
+                cur_pages += count;
+                stat_normal_pages += count;
+            }
+        }
+
+        if ( rc != 0 )
+            break;
+    }
+
+    if ( rc != 0 )
+    {
+        DOMPRINTF("Could not allocate memory for HVM guest.");
+        goto error_out;
+    }
+
+    DPRINTF("PHYSICAL MEMORY ALLOCATION:\n");
+    DPRINTF("  4KB PAGES: 0x%016lx\n", stat_normal_pages);
+    DPRINTF("  2MB PAGES: 0x%016lx\n", stat_2mb_pages);
+    DPRINTF("  1GB PAGES: 0x%016lx\n", stat_1gb_pages);
+
+    rc = 0;
+    goto out;
+ error_out:
+    rc = -1;
+ out:
+
+    /* ensure no unclaimed pages are left unused */
+    xc_domain_claim_pages(xch, domid, 0 /* cancels the claim */);
+
+    return rc;
+}
+
+int arch_setup_meminit(struct xc_dom_image *dom)
+{
+    return (dom->container_type == XC_DOM_PV_CONTAINER) ?
+           arch_setup_meminit_pv(dom) : arch_setup_meminit_hvm(dom);
+}
+
 int arch_setup_bootearly(struct xc_dom_image *dom)
 {
     DOMPRINTF("%s: doing nothing", __FUNCTION__);
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 8907bd6..6273052 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -666,6 +666,7 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
     dom->xenstore_evtchn = state->store_port;
     dom->xenstore_domid = state->store_domid;
     dom->claim_enabled = libxl_defbool_val(info->claim_mode);
+    dom->vga_hole = 0;
 
     if (info->num_vnuma_nodes != 0) {
         unsigned int i;
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 05/13] libxc: introduce a domain loader for HVM guest firmware
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (3 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 04/13] libxc: allow arch_setup_meminit to populate HVM domain memory Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-23  9:29   ` Jan Beulich
  2015-07-10 19:09   ` Konrad Rzeszutek Wilk
  2015-06-22 16:11 ` [PATCH RFC v1 06/13] libxc: introduce a xc_dom_arch for hvm-3.0-x86_32 guests Roger Pau Monne
                   ` (8 subsequent siblings)
  13 siblings, 2 replies; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

Introduce a very simple (and dummy) domain loader to be used to load the
firmware (hvmloader) into HVM guests. Since hmvloader is just a 32bit elf
executable the loader is fairly simple.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
 tools/libxc/Makefile           |   1 +
 tools/libxc/include/xc_dom.h   |   7 +
 tools/libxc/xc_dom_hvmloader.c | 316 +++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/libelf.h       |   1 +
 4 files changed, 325 insertions(+)
 create mode 100644 tools/libxc/xc_dom_hvmloader.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 55782c8..7f860d7 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -86,6 +86,7 @@ GUEST_SRCS-y                 += xc_dom_core.c xc_dom_boot.c
 GUEST_SRCS-y                 += xc_dom_elfloader.c
 GUEST_SRCS-$(CONFIG_X86)     += xc_dom_bzimageloader.c
 GUEST_SRCS-$(CONFIG_X86)     += xc_dom_decompress_lz4.c
+GUEST_SRCS-$(CONFIG_X86)     += xc_dom_hvmloader.c
 GUEST_SRCS-$(CONFIG_ARM)     += xc_dom_armzimageloader.c
 GUEST_SRCS-y                 += xc_dom_binloader.c
 GUEST_SRCS-y                 += xc_dom_compat_linux.c
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 051a7de..caeb1c8 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -15,6 +15,7 @@
  */
 
 #include <xen/libelf/libelf.h>
+#include <xenguest.h>
 
 #define INVALID_P2M_ENTRY   ((xen_pfn_t)-1)
 
@@ -194,6 +195,12 @@ struct xc_dom_image {
     xen_pfn_t lowmem_end;
     xen_pfn_t highmem_end;
     int vga_hole;
+
+    /* Extra ACPI tables passed to HVMLOADER */
+    struct xc_hvm_firmware_module acpi_module;
+
+    /* Extra SMBIOS structures passed to HVMLOADER */
+    struct xc_hvm_firmware_module smbios_module;
 };
 
 /* --- pluggable kernel loader ------------------------------------- */
diff --git a/tools/libxc/xc_dom_hvmloader.c b/tools/libxc/xc_dom_hvmloader.c
new file mode 100644
index 0000000..b6270c9
--- /dev/null
+++ b/tools/libxc/xc_dom_hvmloader.c
@@ -0,0 +1,316 @@
+/*
+ * Xen domain builder -- HVM specific bits.
+ *
+ * Parse and load ELF firmware images for HVM domains.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <inttypes.h>
+#include <assert.h>
+
+#include "xg_private.h"
+#include "xc_dom.h"
+#include "xc_bitops.h"
+
+/* ------------------------------------------------------------------------ */
+/* parse elf binary                                                         */
+
+static elf_negerrnoval check_elf_kernel(struct xc_dom_image *dom, bool verbose)
+{
+    if ( dom->kernel_blob == NULL )
+    {
+        if ( verbose )
+            xc_dom_panic(dom->xch,
+                         XC_INTERNAL_ERROR, "%s: no kernel image loaded",
+                         __FUNCTION__);
+        return -EINVAL;
+    }
+
+    if ( !elf_is_elfbinary(dom->kernel_blob, dom->kernel_size) )
+    {
+        if ( verbose )
+            xc_dom_panic(dom->xch,
+                         XC_INVALID_KERNEL, "%s: kernel is not an ELF image",
+                         __FUNCTION__);
+        return -EINVAL;
+    }
+    return 0;
+}
+
+static elf_negerrnoval xc_dom_probe_hvm_kernel(struct xc_dom_image *dom)
+{
+    struct elf_binary elf;
+    int rc;
+
+    /* This loader is designed for HVM guest firmware. */
+    if ( dom->container_type != XC_DOM_HVM_CONTAINER )
+        return -EINVAL;
+
+    rc = check_elf_kernel(dom, 0);
+    if ( rc != 0 )
+        return rc;
+
+    rc = elf_init(&elf, dom->kernel_blob, dom->kernel_size);
+    if ( rc != 0 )
+        return rc;
+
+    /*
+     * We need to check that there are no Xen ELFNOTES, or
+     * else we might be trying to load a PV kernel.
+     */
+    elf_parse_binary(&elf);
+    rc = elf_xen_parse(&elf, &dom->parms);
+    if ( rc == 0 )
+        return -EINVAL;
+
+    return 0;
+}
+
+static elf_errorstatus xc_dom_parse_hvm_kernel(struct xc_dom_image *dom)
+    /*
+     * This function sometimes returns -1 for error and sometimes
+     * an errno value.  ?!?!
+     */
+{
+    struct elf_binary *elf;
+    elf_errorstatus rc;
+
+    rc = check_elf_kernel(dom, 1);
+    if ( rc != 0 )
+        return rc;
+
+    elf = xc_dom_malloc(dom, sizeof(*elf));
+    if ( elf == NULL )
+        return -1;
+    dom->private_loader = elf;
+    rc = elf_init(elf, dom->kernel_blob, dom->kernel_size);
+    xc_elf_set_logfile(dom->xch, elf, 1);
+    if ( rc != 0 )
+    {
+        xc_dom_panic(dom->xch, XC_INVALID_KERNEL, "%s: corrupted ELF image",
+                     __FUNCTION__);
+        return rc;
+    }
+
+    if ( !elf_32bit(elf) )
+    {
+        xc_dom_panic(dom->xch, XC_INVALID_KERNEL, "%s: ELF image is not 32bit",
+                     __FUNCTION__);
+        return -EINVAL;
+    }
+
+    /* parse binary and get xen meta info */
+    elf_parse_binary(elf);
+
+    /* find kernel segment */
+    dom->kernel_seg.vstart = elf->pstart;
+    dom->kernel_seg.vend   = elf->pend;
+
+    dom->guest_type = "hvm-3.0-x86_32";
+
+    if ( elf_check_broken(elf) )
+        DOMPRINTF("%s: ELF broken: %s", __FUNCTION__,
+                  elf_check_broken(elf));
+
+    return rc;
+}
+
+static int modules_init(struct xc_dom_image *dom,
+                        uint64_t vend, struct elf_binary *elf,
+                        uint64_t *mstart_out, uint64_t *mend_out)
+{
+#define MODULE_ALIGN 1UL << 7
+#define MB_ALIGN     1UL << 20
+#define MKALIGN(x, a) (((uint64_t)(x) + (a) - 1) & ~(uint64_t)((a) - 1))
+    uint64_t total_len = 0, offset1 = 0;
+
+    if ( (dom->acpi_module.length == 0) && (dom->smbios_module.length == 0) )
+        return 0;
+
+    /* Find the total length for the firmware modules with a reasonable large
+     * alignment size to align each the modules.
+     */
+    total_len = MKALIGN(dom->acpi_module.length, MODULE_ALIGN);
+    offset1 = total_len;
+    total_len += MKALIGN(dom->smbios_module.length, MODULE_ALIGN);
+
+    /* Want to place the modules 1Mb+change behind the loader image. */
+    *mstart_out = MKALIGN(elf->pend, MB_ALIGN) + (MB_ALIGN);
+    *mend_out = *mstart_out + total_len;
+
+    if ( *mend_out > vend )
+        return -1;
+
+    if ( dom->acpi_module.length != 0 )
+        dom->acpi_module.guest_addr_out = *mstart_out;
+    if ( dom->smbios_module.length != 0 )
+        dom->smbios_module.guest_addr_out = *mstart_out + offset1;
+
+    return 0;
+}
+
+static int loadmodules(struct xc_dom_image *dom,
+                       uint64_t mstart, uint64_t mend,
+                       uint32_t domid)
+{
+    privcmd_mmap_entry_t *entries = NULL;
+    unsigned long pfn_start;
+    unsigned long pfn_end;
+    size_t pages;
+    uint32_t i;
+    uint8_t *dest;
+    int rc = -1;
+    xc_interface *xch = dom->xch;
+
+    if ( (mstart == 0)||(mend == 0) )
+        return 0;
+
+    pfn_start = (unsigned long)(mstart >> PAGE_SHIFT);
+    pfn_end = (unsigned long)((mend + PAGE_SIZE - 1) >> PAGE_SHIFT);
+    pages = pfn_end - pfn_start;
+
+    /* Map address space for module list. */
+    entries = calloc(pages, sizeof(privcmd_mmap_entry_t));
+    if ( entries == NULL )
+        goto error_out;
+
+    for ( i = 0; i < pages; i++ )
+        entries[i].mfn = (mstart >> PAGE_SHIFT) + i;
+
+    dest = xc_map_foreign_ranges(
+        xch, domid, pages << PAGE_SHIFT, PROT_READ | PROT_WRITE, 1 << PAGE_SHIFT,
+        entries, pages);
+    if ( dest == NULL )
+        goto error_out;
+
+    /* Zero the range so padding is clear between modules */
+    memset(dest, 0, pages << PAGE_SHIFT);
+
+    /* Load modules into range */
+    if ( dom->acpi_module.length != 0 )
+    {
+        memcpy(dest,
+               dom->acpi_module.data,
+               dom->acpi_module.length);
+    }
+    if ( dom->smbios_module.length != 0 )
+    {
+        memcpy(dest + (dom->smbios_module.guest_addr_out - mstart),
+               dom->smbios_module.data,
+               dom->smbios_module.length);
+    }
+
+    munmap(dest, pages << PAGE_SHIFT);
+    rc = 0;
+
+ error_out:
+    free(entries);
+
+    return rc;
+}
+
+static elf_errorstatus xc_dom_load_hvm_kernel(struct xc_dom_image *dom)
+{
+    struct elf_binary *elf = dom->private_loader;
+    privcmd_mmap_entry_t *entries = NULL;
+    size_t pages = (elf->pend - elf->pstart + PAGE_SIZE - 1) >> PAGE_SHIFT;
+    elf_errorstatus rc;
+    uint64_t m_start = 0, m_end = 0;
+    int i;
+
+    /* Map address space for initial elf image. */
+    entries = calloc(pages, sizeof(privcmd_mmap_entry_t));
+    if ( entries == NULL )
+        return -ENOMEM;
+
+    for ( i = 0; i < pages; i++ )
+        entries[i].mfn = (elf->pstart >> PAGE_SHIFT) + i;
+
+    elf->dest_base = xc_map_foreign_ranges(
+        dom->xch, dom->guest_domid, pages << PAGE_SHIFT,
+        PROT_READ | PROT_WRITE, 1 << PAGE_SHIFT,
+        entries, pages);
+    if ( elf->dest_base == NULL )
+    {
+        DOMPRINTF("%s: unable to map guest memory space", __FUNCTION__);
+        rc = -EFAULT;
+        goto error;
+    }
+
+    elf->dest_size = pages * XC_DOM_PAGE_SIZE(dom);
+
+    rc = elf_load_binary(elf);
+    if ( rc < 0 )
+    {
+        DOMPRINTF("%s: failed to load elf binary", __FUNCTION__);
+        return rc;
+    }
+
+    munmap(elf->dest_base, elf->dest_size);
+
+    rc = modules_init(dom, dom->total_pages << PAGE_SHIFT, elf, &m_start,
+                      &m_end);
+    if ( rc != 0 )
+    {
+        DOMPRINTF("%s: insufficient space to load modules.", __FUNCTION__);
+        return rc;
+    }
+
+    rc = loadmodules(dom, m_start, m_end, dom->guest_domid);
+    if ( rc != 0 )
+    {
+        DOMPRINTF("%s: unable to load modules.", __FUNCTION__);
+        return rc;
+    }
+
+    dom->parms.phys_entry = elf_uval(elf, elf->ehdr, e_entry);
+
+    free(entries);
+    return 0;
+
+error:
+    assert(rc != 0);
+    free(entries);
+    return rc;
+}
+
+/* ------------------------------------------------------------------------ */
+
+struct xc_dom_loader hvm_loader = {
+    .name = "HVM-generic",
+    .probe = xc_dom_probe_hvm_kernel,
+    .parser = xc_dom_parse_hvm_kernel,
+    .loader = xc_dom_load_hvm_kernel,
+};
+
+static void __init register_loader(void)
+{
+    xc_dom_register_loader(&hvm_loader);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/libelf.h b/xen/include/xen/libelf.h
index 6393040..03d449e 100644
--- a/xen/include/xen/libelf.h
+++ b/xen/include/xen/libelf.h
@@ -424,6 +424,7 @@ struct elf_dom_parms {
     uint64_t elf_paddr_offset;
     uint32_t f_supported[XENFEAT_NR_SUBMAPS];
     uint32_t f_required[XENFEAT_NR_SUBMAPS];
+    uint32_t phys_entry;
 
     /* calculated */
     uint64_t virt_offset;
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 06/13] libxc: introduce a xc_dom_arch for hvm-3.0-x86_32 guests
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (4 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 05/13] libxc: introduce a domain loader for HVM guest firmware Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 07/13] libxl: switch HVM domain building to use xc_dom_* helpers Roger Pau Monne
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

This xc_dom_arch will be used in order to build HVM domains. The code is
based on the existing xc_hvm_populate_params function.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
This is abusing the alloc_magic_pages hook in order to setup everything,
which is not the best approach but it works. In later versions I would like
to break alloc_magic_pages_hvm into smaller functions that can be used to
populate the remaining hooks (start_info and shared_info).
---
 tools/libxc/xc_dom_x86.c | 183 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 183 insertions(+)

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 8a1ef24..0d9ec42 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -49,6 +49,20 @@
 
 #define VGA_HOLE_SIZE (0x20)
 
+#define SPECIALPAGE_PAGING   0
+#define SPECIALPAGE_ACCESS   1
+#define SPECIALPAGE_SHARING  2
+#define SPECIALPAGE_BUFIOREQ 3
+#define SPECIALPAGE_XENSTORE 4
+#define SPECIALPAGE_IOREQ    5
+#define SPECIALPAGE_IDENT_PT 6
+#define SPECIALPAGE_CONSOLE  7
+#define NR_SPECIAL_PAGES     8
+#define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x))
+
+#define NR_IOREQ_SERVER_PAGES 8
+#define ioreq_server_pfn(x) (special_pfn(0) - NR_IOREQ_SERVER_PAGES + (x))
+
 #define bits_to_mask(bits)       (((xen_vaddr_t)1 << (bits))-1)
 #define round_down(addr, mask)   ((addr) & ~(mask))
 #define round_up(addr, mask)     ((addr) | (mask))
@@ -467,6 +481,135 @@ static int alloc_magic_pages(struct xc_dom_image *dom)
     return 0;
 }
 
+static void build_hvm_info(void *hvm_info_page, struct xc_dom_image *dom)
+{
+    struct hvm_info_table *hvm_info = (struct hvm_info_table *)
+        (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
+    uint8_t sum;
+    int i;
+
+    memset(hvm_info_page, 0, PAGE_SIZE);
+
+    /* Fill in the header. */
+    strncpy(hvm_info->signature, "HVM INFO", 8);
+    hvm_info->length = sizeof(struct hvm_info_table);
+
+    /* Sensible defaults: these can be overridden by the caller. */
+    hvm_info->apic_mode = 1;
+    hvm_info->nr_vcpus = 1;
+    memset(hvm_info->vcpu_online, 0xff, sizeof(hvm_info->vcpu_online));
+
+    /* Memory parameters. */
+    hvm_info->low_mem_pgend = dom->lowmem_end >> PAGE_SHIFT;
+    hvm_info->high_mem_pgend = dom->highmem_end >> PAGE_SHIFT;
+    hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
+
+    /* Finish with the checksum. */
+    for ( i = 0, sum = 0; i < hvm_info->length; i++ )
+        sum += ((uint8_t *)hvm_info)[i];
+    hvm_info->checksum = -sum;
+}
+
+static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
+{
+    unsigned long i;
+    void *hvm_info_page;
+    uint32_t *ident_pt, domid = dom->guest_domid;
+    int rc;
+    xen_pfn_t special_array[NR_SPECIAL_PAGES];
+    xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
+    xc_interface *xch = dom->xch;
+
+    if ( (hvm_info_page = xc_map_foreign_range(
+              xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE,
+              HVM_INFO_PFN)) == NULL )
+        goto error_out;
+    build_hvm_info(hvm_info_page, dom);
+    munmap(hvm_info_page, PAGE_SIZE);
+
+    /* Allocate and clear special pages. */
+    for ( i = 0; i < NR_SPECIAL_PAGES; i++ )
+        special_array[i] = special_pfn(i);
+
+    rc = xc_domain_populate_physmap_exact(xch, domid, NR_SPECIAL_PAGES, 0, 0,
+                                          special_array);
+    if ( rc != 0 )
+    {
+        DOMPRINTF("Could not allocate special pages.");
+        goto error_out;
+    }
+
+    if ( xc_clear_domain_pages(xch, domid, special_pfn(0), NR_SPECIAL_PAGES) )
+            goto error_out;
+
+    xc_hvm_param_set(xch, domid, HVM_PARAM_STORE_PFN,
+                     special_pfn(SPECIALPAGE_XENSTORE));
+    xc_hvm_param_set(xch, domid, HVM_PARAM_BUFIOREQ_PFN,
+                     special_pfn(SPECIALPAGE_BUFIOREQ));
+    xc_hvm_param_set(xch, domid, HVM_PARAM_IOREQ_PFN,
+                     special_pfn(SPECIALPAGE_IOREQ));
+    xc_hvm_param_set(xch, domid, HVM_PARAM_CONSOLE_PFN,
+                     special_pfn(SPECIALPAGE_CONSOLE));
+    xc_hvm_param_set(xch, domid, HVM_PARAM_PAGING_RING_PFN,
+                     special_pfn(SPECIALPAGE_PAGING));
+    xc_hvm_param_set(xch, domid, HVM_PARAM_MONITOR_RING_PFN,
+                     special_pfn(SPECIALPAGE_ACCESS));
+    xc_hvm_param_set(xch, domid, HVM_PARAM_SHARING_RING_PFN,
+                     special_pfn(SPECIALPAGE_SHARING));
+
+    /*
+     * Allocate and clear additional ioreq server pages. The default
+     * server will use the IOREQ and BUFIOREQ special pages above.
+     */
+    for ( i = 0; i < NR_IOREQ_SERVER_PAGES; i++ )
+        ioreq_server_array[i] = ioreq_server_pfn(i);
+
+    rc = xc_domain_populate_physmap_exact(xch, domid, NR_IOREQ_SERVER_PAGES, 0,
+                                          0, ioreq_server_array);
+    if ( rc != 0 )
+    {
+        DOMPRINTF("Could not allocate ioreq server pages.");
+        goto error_out;
+    }
+
+    if ( xc_clear_domain_pages(xch, domid, ioreq_server_pfn(0),
+                               NR_IOREQ_SERVER_PAGES) )
+            goto error_out;
+
+    /* Tell the domain where the pages are and how many there are */
+    xc_hvm_param_set(xch, domid, HVM_PARAM_IOREQ_SERVER_PFN,
+                     ioreq_server_pfn(0));
+    xc_hvm_param_set(xch, domid, HVM_PARAM_NR_IOREQ_SERVER_PAGES,
+                     NR_IOREQ_SERVER_PAGES);
+
+    /*
+     * Identity-map page table is required for running with CR0.PG=0 when
+     * using Intel EPT. Create a 32-bit non-PAE page directory of superpages.
+     */
+    if ( (ident_pt = xc_map_foreign_range(
+              xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE,
+              special_pfn(SPECIALPAGE_IDENT_PT))) == NULL )
+        goto error_out;
+    for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ )
+        ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
+                       _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
+    munmap(ident_pt, PAGE_SIZE);
+    xc_hvm_param_set(xch, domid, HVM_PARAM_IDENT_PT,
+                     special_pfn(SPECIALPAGE_IDENT_PT) << PAGE_SHIFT);
+
+    dom->console_pfn = special_pfn(SPECIALPAGE_CONSOLE);
+    dom->xenstore_pfn = special_pfn(SPECIALPAGE_XENSTORE);
+    dom->parms.virt_hypercall = -1;
+
+    rc = 0;
+    goto out;
+ error_out:
+    rc = -1;
+ out:
+
+    return rc;
+}
+
 /* ------------------------------------------------------------------------ */
 
 static int start_info_x86_32(struct xc_dom_image *dom)
@@ -674,6 +817,28 @@ static int vcpu_x86_64(struct xc_dom_image *dom, void *ptr)
     return 0;
 }
 
+static int vcpu_hvm(struct xc_dom_image *dom, void *ptr)
+{
+    vcpu_guest_context_x86_64_t *ctxt = ptr;
+
+    DOMPRINTF_CALLED(dom->xch);
+
+    /* clear everything */
+    memset(ctxt, 0, sizeof(*ctxt));
+
+    ctxt->user_regs.ds = FLAT_KERNEL_DS_X86_32;
+    ctxt->user_regs.es = FLAT_KERNEL_DS_X86_32;
+    ctxt->user_regs.fs = FLAT_KERNEL_DS_X86_32;
+    ctxt->user_regs.gs = FLAT_KERNEL_DS_X86_32;
+    ctxt->user_regs.ss = FLAT_KERNEL_SS_X86_32;
+    ctxt->user_regs.cs = FLAT_KERNEL_CS_X86_32;
+    ctxt->user_regs.rip = dom->parms.phys_entry;
+
+    ctxt->flags = VGCF_in_kernel_X86_32 | VGCF_online_X86_32;
+
+    return 0;
+}
+
 /* ------------------------------------------------------------------------ */
 
 static struct xc_dom_arch xc_dom_32_pae = {
@@ -702,10 +867,24 @@ static struct xc_dom_arch xc_dom_64 = {
     .vcpu = vcpu_x86_64,
 };
 
+static struct xc_dom_arch xc_hvm_32 = {
+    .guest_type = "hvm-3.0-x86_32",
+    .native_protocol = XEN_IO_PROTO_ABI_X86_32,
+    .page_shift = PAGE_SHIFT_X86,
+    .sizeof_pfn = 4,
+    .alloc_magic_pages = alloc_magic_pages_hvm,
+    .count_pgtables = NULL,
+    .setup_pgtables = NULL,
+    .start_info = NULL,
+    .shared_info = NULL,
+    .vcpu = vcpu_hvm,
+};
+
 static void __init register_arch_hooks(void)
 {
     xc_dom_register_arch_hooks(&xc_dom_32_pae);
     xc_dom_register_arch_hooks(&xc_dom_64);
+    xc_dom_register_arch_hooks(&xc_hvm_32);
 }
 
 static int x86_compat(xc_interface *xch, domid_t domid, char *guest_type)
@@ -1352,6 +1531,10 @@ int arch_setup_bootlate(struct xc_dom_image *dom)
     xen_pfn_t shinfo;
     int i, rc;
 
+    if ( dom->container_type == XC_DOM_HVM_CONTAINER )
+        /* Nothing to do for HVM-type guests. */
+        return 0;
+
     for ( i = 0; i < ARRAY_SIZE(types); i++ )
         if ( !strcmp(types[i].guest, dom->guest_type) )
             pgd_type = types[i].pgd_type;
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 07/13] libxl: switch HVM domain building to use xc_dom_* helpers
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (5 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 06/13] libxc: introduce a xc_dom_arch for hvm-3.0-x86_32 guests Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 08/13] libxc: remove dead x86 HVM code Roger Pau Monne
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

Now that we have all the code in place HVM domain building in libxl can be
switched to use the xc_dom_* family of functions, just like they are used in
order to build PV guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
TBH HVM building still avoids calling some xc_dom_* functions that are
called in order to build PV guests. IMHO we should look into having the
exact same flow in order to build PV or HVM domains, so libxl__build_pv and
libxl__build_hvm can be unified into a single function.
---
 tools/libxl/libxl_dom.c      | 146 ++++++++++++++++++++++++++++---------------
 tools/libxl/libxl_internal.h |   2 +-
 tools/libxl/libxl_vnuma.c    |  12 ++--
 3 files changed, 105 insertions(+), 55 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 6273052..12ede26 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -801,39 +801,39 @@ static int hvm_build_set_params(xc_interface *handle, uint32_t domid,
 
 static int hvm_build_set_xs_values(libxl__gc *gc,
                                    uint32_t domid,
-                                   struct xc_hvm_build_args *args)
+                                   struct xc_dom_image *dom)
 {
     char *path = NULL;
     int ret = 0;
 
-    if (args->smbios_module.guest_addr_out) {
+    if (dom->smbios_module.guest_addr_out) {
         path = GCSPRINTF("/local/domain/%d/"HVM_XS_SMBIOS_PT_ADDRESS, domid);
 
         ret = libxl__xs_write(gc, XBT_NULL, path, "0x%"PRIx64,
-                              args->smbios_module.guest_addr_out);
+                              dom->smbios_module.guest_addr_out);
         if (ret)
             goto err;
 
         path = GCSPRINTF("/local/domain/%d/"HVM_XS_SMBIOS_PT_LENGTH, domid);
 
         ret = libxl__xs_write(gc, XBT_NULL, path, "0x%x",
-                              args->smbios_module.length);
+                              dom->smbios_module.length);
         if (ret)
             goto err;
     }
 
-    if (args->acpi_module.guest_addr_out) {
+    if (dom->acpi_module.guest_addr_out) {
         path = GCSPRINTF("/local/domain/%d/"HVM_XS_ACPI_PT_ADDRESS, domid);
 
         ret = libxl__xs_write(gc, XBT_NULL, path, "0x%"PRIx64,
-                              args->acpi_module.guest_addr_out);
+                              dom->acpi_module.guest_addr_out);
         if (ret)
             goto err;
 
         path = GCSPRINTF("/local/domain/%d/"HVM_XS_ACPI_PT_LENGTH, domid);
 
         ret = libxl__xs_write(gc, XBT_NULL, path, "0x%x",
-                              args->acpi_module.length);
+                              dom->acpi_module.length);
         if (ret)
             goto err;
     }
@@ -847,7 +847,7 @@ err:
 
 static int libxl__domain_firmware(libxl__gc *gc,
                                   libxl_domain_build_info *info,
-                                  struct xc_hvm_build_args *args)
+                                  struct xc_dom_image *dom)
 {
     libxl_ctx *ctx = libxl__gc_owner(gc);
     const char *firmware;
@@ -873,8 +873,13 @@ static int libxl__domain_firmware(libxl__gc *gc,
             break;
         }
     }
-    args->image_file_name = libxl__abs_path(gc, firmware,
-                                            libxl__xenfirmwaredir_path());
+
+    rc = xc_dom_kernel_file(dom, libxl__abs_path(gc, firmware,
+                                                 libxl__xenfirmwaredir_path()));
+    if (rc != 0) {
+        LOGE(ERROR, "xc_dom_kernel_file failed");
+        goto out;
+    }
 
     if (info->u.hvm.smbios_firmware) {
         data = NULL;
@@ -888,8 +893,8 @@ static int libxl__domain_firmware(libxl__gc *gc,
         libxl__ptr_add(gc, data);
         if (datalen) {
             /* Only accept non-empty files */
-            args->smbios_module.data = data;
-            args->smbios_module.length = (uint32_t)datalen;
+            dom->smbios_module.data = data;
+            dom->smbios_module.length = (uint32_t)datalen;
         }
     }
 
@@ -905,8 +910,8 @@ static int libxl__domain_firmware(libxl__gc *gc,
         libxl__ptr_add(gc, data);
         if (datalen) {
             /* Only accept non-empty files */
-            args->acpi_module.data = data;
-            args->acpi_module.length = (uint32_t)datalen;
+            dom->acpi_module.data = data;
+            dom->acpi_module.length = (uint32_t)datalen;
         }
     }
 
@@ -920,50 +925,61 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
               libxl__domain_build_state *state)
 {
     libxl_ctx *ctx = libxl__gc_owner(gc);
-    struct xc_hvm_build_args args = {};
     int ret, rc = ERROR_FAIL;
-    uint64_t mmio_start, lowmem_end, highmem_end;
+    uint64_t mmio_start, lowmem_end, highmem_end, mem_size;
+    struct xc_dom_image *dom = NULL;
+
+    xc_dom_loginit(ctx->xch);
+
+    dom = xc_dom_allocate(ctx->xch, NULL, NULL);
+    if (!dom) {
+        LOGE(ERROR, "xc_dom_allocate failed");
+        goto out;
+    }
+
+    dom->container_type = XC_DOM_HVM_CONTAINER;
 
-    memset(&args, 0, sizeof(struct xc_hvm_build_args));
     /* The params from the configuration file are in Mb, which are then
      * multiplied by 1 Kb. This was then divided off when calling
      * the old xc_hvm_build_target_mem() which then turned them to bytes.
      * Do all this in one step here...
      */
-    args.mem_size = (uint64_t)(info->max_memkb - info->video_memkb) << 10;
-    args.mem_target = (uint64_t)(info->target_memkb - info->video_memkb) << 10;
-    args.claim_enabled = libxl_defbool_val(info->claim_mode);
+    mem_size = (uint64_t)(info->max_memkb - info->video_memkb) << 10;
+    dom->target_pages = (uint64_t)(info->target_memkb - info->video_memkb) >> 2;
+    dom->claim_enabled = libxl_defbool_val(info->claim_mode);
     if (info->u.hvm.mmio_hole_memkb) {
         uint64_t max_ram_below_4g = (1ULL << 32) -
             (info->u.hvm.mmio_hole_memkb << 10);
 
         if (max_ram_below_4g < HVM_BELOW_4G_MMIO_START)
-            args.mmio_size = info->u.hvm.mmio_hole_memkb << 10;
+            dom->mmio_size = info->u.hvm.mmio_hole_memkb << 10;
     }
-    if (libxl__domain_firmware(gc, info, &args)) {
+    if (libxl__domain_firmware(gc, info, dom)) {
         LOG(ERROR, "initializing domain firmware failed");
         goto out;
     }
-    if (args.mem_target == 0)
-        args.mem_target = args.mem_size;
-    if (args.mmio_size == 0)
-        args.mmio_size = HVM_BELOW_4G_MMIO_LENGTH;
-    lowmem_end = args.mem_size;
+
+    if (dom->target_pages == 0)
+        dom->target_pages = mem_size >> XC_PAGE_SHIFT;
+    if (dom->mmio_size == 0)
+        dom->mmio_size = HVM_BELOW_4G_MMIO_LENGTH;
+    lowmem_end = mem_size;
     highmem_end = 0;
-    mmio_start = (1ull << 32) - args.mmio_size;
+    mmio_start = (1ull << 32) - dom->mmio_size;
     if (lowmem_end > mmio_start)
     {
         highmem_end = (1ull << 32) + (lowmem_end - mmio_start);
         lowmem_end = mmio_start;
     }
-    args.lowmem_end = lowmem_end;
-    args.highmem_end = highmem_end;
-    args.mmio_start = mmio_start;
+    dom->lowmem_end = lowmem_end;
+    dom->highmem_end = highmem_end;
+    dom->mmio_start = mmio_start;
+    dom->vga_hole = 1;
 
     if (info->num_vnuma_nodes != 0) {
         int i;
 
-        ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, &args);
+        ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, dom);
         if (ret) {
             LOGEV(ERROR, ret, "hvm build vmemranges failed");
             goto out;
@@ -973,27 +989,57 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
         ret = set_vnuma_info(gc, domid, info, state);
         if (ret) goto out;
 
-        args.nr_vmemranges = state->num_vmemranges;
-        args.vmemranges = libxl__malloc(gc, sizeof(*args.vmemranges) *
-                                        args.nr_vmemranges);
+        dom->nr_vmemranges = state->num_vmemranges;
+        dom->vmemranges = libxl__malloc(gc, sizeof(*dom->vmemranges) *
+                                        dom->nr_vmemranges);
 
-        for (i = 0; i < args.nr_vmemranges; i++) {
-            args.vmemranges[i].start = state->vmemranges[i].start;
-            args.vmemranges[i].end   = state->vmemranges[i].end;
-            args.vmemranges[i].flags = state->vmemranges[i].flags;
-            args.vmemranges[i].nid   = state->vmemranges[i].nid;
+        for (i = 0; i < dom->nr_vmemranges; i++) {
+            dom->vmemranges[i].start = state->vmemranges[i].start;
+            dom->vmemranges[i].end   = state->vmemranges[i].end;
+            dom->vmemranges[i].flags = state->vmemranges[i].flags;
+            dom->vmemranges[i].nid   = state->vmemranges[i].nid;
         }
 
-        args.nr_vnodes = info->num_vnuma_nodes;
-        args.vnode_to_pnode = libxl__malloc(gc, sizeof(*args.vnode_to_pnode) *
-                                            args.nr_vnodes);
-        for (i = 0; i < args.nr_vnodes; i++)
-            args.vnode_to_pnode[i] = info->vnuma_nodes[i].pnode;
+        dom->nr_vnodes = info->num_vnuma_nodes;
+        dom->vnode_to_pnode = libxl__malloc(gc, sizeof(*dom->vnode_to_pnode) *
+                                            dom->nr_vnodes);
+        for (i = 0; i < dom->nr_vnodes; i++)
+            dom->vnode_to_pnode[i] = info->vnuma_nodes[i].pnode;
+    }
+
+    rc = xc_dom_boot_xen_init(dom, ctx->xch, domid);
+    if (rc != 0) {
+        LOGE(ERROR, "xc_dom_boot_xen_init failed");
+        goto out;
     }
 
-    ret = xc_hvm_build(ctx->xch, domid, &args);
-    if (ret) {
-        LOGEV(ERROR, ret, "hvm building failed");
+    rc = xc_dom_parse_image(dom);
+    if (rc != 0) {
+        LOGE(ERROR, "xc_dom_parse_image failed");
+        goto out;
+    }
+
+    rc = xc_dom_mem_init(dom, mem_size / (1024 * 1024));
+    if (rc != 0) {
+        LOGE(ERROR, "xc_dom_mem_init failed");
+        goto out;
+    }
+
+    rc = xc_dom_boot_mem_init(dom);
+    if (rc != 0) {
+        LOGE(ERROR, "xc_dom_boot_mem_init failed");
+        goto out;
+    }
+
+    rc = xc_dom_build_image(dom);
+    if (rc != 0) {
+        LOGE(ERROR, "xc_dom_build_image failed");
+        goto out;
+    }
+
+    rc = xc_dom_boot_image(dom);
+    if (rc != 0) {
+        LOGE(ERROR, "xc_dom_boot_image failed");
         goto out;
     }
 
@@ -1006,14 +1052,16 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
         goto out;
     }
 
-    ret = hvm_build_set_xs_values(gc, domid, &args);
+    ret = hvm_build_set_xs_values(gc, domid, dom);
     if (ret) {
         LOG(ERROR, "hvm build set xenstore values failed (ret=%d)", ret);
         goto out;
     }
 
+    xc_dom_release(dom);
     return 0;
 out:
+    if (dom != NULL) xc_dom_release(dom);
     return rc;
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e96d6b5..71b3714 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3461,7 +3461,7 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
                                      uint32_t domid,
                                      libxl_domain_build_info *b_info,
                                      libxl__domain_build_state *state,
-                                     struct xc_hvm_build_args *args);
+                                     struct xc_dom_image *dom);
 bool libxl__vnuma_configured(const libxl_domain_build_info *b_info);
 
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index 56856d2..db22799 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -17,6 +17,8 @@
 #include "libxl_arch.h"
 #include <stdlib.h>
 
+#include <xc_dom.h>
+
 bool libxl__vnuma_configured(const libxl_domain_build_info *b_info)
 {
     return b_info->num_vnuma_nodes != 0;
@@ -252,7 +254,7 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
                                      uint32_t domid,
                                      libxl_domain_build_info *b_info,
                                      libxl__domain_build_state *state,
-                                     struct xc_hvm_build_args *args)
+                                     struct xc_dom_image *dom)
 {
     uint64_t hole_start, hole_end, next;
     int nid, nr_vmemrange;
@@ -264,10 +266,10 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
      * Guest physical address space layout:
      * [0, hole_start) [hole_start, hole_end) [hole_end, highmem_end)
      */
-    hole_start = args->lowmem_end < args->mmio_start ?
-        args->lowmem_end : args->mmio_start;
-    hole_end = (args->mmio_start + args->mmio_size) > (1ULL << 32) ?
-        (args->mmio_start + args->mmio_size) : (1ULL << 32);
+    hole_start = dom->lowmem_end < dom->mmio_start ?
+        dom->lowmem_end : dom->mmio_start;
+    hole_end = (dom->mmio_start + dom->mmio_size) > (1ULL << 32) ?
+        (dom->mmio_start + dom->mmio_size) : (1ULL << 32);
 
     assert(state->vmemranges == NULL);
 
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 08/13] libxc: remove dead x86 HVM code
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (6 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 07/13] libxl: switch HVM domain building to use xc_dom_* helpers Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-22 16:11 ` [PATCH RFC v1 09/13] elfnotes: intorduce a new PHYS_ENTRY elfnote Roger Pau Monne
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

Remove xc_hvm_build_x86.c since xc_hvm_build is not longer used in order to
create HVM guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
 tools/libxc/Makefile              |   1 -
 tools/libxc/include/xenguest.h    |  44 ---
 tools/libxc/xc_hvm_build_x86.c    | 805 --------------------------------------
 tools/libxc/xg_private.c          |   9 -
 tools/python/xen/lowlevel/xc/xc.c |  81 ----
 5 files changed, 940 deletions(-)
 delete mode 100644 tools/libxc/xc_hvm_build_x86.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 7f860d7..c738c46 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -93,7 +93,6 @@ GUEST_SRCS-y                 += xc_dom_compat_linux.c
 
 GUEST_SRCS-$(CONFIG_X86)     += xc_dom_x86.c
 GUEST_SRCS-$(CONFIG_X86)     += xc_cpuid_x86.c
-GUEST_SRCS-$(CONFIG_X86)     += xc_hvm_build_x86.c
 GUEST_SRCS-$(CONFIG_ARM)     += xc_dom_arm.c
 GUEST_SRCS-$(CONFIG_ARM)     += xc_hvm_build_arm.c
 
diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 7581263..d96bf7d 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -233,50 +233,6 @@ struct xc_hvm_firmware_module {
     uint64_t  guest_addr_out;
 };
 
-struct xc_hvm_build_args {
-    uint64_t mem_size;           /* Memory size in bytes. */
-    uint64_t mem_target;         /* Memory target in bytes. */
-    uint64_t mmio_size;          /* Size of the MMIO hole in bytes. */
-    const char *image_file_name; /* File name of the image to load. */
-
-    /* Extra ACPI tables passed to HVMLOADER */
-    struct xc_hvm_firmware_module acpi_module;
-
-    /* Extra SMBIOS structures passed to HVMLOADER */
-    struct xc_hvm_firmware_module smbios_module;
-    /* Whether to use claim hypercall (1 - enable, 0 - disable). */
-    int claim_enabled;
-
-    /* vNUMA information*/
-    xen_vmemrange_t *vmemranges;
-    unsigned int nr_vmemranges;
-    unsigned int *vnode_to_pnode;
-    unsigned int nr_vnodes;
-
-    /* Out parameters  */
-    uint64_t lowmem_end;
-    uint64_t highmem_end;
-    uint64_t mmio_start;
-};
-
-/**
- * Build a HVM domain.
- * @parm xch      libxc context handle.
- * @parm domid    domain ID for the new domain.
- * @parm hvm_args parameters for the new domain.
- *
- * The memory size and image file parameters are required, the rest
- * are optional.
- */
-int xc_hvm_build(xc_interface *xch, uint32_t domid,
-                 struct xc_hvm_build_args *hvm_args);
-
-int xc_hvm_build_target_mem(xc_interface *xch,
-                            uint32_t domid,
-                            int memsize,
-                            int target,
-                            const char *image_name);
-
 /*
  * Sets *lockfd to -1.
  * Has deallocated everything even on error.
diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
deleted file mode 100644
index f7616a8..0000000
--- a/tools/libxc/xc_hvm_build_x86.c
+++ /dev/null
@@ -1,805 +0,0 @@
-/******************************************************************************
- * xc_hvm_build.c
- *
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation;
- * version 2.1 of the License.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; if not, write to the Free Software
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
- */
-
-#include <stddef.h>
-#include <inttypes.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <zlib.h>
-
-#include "xg_private.h"
-#include "xc_private.h"
-
-#include <xen/foreign/x86_32.h>
-#include <xen/foreign/x86_64.h>
-#include <xen/hvm/hvm_info_table.h>
-#include <xen/hvm/params.h>
-#include <xen/hvm/e820.h>
-
-#include <xen/libelf/libelf.h>
-
-#define SUPERPAGE_2MB_SHIFT   9
-#define SUPERPAGE_2MB_NR_PFNS (1UL << SUPERPAGE_2MB_SHIFT)
-#define SUPERPAGE_1GB_SHIFT   18
-#define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT)
-
-#define SPECIALPAGE_PAGING   0
-#define SPECIALPAGE_ACCESS   1
-#define SPECIALPAGE_SHARING  2
-#define SPECIALPAGE_BUFIOREQ 3
-#define SPECIALPAGE_XENSTORE 4
-#define SPECIALPAGE_IOREQ    5
-#define SPECIALPAGE_IDENT_PT 6
-#define SPECIALPAGE_CONSOLE  7
-#define NR_SPECIAL_PAGES     8
-#define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x))
-
-#define NR_IOREQ_SERVER_PAGES 8
-#define ioreq_server_pfn(x) (special_pfn(0) - NR_IOREQ_SERVER_PAGES + (x))
-
-#define VGA_HOLE_SIZE (0x20)
-
-static int modules_init(struct xc_hvm_build_args *args,
-                        uint64_t vend, struct elf_binary *elf,
-                        uint64_t *mstart_out, uint64_t *mend_out)
-{
-#define MODULE_ALIGN 1UL << 7
-#define MB_ALIGN     1UL << 20
-#define MKALIGN(x, a) (((uint64_t)(x) + (a) - 1) & ~(uint64_t)((a) - 1))
-    uint64_t total_len = 0, offset1 = 0;
-
-    if ( (args->acpi_module.length == 0)&&(args->smbios_module.length == 0) )
-        return 0;
-
-    /* Find the total length for the firmware modules with a reasonable large
-     * alignment size to align each the modules.
-     */
-    total_len = MKALIGN(args->acpi_module.length, MODULE_ALIGN);
-    offset1 = total_len;
-    total_len += MKALIGN(args->smbios_module.length, MODULE_ALIGN);
-
-    /* Want to place the modules 1Mb+change behind the loader image. */
-    *mstart_out = MKALIGN(elf->pend, MB_ALIGN) + (MB_ALIGN);
-    *mend_out = *mstart_out + total_len;
-
-    if ( *mend_out > vend )    
-        return -1;
-
-    if ( args->acpi_module.length != 0 )
-        args->acpi_module.guest_addr_out = *mstart_out;
-    if ( args->smbios_module.length != 0 )
-        args->smbios_module.guest_addr_out = *mstart_out + offset1;
-
-    return 0;
-}
-
-static void build_hvm_info(void *hvm_info_page,
-                           struct xc_hvm_build_args *args)
-{
-    struct hvm_info_table *hvm_info = (struct hvm_info_table *)
-        (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
-    uint8_t sum;
-    int i;
-
-    memset(hvm_info_page, 0, PAGE_SIZE);
-
-    /* Fill in the header. */
-    strncpy(hvm_info->signature, "HVM INFO", 8);
-    hvm_info->length = sizeof(struct hvm_info_table);
-
-    /* Sensible defaults: these can be overridden by the caller. */
-    hvm_info->apic_mode = 1;
-    hvm_info->nr_vcpus = 1;
-    memset(hvm_info->vcpu_online, 0xff, sizeof(hvm_info->vcpu_online));
-
-    /* Memory parameters. */
-    hvm_info->low_mem_pgend = args->lowmem_end >> PAGE_SHIFT;
-    hvm_info->high_mem_pgend = args->highmem_end >> PAGE_SHIFT;
-    hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
-
-    /* Finish with the checksum. */
-    for ( i = 0, sum = 0; i < hvm_info->length; i++ )
-        sum += ((uint8_t *)hvm_info)[i];
-    hvm_info->checksum = -sum;
-}
-
-static int loadelfimage(xc_interface *xch, struct elf_binary *elf,
-                        uint32_t dom, unsigned long *parray)
-{
-    privcmd_mmap_entry_t *entries = NULL;
-    unsigned long pfn_start = elf->pstart >> PAGE_SHIFT;
-    unsigned long pfn_end = (elf->pend + PAGE_SIZE - 1) >> PAGE_SHIFT;
-    size_t pages = pfn_end - pfn_start;
-    int i, rc = -1;
-
-    /* Map address space for initial elf image. */
-    entries = calloc(pages, sizeof(privcmd_mmap_entry_t));
-    if ( entries == NULL )
-        goto err;
-
-    for ( i = 0; i < pages; i++ )
-        entries[i].mfn = parray[(elf->pstart >> PAGE_SHIFT) + i];
-
-    elf->dest_base = xc_map_foreign_ranges(
-        xch, dom, pages << PAGE_SHIFT, PROT_READ | PROT_WRITE, 1 << PAGE_SHIFT,
-        entries, pages);
-    if ( elf->dest_base == NULL )
-        goto err;
-    elf->dest_size = pages * PAGE_SIZE;
-
-    ELF_ADVANCE_DEST(elf, elf->pstart & (PAGE_SIZE - 1));
-
-    /* Load the initial elf image. */
-    rc = elf_load_binary(elf);
-    if ( rc < 0 )
-        PERROR("Failed to load elf binary\n");
-
-    munmap(elf->dest_base, pages << PAGE_SHIFT);
-    elf->dest_base = NULL;
-    elf->dest_size = 0;
-
- err:
-    free(entries);
-
-    return rc;
-}
-
-static int loadmodules(xc_interface *xch,
-                       struct xc_hvm_build_args *args,
-                       uint64_t mstart, uint64_t mend,
-                       uint32_t dom, unsigned long *parray)
-{
-    privcmd_mmap_entry_t *entries = NULL;
-    unsigned long pfn_start;
-    unsigned long pfn_end;
-    size_t pages;
-    uint32_t i;
-    uint8_t *dest;
-    int rc = -1;
-
-    if ( (mstart == 0)||(mend == 0) )
-        return 0;
-
-    pfn_start = (unsigned long)(mstart >> PAGE_SHIFT);
-    pfn_end = (unsigned long)((mend + PAGE_SIZE - 1) >> PAGE_SHIFT);
-    pages = pfn_end - pfn_start;
-
-    /* Map address space for module list. */
-    entries = calloc(pages, sizeof(privcmd_mmap_entry_t));
-    if ( entries == NULL )
-        goto error_out;
-
-    for ( i = 0; i < pages; i++ )
-        entries[i].mfn = parray[(mstart >> PAGE_SHIFT) + i];
-
-    dest = xc_map_foreign_ranges(
-        xch, dom, pages << PAGE_SHIFT, PROT_READ | PROT_WRITE, 1 << PAGE_SHIFT,
-        entries, pages);
-    if ( dest == NULL )
-        goto error_out;
-
-    /* Zero the range so padding is clear between modules */
-    memset(dest, 0, pages << PAGE_SHIFT);
-
-    /* Load modules into range */    
-    if ( args->acpi_module.length != 0 )
-    {
-        memcpy(dest,
-               args->acpi_module.data,
-               args->acpi_module.length);
-    }
-    if ( args->smbios_module.length != 0 )
-    {
-        memcpy(dest + (args->smbios_module.guest_addr_out - mstart),
-               args->smbios_module.data,
-               args->smbios_module.length);
-    }
-
-    munmap(dest, pages << PAGE_SHIFT);
-    rc = 0;
-
- error_out:
-    free(entries);
-
-    return rc;
-}
-
-/*
- * Check whether there exists mmio hole in the specified memory range.
- * Returns 1 if exists, else returns 0.
- */
-static int check_mmio_hole(uint64_t start, uint64_t memsize,
-                           uint64_t mmio_start, uint64_t mmio_size)
-{
-    if ( start + memsize <= mmio_start || start >= mmio_start + mmio_size )
-        return 0;
-    else
-        return 1;
-}
-
-static int xc_hvm_populate_memory(xc_interface *xch, uint32_t dom,
-                                  struct xc_hvm_build_args *args,
-                                  xen_pfn_t *page_array)
-{
-    unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
-    unsigned long p2m_size;
-    unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
-    unsigned long cur_pages, cur_pfn;
-    int rc;
-    xen_capabilities_info_t caps;
-    unsigned long stat_normal_pages = 0, stat_2mb_pages = 0, 
-        stat_1gb_pages = 0;
-    unsigned int memflags = 0;
-    int claim_enabled = args->claim_enabled;
-    uint64_t total_pages;
-    xen_vmemrange_t dummy_vmemrange[2];
-    unsigned int dummy_vnode_to_pnode[1];
-    xen_vmemrange_t *vmemranges;
-    unsigned int *vnode_to_pnode;
-    unsigned int nr_vmemranges, nr_vnodes;
-
-    if ( nr_pages > target_pages )
-        memflags |= XENMEMF_populate_on_demand;
-
-    if ( args->nr_vmemranges == 0 )
-    {
-        /* Build dummy vnode information
-         *
-         * Guest physical address space layout:
-         * [0, hole_start) [hole_start, 4G) [4G, highmem_end)
-         *
-         * Of course if there is no high memory, the second vmemrange
-         * has no effect on the actual result.
-         */
-
-        dummy_vmemrange[0].start = 0;
-        dummy_vmemrange[0].end   = args->lowmem_end;
-        dummy_vmemrange[0].flags = 0;
-        dummy_vmemrange[0].nid   = 0;
-        nr_vmemranges = 1;
-
-        if ( args->highmem_end > (1ULL << 32) )
-        {
-            dummy_vmemrange[1].start = 1ULL << 32;
-            dummy_vmemrange[1].end   = args->highmem_end;
-            dummy_vmemrange[1].flags = 0;
-            dummy_vmemrange[1].nid   = 0;
-
-            nr_vmemranges++;
-        }
-
-        dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE;
-        nr_vnodes = 1;
-        vmemranges = dummy_vmemrange;
-        vnode_to_pnode = dummy_vnode_to_pnode;
-    }
-    else
-    {
-        if ( nr_pages > target_pages )
-        {
-            PERROR("Cannot enable vNUMA and PoD at the same time");
-            goto error_out;
-        }
-
-        nr_vmemranges = args->nr_vmemranges;
-        nr_vnodes = args->nr_vnodes;
-        vmemranges = args->vmemranges;
-        vnode_to_pnode = args->vnode_to_pnode;
-    }
-
-    total_pages = 0;
-    p2m_size = 0;
-    for ( i = 0; i < nr_vmemranges; i++ )
-    {
-        total_pages += ((vmemranges[i].end - vmemranges[i].start)
-                        >> PAGE_SHIFT);
-        p2m_size = p2m_size > (vmemranges[i].end >> PAGE_SHIFT) ?
-            p2m_size : (vmemranges[i].end >> PAGE_SHIFT);
-    }
-
-    if ( total_pages != (args->mem_size >> PAGE_SHIFT) )
-    {
-        PERROR("vNUMA memory pages mismatch (0x%"PRIx64" != 0x%"PRIx64")",
-               total_pages, args->mem_size >> PAGE_SHIFT);
-        goto error_out;
-    }
-
-    if ( xc_version(xch, XENVER_capabilities, &caps) != 0 )
-    {
-        PERROR("Could not get Xen capabilities");
-        goto error_out;
-    }
-
-    for ( i = 0; i < p2m_size; i++ )
-        page_array[i] = ((xen_pfn_t)-1);
-    for ( vmemid = 0; vmemid < nr_vmemranges; vmemid++ )
-    {
-        uint64_t pfn;
-
-        for ( pfn = vmemranges[vmemid].start >> PAGE_SHIFT;
-              pfn < vmemranges[vmemid].end >> PAGE_SHIFT;
-              pfn++ )
-            page_array[pfn] = pfn;
-    }
-
-    /*
-     * Try to claim pages for early warning of insufficient memory available.
-     * This should go before xc_domain_set_pod_target, becuase that function
-     * actually allocates memory for the guest. Claiming after memory has been
-     * allocated is pointless.
-     */
-    if ( claim_enabled ) {
-        rc = xc_domain_claim_pages(xch, dom, target_pages - VGA_HOLE_SIZE);
-        if ( rc != 0 )
-        {
-            PERROR("Could not allocate memory for HVM guest as we cannot claim memory!");
-            goto error_out;
-        }
-    }
-
-    if ( memflags & XENMEMF_populate_on_demand )
-    {
-        /*
-         * Subtract VGA_HOLE_SIZE from target_pages for the VGA
-         * "hole".  Xen will adjust the PoD cache size so that domain
-         * tot_pages will be target_pages - VGA_HOLE_SIZE after
-         * this call.
-         */
-        rc = xc_domain_set_pod_target(xch, dom,
-                                      target_pages - VGA_HOLE_SIZE,
-                                      NULL, NULL, NULL);
-        if ( rc != 0 )
-        {
-            PERROR("Could not set PoD target for HVM guest.\n");
-            goto error_out;
-        }
-    }
-
-    /*
-     * Allocate memory for HVM guest, skipping VGA hole 0xA0000-0xC0000.
-     *
-     * We attempt to allocate 1GB pages if possible. It falls back on 2MB
-     * pages if 1GB allocation fails. 4KB pages will be used eventually if
-     * both fail.
-     * 
-     * Under 2MB mode, we allocate pages in batches of no more than 8MB to 
-     * ensure that we can be preempted and hence dom0 remains responsive.
-     */
-    rc = xc_domain_populate_physmap_exact(
-        xch, dom, 0xa0, 0, memflags, &page_array[0x00]);
-
-    stat_normal_pages = 0;
-    for ( vmemid = 0; vmemid < nr_vmemranges; vmemid++ )
-    {
-        unsigned int new_memflags = memflags;
-        uint64_t end_pages;
-        unsigned int vnode = vmemranges[vmemid].nid;
-        unsigned int pnode = vnode_to_pnode[vnode];
-
-        if ( pnode != XC_NUMA_NO_NODE )
-            new_memflags |= XENMEMF_exact_node(pnode);
-
-        end_pages = vmemranges[vmemid].end >> PAGE_SHIFT;
-        /*
-         * Consider vga hole belongs to the vmemrange that covers
-         * 0xA0000-0xC0000. Note that 0x00000-0xA0000 is populated just
-         * before this loop.
-         */
-        if ( vmemranges[vmemid].start == 0 )
-        {
-            cur_pages = 0xc0;
-            stat_normal_pages += 0xc0;
-        }
-        else
-            cur_pages = vmemranges[vmemid].start >> PAGE_SHIFT;
-
-        while ( (rc == 0) && (end_pages > cur_pages) )
-        {
-            /* Clip count to maximum 1GB extent. */
-            unsigned long count = end_pages - cur_pages;
-            unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
-
-            if ( count > max_pages )
-                count = max_pages;
-
-            cur_pfn = page_array[cur_pages];
-
-            /* Take care the corner cases of super page tails */
-            if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
-                 (count > (-cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1))) )
-                count = -cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1);
-            else if ( ((count & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
-                      (count > SUPERPAGE_1GB_NR_PFNS) )
-                count &= ~(SUPERPAGE_1GB_NR_PFNS - 1);
-
-            /* Attemp to allocate 1GB super page. Because in each pass
-             * we only allocate at most 1GB, we don't have to clip
-             * super page boundaries.
-             */
-            if ( ((count | cur_pfn) & (SUPERPAGE_1GB_NR_PFNS - 1)) == 0 &&
-                 /* Check if there exists MMIO hole in the 1GB memory
-                  * range */
-                 !check_mmio_hole(cur_pfn << PAGE_SHIFT,
-                                  SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
-                                  args->mmio_start, args->mmio_size) )
-            {
-                long done;
-                unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
-                xen_pfn_t sp_extents[nr_extents];
-
-                for ( i = 0; i < nr_extents; i++ )
-                    sp_extents[i] =
-                        page_array[cur_pages+(i<<SUPERPAGE_1GB_SHIFT)];
-
-                done = xc_domain_populate_physmap(xch, dom, nr_extents,
-                                                  SUPERPAGE_1GB_SHIFT,
-                                                  memflags, sp_extents);
-
-                if ( done > 0 )
-                {
-                    stat_1gb_pages += done;
-                    done <<= SUPERPAGE_1GB_SHIFT;
-                    cur_pages += done;
-                    count -= done;
-                }
-            }
-
-            if ( count != 0 )
-            {
-                /* Clip count to maximum 8MB extent. */
-                max_pages = SUPERPAGE_2MB_NR_PFNS * 4;
-                if ( count > max_pages )
-                    count = max_pages;
-
-                /* Clip partial superpage extents to superpage
-                 * boundaries. */
-                if ( ((cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
-                     (count > (-cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1))) )
-                    count = -cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1);
-                else if ( ((count & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
-                          (count > SUPERPAGE_2MB_NR_PFNS) )
-                    count &= ~(SUPERPAGE_2MB_NR_PFNS - 1); /* clip non-s.p. tail */
-
-                /* Attempt to allocate superpage extents. */
-                if ( ((count | cur_pfn) & (SUPERPAGE_2MB_NR_PFNS - 1)) == 0 )
-                {
-                    long done;
-                    unsigned long nr_extents = count >> SUPERPAGE_2MB_SHIFT;
-                    xen_pfn_t sp_extents[nr_extents];
-
-                    for ( i = 0; i < nr_extents; i++ )
-                        sp_extents[i] =
-                            page_array[cur_pages+(i<<SUPERPAGE_2MB_SHIFT)];
-
-                    done = xc_domain_populate_physmap(xch, dom, nr_extents,
-                                                      SUPERPAGE_2MB_SHIFT,
-                                                      memflags, sp_extents);
-
-                    if ( done > 0 )
-                    {
-                        stat_2mb_pages += done;
-                        done <<= SUPERPAGE_2MB_SHIFT;
-                        cur_pages += done;
-                        count -= done;
-                    }
-                }
-            }
-
-            /* Fall back to 4kB extents. */
-            if ( count != 0 )
-            {
-                rc = xc_domain_populate_physmap_exact(
-                    xch, dom, count, 0, new_memflags, &page_array[cur_pages]);
-                cur_pages += count;
-                stat_normal_pages += count;
-            }
-        }
-
-        if ( rc != 0 )
-            break;
-    }
-
-    if ( rc != 0 )
-    {
-        PERROR("Could not allocate memory for HVM guest.");
-        goto error_out;
-    }
-
-    DPRINTF("PHYSICAL MEMORY ALLOCATION:\n");
-    DPRINTF("  4KB PAGES: 0x%016lx\n", stat_normal_pages);
-    DPRINTF("  2MB PAGES: 0x%016lx\n", stat_2mb_pages);
-    DPRINTF("  1GB PAGES: 0x%016lx\n", stat_1gb_pages);
-
-    rc = 0;
-    goto out;
- error_out:
-    rc = -1;
- out:
-
-    /* ensure no unclaimed pages are left unused */
-    xc_domain_claim_pages(xch, dom, 0 /* cancels the claim */);
-
-    return rc;
-}
-
-static int xc_hvm_load_image(xc_interface *xch,
-                       uint32_t dom, struct xc_hvm_build_args *args,
-                       xen_pfn_t *page_array)
-{
-    unsigned long entry_eip, image_size;
-    struct elf_binary elf;
-    uint64_t v_start, v_end;
-    uint64_t m_start = 0, m_end = 0;
-    char *image;
-    int rc;
-
-    image = xc_read_image(xch, args->image_file_name, &image_size);
-    if ( image == NULL )
-        return -1;
-
-    memset(&elf, 0, sizeof(elf));
-    if ( elf_init(&elf, image, image_size) != 0 )
-        goto error_out;
-
-    xc_elf_set_logfile(xch, &elf, 1);
-
-    elf_parse_binary(&elf);
-    v_start = 0;
-    v_end = args->mem_size;
-
-    if ( modules_init(args, v_end, &elf, &m_start, &m_end) != 0 )
-    {
-        ERROR("Insufficient space to load modules.");
-        goto error_out;
-    }
-
-    DPRINTF("VIRTUAL MEMORY ARRANGEMENT:\n");
-    DPRINTF("  Loader:   %016"PRIx64"->%016"PRIx64"\n", elf.pstart, elf.pend);
-    DPRINTF("  Modules:  %016"PRIx64"->%016"PRIx64"\n", m_start, m_end);
-
-    if ( loadelfimage(xch, &elf, dom, page_array) != 0 )
-    {
-        PERROR("Could not load ELF image");
-        goto error_out;
-    }
-
-    if ( loadmodules(xch, args, m_start, m_end, dom, page_array) != 0 )
-    {
-        PERROR("Could not load ACPI modules");
-        goto error_out;
-    }
-
-    /* Insert JMP <rel32> instruction at address 0x0 to reach entry point. */
-    entry_eip = elf_uval(&elf, elf.ehdr, e_entry);
-    if ( entry_eip != 0 )
-    {
-        char *page0 = xc_map_foreign_range(
-            xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0);
-        if ( page0 == NULL )
-            goto error_out;
-        page0[0] = 0xe9;
-        *(uint32_t *)&page0[1] = entry_eip - 5;
-        munmap(page0, PAGE_SIZE);
-    }
-
-    rc = 0;
-    goto out;
- error_out:
-    rc = -1;
- out:
-    if ( elf_check_broken(&elf) )
-        ERROR("HVM ELF broken: %s", elf_check_broken(&elf));
-    free(image);
-
-    return rc;
-}
-
-static int xc_hvm_populate_params(xc_interface *xch, uint32_t dom,
-                                  struct xc_hvm_build_args *args)
-{
-    unsigned long i;
-    void *hvm_info_page;
-    uint32_t *ident_pt;
-    uint64_t v_end;
-    int rc;
-    xen_pfn_t special_array[NR_SPECIAL_PAGES];
-    xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
-
-    v_end = args->mem_size;
-
-    if ( (hvm_info_page = xc_map_foreign_range(
-              xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
-              HVM_INFO_PFN)) == NULL )
-    {
-        PERROR("Could not map hvm info page");
-        goto error_out;
-    }
-    build_hvm_info(hvm_info_page, args);
-    munmap(hvm_info_page, PAGE_SIZE);
-
-    /* Allocate and clear special pages. */
-    for ( i = 0; i < NR_SPECIAL_PAGES; i++ )
-        special_array[i] = special_pfn(i);
-
-    rc = xc_domain_populate_physmap_exact(xch, dom, NR_SPECIAL_PAGES, 0, 0,
-                                          special_array);
-    if ( rc != 0 )
-    {
-        PERROR("Could not allocate special pages.");
-        goto error_out;
-    }
-
-    if ( xc_clear_domain_pages(xch, dom, special_pfn(0), NR_SPECIAL_PAGES) )
-    {
-        PERROR("Could not clear special pages");
-        goto error_out;
-    }
-
-    xc_hvm_param_set(xch, dom, HVM_PARAM_STORE_PFN,
-                     special_pfn(SPECIALPAGE_XENSTORE));
-    xc_hvm_param_set(xch, dom, HVM_PARAM_BUFIOREQ_PFN,
-                     special_pfn(SPECIALPAGE_BUFIOREQ));
-    xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_PFN,
-                     special_pfn(SPECIALPAGE_IOREQ));
-    xc_hvm_param_set(xch, dom, HVM_PARAM_CONSOLE_PFN,
-                     special_pfn(SPECIALPAGE_CONSOLE));
-    xc_hvm_param_set(xch, dom, HVM_PARAM_PAGING_RING_PFN,
-                     special_pfn(SPECIALPAGE_PAGING));
-    xc_hvm_param_set(xch, dom, HVM_PARAM_MONITOR_RING_PFN,
-                     special_pfn(SPECIALPAGE_ACCESS));
-    xc_hvm_param_set(xch, dom, HVM_PARAM_SHARING_RING_PFN,
-                     special_pfn(SPECIALPAGE_SHARING));
-
-    /*
-     * Allocate and clear additional ioreq server pages. The default
-     * server will use the IOREQ and BUFIOREQ special pages above.
-     */
-    for ( i = 0; i < NR_IOREQ_SERVER_PAGES; i++ )
-        ioreq_server_array[i] = ioreq_server_pfn(i);
-
-    rc = xc_domain_populate_physmap_exact(xch, dom, NR_IOREQ_SERVER_PAGES, 0, 0,
-                                          ioreq_server_array);
-    if ( rc != 0 )
-    {
-        PERROR("Could not allocate ioreq server pages.");
-        goto error_out;
-    }
-
-    if ( xc_clear_domain_pages(xch, dom, ioreq_server_pfn(0), NR_IOREQ_SERVER_PAGES) )
-    {
-        PERROR("Could not clear ioreq page");
-        goto error_out;
-    }
-
-    /* Tell the domain where the pages are and how many there are */
-    xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_SERVER_PFN,
-                     ioreq_server_pfn(0));
-    xc_hvm_param_set(xch, dom, HVM_PARAM_NR_IOREQ_SERVER_PAGES,
-                     NR_IOREQ_SERVER_PAGES);
-
-    /*
-     * Identity-map page table is required for running with CR0.PG=0 when
-     * using Intel EPT. Create a 32-bit non-PAE page directory of superpages.
-     */
-    if ( (ident_pt = xc_map_foreign_range(
-              xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
-              special_pfn(SPECIALPAGE_IDENT_PT))) == NULL )
-    {
-        PERROR("Could not map special page ident_pt");
-        goto error_out;
-    }
-    for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ )
-        ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
-                       _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
-    munmap(ident_pt, PAGE_SIZE);
-    xc_hvm_param_set(xch, dom, HVM_PARAM_IDENT_PT,
-                     special_pfn(SPECIALPAGE_IDENT_PT) << PAGE_SHIFT);
-
-    rc = 0;
-    goto out;
- error_out:
-    rc = -1;
- out:
-
-    return rc;
-}
-
-/* xc_hvm_build:
- * Create a domain for a virtualized Linux, using files/filenames.
- */
-int xc_hvm_build(xc_interface *xch, uint32_t domid,
-                 struct xc_hvm_build_args *hvm_args)
-{
-    struct xc_hvm_build_args args = *hvm_args;
-    xen_pfn_t *parray = NULL;
-    int rc;
-
-    if ( domid == 0 )
-        return -1;
-    if ( args.image_file_name == NULL )
-        return -1;
-
-    /* An HVM guest must be initialised with at least 2MB memory. */
-    if ( args.mem_size < (2ull << 20) || args.mem_target < (2ull << 20) )
-        return -1;
-
-    parray = malloc((args.mem_size >> PAGE_SHIFT) * sizeof(xen_pfn_t));
-    if ( parray == NULL )
-        return -1;
-
-    rc = xc_hvm_populate_memory(xch, domid, &args, parray);
-    if ( rc != 0 )
-    {
-        PERROR("xc_hvm_populate_memory failed");
-        goto out;
-    }
-    rc = xc_hvm_load_image(xch, domid, &args, parray);
-    if ( rc != 0 )
-    {
-        PERROR("xc_hvm_load_image failed");
-        goto out;
-    }
-    rc = xc_hvm_populate_params(xch, domid, &args);
-    if ( rc != 0 )
-    {
-        PERROR("xc_hvm_populate_params failed");
-        goto out;
-    }
-
-    /* Return module load addresses to caller */
-    hvm_args->acpi_module.guest_addr_out = args.acpi_module.guest_addr_out;
-    hvm_args->smbios_module.guest_addr_out = args.smbios_module.guest_addr_out;
-
-out:
-    free(parray);
-
-    return rc;
-}
-
-/* xc_hvm_build_target_mem: 
- * Create a domain for a pre-ballooned virtualized Linux, using
- * files/filenames.  If target < memsize, domain is created with
- * memsize pages marked populate-on-demand, 
- * calculating pod cache size based on target.
- * If target == memsize, pages are populated normally.
- */
-int xc_hvm_build_target_mem(xc_interface *xch,
-                           uint32_t domid,
-                           int memsize,
-                           int target,
-                           const char *image_name)
-{
-    struct xc_hvm_build_args args = {};
-
-    memset(&args, 0, sizeof(struct xc_hvm_build_args));
-    args.mem_size = (uint64_t)memsize << 20;
-    args.mem_target = (uint64_t)target << 20;
-    args.image_file_name = image_name;
-
-    return xc_hvm_build(xch, domid, &args);
-}
-
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/tools/libxc/xg_private.c b/tools/libxc/xg_private.c
index c52cb44..98f33b2 100644
--- a/tools/libxc/xg_private.c
+++ b/tools/libxc/xg_private.c
@@ -188,15 +188,6 @@ unsigned long csum_page(void *page)
     return sum ^ (sum>>32);
 }
 
-__attribute__((weak)) 
-    int xc_hvm_build(xc_interface *xch,
-                     uint32_t domid,
-                     struct xc_hvm_build_args *hvm_args)
-{
-    errno = ENOSYS;
-    return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index c77e15b..690780c 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -910,77 +910,6 @@ static PyObject *pyxc_dom_suppress_spurious_page_faults(XcObject *self,
 }
 #endif /* __i386__ || __x86_64__ */
 
-static PyObject *pyxc_hvm_build(XcObject *self,
-                                PyObject *args,
-                                PyObject *kwds)
-{
-    uint32_t dom;
-    struct hvm_info_table *va_hvm;
-    uint8_t *va_map, sum;
-    int i;
-    char *image;
-    int memsize, target=-1, vcpus = 1, acpi = 0, apic = 1;
-    PyObject *vcpu_avail_handle = NULL;
-    uint8_t vcpu_avail[(HVM_MAX_VCPUS + 7)/8];
-
-    static char *kwd_list[] = { "domid",
-                                "memsize", "image", "target", "vcpus", 
-                                "vcpu_avail", "acpi", "apic", NULL };
-    if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iis|iiOii", kwd_list,
-                                      &dom, &memsize, &image, &target, &vcpus,
-                                      &vcpu_avail_handle, &acpi, &apic) )
-        return NULL;
-
-    memset(vcpu_avail, 0, sizeof(vcpu_avail));
-    vcpu_avail[0] = 1;
-    if ( vcpu_avail_handle != NULL )
-    {
-        if ( PyInt_Check(vcpu_avail_handle) )
-        {
-            unsigned long v = PyInt_AsLong(vcpu_avail_handle);
-            for ( i = 0; i < sizeof(long); i++ )
-                vcpu_avail[i] = (uint8_t)(v>>(i*8));
-        }
-        else if ( PyLong_Check(vcpu_avail_handle) )
-        {
-            if ( _PyLong_AsByteArray((PyLongObject *)vcpu_avail_handle,
-                                     (unsigned char *)vcpu_avail,
-                                     sizeof(vcpu_avail), 1, 0) )
-                return NULL;
-        }
-        else
-        {
-            errno = EINVAL;
-            PyErr_SetFromErrno(xc_error_obj);
-            return NULL;
-        }
-    }
-
-    if ( target == -1 )
-        target = memsize;
-
-    if ( xc_hvm_build_target_mem(self->xc_handle, dom, memsize,
-                                 target, image) != 0 )
-        return pyxc_error_to_exception(self->xc_handle);
-
-    /* Fix up the HVM info table. */
-    va_map = xc_map_foreign_range(self->xc_handle, dom, XC_PAGE_SIZE,
-                                  PROT_READ | PROT_WRITE,
-                                  HVM_INFO_PFN);
-    if ( va_map == NULL )
-        return PyErr_SetFromErrno(xc_error_obj);
-    va_hvm = (struct hvm_info_table *)(va_map + HVM_INFO_OFFSET);
-    va_hvm->apic_mode    = apic;
-    va_hvm->nr_vcpus     = vcpus;
-    memcpy(va_hvm->vcpu_online, vcpu_avail, sizeof(vcpu_avail));
-    for ( i = 0, sum = 0; i < va_hvm->length; i++ )
-        sum += ((uint8_t *)va_hvm)[i];
-    va_hvm->checksum -= sum;
-    munmap(va_map, XC_PAGE_SIZE);
-
-    return Py_BuildValue("{}");
-}
-
 static PyObject *pyxc_gnttab_hvm_seed(XcObject *self,
 				      PyObject *args,
 				      PyObject *kwds)
@@ -2411,16 +2340,6 @@ static PyMethodDef pyxc_methods[] = {
       " image   [str]:      Name of kernel image file. May be gzipped.\n"
       " cmdline [str, n/a]: Kernel parameters, if any.\n\n"},
 
-    { "hvm_build", 
-      (PyCFunction)pyxc_hvm_build, 
-      METH_VARARGS | METH_KEYWORDS, "\n"
-      "Build a new HVM guest OS.\n"
-      " dom     [int]:      Identifier of domain to build into.\n"
-      " image   [str]:      Name of HVM loader image file.\n"
-      " vcpus   [int, 1]:   Number of Virtual CPUS in domain.\n\n"
-      " vcpu_avail [long, 1]: Which Virtual CPUS available.\n\n"
-      "Returns: [int] 0 on success; -1 on error.\n" },
-
     { "gnttab_hvm_seed",
       (PyCFunction)pyxc_gnttab_hvm_seed,
       METH_KEYWORDS, "\n"
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 09/13] elfnotes: intorduce a new PHYS_ENTRY elfnote
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (7 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 08/13] libxc: remove dead x86 HVM code Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-23  9:35   ` Jan Beulich
  2015-06-22 16:11 ` [PATCH RFC v1 10/13] lib{xc/xl}: allow the creation of HVM domains with a kernel Roger Pau Monne
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

This new elfnote contains the 32bit entry point into the kernel. Xen will
use this entry point in order to launch the guest kernel in 32bit protected
mode with paging disabled.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
 tools/xcutils/readnotes.c          |  3 +++
 xen/common/libelf/libelf-dominfo.c |  4 ++++
 xen/include/public/elfnote.h       | 11 ++++++++++-
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/xcutils/readnotes.c b/tools/xcutils/readnotes.c
index 5fa445e..863ea5f 100644
--- a/tools/xcutils/readnotes.c
+++ b/tools/xcutils/readnotes.c
@@ -159,6 +159,9 @@ static unsigned print_notes(struct elf_binary *elf, ELF_HANDLE_DECL(elf_note) st
 		case XEN_ELFNOTE_L1_MFN_VALID:
 			print_l1_mfn_valid_note("L1_MFN_VALID", elf , note);
 			break;
+		case XEN_ELFNOTE_PHYS_ENTRY:
+			print_numeric_note("PHYS_ENTRY", elf , note);
+			break;
 		default:
 			printf("unknown note type %#x\n",
 			       (unsigned)elf_uval(elf, note, type));
diff --git a/xen/common/libelf/libelf-dominfo.c b/xen/common/libelf/libelf-dominfo.c
index 0771323..ca0e327 100644
--- a/xen/common/libelf/libelf-dominfo.c
+++ b/xen/common/libelf/libelf-dominfo.c
@@ -120,6 +120,7 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf,
         [XEN_ELFNOTE_BSD_SYMTAB] = { "BSD_SYMTAB", 1},
         [XEN_ELFNOTE_SUSPEND_CANCEL] = { "SUSPEND_CANCEL", 0 },
         [XEN_ELFNOTE_MOD_START_PFN] = { "MOD_START_PFN", 0 },
+        [XEN_ELFNOTE_PHYS_ENTRY] = { "PHYS_ENTRY", 0 },
     };
 /* *INDENT-ON* */
 
@@ -213,6 +214,9 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf,
                 elf, note, sizeof(*parms->f_supported), i);
         break;
 
+    case XEN_ELFNOTE_PHYS_ENTRY:
+        parms->phys_entry = val;
+        break;
     }
     return 0;
 }
diff --git a/xen/include/public/elfnote.h b/xen/include/public/elfnote.h
index 3824a94..b5232cd 100644
--- a/xen/include/public/elfnote.h
+++ b/xen/include/public/elfnote.h
@@ -200,9 +200,18 @@
 #define XEN_ELFNOTE_SUPPORTED_FEATURES 17
 
 /*
+ * Physical entry point into the kernel.
+ *
+ * 32bit entry point into the kernel. Xen will use this entry point
+ * in order to launch the guest kernel in 32bit protected mode
+ * with paging disabled.
+ */
+#define XEN_ELFNOTE_PHYS_ENTRY 18
+
+/*
  * The number of the highest elfnote defined.
  */
-#define XEN_ELFNOTE_MAX XEN_ELFNOTE_SUPPORTED_FEATURES
+#define XEN_ELFNOTE_MAX XEN_ELFNOTE_PHYS_ENTRY
 
 /*
  * System information exported through crash notes.
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 10/13] lib{xc/xl}: allow the creation of HVM domains with a kernel
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (8 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 09/13] elfnotes: intorduce a new PHYS_ENTRY elfnote Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-25 10:39   ` Wei Liu
  2015-06-22 16:11 ` [PATCH RFC v1 11/13] xen/libxl: allow creating HVM guests without a device model Roger Pau Monne
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

Replace the firmware loaded into HVM guests with an OS kernel. Since the HVM
builder now uses the PV xc_dom_* set of functions this kernel will be parsed
and loaded inside the guest like on PV, but the container is a pure HVM
guest.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
Only xc_dom_elfloader has been switched in order to use hvm-3.0-x86_32,
other loaders need to be adapted also.
---
 tools/libxc/xc_dom_elfloader.c |  4 ++++
 tools/libxl/libxl_dom.c        | 17 +++++++++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/tools/libxc/xc_dom_elfloader.c b/tools/libxc/xc_dom_elfloader.c
index 6ce1062..2f05015 100644
--- a/tools/libxc/xc_dom_elfloader.c
+++ b/tools/libxc/xc_dom_elfloader.c
@@ -57,6 +57,10 @@ static char *xc_dom_guest_type(struct xc_dom_image *dom,
 {
     uint64_t machine = elf_uval(elf, elf->ehdr, e_machine);
 
+    if ( dom->container_type == XC_DOM_HVM_CONTAINER &&
+         dom->parms.phys_entry != UNSET_ADDR )
+        return "hvm-3.0-x86_32";
+
     switch ( machine )
     {
     case EM_386:
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 12ede26..8ee14b9 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -874,10 +874,20 @@ static int libxl__domain_firmware(libxl__gc *gc,
         }
     }
 
-    rc = xc_dom_kernel_file(dom, libxl__abs_path(gc, firmware,
+    if (info->kernel != NULL) {
+        /* Try to load a kernel instead of the firmware. */
+        rc = xc_dom_kernel_file(dom, info->kernel);
+        if (rc == 0 && info->ramdisk != NULL)
+            rc = xc_dom_ramdisk_file(dom, info->ramdisk);
+        dom->vga_hole = 0;
+    } else {
+        rc = xc_dom_kernel_file(dom, libxl__abs_path(gc, firmware,
                                                  libxl__xenfirmwaredir_path()));
+        dom->vga_hole = 1;
+    }
+
     if (rc != 0) {
-        LOGE(ERROR, "xc_dom_kernel_file failed");
+        LOGE(ERROR, "xc_dom_{kernel_file/ramdisk_file} failed");
         goto out;
     }
 
@@ -931,7 +941,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
 
     xc_dom_loginit(ctx->xch);
 
-    dom = xc_dom_allocate(ctx->xch, NULL, NULL);
+    dom = xc_dom_allocate(ctx->xch, info->cmdline, NULL);
     if (!dom) {
         LOGE(ERROR, "xc_dom_allocate failed");
         goto out;
@@ -974,7 +984,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
     dom->lowmem_end = lowmem_end;
     dom->highmem_end = highmem_end;
     dom->mmio_start = mmio_start;
-    dom->vga_hole = 1;
 
     if (info->num_vnuma_nodes != 0) {
         int i;
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 11/13] xen/libxl: allow creating HVM guests without a device model
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (9 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 10/13] lib{xc/xl}: allow the creation of HVM domains with a kernel Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-23  9:41   ` Jan Beulich
  2015-06-22 16:11 ` [PATCH RFC v1 12/13] xen: allow 64bit HVM guests to use XENMEM_memory_map Roger Pau Monne
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

Intorduce a new device model version (NONE) that can be used to specify that
no device model should be used. Propagate this to Xen by creating a new
XEN_DOMCTL_CDF_noemu flag that disables some of the emulation done inside of
Xen.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
IMHO the XEN_DOMCTL_CDF_noemu flag should be expanded into multiple smaller
flags that can be used to disable specific emulated devices, like the
vlapic, vioapic, vhpet...

Also hvm_mmio_handlers should become domain specific in order to populate it
with the usable handlers only.
---
 tools/libxl/libxl.c              |  7 +++----
 tools/libxl/libxl_create.c       | 16 ++++++++++++++++
 tools/libxl/libxl_dom.c          |  6 ++++++
 tools/libxl/libxl_types.idl      |  1 +
 tools/libxl/xl_cmdimpl.c         |  2 ++
 xen/arch/x86/domain.c            |  2 +-
 xen/arch/x86/hvm/hvm.c           | 14 +++++++++-----
 xen/arch/x86/hvm/intercept.c     |  6 ++++++
 xen/common/domctl.c              |  5 ++++-
 xen/include/asm-x86/hvm/domain.h |  1 +
 xen/include/asm-x86/hvm/hvm.h    |  2 +-
 xen/include/public/domctl.h      |  3 +++
 xen/include/xen/sched.h          |  4 ++++
 13 files changed, 57 insertions(+), 12 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d86ea62..7c83486 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1587,11 +1587,10 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis)
 
     switch (libxl__domain_type(gc, domid)) {
     case LIBXL_DOMAIN_TYPE_HVM:
-        if (!libxl_get_stubdom_id(CTX, domid))
-            dm_present = 1;
-        else
+        if (libxl_get_stubdom_id(CTX, domid)) {
             dm_present = 0;
-        break;
+            break;
+        }
     case LIBXL_DOMAIN_TYPE_PV:
         pid = libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc, "/local/domain/%d/image/device-model-pid", domid));
         dm_present = (pid != NULL);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..06cf02c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -164,6 +164,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
                 b_info->u.hvm.bios = LIBXL_BIOS_TYPE_ROMBIOS; break;
             case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
                 b_info->u.hvm.bios = LIBXL_BIOS_TYPE_SEABIOS; break;
+            case LIBXL_DEVICE_MODEL_VERSION_NONE:
+                break;
             default:return ERROR_INVAL;
             }
 
@@ -177,6 +179,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
             if (b_info->u.hvm.bios == LIBXL_BIOS_TYPE_ROMBIOS)
                 return ERROR_INVAL;
             break;
+        case LIBXL_DEVICE_MODEL_VERSION_NONE:
+            break;
         default:abort();
         }
 
@@ -278,6 +282,9 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
                 break;
             }
             break;
+        case LIBXL_DEVICE_MODEL_VERSION_NONE:
+            b_info->video_memkb = 0;
+            break;
         case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
         default:
             switch (b_info->u.hvm.vga.kind) {
@@ -535,6 +542,7 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
 
     /* convenience aliases */
     libxl_domain_create_info *info = &d_config->c_info;
+    libxl_domain_build_info *b_info = &d_config->b_info;
 
     assert(!libxl_domid_valid_guest(*domid));
 
@@ -549,6 +557,8 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
         flags |= XEN_DOMCTL_CDF_hvm_guest;
         flags |= libxl_defbool_val(info->hap) ? XEN_DOMCTL_CDF_hap : 0;
         flags |= libxl_defbool_val(info->oos) ? 0 : XEN_DOMCTL_CDF_oos_off;
+        if (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE)
+            flags |= XEN_DOMCTL_CDF_noemu;
     } else if (libxl_defbool_val(info->pvh)) {
         flags |= XEN_DOMCTL_CDF_pvh_guest;
         if (!libxl_defbool_val(info->hap)) {
@@ -1293,6 +1303,12 @@ static void domcreate_launch_dm(libxl__egc *egc, libxl__multidev *multidev,
         libxl__device_console_add(gc, domid, &console, state, &device);
         libxl__device_console_dispose(&console);
 
+        if (d_config->b_info.device_model_version ==
+            LIBXL_DEVICE_MODEL_VERSION_NONE) {
+            domcreate_devmodel_started(egc, &dcs->dmss.dm, 0);
+            return;
+        }
+
         libxl_device_vkb_init(&vkb);
         libxl__device_vkb_add(gc, domid, &vkb);
         libxl_device_vkb_dispose(&vkb);
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 8ee14b9..d948546 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -866,6 +866,12 @@ static int libxl__domain_firmware(libxl__gc *gc,
         case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
             firmware = "hvmloader";
             break;
+        case LIBXL_DEVICE_MODEL_VERSION_NONE:
+            if (info->kernel == NULL) {
+                LOG(ERROR, "no device model requested without a kernel");
+                return ERROR_FAIL;
+            }
+            break;
         default:
             LOG(ERROR, "invalid device model version %d",
                 info->device_model_version);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 23f27d4..0b75834 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -83,6 +83,7 @@ libxl_device_model_version = Enumeration("device_model_version", [
     (0, "UNKNOWN"),
     (1, "QEMU_XEN_TRADITIONAL"), # Historical qemu-xen device model (qemu-dm)
     (2, "QEMU_XEN"),             # Upstream based qemu-xen device model
+    (3, "NONE"),                 # No device model
     ])
 
 libxl_console_type = Enumeration("console_type", [
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c858068..3d9b3d4 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2067,6 +2067,8 @@ skip_vfb:
         } else if (!strcmp(buf, "qemu-xen")) {
             b_info->device_model_version
                 = LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
+        } else if (!strcmp(buf, "none")) {
+            b_info->device_model_version = LIBXL_DEVICE_MODEL_VERSION_NONE;
         } else {
             fprintf(stderr,
                     "Unknown device_model_version \"%s\" specified\n", buf);
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 0363650..bad0872 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -610,7 +610,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags,
 
     if ( has_hvm_container_domain(d) )
     {
-        if ( (rc = hvm_domain_initialise(d)) != 0 )
+        if ( (rc = hvm_domain_initialise(d, domcr_flags)) != 0 )
         {
             iommu_domain_destroy(d);
             goto fail;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index d5e5242..7694c9e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -343,7 +343,7 @@ u64 hvm_get_guest_tsc_adjust(struct vcpu *v)
 void hvm_migrate_timers(struct vcpu *v)
 {
     /* PVH doesn't use rtc and emulated timers, it uses pvclock mechanism. */
-    if ( is_pvh_vcpu(v) )
+    if ( is_pvh_vcpu(v) || v->domain->arch.hvm_domain.no_emu )
         return;
 
     rtc_migrate_timers(v);
@@ -1423,7 +1423,7 @@ static int hvm_set_dm_domain(struct domain *d, domid_t domid)
     return rc;
 }
 
-int hvm_domain_initialise(struct domain *d)
+int hvm_domain_initialise(struct domain *d, unsigned int domcr_flags)
 {
     int rc;
 
@@ -1485,9 +1485,10 @@ int hvm_domain_initialise(struct domain *d)
     else
         d->arch.hvm_domain.io_bitmap = hvm_io_bitmap;
 
-    if ( is_pvh_domain(d) )
+    if ( is_pvh_domain(d) || domcr_flags & DOMCRF_noemu )
     {
         register_portio_handler(d, 0, 0x10003, handle_pvh_io);
+        d->arch.hvm_domain.no_emu = TRUE;
         return 0;
     }
 
@@ -1531,7 +1532,7 @@ int hvm_domain_initialise(struct domain *d)
 
 void hvm_domain_relinquish_resources(struct domain *d)
 {
-    if ( is_pvh_domain(d) )
+    if ( is_pvh_domain(d) || d->arch.hvm_domain.no_emu )
         return;
 
     if ( hvm_funcs.nhvm_domain_relinquish_resources )
@@ -1557,7 +1558,7 @@ void hvm_domain_destroy(struct domain *d)
 
     hvm_destroy_cacheattr_region_list(d);
 
-    if ( is_pvh_domain(d) )
+    if ( is_pvh_domain(d) || d->arch.hvm_domain.no_emu )
         return;
 
     hvm_funcs.domain_destroy(d);
@@ -2327,6 +2328,9 @@ int hvm_vcpu_initialise(struct vcpu *v)
         return 0;
     }
 
+    if ( d->arch.hvm_domain.no_emu )
+        return 0;
+
     rc = setup_compat_arg_xlat(v); /* teardown: free_compat_arg_xlat() */
     if ( rc != 0 )
         goto fail4;
diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c
index d52a48c..b7bc3c7 100644
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -168,6 +168,9 @@ bool_t hvm_mmio_internal(paddr_t gpa)
     struct vcpu *curr = current;
     unsigned int i;
 
+    if ( curr->domain->arch.hvm_domain.no_emu )
+        return 0;
+
     for ( i = 0; i < HVM_MMIO_HANDLER_NR; ++i )
         if ( hvm_mmio_handlers[i]->check_handler(curr, gpa) )
             return 1;
@@ -180,6 +183,9 @@ int hvm_mmio_intercept(ioreq_t *p)
     struct vcpu *v = current;
     int i;
 
+    if ( v->domain->arch.hvm_domain.no_emu )
+        return X86EMUL_UNHANDLEABLE;
+
     for ( i = 0; i < HVM_MMIO_HANDLER_NR; i++ )
     {
         hvm_mmio_check_t check_handler =
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index ce517a7..1ce7ae0 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -550,7 +550,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
                | XEN_DOMCTL_CDF_pvh_guest
                | XEN_DOMCTL_CDF_hap
                | XEN_DOMCTL_CDF_s3_integrity
-               | XEN_DOMCTL_CDF_oos_off)) )
+               | XEN_DOMCTL_CDF_oos_off
+               | XEN_DOMCTL_CDF_noemu)) )
             break;
 
         dom = op->domain;
@@ -592,6 +593,8 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
             domcr_flags |= DOMCRF_s3_integrity;
         if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_oos_off )
             domcr_flags |= DOMCRF_oos_off;
+        if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_noemu )
+            domcr_flags |= DOMCRF_noemu;
 
         d = domain_create(dom, domcr_flags, op->u.createdomain.ssidref,
                           &op->u.createdomain.config);
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index ad68fcf..948ced8 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -135,6 +135,7 @@ struct hvm_domain {
     bool_t                 mem_sharing_enabled;
     bool_t                 qemu_mapcache_invalidate;
     bool_t                 is_s3_suspended;
+    bool_t                 no_emu;
 
     /*
      * TSC value that VCPUs use to calculate their tsc_offset value.
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 77eeac5..68c987a 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -217,7 +217,7 @@ extern s8 hvm_port80_allowed;
 extern const struct hvm_function_table *start_svm(void);
 extern const struct hvm_function_table *start_vmx(void);
 
-int hvm_domain_initialise(struct domain *d);
+int hvm_domain_initialise(struct domain *d, unsigned int domcr_flags);
 void hvm_domain_relinquish_resources(struct domain *d);
 void hvm_domain_destroy(struct domain *d);
 
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index bc45ea5..4e9d7e7 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -63,6 +63,9 @@ struct xen_domctl_createdomain {
  /* Is this a PVH guest (as opposed to an HVM or PV guest)? */
 #define _XEN_DOMCTL_CDF_pvh_guest     4
 #define XEN_DOMCTL_CDF_pvh_guest      (1U<<_XEN_DOMCTL_CDF_pvh_guest)
+ /* Disable emulated devices */
+#define _XEN_DOMCTL_CDF_noemu         5
+#define XEN_DOMCTL_CDF_noemu          (1U<<_XEN_DOMCTL_CDF_noemu)
     uint32_t flags;
     struct xen_arch_domainconfig config;
 };
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 604d047..0aaff1e 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -552,6 +552,10 @@ struct domain *domain_create(domid_t domid, unsigned int domcr_flags,
  /* DOMCRF_pvh: Create PV domain in HVM container. */
 #define _DOMCRF_pvh             5
 #define DOMCRF_pvh              (1U<<_DOMCRF_pvh)
+/* DOMCRF_noemu: Create a HVM domain without emulated devices. */
+/* XXX: Should be split into smaller flags that disable specific devices? */
+#define _DOMCRF_noemu           6
+#define DOMCRF_noemu            (1U<<_DOMCRF_noemu)
 
 /*
  * rcu_lock_domain_by_id() is more efficient than get_domain_by_id().
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 12/13] xen: allow 64bit HVM guests to use XENMEM_memory_map
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (10 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 11/13] xen/libxl: allow creating HVM guests without a device model Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-23  9:43   ` Jan Beulich
  2015-06-22 16:11 ` [PATCH RFC v1 13/13] xenconsole: try to attach to PV console if HVM fails Roger Pau Monne
  2015-06-22 17:55 ` [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Stefano Stabellini
  13 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

Enable this hypercall for 64bit HVM guests in order to fetch the e820 memory
map in the absence of an emulated BIOS. The memory map is populated and
notified to Xen in arch_setup_meminit_hvm.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
I see no reason to not enable it for 32bit HVM guests, but let's leave that
for later.
---
 tools/libxc/xc_dom_x86.c | 29 ++++++++++++++++++++++++++++-
 xen/arch/x86/hvm/hvm.c   |  1 -
 xen/arch/x86/mm.c        |  6 ------
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 0d9ec42..3a57ded 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -1154,6 +1154,7 @@ static int check_mmio_hole(uint64_t start, uint64_t memsize,
         return 1;
 }
 
+#define MAX_E820_ENTRIES    128
 static int arch_setup_meminit_hvm(struct xc_dom_image *dom)
 {
     unsigned long i, vmemid, nr_pages = dom->total_pages;
@@ -1174,6 +1175,8 @@ static int arch_setup_meminit_hvm(struct xc_dom_image *dom)
     unsigned int nr_vmemranges, nr_vnodes;
     xc_interface *xch = dom->xch;
     uint32_t domid = dom->guest_domid;
+    struct e820entry entries[MAX_E820_ENTRIES];
+    int e820_index = 0;
 
     if ( nr_pages > target_pages )
         memflags |= XENMEMF_populate_on_demand;
@@ -1224,6 +1227,13 @@ static int arch_setup_meminit_hvm(struct xc_dom_image *dom)
         vnode_to_pnode = dom->vnode_to_pnode;
     }
 
+    /* Add one additional memeory range to account for the VGA hole */
+    if ( (nr_vmemranges + (dom->vga_hole ? 1 : 0)) > MAX_E820_ENTRIES )
+    {
+        DOMPRINTF("Too many memory ranges");
+        goto error_out;
+    }
+
     total_pages = 0;
     p2m_size = 0;
     for ( i = 0; i < nr_vmemranges; i++ )
@@ -1313,9 +1323,13 @@ static int arch_setup_meminit_hvm(struct xc_dom_image *dom)
      * Under 2MB mode, we allocate pages in batches of no more than 8MB to 
      * ensure that we can be preempted and hence dom0 remains responsive.
      */
-    if ( dom->vga_hole )
+    if ( dom->vga_hole ) {
         rc = xc_domain_populate_physmap_exact(
             xch, domid, 0xa0, 0, memflags, &dom->p2m_host[0x00]);
+        entries[e820_index].addr = 0;
+        entries[e820_index].size = 0xa0 << PAGE_SHIFT;
+        entries[e820_index++].type = E820_RAM;
+    }
 
     stat_normal_pages = 0;
     for ( vmemid = 0; vmemid < nr_vmemranges; vmemid++ )
@@ -1342,6 +1356,12 @@ static int arch_setup_meminit_hvm(struct xc_dom_image *dom)
         else
             cur_pages = vmemranges[vmemid].start >> PAGE_SHIFT;
 
+                /* Build an e820 map. */
+        entries[e820_index].addr = cur_pages << PAGE_SHIFT;
+        entries[e820_index].size = vmemranges[vmemid].end -
+                                   entries[e820_index].addr;
+        entries[e820_index++].type = E820_RAM;
+
         while ( (rc == 0) && (end_pages > cur_pages) )
         {
             /* Clip count to maximum 1GB extent. */
@@ -1459,6 +1479,13 @@ static int arch_setup_meminit_hvm(struct xc_dom_image *dom)
     DPRINTF("  2MB PAGES: 0x%016lx\n", stat_2mb_pages);
     DPRINTF("  1GB PAGES: 0x%016lx\n", stat_1gb_pages);
 
+    rc = xc_domain_set_memory_map(xch, domid, entries, e820_index);
+    if ( rc != 0 )
+    {
+        DOMPRINTF("unable to set memory map.");
+        goto error_out;
+    }
+
     rc = 0;
     goto out;
  error_out:
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7694c9e..98109e2 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4749,7 +4749,6 @@ static long hvm_memory_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     switch ( cmd & MEMOP_CMD_MASK )
     {
-    case XENMEM_memory_map:
     case XENMEM_machine_memory_map:
     case XENMEM_machphys_mapping:
         return -ENOSYS;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 9e08c9b..fcb8682 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4717,12 +4717,6 @@ long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
             return rc;
         }
 
-        if ( is_hvm_domain(d) )
-        {
-            rcu_unlock_domain(d);
-            return -EPERM;
-        }
-
         e820 = xmalloc_array(e820entry_t, fmap.map.nr_entries);
         if ( e820 == NULL )
         {
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC v1 13/13] xenconsole: try to attach to PV console if HVM fails
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (11 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 12/13] xen: allow 64bit HVM guests to use XENMEM_memory_map Roger Pau Monne
@ 2015-06-22 16:11 ` Roger Pau Monne
  2015-06-22 17:55 ` [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Stefano Stabellini
  13 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monne @ 2015-06-22 16:11 UTC (permalink / raw)
  To: xen-devel
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, Boris Ostrovsky,
	Roger Pau Monne

HVM guests have always used the emulated serial console by default, but if
the emulated serial pty cannot be fetched from xenstore try to use the PV
console instead.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
---
 tools/console/client/main.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/console/client/main.c b/tools/console/client/main.c
index f4c783b..c92553e 100644
--- a/tools/console/client/main.c
+++ b/tools/console/client/main.c
@@ -279,7 +279,7 @@ int main(int argc, char **argv)
 		{ 0 },
 
 	};
-	char *dom_path = NULL, *path = NULL;
+	char *dom_path = NULL, *path = NULL, *test = NULL;
 	int spty, xsfd;
 	struct xs_handle *xs;
 	char *end;
@@ -357,9 +357,14 @@ int main(int argc, char **argv)
 	path = malloc(strlen(dom_path) + strlen("/device/console/0/tty") + 5);
 	if (path == NULL)
 		err(ENOMEM, "malloc");
-	if (type == CONSOLE_SERIAL)
+	if (type == CONSOLE_SERIAL) {
 		snprintf(path, strlen(dom_path) + strlen("/serial/0/tty") + 5, "%s/serial/%d/tty", dom_path, num);
-	else {
+		test = xs_read(xs, XBT_NULL, path, NULL);
+		free(test);
+		if (test == NULL)
+			goto pv_console;
+	} else {
+pv_console:
 		if (num == 0)
 			snprintf(path, strlen(dom_path) + strlen("/console/tty") + 1, "%s/console/tty", dom_path);
 		else
-- 
1.9.5 (Apple Git-50.3)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
                   ` (12 preceding siblings ...)
  2015-06-22 16:11 ` [PATCH RFC v1 13/13] xenconsole: try to attach to PV console if HVM fails Roger Pau Monne
@ 2015-06-22 17:55 ` Stefano Stabellini
  2015-06-22 18:05   ` Konrad Rzeszutek Wilk
  2015-06-23  7:14   ` Roger Pau Monné
  13 siblings, 2 replies; 43+ messages in thread
From: Stefano Stabellini @ 2015-06-22 17:55 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, andrew.cooper3,
	Stefano Stabellini, ian.jackson, xen-devel, boris.ostrovsky

Hi Roger,

given that this patch series is actually using the Xen "hvm" builder, I
take that all the PVH code paths in Xen or the guest kernel are not
actually used, correct? This is more like PV on HVM without QEMU, right?

Do you think think this can work for Dom0 too?

Would that make all the PVH stuff in Xen and Linux effectively useless?

Thanks,

Stefano

On Mon, 22 Jun 2015, Roger Pau Monne wrote:
> Before reading any further, keep in mind this is a VERY inital RFC 
> prototype series. Many things are not finished, and those that are done 
> make heavy use of duck tape in order to keep things into place.
> 
> Now that you are warned, this series is split in the following order:
> 
>  - Patches from 1 to 7 switch HVM domain contruction to use the xc_dom_* 
>    family of functions, like they are used to build PV domains. 
>  - Patches from 8 to 13 introduce the creation of HVM domains without 
>    firmware, which is replaced by directly loading a kernel like it's done 
>    for PV guests. A new boot ABI based on the discussion in the thread "RFC: 
>    making the PVH 64bit ABI as stable" is also introduced, although it's not 
>    finished.
> 
> Some things that are missing from the new boot ABI:
> 
>  - Although it supports loading a ramdisk, there's still now defined way 
>    into how to pass this ramdisk to the guest. I'm thinking of using a 
>    HVMPARAM or simply setting a GP register to contain the address of the 
>    ramdisk. Ideally I would like to support loading more than one ramdisk.
> 
> Some patches contain comments after the SoB, which in general describe the 
> shortcommings of the implementation. The aim of those is to initiate 
> discussion about whether the aproach taken is TRTTD.
> 
> I've only tested this on Intel hw, but I see no reason why it shouldn't work 
> on AMD. I've managed to boot FreeBSD up to the point were it should 
> jump into user-space (I just didn't have a VBD attached to the VM so it just 
> sits waiting for a valid disk). I have not tried to boot it any further 
> since I think that's fine for the purpose of this series. 
> 
> The series can also be found in the following git repo:
> 
> git://xenbits.xen.org/people/royger/xen.git branch hvm_without_dm_v1
> 
> And for the FreeBSD part:
> 
> git://xenbits.xen.org/people/royger/freebsd.git branch new_entry_point_v1
> 
> In case someone wants to give it a try, I've uploaded a FreeBSD kernel that 
> should work when booted into this mode:
> 
> https://people.freebsd.org/~royger/kernel_no_dm
> 
> The config file that I've used is:
> 
> <config>
> kernel="/path/to/kernel_no_dm"
> 
> builder="hvm"
> device_model_version="none"
> 
> memory=128
> vcpus=1
> name = "freebsd"
> </config>
> 
> Thanks, Roger.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-22 17:55 ` [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Stefano Stabellini
@ 2015-06-22 18:05   ` Konrad Rzeszutek Wilk
  2015-06-23  8:14     ` Roger Pau Monné
  2015-06-23 10:55     ` Stefano Stabellini
  2015-06-23  7:14   ` Roger Pau Monné
  1 sibling, 2 replies; 43+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-06-22 18:05 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, andrew.cooper3,
	ian.jackson, xen-devel, boris.ostrovsky, Roger Pau Monne

On Mon, Jun 22, 2015 at 06:55:12PM +0100, Stefano Stabellini wrote:
> Hi Roger,
> 
> given that this patch series is actually using the Xen "hvm" builder, I
> take that all the PVH code paths in Xen or the guest kernel are not
> actually used, correct? This is more like PV on HVM without QEMU, right?

Are you saying it should be called now 'HVM-diet' ? Or 'HVMlite' instead
of PVH since it is looking at this from the HVM perspective instead of PVH?


> 
> Do you think think this can work for Dom0 too?
> 
> Would that make all the PVH stuff in Xen and Linux effectively useless?

No. The AP bootup is still the same. So would all the hypercalls I think.
> 
> Thanks,
> 
> Stefano
> 
> On Mon, 22 Jun 2015, Roger Pau Monne wrote:
> > Before reading any further, keep in mind this is a VERY inital RFC 
> > prototype series. Many things are not finished, and those that are done 
> > make heavy use of duck tape in order to keep things into place.
> > 
> > Now that you are warned, this series is split in the following order:
> > 
> >  - Patches from 1 to 7 switch HVM domain contruction to use the xc_dom_* 
> >    family of functions, like they are used to build PV domains. 
> >  - Patches from 8 to 13 introduce the creation of HVM domains without 
> >    firmware, which is replaced by directly loading a kernel like it's done 
> >    for PV guests. A new boot ABI based on the discussion in the thread "RFC: 
> >    making the PVH 64bit ABI as stable" is also introduced, although it's not 
> >    finished.
> > 
> > Some things that are missing from the new boot ABI:
> > 
> >  - Although it supports loading a ramdisk, there's still now defined way 
> >    into how to pass this ramdisk to the guest. I'm thinking of using a 
> >    HVMPARAM or simply setting a GP register to contain the address of the 
> >    ramdisk. Ideally I would like to support loading more than one ramdisk.
> > 
> > Some patches contain comments after the SoB, which in general describe the 
> > shortcommings of the implementation. The aim of those is to initiate 
> > discussion about whether the aproach taken is TRTTD.
> > 
> > I've only tested this on Intel hw, but I see no reason why it shouldn't work 
> > on AMD. I've managed to boot FreeBSD up to the point were it should 
> > jump into user-space (I just didn't have a VBD attached to the VM so it just 
> > sits waiting for a valid disk). I have not tried to boot it any further 
> > since I think that's fine for the purpose of this series. 
> > 
> > The series can also be found in the following git repo:
> > 
> > git://xenbits.xen.org/people/royger/xen.git branch hvm_without_dm_v1
> > 
> > And for the FreeBSD part:
> > 
> > git://xenbits.xen.org/people/royger/freebsd.git branch new_entry_point_v1
> > 
> > In case someone wants to give it a try, I've uploaded a FreeBSD kernel that 
> > should work when booted into this mode:
> > 
> > https://people.freebsd.org/~royger/kernel_no_dm
> > 
> > The config file that I've used is:
> > 
> > <config>
> > kernel="/path/to/kernel_no_dm"
> > 
> > builder="hvm"
> > device_model_version="none"
> > 
> > memory=128
> > vcpus=1
> > name = "freebsd"
> > </config>
> > 
> > Thanks, Roger.
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> > 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-22 17:55 ` [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Stefano Stabellini
  2015-06-22 18:05   ` Konrad Rzeszutek Wilk
@ 2015-06-23  7:14   ` Roger Pau Monné
  1 sibling, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2015-06-23  7:14 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, andrew.cooper3,
	ian.jackson, xen-devel, boris.ostrovsky

Hello,

El 22/06/15 a les 19.55, Stefano Stabellini ha escrit:
> Hi Roger,
> 
> given that this patch series is actually using the Xen "hvm" builder, I
> take that all the PVH code paths in Xen or the guest kernel are not
> actually used, correct? This is more like PV on HVM without QEMU, right?

>From a Xen POV this is not a PVH guest (it's an HVM guest), although I'm
using some code paths which are shared with PVH. In the guest (in this
case FreeBSD) I'm using the same paths as PVH, since the exposed
interface is very similar. We probably aim at enabling the same set of
hypercalls that are enabled on PVH.

> Do you think think this can work for Dom0 too?

I don't see why it couldn't be made to work.

> Would that make all the PVH stuff in Xen and Linux effectively useless?

No, I expect that some code paths inside of Xen will be shared between
PVH and this HVM-without-dm guest type.

Then from a guest POV the interface is quite similar, so most of the
code present in Linux and FreeBSD in order to run as a PVH guest can be
reused for this new guest mode. If you take a look at the FreeBSD patch
the change is just mostly during early boot in order to set the page
tables and jump into long mode, then the rest is quite similar to PVH.

Forgot to mention, I've also tested it with hap=0 (so using shadow) and
it seems to work fine.

Roger.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-22 18:05   ` Konrad Rzeszutek Wilk
@ 2015-06-23  8:14     ` Roger Pau Monné
  2015-06-23 10:55     ` Stefano Stabellini
  1 sibling, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2015-06-23  8:14 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Stefano Stabellini
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, andrew.cooper3,
	ian.jackson, xen-devel, boris.ostrovsky

El 22/06/15 a les 20.05, Konrad Rzeszutek Wilk ha escrit:
> On Mon, Jun 22, 2015 at 06:55:12PM +0100, Stefano Stabellini wrote:
>> Hi Roger,
>>
>> given that this patch series is actually using the Xen "hvm" builder, I
>> take that all the PVH code paths in Xen or the guest kernel are not
>> actually used, correct? This is more like PV on HVM without QEMU, right?
> 
> Are you saying it should be called now 'HVM-diet' ? Or 'HVMlite' instead
> of PVH since it is looking at this from the HVM perspective instead of PVH?
> 
> 
>>
>> Do you think think this can work for Dom0 too?
>>
>> Would that make all the PVH stuff in Xen and Linux effectively useless?
> 
> No. The AP bootup is still the same. So would all the hypercalls I think.

This is something that we have not discussed, but for HVM domains the
vCPU initialize hypercall only allows starting the vCPU in 32bit mode
with paging disabled (just like how we are starting the BSP). I'm fine
with that and I don't mind implementing a small trampoline for APs also,
but would like to hear opinions about it.

Roger.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 05/13] libxc: introduce a domain loader for HVM guest firmware
  2015-06-22 16:11 ` [PATCH RFC v1 05/13] libxc: introduce a domain loader for HVM guest firmware Roger Pau Monne
@ 2015-06-23  9:29   ` Jan Beulich
  2015-06-23  9:36     ` Roger Pau Monné
  2015-07-10 19:09   ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2015-06-23  9:29 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, xen-devel, Boris Ostrovsky

>>> On 22.06.15 at 18:11, <roger.pau@citrix.com> wrote:
> Introduce a very simple (and dummy) domain loader to be used to load the
> firmware (hvmloader) into HVM guests. Since hmvloader is just a 32bit elf
> executable the loader is fairly simple.

But hvmloader gets loaded fine today - why is this needed?

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 09/13] elfnotes: intorduce a new PHYS_ENTRY elfnote
  2015-06-22 16:11 ` [PATCH RFC v1 09/13] elfnotes: intorduce a new PHYS_ENTRY elfnote Roger Pau Monne
@ 2015-06-23  9:35   ` Jan Beulich
  2015-06-23  9:40     ` Roger Pau Monné
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2015-06-23  9:35 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, xen-devel, Boris Ostrovsky

>>> On 22.06.15 at 18:11, <roger.pau@citrix.com> wrote:
> @@ -213,6 +214,9 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf,
>                  elf, note, sizeof(*parms->f_supported), i);
>          break;
>  
> +    case XEN_ELFNOTE_PHYS_ENTRY:
> +        parms->phys_entry = val;

I don't think I've seen this field having got added in an earlier patch,
and it's also not getting added here.

> --- a/xen/include/public/elfnote.h
> +++ b/xen/include/public/elfnote.h
> @@ -200,9 +200,18 @@
>  #define XEN_ELFNOTE_SUPPORTED_FEATURES 17
>  
>  /*
> + * Physical entry point into the kernel.
> + *
> + * 32bit entry point into the kernel. Xen will use this entry point
> + * in order to launch the guest kernel in 32bit protected mode
> + * with paging disabled.
> + */
> +#define XEN_ELFNOTE_PHYS_ENTRY 18

Perhaps XEN_ELFNOTE_PHYS_ENTRY32 or XEN_ELFNOTE_PHYS32_ENTRY
then?

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 05/13] libxc: introduce a domain loader for HVM guest firmware
  2015-06-23  9:29   ` Jan Beulich
@ 2015-06-23  9:36     ` Roger Pau Monné
  0 siblings, 0 replies; 43+ messages in thread
From: Roger Pau Monné @ 2015-06-23  9:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, xen-devel, Boris Ostrovsky

El 23/06/15 a les 11.29, Jan Beulich ha escrit:
>>>> On 22.06.15 at 18:11, <roger.pau@citrix.com> wrote:
>> Introduce a very simple (and dummy) domain loader to be used to load the
>> firmware (hvmloader) into HVM guests. Since hmvloader is just a 32bit elf
>> executable the loader is fairly simple.
> 
> But hvmloader gets loaded fine today - why is this needed?

So that we can use the same set of xc_dom_* functions that we currently
use on x86 to build PV guests, instead of the crappy setup_guest stub
that we have in xc_hvm_build_x86.c.

As said in the cover letter, patches 1 to 7 aim at unifying the way HVM
and PV guests are created on x86.

Roger.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 09/13] elfnotes: intorduce a new PHYS_ENTRY elfnote
  2015-06-23  9:35   ` Jan Beulich
@ 2015-06-23  9:40     ` Roger Pau Monné
  2015-06-23 10:01       ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2015-06-23  9:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, xen-devel, Boris Ostrovsky

El 23/06/15 a les 11.35, Jan Beulich ha escrit:
>>>> On 22.06.15 at 18:11, <roger.pau@citrix.com> wrote:
>> @@ -213,6 +214,9 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf,
>>                  elf, note, sizeof(*parms->f_supported), i);
>>          break;
>>  
>> +    case XEN_ELFNOTE_PHYS_ENTRY:
>> +        parms->phys_entry = val;
> 
> I don't think I've seen this field having got added in an earlier patch,
> and it's also not getting added here.

Yes, it's added in patch 5, because it's also used by the HVM-generic
loader.

>> --- a/xen/include/public/elfnote.h
>> +++ b/xen/include/public/elfnote.h
>> @@ -200,9 +200,18 @@
>>  #define XEN_ELFNOTE_SUPPORTED_FEATURES 17
>>  
>>  /*
>> + * Physical entry point into the kernel.
>> + *
>> + * 32bit entry point into the kernel. Xen will use this entry point
>> + * in order to launch the guest kernel in 32bit protected mode
>> + * with paging disabled.
>> + */
>> +#define XEN_ELFNOTE_PHYS_ENTRY 18
> 
> Perhaps XEN_ELFNOTE_PHYS_ENTRY32 or XEN_ELFNOTE_PHYS32_ENTRY
> then?

That's fine, I don't mind changing it. Although AFAIK it's not possible
to have a 64bit physical entry point (paging-disabled).

Roger.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 11/13] xen/libxl: allow creating HVM guests without a device model
  2015-06-22 16:11 ` [PATCH RFC v1 11/13] xen/libxl: allow creating HVM guests without a device model Roger Pau Monne
@ 2015-06-23  9:41   ` Jan Beulich
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2015-06-23  9:41 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, xen-devel, Boris Ostrovsky

>>> On 22.06.15 at 18:11, <roger.pau@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -343,7 +343,7 @@ u64 hvm_get_guest_tsc_adjust(struct vcpu *v)
>  void hvm_migrate_timers(struct vcpu *v)
>  {
>      /* PVH doesn't use rtc and emulated timers, it uses pvclock mechanism. */
> -    if ( is_pvh_vcpu(v) )
> +    if ( is_pvh_vcpu(v) || v->domain->arch.hvm_domain.no_emu )

Why would you need to keep the is-PVH check when you have ...

> @@ -1485,9 +1485,10 @@ int hvm_domain_initialise(struct domain *d)
>      else
>          d->arch.hvm_domain.io_bitmap = hvm_io_bitmap;
>  
> -    if ( is_pvh_domain(d) )
> +    if ( is_pvh_domain(d) || domcr_flags & DOMCRF_noemu )
>      {
>          register_portio_handler(d, 0, 0x10003, handle_pvh_io);
> +        d->arch.hvm_domain.no_emu = TRUE;

... this (which of course shouldn't use TRUE).

> @@ -2327,6 +2328,9 @@ int hvm_vcpu_initialise(struct vcpu *v)
>          return 0;
>      }
>  
> +    if ( d->arch.hvm_domain.no_emu )
> +        return 0;
> +
>      rc = setup_compat_arg_xlat(v); /* teardown: free_compat_arg_xlat() */

How are you going to get away without an argument translation
area?

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 12/13] xen: allow 64bit HVM guests to use XENMEM_memory_map
  2015-06-22 16:11 ` [PATCH RFC v1 12/13] xen: allow 64bit HVM guests to use XENMEM_memory_map Roger Pau Monne
@ 2015-06-23  9:43   ` Jan Beulich
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2015-06-23  9:43 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, xen-devel, Boris Ostrovsky

>>> On 22.06.15 at 18:11, <roger.pau@citrix.com> wrote:
> Enable this hypercall for 64bit HVM guests in order to fetch the e820 memory
> map in the absence of an emulated BIOS. The memory map is populated and
> notified to Xen in arch_setup_meminit_hvm.

I see no reason why this should be limited to 64-bit guests.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 09/13] elfnotes: intorduce a new PHYS_ENTRY elfnote
  2015-06-23  9:40     ` Roger Pau Monné
@ 2015-06-23 10:01       ` Jan Beulich
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2015-06-23 10:01 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, xen-devel, BorisOstrovsky

>>> On 23.06.15 at 11:40, <roger.pau@citrix.com> wrote:
> El 23/06/15 a les 11.35, Jan Beulich ha escrit:
>>>>> On 22.06.15 at 18:11, <roger.pau@citrix.com> wrote:
>>> @@ -213,6 +214,9 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary 
> *elf,
>>>                  elf, note, sizeof(*parms->f_supported), i);
>>>          break;
>>>  
>>> +    case XEN_ELFNOTE_PHYS_ENTRY:
>>> +        parms->phys_entry = val;
>> 
>> I don't think I've seen this field having got added in an earlier patch,
>> and it's also not getting added here.
> 
> Yes, it's added in patch 5, because it's also used by the HVM-generic
> loader.

So indeed missed it (being in an otherwise tools only patch). Sorry.

>>> --- a/xen/include/public/elfnote.h
>>> +++ b/xen/include/public/elfnote.h
>>> @@ -200,9 +200,18 @@
>>>  #define XEN_ELFNOTE_SUPPORTED_FEATURES 17
>>>  
>>>  /*
>>> + * Physical entry point into the kernel.
>>> + *
>>> + * 32bit entry point into the kernel. Xen will use this entry point
>>> + * in order to launch the guest kernel in 32bit protected mode
>>> + * with paging disabled.
>>> + */
>>> +#define XEN_ELFNOTE_PHYS_ENTRY 18
>> 
>> Perhaps XEN_ELFNOTE_PHYS_ENTRY32 or XEN_ELFNOTE_PHYS32_ENTRY
>> then?
> 
> That's fine, I don't mind changing it. Although AFAIK it's not possible
> to have a 64bit physical entry point (paging-disabled).

That depends on the perspective you take: Under UEFI the kernel
would be entered in pseudo-physical (1:1 mapped virtual) mode.
And that's certainly a model to at least keep in mind.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-22 18:05   ` Konrad Rzeszutek Wilk
  2015-06-23  8:14     ` Roger Pau Monné
@ 2015-06-23 10:55     ` Stefano Stabellini
  2015-06-23 12:50       ` Ian Campbell
  2015-06-24  9:47       ` Roger Pau Monné
  1 sibling, 2 replies; 43+ messages in thread
From: Stefano Stabellini @ 2015-06-23 10:55 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, Stefano Stabellini,
	andrew.cooper3, ian.jackson, xen-devel, boris.ostrovsky,
	Roger Pau Monne

On Mon, 22 Jun 2015, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 22, 2015 at 06:55:12PM +0100, Stefano Stabellini wrote:
> > Hi Roger,
> > 
> > given that this patch series is actually using the Xen "hvm" builder, I
> > take that all the PVH code paths in Xen or the guest kernel are not
> > actually used, correct? This is more like PV on HVM without QEMU, right?
> 
> Are you saying it should be called now 'HVM-diet' ? Or 'HVMlite' instead
> of PVH since it is looking at this from the HVM perspective instead of PVH?

HVMlite doesn't sound bad :-)

I don't know if we should introduce a new name for this, but I wanted to
point out that this is different from PVH from Xen point of view. In
particular most of the outstanding PVH work items (32bit, AMD) on the
hypervisor would be redudant if we get this to work, right? If that is
the case, then I think it is best we agree on which road we want to take
going forward as soon as possible to avoid duplicated or wasted efforts.

In fact it is not clear to me if/when we get this to work, what would be
the remaining open issues to complete "HVMlite". Roger?


> > Do you think think this can work for Dom0 too?
> > 
> > Would that make all the PVH stuff in Xen and Linux effectively useless?
> 
> No. The AP bootup is still the same. So would all the hypercalls I think.
> > 
> > Thanks,
> > 
> > Stefano
> > 
> > On Mon, 22 Jun 2015, Roger Pau Monne wrote:
> > > Before reading any further, keep in mind this is a VERY inital RFC 
> > > prototype series. Many things are not finished, and those that are done 
> > > make heavy use of duck tape in order to keep things into place.
> > > 
> > > Now that you are warned, this series is split in the following order:
> > > 
> > >  - Patches from 1 to 7 switch HVM domain contruction to use the xc_dom_* 
> > >    family of functions, like they are used to build PV domains. 
> > >  - Patches from 8 to 13 introduce the creation of HVM domains without 
> > >    firmware, which is replaced by directly loading a kernel like it's done 
> > >    for PV guests. A new boot ABI based on the discussion in the thread "RFC: 
> > >    making the PVH 64bit ABI as stable" is also introduced, although it's not 
> > >    finished.
> > > 
> > > Some things that are missing from the new boot ABI:
> > > 
> > >  - Although it supports loading a ramdisk, there's still now defined way 
> > >    into how to pass this ramdisk to the guest. I'm thinking of using a 
> > >    HVMPARAM or simply setting a GP register to contain the address of the 
> > >    ramdisk. Ideally I would like to support loading more than one ramdisk.
> > > 
> > > Some patches contain comments after the SoB, which in general describe the 
> > > shortcommings of the implementation. The aim of those is to initiate 
> > > discussion about whether the aproach taken is TRTTD.
> > > 
> > > I've only tested this on Intel hw, but I see no reason why it shouldn't work 
> > > on AMD. I've managed to boot FreeBSD up to the point were it should 
> > > jump into user-space (I just didn't have a VBD attached to the VM so it just 
> > > sits waiting for a valid disk). I have not tried to boot it any further 
> > > since I think that's fine for the purpose of this series. 
> > > 
> > > The series can also be found in the following git repo:
> > > 
> > > git://xenbits.xen.org/people/royger/xen.git branch hvm_without_dm_v1
> > > 
> > > And for the FreeBSD part:
> > > 
> > > git://xenbits.xen.org/people/royger/freebsd.git branch new_entry_point_v1
> > > 
> > > In case someone wants to give it a try, I've uploaded a FreeBSD kernel that 
> > > should work when booted into this mode:
> > > 
> > > https://people.freebsd.org/~royger/kernel_no_dm
> > > 
> > > The config file that I've used is:
> > > 
> > > <config>
> > > kernel="/path/to/kernel_no_dm"
> > > 
> > > builder="hvm"
> > > device_model_version="none"
> > > 
> > > memory=128
> > > vcpus=1
> > > name = "freebsd"
> > > </config>
> > > 
> > > Thanks, Roger.
> > > 
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xen.org
> > > http://lists.xen.org/xen-devel
> > > 
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-23 10:55     ` Stefano Stabellini
@ 2015-06-23 12:50       ` Ian Campbell
  2015-06-23 13:12         ` Stefano Stabellini
  2015-06-24  9:47       ` Roger Pau Monné
  1 sibling, 1 reply; 43+ messages in thread
From: Ian Campbell @ 2015-06-23 12:50 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: elena.ufimtseva, wei.liu2, andrew.cooper3, ian.jackson,
	xen-devel, boris.ostrovsky, Roger Pau Monne

On Tue, 2015-06-23 at 11:55 +0100, Stefano Stabellini wrote:
> I don't know if we should introduce a new name for this, but I wanted to
> point out that this is different from PVH from Xen point of view. In
> particular most of the outstanding PVH work items (32bit, AMD) on the
> hypervisor would be redudant if we get this to work, right? If that is
> the case, then I think it is best we agree on which road we want to take
> going forward as soon as possible to avoid duplicated or wasted efforts.

I think what you are saying is we either want to pursue this path _or_
PVH, but not both, and I would be inclined to agree, it seems to me like
duplication of both effort and functionality to do both.

Ian.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-23 12:50       ` Ian Campbell
@ 2015-06-23 13:12         ` Stefano Stabellini
  2015-06-24  2:45           ` Boris Ostrovsky
  0 siblings, 1 reply; 43+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:12 UTC (permalink / raw)
  To: Ian Campbell
  Cc: elena.ufimtseva, wei.liu2, andrew.cooper3, Stefano Stabellini,
	ian.jackson, xen-devel, boris.ostrovsky, Roger Pau Monne

On Tue, 23 Jun 2015, Ian Campbell wrote:
> On Tue, 2015-06-23 at 11:55 +0100, Stefano Stabellini wrote:
> > I don't know if we should introduce a new name for this, but I wanted to
> > point out that this is different from PVH from Xen point of view. In
> > particular most of the outstanding PVH work items (32bit, AMD) on the
> > hypervisor would be redudant if we get this to work, right? If that is
> > the case, then I think it is best we agree on which road we want to take
> > going forward as soon as possible to avoid duplicated or wasted efforts.
> 
> I think what you are saying is we either want to pursue this path _or_
> PVH, but not both, and I would be inclined to agree, it seems to me like
> duplication of both effort and functionality to do both.

Right, especially given that they both seem to provide similar
functionalities.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-23 13:12         ` Stefano Stabellini
@ 2015-06-24  2:45           ` Boris Ostrovsky
  0 siblings, 0 replies; 43+ messages in thread
From: Boris Ostrovsky @ 2015-06-24  2:45 UTC (permalink / raw)
  To: Stefano Stabellini, Ian Campbell
  Cc: elena.ufimtseva, wei.liu2, andrew.cooper3, ian.jackson,
	xen-devel, Roger Pau Monne

On 06/23/2015 09:12 AM, Stefano Stabellini wrote:
> On Tue, 23 Jun 2015, Ian Campbell wrote:
>> On Tue, 2015-06-23 at 11:55 +0100, Stefano Stabellini wrote:
>>> I don't know if we should introduce a new name for this, but I wanted to
>>> point out that this is different from PVH from Xen point of view. In
>>> particular most of the outstanding PVH work items (32bit, AMD) on the
>>> hypervisor would be redudant if we get this to work, right? If that is
>>> the case, then I think it is best we agree on which road we want to take
>>> going forward as soon as possible to avoid duplicated or wasted efforts.
>> I think what you are saying is we either want to pursue this path _or_
>> PVH, but not both, and I would be inclined to agree, it seems to me like
>> duplication of both effort and functionality to do both.
> Right, especially given that they both seem to provide similar
> functionalities.

Given that 32-bit support of existing PVH model looks pretty simple and 
required AMD changes are also well understood (right?) and do not appear 
particularly invasive I would argue for finishing those two while 
continuing to work on unified boot model.

-boris

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-23 10:55     ` Stefano Stabellini
  2015-06-23 12:50       ` Ian Campbell
@ 2015-06-24  9:47       ` Roger Pau Monné
  2015-06-24 10:05         ` Jan Beulich
  2015-06-24 13:26         ` Stefano Stabellini
  1 sibling, 2 replies; 43+ messages in thread
From: Roger Pau Monné @ 2015-06-24  9:47 UTC (permalink / raw)
  To: Stefano Stabellini, Konrad Rzeszutek Wilk
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, andrew.cooper3,
	ian.jackson, xen-devel, boris.ostrovsky

El 23/06/15 a les 12.55, Stefano Stabellini ha escrit:
> On Mon, 22 Jun 2015, Konrad Rzeszutek Wilk wrote:
>> On Mon, Jun 22, 2015 at 06:55:12PM +0100, Stefano Stabellini wrote:
>>> Hi Roger,
>>>
>>> given that this patch series is actually using the Xen "hvm" builder, I
>>> take that all the PVH code paths in Xen or the guest kernel are not
>>> actually used, correct? This is more like PV on HVM without QEMU, right?
>>
>> Are you saying it should be called now 'HVM-diet' ? Or 'HVMlite' instead
>> of PVH since it is looking at this from the HVM perspective instead of PVH?
> 
> HVMlite doesn't sound bad :-)
> 
> I don't know if we should introduce a new name for this, but I wanted to
> point out that this is different from PVH from Xen point of view. In
> particular most of the outstanding PVH work items (32bit, AMD) on the
> hypervisor would be redudant if we get this to work, right? If that is
> the case, then I think it is best we agree on which road we want to take
> going forward as soon as possible to avoid duplicated or wasted efforts.
> 
> In fact it is not clear to me if/when we get this to work, what would be
> the remaining open issues to complete "HVMlite". Roger?

The following items are already working out of the box with the current
patch set:

 - 32bit* and 64bit guest modes.
 - Intel support.
 - AMD support**.
 - HAP support.
 - Shadow support.

*  32bit CPU mode works, but I don't think 32bit hypercalls will work,
   see Jan's reply to patch 11. I plan to fix this in the next
   iteration.
** Untested.


What needs to be done (ordered by priority):

 - Clean up the patches, this patch series was done in less than a week.
 - Finish the boot ABI (this would also be needed for PVH anyway).
 - Convert the rest of xc_dom_*loaders in order to use the physical
   entry point when present, right now xc_dom_elfloader is the only one
   usable with HVMlite. This is quite trivial (see patch 10, it's a 4
   LOC change).
 - Dom0 support.
 - Migration.
 - PCI pass-through.

IMHO this is what we agreed to do with PVH, make it an HVM guest without
a device model and without the emulated devices inside of Xen. Sooner or
later we would need to make that change anyway in order to properly
integrate PVH into Xen, and we get a bunch of new features for free as
compared to PVH.

I don't think of this as "throw PVH out of the window and start
something completely new from scratch", we are going to reuse some of
the code paths used by PVH inside of Xen. From a guest POV the changes
needed to move from PVH into HVMlite are regarding the boot ABI only,
which we already agreed that should be changed anyway.

Roger.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-24  9:47       ` Roger Pau Monné
@ 2015-06-24 10:05         ` Jan Beulich
  2015-06-24 10:14           ` Roger Pau Monné
  2015-06-24 13:26         ` Stefano Stabellini
  1 sibling, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2015-06-24 10:05 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, Stefano Stabellini,
	andrew.cooper3, ian.jackson, xen-devel, boris.ostrovsky

>>> On 24.06.15 at 11:47, <roger.pau@citrix.com> wrote:
> What needs to be done (ordered by priority):
> 
>  - Clean up the patches, this patch series was done in less than a week.
>  - Finish the boot ABI (this would also be needed for PVH anyway).
>  - Convert the rest of xc_dom_*loaders in order to use the physical
>    entry point when present, right now xc_dom_elfloader is the only one
>    usable with HVMlite. This is quite trivial (see patch 10, it's a 4
>    LOC change).
>  - Dom0 support.
>  - Migration.
>  - PCI pass-through.
> 
> IMHO this is what we agreed to do with PVH, make it an HVM guest without
> a device model and without the emulated devices inside of Xen. Sooner or
> later we would need to make that change anyway in order to properly
> integrate PVH into Xen, and we get a bunch of new features for free as
> compared to PVH.
> 
> I don't think of this as "throw PVH out of the window and start
> something completely new from scratch", we are going to reuse some of
> the code paths used by PVH inside of Xen. From a guest POV the changes
> needed to move from PVH into HVMlite are regarding the boot ABI only,
> which we already agreed that should be changed anyway.

I have to admit that I'm having a hard time making myself a clear
picture of what the intention now is, namely with feature freeze
being in about 2.5 weeks: If we assume that this series gets ready
in time, should we drop Boris' 32-bit support patches? Would then
be unfortunate if the series here didn't get ready.

Otoh I don't think this and Boris' code conflict, and what we got in
the tree PVH-wise is kind of a mess right now anyway, so adding to
it just a few more bits (actually getting rid of some fixme-s, i.e.
reducing messiness), so I'd be inclined to take the rest of Boris'
series once ready, and if the series here gets ready too it could
then also go in. Which would then mean for someone (perhaps
after 4.6 was branched) to clean up any no longer necessary
PVH special cases, unifying things towards what we seem to now
call HVMlite.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-24 10:05         ` Jan Beulich
@ 2015-06-24 10:14           ` Roger Pau Monné
  2015-06-24 11:52             ` Boris Ostrovsky
  0 siblings, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2015-06-24 10:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, Stefano Stabellini,
	andrew.cooper3, ian.jackson, xen-devel, boris.ostrovsky

El 24/06/15 a les 12.05, Jan Beulich ha escrit:
>>>> On 24.06.15 at 11:47, <roger.pau@citrix.com> wrote:
>> What needs to be done (ordered by priority):
>>
>>  - Clean up the patches, this patch series was done in less than a week.
>>  - Finish the boot ABI (this would also be needed for PVH anyway).
>>  - Convert the rest of xc_dom_*loaders in order to use the physical
>>    entry point when present, right now xc_dom_elfloader is the only one
>>    usable with HVMlite. This is quite trivial (see patch 10, it's a 4
>>    LOC change).
>>  - Dom0 support.
>>  - Migration.
>>  - PCI pass-through.
>>
>> IMHO this is what we agreed to do with PVH, make it an HVM guest without
>> a device model and without the emulated devices inside of Xen. Sooner or
>> later we would need to make that change anyway in order to properly
>> integrate PVH into Xen, and we get a bunch of new features for free as
>> compared to PVH.
>>
>> I don't think of this as "throw PVH out of the window and start
>> something completely new from scratch", we are going to reuse some of
>> the code paths used by PVH inside of Xen. From a guest POV the changes
>> needed to move from PVH into HVMlite are regarding the boot ABI only,
>> which we already agreed that should be changed anyway.
> 
> I have to admit that I'm having a hard time making myself a clear
> picture of what the intention now is, namely with feature freeze
> being in about 2.5 weeks: If we assume that this series gets ready
> in time, should we drop Boris' 32-bit support patches? Would then
> be unfortunate if the series here didn't get ready.

TBH I'm not going to make any promises of this being ready before the
4.6 feature freeze, not until I get some feedback from the tools
maintainers regarding the libxc changes to unify the PV and HVM domain
creation paths.

> Otoh I don't think this and Boris' code conflict, and what we got in
> the tree PVH-wise is kind of a mess right now anyway, so adding to
> it just a few more bits (actually getting rid of some fixme-s, i.e.
> reducing messiness), so I'd be inclined to take the rest of Boris'
> series once ready, and if the series here gets ready too it could
> then also go in. Which would then mean for someone (perhaps
> after 4.6 was branched) to clean up any no longer necessary
> PVH special cases, unifying things towards what we seem to now
> call HVMlite.

I'm not against merging the 32bit support series for PVH, but I'm
certainly not going to invest time in adding 32bit PVH entry points to
any OSes.

Roger.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-24 10:14           ` Roger Pau Monné
@ 2015-06-24 11:52             ` Boris Ostrovsky
  2015-06-24 12:04               ` Roger Pau Monné
  2015-07-03 16:22               ` Tim Deegan
  0 siblings, 2 replies; 43+ messages in thread
From: Boris Ostrovsky @ 2015-06-24 11:52 UTC (permalink / raw)
  To: Roger Pau Monné, Jan Beulich
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, Stefano Stabellini,
	andrew.cooper3, ian.jackson, xen-devel

On 06/24/2015 06:14 AM, Roger Pau Monné wrote:
> El 24/06/15 a les 12.05, Jan Beulich ha escrit:
>>>>> On 24.06.15 at 11:47, <roger.pau@citrix.com> wrote:
>>> What needs to be done (ordered by priority):
>>>
>>>   - Clean up the patches, this patch series was done in less than a week.
>>>   - Finish the boot ABI (this would also be needed for PVH anyway).
>>>   - Convert the rest of xc_dom_*loaders in order to use the physical
>>>     entry point when present, right now xc_dom_elfloader is the only one
>>>     usable with HVMlite. This is quite trivial (see patch 10, it's a 4
>>>     LOC change).
>>>   - Dom0 support.
>>>   - Migration.
>>>   - PCI pass-through.
>>>
>>> IMHO this is what we agreed to do with PVH, make it an HVM guest without
>>> a device model and without the emulated devices inside of Xen. Sooner or
>>> later we would need to make that change anyway in order to properly
>>> integrate PVH into Xen, and we get a bunch of new features for free as
>>> compared to PVH.
>>>
>>> I don't think of this as "throw PVH out of the window and start
>>> something completely new from scratch", we are going to reuse some of
>>> the code paths used by PVH inside of Xen. From a guest POV the changes
>>> needed to move from PVH into HVMlite are regarding the boot ABI only,
>>> which we already agreed that should be changed anyway.
>> I have to admit that I'm having a hard time making myself a clear
>> picture of what the intention now is, namely with feature freeze
>> being in about 2.5 weeks: If we assume that this series gets ready
>> in time, should we drop Boris' 32-bit support patches? Would then
>> be unfortunate if the series here didn't get ready.
> TBH I'm not going to make any promises of this being ready before the
> 4.6 feature freeze, not until I get some feedback from the tools
> maintainers regarding the libxc changes to unify the PV and HVM domain
> creation paths.

FWIW, I gave this a quick spin on Monday and crashed the hypervisor on a 
NULL pointer right away in vapic code. Which, I assume, is not 
surprising since we are not supposed to be there in the first place.

I'll try it again later today (I was out yesterday), maybe I messed 
something up.

>
>> Otoh I don't think this and Boris' code conflict, and what we got in
>> the tree PVH-wise is kind of a mess right now anyway, so adding to
>> it just a few more bits (actually getting rid of some fixme-s, i.e.
>> reducing messiness), so I'd be inclined to take the rest of Boris'
>> series once ready, and if the series here gets ready too it could
>> then also go in. Which would then mean for someone (perhaps
>> after 4.6 was branched) to clean up any no longer necessary
>> PVH special cases, unifying things towards what we seem to now
>> call HVMlite.
> I'm not against merging the 32bit support series for PVH, but I'm
> certainly not going to invest time in adding 32bit PVH entry points to
> any OSes.

What about Tim's proposal 
(http://lists.xen.org/archives/html/xen-devel/2014-12/msg00596.html)? 
Can this work be made part of it? At least, make it extendable to that?

-boris

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-24 11:52             ` Boris Ostrovsky
@ 2015-06-24 12:04               ` Roger Pau Monné
  2015-06-24 13:36                 ` Konrad Rzeszutek Wilk
  2015-07-03 16:22               ` Tim Deegan
  1 sibling, 1 reply; 43+ messages in thread
From: Roger Pau Monné @ 2015-06-24 12:04 UTC (permalink / raw)
  To: Boris Ostrovsky, Jan Beulich
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, Stefano Stabellini,
	andrew.cooper3, ian.jackson, xen-devel

El 24/06/15 a les 13.52, Boris Ostrovsky ha escrit:
> On 06/24/2015 06:14 AM, Roger Pau Monné wrote:
>> El 24/06/15 a les 12.05, Jan Beulich ha escrit:
>>>>>> On 24.06.15 at 11:47, <roger.pau@citrix.com> wrote:
>>>> What needs to be done (ordered by priority):
>>>>
>>>>   - Clean up the patches, this patch series was done in less than a
>>>> week.
>>>>   - Finish the boot ABI (this would also be needed for PVH anyway).
>>>>   - Convert the rest of xc_dom_*loaders in order to use the physical
>>>>     entry point when present, right now xc_dom_elfloader is the only
>>>> one
>>>>     usable with HVMlite. This is quite trivial (see patch 10, it's a 4
>>>>     LOC change).
>>>>   - Dom0 support.
>>>>   - Migration.
>>>>   - PCI pass-through.
>>>>
>>>> IMHO this is what we agreed to do with PVH, make it an HVM guest
>>>> without
>>>> a device model and without the emulated devices inside of Xen.
>>>> Sooner or
>>>> later we would need to make that change anyway in order to properly
>>>> integrate PVH into Xen, and we get a bunch of new features for free as
>>>> compared to PVH.
>>>>
>>>> I don't think of this as "throw PVH out of the window and start
>>>> something completely new from scratch", we are going to reuse some of
>>>> the code paths used by PVH inside of Xen. From a guest POV the changes
>>>> needed to move from PVH into HVMlite are regarding the boot ABI only,
>>>> which we already agreed that should be changed anyway.
>>> I have to admit that I'm having a hard time making myself a clear
>>> picture of what the intention now is, namely with feature freeze
>>> being in about 2.5 weeks: If we assume that this series gets ready
>>> in time, should we drop Boris' 32-bit support patches? Would then
>>> be unfortunate if the series here didn't get ready.
>> TBH I'm not going to make any promises of this being ready before the
>> 4.6 feature freeze, not until I get some feedback from the tools
>> maintainers regarding the libxc changes to unify the PV and HVM domain
>> creation paths.
> 
> FWIW, I gave this a quick spin on Monday and crashed the hypervisor on a
> NULL pointer right away in vapic code. Which, I assume, is not
> surprising since we are not supposed to be there in the first place.
> 
> I'll try it again later today (I was out yesterday), maybe I messed
> something up.

Yes, feature disabling is still not 100% done I'm afraid. For example if
your hw supports vAPIC it will be enabled anyway, which can then lead to
all kinds of trouble. As said, this is very initial and I've only tested
it on one Nehalem box which doesn't have vAPIC.

>>
>>> Otoh I don't think this and Boris' code conflict, and what we got in
>>> the tree PVH-wise is kind of a mess right now anyway, so adding to
>>> it just a few more bits (actually getting rid of some fixme-s, i.e.
>>> reducing messiness), so I'd be inclined to take the rest of Boris'
>>> series once ready, and if the series here gets ready too it could
>>> then also go in. Which would then mean for someone (perhaps
>>> after 4.6 was branched) to clean up any no longer necessary
>>> PVH special cases, unifying things towards what we seem to now
>>> call HVMlite.
>> I'm not against merging the 32bit support series for PVH, but I'm
>> certainly not going to invest time in adding 32bit PVH entry points to
>> any OSes.
> 
> What about Tim's proposal
> (http://lists.xen.org/archives/html/xen-devel/2014-12/msg00596.html)?
> Can this work be made part of it? At least, make it extendable to that?

Yes, the aim of this work is to address some of the points expressed in
that email, mainly merge PVH into HVM. But as we have already spoken,
the entry point of HVMlite or whatever we call it is going to be
different from the traditional PV/PVH entry point.

Roger.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-24  9:47       ` Roger Pau Monné
  2015-06-24 10:05         ` Jan Beulich
@ 2015-06-24 13:26         ` Stefano Stabellini
  2015-06-24 16:30           ` Boris Ostrovsky
  1 sibling, 1 reply; 43+ messages in thread
From: Stefano Stabellini @ 2015-06-24 13:26 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, andrew.cooper3,
	Stefano Stabellini, ian.jackson, xen-devel, boris.ostrovsky

[-- Attachment #1: Type: text/plain, Size: 3413 bytes --]

On Wed, 24 Jun 2015, Roger Pau Monné wrote:
> El 23/06/15 a les 12.55, Stefano Stabellini ha escrit:
> > On Mon, 22 Jun 2015, Konrad Rzeszutek Wilk wrote:
> >> On Mon, Jun 22, 2015 at 06:55:12PM +0100, Stefano Stabellini wrote:
> >>> Hi Roger,
> >>>
> >>> given that this patch series is actually using the Xen "hvm" builder, I
> >>> take that all the PVH code paths in Xen or the guest kernel are not
> >>> actually used, correct? This is more like PV on HVM without QEMU, right?
> >>
> >> Are you saying it should be called now 'HVM-diet' ? Or 'HVMlite' instead
> >> of PVH since it is looking at this from the HVM perspective instead of PVH?
> >
> > HVMlite doesn't sound bad :-)
> >
> > I don't know if we should introduce a new name for this, but I wanted to
> > point out that this is different from PVH from Xen point of view. In
> > particular most of the outstanding PVH work items (32bit, AMD) on the
> > hypervisor would be redudant if we get this to work, right? If that is
> > the case, then I think it is best we agree on which road we want to take
> > going forward as soon as possible to avoid duplicated or wasted efforts.
> >
> > In fact it is not clear to me if/when we get this to work, what would be
> > the remaining open issues to complete "HVMlite". Roger?
>
> The following items are already working out of the box with the current
> patch set:
>
>  - 32bit* and 64bit guest modes.
>  - Intel support.
>  - AMD support**.
>  - HAP support.
>  - Shadow support.
>
> *  32bit CPU mode works, but I don't think 32bit hypercalls will work,
>    see Jan's reply to patch 11. I plan to fix this in the next
>    iteration.
> ** Untested.
>
>
> What needs to be done (ordered by priority):
>
>  - Clean up the patches, this patch series was done in less than a week.
>  - Finish the boot ABI (this would also be needed for PVH anyway).
>  - Convert the rest of xc_dom_*loaders in order to use the physical
>    entry point when present, right now xc_dom_elfloader is the only one
>    usable with HVMlite. This is quite trivial (see patch 10, it's a 4
>    LOC change).
>  - Dom0 support.

What do you think that Dom0 support is going to entail?


>  - Migration.

This is just HVM migration minus saving/restoring the QEMU state file,
isn't it?


>  - PCI pass-through.

Do we really need PCI pass-through? I see HVMlite mostly useful for
Dom0, but also for higher security Linux and BSD guests. If a user wants
PCI pass-through, she can always use PV on HVM.

Or do you mean allowing PCI pass-through to HVM guests on an HVMlite
Dom0?


> IMHO this is what we agreed to do with PVH, make it an HVM guest without
> a device model and without the emulated devices inside of Xen. Sooner or
> later we would need to make that change anyway in order to properly
> integrate PVH into Xen, and we get a bunch of new features for free as
> compared to PVH.
>
> I don't think of this as "throw PVH out of the window and start
> something completely new from scratch", we are going to reuse some of
> the code paths used by PVH inside of Xen. From a guest POV the changes
> needed to move from PVH into HVMlite are regarding the boot ABI only,
> which we already agreed that should be changed anyway.

Sure, I just wanted to highlight that some work items could become
redundant so that we don't waste any time on those.

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-24 12:04               ` Roger Pau Monné
@ 2015-06-24 13:36                 ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 43+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-06-24 13:36 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, Stefano Stabellini,
	andrew.cooper3, ian.jackson, Jan Beulich, xen-devel,
	Boris Ostrovsky

On Wed, Jun 24, 2015 at 02:04:45PM +0200, Roger Pau Monné wrote:
> El 24/06/15 a les 13.52, Boris Ostrovsky ha escrit:
> > On 06/24/2015 06:14 AM, Roger Pau Monné wrote:
> >> El 24/06/15 a les 12.05, Jan Beulich ha escrit:
> >>>>>> On 24.06.15 at 11:47, <roger.pau@citrix.com> wrote:
> >>>> What needs to be done (ordered by priority):
> >>>>
> >>>>   - Clean up the patches, this patch series was done in less than a
> >>>> week.
> >>>>   - Finish the boot ABI (this would also be needed for PVH anyway).
> >>>>   - Convert the rest of xc_dom_*loaders in order to use the physical
> >>>>     entry point when present, right now xc_dom_elfloader is the only
> >>>> one
> >>>>     usable with HVMlite. This is quite trivial (see patch 10, it's a 4
> >>>>     LOC change).
> >>>>   - Dom0 support.
> >>>>   - Migration.
> >>>>   - PCI pass-through.
> >>>>
> >>>> IMHO this is what we agreed to do with PVH, make it an HVM guest
> >>>> without
> >>>> a device model and without the emulated devices inside of Xen.
> >>>> Sooner or
> >>>> later we would need to make that change anyway in order to properly
> >>>> integrate PVH into Xen, and we get a bunch of new features for free as
> >>>> compared to PVH.
> >>>>
> >>>> I don't think of this as "throw PVH out of the window and start
> >>>> something completely new from scratch", we are going to reuse some of
> >>>> the code paths used by PVH inside of Xen. From a guest POV the changes
> >>>> needed to move from PVH into HVMlite are regarding the boot ABI only,
> >>>> which we already agreed that should be changed anyway.
> >>> I have to admit that I'm having a hard time making myself a clear
> >>> picture of what the intention now is, namely with feature freeze
> >>> being in about 2.5 weeks: If we assume that this series gets ready
> >>> in time, should we drop Boris' 32-bit support patches? Would then
> >>> be unfortunate if the series here didn't get ready.
> >> TBH I'm not going to make any promises of this being ready before the
> >> 4.6 feature freeze, not until I get some feedback from the tools
> >> maintainers regarding the libxc changes to unify the PV and HVM domain
> >> creation paths.
> > 
> > FWIW, I gave this a quick spin on Monday and crashed the hypervisor on a
> > NULL pointer right away in vapic code. Which, I assume, is not
> > surprising since we are not supposed to be there in the first place.
> > 
> > I'll try it again later today (I was out yesterday), maybe I messed
> > something up.
> 
> Yes, feature disabling is still not 100% done I'm afraid. For example if
> your hw supports vAPIC it will be enabled anyway, which can then lead to
> all kinds of trouble. As said, this is very initial and I've only tested
> it on one Nehalem box which doesn't have vAPIC.
> 
> >>
> >>> Otoh I don't think this and Boris' code conflict, and what we got in
> >>> the tree PVH-wise is kind of a mess right now anyway, so adding to
> >>> it just a few more bits (actually getting rid of some fixme-s, i.e.
> >>> reducing messiness), so I'd be inclined to take the rest of Boris'
> >>> series once ready, and if the series here gets ready too it could
> >>> then also go in. Which would then mean for someone (perhaps
> >>> after 4.6 was branched) to clean up any no longer necessary
> >>> PVH special cases, unifying things towards what we seem to now
> >>> call HVMlite.
> >> I'm not against merging the 32bit support series for PVH, but I'm
> >> certainly not going to invest time in adding 32bit PVH entry points to
> >> any OSes.
> > 
> > What about Tim's proposal
> > (http://lists.xen.org/archives/html/xen-devel/2014-12/msg00596.html)?
> > Can this work be made part of it? At least, make it extendable to that?
> 
> Yes, the aim of this work is to address some of the points expressed in
> that email, mainly merge PVH into HVM. But as we have already spoken,
> the entry point of HVMlite or whatever we call it is going to be
> different from the traditional PV/PVH entry point.

Right. If you were to take a sharpie where would you put in the 
list that Tim had written out?

And to my eye - it looks as the HVMLIte boot entry and the old PVH
bootpaths can co-exist for some time until they get merged together?

This would have the nice benefit of being able to troubleshoot
issues easier as you have an existing codepath for 'old-PVH' 
and new bootup path and can figure out what is missing to make it work
(with the Linux kernel at least).

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-24 13:26         ` Stefano Stabellini
@ 2015-06-24 16:30           ` Boris Ostrovsky
  2015-06-24 17:54             ` Stefano Stabellini
  0 siblings, 1 reply; 43+ messages in thread
From: Boris Ostrovsky @ 2015-06-24 16:30 UTC (permalink / raw)
  To: Stefano Stabellini, Roger Pau Monné
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, andrew.cooper3,
	ian.jackson, xen-devel

On 06/24/2015 09:26 AM, Stefano Stabellini wrote:
> On Wed, 24 Jun 2015, Roger Pau Monné wrote:
>
>>   - PCI pass-through.
> Do we really need PCI pass-through? I see HVMlite mostly useful for
> Dom0, but also for higher security Linux and BSD guests. If a user wants
> PCI pass-through, she can always use PV on HVM.

Why is this model not useful for a generic domU? I thought that it 
should eventually become a replacement for what we now have as PVH?

-boris



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-24 16:30           ` Boris Ostrovsky
@ 2015-06-24 17:54             ` Stefano Stabellini
  0 siblings, 0 replies; 43+ messages in thread
From: Stefano Stabellini @ 2015-06-24 17:54 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, andrew.cooper3,
	Stefano Stabellini, ian.jackson, xen-devel, Roger Pau Monné

[-- Attachment #1: Type: text/plain, Size: 874 bytes --]

On Wed, 24 Jun 2015, Boris Ostrovsky wrote:
> On 06/24/2015 09:26 AM, Stefano Stabellini wrote:
> > On Wed, 24 Jun 2015, Roger Pau Monné wrote:
> >
> > >   - PCI pass-through.
> > Do we really need PCI pass-through? I see HVMlite mostly useful for
> > Dom0, but also for higher security Linux and BSD guests. If a user wants
> > PCI pass-through, she can always use PV on HVM.
>
> Why is this model not useful for a generic domU? I thought that it should
> eventually become a replacement for what we now have as PVH?

It is useful as generic DomU because it provides a smaller surface of
attack compared to PV on HVM guests. At the same time using PCI
pass-through increases that surface of attack again, decreasing the
value of HVMLite/PVH, at least in my view. But you are right that it
might be nice to have it for feature completeness, if nothing else.

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 04/13] libxc: allow arch_setup_meminit to populate HVM domain memory
  2015-06-22 16:11 ` [PATCH RFC v1 04/13] libxc: allow arch_setup_meminit to populate HVM domain memory Roger Pau Monne
@ 2015-06-25 10:29   ` Wei Liu
  2015-06-25 10:33     ` Wei Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Wei Liu @ 2015-06-25 10:29 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, xen-devel,
	Boris Ostrovsky

On Mon, Jun 22, 2015 at 06:11:18PM +0200, Roger Pau Monne wrote:
> Introduce a new arch_setup_meminit_hvm that's going to be used to populate
> HVM domain memory. Rename arch_setup_meminit to arch_setup_meminit_hvm_pv

arch_setup_meminit_hvm/pv

> and introduce a stub arch_setup_meminit that will call the right meminit
> function depending on the contains type.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> ---
> I think that both arch_setup_meminit_hvm and arch_setup_meminit_pv could be
> unified into a single meminit function. I have however not looked into it,
> and just created arch_setup_meminit_hvm based on the code in
> xc_hvm_populate_memory.
> ---
>  tools/libxc/include/xc_dom.h |   8 +
>  tools/libxc/xc_dom_x86.c     | 365 +++++++++++++++++++++++++++++++++++++++++--
>  tools/libxl/libxl_dom.c      |   1 +
>  3 files changed, 362 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
> index f7b5f0f..051a7de 100644
> --- a/tools/libxc/include/xc_dom.h
> +++ b/tools/libxc/include/xc_dom.h
> @@ -186,6 +186,14 @@ struct xc_dom_image {
>          XC_DOM_PV_CONTAINER,
>          XC_DOM_HVM_CONTAINER,
>      } container_type;
> +
> +    /* HVM specific fields. */
> +    xen_pfn_t target_pages;
> +    xen_pfn_t mmio_start;
> +    xen_pfn_t mmio_size;
> +    xen_pfn_t lowmem_end;
> +    xen_pfn_t highmem_end;

These fields are in xc_hvm_build_args. I think you can  use that
structure.

> +    int vga_hole;

bool vga_hole.

Wei.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 04/13] libxc: allow arch_setup_meminit to populate HVM domain memory
  2015-06-25 10:29   ` Wei Liu
@ 2015-06-25 10:33     ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2015-06-25 10:33 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, xen-devel,
	Boris Ostrovsky

On Thu, Jun 25, 2015 at 11:29:34AM +0100, Wei Liu wrote:
> On Mon, Jun 22, 2015 at 06:11:18PM +0200, Roger Pau Monne wrote:
> > Introduce a new arch_setup_meminit_hvm that's going to be used to populate
> > HVM domain memory. Rename arch_setup_meminit to arch_setup_meminit_hvm_pv
> 
> arch_setup_meminit_hvm/pv
> 
> > and introduce a stub arch_setup_meminit that will call the right meminit
> > function depending on the contains type.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > Cc: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> > ---
> > I think that both arch_setup_meminit_hvm and arch_setup_meminit_pv could be
> > unified into a single meminit function. I have however not looked into it,
> > and just created arch_setup_meminit_hvm based on the code in
> > xc_hvm_populate_memory.
> > ---
> >  tools/libxc/include/xc_dom.h |   8 +
> >  tools/libxc/xc_dom_x86.c     | 365 +++++++++++++++++++++++++++++++++++++++++--
> >  tools/libxl/libxl_dom.c      |   1 +
> >  3 files changed, 362 insertions(+), 12 deletions(-)
> > 
> > diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
> > index f7b5f0f..051a7de 100644
> > --- a/tools/libxc/include/xc_dom.h
> > +++ b/tools/libxc/include/xc_dom.h
> > @@ -186,6 +186,14 @@ struct xc_dom_image {
> >          XC_DOM_PV_CONTAINER,
> >          XC_DOM_HVM_CONTAINER,
> >      } container_type;
> > +
> > +    /* HVM specific fields. */
> > +    xen_pfn_t target_pages;
> > +    xen_pfn_t mmio_start;
> > +    xen_pfn_t mmio_size;
> > +    xen_pfn_t lowmem_end;
> > +    xen_pfn_t highmem_end;
> 
> These fields are in xc_hvm_build_args. I think you can  use that
> structure.

Never mind this. I see you delete that structure later.

> 
> > +    int vga_hole;
> 
> bool vga_hole.
> 
> Wei.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 10/13] lib{xc/xl}: allow the creation of HVM domains with a kernel
  2015-06-22 16:11 ` [PATCH RFC v1 10/13] lib{xc/xl}: allow the creation of HVM domains with a kernel Roger Pau Monne
@ 2015-06-25 10:39   ` Wei Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Wei Liu @ 2015-06-25 10:39 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, xen-devel,
	Boris Ostrovsky

I think the subject line should be changed a bit.

We already support HVM direct kernel boot with QEMU. Now you're
implementing that without QEMU.

Wei.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
  2015-06-24 11:52             ` Boris Ostrovsky
  2015-06-24 12:04               ` Roger Pau Monné
@ 2015-07-03 16:22               ` Tim Deegan
  1 sibling, 0 replies; 43+ messages in thread
From: Tim Deegan @ 2015-07-03 16:22 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: elena.ufimtseva, wei.liu2, Ian Campbell, Stefano Stabellini,
	andrew.cooper3, ian.jackson, Jan Beulich, xen-devel,
	Roger Pau Monné

At 07:52 -0400 on 24 Jun (1435132373), Boris Ostrovsky wrote:
> On 06/24/2015 06:14 AM, Roger Pau Monné wrote:
> > El 24/06/15 a les 12.05, Jan Beulich ha escrit:
> >>>>> On 24.06.15 at 11:47, <roger.pau@citrix.com> wrote:
> >>> What needs to be done (ordered by priority):
> >>>
> >>>   - Clean up the patches, this patch series was done in less than a week.
> >>>   - Finish the boot ABI (this would also be needed for PVH anyway).
> >>>   - Convert the rest of xc_dom_*loaders in order to use the physical
> >>>     entry point when present, right now xc_dom_elfloader is the only one
> >>>     usable with HVMlite. This is quite trivial (see patch 10, it's a 4
> >>>     LOC change).
> >>>   - Dom0 support.
> >>>   - Migration.
> >>>   - PCI pass-through.
> >>>
> >>> IMHO this is what we agreed to do with PVH, make it an HVM guest without
> >>> a device model and without the emulated devices inside of Xen. Sooner or
> >>> later we would need to make that change anyway in order to properly
> >>> integrate PVH into Xen, and we get a bunch of new features for free as
> >>> compared to PVH.
> >>>
> >>> I don't think of this as "throw PVH out of the window and start
> >>> something completely new from scratch", we are going to reuse some of
> >>> the code paths used by PVH inside of Xen. From a guest POV the changes
> >>> needed to move from PVH into HVMlite are regarding the boot ABI only,
> >>> which we already agreed that should be changed anyway.
> >> I have to admit that I'm having a hard time making myself a clear
> >> picture of what the intention now is, namely with feature freeze
> >> being in about 2.5 weeks: If we assume that this series gets ready
> >> in time, should we drop Boris' 32-bit support patches? Would then
> >> be unfortunate if the series here didn't get ready.
> > TBH I'm not going to make any promises of this being ready before the
> > 4.6 feature freeze, not until I get some feedback from the tools
> > maintainers regarding the libxc changes to unify the PV and HVM domain
> > creation paths.
> 
> FWIW, I gave this a quick spin on Monday and crashed the hypervisor on a 
> NULL pointer right away in vapic code. Which, I assume, is not 
> surprising since we are not supposed to be there in the first place.
> 
> I'll try it again later today (I was out yesterday), maybe I messed 
> something up.
> 
> >
> >> Otoh I don't think this and Boris' code conflict, and what we got in
> >> the tree PVH-wise is kind of a mess right now anyway, so adding to
> >> it just a few more bits (actually getting rid of some fixme-s, i.e.
> >> reducing messiness), so I'd be inclined to take the rest of Boris'
> >> series once ready, and if the series here gets ready too it could
> >> then also go in. Which would then mean for someone (perhaps
> >> after 4.6 was branched) to clean up any no longer necessary
> >> PVH special cases, unifying things towards what we seem to now
> >> call HVMlite.
> > I'm not against merging the 32bit support series for PVH, but I'm
> > certainly not going to invest time in adding 32bit PVH entry points to
> > any OSes.
> 
> What about Tim's proposal 
> (http://lists.xen.org/archives/html/xen-devel/2014-12/msg00596.html)? 
> Can this work be made part of it? At least, make it extendable to that?

FWIW, I think this new scheme (start at HVM and remove things) is
approaching the same end result from the other direction: Xen itself
no longer needs to have a concept of PVH, and the guests get to keep
their useful new interface.  As such, I heartily approve. :)

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC v1 05/13] libxc: introduce a domain loader for HVM guest firmware
  2015-06-22 16:11 ` [PATCH RFC v1 05/13] libxc: introduce a domain loader for HVM guest firmware Roger Pau Monne
  2015-06-23  9:29   ` Jan Beulich
@ 2015-07-10 19:09   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 43+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-07-10 19:09 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Elena Ufimtseva, Wei Liu, Ian Campbell, Stefano Stabellini,
	Andrew Cooper, Ian Jackson, Jan Beulich, xen-devel,
	Boris Ostrovsky

. snip..
> +static elf_negerrnoval check_elf_kernel(struct xc_dom_image *dom, bool verbose)

why don't we want the verbose to be always true?

> +{
> +    if ( dom->kernel_blob == NULL )
> +    {
> +        if ( verbose )
> +            xc_dom_panic(dom->xch,
> +                         XC_INTERNAL_ERROR, "%s: no kernel image loaded",
> +                         __FUNCTION__);
> +        return -EINVAL;
> +    }
> +
> +    if ( !elf_is_elfbinary(dom->kernel_blob, dom->kernel_size) )
> +    {
> +        if ( verbose )
> +            xc_dom_panic(dom->xch,
> +                         XC_INVALID_KERNEL, "%s: kernel is not an ELF image",
> +                         __FUNCTION__);
> +        return -EINVAL;
> +    }
> +    return 0;
> +}
> +
> +static elf_negerrnoval xc_dom_probe_hvm_kernel(struct xc_dom_image *dom)
> +{
> +    struct elf_binary elf;
> +    int rc;
> +
> +    /* This loader is designed for HVM guest firmware. */
> +    if ( dom->container_type != XC_DOM_HVM_CONTAINER )
> +        return -EINVAL;
> +
> +    rc = check_elf_kernel(dom, 0);
> +    if ( rc != 0 )
> +        return rc;
> +
> +    rc = elf_init(&elf, dom->kernel_blob, dom->kernel_size);
> +    if ( rc != 0 )
> +        return rc;
> +
> +    /*
> +     * We need to check that there are no Xen ELFNOTES, or
> +     * else we might be trying to load a PV kernel.
> +     */
> +    elf_parse_binary(&elf);
> +    rc = elf_xen_parse(&elf, &dom->parms);
> +    if ( rc == 0 )
> +        return -EINVAL;
> +
> +    return 0;
> +}
> +
> +static elf_errorstatus xc_dom_parse_hvm_kernel(struct xc_dom_image *dom)
> +    /*
> +     * This function sometimes returns -1 for error and sometimes
> +     * an errno value.  ?!?!

The definition for this error type says:

include/xen/libelf.h:typedef int elf_negerrnoval; /* 0: ok; -EFOO: error */

so it should be -EXX thought that is at odd with the libxc API - which
is -1 for errors and errno carries the bug. But that would
require xc_dom_parse return value to be 'int', not elf_errorstatus.

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2015-07-10 19:09 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-22 16:11 [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Roger Pau Monne
2015-06-22 16:11 ` [PATCH RFC v1 01/13] libxc: split x86 HVM setup_guest into smaller logical functions Roger Pau Monne
2015-06-22 16:11 ` [PATCH RFC v1 02/13] libxc: unify xc_dom_p2m_{host/guest} Roger Pau Monne
2015-06-22 16:11 ` [PATCH RFC v1 03/13] libxc: introduce the notion of a container type Roger Pau Monne
2015-06-22 16:11 ` [PATCH RFC v1 04/13] libxc: allow arch_setup_meminit to populate HVM domain memory Roger Pau Monne
2015-06-25 10:29   ` Wei Liu
2015-06-25 10:33     ` Wei Liu
2015-06-22 16:11 ` [PATCH RFC v1 05/13] libxc: introduce a domain loader for HVM guest firmware Roger Pau Monne
2015-06-23  9:29   ` Jan Beulich
2015-06-23  9:36     ` Roger Pau Monné
2015-07-10 19:09   ` Konrad Rzeszutek Wilk
2015-06-22 16:11 ` [PATCH RFC v1 06/13] libxc: introduce a xc_dom_arch for hvm-3.0-x86_32 guests Roger Pau Monne
2015-06-22 16:11 ` [PATCH RFC v1 07/13] libxl: switch HVM domain building to use xc_dom_* helpers Roger Pau Monne
2015-06-22 16:11 ` [PATCH RFC v1 08/13] libxc: remove dead x86 HVM code Roger Pau Monne
2015-06-22 16:11 ` [PATCH RFC v1 09/13] elfnotes: intorduce a new PHYS_ENTRY elfnote Roger Pau Monne
2015-06-23  9:35   ` Jan Beulich
2015-06-23  9:40     ` Roger Pau Monné
2015-06-23 10:01       ` Jan Beulich
2015-06-22 16:11 ` [PATCH RFC v1 10/13] lib{xc/xl}: allow the creation of HVM domains with a kernel Roger Pau Monne
2015-06-25 10:39   ` Wei Liu
2015-06-22 16:11 ` [PATCH RFC v1 11/13] xen/libxl: allow creating HVM guests without a device model Roger Pau Monne
2015-06-23  9:41   ` Jan Beulich
2015-06-22 16:11 ` [PATCH RFC v1 12/13] xen: allow 64bit HVM guests to use XENMEM_memory_map Roger Pau Monne
2015-06-23  9:43   ` Jan Beulich
2015-06-22 16:11 ` [PATCH RFC v1 13/13] xenconsole: try to attach to PV console if HVM fails Roger Pau Monne
2015-06-22 17:55 ` [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI Stefano Stabellini
2015-06-22 18:05   ` Konrad Rzeszutek Wilk
2015-06-23  8:14     ` Roger Pau Monné
2015-06-23 10:55     ` Stefano Stabellini
2015-06-23 12:50       ` Ian Campbell
2015-06-23 13:12         ` Stefano Stabellini
2015-06-24  2:45           ` Boris Ostrovsky
2015-06-24  9:47       ` Roger Pau Monné
2015-06-24 10:05         ` Jan Beulich
2015-06-24 10:14           ` Roger Pau Monné
2015-06-24 11:52             ` Boris Ostrovsky
2015-06-24 12:04               ` Roger Pau Monné
2015-06-24 13:36                 ` Konrad Rzeszutek Wilk
2015-07-03 16:22               ` Tim Deegan
2015-06-24 13:26         ` Stefano Stabellini
2015-06-24 16:30           ` Boris Ostrovsky
2015-06-24 17:54             ` Stefano Stabellini
2015-06-23  7:14   ` Roger Pau Monné

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).