All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/9] libxc: support building large pv-domains
@ 2015-11-12 13:43 Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator Juergen Gross
                   ` (8 more replies)
  0 siblings, 9 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross

The Xen hypervisor supports starting a dom0 with large memory (up to
the TB range) by not including the initrd and p2m list in the initial
kernel mapping. Especially the p2m list can grow larger than the
available virtual space in the initial mapping.

The started kernel is indicating the support of each feature via
elf notes.

This series enables the domain builder in libxc to do the same as the
hypervisor. This enables starting of huge pv-domUs via xl.

Unmapped initrd is supported for 64 and 32 bit domains, omitting the
p2m from initial kernel mapping is possible for 64 bit domains only.

Tested with:
- 32 bit domU (kernel not supporting unmapped initrd)
- 32 bit domU (kernel supporting unmapped initrd)
- 1 GB 64 bit domU (kernel supporting unmapped initrd, not p2m)
- 1 GB 64 bit domU (kernel supporting unmapped initrd and p2m)
- 900GB 64 bit domU (kernel supporting unmapped initrd and p2m)
- HVM domU

Changes in V5:
- minor modifications in patch 8 as suggested by Wei Liu

Changes in v4:
- updated patch 1 as suggested by Wei Liu (comment and variable name)
- modify comment in patch 6 as suggested by Wei Liu
- rework of patch 8 reducing line count by nearly 100
- added some additional plausibility checks to patch 8 as suggested by
  Wei Liu
- renamed round_pg() to round_pg_up() in patch 9 as suggested by Wei Liu

Changes in v3:
- Rebased the complete series to new staging (hvm builder patches by
  Roger Pau Monne)
- Removed old patch 1 as it broke stubdom build
- Introduced new Patch 1 to make allocation of guest memory more clear
  regarding virtual/physical memory allocation (requested by Ian Campbell)
- Change name of flag to indicate support of unmapped initrd in patch 2
  (requested by Ian Campbell)
- Introduce new patches 3, 4, 5 ("rename domain builder count_pgtables to
  alloc_pgtables", "introduce domain builder architecture specific data",
  "use domain builder architecture private data for x86 pv domains") to
  assist later page table work
- don't fiddle with initrd virtual address in patch 6 (was patch 3 in v2),
  add explicit initrd parameters for start_info in struct xc_dom_image
  instead (requested by Ian Campbell)
- Introduce new patch 8 ("rework of domain builder's page table handler")
  to be able to use common helpers for unmapped p2m list (requested by
  Ian Campbell)
- use now known segment size in pages for p2m list in patch 9 (was patch
  5 in v2) instead of fiddling with segment end address (requested by
  Ian Campbell)
- split alloc_p2m_list() in patch 9 (was patch 5 in v2) to 32/64 bit
  variants (requested by Ian Campbell)

Changes in v2:
- patch 2 has been removed as it has been applied already
- introduced new patch 2 as suggested by Ian Campbell: add a flag
  indicating support of an unmapped initrd to the parsed elf data of
  the elf_dom_parms structure
- updated patch description of patch 3 as requested by Ian Campbell


Juergen Gross (9):
  libxc: reorganize domain builder guest memory allocator
  xen: add generic flag to elf_dom_parms indicating support of unmapped
    initrd
  libxc: rename domain builder count_pgtables to alloc_pgtables
  libxc: introduce domain builder architecture specific data
  libxc: use domain builder architecture private data for x86 pv domains
  libxc: create unmapped initrd in domain builder if supported
  libxc: split p2m allocation in domain builder from other magic pages
  libxc: rework of domain builder's page table handler
  libxc: create p2m list outside of kernel mapping if supported

 stubdom/grub/kexec.c               |  12 +-
 tools/libxc/include/xc_dom.h       |  34 +--
 tools/libxc/xc_dom_arm.c           |   6 +-
 tools/libxc/xc_dom_core.c          | 180 ++++++++----
 tools/libxc/xc_dom_x86.c           | 564 +++++++++++++++++++++----------------
 tools/libxc/xg_private.h           |  39 +--
 xen/arch/x86/domain_build.c        |   4 +-
 xen/common/libelf/libelf-dominfo.c |   3 +
 xen/include/xen/libelf.h           |   1 +
 9 files changed, 491 insertions(+), 352 deletions(-)

-- 
2.6.2

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
@ 2015-11-12 13:43 ` Juergen Gross
  2015-11-12 13:48   ` Wei Liu
  2015-11-12 13:43 ` [PATCH v5 2/9] xen: add generic flag to elf_dom_parms indicating support of unmapped initrd Juergen Gross
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross

Guest memory allocation in the domain builder of libxc is done via
virtual addresses only. In order to be able to support preallocated
areas not virtually mapped reorganize the memory allocator to keep
track of allocated pages globally and in allocated segments.

This requires an interface change of the allocate callback of the
domain builder which currently is using the last mapped virtual
address as a parameter. This is no problem as the only user of this
callback is stubdom/grub/kexec.c using this virtual address to
calculate the last used pfn.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 stubdom/grub/kexec.c         |   6 +--
 tools/libxc/include/xc_dom.h |  13 +++---
 tools/libxc/xc_dom_core.c    | 107 ++++++++++++++++++++++++++++---------------
 3 files changed, 79 insertions(+), 47 deletions(-)

diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
index 0b2f4f3..2300318 100644
--- a/stubdom/grub/kexec.c
+++ b/stubdom/grub/kexec.c
@@ -100,9 +100,9 @@ static void do_exchange(struct xc_dom_image *dom, xen_pfn_t target_pfn, xen_pfn_
     dom->p2m_host[target_pfn] = source_mfn;
 }
 
-int kexec_allocate(struct xc_dom_image *dom, xen_vaddr_t up_to)
+int kexec_allocate(struct xc_dom_image *dom)
 {
-    unsigned long new_allocated = (up_to - dom->parms.virt_base) / PAGE_SIZE;
+    unsigned long new_allocated = dom->pfn_alloc_end - dom->rambase_pfn;
     unsigned long i;
 
     pages = realloc(pages, new_allocated * sizeof(*pages));
@@ -319,8 +319,6 @@ void kexec(void *kernel, long kernel_size, void *module, long module_size, char
 
     /* Make sure the bootstrap page table does not RW-map any of our current
      * page table frames */
-    kexec_allocate(dom, dom->virt_pgtab_end);
-
     if ( (rc = xc_dom_update_guest_p2m(dom))) {
         grub_printf("xc_dom_update_guest_p2m returned %d\n", rc);
         errnum = ERR_BOOT_FAILURE;
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index ccc5926..68d6848 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -29,6 +29,7 @@ struct xc_dom_seg {
     xen_vaddr_t vstart;
     xen_vaddr_t vend;
     xen_pfn_t pfn;
+    xen_pfn_t pages;
 };
 
 struct xc_dom_mem {
@@ -90,6 +91,7 @@ struct xc_dom_image {
     xen_pfn_t xenstore_pfn;
     xen_pfn_t shared_info_pfn;
     xen_pfn_t bootstack_pfn;
+    xen_pfn_t pfn_alloc_end;
     xen_vaddr_t virt_alloc_end;
     xen_vaddr_t bsd_symtab_start;
 
@@ -175,8 +177,8 @@ struct xc_dom_image {
 
     /* kernel loader */
     struct xc_dom_arch *arch_hooks;
-    /* allocate up to virt_alloc_end */
-    int (*allocate) (struct xc_dom_image * dom, xen_vaddr_t up_to);
+    /* allocate up to pfn_alloc_end */
+    int (*allocate) (struct xc_dom_image * dom);
 
     /* Container type (HVM or PV). */
     enum {
@@ -360,14 +362,11 @@ static inline void *xc_dom_seg_to_ptr_pages(struct xc_dom_image *dom,
                                       struct xc_dom_seg *seg,
                                       xen_pfn_t *pages_out)
 {
-    xen_vaddr_t segsize = seg->vend - seg->vstart;
-    unsigned int page_size = XC_DOM_PAGE_SIZE(dom);
-    xen_pfn_t pages = (segsize + page_size - 1) / page_size;
     void *retval;
 
-    retval = xc_dom_pfn_to_ptr(dom, seg->pfn, pages);
+    retval = xc_dom_pfn_to_ptr(dom, seg->pfn, seg->pages);
 
-    *pages_out = retval ? pages : 0;
+    *pages_out = retval ? seg->pages : 0;
     return retval;
 }
 
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index fbe4464..a14d477 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -535,56 +535,75 @@ void *xc_dom_pfn_to_ptr_retcount(struct xc_dom_image *dom, xen_pfn_t pfn,
     return phys->ptr;
 }
 
-int xc_dom_alloc_segment(struct xc_dom_image *dom,
-                         struct xc_dom_seg *seg, char *name,
-                         xen_vaddr_t start, xen_vaddr_t size)
+static int xc_dom_chk_alloc_pages(struct xc_dom_image *dom, char *name,
+                                  xen_pfn_t pages)
 {
     unsigned int page_size = XC_DOM_PAGE_SIZE(dom);
-    xen_pfn_t pages = (size + page_size - 1) / page_size;
-    xen_pfn_t pfn;
-    void *ptr;
 
-    if ( start == 0 )
-        start = dom->virt_alloc_end;
+    if ( pages > dom->total_pages || /* multiple test avoids overflow probs */
+         dom->pfn_alloc_end - dom->rambase_pfn > dom->total_pages ||
+         pages > dom->total_pages - dom->pfn_alloc_end + dom->rambase_pfn )
+    {
+        xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
+                     "%s: segment %s too large (0x%"PRIpfn" > "
+                     "0x%"PRIpfn" - 0x%"PRIpfn" pages)", __FUNCTION__, name,
+                     pages, dom->total_pages,
+                     dom->pfn_alloc_end - dom->rambase_pfn);
+        return -1;
+    }
+
+    dom->pfn_alloc_end += pages;
+    dom->virt_alloc_end += pages * page_size;
+
+    return 0;
+}
 
-    if ( start & (page_size - 1) )
+static int xc_dom_alloc_pad(struct xc_dom_image *dom, xen_vaddr_t boundary)
+{
+    unsigned int page_size = XC_DOM_PAGE_SIZE(dom);
+    xen_pfn_t pages;
+
+    if ( boundary & (page_size - 1) )
     {
         xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
-                     "%s: segment start isn't page aligned (0x%" PRIx64 ")",
-                     __FUNCTION__, start);
+                     "%s: segment boundary isn't page aligned (0x%" PRIx64 ")",
+                     __FUNCTION__, boundary);
         return -1;
     }
-    if ( start < dom->virt_alloc_end )
+    if ( boundary < dom->virt_alloc_end )
     {
         xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
-                     "%s: segment start too low (0x%" PRIx64 " < 0x%" PRIx64
-                     ")", __FUNCTION__, start, dom->virt_alloc_end);
+                     "%s: segment boundary too low (0x%" PRIx64 " < 0x%" PRIx64
+                     ")", __FUNCTION__, boundary, dom->virt_alloc_end);
         return -1;
     }
+    pages = (boundary - dom->virt_alloc_end) / page_size;
 
-    seg->vstart = start;
-    pfn = (seg->vstart - dom->parms.virt_base) / page_size;
-    seg->pfn = pfn + dom->rambase_pfn;
+    return xc_dom_chk_alloc_pages(dom, "padding", pages);
+}
 
-    if ( pages > dom->total_pages || /* multiple test avoids overflow probs */
-         pfn > dom->total_pages ||
-         pages > dom->total_pages - pfn)
-    {
-        xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
-                     "%s: segment %s too large (0x%"PRIpfn" > "
-                     "0x%"PRIpfn" - 0x%"PRIpfn" pages)",
-                     __FUNCTION__, name, pages, dom->total_pages, pfn);
+int xc_dom_alloc_segment(struct xc_dom_image *dom,
+                         struct xc_dom_seg *seg, char *name,
+                         xen_vaddr_t start, xen_vaddr_t size)
+{
+    unsigned int page_size = XC_DOM_PAGE_SIZE(dom);
+    xen_pfn_t pages;
+    void *ptr;
+
+    if ( start && xc_dom_alloc_pad(dom, start) )
         return -1;
-    }
 
-    seg->vend = start + pages * page_size;
-    dom->virt_alloc_end = seg->vend;
-    if (dom->allocate)
-        dom->allocate(dom, dom->virt_alloc_end);
+    pages = (size + page_size - 1) / page_size;
+    start = dom->virt_alloc_end;
 
-    DOMPRINTF("%-20s:   %-12s : 0x%" PRIx64 " -> 0x%" PRIx64
-              "  (pfn 0x%" PRIpfn " + 0x%" PRIpfn " pages)",
-              __FUNCTION__, name, seg->vstart, seg->vend, seg->pfn, pages);
+    seg->pfn = dom->pfn_alloc_end;
+    seg->pages = pages;
+
+    if ( xc_dom_chk_alloc_pages(dom, name, pages) )
+        return -1;
+
+    if (dom->allocate)
+        dom->allocate(dom);
 
     /* map and clear pages */
     ptr = xc_dom_seg_to_ptr(dom, seg);
@@ -592,6 +611,13 @@ int xc_dom_alloc_segment(struct xc_dom_image *dom,
         return -1;
     memset(ptr, 0, pages * page_size);
 
+    seg->vstart = start;
+    seg->vend = dom->virt_alloc_end;
+
+    DOMPRINTF("%-20s:   %-12s : 0x%" PRIx64 " -> 0x%" PRIx64
+              "  (pfn 0x%" PRIpfn " + 0x%" PRIpfn " pages)",
+              __FUNCTION__, name, seg->vstart, seg->vend, seg->pfn, pages);
+
     return 0;
 }
 
@@ -602,10 +628,11 @@ int xc_dom_alloc_page(struct xc_dom_image *dom, char *name)
     xen_pfn_t pfn;
 
     start = dom->virt_alloc_end;
+    pfn = dom->pfn_alloc_end - dom->rambase_pfn;
     dom->virt_alloc_end += page_size;
-    if (dom->allocate)
-        dom->allocate(dom, dom->virt_alloc_end);
-    pfn = (start - dom->parms.virt_base) / page_size;
+    dom->pfn_alloc_end++;
+    if ( dom->allocate )
+        dom->allocate(dom);
 
     DOMPRINTF("%-20s:   %-12s : 0x%" PRIx64 " (pfn 0x%" PRIpfn ")",
               __FUNCTION__, name, start, pfn);
@@ -886,6 +913,7 @@ int xc_dom_parse_image(struct xc_dom_image *dom)
 int xc_dom_rambase_init(struct xc_dom_image *dom, uint64_t rambase)
 {
     dom->rambase_pfn = rambase >> XC_PAGE_SHIFT;
+    dom->pfn_alloc_end = dom->rambase_pfn;
     DOMPRINTF("%s: RAM starts at %"PRI_xen_pfn,
               __FUNCTION__, dom->rambase_pfn);
     return 0;
@@ -1013,6 +1041,8 @@ int xc_dom_build_image(struct xc_dom_image *dom)
         goto err;
     }
     page_size = XC_DOM_PAGE_SIZE(dom);
+    if ( dom->parms.virt_base != UNSET_ADDR )
+        dom->virt_alloc_end = dom->parms.virt_base;
 
     /* load kernel */
     if ( xc_dom_alloc_segment(dom, &dom->kernel_seg, "kernel",
@@ -1067,6 +1097,11 @@ int xc_dom_build_image(struct xc_dom_image *dom)
               __FUNCTION__, dom->virt_alloc_end);
     DOMPRINTF("%-20s: virt_pgtab_end : 0x%" PRIx64 "",
               __FUNCTION__, dom->virt_pgtab_end);
+
+    /* Make sure all memory mapped by initial page tables is available */
+    if ( dom->virt_pgtab_end && xc_dom_alloc_pad(dom, dom->virt_pgtab_end) )
+        return -1;
+
     return 0;
 
  err:
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v5 2/9] xen: add generic flag to elf_dom_parms indicating support of unmapped initrd
  2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator Juergen Gross
@ 2015-11-12 13:43 ` Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 3/9] libxc: rename domain builder count_pgtables to alloc_pgtables Juergen Gross
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross, andrew.cooper3, keir, jbeulich

Support of an unmapped initrd is indicated by the kernel of the domain
via elf notes. In order not to have to use raw elf data in the tools
for support of an unmapped initrd add a flag to the parsed data area
to indicate the kernel supporting this feature.

Switch using this flag in the hypervisor domain builder.

Cc: andrew.cooper3@citrix.com
Cc: jbeulich@suse.com
Cc: keir@xen.org
Suggested-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/domain_build.c        | 4 ++--
 xen/common/libelf/libelf-dominfo.c | 3 +++
 xen/include/xen/libelf.h           | 1 +
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index c2ef87a..d02dc4b 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -353,7 +353,7 @@ static unsigned long __init compute_dom0_nr_pages(
 
         vstart = parms->virt_base;
         vend = round_pgup(parms->virt_kend);
-        if ( !parms->elf_notes[XEN_ELFNOTE_MOD_START_PFN].data.num )
+        if ( !parms->unmapped_initrd )
             vend += round_pgup(initrd_len);
         end = vend + nr_pages * sizeof_long;
 
@@ -1037,7 +1037,7 @@ int __init construct_dom0(
     v_start          = parms.virt_base;
     vkern_start      = parms.virt_kstart;
     vkern_end        = parms.virt_kend;
-    if ( parms.elf_notes[XEN_ELFNOTE_MOD_START_PFN].data.num )
+    if ( parms.unmapped_initrd )
     {
         vinitrd_start  = vinitrd_end = 0;
         vphysmap_start = round_pgup(vkern_end);
diff --git a/xen/common/libelf/libelf-dominfo.c b/xen/common/libelf/libelf-dominfo.c
index 3de1c23..c9243e4 100644
--- a/xen/common/libelf/libelf-dominfo.c
+++ b/xen/common/libelf/libelf-dominfo.c
@@ -190,6 +190,9 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf,
     case XEN_ELFNOTE_INIT_P2M:
         parms->p2m_base = val;
         break;
+    case XEN_ELFNOTE_MOD_START_PFN:
+        parms->unmapped_initrd = !!val;
+        break;
     case XEN_ELFNOTE_PADDR_OFFSET:
         parms->elf_paddr_offset = val;
         break;
diff --git a/xen/include/xen/libelf.h b/xen/include/xen/libelf.h
index de788c7..6da4cc0 100644
--- a/xen/include/xen/libelf.h
+++ b/xen/include/xen/libelf.h
@@ -423,6 +423,7 @@ struct elf_dom_parms {
     char loader[16];
     enum xen_pae_type pae;
     bool bsd_symtab;
+    bool unmapped_initrd;
     uint64_t virt_base;
     uint64_t virt_entry;
     uint64_t virt_hypercall;
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v5 3/9] libxc: rename domain builder count_pgtables to alloc_pgtables
  2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 2/9] xen: add generic flag to elf_dom_parms indicating support of unmapped initrd Juergen Gross
@ 2015-11-12 13:43 ` Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 4/9] libxc: introduce domain builder architecture specific data Juergen Gross
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross

Rename the count_pgtables hook of the domain builder to alloc_pgtables
and do the allocation of the guest memory for page tables inside this
hook. This will remove the need for accessing the x86 specific pgtables
member of struct xc_dom_image in the generic domain builder code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xc_dom.h |  2 +-
 tools/libxc/xc_dom_arm.c     |  6 +++---
 tools/libxc/xc_dom_core.c    | 11 ++---------
 tools/libxc/xc_dom_x86.c     | 26 +++++++++++++++++---------
 4 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 68d6848..19d45f4 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -220,7 +220,7 @@ void xc_dom_register_loader(struct xc_dom_loader *loader);
 struct xc_dom_arch {
     /* pagetable setup */
     int (*alloc_magic_pages) (struct xc_dom_image * dom);
-    int (*count_pgtables) (struct xc_dom_image * dom);
+    int (*alloc_pgtables) (struct xc_dom_image * dom);
     int (*setup_pgtables) (struct xc_dom_image * dom);
 
     /* arch-specific data structs setup */
diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index 397eef0..d9a6371 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -49,7 +49,7 @@ const char *xc_domain_get_native_protocol(xc_interface *xch,
  * arm guests are hybrid and start off with paging disabled, therefore no
  * pagetables and nothing to do here.
  */
-static int count_pgtables_arm(struct xc_dom_image *dom)
+static int alloc_pgtables_arm(struct xc_dom_image *dom)
 {
     DOMPRINTF_CALLED(dom->xch);
     return 0;
@@ -534,7 +534,7 @@ static struct xc_dom_arch xc_dom_32 = {
     .page_shift = PAGE_SHIFT_ARM,
     .sizeof_pfn = 8,
     .alloc_magic_pages = alloc_magic_pages,
-    .count_pgtables = count_pgtables_arm,
+    .alloc_pgtables = alloc_pgtables_arm,
     .setup_pgtables = setup_pgtables_arm,
     .start_info = start_info_arm,
     .shared_info = shared_info_arm,
@@ -550,7 +550,7 @@ static struct xc_dom_arch xc_dom_64 = {
     .page_shift = PAGE_SHIFT_ARM,
     .sizeof_pfn = 8,
     .alloc_magic_pages = alloc_magic_pages,
-    .count_pgtables = count_pgtables_arm,
+    .alloc_pgtables = alloc_pgtables_arm,
     .setup_pgtables = setup_pgtables_arm,
     .start_info = start_info_arm,
     .shared_info = shared_info_arm,
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index a14d477..74de3c3 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -1082,15 +1082,8 @@ int xc_dom_build_image(struct xc_dom_image *dom)
     /* allocate other pages */
     if ( dom->arch_hooks->alloc_magic_pages(dom) != 0 )
         goto err;
-    if ( dom->arch_hooks->count_pgtables )
-    {
-        if ( dom->arch_hooks->count_pgtables(dom) != 0 )
-            goto err;
-        if ( (dom->pgtables > 0) &&
-             (xc_dom_alloc_segment(dom, &dom->pgtables_seg, "page tables", 0,
-                                   dom->pgtables * page_size) != 0) )
-                goto err;
-    }
+    if ( dom->arch_hooks->alloc_pgtables(dom) != 0 )
+        goto err;
     if ( dom->alloc_bootstack )
         dom->bootstack_pfn = xc_dom_alloc_page(dom, "boot stack");
     DOMPRINTF("%-20s: virt_alloc_end : 0x%" PRIx64 "",
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index ed43c28..ea32b00 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -126,7 +126,7 @@ nr_page_tables(struct xc_dom_image *dom,
     return tables;
 }
 
-static int count_pgtables(struct xc_dom_image *dom, int pae,
+static int alloc_pgtables(struct xc_dom_image *dom, int pae,
                           int l4_bits, int l3_bits, int l2_bits, int l1_bits)
 {
     int pages, extra_pages;
@@ -172,7 +172,9 @@ static int count_pgtables(struct xc_dom_image *dom, int pae,
             break;
     }
     dom->virt_pgtab_end = try_virt_end + 1;
-    return 0;
+
+    return xc_dom_alloc_segment(dom, &dom->pgtables_seg, "page tables", 0,
+                                dom->pgtables * PAGE_SIZE_X86);
 }
 
 /* ------------------------------------------------------------------------ */
@@ -182,9 +184,9 @@ static int count_pgtables(struct xc_dom_image *dom, int pae,
 #define L2_PROT (_PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER)
 #define L3_PROT (_PAGE_PRESENT)
 
-static int count_pgtables_x86_32_pae(struct xc_dom_image *dom)
+static int alloc_pgtables_x86_32_pae(struct xc_dom_image *dom)
 {
-    return count_pgtables(dom, 1, 0, 32,
+    return alloc_pgtables(dom, 1, 0, 32,
                           L3_PAGETABLE_SHIFT_PAE, L2_PAGETABLE_SHIFT_PAE);
 }
 
@@ -355,9 +357,9 @@ pfn_error:
 /* ------------------------------------------------------------------------ */
 /* x86_64 pagetables                                                        */
 
-static int count_pgtables_x86_64(struct xc_dom_image *dom)
+static int alloc_pgtables_x86_64(struct xc_dom_image *dom)
 {
-    return count_pgtables(dom, 0,
+    return alloc_pgtables(dom, 0,
                           L4_PAGETABLE_SHIFT_X86_64 + 9,
                           L4_PAGETABLE_SHIFT_X86_64,
                           L3_PAGETABLE_SHIFT_X86_64,
@@ -1620,6 +1622,12 @@ static int bootlate_pv(struct xc_dom_image *dom)
     return 0;
 }
 
+static int alloc_pgtables_hvm(struct xc_dom_image *dom)
+{
+    DOMPRINTF("%s: doing nothing", __func__);
+    return 0;
+}
+
 static int bootlate_hvm(struct xc_dom_image *dom)
 {
     DOMPRINTF("%s: doing nothing", __func__);
@@ -1643,7 +1651,7 @@ static struct xc_dom_arch xc_dom_32_pae = {
     .page_shift = PAGE_SHIFT_X86,
     .sizeof_pfn = 4,
     .alloc_magic_pages = alloc_magic_pages,
-    .count_pgtables = count_pgtables_x86_32_pae,
+    .alloc_pgtables = alloc_pgtables_x86_32_pae,
     .setup_pgtables = setup_pgtables_x86_32_pae,
     .start_info = start_info_x86_32,
     .shared_info = shared_info_x86_32,
@@ -1659,7 +1667,7 @@ static struct xc_dom_arch xc_dom_64 = {
     .page_shift = PAGE_SHIFT_X86,
     .sizeof_pfn = 8,
     .alloc_magic_pages = alloc_magic_pages,
-    .count_pgtables = count_pgtables_x86_64,
+    .alloc_pgtables = alloc_pgtables_x86_64,
     .setup_pgtables = setup_pgtables_x86_64,
     .start_info = start_info_x86_64,
     .shared_info = shared_info_x86_64,
@@ -1675,7 +1683,7 @@ static struct xc_dom_arch xc_hvm_32 = {
     .page_shift = PAGE_SHIFT_X86,
     .sizeof_pfn = 4,
     .alloc_magic_pages = alloc_magic_pages_hvm,
-    .count_pgtables = NULL,
+    .alloc_pgtables = alloc_pgtables_hvm,
     .setup_pgtables = NULL,
     .start_info = NULL,
     .shared_info = NULL,
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v5 4/9] libxc: introduce domain builder architecture specific data
  2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
                   ` (2 preceding siblings ...)
  2015-11-12 13:43 ` [PATCH v5 3/9] libxc: rename domain builder count_pgtables to alloc_pgtables Juergen Gross
@ 2015-11-12 13:43 ` Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 5/9] libxc: use domain builder architecture private data for x86 pv domains Juergen Gross
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross

Reorganize struct xc_dom_image to contain a pointer to domain builder
architecture specific private data. This will abstract the architecture
or domain type specific data from the general used data.

The new area is allocated as soon as the domain type is known.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 stubdom/grub/kexec.c         |  6 +++++-
 tools/libxc/include/xc_dom.h |  6 +++++-
 tools/libxc/xc_dom_core.c    | 27 +++++++++++++++++++--------
 3 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
index 2300318..8fd9ff9 100644
--- a/stubdom/grub/kexec.c
+++ b/stubdom/grub/kexec.c
@@ -272,7 +272,11 @@ void kexec(void *kernel, long kernel_size, void *module, long module_size, char
 #endif
 
     /* equivalent of xc_dom_mem_init */
-    dom->arch_hooks = xc_dom_find_arch_hooks(xc_handle, dom->guest_type);
+    if (xc_dom_set_arch_hooks(dom)) {
+        grub_printf("xc_dom_set_arch_hooks failed\n");
+        errnum = ERR_EXEC_FORMAT;
+        goto out;
+    }
     dom->total_pages = start_info.nr_pages;
 
     /* equivalent of arch_setup_meminit */
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 19d45f4..09f73cd 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -175,6 +175,9 @@ struct xc_dom_image {
     unsigned int *vnode_to_pnode;
     unsigned int nr_vnodes;
 
+    /* domain type/architecture specific data */
+    void *arch_private;
+
     /* kernel loader */
     struct xc_dom_arch *arch_hooks;
     /* allocate up to pfn_alloc_end */
@@ -237,6 +240,7 @@ struct xc_dom_arch {
     char *native_protocol;
     int page_shift;
     int sizeof_pfn;
+    int arch_private_size;
 
     struct xc_dom_arch *next;
 };
@@ -290,7 +294,7 @@ int xc_dom_devicetree_mem(struct xc_dom_image *dom, const void *mem,
                           size_t memsize);
 
 int xc_dom_parse_image(struct xc_dom_image *dom);
-struct xc_dom_arch *xc_dom_find_arch_hooks(xc_interface *xch, char *guest_type);
+int xc_dom_set_arch_hooks(struct xc_dom_image *dom);
 int xc_dom_build_image(struct xc_dom_image *dom);
 int xc_dom_update_guest_p2m(struct xc_dom_image *dom);
 
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index 74de3c3..3a31222 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -710,19 +710,30 @@ void xc_dom_register_arch_hooks(struct xc_dom_arch *hooks)
     first_hook = hooks;
 }
 
-struct xc_dom_arch *xc_dom_find_arch_hooks(xc_interface *xch, char *guest_type)
+int xc_dom_set_arch_hooks(struct xc_dom_image *dom)
 {
     struct xc_dom_arch *hooks = first_hook;
 
     while (  hooks != NULL )
     {
-        if ( !strcmp(hooks->guest_type, guest_type))
-            return hooks;
+        if ( !strcmp(hooks->guest_type, dom->guest_type) )
+        {
+            if ( hooks->arch_private_size )
+            {
+                dom->arch_private = malloc(hooks->arch_private_size);
+                if ( dom->arch_private == NULL )
+                    return -1;
+                memset(dom->arch_private, 0, hooks->arch_private_size);
+                dom->alloc_malloc += hooks->arch_private_size;
+            }
+            dom->arch_hooks = hooks;
+            return 0;
+        }
         hooks = hooks->next;
     }
-    xc_dom_panic(xch, XC_INVALID_KERNEL,
-                 "%s: not found (type %s)", __FUNCTION__, guest_type);
-    return NULL;
+    xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
+                 "%s: not found (type %s)", __FUNCTION__, dom->guest_type);
+    return -1;
 }
 
 /* ------------------------------------------------------------------------ */
@@ -734,6 +745,7 @@ void xc_dom_release(struct xc_dom_image *dom)
     if ( dom->phys_pages )
         xc_dom_unmap_all(dom);
     xc_dom_free_all(dom);
+    free(dom->arch_private);
     free(dom);
 }
 
@@ -924,8 +936,7 @@ int xc_dom_mem_init(struct xc_dom_image *dom, unsigned int mem_mb)
     unsigned int page_shift;
     xen_pfn_t nr_pages;
 
-    dom->arch_hooks = xc_dom_find_arch_hooks(dom->xch, dom->guest_type);
-    if ( dom->arch_hooks == NULL )
+    if ( xc_dom_set_arch_hooks(dom) )
     {
         xc_dom_panic(dom->xch, XC_INTERNAL_ERROR, "%s: arch hooks not set",
                      __FUNCTION__);
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v5 5/9] libxc: use domain builder architecture private data for x86 pv domains
  2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
                   ` (3 preceding siblings ...)
  2015-11-12 13:43 ` [PATCH v5 4/9] libxc: introduce domain builder architecture specific data Juergen Gross
@ 2015-11-12 13:43 ` Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported Juergen Gross
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross

Move some data private to the x86 domain builder to the private data
section. Remove extra_pages as they are used nowhere.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xc_dom.h |  8 --------
 tools/libxc/xc_dom_x86.c     | 48 +++++++++++++++++++++++++++++---------------
 2 files changed, 32 insertions(+), 24 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 09f73cd..0ba9821 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -94,15 +94,7 @@ struct xc_dom_image {
     xen_pfn_t pfn_alloc_end;
     xen_vaddr_t virt_alloc_end;
     xen_vaddr_t bsd_symtab_start;
-
-    /* initial page tables */
-    unsigned int pgtables;
-    unsigned int pg_l4;
-    unsigned int pg_l3;
-    unsigned int pg_l2;
-    unsigned int pg_l1;
     unsigned int alloc_bootstack;
-    unsigned int extra_pages;
     xen_vaddr_t virt_pgtab_end;
 
     /* other state info */
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index ea32b00..aba50df 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -69,6 +69,15 @@
 #define round_down(addr, mask)   ((addr) & ~(mask))
 #define round_up(addr, mask)     ((addr) | (mask))
 
+struct xc_dom_image_x86 {
+    /* initial page tables */
+    unsigned int pgtables;
+    unsigned int pg_l4;
+    unsigned int pg_l3;
+    unsigned int pg_l2;
+    unsigned int pg_l1;
+};
+
 /* get guest IO ABI protocol */
 const char *xc_domain_get_native_protocol(xc_interface *xch,
                                           uint32_t domid)
@@ -132,9 +141,9 @@ static int alloc_pgtables(struct xc_dom_image *dom, int pae,
     int pages, extra_pages;
     xen_vaddr_t try_virt_end;
     xen_pfn_t try_pfn_end;
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
 
     extra_pages = dom->alloc_bootstack ? 1 : 0;
-    extra_pages += dom->extra_pages;
     extra_pages += 128; /* 512kB padding */
     pages = extra_pages;
     for ( ; ; )
@@ -152,29 +161,30 @@ static int alloc_pgtables(struct xc_dom_image *dom, int pae,
             return -ENOMEM;
         }
 
-        dom->pg_l4 =
+        domx86->pg_l4 =
             nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l4_bits);
-        dom->pg_l3 =
+        domx86->pg_l3 =
             nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l3_bits);
-        dom->pg_l2 =
+        domx86->pg_l2 =
             nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l2_bits);
-        dom->pg_l1 =
+        domx86->pg_l1 =
             nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l1_bits);
         if (pae && try_virt_end < 0xc0000000)
         {
             DOMPRINTF("%s: PAE: extra l2 page table for l3#3",
                       __FUNCTION__);
-            dom->pg_l2++;
+            domx86->pg_l2++;
         }
-        dom->pgtables = dom->pg_l4 + dom->pg_l3 + dom->pg_l2 + dom->pg_l1;
-        pages = dom->pgtables + extra_pages;
+        domx86->pgtables = domx86->pg_l4 + domx86->pg_l3 +
+                           domx86->pg_l2 + domx86->pg_l1;
+        pages = domx86->pgtables + extra_pages;
         if ( dom->virt_alloc_end + pages * PAGE_SIZE_X86 <= try_virt_end + 1 )
             break;
     }
     dom->virt_pgtab_end = try_virt_end + 1;
 
     return xc_dom_alloc_segment(dom, &dom->pgtables_seg, "page tables", 0,
-                                dom->pgtables * PAGE_SIZE_X86);
+                                domx86->pgtables * PAGE_SIZE_X86);
 }
 
 /* ------------------------------------------------------------------------ */
@@ -262,9 +272,10 @@ static xen_pfn_t move_l3_below_4G(struct xc_dom_image *dom,
 
 static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
 {
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
     xen_pfn_t l3pfn = dom->pgtables_seg.pfn;
-    xen_pfn_t l2pfn = l3pfn + dom->pg_l3;
-    xen_pfn_t l1pfn = l2pfn + dom->pg_l2;
+    xen_pfn_t l2pfn = l3pfn + domx86->pg_l3;
+    xen_pfn_t l1pfn = l2pfn + domx86->pg_l2;
     l3_pgentry_64_t *l3tab;
     l2_pgentry_64_t *l2tab = NULL;
     l1_pgentry_64_t *l1tab = NULL;
@@ -373,10 +384,11 @@ static int alloc_pgtables_x86_64(struct xc_dom_image *dom)
 
 static int setup_pgtables_x86_64(struct xc_dom_image *dom)
 {
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
     xen_pfn_t l4pfn = dom->pgtables_seg.pfn;
-    xen_pfn_t l3pfn = l4pfn + dom->pg_l4;
-    xen_pfn_t l2pfn = l3pfn + dom->pg_l3;
-    xen_pfn_t l1pfn = l2pfn + dom->pg_l2;
+    xen_pfn_t l3pfn = l4pfn + domx86->pg_l4;
+    xen_pfn_t l2pfn = l3pfn + domx86->pg_l3;
+    xen_pfn_t l1pfn = l2pfn + domx86->pg_l2;
     l4_pgentry_64_t *l4tab = xc_dom_pfn_to_ptr(dom, l4pfn, 1);
     l3_pgentry_64_t *l3tab = NULL;
     l2_pgentry_64_t *l2tab = NULL;
@@ -619,6 +631,7 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
 
 static int start_info_x86_32(struct xc_dom_image *dom)
 {
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
     start_info_x86_32_t *start_info =
         xc_dom_pfn_to_ptr(dom, dom->start_info_pfn, 1);
     xen_pfn_t shinfo =
@@ -639,7 +652,7 @@ static int start_info_x86_32(struct xc_dom_image *dom)
     start_info->nr_pages = dom->total_pages;
     start_info->shared_info = shinfo << PAGE_SHIFT_X86;
     start_info->pt_base = dom->pgtables_seg.vstart;
-    start_info->nr_pt_frames = dom->pgtables;
+    start_info->nr_pt_frames = domx86->pgtables;
     start_info->mfn_list = dom->p2m_seg.vstart;
 
     start_info->flags = dom->flags;
@@ -665,6 +678,7 @@ static int start_info_x86_32(struct xc_dom_image *dom)
 
 static int start_info_x86_64(struct xc_dom_image *dom)
 {
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
     start_info_x86_64_t *start_info =
         xc_dom_pfn_to_ptr(dom, dom->start_info_pfn, 1);
     xen_pfn_t shinfo =
@@ -685,7 +699,7 @@ static int start_info_x86_64(struct xc_dom_image *dom)
     start_info->nr_pages = dom->total_pages;
     start_info->shared_info = shinfo << PAGE_SHIFT_X86;
     start_info->pt_base = dom->pgtables_seg.vstart;
-    start_info->nr_pt_frames = dom->pgtables;
+    start_info->nr_pt_frames = domx86->pgtables;
     start_info->mfn_list = dom->p2m_seg.vstart;
 
     start_info->flags = dom->flags;
@@ -1650,6 +1664,7 @@ static struct xc_dom_arch xc_dom_32_pae = {
     .native_protocol = XEN_IO_PROTO_ABI_X86_32,
     .page_shift = PAGE_SHIFT_X86,
     .sizeof_pfn = 4,
+    .arch_private_size = sizeof(struct xc_dom_image_x86),
     .alloc_magic_pages = alloc_magic_pages,
     .alloc_pgtables = alloc_pgtables_x86_32_pae,
     .setup_pgtables = setup_pgtables_x86_32_pae,
@@ -1666,6 +1681,7 @@ static struct xc_dom_arch xc_dom_64 = {
     .native_protocol = XEN_IO_PROTO_ABI_X86_64,
     .page_shift = PAGE_SHIFT_X86,
     .sizeof_pfn = 8,
+    .arch_private_size = sizeof(struct xc_dom_image_x86),
     .alloc_magic_pages = alloc_magic_pages,
     .alloc_pgtables = alloc_pgtables_x86_64,
     .setup_pgtables = setup_pgtables_x86_64,
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
                   ` (4 preceding siblings ...)
  2015-11-12 13:43 ` [PATCH v5 5/9] libxc: use domain builder architecture private data for x86 pv domains Juergen Gross
@ 2015-11-12 13:43 ` Juergen Gross
  2015-11-25 16:12   ` Boris Ostrovsky
  2015-11-12 13:43 ` [PATCH v5 7/9] libxc: split p2m allocation in domain builder from other magic pages Juergen Gross
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross

In case the kernel of a new pv-domU indicates it is supporting an
unmapped initrd, don't waste precious virtual space for the initrd,
but allocate only guest physical memory for it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xc_dom.h |  5 +++++
 tools/libxc/xc_dom_core.c    | 19 +++++++++++++++++--
 tools/libxc/xc_dom_x86.c     |  8 ++++----
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 0ba9821..2358012 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -94,6 +94,11 @@ struct xc_dom_image {
     xen_pfn_t pfn_alloc_end;
     xen_vaddr_t virt_alloc_end;
     xen_vaddr_t bsd_symtab_start;
+
+    /* initrd parameters as specified in start_info page */
+    unsigned long initrd_start;
+    unsigned long initrd_len;
+
     unsigned int alloc_bootstack;
     xen_vaddr_t virt_pgtab_end;
 
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index 3a31222..7b48b1f 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -1041,6 +1041,7 @@ static int xc_dom_build_ramdisk(struct xc_dom_image *dom)
 int xc_dom_build_image(struct xc_dom_image *dom)
 {
     unsigned int page_size;
+    bool unmapped_initrd;
 
     DOMPRINTF_CALLED(dom->xch);
 
@@ -1064,11 +1065,15 @@ int xc_dom_build_image(struct xc_dom_image *dom)
     if ( dom->kernel_loader->loader(dom) != 0 )
         goto err;
 
-    /* load ramdisk */
-    if ( dom->ramdisk_blob )
+    /* Don't load ramdisk now if no initial mapping required. */
+    unmapped_initrd = dom->parms.unmapped_initrd && !dom->ramdisk_seg.vstart;
+
+    if ( dom->ramdisk_blob && !unmapped_initrd )
     {
         if ( xc_dom_build_ramdisk(dom) != 0 )
             goto err;
+        dom->initrd_start = dom->ramdisk_seg.vstart;
+        dom->initrd_len = dom->ramdisk_seg.vend - dom->ramdisk_seg.vstart;
     }
 
     /* load devicetree */
@@ -1106,6 +1111,16 @@ int xc_dom_build_image(struct xc_dom_image *dom)
     if ( dom->virt_pgtab_end && xc_dom_alloc_pad(dom, dom->virt_pgtab_end) )
         return -1;
 
+    /* Load ramdisk if no initial mapping required. */
+    if ( dom->ramdisk_blob && unmapped_initrd )
+    {
+        if ( xc_dom_build_ramdisk(dom) != 0 )
+            goto err;
+        dom->flags |= SIF_MOD_START_PFN;
+        dom->initrd_start = dom->ramdisk_seg.pfn;
+        dom->initrd_len = page_size * dom->ramdisk_seg.pages;
+    }
+
     return 0;
 
  err:
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index aba50df..3c6bb9c 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -663,8 +663,8 @@ static int start_info_x86_32(struct xc_dom_image *dom)
 
     if ( dom->ramdisk_blob )
     {
-        start_info->mod_start = dom->ramdisk_seg.vstart;
-        start_info->mod_len = dom->ramdisk_seg.vend - dom->ramdisk_seg.vstart;
+        start_info->mod_start = dom->initrd_start;
+        start_info->mod_len = dom->initrd_len;
     }
 
     if ( dom->cmdline )
@@ -710,8 +710,8 @@ static int start_info_x86_64(struct xc_dom_image *dom)
 
     if ( dom->ramdisk_blob )
     {
-        start_info->mod_start = dom->ramdisk_seg.vstart;
-        start_info->mod_len = dom->ramdisk_seg.vend - dom->ramdisk_seg.vstart;
+        start_info->mod_start = dom->initrd_start;
+        start_info->mod_len = dom->initrd_len;
     }
 
     if ( dom->cmdline )
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v5 7/9] libxc: split p2m allocation in domain builder from other magic pages
  2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
                   ` (5 preceding siblings ...)
  2015-11-12 13:43 ` [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported Juergen Gross
@ 2015-11-12 13:43 ` Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 8/9] libxc: rework of domain builder's page table handler Juergen Gross
  2015-11-12 13:43 ` [PATCH v5 9/9] libxc: create p2m list outside of kernel mapping if supported Juergen Gross
  8 siblings, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross

Carve out the p2m list allocation from the .alloc_magic_pages hook of
the domain builder in order to prepare allocating the p2m list outside
of the initial kernel mapping. This will be needed to support loading
domains with huge memory (>512 GB).

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xc_dom.h |  1 +
 tools/libxc/xc_dom_core.c    |  3 +++
 tools/libxc/xc_dom_x86.c     | 11 ++++++++++-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 2358012..7c157c3 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -221,6 +221,7 @@ struct xc_dom_arch {
     /* pagetable setup */
     int (*alloc_magic_pages) (struct xc_dom_image * dom);
     int (*alloc_pgtables) (struct xc_dom_image * dom);
+    int (*alloc_p2m_list) (struct xc_dom_image * dom);
     int (*setup_pgtables) (struct xc_dom_image * dom);
 
     /* arch-specific data structs setup */
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index 7b48b1f..ad91b35 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -1096,6 +1096,9 @@ int xc_dom_build_image(struct xc_dom_image *dom)
     }
 
     /* allocate other pages */
+    if ( dom->arch_hooks->alloc_p2m_list &&
+         dom->arch_hooks->alloc_p2m_list(dom) != 0 )
+        goto err;
     if ( dom->arch_hooks->alloc_magic_pages(dom) != 0 )
         goto err;
     if ( dom->arch_hooks->alloc_pgtables(dom) != 0 )
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 3c6bb9c..dd448cb 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -475,7 +475,7 @@ pfn_error:
 
 /* ------------------------------------------------------------------------ */
 
-static int alloc_magic_pages(struct xc_dom_image *dom)
+static int alloc_p2m_list(struct xc_dom_image *dom)
 {
     size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
 
@@ -487,6 +487,13 @@ static int alloc_magic_pages(struct xc_dom_image *dom)
     if ( dom->p2m_guest == NULL )
         return -1;
 
+    return 0;
+}
+
+/* ------------------------------------------------------------------------ */
+
+static int alloc_magic_pages(struct xc_dom_image *dom)
+{
     /* allocate special pages */
     dom->start_info_pfn = xc_dom_alloc_page(dom, "start info");
     dom->xenstore_pfn = xc_dom_alloc_page(dom, "xenstore");
@@ -1667,6 +1674,7 @@ static struct xc_dom_arch xc_dom_32_pae = {
     .arch_private_size = sizeof(struct xc_dom_image_x86),
     .alloc_magic_pages = alloc_magic_pages,
     .alloc_pgtables = alloc_pgtables_x86_32_pae,
+    .alloc_p2m_list = alloc_p2m_list,
     .setup_pgtables = setup_pgtables_x86_32_pae,
     .start_info = start_info_x86_32,
     .shared_info = shared_info_x86_32,
@@ -1684,6 +1692,7 @@ static struct xc_dom_arch xc_dom_64 = {
     .arch_private_size = sizeof(struct xc_dom_image_x86),
     .alloc_magic_pages = alloc_magic_pages,
     .alloc_pgtables = alloc_pgtables_x86_64,
+    .alloc_p2m_list = alloc_p2m_list,
     .setup_pgtables = setup_pgtables_x86_64,
     .start_info = start_info_x86_64,
     .shared_info = shared_info_x86_64,
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
                   ` (6 preceding siblings ...)
  2015-11-12 13:43 ` [PATCH v5 7/9] libxc: split p2m allocation in domain builder from other magic pages Juergen Gross
@ 2015-11-12 13:43 ` Juergen Gross
  2015-11-12 13:47   ` Wei Liu
  2015-11-18 16:11   ` Boris Ostrovsky
  2015-11-12 13:43 ` [PATCH v5 9/9] libxc: create p2m list outside of kernel mapping if supported Juergen Gross
  8 siblings, 2 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross

In order to prepare a p2m list outside of the initial kernel mapping
do a rework of the domain builder's page table handler. The goal is
to be able to use common helpers for page table allocation and setup
for initial kernel page tables and page tables mapping the p2m list.
This is achieved by supporting multiple mapping areas. The mapped
virtual addresses of the single areas must not overlap, while the
page tables of a new area added might already be partially present.
Especially the top level page table is existing only once, of course.

Currently restrict the number of mappings to 1 because the only mapping
now is the initial mapping created by toolstack. There should not be
behaviour change and guest visible change introduced.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com
---
 tools/libxc/xc_dom_x86.c | 479 ++++++++++++++++++++++++-----------------------
 tools/libxc/xg_private.h |  39 +---
 2 files changed, 252 insertions(+), 266 deletions(-)

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index dd448cb..1614354 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -26,6 +26,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <inttypes.h>
+#include <assert.h>
 
 #include <xen/xen.h>
 #include <xen/foreign/x86_32.h>
@@ -69,13 +70,29 @@
 #define round_down(addr, mask)   ((addr) & ~(mask))
 #define round_up(addr, mask)     ((addr) | (mask))
 
-struct xc_dom_image_x86 {
-    /* initial page tables */
+struct xc_dom_params {
+    unsigned levels;
+    xen_vaddr_t vaddr_mask;
+    x86_pgentry_t lvl_prot[4];
+};
+
+struct xc_dom_x86_mapping_lvl {
+    xen_vaddr_t from;
+    xen_vaddr_t to;
+    xen_pfn_t pfn;
     unsigned int pgtables;
-    unsigned int pg_l4;
-    unsigned int pg_l3;
-    unsigned int pg_l2;
-    unsigned int pg_l1;
+};
+
+struct xc_dom_x86_mapping {
+    struct xc_dom_x86_mapping_lvl area;
+    struct xc_dom_x86_mapping_lvl lvls[4];
+};
+
+struct xc_dom_image_x86 {
+    unsigned n_mappings;
+#define MAPPING_MAX 1
+    struct xc_dom_x86_mapping maps[MAPPING_MAX];
+    struct xc_dom_params *params;
 };
 
 /* get guest IO ABI protocol */
@@ -105,102 +122,160 @@ const char *xc_domain_get_native_protocol(xc_interface *xch,
     return protocol;
 }
 
-static unsigned long
-nr_page_tables(struct xc_dom_image *dom,
-               xen_vaddr_t start, xen_vaddr_t end, unsigned long bits)
+static int count_pgtables(struct xc_dom_image *dom, xen_vaddr_t from,
+                          xen_vaddr_t to, xen_pfn_t pfn)
 {
-    xen_vaddr_t mask = bits_to_mask(bits);
-    int tables;
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
+    struct xc_dom_x86_mapping *map, *map_cmp;
+    xen_pfn_t pfn_end;
+    xen_vaddr_t mask;
+    unsigned bits;
+    int l, m;
 
-    if ( bits == 0 )
-        return 0;  /* unused */
+    if ( domx86->n_mappings == MAPPING_MAX )
+    {
+        xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
+                     "%s: too many mappings\n", __FUNCTION__);
+        return -ENOMEM;
+    }
+    map = domx86->maps + domx86->n_mappings;
 
-    if ( bits == (8 * sizeof(unsigned long)) )
+    pfn_end = pfn + ((to - from) >> PAGE_SHIFT_X86);
+    if ( pfn_end >= dom->p2m_size )
     {
-        /* must be pgd, need one */
-        start = 0;
-        end = -1;
-        tables = 1;
+        xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
+                     "%s: not enough memory for initial mapping (%#"PRIpfn" > %#"PRIpfn")",
+                     __FUNCTION__, pfn_end, dom->p2m_size);
+        return -ENOMEM;
     }
-    else
+    for ( m = 0; m < domx86->n_mappings; m++ )
+    {
+        map_cmp = domx86->maps + m;
+        if ( from < map_cmp->area.to && to > map_cmp->area.from )
+        {
+            xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                         "%s: overlapping mappings\n", __FUNCTION__);
+            return -EINVAL;
+        }
+    }
+
+    memset(map, 0, sizeof(*map));
+    map->area.from = from & domx86->params->vaddr_mask;
+    map->area.to = to & domx86->params->vaddr_mask;
+
+    for ( l = domx86->params->levels - 1; l >= 0; l-- )
     {
-        start = round_down(start, mask);
-        end = round_up(end, mask);
-        tables = ((end - start) >> bits) + 1;
+        map->lvls[l].pfn = pfn + map->area.pgtables;
+        if ( l == domx86->params->levels - 1 )
+        {
+            /* Top level page table in first mapping only. */
+            if ( domx86->n_mappings == 0 )
+            {
+                map->lvls[l].from = 0;
+                map->lvls[l].to = domx86->params->vaddr_mask;
+                map->lvls[l].pgtables = 1;
+                map->area.pgtables++;
+            }
+            continue;
+        }
+
+        bits = PAGE_SHIFT_X86 + (l + 1) * PGTBL_LEVEL_SHIFT_X86;
+        mask = bits_to_mask(bits);
+        map->lvls[l].from = map->area.from & ~mask;
+        map->lvls[l].to = map->area.to | mask;
+
+        if ( domx86->params->levels == PGTBL_LEVELS_I386 &&
+             domx86->n_mappings == 0 && to < 0xc0000000 && l == 1 )
+        {
+            DOMPRINTF("%s: PAE: extra l2 page table for l3#3", __FUNCTION__);
+            map->lvls[l].to = domx86->params->vaddr_mask;
+        }
+
+        for ( m = 0; m < domx86->n_mappings; m++ )
+        {
+            map_cmp = domx86->maps + m;
+            if ( map_cmp->lvls[l].from == map_cmp->lvls[l].to )
+                continue;
+            if ( map->lvls[l].from >= map_cmp->lvls[l].from &&
+                 map->lvls[l].to <= map_cmp->lvls[l].to )
+            {
+                map->lvls[l].from = 0;
+                map->lvls[l].to = 0;
+                break;
+            }
+            assert(map->lvls[l].from >= map_cmp->lvls[l].from ||
+                   map->lvls[l].to <= map_cmp->lvls[l].to);
+            if ( map->lvls[l].from >= map_cmp->lvls[l].from &&
+                 map->lvls[l].from <= map_cmp->lvls[l].to )
+                map->lvls[l].from = map_cmp->lvls[l].to + 1;
+            if ( map->lvls[l].to >= map_cmp->lvls[l].from &&
+                 map->lvls[l].to <= map_cmp->lvls[l].to )
+                map->lvls[l].to = map_cmp->lvls[l].from - 1;
+        }
+        if ( map->lvls[l].from < map->lvls[l].to )
+            map->lvls[l].pgtables =
+                ((map->lvls[l].to - map->lvls[l].from) >> bits) + 1;
+        DOMPRINTF("%s: 0x%016" PRIx64 "/%d: 0x%016" PRIx64 " -> 0x%016" PRIx64
+                  ", %d table(s)", __FUNCTION__, mask, bits,
+                  map->lvls[l].from, map->lvls[l].to, map->lvls[l].pgtables);
+        map->area.pgtables += map->lvls[l].pgtables;
     }
 
-    DOMPRINTF("%s: 0x%016" PRIx64 "/%ld: 0x%016" PRIx64
-              " -> 0x%016" PRIx64 ", %d table(s)",
-              __FUNCTION__, mask, bits, start, end, tables);
-    return tables;
+    return 0;
 }
 
-static int alloc_pgtables(struct xc_dom_image *dom, int pae,
-                          int l4_bits, int l3_bits, int l2_bits, int l1_bits)
+static int alloc_pgtables(struct xc_dom_image *dom)
 {
     int pages, extra_pages;
     xen_vaddr_t try_virt_end;
-    xen_pfn_t try_pfn_end;
     struct xc_dom_image_x86 *domx86 = dom->arch_private;
+    struct xc_dom_x86_mapping *map = domx86->maps + domx86->n_mappings;
 
     extra_pages = dom->alloc_bootstack ? 1 : 0;
-    extra_pages += 128; /* 512kB padding */
+    extra_pages += (512 * 1024) / PAGE_SIZE_X86; /* 512kB padding */
     pages = extra_pages;
     for ( ; ; )
     {
         try_virt_end = round_up(dom->virt_alloc_end + pages * PAGE_SIZE_X86,
                                 bits_to_mask(22)); /* 4MB alignment */
 
-        try_pfn_end = (try_virt_end - dom->parms.virt_base) >> PAGE_SHIFT_X86;
-
-        if ( try_pfn_end > dom->p2m_size )
-        {
-            xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
-                         "%s: not enough memory for initial mapping (%#"PRIpfn" > %#"PRIpfn")",
-                         __FUNCTION__, try_pfn_end, dom->p2m_size);
-            return -ENOMEM;
-        }
+        if ( count_pgtables(dom, dom->parms.virt_base, try_virt_end,
+                            dom->pfn_alloc_end) )
+            return -1;
 
-        domx86->pg_l4 =
-            nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l4_bits);
-        domx86->pg_l3 =
-            nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l3_bits);
-        domx86->pg_l2 =
-            nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l2_bits);
-        domx86->pg_l1 =
-            nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l1_bits);
-        if (pae && try_virt_end < 0xc0000000)
-        {
-            DOMPRINTF("%s: PAE: extra l2 page table for l3#3",
-                      __FUNCTION__);
-            domx86->pg_l2++;
-        }
-        domx86->pgtables = domx86->pg_l4 + domx86->pg_l3 +
-                           domx86->pg_l2 + domx86->pg_l1;
-        pages = domx86->pgtables + extra_pages;
+        pages = map->area.pgtables + extra_pages;
         if ( dom->virt_alloc_end + pages * PAGE_SIZE_X86 <= try_virt_end + 1 )
             break;
     }
+    map->area.pfn = 0;
+    domx86->n_mappings++;
     dom->virt_pgtab_end = try_virt_end + 1;
 
     return xc_dom_alloc_segment(dom, &dom->pgtables_seg, "page tables", 0,
-                                domx86->pgtables * PAGE_SIZE_X86);
+                                map->area.pgtables * PAGE_SIZE_X86);
 }
 
 /* ------------------------------------------------------------------------ */
 /* i386 pagetables                                                          */
 
-#define L1_PROT (_PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED)
-#define L2_PROT (_PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER)
-#define L3_PROT (_PAGE_PRESENT)
+static struct xc_dom_params x86_32_params = {
+    .levels = PGTBL_LEVELS_I386,
+    .vaddr_mask = bits_to_mask(VIRT_BITS_I386),
+    .lvl_prot[0] = _PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED,
+    .lvl_prot[1] = _PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER,
+    .lvl_prot[2] = _PAGE_PRESENT,
+};
 
 static int alloc_pgtables_x86_32_pae(struct xc_dom_image *dom)
 {
-    return alloc_pgtables(dom, 1, 0, 32,
-                          L3_PAGETABLE_SHIFT_PAE, L2_PAGETABLE_SHIFT_PAE);
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
+
+    domx86->params = &x86_32_params;
+    return alloc_pgtables(dom);
 }
 
 #define pfn_to_paddr(pfn) ((xen_paddr_t)(pfn) << PAGE_SHIFT_X86)
+#define pgentry_to_pfn(entry) ((xen_pfn_t)((entry) >> PAGE_SHIFT_X86))
 
 /*
  * Move the l3 page table page below 4G for guests which do not
@@ -270,20 +345,100 @@ static xen_pfn_t move_l3_below_4G(struct xc_dom_image *dom,
     return l3mfn;
 }
 
+static x86_pgentry_t *get_pg_table_x86(struct xc_dom_image *dom, int m, int l)
+{
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
+    struct xc_dom_x86_mapping *map;
+    x86_pgentry_t *pg;
+
+    map = domx86->maps + m;
+    pg = xc_dom_pfn_to_ptr(dom, map->lvls[l].pfn, 0);
+    if ( pg )
+        return pg;
+
+    xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                 "%s: xc_dom_pfn_to_ptr failed", __FUNCTION__);
+    return NULL;
+}
+
+static x86_pgentry_t get_pg_prot_x86(struct xc_dom_image *dom, int l,
+                                     xen_pfn_t pfn)
+{
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
+    struct xc_dom_x86_mapping *map;
+    xen_pfn_t pfn_s, pfn_e;
+    x86_pgentry_t prot;
+    unsigned m;
+
+    prot = domx86->params->lvl_prot[l];
+    if ( l > 0 )
+        return prot;
+
+    for ( m = 0; m < domx86->n_mappings; m++ )
+    {
+        map = domx86->maps + m;
+        pfn_s = map->lvls[domx86->params->levels - 1].pfn;
+        pfn_e = map->area.pgtables + pfn_s;
+        if ( pfn >= pfn_s && pfn < pfn_e )
+            return prot & ~_PAGE_RW;
+    }
+
+    return prot;
+}
+
+static int setup_pgtables_x86(struct xc_dom_image *dom)
+{
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
+    struct xc_dom_x86_mapping *map1, *map2;
+    struct xc_dom_x86_mapping_lvl *lvl;
+    xen_vaddr_t from, to;
+    xen_pfn_t pfn, p, p_s, p_e;
+    x86_pgentry_t *pg;
+    unsigned m1, m2;
+    int l;
+
+    for ( l = domx86->params->levels - 1; l >= 0; l-- )
+        for ( m1 = 0; m1 < domx86->n_mappings; m1++ )
+        {
+            map1 = domx86->maps + m1;
+            from = map1->lvls[l].from;
+            to = map1->lvls[l].to;
+            pg = get_pg_table_x86(dom, m1, l);
+            if ( !pg )
+                return -1;
+            for ( m2 = 0; m2 < domx86->n_mappings; m2++ )
+            {
+                map2 = domx86->maps + m2;
+                lvl = (l > 0) ? map2->lvls + l - 1 : &map2->area;
+                if ( l > 0 && lvl->pgtables == 0 )
+                    continue;
+                if ( lvl->from >= to || lvl->to <= from )
+                    continue;
+                p_s = (max(from, lvl->from) - from) >>
+                      (PAGE_SHIFT_X86 + l * PGTBL_LEVEL_SHIFT_X86);
+                p_e = (min(to, lvl->to) - from) >>
+                      (PAGE_SHIFT_X86 + l * PGTBL_LEVEL_SHIFT_X86);
+                pfn = ((max(from, lvl->from) - lvl->from) >>
+                      (PAGE_SHIFT_X86 + l * PGTBL_LEVEL_SHIFT_X86)) + lvl->pfn;
+                for ( p = p_s; p <= p_e; p++ )
+                {
+                    pg[p] = pfn_to_paddr(xc_dom_p2m(dom, pfn)) |
+                            get_pg_prot_x86(dom, l, pfn);
+                    pfn++;
+                }
+            }
+        }
+
+    return 0;
+}
+
 static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
 {
     struct xc_dom_image_x86 *domx86 = dom->arch_private;
-    xen_pfn_t l3pfn = dom->pgtables_seg.pfn;
-    xen_pfn_t l2pfn = l3pfn + domx86->pg_l3;
-    xen_pfn_t l1pfn = l2pfn + domx86->pg_l2;
-    l3_pgentry_64_t *l3tab;
-    l2_pgentry_64_t *l2tab = NULL;
-    l1_pgentry_64_t *l1tab = NULL;
-    unsigned long l3off, l2off = 0, l1off;
-    xen_vaddr_t addr;
-    xen_pfn_t pgpfn;
-    xen_pfn_t l3mfn = xc_dom_p2m(dom, l3pfn);
+    xen_pfn_t l3mfn, l3pfn;
 
+    l3pfn = domx86->maps[0].lvls[2].pfn;
+    l3mfn = xc_dom_p2m(dom, l3pfn);
     if ( dom->parms.pae == XEN_PAE_YES )
     {
         if ( l3mfn >= 0x100000 )
@@ -299,180 +454,34 @@ static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
         }
     }
 
-    l3tab = xc_dom_pfn_to_ptr(dom, l3pfn, 1);
-    if ( l3tab == NULL )
-        goto pfn_error;
-
-    for ( addr = dom->parms.virt_base; addr < dom->virt_pgtab_end;
-          addr += PAGE_SIZE_X86 )
-    {
-        if ( l2tab == NULL )
-        {
-            /* get L2 tab, make L3 entry */
-            l2tab = xc_dom_pfn_to_ptr(dom, l2pfn, 1);
-            if ( l2tab == NULL )
-                goto pfn_error;
-            l3off = l3_table_offset_pae(addr);
-            l3tab[l3off] =
-                pfn_to_paddr(xc_dom_p2m(dom, l2pfn)) | L3_PROT;
-            l2pfn++;
-        }
-
-        if ( l1tab == NULL )
-        {
-            /* get L1 tab, make L2 entry */
-            l1tab = xc_dom_pfn_to_ptr(dom, l1pfn, 1);
-            if ( l1tab == NULL )
-                goto pfn_error;
-            l2off = l2_table_offset_pae(addr);
-            l2tab[l2off] =
-                pfn_to_paddr(xc_dom_p2m(dom, l1pfn)) | L2_PROT;
-            l1pfn++;
-        }
-
-        /* make L1 entry */
-        l1off = l1_table_offset_pae(addr);
-        pgpfn = (addr - dom->parms.virt_base) >> PAGE_SHIFT_X86;
-        l1tab[l1off] =
-            pfn_to_paddr(xc_dom_p2m(dom, pgpfn)) | L1_PROT;
-        if ( (!dom->pvh_enabled)                &&
-             (addr >= dom->pgtables_seg.vstart) &&
-             (addr < dom->pgtables_seg.vend) )
-            l1tab[l1off] &= ~_PAGE_RW; /* page tables are r/o */
-
-        if ( l1off == (L1_PAGETABLE_ENTRIES_PAE - 1) )
-        {
-            l1tab = NULL;
-            if ( l2off == (L2_PAGETABLE_ENTRIES_PAE - 1) )
-                l2tab = NULL;
-        }
-    }
-
-    if ( dom->virt_pgtab_end <= 0xc0000000 )
-    {
-        DOMPRINTF("%s: PAE: extra l2 page table for l3#3", __FUNCTION__);
-        l3tab[3] = pfn_to_paddr(xc_dom_p2m(dom, l2pfn)) | L3_PROT;
-    }
-    return 0;
-
-pfn_error:
-    xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
-                 "%s: xc_dom_pfn_to_ptr failed", __FUNCTION__);
-    return -EINVAL;
+    return setup_pgtables_x86(dom);
 }
 
-#undef L1_PROT
-#undef L2_PROT
-#undef L3_PROT
-
 /* ------------------------------------------------------------------------ */
 /* x86_64 pagetables                                                        */
 
+static struct xc_dom_params x86_64_params = {
+    .levels = PGTBL_LEVELS_X86_64,
+    .vaddr_mask = bits_to_mask(VIRT_BITS_X86_64),
+    .lvl_prot[0] = _PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED,
+    .lvl_prot[1] = _PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER,
+    .lvl_prot[2] = _PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER,
+    .lvl_prot[3] = _PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER,
+};
+
 static int alloc_pgtables_x86_64(struct xc_dom_image *dom)
 {
-    return alloc_pgtables(dom, 0,
-                          L4_PAGETABLE_SHIFT_X86_64 + 9,
-                          L4_PAGETABLE_SHIFT_X86_64,
-                          L3_PAGETABLE_SHIFT_X86_64,
-                          L2_PAGETABLE_SHIFT_X86_64);
-}
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
 
-#define L1_PROT (_PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED)
-#define L2_PROT (_PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER)
-#define L3_PROT (_PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER)
-#define L4_PROT (_PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER)
+    domx86->params = &x86_64_params;
+    return alloc_pgtables(dom);
+}
 
 static int setup_pgtables_x86_64(struct xc_dom_image *dom)
 {
-    struct xc_dom_image_x86 *domx86 = dom->arch_private;
-    xen_pfn_t l4pfn = dom->pgtables_seg.pfn;
-    xen_pfn_t l3pfn = l4pfn + domx86->pg_l4;
-    xen_pfn_t l2pfn = l3pfn + domx86->pg_l3;
-    xen_pfn_t l1pfn = l2pfn + domx86->pg_l2;
-    l4_pgentry_64_t *l4tab = xc_dom_pfn_to_ptr(dom, l4pfn, 1);
-    l3_pgentry_64_t *l3tab = NULL;
-    l2_pgentry_64_t *l2tab = NULL;
-    l1_pgentry_64_t *l1tab = NULL;
-    uint64_t l4off, l3off = 0, l2off = 0, l1off;
-    uint64_t addr;
-    xen_pfn_t pgpfn;
-
-    if ( l4tab == NULL )
-        goto pfn_error;
-
-    for ( addr = dom->parms.virt_base; addr < dom->virt_pgtab_end;
-          addr += PAGE_SIZE_X86 )
-    {
-        if ( l3tab == NULL )
-        {
-            /* get L3 tab, make L4 entry */
-            l3tab = xc_dom_pfn_to_ptr(dom, l3pfn, 1);
-            if ( l3tab == NULL )
-                goto pfn_error;
-            l4off = l4_table_offset_x86_64(addr);
-            l4tab[l4off] =
-                pfn_to_paddr(xc_dom_p2m(dom, l3pfn)) | L4_PROT;
-            l3pfn++;
-        }
-
-        if ( l2tab == NULL )
-        {
-            /* get L2 tab, make L3 entry */
-            l2tab = xc_dom_pfn_to_ptr(dom, l2pfn, 1);
-            if ( l2tab == NULL )
-                goto pfn_error;
-            l3off = l3_table_offset_x86_64(addr);
-            l3tab[l3off] =
-                pfn_to_paddr(xc_dom_p2m(dom, l2pfn)) | L3_PROT;
-            l2pfn++;
-        }
-
-        if ( l1tab == NULL )
-        {
-            /* get L1 tab, make L2 entry */
-            l1tab = xc_dom_pfn_to_ptr(dom, l1pfn, 1);
-            if ( l1tab == NULL )
-                goto pfn_error;
-            l2off = l2_table_offset_x86_64(addr);
-            l2tab[l2off] =
-                pfn_to_paddr(xc_dom_p2m(dom, l1pfn)) | L2_PROT;
-            l1pfn++;
-        }
-
-        /* make L1 entry */
-        l1off = l1_table_offset_x86_64(addr);
-        pgpfn = (addr - dom->parms.virt_base) >> PAGE_SHIFT_X86;
-        l1tab[l1off] =
-            pfn_to_paddr(xc_dom_p2m(dom, pgpfn)) | L1_PROT;
-        if ( (!dom->pvh_enabled)                &&
-             (addr >= dom->pgtables_seg.vstart) &&
-             (addr < dom->pgtables_seg.vend) )
-            l1tab[l1off] &= ~_PAGE_RW; /* page tables are r/o */
-
-        if ( l1off == (L1_PAGETABLE_ENTRIES_X86_64 - 1) )
-        {
-            l1tab = NULL;
-            if ( l2off == (L2_PAGETABLE_ENTRIES_X86_64 - 1) )
-            {
-                l2tab = NULL;
-                if ( l3off == (L3_PAGETABLE_ENTRIES_X86_64 - 1) )
-                    l3tab = NULL;
-            }
-        }
-    }
-    return 0;
-
-pfn_error:
-    xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
-                 "%s: xc_dom_pfn_to_ptr failed", __FUNCTION__);
-    return -EINVAL;
+    return setup_pgtables_x86(dom);
 }
 
-#undef L1_PROT
-#undef L2_PROT
-#undef L3_PROT
-#undef L4_PROT
-
 /* ------------------------------------------------------------------------ */
 
 static int alloc_p2m_list(struct xc_dom_image *dom)
@@ -659,7 +668,7 @@ static int start_info_x86_32(struct xc_dom_image *dom)
     start_info->nr_pages = dom->total_pages;
     start_info->shared_info = shinfo << PAGE_SHIFT_X86;
     start_info->pt_base = dom->pgtables_seg.vstart;
-    start_info->nr_pt_frames = domx86->pgtables;
+    start_info->nr_pt_frames = domx86->maps[0].area.pgtables;
     start_info->mfn_list = dom->p2m_seg.vstart;
 
     start_info->flags = dom->flags;
@@ -706,7 +715,7 @@ static int start_info_x86_64(struct xc_dom_image *dom)
     start_info->nr_pages = dom->total_pages;
     start_info->shared_info = shinfo << PAGE_SHIFT_X86;
     start_info->pt_base = dom->pgtables_seg.vstart;
-    start_info->nr_pt_frames = domx86->pgtables;
+    start_info->nr_pt_frames = domx86->maps[0].area.pgtables;
     start_info->mfn_list = dom->p2m_seg.vstart;
 
     start_info->flags = dom->flags;
diff --git a/tools/libxc/xg_private.h b/tools/libxc/xg_private.h
index 5544897..a76d6b3 100644
--- a/tools/libxc/xg_private.h
+++ b/tools/libxc/xg_private.h
@@ -62,10 +62,13 @@ unsigned long csum_page (void * page);
 #define _PAGE_PSE       0x080
 #define _PAGE_GLOBAL    0x100
 
-#define L1_PAGETABLE_SHIFT_I386       12
-#define L2_PAGETABLE_SHIFT_I386       22
-#define L1_PAGETABLE_ENTRIES_I386   1024
-#define L2_PAGETABLE_ENTRIES_I386   1024
+#define VIRT_BITS_I386     32
+#define VIRT_BITS_X86_64   48
+
+#define PGTBL_LEVELS_I386       3
+#define PGTBL_LEVELS_X86_64     4
+
+#define PGTBL_LEVEL_SHIFT_X86   9
 
 #define L1_PAGETABLE_SHIFT_PAE        12
 #define L2_PAGETABLE_SHIFT_PAE        21
@@ -83,33 +86,7 @@ unsigned long csum_page (void * page);
 #define L3_PAGETABLE_ENTRIES_X86_64  512
 #define L4_PAGETABLE_ENTRIES_X86_64  512
 
-typedef uint32_t l1_pgentry_32_t;
-typedef uint32_t l2_pgentry_32_t;
-typedef uint64_t l1_pgentry_64_t;
-typedef uint64_t l2_pgentry_64_t;
-typedef uint64_t l3_pgentry_64_t;
-typedef uint64_t l4_pgentry_64_t;
-
-#define l1_table_offset_i386(_a) \
-  (((_a) >> L1_PAGETABLE_SHIFT_I386) & (L1_PAGETABLE_ENTRIES_I386 - 1))
-#define l2_table_offset_i386(_a) \
-  (((_a) >> L2_PAGETABLE_SHIFT_I386) & (L2_PAGETABLE_ENTRIES_I386 - 1))
-
-#define l1_table_offset_pae(_a) \
-  (((_a) >> L1_PAGETABLE_SHIFT_PAE) & (L1_PAGETABLE_ENTRIES_PAE - 1))
-#define l2_table_offset_pae(_a) \
-  (((_a) >> L2_PAGETABLE_SHIFT_PAE) & (L2_PAGETABLE_ENTRIES_PAE - 1))
-#define l3_table_offset_pae(_a) \
-  (((_a) >> L3_PAGETABLE_SHIFT_PAE) & (L3_PAGETABLE_ENTRIES_PAE - 1))
-
-#define l1_table_offset_x86_64(_a) \
-  (((_a) >> L1_PAGETABLE_SHIFT_X86_64) & (L1_PAGETABLE_ENTRIES_X86_64 - 1))
-#define l2_table_offset_x86_64(_a) \
-  (((_a) >> L2_PAGETABLE_SHIFT_X86_64) & (L2_PAGETABLE_ENTRIES_X86_64 - 1))
-#define l3_table_offset_x86_64(_a) \
-  (((_a) >> L3_PAGETABLE_SHIFT_X86_64) & (L3_PAGETABLE_ENTRIES_X86_64 - 1))
-#define l4_table_offset_x86_64(_a) \
-  (((_a) >> L4_PAGETABLE_SHIFT_X86_64) & (L4_PAGETABLE_ENTRIES_X86_64 - 1))
+typedef uint64_t x86_pgentry_t;
 
 #define PAGE_SHIFT_ARM          12
 #define PAGE_SIZE_ARM           (1UL << PAGE_SHIFT_ARM)
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v5 9/9] libxc: create p2m list outside of kernel mapping if supported
  2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
                   ` (7 preceding siblings ...)
  2015-11-12 13:43 ` [PATCH v5 8/9] libxc: rework of domain builder's page table handler Juergen Gross
@ 2015-11-12 13:43 ` Juergen Gross
  8 siblings, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:43 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, ian.jackson, stefano.stabellini,
	wei.liu2, roger.pau
  Cc: Juergen Gross

In case the kernel of a new pv-domU indicates it is supporting a p2m
list outside the initial kernel mapping by specifying INIT_P2M, let
the domain builder allocate the memory for the p2m list from physical
guest memory only and map it to the address the kernel is expecting.

This will enable loading pv-domUs larger than 512 GB.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xc_dom.h |  1 +
 tools/libxc/xc_dom_core.c    | 15 +++++++++++-
 tools/libxc/xc_dom_x86.c     | 56 ++++++++++++++++++++++++++++++++++++++------
 3 files changed, 64 insertions(+), 8 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 7c157c3..ad8e47e 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -238,6 +238,7 @@ struct xc_dom_arch {
     char *native_protocol;
     int page_shift;
     int sizeof_pfn;
+    int p2m_base_supported;
     int arch_private_size;
 
     struct xc_dom_arch *next;
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index ad91b35..5d6c3ba 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -777,6 +777,7 @@ struct xc_dom_image *xc_dom_allocate(xc_interface *xch,
     dom->parms.virt_hypercall = UNSET_ADDR;
     dom->parms.virt_hv_start_low = UNSET_ADDR;
     dom->parms.elf_paddr_offset = UNSET_ADDR;
+    dom->parms.p2m_base = UNSET_ADDR;
 
     dom->alloc_malloc += sizeof(*dom);
     return dom;
@@ -1096,7 +1097,11 @@ int xc_dom_build_image(struct xc_dom_image *dom)
     }
 
     /* allocate other pages */
-    if ( dom->arch_hooks->alloc_p2m_list &&
+    if ( !dom->arch_hooks->p2m_base_supported ||
+         dom->parms.p2m_base >= dom->parms.virt_base ||
+         (dom->parms.p2m_base & (XC_DOM_PAGE_SIZE(dom) - 1)) )
+        dom->parms.p2m_base = UNSET_ADDR;
+    if ( dom->arch_hooks->alloc_p2m_list && dom->parms.p2m_base == UNSET_ADDR &&
          dom->arch_hooks->alloc_p2m_list(dom) != 0 )
         goto err;
     if ( dom->arch_hooks->alloc_magic_pages(dom) != 0 )
@@ -1124,6 +1129,14 @@ int xc_dom_build_image(struct xc_dom_image *dom)
         dom->initrd_len = page_size * dom->ramdisk_seg.pages;
     }
 
+    /* Allocate p2m list if outside of initial kernel mapping. */
+    if ( dom->arch_hooks->alloc_p2m_list && dom->parms.p2m_base != UNSET_ADDR )
+    {
+        if ( dom->arch_hooks->alloc_p2m_list(dom) != 0 )
+            goto err;
+        dom->p2m_seg.vstart = dom->parms.p2m_base;
+    }
+
     return 0;
 
  err:
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 1614354..7279fa2 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -69,6 +69,7 @@
 #define bits_to_mask(bits)       (((xen_vaddr_t)1 << (bits))-1)
 #define round_down(addr, mask)   ((addr) & ~(mask))
 #define round_up(addr, mask)     ((addr) | (mask))
+#define round_pg_up(addr)  (((addr) + PAGE_SIZE_X86 - 1) & ~(PAGE_SIZE_X86 - 1))
 
 struct xc_dom_params {
     unsigned levels;
@@ -90,7 +91,7 @@ struct xc_dom_x86_mapping {
 
 struct xc_dom_image_x86 {
     unsigned n_mappings;
-#define MAPPING_MAX 1
+#define MAPPING_MAX 2
     struct xc_dom_x86_mapping maps[MAPPING_MAX];
     struct xc_dom_params *params;
 };
@@ -484,11 +485,8 @@ static int setup_pgtables_x86_64(struct xc_dom_image *dom)
 
 /* ------------------------------------------------------------------------ */
 
-static int alloc_p2m_list(struct xc_dom_image *dom)
+static int alloc_p2m_list(struct xc_dom_image *dom, size_t p2m_alloc_size)
 {
-    size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
-
-    /* allocate phys2mach table */
     if ( xc_dom_alloc_segment(dom, &dom->p2m_seg, "phys2mach",
                               0, p2m_alloc_size) )
         return -1;
@@ -499,6 +497,40 @@ static int alloc_p2m_list(struct xc_dom_image *dom)
     return 0;
 }
 
+static int alloc_p2m_list_x86_32(struct xc_dom_image *dom)
+{
+    size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
+
+    p2m_alloc_size = round_pg_up(p2m_alloc_size);
+    return alloc_p2m_list(dom, p2m_alloc_size);
+}
+
+static int alloc_p2m_list_x86_64(struct xc_dom_image *dom)
+{
+    struct xc_dom_image_x86 *domx86 = dom->arch_private;
+    struct xc_dom_x86_mapping *map = domx86->maps + domx86->n_mappings;
+    size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
+    xen_vaddr_t from, to;
+    unsigned lvl;
+
+    p2m_alloc_size = round_pg_up(p2m_alloc_size);
+    if ( dom->parms.p2m_base != UNSET_ADDR )
+    {
+        from = dom->parms.p2m_base;
+        to = from + p2m_alloc_size - 1;
+        if ( count_pgtables(dom, from, to, dom->pfn_alloc_end) )
+            return -1;
+
+        map->area.pfn = dom->pfn_alloc_end;
+        for ( lvl = 0; lvl < 4; lvl++ )
+            map->lvls[lvl].pfn += p2m_alloc_size >> PAGE_SHIFT_X86;
+        domx86->n_mappings++;
+        p2m_alloc_size += map->area.pgtables << PAGE_SHIFT_X86;
+    }
+
+    return alloc_p2m_list(dom, p2m_alloc_size);
+}
+
 /* ------------------------------------------------------------------------ */
 
 static int alloc_magic_pages(struct xc_dom_image *dom)
@@ -717,6 +749,11 @@ static int start_info_x86_64(struct xc_dom_image *dom)
     start_info->pt_base = dom->pgtables_seg.vstart;
     start_info->nr_pt_frames = domx86->maps[0].area.pgtables;
     start_info->mfn_list = dom->p2m_seg.vstart;
+    if ( dom->parms.p2m_base != UNSET_ADDR )
+    {
+        start_info->first_p2m_pfn = dom->p2m_seg.pfn;
+        start_info->nr_p2m_frames = dom->p2m_seg.pages;
+    }
 
     start_info->flags = dom->flags;
     start_info->store_mfn = xc_dom_p2m(dom, dom->xenstore_pfn);
@@ -1601,7 +1638,10 @@ static int bootlate_pv(struct xc_dom_image *dom)
     if ( !xc_dom_feature_translated(dom) )
     {
         /* paravirtualized guest */
+
+        /* Drop references to all initial page tables before pinning. */
         xc_dom_unmap_one(dom, dom->pgtables_seg.pfn);
+        xc_dom_unmap_one(dom, dom->p2m_seg.pfn);
         rc = pin_table(dom->xch, pgd_type,
                        xc_dom_p2m(dom, dom->pgtables_seg.pfn),
                        dom->guest_domid);
@@ -1680,10 +1720,11 @@ static struct xc_dom_arch xc_dom_32_pae = {
     .native_protocol = XEN_IO_PROTO_ABI_X86_32,
     .page_shift = PAGE_SHIFT_X86,
     .sizeof_pfn = 4,
+    .p2m_base_supported = 0,
     .arch_private_size = sizeof(struct xc_dom_image_x86),
     .alloc_magic_pages = alloc_magic_pages,
     .alloc_pgtables = alloc_pgtables_x86_32_pae,
-    .alloc_p2m_list = alloc_p2m_list,
+    .alloc_p2m_list = alloc_p2m_list_x86_32,
     .setup_pgtables = setup_pgtables_x86_32_pae,
     .start_info = start_info_x86_32,
     .shared_info = shared_info_x86_32,
@@ -1698,10 +1739,11 @@ static struct xc_dom_arch xc_dom_64 = {
     .native_protocol = XEN_IO_PROTO_ABI_X86_64,
     .page_shift = PAGE_SHIFT_X86,
     .sizeof_pfn = 8,
+    .p2m_base_supported = 1,
     .arch_private_size = sizeof(struct xc_dom_image_x86),
     .alloc_magic_pages = alloc_magic_pages,
     .alloc_pgtables = alloc_pgtables_x86_64,
-    .alloc_p2m_list = alloc_p2m_list,
+    .alloc_p2m_list = alloc_p2m_list_x86_64,
     .setup_pgtables = setup_pgtables_x86_64,
     .start_info = start_info_x86_64,
     .shared_info = shared_info_x86_64,
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-12 13:43 ` [PATCH v5 8/9] libxc: rework of domain builder's page table handler Juergen Gross
@ 2015-11-12 13:47   ` Wei Liu
  2015-11-12 13:48     ` Juergen Gross
  2015-11-18 16:11   ` Boris Ostrovsky
  1 sibling, 1 reply; 72+ messages in thread
From: Wei Liu @ 2015-11-12 13:47 UTC (permalink / raw)
  To: Juergen Gross
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, ian.jackson,
	xen-devel, roger.pau

On Thu, Nov 12, 2015 at 02:43:35PM +0100, Juergen Gross wrote:
> In order to prepare a p2m list outside of the initial kernel mapping
> do a rework of the domain builder's page table handler. The goal is
> to be able to use common helpers for page table allocation and setup
> for initial kernel page tables and page tables mapping the p2m list.
> This is achieved by supporting multiple mapping areas. The mapped
> virtual addresses of the single areas must not overlap, while the
> page tables of a new area added might already be partially present.
> Especially the top level page table is existing only once, of course.
> 
> Currently restrict the number of mappings to 1 because the only mapping
> now is the initial mapping created by toolstack. There should not be
> behaviour change and guest visible change introduced.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com

Missing closing ">". :-)

I don't think you need to resend though.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 13:43 ` [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator Juergen Gross
@ 2015-11-12 13:48   ` Wei Liu
  2015-11-12 14:03     ` Juergen Gross
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2015-11-12 13:48 UTC (permalink / raw)
  To: Juergen Gross
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, ian.jackson,
	xen-devel, roger.pau

On Thu, Nov 12, 2015 at 02:43:28PM +0100, Juergen Gross wrote:
> Guest memory allocation in the domain builder of libxc is done via
> virtual addresses only. In order to be able to support preallocated
> areas not virtually mapped reorganize the memory allocator to keep
> track of allocated pages globally and in allocated segments.
> 
> This requires an interface change of the allocate callback of the
> domain builder which currently is using the last mapped virtual
> address as a parameter. This is no problem as the only user of this
> callback is stubdom/grub/kexec.c using this virtual address to
> calculate the last used pfn.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

If you've tested and confirmed pvgrub (stubdom based grub) doesn't
break:

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-12 13:47   ` Wei Liu
@ 2015-11-12 13:48     ` Juergen Gross
  2015-11-16 13:40       ` Ian Campbell
  0 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 13:48 UTC (permalink / raw)
  To: Wei Liu
  Cc: roger.pau, stefano.stabellini, ian.jackson, Ian.Campbell, xen-devel

On 12/11/15 14:47, Wei Liu wrote:
> On Thu, Nov 12, 2015 at 02:43:35PM +0100, Juergen Gross wrote:
>> In order to prepare a p2m list outside of the initial kernel mapping
>> do a rework of the domain builder's page table handler. The goal is
>> to be able to use common helpers for page table allocation and setup
>> for initial kernel page tables and page tables mapping the p2m list.
>> This is achieved by supporting multiple mapping areas. The mapped
>> virtual addresses of the single areas must not overlap, while the
>> page tables of a new area added might already be partially present.
>> Especially the top level page table is existing only once, of course.
>>
>> Currently restrict the number of mappings to 1 because the only mapping
>> now is the initial mapping created by toolstack. There should not be
>> behaviour change and guest visible change introduced.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> Reviewed-by: Wei Liu <wei.liu2@citrix.com
> 
> Missing closing ">". :-)

Uuh, mouse buffer underrun, I guess ;-)

> 
> I don't think you need to resend though.

Thanks,


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 13:48   ` Wei Liu
@ 2015-11-12 14:03     ` Juergen Gross
  2015-11-12 14:47       ` Wei Liu
  2015-11-12 14:47       ` Ian Campbell
  0 siblings, 2 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 14:03 UTC (permalink / raw)
  To: Wei Liu
  Cc: roger.pau, stefano.stabellini, ian.jackson, Ian.Campbell, xen-devel

On 12/11/15 14:48, Wei Liu wrote:
> On Thu, Nov 12, 2015 at 02:43:28PM +0100, Juergen Gross wrote:
>> Guest memory allocation in the domain builder of libxc is done via
>> virtual addresses only. In order to be able to support preallocated
>> areas not virtually mapped reorganize the memory allocator to keep
>> track of allocated pages globally and in allocated segments.
>>
>> This requires an interface change of the allocate callback of the
>> domain builder which currently is using the last mapped virtual
>> address as a parameter. This is no problem as the only user of this
>> callback is stubdom/grub/kexec.c using this virtual address to
>> calculate the last used pfn.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
> 
> If you've tested and confirmed pvgrub (stubdom based grub) doesn't
> break:
> 
> Acked-by: Wei Liu <wei.liu2@citrix.com>
> 

Hmm, difficult. Is this ever tested automatically? I just tried and
it died:

...
vbd 51712 is hd0
******************* BLKFRONT for device/vbd/51712 **********
backend at /local/domain/0/backend/qdisk/3/51712
Failed to read /local/domain/0/backend/qdisk/3/51712/feature-barrier.
16777216 sectors of 0 bytes
**************************

I guess I need a special setup to make it work. :-(


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 14:03     ` Juergen Gross
@ 2015-11-12 14:47       ` Wei Liu
  2015-11-12 14:47       ` Ian Campbell
  1 sibling, 0 replies; 72+ messages in thread
From: Wei Liu @ 2015-11-12 14:47 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Ian.Campbell, stefano.stabellini, ian.jackson,
	xen-devel, roger.pau

On Thu, Nov 12, 2015 at 03:03:32PM +0100, Juergen Gross wrote:
> On 12/11/15 14:48, Wei Liu wrote:
> > On Thu, Nov 12, 2015 at 02:43:28PM +0100, Juergen Gross wrote:
> >> Guest memory allocation in the domain builder of libxc is done via
> >> virtual addresses only. In order to be able to support preallocated
> >> areas not virtually mapped reorganize the memory allocator to keep
> >> track of allocated pages globally and in allocated segments.
> >>
> >> This requires an interface change of the allocate callback of the
> >> domain builder which currently is using the last mapped virtual
> >> address as a parameter. This is no problem as the only user of this
> >> callback is stubdom/grub/kexec.c using this virtual address to
> >> calculate the last used pfn.
> >>
> >> Signed-off-by: Juergen Gross <jgross@suse.com>
> > 
> > If you've tested and confirmed pvgrub (stubdom based grub) doesn't
> > break:
> > 
> > Acked-by: Wei Liu <wei.liu2@citrix.com>
> > 
> 
> Hmm, difficult. Is this ever tested automatically? I just tried and
> it died:
> 

Unfortunately it is not tested in osstest.

> ...
> vbd 51712 is hd0
> ******************* BLKFRONT for device/vbd/51712 **********
> backend at /local/domain/0/backend/qdisk/3/51712
> Failed to read /local/domain/0/backend/qdisk/3/51712/feature-barrier.
> 16777216 sectors of 0 bytes
> **************************
> 
> I guess I need a special setup to make it work. :-(
> 

Can you paste in your guest config here?


Wei.

> 
> Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 14:03     ` Juergen Gross
  2015-11-12 14:47       ` Wei Liu
@ 2015-11-12 14:47       ` Ian Campbell
  2015-11-12 14:48         ` Wei Liu
  2015-11-12 15:27         ` Juergen Gross
  1 sibling, 2 replies; 72+ messages in thread
From: Ian Campbell @ 2015-11-12 14:47 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Thu, 2015-11-12 at 15:03 +0100, Juergen Gross wrote:
> On 12/11/15 14:48, Wei Liu wrote:
> > On Thu, Nov 12, 2015 at 02:43:28PM +0100, Juergen Gross wrote:
> > > Guest memory allocation in the domain builder of libxc is done via
> > > virtual addresses only. In order to be able to support preallocated
> > > areas not virtually mapped reorganize the memory allocator to keep
> > > track of allocated pages globally and in allocated segments.
> > > 
> > > This requires an interface change of the allocate callback of the
> > > domain builder which currently is using the last mapped virtual
> > > address as a parameter. This is no problem as the only user of this
> > > callback is stubdom/grub/kexec.c using this virtual address to
> > > calculate the last used pfn.
> > > 
> > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > 
> > If you've tested and confirmed pvgrub (stubdom based grub) doesn't
> > break:
> > 
> > Acked-by: Wei Liu <wei.liu2@citrix.com>
> > 
> 
> Hmm, difficult. Is this ever tested automatically?

Yes, see the test-amd64-amd64-amd64-pvgrub and test-amd64-amd64-i386-pvgrub 
jobs in any recent osstest flight (it was added a couple of months back.

http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-amd64-pvgrub/xen-unstable.html

suggests it works ok in general.

Ian.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 14:47       ` Ian Campbell
@ 2015-11-12 14:48         ` Wei Liu
  2015-11-12 15:27         ` Juergen Gross
  1 sibling, 0 replies; 72+ messages in thread
From: Wei Liu @ 2015-11-12 14:48 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Juergen Gross, Wei Liu, stefano.stabellini, ian.jackson,
	xen-devel, roger.pau

On Thu, Nov 12, 2015 at 02:47:24PM +0000, Ian Campbell wrote:
> On Thu, 2015-11-12 at 15:03 +0100, Juergen Gross wrote:
> > On 12/11/15 14:48, Wei Liu wrote:
> > > On Thu, Nov 12, 2015 at 02:43:28PM +0100, Juergen Gross wrote:
> > > > Guest memory allocation in the domain builder of libxc is done via
> > > > virtual addresses only. In order to be able to support preallocated
> > > > areas not virtually mapped reorganize the memory allocator to keep
> > > > track of allocated pages globally and in allocated segments.
> > > > 
> > > > This requires an interface change of the allocate callback of the
> > > > domain builder which currently is using the last mapped virtual
> > > > address as a parameter. This is no problem as the only user of this
> > > > callback is stubdom/grub/kexec.c using this virtual address to
> > > > calculate the last used pfn.
> > > > 
> > > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > > 
> > > If you've tested and confirmed pvgrub (stubdom based grub) doesn't
> > > break:
> > > 
> > > Acked-by: Wei Liu <wei.liu2@citrix.com>
> > > 
> > 
> > Hmm, difficult. Is this ever tested automatically?
> 
> Yes, see the test-amd64-amd64-amd64-pvgrub and test-amd64-amd64-i386-pvgrub 
> jobs in any recent osstest flight (it was added a couple of months back.
> 
> http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-amd64-pvgrub/xen-unstable.html
> 
> suggests it works ok in general.
> 

Ah, so I was wrong. Thanks for correction.

Wei.

> Ian.
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 14:47       ` Ian Campbell
  2015-11-12 14:48         ` Wei Liu
@ 2015-11-12 15:27         ` Juergen Gross
  2015-11-12 15:55           ` Wei Liu
  1 sibling, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-12 15:27 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: xen-devel, roger.pau, ian.jackson, stefano.stabellini

On 12/11/15 15:47, Ian Campbell wrote:
> On Thu, 2015-11-12 at 15:03 +0100, Juergen Gross wrote:
>> On 12/11/15 14:48, Wei Liu wrote:
>>> On Thu, Nov 12, 2015 at 02:43:28PM +0100, Juergen Gross wrote:
>>>> Guest memory allocation in the domain builder of libxc is done via
>>>> virtual addresses only. In order to be able to support preallocated
>>>> areas not virtually mapped reorganize the memory allocator to keep
>>>> track of allocated pages globally and in allocated segments.
>>>>
>>>> This requires an interface change of the allocate callback of the
>>>> domain builder which currently is using the last mapped virtual
>>>> address as a parameter. This is no problem as the only user of this
>>>> callback is stubdom/grub/kexec.c using this virtual address to
>>>> calculate the last used pfn.
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>
>>> If you've tested and confirmed pvgrub (stubdom based grub) doesn't
>>> break:
>>>
>>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>>>
>>
>> Hmm, difficult. Is this ever tested automatically?
> 
> Yes, see the test-amd64-amd64-amd64-pvgrub and test-amd64-amd64-i386-pvgrub 
> jobs in any recent osstest flight (it was added a couple of months back.
> 
> http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-amd64-pvgrub/xen-unstable.html
> 
> suggests it works ok in general.

It doesn't work on any of my test installations I've tried
so far (3 different installations with 3 different domUs).
And that's without my patches being active.

In case somebody else is capable of testing pvgrub with
xen-unstable I'd appreciate testing my patches there!


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 15:27         ` Juergen Gross
@ 2015-11-12 15:55           ` Wei Liu
  2015-11-13  4:41             ` Juergen Gross
  2015-11-13  9:28             ` Ian Campbell
  0 siblings, 2 replies; 72+ messages in thread
From: Wei Liu @ 2015-11-12 15:55 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Ian Campbell, stefano.stabellini, ian.jackson,
	xen-devel, roger.pau

On Thu, Nov 12, 2015 at 04:27:14PM +0100, Juergen Gross wrote:
> On 12/11/15 15:47, Ian Campbell wrote:
> > On Thu, 2015-11-12 at 15:03 +0100, Juergen Gross wrote:
> >> On 12/11/15 14:48, Wei Liu wrote:
> >>> On Thu, Nov 12, 2015 at 02:43:28PM +0100, Juergen Gross wrote:
> >>>> Guest memory allocation in the domain builder of libxc is done via
> >>>> virtual addresses only. In order to be able to support preallocated
> >>>> areas not virtually mapped reorganize the memory allocator to keep
> >>>> track of allocated pages globally and in allocated segments.
> >>>>
> >>>> This requires an interface change of the allocate callback of the
> >>>> domain builder which currently is using the last mapped virtual
> >>>> address as a parameter. This is no problem as the only user of this
> >>>> callback is stubdom/grub/kexec.c using this virtual address to
> >>>> calculate the last used pfn.
> >>>>
> >>>> Signed-off-by: Juergen Gross <jgross@suse.com>
> >>>
> >>> If you've tested and confirmed pvgrub (stubdom based grub) doesn't
> >>> break:
> >>>
> >>> Acked-by: Wei Liu <wei.liu2@citrix.com>
> >>>
> >>
> >> Hmm, difficult. Is this ever tested automatically?
> > 
> > Yes, see the test-amd64-amd64-amd64-pvgrub and test-amd64-amd64-i386-pvgrub 
> > jobs in any recent osstest flight (it was added a couple of months back.
> > 
> > http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-amd64-pvgrub/xen-unstable.html
> > 
> > suggests it works ok in general.
> 
> It doesn't work on any of my test installations I've tried
> so far (3 different installations with 3 different domUs).
> And that's without my patches being active.
> 
> In case somebody else is capable of testing pvgrub with
> xen-unstable I'd appreciate testing my patches there!
> 

I played with the pvgrub generated with this series, it can boot Debian
Wheezy stock 3.2 kernel.

Note that my Wheezy  PV installation is using grub2 so pvgrub doesn't
seem to be able to parse its config file so I had to manually enter
some grub commands. But I don't think that affect the big picture.

Wei.

> 
> Juergen
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 15:55           ` Wei Liu
@ 2015-11-13  4:41             ` Juergen Gross
  2015-11-13  9:28             ` Ian Campbell
  1 sibling, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-13  4:41 UTC (permalink / raw)
  To: Wei Liu
  Cc: roger.pau, xen-devel, ian.jackson, Ian Campbell, stefano.stabellini

On 12/11/15 16:55, Wei Liu wrote:
> On Thu, Nov 12, 2015 at 04:27:14PM +0100, Juergen Gross wrote:
>> On 12/11/15 15:47, Ian Campbell wrote:
>>> On Thu, 2015-11-12 at 15:03 +0100, Juergen Gross wrote:
>>>> On 12/11/15 14:48, Wei Liu wrote:
>>>>> On Thu, Nov 12, 2015 at 02:43:28PM +0100, Juergen Gross wrote:
>>>>>> Guest memory allocation in the domain builder of libxc is done via
>>>>>> virtual addresses only. In order to be able to support preallocated
>>>>>> areas not virtually mapped reorganize the memory allocator to keep
>>>>>> track of allocated pages globally and in allocated segments.
>>>>>>
>>>>>> This requires an interface change of the allocate callback of the
>>>>>> domain builder which currently is using the last mapped virtual
>>>>>> address as a parameter. This is no problem as the only user of this
>>>>>> callback is stubdom/grub/kexec.c using this virtual address to
>>>>>> calculate the last used pfn.
>>>>>>
>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>>
>>>>> If you've tested and confirmed pvgrub (stubdom based grub) doesn't
>>>>> break:
>>>>>
>>>>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>>>>>
>>>>
>>>> Hmm, difficult. Is this ever tested automatically?
>>>
>>> Yes, see the test-amd64-amd64-amd64-pvgrub and test-amd64-amd64-i386-pvgrub 
>>> jobs in any recent osstest flight (it was added a couple of months back.
>>>
>>> http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-amd64-pvgrub/xen-unstable.html
>>>
>>> suggests it works ok in general.
>>
>> It doesn't work on any of my test installations I've tried
>> so far (3 different installations with 3 different domUs).
>> And that's without my patches being active.
>>
>> In case somebody else is capable of testing pvgrub with
>> xen-unstable I'd appreciate testing my patches there!
>>
> 
> I played with the pvgrub generated with this series, it can boot Debian
> Wheezy stock 3.2 kernel.

Thanks for testing this!


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-12 15:55           ` Wei Liu
  2015-11-13  4:41             ` Juergen Gross
@ 2015-11-13  9:28             ` Ian Campbell
  2015-11-13 11:13               ` Wei Liu
  1 sibling, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-11-13  9:28 UTC (permalink / raw)
  To: Wei Liu, Juergen Gross
  Cc: xen-devel, roger.pau, ian.jackson, stefano.stabellini

On Thu, 2015-11-12 at 15:55 +0000, Wei Liu wrote:
> 
> Note that my Wheezy  PV installation is using grub2 so pvgrub doesn't
> seem to be able to parse its config file

FYI you can workaround this by installing the "pv-grub-menu" package (in
wheezy-backports and from Jessie onwards) which provides a grub1 compatible
menu.lst. I think (bu I'm not sure) that it can be coinstalled with the
regular grub2 packages for HVM booting purposes.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator
  2015-11-13  9:28             ` Ian Campbell
@ 2015-11-13 11:13               ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2015-11-13 11:13 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Juergen Gross, Wei Liu, stefano.stabellini, ian.jackson,
	xen-devel, roger.pau

On Fri, Nov 13, 2015 at 09:28:11AM +0000, Ian Campbell wrote:
> On Thu, 2015-11-12 at 15:55 +0000, Wei Liu wrote:
> > 
> > Note that my Wheezy  PV installation is using grub2 so pvgrub doesn't
> > seem to be able to parse its config file
> 
> FYI you can workaround this by installing the "pv-grub-menu" package (in
> wheezy-backports and from Jessie onwards) which provides a grub1 compatible
> menu.lst. I think (bu I'm not sure) that it can be coinstalled with the
> regular grub2 packages for HVM booting purposes.
> 

Thanks for the information.

I installed this package. Pvgrub worked fine. So I think this series
should be at least be able to pass our own push-gate.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-12 13:48     ` Juergen Gross
@ 2015-11-16 13:40       ` Ian Campbell
  2015-11-16 14:32         ` Juergen Gross
  0 siblings, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-11-16 13:40 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Thu, 2015-11-12 at 14:48 +0100, Juergen Gross wrote:
> On 12/11/15 14:47, Wei Liu wrote:
> > On Thu, Nov 12, 2015 at 02:43:35PM +0100, Juergen Gross wrote:
> > > In order to prepare a p2m list outside of the initial kernel mapping
> > > do a rework of the domain builder's page table handler. The goal is
> > > to be able to use common helpers for page table allocation and setup
> > > for initial kernel page tables and page tables mapping the p2m list.
> > > This is achieved by supporting multiple mapping areas. The mapped
> > > virtual addresses of the single areas must not overlap, while the
> > > page tables of a new area added might already be partially present.
> > > Especially the top level page table is existing only once, of course.
> > > 
> > > Currently restrict the number of mappings to 1 because the only
> > > mapping
> > > now is the initial mapping created by toolstack. There should not be
> > > behaviour change and guest visible change introduced.
> > > 
> > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > > Reviewed-by: Wei Liu <wei.liu2@citrix.com
> > 
> > Missing closing ">". :-)
> 
> Uuh, mouse buffer underrun, I guess ;-)
> 
> > 
> > I don't think you need to resend though.
> 
> Thanks,

I fixed while I was applying the whole series.

Ian.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-16 13:40       ` Ian Campbell
@ 2015-11-16 14:32         ` Juergen Gross
  0 siblings, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-16 14:32 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: stefano.stabellini, ian.jackson, roger.pau, xen-devel

On 16/11/15 14:40, Ian Campbell wrote:
> On Thu, 2015-11-12 at 14:48 +0100, Juergen Gross wrote:
>> On 12/11/15 14:47, Wei Liu wrote:
>>> On Thu, Nov 12, 2015 at 02:43:35PM +0100, Juergen Gross wrote:
>>>> In order to prepare a p2m list outside of the initial kernel mapping
>>>> do a rework of the domain builder's page table handler. The goal is
>>>> to be able to use common helpers for page table allocation and setup
>>>> for initial kernel page tables and page tables mapping the p2m list.
>>>> This is achieved by supporting multiple mapping areas. The mapped
>>>> virtual addresses of the single areas must not overlap, while the
>>>> page tables of a new area added might already be partially present.
>>>> Especially the top level page table is existing only once, of course.
>>>>
>>>> Currently restrict the number of mappings to 1 because the only
>>>> mapping
>>>> now is the initial mapping created by toolstack. There should not be
>>>> behaviour change and guest visible change introduced.
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> Reviewed-by: Wei Liu <wei.liu2@citrix.com
>>>
>>> Missing closing ">". :-)
>>
>> Uuh, mouse buffer underrun, I guess ;-)
>>
>>>
>>> I don't think you need to resend though.
>>
>> Thanks,
> 
> I fixed while I was applying the whole series.

Thanks,


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-12 13:43 ` [PATCH v5 8/9] libxc: rework of domain builder's page table handler Juergen Gross
  2015-11-12 13:47   ` Wei Liu
@ 2015-11-18 16:11   ` Boris Ostrovsky
  2015-11-18 16:16     ` Wei Liu
  1 sibling, 1 reply; 72+ messages in thread
From: Boris Ostrovsky @ 2015-11-18 16:11 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, Ian.Campbell, ian.jackson,
	stefano.stabellini, wei.liu2, roger.pau

On 11/12/2015 08:43 AM, Juergen Gross wrote:
> In order to prepare a p2m list outside of the initial kernel mapping
> do a rework of the domain builder's page table handler. The goal is
> to be able to use common helpers for page table allocation and setup
> for initial kernel page tables and page tables mapping the p2m list.
> This is achieved by supporting multiple mapping areas. The mapped
> virtual addresses of the single areas must not overlap, while the
> page tables of a new area added might already be partially present.
> Especially the top level page table is existing only once, of course.
>
> Currently restrict the number of mappings to 1 because the only mapping
> now is the initial mapping created by toolstack. There should not be
> behaviour change and guest visible change introduced.
>
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com
> ---
>   tools/libxc/xc_dom_x86.c | 479 ++++++++++++++++++++++++-----------------------
>   tools/libxc/xg_private.h |  39 +---
>   2 files changed, 252 insertions(+), 266 deletions(-)

This broke PVH. I get a triple fault (somewhere in clear_page(), so the 
guest have run a little)

-boris

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-18 16:11   ` Boris Ostrovsky
@ 2015-11-18 16:16     ` Wei Liu
  2015-11-18 16:21       ` Boris Ostrovsky
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2015-11-18 16:16 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Juergen Gross, wei.liu2, Ian.Campbell, stefano.stabellini,
	ian.jackson, xen-devel, roger.pau

On Wed, Nov 18, 2015 at 11:11:16AM -0500, Boris Ostrovsky wrote:
> On 11/12/2015 08:43 AM, Juergen Gross wrote:
> >In order to prepare a p2m list outside of the initial kernel mapping
> >do a rework of the domain builder's page table handler. The goal is
> >to be able to use common helpers for page table allocation and setup
> >for initial kernel page tables and page tables mapping the p2m list.
> >This is achieved by supporting multiple mapping areas. The mapped
> >virtual addresses of the single areas must not overlap, while the
> >page tables of a new area added might already be partially present.
> >Especially the top level page table is existing only once, of course.
> >
> >Currently restrict the number of mappings to 1 because the only mapping
> >now is the initial mapping created by toolstack. There should not be
> >behaviour change and guest visible change introduced.
> >
> >Signed-off-by: Juergen Gross <jgross@suse.com>
> >Reviewed-by: Wei Liu <wei.liu2@citrix.com
> >---
> >  tools/libxc/xc_dom_x86.c | 479 ++++++++++++++++++++++++-----------------------
> >  tools/libxc/xg_private.h |  39 +---
> >  2 files changed, 252 insertions(+), 266 deletions(-)
> 
> This broke PVH. I get a triple fault (somewhere in clear_page(), so the
> guest have run a little)
> 

Is there any output? We just noticed 32 bit pvgrub is broken. Not sure
these two problems are related but the more information the better.

Wei.

> -boris
> 
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-18 16:16     ` Wei Liu
@ 2015-11-18 16:21       ` Boris Ostrovsky
  2015-11-19  6:09         ` Juergen Gross
  0 siblings, 1 reply; 72+ messages in thread
From: Boris Ostrovsky @ 2015-11-18 16:21 UTC (permalink / raw)
  To: Wei Liu
  Cc: Juergen Gross, Ian.Campbell, stefano.stabellini, ian.jackson,
	xen-devel, roger.pau

On 11/18/2015 11:16 AM, Wei Liu wrote:
> On Wed, Nov 18, 2015 at 11:11:16AM -0500, Boris Ostrovsky wrote:
>> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>>> In order to prepare a p2m list outside of the initial kernel mapping
>>> do a rework of the domain builder's page table handler. The goal is
>>> to be able to use common helpers for page table allocation and setup
>>> for initial kernel page tables and page tables mapping the p2m list.
>>> This is achieved by supporting multiple mapping areas. The mapped
>>> virtual addresses of the single areas must not overlap, while the
>>> page tables of a new area added might already be partially present.
>>> Especially the top level page table is existing only once, of course.
>>>
>>> Currently restrict the number of mappings to 1 because the only mapping
>>> now is the initial mapping created by toolstack. There should not be
>>> behaviour change and guest visible change introduced.
>>>
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> Reviewed-by: Wei Liu <wei.liu2@citrix.com
>>> ---
>>>   tools/libxc/xc_dom_x86.c | 479 ++++++++++++++++++++++++-----------------------
>>>   tools/libxc/xg_private.h |  39 +---
>>>   2 files changed, 252 insertions(+), 266 deletions(-)
>> This broke PVH. I get a triple fault (somewhere in clear_page(), so the
>> guest have run a little)
>>
> Is there any output? We just noticed 32 bit pvgrub is broken. Not sure
> these two problems are related but the more information the better.


Not much:

(d3) mapping kernel into physical memory
(XEN) d3v0 Triple fault - invoking HVM shutdown action 0

I then looked at RIP and it was pointing to 'mov    %rax,(%rdi)' in 
clear_page_orig(). I didn't check what %rdi was.

-boris

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-18 16:21       ` Boris Ostrovsky
@ 2015-11-19  6:09         ` Juergen Gross
  2015-11-19 13:41           ` Boris Ostrovsky
  0 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-19  6:09 UTC (permalink / raw)
  To: Boris Ostrovsky, Wei Liu
  Cc: roger.pau, stefano.stabellini, ian.jackson, Ian.Campbell, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1952 bytes --]

On 18/11/15 17:21, Boris Ostrovsky wrote:
> On 11/18/2015 11:16 AM, Wei Liu wrote:
>> On Wed, Nov 18, 2015 at 11:11:16AM -0500, Boris Ostrovsky wrote:
>>> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>>>> In order to prepare a p2m list outside of the initial kernel mapping
>>>> do a rework of the domain builder's page table handler. The goal is
>>>> to be able to use common helpers for page table allocation and setup
>>>> for initial kernel page tables and page tables mapping the p2m list.
>>>> This is achieved by supporting multiple mapping areas. The mapped
>>>> virtual addresses of the single areas must not overlap, while the
>>>> page tables of a new area added might already be partially present.
>>>> Especially the top level page table is existing only once, of course.
>>>>
>>>> Currently restrict the number of mappings to 1 because the only mapping
>>>> now is the initial mapping created by toolstack. There should not be
>>>> behaviour change and guest visible change introduced.
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> Reviewed-by: Wei Liu <wei.liu2@citrix.com
>>>> ---
>>>>   tools/libxc/xc_dom_x86.c | 479
>>>> ++++++++++++++++++++++++-----------------------
>>>>   tools/libxc/xg_private.h |  39 +---
>>>>   2 files changed, 252 insertions(+), 266 deletions(-)
>>> This broke PVH. I get a triple fault (somewhere in clear_page(), so the
>>> guest have run a little)
>>>
>> Is there any output? We just noticed 32 bit pvgrub is broken. Not sure
>> these two problems are related but the more information the better.
> 
> 
> Not much:
> 
> (d3) mapping kernel into physical memory
> (XEN) d3v0 Triple fault - invoking HVM shutdown action 0
> 
> I then looked at RIP and it was pointing to 'mov    %rax,(%rdi)' in
> clear_page_orig(). I didn't check what %rdi was.

I think I've found the bug. I dropped the special case for pvh to
map page tables writable. Can you try the attached patch, please?


Juergen


[-- Attachment #2: pvh.patch --]
[-- Type: text/x-patch, Size: 541 bytes --]

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 7279fa2..d529518 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -380,7 +380,7 @@ static x86_pgentry_t get_pg_prot_x86(struct xc_dom_image *dom, int l,
         map = domx86->maps + m;
         pfn_s = map->lvls[domx86->params->levels - 1].pfn;
         pfn_e = map->area.pgtables + pfn_s;
-        if ( pfn >= pfn_s && pfn < pfn_e )
+        if ( !dom->pvh_enabled && pfn >= pfn_s && pfn < pfn_e )
             return prot & ~_PAGE_RW;
     }
 

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 8/9] libxc: rework of domain builder's page table handler
  2015-11-19  6:09         ` Juergen Gross
@ 2015-11-19 13:41           ` Boris Ostrovsky
  0 siblings, 0 replies; 72+ messages in thread
From: Boris Ostrovsky @ 2015-11-19 13:41 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: roger.pau, stefano.stabellini, ian.jackson, Ian.Campbell, xen-devel

On 11/19/2015 01:09 AM, Juergen Gross wrote:
> On 18/11/15 17:21, Boris Ostrovsky wrote:
>> On 11/18/2015 11:16 AM, Wei Liu wrote:
>>> On Wed, Nov 18, 2015 at 11:11:16AM -0500, Boris Ostrovsky wrote:
>>>> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>>>>> In order to prepare a p2m list outside of the initial kernel mapping
>>>>> do a rework of the domain builder's page table handler. The goal is
>>>>> to be able to use common helpers for page table allocation and setup
>>>>> for initial kernel page tables and page tables mapping the p2m list.
>>>>> This is achieved by supporting multiple mapping areas. The mapped
>>>>> virtual addresses of the single areas must not overlap, while the
>>>>> page tables of a new area added might already be partially present.
>>>>> Especially the top level page table is existing only once, of course.
>>>>>
>>>>> Currently restrict the number of mappings to 1 because the only mapping
>>>>> now is the initial mapping created by toolstack. There should not be
>>>>> behaviour change and guest visible change introduced.
>>>>>
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>> Reviewed-by: Wei Liu <wei.liu2@citrix.com
>>>>> ---
>>>>>    tools/libxc/xc_dom_x86.c | 479
>>>>> ++++++++++++++++++++++++-----------------------
>>>>>    tools/libxc/xg_private.h |  39 +---
>>>>>    2 files changed, 252 insertions(+), 266 deletions(-)
>>>> This broke PVH. I get a triple fault (somewhere in clear_page(), so the
>>>> guest have run a little)
>>>>
>>> Is there any output? We just noticed 32 bit pvgrub is broken. Not sure
>>> these two problems are related but the more information the better.
>>
>> Not much:
>>
>> (d3) mapping kernel into physical memory
>> (XEN) d3v0 Triple fault - invoking HVM shutdown action 0
>>
>> I then looked at RIP and it was pointing to 'mov    %rax,(%rdi)' in
>> clear_page_orig(). I didn't check what %rdi was.
> I think I've found the bug. I dropped the special case for pvh to
> map page tables writable. Can you try the attached patch, please?


Yes, that fixes it. Thanks.

-boris

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-12 13:43 ` [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported Juergen Gross
@ 2015-11-25 16:12   ` Boris Ostrovsky
  2015-11-25 16:18     ` Wei Liu
                       ` (2 more replies)
  0 siblings, 3 replies; 72+ messages in thread
From: Boris Ostrovsky @ 2015-11-25 16:12 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, Ian.Campbell, ian.jackson,
	stefano.stabellini, wei.liu2, roger.pau

On 11/12/2015 08:43 AM, Juergen Gross wrote:
> In case the kernel of a new pv-domU indicates it is supporting an
> unmapped initrd, don't waste precious virtual space for the initrd,
> but allocate only guest physical memory for it.

This patch breaks 32-bit pygrub.

I am not 100% sure yet but it may be that only 64-bit guests are affected.

With RHEL5 I get
     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)


-boris

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-25 16:12   ` Boris Ostrovsky
@ 2015-11-25 16:18     ` Wei Liu
  2015-11-25 16:24       ` Boris Ostrovsky
  2015-11-25 16:29       ` Ian Campbell
  2015-11-26  5:06     ` Juergen Gross
  2015-11-26  7:35     ` Juergen Gross
  2 siblings, 2 replies; 72+ messages in thread
From: Wei Liu @ 2015-11-25 16:18 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Juergen Gross, wei.liu2, Ian.Campbell, stefano.stabellini,
	ian.jackson, xen-devel, roger.pau

On Wed, Nov 25, 2015 at 11:12:12AM -0500, Boris Ostrovsky wrote:
> On 11/12/2015 08:43 AM, Juergen Gross wrote:
> >In case the kernel of a new pv-domU indicates it is supporting an
> >unmapped initrd, don't waste precious virtual space for the initrd,
> >but allocate only guest physical memory for it.
> 
> This patch breaks 32-bit pygrub.
> 

This particular patch?

We discovered a bug in mini-os that caused 32-bit pygrub to break withi
this series. It's now fixed in mini-os upstream.  Check Config.mk for
mini-os commit that fixes the bug.

> I am not 100% sure yet but it may be that only 64-bit guests are affected.
> 
> With RHEL5 I get
>     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)
> 

But this is different from what we found. We need more information.

My wild guess is RHEL5 advertise it supports unmapped initrd but it was
buggy in some way.

Wei.

> 
> -boris

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-25 16:18     ` Wei Liu
@ 2015-11-25 16:24       ` Boris Ostrovsky
  2015-11-25 16:29       ` Ian Campbell
  1 sibling, 0 replies; 72+ messages in thread
From: Boris Ostrovsky @ 2015-11-25 16:24 UTC (permalink / raw)
  To: Wei Liu
  Cc: Juergen Gross, Ian.Campbell, stefano.stabellini, ian.jackson,
	xen-devel, roger.pau

On 11/25/2015 11:18 AM, Wei Liu wrote:
> On Wed, Nov 25, 2015 at 11:12:12AM -0500, Boris Ostrovsky wrote:
>> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>>> In case the kernel of a new pv-domU indicates it is supporting an
>>> unmapped initrd, don't waste precious virtual space for the initrd,
>>> but allocate only guest physical memory for it.
>> This patch breaks 32-bit pygrub.
>>
> This particular patch?
>
> We discovered a bug in mini-os that caused 32-bit pygrub to break withi
> this series. It's now fixed in mini-os upstream.  Check Config.mk for
> mini-os commit that fixes the bug.

Yes, I was waiting for that patch because somehow I thought it was going 
to fix this. And it was unrelated.

>
>> I am not 100% sure yet but it may be that only 64-bit guests are affected.
>>
>> With RHEL5 I get
>>      initrd extends beyond end of memory (0x780080eda000 > 0x40000000)
>>
> But this is different from what we found. We need more information.
>
> My wild guess is RHEL5 advertise it supports unmapped initrd but it was
> buggy in some way.

Maybe. Unfortunately RHEL5 is the only guest I have available to me 
right now that I can test with pygrub on 32 bit. I have others and they 
also fail but I am not convinced they fail because of this issue.

I was going to look some more at this but I am not sure how much I will 
be able to do before next Monday (it's Thanksgiving 4-day weekend in US) 
so I figured I'd post this now.

-boris

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-25 16:18     ` Wei Liu
  2015-11-25 16:24       ` Boris Ostrovsky
@ 2015-11-25 16:29       ` Ian Campbell
  2015-11-25 16:31         ` Wei Liu
  2015-11-25 16:34         ` Boris Ostrovsky
  1 sibling, 2 replies; 72+ messages in thread
From: Ian Campbell @ 2015-11-25 16:29 UTC (permalink / raw)
  To: Wei Liu, Boris Ostrovsky
  Cc: Juergen Gross, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Wed, 2015-11-25 at 16:18 +0000, Wei Liu wrote:
> On Wed, Nov 25, 2015 at 11:12:12AM -0500, Boris Ostrovsky wrote:
> > On 11/12/2015 08:43 AM, Juergen Gross wrote:
> > > In case the kernel of a new pv-domU indicates it is supporting an
> > > unmapped initrd, don't waste precious virtual space for the initrd,
> > > but allocate only guest physical memory for it.
> > 
> > This patch breaks 32-bit pygrub.
> > 
> 
> This particular patch?
> 
> We discovered a bug in mini-os that caused 32-bit pygrub to break withi
> this series. It's now fixed in mini-os upstream.  Check Config.mk for
> mini-os commit that fixes the bug.

Are we really talking about pygrub here, or pvgrub? (the former is
unaffected by mini-os).

If we are really talking about pygrub then we are _actually_ talking about
the domain builder when operating on the RHEL5 kernel+initrd -- the fact
ithat they were extracted from the guest filesystem by pygrub is
irrelevant.

> > With RHEL5 I get
> >     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)

This is reported within the guest, right?

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-25 16:29       ` Ian Campbell
@ 2015-11-25 16:31         ` Wei Liu
  2015-11-25 16:34         ` Boris Ostrovsky
  1 sibling, 0 replies; 72+ messages in thread
From: Wei Liu @ 2015-11-25 16:31 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Juergen Gross, Wei Liu, stefano.stabellini, ian.jackson,
	xen-devel, Boris Ostrovsky, roger.pau

On Wed, Nov 25, 2015 at 04:29:13PM +0000, Ian Campbell wrote:
> On Wed, 2015-11-25 at 16:18 +0000, Wei Liu wrote:
> > On Wed, Nov 25, 2015 at 11:12:12AM -0500, Boris Ostrovsky wrote:
> > > On 11/12/2015 08:43 AM, Juergen Gross wrote:
> > > > In case the kernel of a new pv-domU indicates it is supporting an
> > > > unmapped initrd, don't waste precious virtual space for the initrd,
> > > > but allocate only guest physical memory for it.
> > > 
> > > This patch breaks 32-bit pygrub.
> > > 
> > 
> > This particular patch?
> > 
> > We discovered a bug in mini-os that caused 32-bit pygrub to break withi
> > this series. It's now fixed in mini-os upstream.  Check Config.mk for
> > mini-os commit that fixes the bug.
> 
> Are we really talking about pygrub here, or pvgrub? (the former is
> unaffected by mini-os).
> 

Duh! I misread again.

Boris, can you clarify this?

> If we are really talking about pygrub then we are _actually_ talking about
> the domain builder when operating on the RHEL5 kernel+initrd -- the fact
> ithat they were extracted from the guest filesystem by pygrub is
> irrelevant.
> 
> > > With RHEL5 I get
> > >     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)
> 
> This is reported within the guest, right?
> 
> Ian.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-25 16:29       ` Ian Campbell
  2015-11-25 16:31         ` Wei Liu
@ 2015-11-25 16:34         ` Boris Ostrovsky
  1 sibling, 0 replies; 72+ messages in thread
From: Boris Ostrovsky @ 2015-11-25 16:34 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Juergen Gross, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On 11/25/2015 11:29 AM, Ian Campbell wrote:
> On Wed, 2015-11-25 at 16:18 +0000, Wei Liu wrote:
>> On Wed, Nov 25, 2015 at 11:12:12AM -0500, Boris Ostrovsky wrote:
>>> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>>>> In case the kernel of a new pv-domU indicates it is supporting an
>>>> unmapped initrd, don't waste precious virtual space for the initrd,
>>>> but allocate only guest physical memory for it.
>>> This patch breaks 32-bit pygrub.
>>>
>> This particular patch?
>>
>> We discovered a bug in mini-os that caused 32-bit pygrub to break withi
>> this series. It's now fixed in mini-os upstream.  Check Config.mk for
>> mini-os commit that fixes the bug.
> Are we really talking about pygrub here, or pvgrub? (the former is
> unaffected by mini-os).

That's exactly what confused me into thinking earlier that the fix that 
Wei is talking about would resolve my problem.

It's pYgrub.

>
> If we are really talking about pygrub then we are _actually_ talking about
> the domain builder when operating on the RHEL5 kernel+initrd -- the fact
> ithat they were extracted from the guest filesystem by pygrub is
> irrelevant.
>
>>> With RHEL5 I get
>>>       initrd extends beyond end of memory (0x780080eda000 > 0x40000000)
> This is reported within the guest, right?


Right. I don't have RHEL5 sources to see what exactly the code is doing 
but if it prints what it says it does ;-) then the address looks pretty 
bogus.

-boris

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-25 16:12   ` Boris Ostrovsky
  2015-11-25 16:18     ` Wei Liu
@ 2015-11-26  5:06     ` Juergen Gross
  2015-11-26  5:19       ` Juergen Gross
  2015-11-26  7:35     ` Juergen Gross
  2 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-26  5:06 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel, Ian.Campbell, ian.jackson,
	stefano.stabellini, wei.liu2, roger.pau

On 25/11/15 17:12, Boris Ostrovsky wrote:
> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>> In case the kernel of a new pv-domU indicates it is supporting an
>> unmapped initrd, don't waste precious virtual space for the initrd,
>> but allocate only guest physical memory for it.
> 
> This patch breaks 32-bit pygrub.
> 
> I am not 100% sure yet but it may be that only 64-bit guests are affected.
> 
> With RHEL5 I get
>     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)

Let me summarize your findings:

You are using a 32 bit dom0 to start a 64 bit RHEL5 guest via pygrub
(not pvgrub). The guest then barfs about the initrd position in
memory.

Can you get the debug output of the domain builder? This would help
to see what is really happening.


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-26  5:06     ` Juergen Gross
@ 2015-11-26  5:19       ` Juergen Gross
  0 siblings, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-26  5:19 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel, Ian.Campbell, ian.jackson,
	stefano.stabellini, wei.liu2, roger.pau

On 26/11/15 06:06, Juergen Gross wrote:
> On 25/11/15 17:12, Boris Ostrovsky wrote:
>> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>>> In case the kernel of a new pv-domU indicates it is supporting an
>>> unmapped initrd, don't waste precious virtual space for the initrd,
>>> but allocate only guest physical memory for it.
>>
>> This patch breaks 32-bit pygrub.
>>
>> I am not 100% sure yet but it may be that only 64-bit guests are affected.
>>
>> With RHEL5 I get
>>     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)
> 
> Let me summarize your findings:
> 
> You are using a 32 bit dom0 to start a 64 bit RHEL5 guest via pygrub
> (not pvgrub). The guest then barfs about the initrd position in
> memory.
> 
> Can you get the debug output of the domain builder? This would help
> to see what is really happening.

I think I have found a potential problem not (directly) related to my
patch:

The domain builder is using xen_pfn_t for pfns. With a 32 bit toolstack
this will lead to problems with 64 bit guests, as xen_pfn_t on x86 is:

typedef unsigned long xen_pfn_t;

I guess we have to modify the domain builder to use a 64 bit type
instead.


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-25 16:12   ` Boris Ostrovsky
  2015-11-25 16:18     ` Wei Liu
  2015-11-26  5:06     ` Juergen Gross
@ 2015-11-26  7:35     ` Juergen Gross
  2015-11-30 10:20       ` Wei Liu
  2015-11-30 18:00       ` Boris Ostrovsky
  2 siblings, 2 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-26  7:35 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel, Ian.Campbell, ian.jackson,
	stefano.stabellini, wei.liu2, roger.pau

[-- Attachment #1: Type: text/plain, Size: 584 bytes --]

On 25/11/15 17:12, Boris Ostrovsky wrote:
> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>> In case the kernel of a new pv-domU indicates it is supporting an
>> unmapped initrd, don't waste precious virtual space for the initrd,
>> but allocate only guest physical memory for it.
> 
> This patch breaks 32-bit pygrub.
> 
> I am not 100% sure yet but it may be that only 64-bit guests are affected.
> 
> With RHEL5 I get
>     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)

I think I have found the problem. Can you verify the attached patch is
working?


Juergen


[-- Attachment #2: 0001-libxc-correct-domain-builder-for-64-bit-guest-with-3.patch --]
[-- Type: text/x-patch, Size: 1160 bytes --]

>From 11eaee2aa2291a1d56556d538ac23b8156cf3388 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Thu, 26 Nov 2015 08:32:26 +0100
Subject: [PATCH] libxc: correct domain builder for 64 bit guest with 32 bit
 tools

Commit 8c45adec18e0512c3d34dcafb13414ecba21be6a ("create unmapped
initrd in domain builder if supported") introduced an error for
building a 64 bit guest with a 32 bit toolset.

The initrd start address and size where stored in an unsigned long
instead of using a 64 bit type.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/libxc/include/xc_dom.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 2176216..370dddd 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -99,8 +99,8 @@ struct xc_dom_image {
     xen_vaddr_t bsd_symtab_start;
 
     /* initrd parameters as specified in start_info page */
-    unsigned long initrd_start;
-    unsigned long initrd_len;
+    uint64_t initrd_start;
+    uint64_t initrd_len;
 
     unsigned int alloc_bootstack;
     xen_vaddr_t virt_pgtab_end;
-- 
2.6.2


[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-26  7:35     ` Juergen Gross
@ 2015-11-30 10:20       ` Wei Liu
  2015-11-30 10:23         ` Juergen Gross
  2015-11-30 18:00       ` Boris Ostrovsky
  1 sibling, 1 reply; 72+ messages in thread
From: Wei Liu @ 2015-11-30 10:20 UTC (permalink / raw)
  To: Juergen Gross
  Cc: wei.liu2, Ian.Campbell, stefano.stabellini, ian.jackson,
	xen-devel, Boris Ostrovsky, roger.pau

On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross wrote:
> On 25/11/15 17:12, Boris Ostrovsky wrote:
> > On 11/12/2015 08:43 AM, Juergen Gross wrote:
> >> In case the kernel of a new pv-domU indicates it is supporting an
> >> unmapped initrd, don't waste precious virtual space for the initrd,
> >> but allocate only guest physical memory for it.
> > 
> > This patch breaks 32-bit pygrub.
> > 
> > I am not 100% sure yet but it may be that only 64-bit guests are affected.
> > 
> > With RHEL5 I get
> >     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)
> 
> I think I have found the problem. Can you verify the attached patch is
> working?
> 
> 
> Juergen
> 

> From 11eaee2aa2291a1d56556d538ac23b8156cf3388 Mon Sep 17 00:00:00 2001
> From: Juergen Gross <jgross@suse.com>
> Date: Thu, 26 Nov 2015 08:32:26 +0100
> Subject: [PATCH] libxc: correct domain builder for 64 bit guest with 32 bit
>  tools
> 
> Commit 8c45adec18e0512c3d34dcafb13414ecba21be6a ("create unmapped
> initrd in domain builder if supported") introduced an error for
> building a 64 bit guest with a 32 bit toolset.
> 
> The initrd start address and size where stored in an unsigned long
> instead of using a 64 bit type.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>  tools/libxc/include/xc_dom.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
> index 2176216..370dddd 100644
> --- a/tools/libxc/include/xc_dom.h
> +++ b/tools/libxc/include/xc_dom.h
> @@ -99,8 +99,8 @@ struct xc_dom_image {
>      xen_vaddr_t bsd_symtab_start;
>  
>      /* initrd parameters as specified in start_info page */
> -    unsigned long initrd_start;
> -    unsigned long initrd_len;
> +    uint64_t initrd_start;
> +    uint64_t initrd_len;
>  

I think these should be of type xen_vaddr_t. Doesn't make a difference
in the end though.

Wei.

>      unsigned int alloc_bootstack;
>      xen_vaddr_t virt_pgtab_end;
> -- 
> 2.6.2
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 10:20       ` Wei Liu
@ 2015-11-30 10:23         ` Juergen Gross
  2015-11-30 10:29           ` Wei Liu
  2015-11-30 10:34           ` Ian Campbell
  0 siblings, 2 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-30 10:23 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ian.Campbell, stefano.stabellini, ian.jackson, xen-devel,
	Boris Ostrovsky, roger.pau

On 30/11/15 11:20, Wei Liu wrote:
> On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross wrote:
>> On 25/11/15 17:12, Boris Ostrovsky wrote:
>>> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>>>> In case the kernel of a new pv-domU indicates it is supporting an
>>>> unmapped initrd, don't waste precious virtual space for the initrd,
>>>> but allocate only guest physical memory for it.
>>>
>>> This patch breaks 32-bit pygrub.
>>>
>>> I am not 100% sure yet but it may be that only 64-bit guests are affected.
>>>
>>> With RHEL5 I get
>>>     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)
>>
>> I think I have found the problem. Can you verify the attached patch is
>> working?
>>
>>
>> Juergen
>>
> 
>> From 11eaee2aa2291a1d56556d538ac23b8156cf3388 Mon Sep 17 00:00:00 2001
>> From: Juergen Gross <jgross@suse.com>
>> Date: Thu, 26 Nov 2015 08:32:26 +0100
>> Subject: [PATCH] libxc: correct domain builder for 64 bit guest with 32 bit
>>  tools
>>
>> Commit 8c45adec18e0512c3d34dcafb13414ecba21be6a ("create unmapped
>> initrd in domain builder if supported") introduced an error for
>> building a 64 bit guest with a 32 bit toolset.
>>
>> The initrd start address and size where stored in an unsigned long
>> instead of using a 64 bit type.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>  tools/libxc/include/xc_dom.h | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
>> index 2176216..370dddd 100644
>> --- a/tools/libxc/include/xc_dom.h
>> +++ b/tools/libxc/include/xc_dom.h
>> @@ -99,8 +99,8 @@ struct xc_dom_image {
>>      xen_vaddr_t bsd_symtab_start;
>>  
>>      /* initrd parameters as specified in start_info page */
>> -    unsigned long initrd_start;
>> -    unsigned long initrd_len;
>> +    uint64_t initrd_start;
>> +    uint64_t initrd_len;
>>  
> 
> I think these should be of type xen_vaddr_t. Doesn't make a difference
> in the end though.

xen_vaddr_t seems not to be appropriate. It can be either a virtual
address or a pfn. So the type should be agnostic to any special
semantics and just needs to be big enough for all cases.


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 10:23         ` Juergen Gross
@ 2015-11-30 10:29           ` Wei Liu
  2015-11-30 10:34           ` Ian Campbell
  1 sibling, 0 replies; 72+ messages in thread
From: Wei Liu @ 2015-11-30 10:29 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Wei Liu, Ian.Campbell, stefano.stabellini, ian.jackson,
	xen-devel, Boris Ostrovsky, roger.pau

On Mon, Nov 30, 2015 at 11:23:34AM +0100, Juergen Gross wrote:
> On 30/11/15 11:20, Wei Liu wrote:
> > On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross wrote:
> >> On 25/11/15 17:12, Boris Ostrovsky wrote:
> >>> On 11/12/2015 08:43 AM, Juergen Gross wrote:
> >>>> In case the kernel of a new pv-domU indicates it is supporting an
> >>>> unmapped initrd, don't waste precious virtual space for the initrd,
> >>>> but allocate only guest physical memory for it.
> >>>
> >>> This patch breaks 32-bit pygrub.
> >>>
> >>> I am not 100% sure yet but it may be that only 64-bit guests are affected.
> >>>
> >>> With RHEL5 I get
> >>>     initrd extends beyond end of memory (0x780080eda000 > 0x40000000)
> >>
> >> I think I have found the problem. Can you verify the attached patch is
> >> working?
> >>
> >>
> >> Juergen
> >>
> > 
> >> From 11eaee2aa2291a1d56556d538ac23b8156cf3388 Mon Sep 17 00:00:00 2001
> >> From: Juergen Gross <jgross@suse.com>
> >> Date: Thu, 26 Nov 2015 08:32:26 +0100
> >> Subject: [PATCH] libxc: correct domain builder for 64 bit guest with 32 bit
> >>  tools
> >>
> >> Commit 8c45adec18e0512c3d34dcafb13414ecba21be6a ("create unmapped
> >> initrd in domain builder if supported") introduced an error for
> >> building a 64 bit guest with a 32 bit toolset.
> >>
> >> The initrd start address and size where stored in an unsigned long
> >> instead of using a 64 bit type.
> >>
> >> Signed-off-by: Juergen Gross <jgross@suse.com>
> >> ---
> >>  tools/libxc/include/xc_dom.h | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
> >> index 2176216..370dddd 100644
> >> --- a/tools/libxc/include/xc_dom.h
> >> +++ b/tools/libxc/include/xc_dom.h
> >> @@ -99,8 +99,8 @@ struct xc_dom_image {
> >>      xen_vaddr_t bsd_symtab_start;
> >>  
> >>      /* initrd parameters as specified in start_info page */
> >> -    unsigned long initrd_start;
> >> -    unsigned long initrd_len;
> >> +    uint64_t initrd_start;
> >> +    uint64_t initrd_len;
> >>  
> > 
> > I think these should be of type xen_vaddr_t. Doesn't make a difference
> > in the end though.
> 
> xen_vaddr_t seems not to be appropriate. It can be either a virtual
> address or a pfn. So the type should be agnostic to any special
> semantics and just needs to be big enough for all cases.
> 

Fair enough.  I forgot that it can also be a pfn.

Wei.

> 
> Juergen
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 10:23         ` Juergen Gross
  2015-11-30 10:29           ` Wei Liu
@ 2015-11-30 10:34           ` Ian Campbell
  2015-11-30 10:47             ` Juergen Gross
  1 sibling, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 10:34 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
> On 30/11/15 11:20, Wei Liu wrote:
> > On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross wrote:
> > >  
> > >      /* initrd parameters as specified in start_info page */
> > > -    unsigned long initrd_start;
> > > -    unsigned long initrd_len;
> > > +    uint64_t initrd_start;
> > > +    uint64_t initrd_len;
> > >  
> > 
> > I think these should be of type xen_vaddr_t. Doesn't make a difference
> > in the end though.
> 
> xen_vaddr_t seems not to be appropriate. It can be either a virtual
> address or a pfn.

Did you mean a virtual address or a physical _address_? Potentially mixing
addresses and frame numbers in a single variable seems liable to be
confusing, at best.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 10:34           ` Ian Campbell
@ 2015-11-30 10:47             ` Juergen Gross
  2015-11-30 10:51               ` Ian Campbell
  0 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-30 10:47 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On 30/11/15 11:34, Ian Campbell wrote:
> On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
>> On 30/11/15 11:20, Wei Liu wrote:
>>> On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross wrote:
>>>>  
>>>>      /* initrd parameters as specified in start_info page */
>>>> -    unsigned long initrd_start;
>>>> -    unsigned long initrd_len;
>>>> +    uint64_t initrd_start;
>>>> +    uint64_t initrd_len;
>>>>  
>>>
>>> I think these should be of type xen_vaddr_t. Doesn't make a difference
>>> in the end though.
>>
>> xen_vaddr_t seems not to be appropriate. It can be either a virtual
>> address or a pfn.
> 
> Did you mean a virtual address or a physical _address_? Potentially mixing
> addresses and frame numbers in a single variable seems liable to be
> confusing, at best.

No, it's really a pfn. And this is part of the stable interface between
hypervisor and the pv-domU since more than 5 years now.


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 10:47             ` Juergen Gross
@ 2015-11-30 10:51               ` Ian Campbell
  2015-11-30 10:52                 ` Ian Campbell
  2015-11-30 10:57                 ` [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported Juergen Gross
  0 siblings, 2 replies; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 10:51 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Mon, 2015-11-30 at 11:47 +0100, Juergen Gross wrote:
> On 30/11/15 11:34, Ian Campbell wrote:
> > On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
> > > On 30/11/15 11:20, Wei Liu wrote:
> > > > On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross wrote:
> > > > >  
> > > > >      /* initrd parameters as specified in start_info page */
> > > > > -    unsigned long initrd_start;
> > > > > -    unsigned long initrd_len;
> > > > > +    uint64_t initrd_start;
> > > > > +    uint64_t initrd_len;
> > > > >  
> > > > 
> > > > I think these should be of type xen_vaddr_t. Doesn't make a
> > > > difference
> > > > in the end though.
> > > 
> > > xen_vaddr_t seems not to be appropriate. It can be either a virtual
> > > address or a pfn.
> > 
> > Did you mean a virtual address or a physical _address_? Potentially
> > mixing
> > addresses and frame numbers in a single variable seems liable to be
> > confusing, at best.
> 
> No, it's really a pfn. And this is part of the stable interface between
> hypervisor and the pv-domU since more than 5 years now.

Including the virtual address bit?

That's a shame.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 10:51               ` Ian Campbell
@ 2015-11-30 10:52                 ` Ian Campbell
  2015-11-30 11:03                   ` Juergen Gross
  2015-11-30 10:57                 ` [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported Juergen Gross
  1 sibling, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 10:52 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Mon, 2015-11-30 at 10:51 +0000, Ian Campbell wrote:
> On Mon, 2015-11-30 at 11:47 +0100, Juergen Gross wrote:
> > On 30/11/15 11:34, Ian Campbell wrote:
> > > On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
> > > > On 30/11/15 11:20, Wei Liu wrote:
> > > > > On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross wrote:
> > > > > >  
> > > > > >      /* initrd parameters as specified in start_info page */
> > > > > > -    unsigned long initrd_start;
> > > > > > -    unsigned long initrd_len;
> > > > > > +    uint64_t initrd_start;
> > > > > > +    uint64_t initrd_len;
> > > > > >  
> > > > > 
> > > > > I think these should be of type xen_vaddr_t. Doesn't make a
> > > > > difference
> > > > > in the end though.
> > > > 
> > > > xen_vaddr_t seems not to be appropriate. It can be either a virtual
> > > > address or a pfn.
> > > 
> > > Did you mean a virtual address or a physical _address_? Potentially
> > > mixing
> > > addresses and frame numbers in a single variable seems liable to be
> > > confusing, at best.
> > 
> > No, it's really a pfn. And this is part of the stable interface between
> > hypervisor and the pv-domU since more than 5 years now.
> 
> Including the virtual address bit?
> 
> That's a shame.

... and that being the case would you mind adding a comment here explaining
the two forms of these variables and the flag which indicates which one is
"in force" at a given moment.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 10:51               ` Ian Campbell
  2015-11-30 10:52                 ` Ian Campbell
@ 2015-11-30 10:57                 ` Juergen Gross
  1 sibling, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-11-30 10:57 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On 30/11/15 11:51, Ian Campbell wrote:
> On Mon, 2015-11-30 at 11:47 +0100, Juergen Gross wrote:
>> On 30/11/15 11:34, Ian Campbell wrote:
>>> On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
>>>> On 30/11/15 11:20, Wei Liu wrote:
>>>>> On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross wrote:
>>>>>>  
>>>>>>      /* initrd parameters as specified in start_info page */
>>>>>> -    unsigned long initrd_start;
>>>>>> -    unsigned long initrd_len;
>>>>>> +    uint64_t initrd_start;
>>>>>> +    uint64_t initrd_len;
>>>>>>  
>>>>>
>>>>> I think these should be of type xen_vaddr_t. Doesn't make a
>>>>> difference
>>>>> in the end though.
>>>>
>>>> xen_vaddr_t seems not to be appropriate. It can be either a virtual
>>>> address or a pfn.
>>>
>>> Did you mean a virtual address or a physical _address_? Potentially
>>> mixing
>>> addresses and frame numbers in a single variable seems liable to be
>>> confusing, at best.
>>
>> No, it's really a pfn. And this is part of the stable interface between
>> hypervisor and the pv-domU since more than 5 years now.
> 
> Including the virtual address bit?
> 
> That's a shame.

Before commit 9f41c22a559a7ec039a195f6dc8bca32c66fcd5a it could only be
a virtual address. This commit added the pfn possibility.

I think it was done via pfn rather than physical address as this would
enable the feature to be used for 32 bit domains as well without having
to limit the initrd position to the first 4 GB or having to change the
structure layout.


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 10:52                 ` Ian Campbell
@ 2015-11-30 11:03                   ` Juergen Gross
  2015-11-30 11:23                     ` Ian Campbell
  0 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-30 11:03 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On 30/11/15 11:52, Ian Campbell wrote:
> On Mon, 2015-11-30 at 10:51 +0000, Ian Campbell wrote:
>> On Mon, 2015-11-30 at 11:47 +0100, Juergen Gross wrote:
>>> On 30/11/15 11:34, Ian Campbell wrote:
>>>> On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
>>>>> On 30/11/15 11:20, Wei Liu wrote:
>>>>>> On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross wrote:
>>>>>>>  
>>>>>>>      /* initrd parameters as specified in start_info page */
>>>>>>> -    unsigned long initrd_start;
>>>>>>> -    unsigned long initrd_len;
>>>>>>> +    uint64_t initrd_start;
>>>>>>> +    uint64_t initrd_len;
>>>>>>>  
>>>>>>
>>>>>> I think these should be of type xen_vaddr_t. Doesn't make a
>>>>>> difference
>>>>>> in the end though.
>>>>>
>>>>> xen_vaddr_t seems not to be appropriate. It can be either a virtual
>>>>> address or a pfn.
>>>>
>>>> Did you mean a virtual address or a physical _address_? Potentially
>>>> mixing
>>>> addresses and frame numbers in a single variable seems liable to be
>>>> confusing, at best.
>>>
>>> No, it's really a pfn. And this is part of the stable interface between
>>> hypervisor and the pv-domU since more than 5 years now.
>>
>> Including the virtual address bit?
>>
>> That's a shame.
> 
> ... and that being the case would you mind adding a comment here explaining
> the two forms of these variables and the flag which indicates which one is
> "in force" at a given moment.

The comment in the struct already tells us that initrd_start and
initrd_len are in the very same format as in the start_info page.
Both fields are meant to be opaque to most of the domain builder
parts.

The only function dealing with the differences is xc_dom_build_image()
which already contains the appropriate flag. I added this on your
request. You acked the resulting patch. So why do you want to add
another comment now?


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 11:03                   ` Juergen Gross
@ 2015-11-30 11:23                     ` Ian Campbell
  2015-11-30 12:20                       ` Juergen Gross
  0 siblings, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 11:23 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Mon, 2015-11-30 at 12:03 +0100, Juergen Gross wrote:
> On 30/11/15 11:52, Ian Campbell wrote:
> > On Mon, 2015-11-30 at 10:51 +0000, Ian Campbell wrote:
> > > On Mon, 2015-11-30 at 11:47 +0100, Juergen Gross wrote:
> > > > On 30/11/15 11:34, Ian Campbell wrote:
> > > > > On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
> > > > > > On 30/11/15 11:20, Wei Liu wrote:
> > > > > > > On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross
> > > > > > > wrote:
> > > > > > > >  
> > > > > > > >      /* initrd parameters as specified in start_info page
> > > > > > > > */
> > > > > > > > -    unsigned long initrd_start;
> > > > > > > > -    unsigned long initrd_len;
> > > > > > > > +    uint64_t initrd_start;
> > > > > > > > +    uint64_t initrd_len;
> > > > > > > >  
> > > > > > > 
> > > > > > > I think these should be of type xen_vaddr_t. Doesn't make a
> > > > > > > difference
> > > > > > > in the end though.
> > > > > > 
> > > > > > xen_vaddr_t seems not to be appropriate. It can be either a
> > > > > > virtual
> > > > > > address or a pfn.
> > > > > 
> > > > > Did you mean a virtual address or a physical _address_?
> > > > > Potentially
> > > > > mixing
> > > > > addresses and frame numbers in a single variable seems liable to
> > > > > be
> > > > > confusing, at best.
> > > > 
> > > > No, it's really a pfn. And this is part of the stable interface
> > > > between
> > > > hypervisor and the pv-domU since more than 5 years now.
> > > 
> > > Including the virtual address bit?
> > > 
> > > That's a shame.
> > 
> > ... and that being the case would you mind adding a comment here
> > explaining
> > the two forms of these variables and the flag which indicates which one
> > is
> > "in force" at a given moment.
> 
> The comment in the struct already tells us that initrd_start and
> initrd_len are in the very same format as in the start_info page.
> Both fields are meant to be opaque to most of the domain builder
> parts.
> 
> The only function dealing with the differences is xc_dom_build_image()
> which already contains the appropriate flag. I added this on your
> request. You acked the resulting patch. So why do you want to add
> another comment now?

I hadn't realised at the time that the semantics of these fields was so,
uh, interesting.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 11:23                     ` Ian Campbell
@ 2015-11-30 12:20                       ` Juergen Gross
  2015-11-30 12:35                         ` Ian Campbell
  0 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-30 12:20 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On 30/11/15 12:23, Ian Campbell wrote:
> On Mon, 2015-11-30 at 12:03 +0100, Juergen Gross wrote:
>> On 30/11/15 11:52, Ian Campbell wrote:
>>> On Mon, 2015-11-30 at 10:51 +0000, Ian Campbell wrote:
>>>> On Mon, 2015-11-30 at 11:47 +0100, Juergen Gross wrote:
>>>>> On 30/11/15 11:34, Ian Campbell wrote:
>>>>>> On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
>>>>>>> On 30/11/15 11:20, Wei Liu wrote:
>>>>>>>> On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross
>>>>>>>> wrote:
>>>>>>>>>  
>>>>>>>>>      /* initrd parameters as specified in start_info page
>>>>>>>>> */
>>>>>>>>> -    unsigned long initrd_start;
>>>>>>>>> -    unsigned long initrd_len;
>>>>>>>>> +    uint64_t initrd_start;
>>>>>>>>> +    uint64_t initrd_len;
>>>>>>>>>  
>>>>>>>>
>>>>>>>> I think these should be of type xen_vaddr_t. Doesn't make a
>>>>>>>> difference
>>>>>>>> in the end though.
>>>>>>>
>>>>>>> xen_vaddr_t seems not to be appropriate. It can be either a
>>>>>>> virtual
>>>>>>> address or a pfn.
>>>>>>
>>>>>> Did you mean a virtual address or a physical _address_?
>>>>>> Potentially
>>>>>> mixing
>>>>>> addresses and frame numbers in a single variable seems liable to
>>>>>> be
>>>>>> confusing, at best.
>>>>>
>>>>> No, it's really a pfn. And this is part of the stable interface
>>>>> between
>>>>> hypervisor and the pv-domU since more than 5 years now.
>>>>
>>>> Including the virtual address bit?
>>>>
>>>> That's a shame.
>>>
>>> ... and that being the case would you mind adding a comment here
>>> explaining
>>> the two forms of these variables and the flag which indicates which one
>>> is
>>> "in force" at a given moment.
>>
>> The comment in the struct already tells us that initrd_start and
>> initrd_len are in the very same format as in the start_info page.
>> Both fields are meant to be opaque to most of the domain builder
>> parts.
>>
>> The only function dealing with the differences is xc_dom_build_image()
>> which already contains the appropriate flag. I added this on your
>> request. You acked the resulting patch. So why do you want to add
>> another comment now?
> 
> I hadn't realised at the time that the semantics of these fields was so,
> uh, interesting.

:-)

I guess due to the lack of a comment? ;-)

Okay, I'll add one when submitting the patch after (hopefully) Boris
confirmed it is fixing his problem.


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 12:20                       ` Juergen Gross
@ 2015-11-30 12:35                         ` Ian Campbell
  2015-11-30 12:59                           ` Juergen Gross
  0 siblings, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 12:35 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On Mon, 2015-11-30 at 13:20 +0100, Juergen Gross wrote:
> On 30/11/15 12:23, Ian Campbell wrote:
> > On Mon, 2015-11-30 at 12:03 +0100, Juergen Gross wrote:
> > > On 30/11/15 11:52, Ian Campbell wrote:
> > > > On Mon, 2015-11-30 at 10:51 +0000, Ian Campbell wrote:
> > > > > On Mon, 2015-11-30 at 11:47 +0100, Juergen Gross wrote:
> > > > > > On 30/11/15 11:34, Ian Campbell wrote:
> > > > > > > On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
> > > > > > > > On 30/11/15 11:20, Wei Liu wrote:
> > > > > > > > > On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross
> > > > > > > > > wrote:
> > > > > > > > > >  
> > > > > > > > > >      /* initrd parameters as specified in start_info
> > > > > > > > > > page
> > > > > > > > > > */
> > > > > > > > > > -    unsigned long initrd_start;
> > > > > > > > > > -    unsigned long initrd_len;
> > > > > > > > > > +    uint64_t initrd_start;
> > > > > > > > > > +    uint64_t initrd_len;
> > > > > > > > > >  
> > > > > > > > > 
> > > > > > > > > I think these should be of type xen_vaddr_t. Doesn't make
> > > > > > > > > a
> > > > > > > > > difference
> > > > > > > > > in the end though.
> > > > > > > > 
> > > > > > > > xen_vaddr_t seems not to be appropriate. It can be either a
> > > > > > > > virtual
> > > > > > > > address or a pfn.
> > > > > > > 
> > > > > > > Did you mean a virtual address or a physical _address_?
> > > > > > > Potentially
> > > > > > > mixing
> > > > > > > addresses and frame numbers in a single variable seems liable
> > > > > > > to
> > > > > > > be
> > > > > > > confusing, at best.
> > > > > > 
> > > > > > No, it's really a pfn. And this is part of the stable interface
> > > > > > between
> > > > > > hypervisor and the pv-domU since more than 5 years now.
> > > > > 
> > > > > Including the virtual address bit?
> > > > > 
> > > > > That's a shame.
> > > > 
> > > > ... and that being the case would you mind adding a comment here
> > > > explaining
> > > > the two forms of these variables and the flag which indicates which
> > > > one
> > > > is
> > > > "in force" at a given moment.
> > > 
> > > The comment in the struct already tells us that initrd_start and
> > > initrd_len are in the very same format as in the start_info page.
> > > Both fields are meant to be opaque to most of the domain builder
> > > parts.
> > > 
> > > The only function dealing with the differences is
> > > xc_dom_build_image()
> > > which already contains the appropriate flag. I added this on your
> > > request. You acked the resulting patch. So why do you want to add
> > > another comment now?
> > 
> > I hadn't realised at the time that the semantics of these fields was
> > so,
> > uh, interesting.
> 
> :-)
> 
> I guess due to the lack of a comment? ;-)

;-)

> Okay, I'll add one when submitting the patch after (hopefully) Boris
> confirmed it is fixing his problem.

Thanks!

FYI attempting to upgrade osstest to use Debian Jessie in the guest seems
to have exposed another issue here.

http://logs.test-lab.xenproject.org/osstest/logs/65172/test-amd64-amd64-amd64-pvgrub/info.html

      Booting 'Debian GNU/Linux, kernel 3.16.0-4-amd64'

    root  (hd0,0)
     Filesystem type is ext2fs, partition type 0x83
    kernel  /boot/vmlinuz-3.16.0-4-amd64 root=UUID=12447529-e85a-4b41-86b6-3e83ccfc
    1377 ro 
    initrd  /boot/initrd.img-3.16.0-4-amd64

    ============= Init TPM Front ================
    Tpmfront:Error Unable to read device/vtpm/0/backend-id during tpmfront initialization! error = ENOENT
    Tpmfront:Info Shutting down tpmfront
    pin_table(x) returned 1357193
    close(3)

    Error 9: Unknown boot failure

    Press any key to continue...

xen.git 6853c9bf9ff0 is OK, whereas 713b7e4ef2aa is not. Adding your two
outstanding patches:
    libxc: correct domain builder for 64 bit guest with 32 bit tools
    libxc: use correct return type for do_memory_op()

Doesn't appear to have helped. Anyway, I was in the process of
investigating/bisecting etc but since I was mailing you any way I thought
I'd mention it. I'll start a fresh thread once I have some more to go on.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-30 12:35                         ` Ian Campbell
@ 2015-11-30 12:59                           ` Juergen Gross
  2015-11-30 13:16                             ` pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported) Ian Campbell
  0 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-30 12:59 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On 30/11/15 13:35, Ian Campbell wrote:
> On Mon, 2015-11-30 at 13:20 +0100, Juergen Gross wrote:
>> On 30/11/15 12:23, Ian Campbell wrote:
>>> On Mon, 2015-11-30 at 12:03 +0100, Juergen Gross wrote:
>>>> On 30/11/15 11:52, Ian Campbell wrote:
>>>>> On Mon, 2015-11-30 at 10:51 +0000, Ian Campbell wrote:
>>>>>> On Mon, 2015-11-30 at 11:47 +0100, Juergen Gross wrote:
>>>>>>> On 30/11/15 11:34, Ian Campbell wrote:
>>>>>>>> On Mon, 2015-11-30 at 11:23 +0100, Juergen Gross wrote:
>>>>>>>>> On 30/11/15 11:20, Wei Liu wrote:
>>>>>>>>>> On Thu, Nov 26, 2015 at 08:35:02AM +0100, Juergen Gross
>>>>>>>>>> wrote:
>>>>>>>>>>>  
>>>>>>>>>>>      /* initrd parameters as specified in start_info
>>>>>>>>>>> page
>>>>>>>>>>> */
>>>>>>>>>>> -    unsigned long initrd_start;
>>>>>>>>>>> -    unsigned long initrd_len;
>>>>>>>>>>> +    uint64_t initrd_start;
>>>>>>>>>>> +    uint64_t initrd_len;
>>>>>>>>>>>  
>>>>>>>>>>
>>>>>>>>>> I think these should be of type xen_vaddr_t. Doesn't make
>>>>>>>>>> a
>>>>>>>>>> difference
>>>>>>>>>> in the end though.
>>>>>>>>>
>>>>>>>>> xen_vaddr_t seems not to be appropriate. It can be either a
>>>>>>>>> virtual
>>>>>>>>> address or a pfn.
>>>>>>>>
>>>>>>>> Did you mean a virtual address or a physical _address_?
>>>>>>>> Potentially
>>>>>>>> mixing
>>>>>>>> addresses and frame numbers in a single variable seems liable
>>>>>>>> to
>>>>>>>> be
>>>>>>>> confusing, at best.
>>>>>>>
>>>>>>> No, it's really a pfn. And this is part of the stable interface
>>>>>>> between
>>>>>>> hypervisor and the pv-domU since more than 5 years now.
>>>>>>
>>>>>> Including the virtual address bit?
>>>>>>
>>>>>> That's a shame.
>>>>>
>>>>> ... and that being the case would you mind adding a comment here
>>>>> explaining
>>>>> the two forms of these variables and the flag which indicates which
>>>>> one
>>>>> is
>>>>> "in force" at a given moment.
>>>>
>>>> The comment in the struct already tells us that initrd_start and
>>>> initrd_len are in the very same format as in the start_info page.
>>>> Both fields are meant to be opaque to most of the domain builder
>>>> parts.
>>>>
>>>> The only function dealing with the differences is
>>>> xc_dom_build_image()
>>>> which already contains the appropriate flag. I added this on your
>>>> request. You acked the resulting patch. So why do you want to add
>>>> another comment now?
>>>
>>> I hadn't realised at the time that the semantics of these fields was
>>> so,
>>> uh, interesting.
>>
>> :-)
>>
>> I guess due to the lack of a comment? ;-)
> 
> ;-)
> 
>> Okay, I'll add one when submitting the patch after (hopefully) Boris
>> confirmed it is fixing his problem.
> 
> Thanks!
> 
> FYI attempting to upgrade osstest to use Debian Jessie in the guest seems
> to have exposed another issue here.
> 
> http://logs.test-lab.xenproject.org/osstest/logs/65172/test-amd64-amd64-amd64-pvgrub/info.html
> 
>       Booting 'Debian GNU/Linux, kernel 3.16.0-4-amd64'
> 
>     root  (hd0,0)
>      Filesystem type is ext2fs, partition type 0x83
>     kernel  /boot/vmlinuz-3.16.0-4-amd64 root=UUID=12447529-e85a-4b41-86b6-3e83ccfc
>     1377 ro 
>     initrd  /boot/initrd.img-3.16.0-4-amd64
> 
>     ============= Init TPM Front ================
>     Tpmfront:Error Unable to read device/vtpm/0/backend-id during tpmfront initialization! error = ENOENT
>     Tpmfront:Info Shutting down tpmfront
>     pin_table(x) returned 1357193
>     close(3)
> 
>     Error 9: Unknown boot failure
> 
>     Press any key to continue...
> 
> xen.git 6853c9bf9ff0 is OK, whereas 713b7e4ef2aa is not. Adding your two
> outstanding patches:
>     libxc: correct domain builder for 64 bit guest with 32 bit tools
>     libxc: use correct return type for do_memory_op()
> 
> Doesn't appear to have helped. Anyway, I was in the process of
> investigating/bisecting etc but since I was mailing you any way I thought
> I'd mention it. I'll start a fresh thread once I have some more to go on.

When something was wrong with pvgrub in my tests of the patches it died
right away and didn't show random errors later. I don't think the
problem you are seeing is related to my recent changes. OTOH I have been
wrong before. :-(


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-11-30 12:59                           ` Juergen Gross
@ 2015-11-30 13:16                             ` Ian Campbell
  2015-11-30 13:41                               ` Ian Campbell
  0 siblings, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 13:16 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On Mon, 2015-11-30 at 13:59 +0100, Juergen Gross wrote:
> On 30/11/15 13:35, Ian Campbell wrote:
> > FYI attempting to upgrade osstest to use Debian Jessie in the guest
> > seems
> > to have exposed another issue here.
> > 
> > http://logs.test-lab.xenproject.org/osstest/logs/65172/test-amd64-amd64-amd64-pvgrub/info.html
> > 
> >       Booting 'Debian GNU/Linux, kernel 3.16.0-4-amd64'
> > 
> >     root  (hd0,0)
> >      Filesystem type is ext2fs, partition type 0x83
> >     kernel  /boot/vmlinuz-3.16.0-4-amd64 root=UUID=12447529-e85a-4b41-86b6-3e83ccfc
> >     1377 ro 
> >     initrd  /boot/initrd.img-3.16.0-4-amd64
> > 
> >     ============= Init TPM Front ================
> >     Tpmfront:Error Unable to read device/vtpm/0/backend-id during tpmfront initialization! error = ENOENT
> >     Tpmfront:Info Shutting down tpmfront
> >     pin_table(x) returned 1357193
> >     close(3)
> > 
> >     Error 9: Unknown boot failure
> > 
> >     Press any key to continue...
> > 
> > xen.git 6853c9bf9ff0 is OK, whereas 713b7e4ef2aa is not. Adding your two
> > outstanding patches:
> >     libxc: correct domain builder for 64 bit guest with 32 bit tools
> >     libxc: use correct return type for do_memory_op()
> > 
> > Doesn't appear to have helped. Anyway, I was in the process of
> > investigating/bisecting etc but since I was mailing you any way I thought
> > I'd mention it. I'll start a fresh thread once I have some more to go on.
> 
> When something was wrong with pvgrub in my tests of the patches it died
> right away and didn't show random errors later. I don't think the
> problem you are seeing is related to my recent changes. OTOH I have been
> wrong before. :-(

Bisection has blamed 06954411ee14 "xen: add generic flag to elf_dom_parms
indicating support of unmapped initrd" which I'm struggling to explain
right now...

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-11-30 13:16                             ` pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported) Ian Campbell
@ 2015-11-30 13:41                               ` Ian Campbell
  2015-11-30 14:10                                 ` Ian Campbell
  2015-11-30 16:15                                 ` Juergen Gross
  0 siblings, 2 replies; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 13:41 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, xen-devel, ian.jackson, stefano.stabellini, roger.pau

On Mon, 2015-11-30 at 13:16 +0000, Ian Campbell wrote:
> On Mon, 2015-11-30 at 13:59 +0100, Juergen Gross wrote:
> > On 30/11/15 13:35, Ian Campbell wrote:
> > > FYI attempting to upgrade osstest to use Debian Jessie in the guest
> > > seems
> > > to have exposed another issue here.
> > > 
> > > http://logs.test-lab.xenproject.org/osstest/logs/65172/test-amd64-amd
> > > 64-amd64-pvgrub/info.html
> > > 
> > >       Booting 'Debian GNU/Linux, kernel 3.16.0-4-amd64'
> > > 
> > >     root  (hd0,0)
> > >      Filesystem type is ext2fs, partition type 0x83
> > >     kernel  /boot/vmlinuz-3.16.0-4-amd64 root=UUID=12447529-e85a-
> > > 4b41-86b6-3e83ccfc
> > >     1377 ro 
> > >     initrd  /boot/initrd.img-3.16.0-4-amd64
> > > 
> > >     ============= Init TPM Front ================
> > >     Tpmfront:Error Unable to read device/vtpm/0/backend-id during
> > > tpmfront initialization! error = ENOENT
> > >     Tpmfront:Info Shutting down tpmfront
> > >     pin_table(x) returned 1357193
> > >     close(3)
> > > 
> > >     Error 9: Unknown boot failure
> > > 
> > >     Press any key to continue...
> > > 
> > > xen.git 6853c9bf9ff0 is OK, whereas 713b7e4ef2aa is not. Adding your
> > > two
> > > outstanding patches:
> > >     libxc: correct domain builder for 64 bit guest with 32 bit tools
> > >     libxc: use correct return type for do_memory_op()
> > > 
> > > Doesn't appear to have helped. Anyway, I was in the process of
> > > investigating/bisecting etc but since I was mailing you any way I
> > > thought
> > > I'd mention it. I'll start a fresh thread once I have some more to go
> > > on.
> > 
> > When something was wrong with pvgrub in my tests of the patches it died
> > right away and didn't show random errors later. I don't think the
> > problem you are seeing is related to my recent changes. OTOH I have
> > been
> > wrong before. :-(
> 
> Bisection has blamed 06954411ee14 "xen: add generic flag to elf_dom_parms
> indicating support of unmapped initrd" which I'm struggling to explain
> right now...

And trying again it has blamed ea7c8a3d0e82 "libxc: reorganize domain
builder guest memory allocator" which is far more likely.

I double checked clean rebuilds twice this time and ea7c8a3d0e82 is bad
while 6853c9bf9ff0 "MINIOS_UPSTREAM_REVISION Update" is good.

Now, the question is why...

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-11-30 13:41                               ` Ian Campbell
@ 2015-11-30 14:10                                 ` Ian Campbell
  2015-11-30 16:15                                 ` Juergen Gross
  1 sibling, 0 replies; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 14:10 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Mon, 2015-11-30 at 13:41 +0000, Ian Campbell wrote:
> On Mon, 2015-11-30 at 13:16 +0000, Ian Campbell wrote:
> > On Mon, 2015-11-30 at 13:59 +0100, Juergen Gross wrote:
> > > On 30/11/15 13:35, Ian Campbell wrote:
> > > > FYI attempting to upgrade osstest to use Debian Jessie in the guest
> > > > seems
> > > > to have exposed another issue here.
> > > > 
> > > > http://logs.test-lab.xenproject.org/osstest/logs/65172/test-amd64-a
> > > > md
> > > > 64-amd64-pvgrub/info.html
> > > > 
> > > >       Booting 'Debian GNU/Linux, kernel 3.16.0-4-amd64'
> > > > 
> > > >     root  (hd0,0)
> > > >      Filesystem type is ext2fs, partition type 0x83
> > > >     kernel  /boot/vmlinuz-3.16.0-4-amd64 root=UUID=12447529-e85a-
> > > > 4b41-86b6-3e83ccfc
> > > >     1377 ro 
> > > >     initrd  /boot/initrd.img-3.16.0-4-amd64
> > > > 
> > > >     ============= Init TPM Front ================
> > > >     Tpmfront:Error Unable to read device/vtpm/0/backend-id during
> > > > tpmfront initialization! error = ENOENT
> > > >     Tpmfront:Info Shutting down tpmfront
> > > >     pin_table(x) returned 1357193
> > > >     close(3)
> > > > 
> > > >     Error 9: Unknown boot failure
> > > > 
> > > >     Press any key to continue...
> > > > 
> > > > 
[...]
I forgot to mention, there are corresposding logs from the h/visor too:

(XEN) mm.c:2417:d29v0 Bad type (saw 1400000000000001 != exp 7000000000000000) for mfn 165b80 (pfn 4d80)
(XEN) mm.c:892:d29v0 Could not get page type PGT_writable_page
(XEN) mm.c:944:d29v0 Error getting mfn 165b80 (pfn 4d80) from L1 entry 0000000165b80023 for l1e_owner=29, pg_owner=29
(XEN) mm.c:1258:d29v0 Failure in alloc_l1_table: entry 384
(XEN) mm.c:2164:d29v0 Error while validating mfn 13fd99 (pfn ab99) for type 1000000000000000: caf=8000000000000002 taf=1000000000000001
(XEN) mm.c:952:d29v0 Attempt to create linear p.t. with write perms
(XEN) mm.c:1334:d29v0 Failure in alloc_l2_table: entry 38
(XEN) mm.c:2164:d29v0 Error while validating mfn 13fd72 (pfn ab72) for type 2000000000000000: caf=8000000000000002 taf=2000000000000001
(XEN) mm.c:994:d29v0 Attempt to create linear p.t. with write perms
(XEN) mm.c:1416:d29v0 Failure in alloc_l3_table: entry 510
(XEN) mm.c:2164:d29v0 Error while validating mfn 13fd71 (pfn ab71) for type 3000000000000000: caf=8000000000000002 taf=3000000000000001
(XEN) mm.c:1018:d29v0 Attempt to create linear p.t. with write perms
(XEN) mm.c:1495:d29v0 Failure in alloc_l4_table: entry 511
(XEN) mm.c:2164:d29v0 Error while validating mfn 13fd70 (pfn ab70) for type 4000000000000000: caf=8000000000000002 taf=4000000000000001
(XEN) mm.c:3067:d29v0 Error while pinning mfn 13fd70

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-11-30 13:41                               ` Ian Campbell
  2015-11-30 14:10                                 ` Ian Campbell
@ 2015-11-30 16:15                                 ` Juergen Gross
  2015-11-30 16:25                                   ` Ian Campbell
  1 sibling, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-11-30 16:15 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, xen-devel, ian.jackson, stefano.stabellini, roger.pau

On 30/11/15 14:41, Ian Campbell wrote:
> On Mon, 2015-11-30 at 13:16 +0000, Ian Campbell wrote:
>> On Mon, 2015-11-30 at 13:59 +0100, Juergen Gross wrote:
>>> On 30/11/15 13:35, Ian Campbell wrote:
>>>> FYI attempting to upgrade osstest to use Debian Jessie in the guest
>>>> seems
>>>> to have exposed another issue here.
>>>>
>>>> http://logs.test-lab.xenproject.org/osstest/logs/65172/test-amd64-amd
>>>> 64-amd64-pvgrub/info.html
>>>>
>>>>       Booting 'Debian GNU/Linux, kernel 3.16.0-4-amd64'
>>>>
>>>>     root  (hd0,0)
>>>>      Filesystem type is ext2fs, partition type 0x83
>>>>     kernel  /boot/vmlinuz-3.16.0-4-amd64 root=UUID=12447529-e85a-
>>>> 4b41-86b6-3e83ccfc
>>>>     1377 ro 
>>>>     initrd  /boot/initrd.img-3.16.0-4-amd64
>>>>
>>>>     ============= Init TPM Front ================
>>>>     Tpmfront:Error Unable to read device/vtpm/0/backend-id during
>>>> tpmfront initialization! error = ENOENT
>>>>     Tpmfront:Info Shutting down tpmfront
>>>>     pin_table(x) returned 1357193
>>>>     close(3)
>>>>
>>>>     Error 9: Unknown boot failure
>>>>
>>>>     Press any key to continue...
>>>>
>>>> xen.git 6853c9bf9ff0 is OK, whereas 713b7e4ef2aa is not. Adding your
>>>> two
>>>> outstanding patches:
>>>>     libxc: correct domain builder for 64 bit guest with 32 bit tools
>>>>     libxc: use correct return type for do_memory_op()
>>>>
>>>> Doesn't appear to have helped. Anyway, I was in the process of
>>>> investigating/bisecting etc but since I was mailing you any way I
>>>> thought
>>>> I'd mention it. I'll start a fresh thread once I have some more to go
>>>> on.
>>>
>>> When something was wrong with pvgrub in my tests of the patches it died
>>> right away and didn't show random errors later. I don't think the
>>> problem you are seeing is related to my recent changes. OTOH I have
>>> been
>>> wrong before. :-(
>>
>> Bisection has blamed 06954411ee14 "xen: add generic flag to elf_dom_parms
>> indicating support of unmapped initrd" which I'm struggling to explain
>> right now...
> 
> And trying again it has blamed ea7c8a3d0e82 "libxc: reorganize domain
> builder guest memory allocator" which is far more likely.
> 
> I double checked clean rebuilds twice this time and ea7c8a3d0e82 is bad
> while 6853c9bf9ff0 "MINIOS_UPSTREAM_REVISION Update" is good.
> 
> Now, the question is why...

Indeed.

I can't spot anything wrong with that patch. I could imagine the change
is triggering an issue which has been present in pvgrub before but
didn't show up.

I'll try to trigger it on my machine and see if I can get some more
debug information.


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-11-30 16:15                                 ` Juergen Gross
@ 2015-11-30 16:25                                   ` Ian Campbell
  2015-11-30 16:56                                     ` Ian Campbell
  2015-12-01 11:12                                     ` dom builder logging from pvgrub Ian Campbell
  0 siblings, 2 replies; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 16:25 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, xen-devel, ian.jackson, stefano.stabellini, roger.pau

On Mon, 2015-11-30 at 17:15 +0100, Juergen Gross wrote:
> I'll try to trigger it on my machine and see if I can get some more
> debug information.

Thanks. Below is what I have so far.

It seems to thing mfn=165b81/pfn=4d81 is an L1 PT, when it needs to be a
writeable page due to the reference from the PT at mfn=1bfd9a.

But I don't see anything in the logging which mentioned pfns > 0x4be0+0x1,
i.e. far short of 0x4d81.

I'm also a little confused that the PFNs which the Xen side logging is
arriving at do not fall into the range which has supposedly been allocated
as the guest page tables. Perhaps the m2p isn't expected to be correct at
this point?

e.g. we think the root PT is pfn=4be3/mfn=1bfd71, but when Xen converts
1bfd71 back to a pfn it gets ab71.

0x4be3 and 0xab71 are almost, but not quite, one bit shift offset from each
other.

TBH I suspect the logging here is simply misleading me.

Ian.


(d54) domainbuilder: xc_dom_boot_xen_init: ver 4.7, caps xen-3.0-x86_hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
(d54)  
(d54) domainbuilder: xc_dom_parse_image: called
(d54) domainbuilder: xc_dom_find_loader: trying multiboot-binary loader ... 
(d54) domainbuilder: loader probe failed
(d54) domainbuilder: xc_dom_find_loader: trying HVM-generic loader ... 
(d54) domainbuilder: loader probe failed
(d54) domainbuilder: xc_dom_find_loader: trying Linux bzImage loader ... 
(d54) domainbuilder: loader probe OK
(d54) xc: elf_parse_binary: phdr: paddr=0x1000000 memsz=0x7ca000
(d54) xc: elf_parse_binary: phdr: paddr=0x1800000 memsz=0xee000
(d54) xc: elf_parse_binary: phdr: paddr=0x18ee000 memsz=0x13c00
(d54) xc: elf_parse_binary: phdr: paddr=0x1902000 memsz=0x616000
(d54) xc: elf_parse_binary: memory: 0x1000000 -> 0x1f18000
(d54) xc: elf_xen_parse_note: GUEST_OS = "linux"
(d54) xc: elf_xen_parse_note: GUEST_VERSION = "2.6"
(d54) xc: elf_xen_parse_note: XEN_VERSION = "xen-3.0"
(d54) xc: elf_xen_parse_note: VIRT_BASE = 0xffffffff80000000
(d54) xc: elf_xen_parse_note: ENTRY = 0xffffffff819021f0
(d54) xc: elf_xen_parse_note: HYPERCALL_PAGE = 0xffffffff81001000
(d54) xc: elf_xen_parse_note: FEATURES = "!writable_page_tables|pae_pgdir_above_4gb|writable_descriptor_tables|auto_translated_physma
(d54) p|supervisor_mode_kernel"
(d54) xc: elf_xen_parse_note: SUPPORTED_FEATURES = 0x90d
(d54) xc: elf_xen_parse_note: PAE_MODE = "yes"
(d54) xc: elf_xen_parse_note: LOADER = "generic"
(d54) xc: elf_xen_parse_note: unknown xen elf note (0xd)
(d54) xc: elf_xen_parse_note: SUSPEND_CANCEL = 0x1
(d54) xc: elf_xen_parse_note: HV_START_LOW = 0xffff800000000000
(d54) xc: elf_xen_parse_note: PADDR_OFFSET = 0x0
(d54) xc: elf_xen_addr_calc_check: addresses:
(d54) xc:     virt_base        = 0xffffffff80000000
(d54) xc:     elf_paddr_offset = 0x0
(d54) xc:     virt_offset      = 0xffffffff80000000
(d54) xc:     virt_kstart      = 0xffffffff81000000
(d54) xc:     virt_kend        = 0xffffffff81f18000
(d54) xc:     virt_entry       = 0xffffffff819021f0
(d54) xc:     p2m_base         = 0xffffffffffffffff
(d54) domainbuilder: xc_dom_parse_elf_kernel: xen-3.0-x86_64: 0xffffffff81000000 -> 0xffffffff81f18000
(d54) domainbuilder: xc_dom_build_image: called
(d54) domainbuilder: xc_dom_alloc_pad: called
(d54) domainbuilder: xc_dom_chk_alloc_pages: limits after padding: pfn=1000 virt=ffffffff81000000
(d54) domainbuilder: xc_dom_chk_alloc_pages: limits after kernel: pfn=1f18 virt=ffffffff81f18000
(d54) domainbuilder: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x1000+0xf18 at 0000000020005000
(d54) domainbuilder: xc_dom_alloc_segment:   kernel       : 0xffffffff81000000 -> 0xffffffff81f18000  (pfn 0x1000 + 0xf18 pages)
(d54) xc: elf_load_binary: phdr 0 at 0x20005000 -> 0x207cf000
(d54) xc: elf_load_binary: phdr 1 at 0x20805000 -> 0x208f3000
(d54) xc: elf_load_binary: phdr 2 at 0x208f3000 -> 0x20906c00
(d54) xc: elf_load_binary: phdr 3 at 0x20907000 -> 0x20a26000
(d54) domainbuilder: xc_dom_chk_alloc_pages: limits after ramdisk: pfn=4ae0 virt=ffffffff84ae0000
(d54) domainbuilder: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x1f18+0x2bc8 at 0000000020f1d000
(d54) domainbuilder: xc_dom_alloc_segment:   ramdisk      : 0xffffffff81f18000 -> 0xffffffff84ae0000  (pfn 0x1f18 + 0x2bc8 pages)
(d54) domainbuilder: alloc_p2m_list_x86_64: called
(d54) domainbuilder: alloc_p2m_list: called
(d54) domainbuilder: xc_dom_chk_alloc_pages: limits after phys2mach: pfn=4be0 virt=ffffffff84be0000
(d54) domainbuilder: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x4ae0+0x100 at 0000000023ae5000
(d54) domainbuilder: xc_dom_alloc_segment:   phys2mach    : 0xffffffff84ae0000 -> 0xffffffff84be0000  (pfn 0x4ae0 + 0x100 pages)
(d54) domainbuilder: xc_dom_alloc_page   :   start info   : 0xffffffff84be0000 (pfn 0x4be0)
(d54) domainbuilder: xc_dom_alloc_page   :   xenstore     : 0xffffffff84be1000 (pfn 0x4be1)
(d54) domainbuilder: xc_dom_alloc_page   :   console      : 0xffffffff84be2000 (pfn 0x4be2)
(d54) domainbuilder: alloc_pgtables: called
(d54) domainbuilder: count_pgtables: 0x0000007fffffffff/39: 0x0000ff8000000000 -> 0x0000ffffffffffff, 1 table(s)
(d54) domainbuilder: count_pgtables: 0x000000003fffffff/30: 0x0000ffff80000000 -> 0x0000ffffbfffffff, 1 table(s)
(d54) domainbuilder: count_pgtables: 0x00000000001fffff/21: 0x0000ffff80000000 -> 0x0000ffff84ffffff, 40 table(s)
(d54) domainbuilder: alloc_pgtables, wants 43 PT pages
(d54) domainbuilder: xc_dom_chk_alloc_pages: limits after page tables: pfn=4c0e virt=ffffffff84c0e000
(d54) domainbuilder: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x4be3+0x2b at 0000000023be5000
(d54) domainbuilder: xc_dom_alloc_segment:   page tables  : 0xffffffff84be3000 -> 0xffffffff84c0e000  (pfn 0x4be3 + 0x2b pages)
(d54) domainbuilder: xc_dom_alloc_page   :   boot stack   : 0xffffffff84c0e000 (pfn 0x4c0e)
(d54) domainbuilder: xc_dom_build_image  : virt_alloc_end : 0xffffffff84c0f000
(d54) domainbuilder: xc_dom_build_image  : virt_pgtab_end : 0xffffffff85000000
(d54) domainbuilder: xc_dom_alloc_pad: called
(d54) domainbuilder: xc_dom_chk_alloc_pages: limits after padding: pfn=5000 virt=ffffffff85000000
(d54) domainbuilder: xc_dom_compat_check: supported guest type: xen-3.0-x86_64 <= matches
(d54) domainbuilder: xc_dom_compat_check: supported guest type: xen-3.0-x86_32p
(d54) domainbuilder: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32
(d54) domainbuilder: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32p
(d54) domainbuilder: xc_dom_compat_check: supported guest type: hvm-3.0-x86_64
(d54) virt base at ffffffff80000000
(d54) bootstack_pfn 4c0e
(d54) _boot_target ffffffff84c0e000
(d54) domainbuilder: xc_dom_update_guest_p2m: dst 64bit, pages 0x20000
(d54) domainbuilder: get_pg_table_x86: PFN 0x4be3
(d54) domainbuilder: get_pg_table_x86: PFN 0x4be4
(d54) domainbuilder: get_pg_table_x86: PFN 0x4be5
(d54) domainbuilder: get_pg_table_x86: PFN 0x4be6
(d54) domainbuilder: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x4be0+0x1 at 0000000023c10000
(d54) domainbuilder: start_info_x86_64: called
(d54) domainbuilder: domain builder memory footprint
(d54) domainbuilder:    allocated
(d54) domainbuilder:       malloc             : 123 kB
(d54) domainbuilder:       anon mmap          : 0 bytes
(d54) domainbuilder:    mapped
(d54) domainbuilder:       file mmap          : 0 bytes
(d54) domainbuilder:       domU mmap          : 60 MB
(d54) Unmap pfns 0 .. 0x4c0f
(d54) REBUILT Nov 30 2015 16:09:45
(d54) Pinning the boot page table pfn 4be3 / mfn 1bfd71/1bfd71
(d54) pin_table: MFN 1bfd71
(XEN) mm.c:2417:d54v0 Bad type (saw 1400000000000001 != exp 7000000000000000) for mfn 165b81 (pfn 4d81)
(XEN) mm.c:892:d54v0 Could not get page type PGT_writable_page
(XEN) mm.c:944:d54v0 Error getting mfn 165b81 (pfn 4d81) from L1 entry 0000000165b81023 for l1e_owner=54, pg_owner=54
(XEN) mm.c:1258:d54v0 Failure in alloc_l1_table: entry 385
(XEN) mm.c:2164:d54v0 Error while validating mfn 1bfd9a (pfn ab9a) for type 1000000000000000: caf=8000000000000002 taf=1000000000000001
(XEN) mm.c:952:d54v0 Attempt to create linear p.t. with write perms
(XEN) mm.c:1334:d54v0 Failure in alloc_l2_table: entry 38
(XEN) mm.c:2164:d54v0 Error while validating mfn 1bfd73 (pfn ab73) for type 2000000000000000: caf=8000000000000002 taf=2000000000000001
(XEN) mm.c:994:d54v0 Attempt to create linear p.t. with write perms
(XEN) mm.c:1416:d54v0 Failure in alloc_l3_table: entry 510
(XEN) mm.c:2164:d54v0 Error while validating mfn 1bfd72 (pfn ab72) for type 3000000000000000: caf=8000000000000002 taf=3000000000000001
(XEN) mm.c:1018:d54v0 Attempt to create linear p.t. with write perms
(XEN) mm.c:1495:d54v0 Failure in alloc_l4_table: entry 511
(XEN) mm.c:2164:d54v0 Error while validating mfn 1bfd71 (pfn ab71) for type 4000000000000000: caf=8000000000000002 taf=4000000000000001
(XEN) mm.c:3067:d54v0 Error while pinning mfn 1bfd71
(d54) domainbuilder: xc_dom_release: called
(d54) close(3)
> 
> 
> Juergen
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-11-30 16:25                                   ` Ian Campbell
@ 2015-11-30 16:56                                     ` Ian Campbell
  2015-12-01  7:15                                       ` Juergen Gross
  2015-12-01 11:12                                     ` dom builder logging from pvgrub Ian Campbell
  1 sibling, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-11-30 16:56 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Mon, 2015-11-30 at 16:25 +0000, Ian Campbell wrote:
> (d54) Pinning the boot page table pfn 4be3 / mfn 1bfd71/1bfd71
> (d54) pin_table: MFN 1bfd71
> (XEN) mm.c:2417:d54v0 Bad type (saw 1400000000000001 != exp 7000000000000000) for mfn 165b81 (pfn 4d81)

I added a "BUG_ON(*pt_pfn == 0x4d81);" to mini-os's new_pt_frame, which
after some messing with gdb to decode produced this stack trace:

716e7: arch_do_exit + 9 in section .text
66176: do_exit + 28 in section .text
6ff68: new_pt_frame + 134 in section .text
70401: need_pgt + 410 in section .text
706ec: do_map_frames + 284 in section .text
66e72: sbrk + 130 in section .text
7768e: _sbrk_r + 30 in section .text
74fa3: _malloc_r + 1219 in section .text
76f3f: _realloc_r + 511 in section .text
31035: unsafe_flush + 46 in section .text
38bc7: unxz + 480 in section .text
310fa: xc_dom_decompress_unsafe + 110 in section .text
38cec: xc_try_xz_decode + 45 in section .text
286ff: xc_dom_probe_bzimage_kernel + 891 in section .text
24613: xc_dom_find_loader + 89 in section .text
24d83: xc_dom_parse_image + 58 in section .text
19d06: kexec + 1012 in section .text
03c27: pv_boot + 97 in section .text
08e4b: boot_func + 52 in section .text
0ab16: run_script + 294 in section .text
10848: run_menu + 3133 in section .text
10fb2: cmain + 1444 in section .text
04447: main + 303 in section .text
66991: call_main + 581 in section .text
03423: thread_starter + 9 in section .text

I'm not quite sure what to make of this, in particular I don't see anything
in kexec.c which obviously looks after unmapping the heap or brk areas.

Ian.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported
  2015-11-26  7:35     ` Juergen Gross
  2015-11-30 10:20       ` Wei Liu
@ 2015-11-30 18:00       ` Boris Ostrovsky
  1 sibling, 0 replies; 72+ messages in thread
From: Boris Ostrovsky @ 2015-11-30 18:00 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, Ian.Campbell, ian.jackson,
	stefano.stabellini, wei.liu2, roger.pau

On 11/26/2015 02:35 AM, Juergen Gross wrote:
> On 25/11/15 17:12, Boris Ostrovsky wrote:
>> On 11/12/2015 08:43 AM, Juergen Gross wrote:
>>> In case the kernel of a new pv-domU indicates it is supporting an
>>> unmapped initrd, don't waste precious virtual space for the initrd,
>>> but allocate only guest physical memory for it.
>> This patch breaks 32-bit pygrub.
>>
>> I am not 100% sure yet but it may be that only 64-bit guests are affected.
>>
>> With RHEL5 I get
>>      initrd extends beyond end of memory (0x780080eda000 > 0x40000000)
> I think I have found the problem. Can you verify the attached patch is
> working?


Yes, this does fix the problem. Thank you!

Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-11-30 16:56                                     ` Ian Campbell
@ 2015-12-01  7:15                                       ` Juergen Gross
  2015-12-01  7:41                                         ` Juergen Gross
  2015-12-01  8:32                                         ` Ian Campbell
  0 siblings, 2 replies; 72+ messages in thread
From: Juergen Gross @ 2015-12-01  7:15 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On 30/11/15 17:56, Ian Campbell wrote:
> On Mon, 2015-11-30 at 16:25 +0000, Ian Campbell wrote:
>> (d54) Pinning the boot page table pfn 4be3 / mfn 1bfd71/1bfd71
>> (d54) pin_table: MFN 1bfd71
>> (XEN) mm.c:2417:d54v0 Bad type (saw 1400000000000001 != exp 7000000000000000) for mfn 165b81 (pfn 4d81)
> 
> I added a "BUG_ON(*pt_pfn == 0x4d81);" to mini-os's new_pt_frame, which
> after some messing with gdb to decode produced this stack trace:
> 
> 716e7: arch_do_exit + 9 in section .text
> 66176: do_exit + 28 in section .text
> 6ff68: new_pt_frame + 134 in section .text
> 70401: need_pgt + 410 in section .text
> 706ec: do_map_frames + 284 in section .text
> 66e72: sbrk + 130 in section .text
> 7768e: _sbrk_r + 30 in section .text
> 74fa3: _malloc_r + 1219 in section .text
> 76f3f: _realloc_r + 511 in section .text
> 31035: unsafe_flush + 46 in section .text
> 38bc7: unxz + 480 in section .text
> 310fa: xc_dom_decompress_unsafe + 110 in section .text
> 38cec: xc_try_xz_decode + 45 in section .text
> 286ff: xc_dom_probe_bzimage_kernel + 891 in section .text
> 24613: xc_dom_find_loader + 89 in section .text
> 24d83: xc_dom_parse_image + 58 in section .text
> 19d06: kexec + 1012 in section .text
> 03c27: pv_boot + 97 in section .text
> 08e4b: boot_func + 52 in section .text
> 0ab16: run_script + 294 in section .text
> 10848: run_menu + 3133 in section .text
> 10fb2: cmain + 1444 in section .text
> 04447: main + 303 in section .text
> 66991: call_main + 581 in section .text
> 03423: thread_starter + 9 in section .text
> 
> I'm not quite sure what to make of this, in particular I don't see anything
> in kexec.c which obviously looks after unmapping the heap or brk areas.

Nah, this backtrace shows a normal allocation path while
uncompressing the kernel image. I'd expect something like that.
Why shouldn't mini-os make use of pfn 4d81 somewhere?

I guess there is something wrong either in mini-os's memory
allocator (not very likely) or in kexec_allocate(). I'll try to
check those.

BTW: up to now I haven't managed to reproduce your problem. My
domains are just booting fine up to now. Is there a way I could
get the domain image which is failing? Maybe I could just try
to use that on my test machine with the same domain config you are
using.

Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01  7:15                                       ` Juergen Gross
@ 2015-12-01  7:41                                         ` Juergen Gross
  2015-12-01  8:30                                           ` Ian Campbell
  2015-12-01  8:32                                         ` Ian Campbell
  1 sibling, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-12-01  7:41 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On 01/12/15 08:15, Juergen Gross wrote:
> On 30/11/15 17:56, Ian Campbell wrote:
>> On Mon, 2015-11-30 at 16:25 +0000, Ian Campbell wrote:
>>> (d54) Pinning the boot page table pfn 4be3 / mfn 1bfd71/1bfd71
>>> (d54) pin_table: MFN 1bfd71
>>> (XEN) mm.c:2417:d54v0 Bad type (saw 1400000000000001 != exp 7000000000000000) for mfn 165b81 (pfn 4d81)
>>
>> I added a "BUG_ON(*pt_pfn == 0x4d81);" to mini-os's new_pt_frame, which
>> after some messing with gdb to decode produced this stack trace:
>>
>> 716e7: arch_do_exit + 9 in section .text
>> 66176: do_exit + 28 in section .text
>> 6ff68: new_pt_frame + 134 in section .text
>> 70401: need_pgt + 410 in section .text
>> 706ec: do_map_frames + 284 in section .text
>> 66e72: sbrk + 130 in section .text
>> 7768e: _sbrk_r + 30 in section .text
>> 74fa3: _malloc_r + 1219 in section .text
>> 76f3f: _realloc_r + 511 in section .text
>> 31035: unsafe_flush + 46 in section .text
>> 38bc7: unxz + 480 in section .text
>> 310fa: xc_dom_decompress_unsafe + 110 in section .text
>> 38cec: xc_try_xz_decode + 45 in section .text
>> 286ff: xc_dom_probe_bzimage_kernel + 891 in section .text
>> 24613: xc_dom_find_loader + 89 in section .text
>> 24d83: xc_dom_parse_image + 58 in section .text
>> 19d06: kexec + 1012 in section .text
>> 03c27: pv_boot + 97 in section .text
>> 08e4b: boot_func + 52 in section .text
>> 0ab16: run_script + 294 in section .text
>> 10848: run_menu + 3133 in section .text
>> 10fb2: cmain + 1444 in section .text
>> 04447: main + 303 in section .text
>> 66991: call_main + 581 in section .text
>> 03423: thread_starter + 9 in section .text
>>
>> I'm not quite sure what to make of this, in particular I don't see anything
>> in kexec.c which obviously looks after unmapping the heap or brk areas.
> 
> Nah, this backtrace shows a normal allocation path while
> uncompressing the kernel image. I'd expect something like that.
> Why shouldn't mini-os make use of pfn 4d81 somewhere?
> 
> I guess there is something wrong either in mini-os's memory
> allocator (not very likely) or in kexec_allocate(). I'll try to
> check those.

Hmm, kexec_allocate() seems to be a little bit fishy.

I suspect a problem in the loop for the case new_pfn == i. I think
in this case the p2m list will be written with a wrong entry.

Ian, could you verify via:

diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
index 8fd9ff9..9421023 100644
--- a/stubdom/grub/kexec.c
+++ b/stubdom/grub/kexec.c
@@ -131,6 +131,8 @@ int kexec_allocate(struct xc_dom_image *dom)
	/* Store destination PFN of currently requested page. */
	pages_moved2pfns[i] = new_pfn;

+	BUG_ON(new_pfn == i);
+
	/* Put old page at new PFN */
	dom->p2m_host[new_pfn] = old_mfn;


Juergen

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01  7:41                                         ` Juergen Gross
@ 2015-12-01  8:30                                           ` Ian Campbell
  2015-12-01  8:53                                             ` Juergen Gross
  0 siblings, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-12-01  8:30 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On Tue, 2015-12-01 at 08:41 +0100, Juergen Gross wrote:
> >> I'm not quite sure what to make of this, in particular I don't see
> anything
> >> in kexec.c which obviously looks after unmapping the heap or brk
> areas.
> > 
> > Nah, this backtrace shows a normal allocation path while
> > uncompressing the kernel image. I'd expect something like that.
> > Why shouldn't mini-os make use of pfn 4d81 somewhere?

That pfn ends up right in the middle of the next-kernels vaddr mapping,
so at best it indicates some sort of disconnect/overlap between the
mini-os memory allocator and the domain-builder memory allocator.

Since it seems to be in the middle of the padding area (which might
have been new in ea7c8a3d0e82, I'm not sure, it seems to be more
explicit at the least) it occurred to me on the way home last night
that maybe we need to unmap the padding area as well.

I'll try that and your suggested patch below as well once I get to the
office this morning.

> > 
> > I guess there is something wrong either in mini-os's memory
> > allocator (not very likely) or in kexec_allocate(). I'll try to
> > check those.
> 
> Hmm, kexec_allocate() seems to be a little bit fishy.
> 
> I suspect a problem in the loop for the case new_pfn == i. I think
> in this case the p2m list will be written with a wrong entry.
> 
> Ian, could you verify via:
> 
> diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
> index 8fd9ff9..9421023 100644
> --- a/stubdom/grub/kexec.c
> +++ b/stubdom/grub/kexec.c
> @@ -131,6 +131,8 @@ int kexec_allocate(struct xc_dom_image *dom)
> 	/* Store destination PFN of currently requested page. */
> 	pages_moved2pfns[i] = new_pfn;
> 
> +	BUG_ON(new_pfn == i);
> +
> 	/* Put old page at new PFN */
> 	dom->p2m_host[new_pfn] = old_mfn;
> 
> 
> Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01  7:15                                       ` Juergen Gross
  2015-12-01  7:41                                         ` Juergen Gross
@ 2015-12-01  8:32                                         ` Ian Campbell
  1 sibling, 0 replies; 72+ messages in thread
From: Ian Campbell @ 2015-12-01  8:32 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

On Tue, 2015-12-01 at 08:15 +0100, Juergen Gross wrote:
> 
> BTW: up to now I haven't managed to reproduce your problem. My
> domains are just booting fine up to now. Is there a way I could
> get the domain image which is failing? Maybe I could just try
> to use that on my test machine with the same domain config you are
> using.

I think it should be sufficient just to drop the Debian kernel into an
empty filesystem attached to the guest, just doing "kernel
(hd0,0)/boot/vmlinuz ; boot" from the grub command line seems to
replicate the issue for me. Probably easier than actually going through
an installation.

I can send you the vmlinuz if you think that approach is worth trying.

Ian.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01  8:30                                           ` Ian Campbell
@ 2015-12-01  8:53                                             ` Juergen Gross
  2015-12-01 10:01                                               ` Ian Campbell
  0 siblings, 1 reply; 72+ messages in thread
From: Juergen Gross @ 2015-12-01  8:53 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On 01/12/15 09:30, Ian Campbell wrote:
> On Tue, 2015-12-01 at 08:41 +0100, Juergen Gross wrote:
>>>> I'm not quite sure what to make of this, in particular I don't see
>> anything
>>>> in kexec.c which obviously looks after unmapping the heap or brk
>> areas.
>>>
>>> Nah, this backtrace shows a normal allocation path while
>>> uncompressing the kernel image. I'd expect something like that.
>>> Why shouldn't mini-os make use of pfn 4d81 somewhere?
> 
> That pfn ends up right in the middle of the next-kernels vaddr mapping,
> so at best it indicates some sort of disconnect/overlap between the
> mini-os memory allocator and the domain-builder memory allocator.

I don't think so.

mini-os just allocates single pages and keeps the relation to the
(future) pfn of that page. The p2m list is adjusted later to move the
allocated page to that pfn before activating the new kernel.

> Since it seems to be in the middle of the padding area (which might
> have been new in ea7c8a3d0e82, I'm not sure, it seems to be more
> explicit at the least) it occurred to me on the way home last night
> that maybe we need to unmap the padding area as well.

We do. The page tables need to be unmapped independently as they
have been mapped explicitly during setup_pgtables(dom). All the
mini-os mappings are removed in a loop just after that.

> I'll try that and your suggested patch below as well once I get to the
> office this morning.

Thanks.


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01  8:53                                             ` Juergen Gross
@ 2015-12-01 10:01                                               ` Ian Campbell
  2015-12-01 10:04                                                 ` Ian Campbell
  2015-12-01 10:47                                                 ` Juergen Gross
  0 siblings, 2 replies; 72+ messages in thread
From: Ian Campbell @ 2015-12-01 10:01 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On Tue, 2015-12-01 at 09:53 +0100, Juergen Gross wrote:
> On 01/12/15 09:30, Ian Campbell wrote:
> > On Tue, 2015-12-01 at 08:41 +0100, Juergen Gross wrote:
> > > > > I'm not quite sure what to make of this, in particular I don't
> > > > > see
> > > anything
> > > > > in kexec.c which obviously looks after unmapping the heap or brk
> > > areas.
> > > > 
> > > > Nah, this backtrace shows a normal allocation path while
> > > > uncompressing the kernel image. I'd expect something like that.
> > > > Why shouldn't mini-os make use of pfn 4d81 somewhere?
> > 
> > That pfn ends up right in the middle of the next-kernels vaddr mapping,
> > so at best it indicates some sort of disconnect/overlap between the
> > mini-os memory allocator and the domain-builder memory allocator.
> 
> I don't think so.
> 
> mini-os just allocates single pages and keeps the relation to the
> (future) pfn of that page. The p2m list is adjusted later to move the
> allocated page to that pfn before activating the new kernel.

Ah, I was wondering how it could possibly work so I half expected I must be
missing something.

> > Since it seems to be in the middle of the padding area (which might
> > have been new in ea7c8a3d0e82, I'm not sure, it seems to be more
> > explicit at the least) it occurred to me on the way home last night
> > that maybe we need to unmap the padding area as well.
> 
> We do. The page tables need to be unmapped independently as they
> have been mapped explicitly during setup_pgtables(dom). All the
> mini-os mappings are removed in a loop just after that.

"a loop" is this:
    /* Unmap day0 pages to avoid having a r/w mapping of the future page table */
   for (pfn = 0; pfn < allocated; pfn++)
        munmap((void*) pages[pfn], PAGE_SIZE);

In my debugging this extends only to the end of the actual mappings, not to
the end of the padding, e.g. for me it is extending to "Unmap pfns 0 ..
0x4c0f" while the unexpected PT pfn is at 0x4d80 and the padding area
extends to pfn 0x5000.

> > I'll try that and your suggested patch below as well once I get to the
> > office this morning.
> 
> Thanks.

The BUG_ON doesn't seem to be triggering. I'm not seeing pfn==0x4d80 going
anywhere near kexec_allocate, the highest is 0x4c0f.

Maybe the issue is that the ->allocate hook (==kexec_allocate) isn't called
from xc_dom_alloc_pad?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01 10:01                                               ` Ian Campbell
@ 2015-12-01 10:04                                                 ` Ian Campbell
  2015-12-01 10:21                                                   ` Wei Liu
  2015-12-01 10:47                                                 ` Juergen Gross
  1 sibling, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-12-01 10:04 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On Tue, 2015-12-01 at 10:01 +0000, Ian Campbell wrote:
> 
> > > I'll try that and your suggested patch below as well once I get to the
> > > office this morning.
> > 
> > Thanks.
> 
> The BUG_ON doesn't seem to be triggering. I'm not seeing pfn==0x4d80 going
> anywhere near kexec_allocate, the highest is 0x4c0f.
> 
> Maybe the issue is that the ->allocate hook (==kexec_allocate) isn't called
> from xc_dom_alloc_pad?

That seems like it might be the answer, this patchlet fixes it for me:

diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index 5d6c3ba..6d3f97a 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -579,7 +579,13 @@ static int xc_dom_alloc_pad(struct xc_dom_image *dom, xen_vaddr_t boundary)
     }
     pages = (boundary - dom->virt_alloc_end) / page_size;
 
-    return xc_dom_chk_alloc_pages(dom, "padding", pages);
+    if ( xc_dom_chk_alloc_pages(dom, "padding", pages) )
+        return -1;
+
+    if (dom->allocate)
+        dom->allocate(dom);
+
+    return 0;
 }
 
 int xc_dom_alloc_segment(struct xc_dom_image *dom,

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01 10:04                                                 ` Ian Campbell
@ 2015-12-01 10:21                                                   ` Wei Liu
  2015-12-01 10:31                                                     ` Ian Campbell
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2015-12-01 10:21 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Juergen Gross, Wei Liu, stefano.stabellini, ian.jackson,
	xen-devel, Boris Ostrovsky, roger.pau

On Tue, Dec 01, 2015 at 10:04:38AM +0000, Ian Campbell wrote:
> On Tue, 2015-12-01 at 10:01 +0000, Ian Campbell wrote:
> > 
> > > > I'll try that and your suggested patch below as well once I get to the
> > > > office this morning.
> > > 
> > > Thanks.
> > 
> > The BUG_ON doesn't seem to be triggering. I'm not seeing pfn==0x4d80 going
> > anywhere near kexec_allocate, the highest is 0x4c0f.
> > 
> > Maybe the issue is that the ->allocate hook (==kexec_allocate) isn't called
> > from xc_dom_alloc_pad?
> 
> That seems like it might be the answer, this patchlet fixes it for me:
> 
> diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
> index 5d6c3ba..6d3f97a 100644
> --- a/tools/libxc/xc_dom_core.c
> +++ b/tools/libxc/xc_dom_core.c
> @@ -579,7 +579,13 @@ static int xc_dom_alloc_pad(struct xc_dom_image *dom, xen_vaddr_t boundary)
>      }
>      pages = (boundary - dom->virt_alloc_end) / page_size;
>  
> -    return xc_dom_chk_alloc_pages(dom, "padding", pages);
> +    if ( xc_dom_chk_alloc_pages(dom, "padding", pages) )
> +        return -1;
> +
> +    if (dom->allocate)
> +        dom->allocate(dom);
> +
> +    return 0;
>  }
>  
>  int xc_dom_alloc_segment(struct xc_dom_image *dom,

Currently there are three places that call dom->allocate (if we include
the call in the proposed diff). I think it would be better if we push
dom->allocate down to xc_dom_chk_alloc_pages, then refactor
xc_dom_alloc_page to use xc_dom_chk_alloc_pages.

Just some thought after a quick look at the code. I will see what I can
do after confirming this is the culprit.

Wei.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01 10:21                                                   ` Wei Liu
@ 2015-12-01 10:31                                                     ` Ian Campbell
  2015-12-01 10:33                                                       ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-12-01 10:31 UTC (permalink / raw)
  To: Wei Liu
  Cc: Juergen Gross, stefano.stabellini, ian.jackson, xen-devel,
	Boris Ostrovsky, roger.pau

On Tue, 2015-12-01 at 10:21 +0000, Wei Liu wrote:
> On Tue, Dec 01, 2015 at 10:04:38AM +0000, Ian Campbell wrote:
> > On Tue, 2015-12-01 at 10:01 +0000, Ian Campbell wrote:
> > > 
> > > > > I'll try that and your suggested patch below as well once I get
> > > > > to the
> > > > > office this morning.
> > > > 
> > > > Thanks.
> > > 
> > > The BUG_ON doesn't seem to be triggering. I'm not seeing pfn==0x4d80
> > > going
> > > anywhere near kexec_allocate, the highest is 0x4c0f.
> > > 
> > > Maybe the issue is that the ->allocate hook (==kexec_allocate) isn't
> > > called
> > > from xc_dom_alloc_pad?
> > 
> > That seems like it might be the answer, this patchlet fixes it for me:
> > 
> > diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
> > index 5d6c3ba..6d3f97a 100644
> > --- a/tools/libxc/xc_dom_core.c
> > +++ b/tools/libxc/xc_dom_core.c
> > @@ -579,7 +579,13 @@ static int xc_dom_alloc_pad(struct xc_dom_image
> > *dom, xen_vaddr_t boundary)
> >      }
> >      pages = (boundary - dom->virt_alloc_end) / page_size;
> >  
> > -    return xc_dom_chk_alloc_pages(dom, "padding", pages);
> > +    if ( xc_dom_chk_alloc_pages(dom, "padding", pages) )
> > +        return -1;
> > +
> > +    if (dom->allocate)
> > +        dom->allocate(dom);
> > +
> > +    return 0;
> >  }
> >  
> >  int xc_dom_alloc_segment(struct xc_dom_image *dom,
> 
> Currently there are three places that call dom->allocate (if we include
> the call in the proposed diff). I think it would be better if we push
> dom->allocate down to xc_dom_chk_alloc_pages, then refactor
> xc_dom_alloc_page to use xc_dom_chk_alloc_pages.

Yes, I was thinking along the same lines after the quick hack above
confirmed it.

> Just some thought after a quick look at the code. I will see what I can
> do after confirming this is the culprit.

I don't mind finishing this one off (although I'm also equally happy to
hand it over ;-))

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01 10:31                                                     ` Ian Campbell
@ 2015-12-01 10:33                                                       ` Wei Liu
  2015-12-01 10:35                                                         ` Ian Campbell
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2015-12-01 10:33 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Juergen Gross, Wei Liu, stefano.stabellini, ian.jackson,
	xen-devel, Boris Ostrovsky, roger.pau

On Tue, Dec 01, 2015 at 10:31:20AM +0000, Ian Campbell wrote:
> On Tue, 2015-12-01 at 10:21 +0000, Wei Liu wrote:
> > On Tue, Dec 01, 2015 at 10:04:38AM +0000, Ian Campbell wrote:
> > > On Tue, 2015-12-01 at 10:01 +0000, Ian Campbell wrote:
> > > > 
> > > > > > I'll try that and your suggested patch below as well once I get
> > > > > > to the
> > > > > > office this morning.
> > > > > 
> > > > > Thanks.
> > > > 
> > > > The BUG_ON doesn't seem to be triggering. I'm not seeing pfn==0x4d80
> > > > going
> > > > anywhere near kexec_allocate, the highest is 0x4c0f.
> > > > 
> > > > Maybe the issue is that the ->allocate hook (==kexec_allocate) isn't
> > > > called
> > > > from xc_dom_alloc_pad?
> > > 
> > > That seems like it might be the answer, this patchlet fixes it for me:
> > > 
> > > diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
> > > index 5d6c3ba..6d3f97a 100644
> > > --- a/tools/libxc/xc_dom_core.c
> > > +++ b/tools/libxc/xc_dom_core.c
> > > @@ -579,7 +579,13 @@ static int xc_dom_alloc_pad(struct xc_dom_image
> > > *dom, xen_vaddr_t boundary)
> > >      }
> > >      pages = (boundary - dom->virt_alloc_end) / page_size;
> > >  
> > > -    return xc_dom_chk_alloc_pages(dom, "padding", pages);
> > > +    if ( xc_dom_chk_alloc_pages(dom, "padding", pages) )
> > > +        return -1;
> > > +
> > > +    if (dom->allocate)
> > > +        dom->allocate(dom);
> > > +
> > > +    return 0;
> > >  }
> > >  
> > >  int xc_dom_alloc_segment(struct xc_dom_image *dom,
> > 
> > Currently there are three places that call dom->allocate (if we include
> > the call in the proposed diff). I think it would be better if we push
> > dom->allocate down to xc_dom_chk_alloc_pages, then refactor
> > xc_dom_alloc_page to use xc_dom_chk_alloc_pages.
> 
> Yes, I was thinking along the same lines after the quick hack above
> confirmed it.
> 
> > Just some thought after a quick look at the code. I will see what I can
> > do after confirming this is the culprit.
> 
> I don't mind finishing this one off (although I'm also equally happy to
> hand it over ;-))

Already working on a patch. :-)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01 10:33                                                       ` Wei Liu
@ 2015-12-01 10:35                                                         ` Ian Campbell
  0 siblings, 0 replies; 72+ messages in thread
From: Ian Campbell @ 2015-12-01 10:35 UTC (permalink / raw)
  To: Wei Liu
  Cc: Juergen Gross, stefano.stabellini, ian.jackson, xen-devel,
	Boris Ostrovsky, roger.pau

On Tue, 2015-12-01 at 10:33 +0000, Wei Liu wrote:
> 
> > > Just some thought after a quick look at the code. I will see what I
> > > can
> > > do after confirming this is the culprit.
> > 
> > I don't mind finishing this one off (although I'm also equally happy to
> > hand it over ;-))
> 
> Already working on a patch. :-)

OK, good thing I hadn't started on it yet then.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported)
  2015-12-01 10:01                                               ` Ian Campbell
  2015-12-01 10:04                                                 ` Ian Campbell
@ 2015-12-01 10:47                                                 ` Juergen Gross
  1 sibling, 0 replies; 72+ messages in thread
From: Juergen Gross @ 2015-12-01 10:47 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu
  Cc: Boris Ostrovsky, roger.pau, ian.jackson, xen-devel, stefano.stabellini

On 01/12/15 11:01, Ian Campbell wrote:
> On Tue, 2015-12-01 at 09:53 +0100, Juergen Gross wrote:
>> On 01/12/15 09:30, Ian Campbell wrote:
>>> On Tue, 2015-12-01 at 08:41 +0100, Juergen Gross wrote:
>>>>>> I'm not quite sure what to make of this, in particular I don't
>>>>>> see
>>>> anything
>>>>>> in kexec.c which obviously looks after unmapping the heap or brk
>>>> areas.
>>>>>
>>>>> Nah, this backtrace shows a normal allocation path while
>>>>> uncompressing the kernel image. I'd expect something like that.
>>>>> Why shouldn't mini-os make use of pfn 4d81 somewhere?
>>>
>>> That pfn ends up right in the middle of the next-kernels vaddr mapping,
>>> so at best it indicates some sort of disconnect/overlap between the
>>> mini-os memory allocator and the domain-builder memory allocator.
>>
>> I don't think so.
>>
>> mini-os just allocates single pages and keeps the relation to the
>> (future) pfn of that page. The p2m list is adjusted later to move the
>> allocated page to that pfn before activating the new kernel.
> 
> Ah, I was wondering how it could possibly work so I half expected I must be
> missing something.
> 
>>> Since it seems to be in the middle of the padding area (which might
>>> have been new in ea7c8a3d0e82, I'm not sure, it seems to be more
>>> explicit at the least) it occurred to me on the way home last night
>>> that maybe we need to unmap the padding area as well.
>>
>> We do. The page tables need to be unmapped independently as they
>> have been mapped explicitly during setup_pgtables(dom). All the
>> mini-os mappings are removed in a loop just after that.
> 
> "a loop" is this:
>     /* Unmap day0 pages to avoid having a r/w mapping of the future page table */
>    for (pfn = 0; pfn < allocated; pfn++)
>         munmap((void*) pages[pfn], PAGE_SIZE);
> 
> In my debugging this extends only to the end of the actual mappings, not to
> the end of the padding, e.g. for me it is extending to "Unmap pfns 0 ..
> 0x4c0f" while the unexpected PT pfn is at 0x4d80 and the padding area
> extends to pfn 0x5000.
> 
>>> I'll try that and your suggested patch below as well once I get to the
>>> office this morning.
>>
>> Thanks.
> 
> The BUG_ON doesn't seem to be triggering. I'm not seeing pfn==0x4d80 going
> anywhere near kexec_allocate, the highest is 0x4c0f.
> 
> Maybe the issue is that the ->allocate hook (==kexec_allocate) isn't called
> from xc_dom_alloc_pad?

OMG! How could I miss that? Thanks for finding this!


Juergen

^ permalink raw reply	[flat|nested] 72+ messages in thread

* dom builder logging from pvgrub.
  2015-11-30 16:25                                   ` Ian Campbell
  2015-11-30 16:56                                     ` Ian Campbell
@ 2015-12-01 11:12                                     ` Ian Campbell
  2015-12-01 12:17                                       ` Samuel Thibault
  1 sibling, 1 reply; 72+ messages in thread
From: Ian Campbell @ 2015-12-01 11:12 UTC (permalink / raw)
  To: Juergen Gross, Wei Liu
  Cc: Boris Ostrovsky, stefano.stabellini, ian.jackson, roger.pau, xen-devel

I found the following hack quite handy here. Posting it for the
archives/posterity, I don't really plan to clean it up and submit it
properly (since I'm not really sure what a non-hack version would be).

In case anyone wants to pick it up:
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

(lightly edited in patch form to remove some of the worst bodging)


diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
index 8fd9ff9..761ca7f 100644
--- a/stubdom/grub/kexec.c
+++ b/stubdom/grub/kexec.c
@@ -209,6 +209,35 @@ static void tpm_hash2pcr(struct xc_dom_image *dom, char *cmdline)
 	shutdown_tpmfront(tpm);
 }
 
+static void builder_vmessage(struct xentoollog_logger *logger,
+                     xentoollog_level level,
+                     int errnoval,
+                     const char *context,
+                     const char *format,
+                     va_list ap)
+{
+  char buf[1024];
+
+  vsnprintf(buf, sizeof(buf), format, ap);
+
+  printk("%s%s%s\n",
+	  context ? : "",
+	  context ? ": " : "",
+	  buf);
+}
+static void builder_progress(struct xentoollog_logger *logger,
+                     const char *context,
+                     const char *doing_what,
+                     int percent, unsigned long done, unsigned long total)
+{
+  printk("%s\n", __func__);
+}
+
+static void builder_destroy(struct xentoollog_logger *logger)
+{
+  printk("%s\n", __func__);
+}
+
 void kexec(void *kernel, long kernel_size, void *module, long module_size, char *cmdline, unsigned long flags)
 {
     struct xc_dom_image *dom;
@@ -223,8 +253,14 @@ void kexec(void *kernel, long kernel_size, void *module, long module_size, char
     struct mmu_update *m2p_updates;
     unsigned long nr_m2p_updates;
 
+    struct xentoollog_logger lg = {
+	    .vmessage = builder_vmessage,
+	    .progress = builder_progress,
+	    .destroy = builder_destroy,
+    };
+
     DEBUG("booting with cmdline %s\n", cmdline);
-    xc_handle = xc_interface_open(0,0,0);
+    xc_handle = xc_interface_open(0,&lg,&lg);
 
     dom = xc_dom_allocate(xc_handle, cmdline, features);
     dom->allocate = kexec_allocate;
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: dom builder logging from pvgrub.
  2015-12-01 11:12                                     ` dom builder logging from pvgrub Ian Campbell
@ 2015-12-01 12:17                                       ` Samuel Thibault
  0 siblings, 0 replies; 72+ messages in thread
From: Samuel Thibault @ 2015-12-01 12:17 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Juergen Gross, Wei Liu, stefano.stabellini, ian.jackson,
	xen-devel, Boris Ostrovsky, roger.pau

Ian Campbell, on Tue 01 Dec 2015 11:12:36 +0000, wrote:
> I found the following hack quite handy here. Posting it for the
> archives/posterity, I don't really plan to clean it up and submit it
> properly (since I'm not really sure what a non-hack version would be).

I would say a non-hack version could be simply one with #ifdef DEBUG. We
don't really want to make pv-grub be so verbose by default, but having
the code off-hand to debug things would be really welcome.

Samuel

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2015-12-01 12:17 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-12 13:43 [PATCH v5 0/9] libxc: support building large pv-domains Juergen Gross
2015-11-12 13:43 ` [PATCH v5 1/9] libxc: reorganize domain builder guest memory allocator Juergen Gross
2015-11-12 13:48   ` Wei Liu
2015-11-12 14:03     ` Juergen Gross
2015-11-12 14:47       ` Wei Liu
2015-11-12 14:47       ` Ian Campbell
2015-11-12 14:48         ` Wei Liu
2015-11-12 15:27         ` Juergen Gross
2015-11-12 15:55           ` Wei Liu
2015-11-13  4:41             ` Juergen Gross
2015-11-13  9:28             ` Ian Campbell
2015-11-13 11:13               ` Wei Liu
2015-11-12 13:43 ` [PATCH v5 2/9] xen: add generic flag to elf_dom_parms indicating support of unmapped initrd Juergen Gross
2015-11-12 13:43 ` [PATCH v5 3/9] libxc: rename domain builder count_pgtables to alloc_pgtables Juergen Gross
2015-11-12 13:43 ` [PATCH v5 4/9] libxc: introduce domain builder architecture specific data Juergen Gross
2015-11-12 13:43 ` [PATCH v5 5/9] libxc: use domain builder architecture private data for x86 pv domains Juergen Gross
2015-11-12 13:43 ` [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported Juergen Gross
2015-11-25 16:12   ` Boris Ostrovsky
2015-11-25 16:18     ` Wei Liu
2015-11-25 16:24       ` Boris Ostrovsky
2015-11-25 16:29       ` Ian Campbell
2015-11-25 16:31         ` Wei Liu
2015-11-25 16:34         ` Boris Ostrovsky
2015-11-26  5:06     ` Juergen Gross
2015-11-26  5:19       ` Juergen Gross
2015-11-26  7:35     ` Juergen Gross
2015-11-30 10:20       ` Wei Liu
2015-11-30 10:23         ` Juergen Gross
2015-11-30 10:29           ` Wei Liu
2015-11-30 10:34           ` Ian Campbell
2015-11-30 10:47             ` Juergen Gross
2015-11-30 10:51               ` Ian Campbell
2015-11-30 10:52                 ` Ian Campbell
2015-11-30 11:03                   ` Juergen Gross
2015-11-30 11:23                     ` Ian Campbell
2015-11-30 12:20                       ` Juergen Gross
2015-11-30 12:35                         ` Ian Campbell
2015-11-30 12:59                           ` Juergen Gross
2015-11-30 13:16                             ` pvgrub "Error 9: Unknown boot failure" booting Debian Jessie kernel (Was: Re: [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported) Ian Campbell
2015-11-30 13:41                               ` Ian Campbell
2015-11-30 14:10                                 ` Ian Campbell
2015-11-30 16:15                                 ` Juergen Gross
2015-11-30 16:25                                   ` Ian Campbell
2015-11-30 16:56                                     ` Ian Campbell
2015-12-01  7:15                                       ` Juergen Gross
2015-12-01  7:41                                         ` Juergen Gross
2015-12-01  8:30                                           ` Ian Campbell
2015-12-01  8:53                                             ` Juergen Gross
2015-12-01 10:01                                               ` Ian Campbell
2015-12-01 10:04                                                 ` Ian Campbell
2015-12-01 10:21                                                   ` Wei Liu
2015-12-01 10:31                                                     ` Ian Campbell
2015-12-01 10:33                                                       ` Wei Liu
2015-12-01 10:35                                                         ` Ian Campbell
2015-12-01 10:47                                                 ` Juergen Gross
2015-12-01  8:32                                         ` Ian Campbell
2015-12-01 11:12                                     ` dom builder logging from pvgrub Ian Campbell
2015-12-01 12:17                                       ` Samuel Thibault
2015-11-30 10:57                 ` [PATCH v5 6/9] libxc: create unmapped initrd in domain builder if supported Juergen Gross
2015-11-30 18:00       ` Boris Ostrovsky
2015-11-12 13:43 ` [PATCH v5 7/9] libxc: split p2m allocation in domain builder from other magic pages Juergen Gross
2015-11-12 13:43 ` [PATCH v5 8/9] libxc: rework of domain builder's page table handler Juergen Gross
2015-11-12 13:47   ` Wei Liu
2015-11-12 13:48     ` Juergen Gross
2015-11-16 13:40       ` Ian Campbell
2015-11-16 14:32         ` Juergen Gross
2015-11-18 16:11   ` Boris Ostrovsky
2015-11-18 16:16     ` Wei Liu
2015-11-18 16:21       ` Boris Ostrovsky
2015-11-19  6:09         ` Juergen Gross
2015-11-19 13:41           ` Boris Ostrovsky
2015-11-12 13:43 ` [PATCH v5 9/9] libxc: create p2m list outside of kernel mapping if supported Juergen Gross

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.