All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/20][V5]: PVH xen: version 5 patches...
@ 2013-05-15  0:52 Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 01/20] PVH xen: turn gdb_frames/gdt_ents into union Mukesh Rathor
                   ` (19 more replies)
  0 siblings, 20 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

I've version 5 of my patches for 64bit PVH guest for xen.  This is Phase I.
These patches are built on top of git
c/s: 4de97462d34f7b74c748ab67600fe2386131b778

Phase I:
   - Establish a baseline of something working. These patches allow for
     dom0 to be booted in PVH mode, and after that guests to be started
     in PV, PVH, and HVM modes. I also tested booting dom0 in PV mode,
     and starting PV, PVH, and HVM guests.

     Also, the disk must be specified as phy: in vm.cfg file:
         > losetup /dev/loop1 guest.img
         > vm.cfg file: disk = ['phy:/dev/loop1,xvda,w']        

     I've not tested anything else.
     Note, HAP and iommu are required for PVH.

As a result of V3, there were two new action items on the linux side before
it will boot as PVH: 1)MSI-X fixup and 2)load KERNEL_CS righ after gdt switch.

As a result of V5 a new fixme:
  - MMIO ranges above the highest covered e820 address must be mapped for dom0.

Following fixme's exist in the code:
  - Add support for more memory types in arch/x86/hvm/mtrr.c.
  - arch/x86/time.c: support more tsc modes.
  - check_guest_io_breakpoint(): check/add support for IO breakpoint.
  - implement arch_get_info_guest() for pvh.
  - vmxit_msr_read(): during AMD port go thru hvm_msr_read_intercept() again.
  - verify bp matching on emulated instructions will work same as HVM for
    PVH guest. see instruction_done() and check_guest_io_breakpoint().

Following remain to be done for PVH:
   - AMD port.
   - Avail PVH dom0 of posted interrupts. (This will be a big win).
   - 32bit support in both linux and xen. Xen changes are tagged "32bitfixme".
   - Add support for monitoring guest behavior. See hvm_memory_event* functions
     in hvm.c
   - Change xl to support other modes other than "phy:".
   - Hotplug support
   - Migration of PVH guests.

Thanks for all the help,
Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 01/20] PVH xen: turn gdb_frames/gdt_ents into union
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range Mukesh Rathor
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

Changes in V2:
  - Add __XEN_INTERFACE_VERSION__

  Changes in V3:
    - Rename union to 'gdt' and rename field names.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 tools/libxc/xc_domain_restore.c   |    8 ++++----
 tools/libxc/xc_domain_save.c      |    6 +++---
 xen/arch/x86/domain.c             |   12 ++++++------
 xen/arch/x86/domctl.c             |   12 ++++++------
 xen/include/public/arch-x86/xen.h |   14 ++++++++++++++
 5 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
index a15f86a..5530631 100644
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -2020,15 +2020,15 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
             munmap(start_info, PAGE_SIZE);
         }
         /* Uncanonicalise each GDT frame number. */
-        if ( GET_FIELD(ctxt, gdt_ents) > 8192 )
+        if ( GET_FIELD(ctxt, gdt.pv.num_ents) > 8192 )
         {
             ERROR("GDT entry count out of range");
             goto out;
         }
 
-        for ( j = 0; (512*j) < GET_FIELD(ctxt, gdt_ents); j++ )
+        for ( j = 0; (512*j) < GET_FIELD(ctxt, gdt.pv.num_ents); j++ )
         {
-            pfn = GET_FIELD(ctxt, gdt_frames[j]);
+            pfn = GET_FIELD(ctxt, gdt.pv.frames[j]);
             if ( (pfn >= dinfo->p2m_size) ||
                  (pfn_type[pfn] != XEN_DOMCTL_PFINFO_NOTAB) )
             {
@@ -2036,7 +2036,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       j, (unsigned long)pfn);
                 goto out;
             }
-            SET_FIELD(ctxt, gdt_frames[j], ctx->p2m[pfn]);
+            SET_FIELD(ctxt, gdt.pv.frames[j], ctx->p2m[pfn]);
         }
         /* Uncanonicalise the page table base pointer. */
         pfn = UNFOLD_CR3(GET_FIELD(ctxt, ctrlreg[3]));
diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index ff76626..97cf64a 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -1900,15 +1900,15 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
         }
 
         /* Canonicalise each GDT frame number. */
-        for ( j = 0; (512*j) < GET_FIELD(&ctxt, gdt_ents); j++ )
+        for ( j = 0; (512*j) < GET_FIELD(&ctxt, gdt.pv.num_ents); j++ )
         {
-            mfn = GET_FIELD(&ctxt, gdt_frames[j]);
+            mfn = GET_FIELD(&ctxt, gdt.pv.frames[j]);
             if ( !MFN_IS_IN_PSEUDOPHYS_MAP(mfn) )
             {
                 ERROR("GDT frame is not in range of pseudophys map");
                 goto out;
             }
-            SET_FIELD(&ctxt, gdt_frames[j], mfn_to_pfn(mfn));
+            SET_FIELD(&ctxt, gdt.pv.frames[j], mfn_to_pfn(mfn));
         }
 
         /* Canonicalise the page table base pointer. */
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index db1e65d..65e14da 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -782,8 +782,8 @@ int arch_set_info_guest(
         }
 
         for ( i = 0; i < ARRAY_SIZE(v->arch.pv_vcpu.gdt_frames); ++i )
-            fail |= v->arch.pv_vcpu.gdt_frames[i] != c(gdt_frames[i]);
-        fail |= v->arch.pv_vcpu.gdt_ents != c(gdt_ents);
+            fail |= v->arch.pv_vcpu.gdt_frames[i] != c(gdt.pv.frames[i]);
+        fail |= v->arch.pv_vcpu.gdt_ents != c(gdt.pv.num_ents);
 
         fail |= v->arch.pv_vcpu.ldt_base != c(ldt_base);
         fail |= v->arch.pv_vcpu.ldt_ents != c(ldt_ents);
@@ -832,17 +832,17 @@ int arch_set_info_guest(
         d->vm_assist = c(vm_assist);
 
     if ( !compat )
-        rc = (int)set_gdt(v, c.nat->gdt_frames, c.nat->gdt_ents);
+        rc = (int)set_gdt(v, c.nat->gdt.pv.frames, c.nat->gdt.pv.num_ents);
     else
     {
         unsigned long gdt_frames[ARRAY_SIZE(v->arch.pv_vcpu.gdt_frames)];
-        unsigned int n = (c.cmp->gdt_ents + 511) / 512;
+        unsigned int n = (c.cmp->gdt.pv.num_ents + 511) / 512;
 
         if ( n > ARRAY_SIZE(v->arch.pv_vcpu.gdt_frames) )
             return -EINVAL;
         for ( i = 0; i < n; ++i )
-            gdt_frames[i] = c.cmp->gdt_frames[i];
-        rc = (int)set_gdt(v, gdt_frames, c.cmp->gdt_ents);
+            gdt_frames[i] = c.cmp->gdt.pv.frames[i];
+        rc = (int)set_gdt(v, gdt_frames, c.cmp->gdt.pv.num_ents);
     }
     if ( rc != 0 )
         return rc;
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 1f16ad2..dd5bc10 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1300,12 +1300,12 @@ void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
         c(ldt_base = v->arch.pv_vcpu.ldt_base);
         c(ldt_ents = v->arch.pv_vcpu.ldt_ents);
         for ( i = 0; i < ARRAY_SIZE(v->arch.pv_vcpu.gdt_frames); ++i )
-            c(gdt_frames[i] = v->arch.pv_vcpu.gdt_frames[i]);
-        BUILD_BUG_ON(ARRAY_SIZE(c.nat->gdt_frames) !=
-                     ARRAY_SIZE(c.cmp->gdt_frames));
-        for ( ; i < ARRAY_SIZE(c.nat->gdt_frames); ++i )
-            c(gdt_frames[i] = 0);
-        c(gdt_ents = v->arch.pv_vcpu.gdt_ents);
+            c(gdt.pv.frames[i] = v->arch.pv_vcpu.gdt_frames[i]);
+        BUILD_BUG_ON(ARRAY_SIZE(c.nat->gdt.pv.frames) !=
+                     ARRAY_SIZE(c.cmp->gdt.pv.frames));
+        for ( ; i < ARRAY_SIZE(c.nat->gdt.pv.frames); ++i )
+            c(gdt.pv.frames[i] = 0);
+        c(gdt.pv.num_ents = v->arch.pv_vcpu.gdt_ents);
         c(kernel_ss = v->arch.pv_vcpu.kernel_ss);
         c(kernel_sp = v->arch.pv_vcpu.kernel_sp);
         for ( i = 0; i < ARRAY_SIZE(v->arch.pv_vcpu.ctrlreg); ++i )
diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h
index b7f6a51..25c8519 100644
--- a/xen/include/public/arch-x86/xen.h
+++ b/xen/include/public/arch-x86/xen.h
@@ -170,7 +170,21 @@ struct vcpu_guest_context {
     struct cpu_user_regs user_regs;         /* User-level CPU registers     */
     struct trap_info trap_ctxt[256];        /* Virtual IDT                  */
     unsigned long ldt_base, ldt_ents;       /* LDT (linear address, # ents) */
+#if __XEN_INTERFACE_VERSION__ < 0x00040400
     unsigned long gdt_frames[16], gdt_ents; /* GDT (machine frames, # ents) */
+#else
+    union {
+        struct {
+            /* GDT (machine frames, # ents) */
+            unsigned long frames[16], num_ents;
+        } pv;
+        struct {
+            /* PVH: GDTR addr and size */
+            uint64_t addr;
+            uint16_t limit;
+        } pvh;
+    } gdt;
+#endif
     unsigned long kernel_ss, kernel_sp;     /* Virtual TSS (only SS1/SP1)   */
     /* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
     unsigned long ctrlreg[8];               /* CR0-CR7 (control registers)  */
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 01/20] PVH xen: turn gdb_frames/gdt_ents into union Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  9:58   ` Jan Beulich
  2013-05-15  0:52 ` [PATCH 03/20] PVH xen: create domctl_memory_mapping() function Mukesh Rathor
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

In this patch we add a new function xenmem_add_to_physmap_range(), and
change xenmem_add_to_physmap_once parameters so it can be called from
xenmem_add_to_physmap_range. There is no PVH specific change here.

Changes in V2:
  - Do not break parameter so xenmem_add_to_physmap_once() but pass in
    struct xen_add_to_physmap.

Changes in V3:
  - add xsm hook
  - redo xenmem_add_to_physmap_range() a bit as the struct
    xen_add_to_physmap_range got enhanced.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/mm.c |   80 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 5123860..43eeddc 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4519,7 +4519,8 @@ static int handle_iomem_range(unsigned long s, unsigned long e, void *p)
 
 static int xenmem_add_to_physmap_once(
     struct domain *d,
-    const struct xen_add_to_physmap *xatp)
+    const struct xen_add_to_physmap *xatp,
+    domid_t foreign_domid)
 {
     struct page_info *page = NULL;
     unsigned long gfn = 0; /* gcc ... */
@@ -4646,7 +4647,7 @@ static int xenmem_add_to_physmap(struct domain *d,
         start_xatp = *xatp;
         while ( xatp->size > 0 )
         {
-            rc = xenmem_add_to_physmap_once(d, xatp);
+            rc = xenmem_add_to_physmap_once(d, xatp, DOMID_INVALID);
             if ( rc < 0 )
                 return rc;
 
@@ -4672,7 +4673,43 @@ static int xenmem_add_to_physmap(struct domain *d,
         return rc;
     }
 
-    return xenmem_add_to_physmap_once(d, xatp);
+    return xenmem_add_to_physmap_once(d, xatp, DOMID_INVALID);
+}
+
+static int xenmem_add_to_physmap_range(struct domain *d,
+                                       struct xen_add_to_physmap_range *xatpr)
+{
+    int rc;
+
+    /* Process entries in reverse order to allow continuations */
+    while ( xatpr->size > 0 )
+    {
+        xen_ulong_t idx;
+        xen_pfn_t gpfn;
+        struct xen_add_to_physmap xatp;
+
+        if ( copy_from_guest_offset(&idx, xatpr->idxs, xatpr->size-1, 1)  ||
+             copy_from_guest_offset(&gpfn, xatpr->gpfns, xatpr->size-1, 1) )
+        {
+            return -EFAULT;
+        }
+
+        xatp.space = xatpr->space;
+        xatp.idx = idx;
+        xatp.gpfn = gpfn;
+        rc = xenmem_add_to_physmap_once(d, &xatp, xatpr->foreign_domid);
+
+        if ( copy_to_guest_offset(xatpr->errs, xatpr->size-1, &rc, 1) )
+            return -EFAULT;
+
+        xatpr->size--;
+
+        /* Check for continuation if it's not the last interation */
+        if ( xatpr->size > 0 && hypercall_preempt_check() )
+            return -EAGAIN;
+    }
+
+    return 0;
 }
 
 long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg)
@@ -4689,6 +4726,10 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( copy_from_guest(&xatp, arg, 1) )
             return -EFAULT;
 
+        /* This one is only supported for add_to_physmap_range */
+        if ( xatp.space == XENMAPSPACE_gmfn_foreign )
+            return -EINVAL;
+
         d = rcu_lock_domain_by_any_id(xatp.domid);
         if ( d == NULL )
             return -ESRCH;
@@ -4716,6 +4757,39 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg)
         return rc;
     }
 
+    case XENMEM_add_to_physmap_range:
+    {
+        struct xen_add_to_physmap_range xatpr;
+        struct domain *d;
+
+        if ( copy_from_guest(&xatpr, arg, 1) )
+            return -EFAULT;
+
+        /* This mapspace is redundant for this hypercall */
+        if ( xatpr.space == XENMAPSPACE_gmfn_range )
+            return -EINVAL;
+
+        rc = rcu_lock_target_domain_by_id(xatpr.domid, &d);
+        if ( rc != 0 )
+            return rc;
+
+        if ( xsm_add_to_physmap(XSM_TARGET, current->domain, d) )
+        {
+            rcu_unlock_domain(d);
+            return -EPERM;
+        }
+
+        rc = xenmem_add_to_physmap_range(d, &xatpr);
+
+        rcu_unlock_domain(d);
+
+        if ( rc == -EAGAIN )
+            rc = hypercall_create_continuation(
+                __HYPERVISOR_memory_op, "ih", op, arg);
+
+        return rc;
+    }
+
     case XENMEM_set_memory_map:
     {
         struct xen_foreign_memory_map fmap;
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 03/20] PVH xen: create domctl_memory_mapping() function
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 01/20] PVH xen: turn gdb_frames/gdt_ents into union Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15 10:07   ` Jan Beulich
  2013-05-15  0:52 ` [PATCH 04/20] PVH xen: add params to read_segment_register Mukesh Rathor
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

In this patch, XEN_DOMCTL_memory_mapping code is put into a function so
it can be shared later for PVH. There is no change in it's
functionality.

Changes in V2:
  - Remove PHYSDEVOP_map_iomem sub hypercall, and the code supporting it
           as the IO region is mapped transparently now.

Changes in V3:
  - change loop control variable to ulong from int.
  - move priv checks to the function.

Changes in V5:
  - Move iomem_access_permitted check to case statment from the function
    as current doesn't point to dom0 during construct_dom0.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domctl.c    |  121 +++++++++++++++++++++++++---------------------
 xen/include/xen/domain.h |    2 +
 2 files changed, 67 insertions(+), 56 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index dd5bc10..c5a6f6f 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -46,6 +46,70 @@ static int gdbsx_guest_mem_io(
     return (iop->remain ? -EFAULT : 0);
 }
 
+long domctl_memory_mapping(struct domain *d, unsigned long gfn,
+                           unsigned long mfn, unsigned long nr_mfns,
+                           bool_t add_map)
+{
+    unsigned long i;
+    long ret;
+
+    if ( (mfn + nr_mfns - 1) < mfn || /* wrap? */
+         ((mfn | (mfn + nr_mfns - 1)) >> (paddr_bits - PAGE_SHIFT)) ||
+         (gfn + nr_mfns - 1) < gfn ) /* wrap? */
+        return -EINVAL;
+
+    ret = xsm_iomem_permission(XSM_HOOK, d, mfn, mfn + nr_mfns - 1, add_map);
+    if ( ret )
+        return ret;
+
+    if ( add_map )
+    {
+        printk(XENLOG_G_INFO
+               "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
+               d->domain_id, gfn, mfn, nr_mfns);
+
+        ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
+        if ( !ret && paging_mode_translate(d) )
+        {
+            for ( i = 0; !ret && i < nr_mfns; i++ )
+                if ( !set_mmio_p2m_entry(d, gfn + i, _mfn(mfn + i)) )
+                    ret = -EIO;
+            if ( ret )
+            {
+                printk(XENLOG_G_WARNING
+                       "memory_map:fail: dom%d gfn=%lx mfn=%lx\n",
+                       d->domain_id, gfn + i, mfn + i);
+                while ( i-- )
+                    clear_mmio_p2m_entry(d, gfn + i);
+                if ( iomem_deny_access(d, mfn, mfn + nr_mfns - 1) &&
+                     IS_PRIV(current->domain) )
+                    printk(XENLOG_ERR
+                           "memory_map: failed to deny dom%d access to [%lx,%lx]\n",
+                           d->domain_id, mfn, mfn + nr_mfns - 1);
+            }
+        }
+    }
+    else
+    {
+        printk(XENLOG_G_INFO
+               "memory_map:remove: dom%d gfn=%lx mfn=%lx nr=%lx\n",
+               d->domain_id, gfn, mfn, nr_mfns);
+
+        if ( paging_mode_translate(d) )
+            for ( i = 0; i < nr_mfns; i++ )
+                add_map |= !clear_mmio_p2m_entry(d, gfn + i);
+        ret = iomem_deny_access(d, mfn, mfn + nr_mfns - 1);
+        if ( !ret && add_map )
+            ret = -EIO;
+        if ( ret && IS_PRIV(current->domain) )
+            printk(XENLOG_ERR
+                   "memory_map: error %ld %s dom%d access to [%lx,%lx]\n",
+                   ret, add_map ? "removing" : "denying", d->domain_id,
+                   mfn, mfn + nr_mfns - 1);
+    }
+    return ret;
+}
+
 long arch_do_domctl(
     struct xen_domctl *domctl, struct domain *d,
     XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
@@ -625,67 +689,12 @@ long arch_do_domctl(
         unsigned long mfn = domctl->u.memory_mapping.first_mfn;
         unsigned long nr_mfns = domctl->u.memory_mapping.nr_mfns;
         int add = domctl->u.memory_mapping.add_mapping;
-        unsigned long i;
-
-        ret = -EINVAL;
-        if ( (mfn + nr_mfns - 1) < mfn || /* wrap? */
-             ((mfn | (mfn + nr_mfns - 1)) >> (paddr_bits - PAGE_SHIFT)) ||
-             (gfn + nr_mfns - 1) < gfn ) /* wrap? */
-            break;
 
         ret = -EPERM;
         if ( !iomem_access_permitted(current->domain, mfn, mfn + nr_mfns - 1) )
             break;
 
-        ret = xsm_iomem_mapping(XSM_HOOK, d, mfn, mfn + nr_mfns - 1, add);
-        if ( ret )
-            break;
-
-        if ( add )
-        {
-            printk(XENLOG_G_INFO
-                   "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
-                   d->domain_id, gfn, mfn, nr_mfns);
-
-            ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
-            if ( !ret && paging_mode_translate(d) )
-            {
-                for ( i = 0; !ret && i < nr_mfns; i++ )
-                    if ( !set_mmio_p2m_entry(d, gfn + i, _mfn(mfn + i)) )
-                        ret = -EIO;
-                if ( ret )
-                {
-                    printk(XENLOG_G_WARNING
-                           "memory_map:fail: dom%d gfn=%lx mfn=%lx\n",
-                           d->domain_id, gfn + i, mfn + i);
-                    while ( i-- )
-                        clear_mmio_p2m_entry(d, gfn + i);
-                    if ( iomem_deny_access(d, mfn, mfn + nr_mfns - 1) &&
-                         IS_PRIV(current->domain) )
-                        printk(XENLOG_ERR
-                               "memory_map: failed to deny dom%d access to [%lx,%lx]\n",
-                               d->domain_id, mfn, mfn + nr_mfns - 1);
-                }
-            }
-        }
-        else
-        {
-            printk(XENLOG_G_INFO
-                   "memory_map:remove: dom%d gfn=%lx mfn=%lx nr=%lx\n",
-                   d->domain_id, gfn, mfn, nr_mfns);
-
-            if ( paging_mode_translate(d) )
-                for ( i = 0; i < nr_mfns; i++ )
-                    add |= !clear_mmio_p2m_entry(d, gfn + i);
-            ret = iomem_deny_access(d, mfn, mfn + nr_mfns - 1);
-            if ( !ret && add )
-                ret = -EIO;
-            if ( ret && IS_PRIV(current->domain) )
-                printk(XENLOG_ERR
-                       "memory_map: error %ld %s dom%d access to [%lx,%lx]\n",
-                       ret, add ? "removing" : "denying", d->domain_id,
-                       mfn, mfn + nr_mfns - 1);
-        }
+        ret = domctl_memory_mapping(d, gfn, mfn, nr_mfns, add);
     }
     break;
 
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 504a70f..4213e98 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -86,4 +86,6 @@ extern unsigned int xen_processor_pmbits;
 
 extern bool_t opt_dom0_vcpus_pin;
 
+extern long domctl_memory_mapping(struct domain *d, unsigned long gfn,
+                    unsigned long mfn, unsigned long nr_mfns, bool_t add_map);
 #endif /* __XEN_DOMAIN_H__ */
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 04/20] PVH xen: add params to read_segment_register
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (2 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 03/20] PVH xen: create domctl_memory_mapping() function Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 05/20] PVH xen: vmx realted preparatory changes for PVH Mukesh Rathor
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

In this patch, read_segment_register macro is changed to take vcpu and
regs parameters so it can check if it's PVH guest (change in upcoming
patches). No functionality change. Also, make emulate_privileged_op()
public for later while changing this file.

Changes in V2:  None
Changes in V3:
   - Replace read_sreg with read_segment_register

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domain.c        |    8 ++++----
 xen/arch/x86/traps.c         |   28 +++++++++++++---------------
 xen/arch/x86/x86_64/traps.c  |   16 ++++++++--------
 xen/include/asm-x86/system.h |    2 +-
 xen/include/asm-x86/traps.h  |    1 +
 5 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 65e14da..b9711d2 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1347,10 +1347,10 @@ static void save_segments(struct vcpu *v)
     struct cpu_user_regs *regs = &v->arch.user_regs;
     unsigned int dirty_segment_mask = 0;
 
-    regs->ds = read_segment_register(ds);
-    regs->es = read_segment_register(es);
-    regs->fs = read_segment_register(fs);
-    regs->gs = read_segment_register(gs);
+    regs->ds = read_segment_register(v, regs, ds);
+    regs->es = read_segment_register(v, regs, es);
+    regs->fs = read_segment_register(v, regs, fs);
+    regs->gs = read_segment_register(v, regs, gs);
 
     if ( regs->ds )
         dirty_segment_mask |= DIRTY_DS;
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index fbbe31d..1b280dc 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1823,8 +1823,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
     }                                                                       \
     (eip) += sizeof(_x); _x; })
 
-#define read_sreg(regs, sr) read_segment_register(sr)
-
 static int is_cpufreq_controller(struct domain *d)
 {
     return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
@@ -1833,7 +1831,7 @@ static int is_cpufreq_controller(struct domain *d)
 
 #include "x86_64/mmconfig.h"
 
-static int emulate_privileged_op(struct cpu_user_regs *regs)
+int emulate_privileged_op(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
     unsigned long *reg, eip = regs->eip;
@@ -1869,7 +1867,7 @@ static int emulate_privileged_op(struct cpu_user_regs *regs)
         goto fail;
 
     /* emulating only opcodes not allowing SS to be default */
-    data_sel = read_sreg(regs, ds);
+    data_sel = read_segment_register(v, regs, ds);
 
     /* Legacy prefixes. */
     for ( i = 0; i < 8; i++, rex == opcode || (rex = 0) )
@@ -1887,17 +1885,17 @@ static int emulate_privileged_op(struct cpu_user_regs *regs)
             data_sel = regs->cs;
             continue;
         case 0x3e: /* DS override */
-            data_sel = read_sreg(regs, ds);
+            data_sel = read_segment_register(v, regs, ds);
             continue;
         case 0x26: /* ES override */
-            data_sel = read_sreg(regs, es);
+            data_sel = read_segment_register(v, regs, es);
             continue;
         case 0x64: /* FS override */
-            data_sel = read_sreg(regs, fs);
+            data_sel = read_segment_register(v, regs, fs);
             lm_ovr = lm_seg_fs;
             continue;
         case 0x65: /* GS override */
-            data_sel = read_sreg(regs, gs);
+            data_sel = read_segment_register(v, regs, gs);
             lm_ovr = lm_seg_gs;
             continue;
         case 0x36: /* SS override */
@@ -1944,7 +1942,7 @@ static int emulate_privileged_op(struct cpu_user_regs *regs)
 
         if ( !(opcode & 2) )
         {
-            data_sel = read_sreg(regs, es);
+            data_sel = read_segment_register(v, regs, es);
             lm_ovr = lm_seg_none;
         }
 
@@ -2688,22 +2686,22 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
             ASSERT(opnd_sel);
             continue;
         case 0x3e: /* DS override */
-            opnd_sel = read_sreg(regs, ds);
+            opnd_sel = read_segment_register(v, regs, ds);
             if ( !opnd_sel )
                 opnd_sel = dpl;
             continue;
         case 0x26: /* ES override */
-            opnd_sel = read_sreg(regs, es);
+            opnd_sel = read_segment_register(v, regs, es);
             if ( !opnd_sel )
                 opnd_sel = dpl;
             continue;
         case 0x64: /* FS override */
-            opnd_sel = read_sreg(regs, fs);
+            opnd_sel = read_segment_register(v, regs, fs);
             if ( !opnd_sel )
                 opnd_sel = dpl;
             continue;
         case 0x65: /* GS override */
-            opnd_sel = read_sreg(regs, gs);
+            opnd_sel = read_segment_register(v, regs, gs);
             if ( !opnd_sel )
                 opnd_sel = dpl;
             continue;
@@ -2756,7 +2754,7 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
                             switch ( modrm & 7 )
                             {
                             default:
-                                opnd_sel = read_sreg(regs, ds);
+                                opnd_sel = read_segment_register(v, regs, ds);
                                 break;
                             case 4: case 5:
                                 opnd_sel = regs->ss;
@@ -2784,7 +2782,7 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
                             break;
                         }
                         if ( !opnd_sel )
-                            opnd_sel = read_sreg(regs, ds);
+                            opnd_sel = read_segment_register(v, regs, ds);
                         switch ( modrm & 7 )
                         {
                         case 0: case 2: case 4:
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index eec919a..d2f7209 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -122,10 +122,10 @@ void show_registers(struct cpu_user_regs *regs)
         fault_crs[0] = read_cr0();
         fault_crs[3] = read_cr3();
         fault_crs[4] = read_cr4();
-        fault_regs.ds = read_segment_register(ds);
-        fault_regs.es = read_segment_register(es);
-        fault_regs.fs = read_segment_register(fs);
-        fault_regs.gs = read_segment_register(gs);
+        fault_regs.ds = read_segment_register(v, regs, ds);
+        fault_regs.es = read_segment_register(v, regs, es);
+        fault_regs.fs = read_segment_register(v, regs, fs);
+        fault_regs.gs = read_segment_register(v, regs, gs);
     }
 
     print_xen_info();
@@ -240,10 +240,10 @@ void do_double_fault(struct cpu_user_regs *regs)
     crs[2] = read_cr2();
     crs[3] = read_cr3();
     crs[4] = read_cr4();
-    regs->ds = read_segment_register(ds);
-    regs->es = read_segment_register(es);
-    regs->fs = read_segment_register(fs);
-    regs->gs = read_segment_register(gs);
+    regs->ds = read_segment_register(current, regs, ds);
+    regs->es = read_segment_register(current, regs, es);
+    regs->fs = read_segment_register(current, regs, fs);
+    regs->gs = read_segment_register(current, regs, gs);
 
     printk("CPU:    %d\n", cpu);
     _show_registers(regs, crs, CTXT_hypervisor, NULL);
diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
index b0876d6..d8dc6f2 100644
--- a/xen/include/asm-x86/system.h
+++ b/xen/include/asm-x86/system.h
@@ -4,7 +4,7 @@
 #include <xen/lib.h>
 #include <asm/bitops.h>
 
-#define read_segment_register(name)                             \
+#define read_segment_register(vcpu, regs, name)                 \
 ({  u16 __sel;                                                  \
     asm volatile ( "movw %%" STR(name) ",%0" : "=r" (__sel) );  \
     __sel;                                                      \
diff --git a/xen/include/asm-x86/traps.h b/xen/include/asm-x86/traps.h
index 82cbcee..202e3be 100644
--- a/xen/include/asm-x86/traps.h
+++ b/xen/include/asm-x86/traps.h
@@ -49,4 +49,5 @@ extern int guest_has_trap_callback(struct domain *d, uint16_t vcpuid,
 extern int send_guest_trap(struct domain *d, uint16_t vcpuid,
 				unsigned int trap_nr);
 
+int emulate_privileged_op(struct cpu_user_regs *regs);
 #endif /* ASM_TRAP_H */
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 05/20] PVH xen: vmx realted preparatory changes for PVH
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (3 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 04/20] PVH xen: add params to read_segment_register Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 06/20] PVH xen: Move e820 fields out of pv_domain struct Mukesh Rathor
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This is another preparotary patch for PVH. In this patch, following
functions are made non-static:
    vmx_fpu_enter(), get_instruction_length(), update_guest_eip(),
    vmx_dr_access(), and pv_cpuid().

There is no functionality change.

Changes in V2:
  - prepend vmx_ to get_instruction_length and update_guest_eip.
  - Do not export/use vmr().

Changes in V3:
  - Do not change emulate_forced_invalid_op() in this patch.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/hvm/vmx/vmx.c         |   72 +++++++++++++++---------------------
 xen/arch/x86/hvm/vmx/vvmx.c        |    2 +-
 xen/arch/x86/traps.c               |    2 +-
 xen/include/asm-x86/hvm/vmx/vmcs.h |    1 +
 xen/include/asm-x86/hvm/vmx/vmx.h  |   15 +++++++-
 xen/include/asm-x86/processor.h    |    1 +
 6 files changed, 48 insertions(+), 45 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 51187a9..7e5dba8 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -577,7 +577,7 @@ static int vmx_load_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt)
     return 0;
 }
 
-static void vmx_fpu_enter(struct vcpu *v)
+void vmx_fpu_enter(struct vcpu *v)
 {
     vcpu_restore_fpu_lazy(v);
     v->arch.hvm_vmx.exception_bitmap &= ~(1u << TRAP_no_device);
@@ -1594,24 +1594,12 @@ const struct hvm_function_table * __init start_vmx(void)
     return &vmx_function_table;
 }
 
-/*
- * Not all cases receive valid value in the VM-exit instruction length field.
- * Callers must know what they're doing!
- */
-static int get_instruction_length(void)
-{
-    int len;
-    len = __vmread(VM_EXIT_INSTRUCTION_LEN); /* Safe: callers audited */
-    BUG_ON((len < 1) || (len > 15));
-    return len;
-}
-
-void update_guest_eip(void)
+void vmx_update_guest_eip(void)
 {
     struct cpu_user_regs *regs = guest_cpu_user_regs();
     unsigned long x;
 
-    regs->eip += get_instruction_length(); /* Safe: callers audited */
+    regs->eip += vmx_get_instruction_length(); /* Safe: callers audited */
     regs->eflags &= ~X86_EFLAGS_RF;
 
     x = __vmread(GUEST_INTERRUPTIBILITY_INFO);
@@ -1684,8 +1672,8 @@ static void vmx_do_cpuid(struct cpu_user_regs *regs)
     regs->edx = edx;
 }
 
-static void vmx_dr_access(unsigned long exit_qualification,
-                          struct cpu_user_regs *regs)
+void vmx_dr_access(unsigned long exit_qualification,
+                   struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
 
@@ -2289,7 +2277,7 @@ static int vmx_handle_eoi_write(void)
     if ( (((exit_qualification >> 12) & 0xf) == 1) &&
          ((exit_qualification & 0xfff) == APIC_EOI) )
     {
-        update_guest_eip(); /* Safe: APIC data write */
+        vmx_update_guest_eip(); /* Safe: APIC data write */
         vlapic_EOI_set(vcpu_vlapic(current));
         HVMTRACE_0D(VLAPIC);
         return 1;
@@ -2502,7 +2490,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             HVMTRACE_1D(TRAP, vector);
             if ( v->domain->debugger_attached )
             {
-                update_guest_eip(); /* Safe: INT3 */            
+                vmx_update_guest_eip(); /* Safe: INT3 */
                 current->arch.gdbsx_vcpu_event = TRAP_int3;
                 domain_pause_for_debugger();
                 break;
@@ -2610,7 +2598,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
          */
         inst_len = ((source != 3) ||        /* CALL, IRET, or JMP? */
                     (idtv_info & (1u<<10))) /* IntrType > 3? */
-            ? get_instruction_length() /* Safe: SDM 3B 23.2.4 */ : 0;
+            ? vmx_get_instruction_length() /* Safe: SDM 3B 23.2.4 */ : 0;
         if ( (source == 3) && (idtv_info & INTR_INFO_DELIVER_CODE_MASK) )
             ecode = __vmread(IDT_VECTORING_ERROR_CODE);
         regs->eip += inst_len;
@@ -2618,15 +2606,15 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         break;
     }
     case EXIT_REASON_CPUID:
-        update_guest_eip(); /* Safe: CPUID */
+        vmx_update_guest_eip(); /* Safe: CPUID */
         vmx_do_cpuid(regs);
         break;
     case EXIT_REASON_HLT:
-        update_guest_eip(); /* Safe: HLT */
+        vmx_update_guest_eip(); /* Safe: HLT */
         hvm_hlt(regs->eflags);
         break;
     case EXIT_REASON_INVLPG:
-        update_guest_eip(); /* Safe: INVLPG */
+        vmx_update_guest_eip(); /* Safe: INVLPG */
         exit_qualification = __vmread(EXIT_QUALIFICATION);
         vmx_invlpg_intercept(exit_qualification);
         break;
@@ -2634,7 +2622,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         regs->ecx = hvm_msr_tsc_aux(v);
         /* fall through */
     case EXIT_REASON_RDTSC:
-        update_guest_eip(); /* Safe: RDTSC, RDTSCP */
+        vmx_update_guest_eip(); /* Safe: RDTSC, RDTSCP */
         hvm_rdtsc_intercept(regs);
         break;
     case EXIT_REASON_VMCALL:
@@ -2644,7 +2632,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         rc = hvm_do_hypercall(regs);
         if ( rc != HVM_HCALL_preempted )
         {
-            update_guest_eip(); /* Safe: VMCALL */
+            vmx_update_guest_eip(); /* Safe: VMCALL */
             if ( rc == HVM_HCALL_invalidate )
                 send_invalidate_req();
         }
@@ -2654,7 +2642,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
     {
         exit_qualification = __vmread(EXIT_QUALIFICATION);
         if ( vmx_cr_access(exit_qualification) == X86EMUL_OKAY )
-            update_guest_eip(); /* Safe: MOV Cn, LMSW, CLTS */
+            vmx_update_guest_eip(); /* Safe: MOV Cn, LMSW, CLTS */
         break;
     }
     case EXIT_REASON_DR_ACCESS:
@@ -2668,7 +2656,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         {
             regs->eax = (uint32_t)msr_content;
             regs->edx = (uint32_t)(msr_content >> 32);
-            update_guest_eip(); /* Safe: RDMSR */
+            vmx_update_guest_eip(); /* Safe: RDMSR */
         }
         break;
     }
@@ -2677,63 +2665,63 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         uint64_t msr_content;
         msr_content = ((uint64_t)regs->edx << 32) | (uint32_t)regs->eax;
         if ( hvm_msr_write_intercept(regs->ecx, msr_content) == X86EMUL_OKAY )
-            update_guest_eip(); /* Safe: WRMSR */
+            vmx_update_guest_eip(); /* Safe: WRMSR */
         break;
     }
 
     case EXIT_REASON_VMXOFF:
         if ( nvmx_handle_vmxoff(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
 
     case EXIT_REASON_VMXON:
         if ( nvmx_handle_vmxon(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
 
     case EXIT_REASON_VMCLEAR:
         if ( nvmx_handle_vmclear(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
  
     case EXIT_REASON_VMPTRLD:
         if ( nvmx_handle_vmptrld(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
 
     case EXIT_REASON_VMPTRST:
         if ( nvmx_handle_vmptrst(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
 
     case EXIT_REASON_VMREAD:
         if ( nvmx_handle_vmread(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
  
     case EXIT_REASON_VMWRITE:
         if ( nvmx_handle_vmwrite(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
 
     case EXIT_REASON_VMLAUNCH:
         if ( nvmx_handle_vmlaunch(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
 
     case EXIT_REASON_VMRESUME:
         if ( nvmx_handle_vmresume(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
 
     case EXIT_REASON_INVEPT:
         if ( nvmx_handle_invept(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
 
     case EXIT_REASON_INVVPID:
         if ( nvmx_handle_invvpid(regs) == X86EMUL_OKAY )
-            update_guest_eip();
+            vmx_update_guest_eip();
         break;
 
     case EXIT_REASON_MWAIT_INSTRUCTION:
@@ -2781,14 +2769,14 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             int bytes = (exit_qualification & 0x07) + 1;
             int dir = (exit_qualification & 0x08) ? IOREQ_READ : IOREQ_WRITE;
             if ( handle_pio(port, bytes, dir) )
-                update_guest_eip(); /* Safe: IN, OUT */
+                vmx_update_guest_eip(); /* Safe: IN, OUT */
         }
         break;
 
     case EXIT_REASON_INVD:
     case EXIT_REASON_WBINVD:
     {
-        update_guest_eip(); /* Safe: INVD, WBINVD */
+        vmx_update_guest_eip(); /* Safe: INVD, WBINVD */
         vmx_wbinvd_intercept();
         break;
     }
@@ -2821,7 +2809,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
     {
         u64 new_bv = (((u64)regs->edx) << 32) | regs->eax;
         if ( hvm_handle_xsetbv(new_bv) == 0 )
-            update_guest_eip(); /* Safe: XSETBV */
+            vmx_update_guest_eip(); /* Safe: XSETBV */
         break;
     }
 
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index bb7688f..225de9f 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -2136,7 +2136,7 @@ int nvmx_n2_vmexit_handler(struct cpu_user_regs *regs,
             tsc += __get_vvmcs(nvcpu->nv_vvmcx, TSC_OFFSET);
             regs->eax = (uint32_t)tsc;
             regs->edx = (uint32_t)(tsc >> 32);
-            update_guest_eip();
+            vmx_update_guest_eip();
 
             return 1;
         }
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 1b280dc..f68c526 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -728,7 +728,7 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx,
     return 1;
 }
 
-static void pv_cpuid(struct cpu_user_regs *regs)
+void pv_cpuid(struct cpu_user_regs *regs)
 {
     uint32_t a, b, c, d;
 
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index f30e5ac..c9d7118 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -475,6 +475,7 @@ void vmx_vmcs_switch(struct vmcs_struct *from, struct vmcs_struct *to);
 void vmx_set_eoi_exit_bitmap(struct vcpu *v, u8 vector);
 void vmx_clear_eoi_exit_bitmap(struct vcpu *v, u8 vector);
 int vmx_check_msr_bitmap(unsigned long *msr_bitmap, u32 msr, int access_type);
+void vmx_fpu_enter(struct vcpu *v);
 void virtual_vmcs_enter(void *vvmcs);
 void virtual_vmcs_exit(void *vvmcs);
 u64 virtual_vmcs_vmread(void *vvmcs, u32 vmcs_encoding);
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index c33b9f9..6fc0965 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -446,6 +446,18 @@ static inline int __vmxon(u64 addr)
     return rc;
 }
 
+/*
+ * Not all cases receive valid value in the VM-exit instruction length field.
+ * Callers must know what they're doing!
+ */
+static inline int vmx_get_instruction_length(void)
+{
+    int len;
+    len = __vmread(VM_EXIT_INSTRUCTION_LEN); /* Safe: callers audited */
+    BUG_ON((len < 1) || (len > 15));
+    return len;
+}
+
 void vmx_get_segment_register(struct vcpu *, enum x86_segment,
                               struct segment_register *);
 void vmx_inject_extint(int trap);
@@ -457,7 +469,8 @@ void ept_p2m_uninit(struct p2m_domain *p2m);
 void ept_walk_table(struct domain *d, unsigned long gfn);
 void setup_ept_dump(void);
 
-void update_guest_eip(void);
+void vmx_update_guest_eip(void);
+void vmx_dr_access(unsigned long exit_qualification,struct cpu_user_regs *regs);
 
 int alloc_p2m_hap_data(struct p2m_domain *p2m);
 void free_p2m_hap_data(struct p2m_domain *p2m);
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 5cdacc7..8c70324 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -566,6 +566,7 @@ void microcode_set_module(unsigned int);
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
 int microcode_resume_cpu(int cpu);
 
+void pv_cpuid(struct cpu_user_regs *regs);
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_X86_PROCESSOR_H */
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 06/20] PVH xen: Move e820 fields out of pv_domain struct
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (4 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 05/20] PVH xen: vmx realted preparatory changes for PVH Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15 10:27   ` Jan Beulich
  2013-05-15  0:52 ` [PATCH 07/20] PVH xen: Introduce PVH guest type Mukesh Rathor
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This patch moves fields out of the pv_domain struct as they are used by
PVH also.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domain.c        |    4 ++--
 xen/arch/x86/mm.c            |   26 +++++++++++++-------------
 xen/include/asm-x86/domain.h |   10 +++++-----
 3 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index b9711d2..8b04339 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -569,7 +569,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
         /* 64-bit PV guest by default. */
         d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
 
-        spin_lock_init(&d->arch.pv_domain.e820_lock);
+        spin_lock_init(&d->arch.e820_lock);
     }
 
     /* initialize default tsc behavior in case tools don't */
@@ -595,7 +595,7 @@ void arch_domain_destroy(struct domain *d)
     if ( is_hvm_domain(d) )
         hvm_domain_destroy(d);
     else
-        xfree(d->arch.pv_domain.e820);
+        xfree(d->arch.e820);
 
     free_domain_pirqs(d);
     if ( !is_idle_domain(d) )
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 43eeddc..60f1a4f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4833,11 +4833,11 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg)
             return -EFAULT;
         }
 
-        spin_lock(&d->arch.pv_domain.e820_lock);
-        xfree(d->arch.pv_domain.e820);
-        d->arch.pv_domain.e820 = e820;
-        d->arch.pv_domain.nr_e820 = fmap.map.nr_entries;
-        spin_unlock(&d->arch.pv_domain.e820_lock);
+        spin_lock(&d->arch.e820_lock);
+        xfree(d->arch.e820);
+        d->arch.e820 = e820;
+        d->arch.nr_e820 = fmap.map.nr_entries;
+        spin_unlock(&d->arch.e820_lock);
 
         rcu_unlock_domain(d);
         return rc;
@@ -4851,26 +4851,26 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( copy_from_guest(&map, arg, 1) )
             return -EFAULT;
 
-        spin_lock(&d->arch.pv_domain.e820_lock);
+        spin_lock(&d->arch.e820_lock);
 
         /* Backwards compatibility. */
-        if ( (d->arch.pv_domain.nr_e820 == 0) ||
-             (d->arch.pv_domain.e820 == NULL) )
+        if ( (d->arch.nr_e820 == 0) ||
+             (d->arch.e820 == NULL) )
         {
-            spin_unlock(&d->arch.pv_domain.e820_lock);
+            spin_unlock(&d->arch.e820_lock);
             return -ENOSYS;
         }
 
-        map.nr_entries = min(map.nr_entries, d->arch.pv_domain.nr_e820);
-        if ( copy_to_guest(map.buffer, d->arch.pv_domain.e820,
+        map.nr_entries = min(map.nr_entries, d->arch.nr_e820);
+        if ( copy_to_guest(map.buffer, d->arch.e820,
                            map.nr_entries) ||
              __copy_to_guest(arg, &map, 1) )
         {
-            spin_unlock(&d->arch.pv_domain.e820_lock);
+            spin_unlock(&d->arch.e820_lock);
             return -EFAULT;
         }
 
-        spin_unlock(&d->arch.pv_domain.e820_lock);
+        spin_unlock(&d->arch.e820_lock);
         return 0;
     }
 
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 83fbe58..1d5783f 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -234,11 +234,6 @@ struct pv_domain
 
     /* map_domain_page() mapping cache. */
     struct mapcache_domain mapcache;
-
-    /* Pseudophysical e820 map (XENMEM_memory_map).  */
-    spinlock_t e820_lock;
-    struct e820entry *e820;
-    unsigned int nr_e820;
 };
 
 struct arch_domain
@@ -313,6 +308,11 @@ struct arch_domain
                                 (possibly other cases in the future */
     uint64_t vtsc_kerncount; /* for hvm, counts all vtsc */
     uint64_t vtsc_usercount; /* not used for hvm */
+
+    /* Pseudophysical e820 map (XENMEM_memory_map).  */
+    spinlock_t e820_lock;
+    struct e820entry *e820;
+    unsigned int nr_e820;
 } __cacheline_aligned;
 
 #define has_arch_pdevs(d)    (!list_empty(&(d)->arch.pdev_list))
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 07/20] PVH xen: Introduce PVH guest type
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (5 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 06/20] PVH xen: Move e820 fields out of pv_domain struct Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 08/20] PVH xen: tools changes to create PVH domain Mukesh Rathor
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This patch introduces the concept of a pvh guest. There are also other basic
changes like creating macros to check for pvh vcpu/domain, and creating
new macros to see if it's pv/pvh/hvm domain/vcpu. Also, modify copy macros
to include pvh. Lastly, we introduce that PVH uses HVM style event delivery.

Chagnes in V2:
  - make is_pvh/is_hvm enum instead of adding is_pvh as a new flag.
  - fix indentation and spacing in guest_kernel_mode macro.
  - add debug only BUG() in GUEST_KERNEL_RPL macro as it should no longer
    be called in any PVH paths.

Chagnes in V3:
  - Rename enum fields, and add is_pv to it.
  - Get rid if is_hvm_or_pvh_* macros.

Chagnes in V4:
  - Move e820 fields out of pv_domain struct.

Chagnes in V5:
  - Move e820 changes above in V4, to a separate patch.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/debug.c               |    2 +-
 xen/arch/x86/domain.c              |    7 +++++++
 xen/common/domain.c                |    2 +-
 xen/include/asm-x86/desc.h         |    5 +++++
 xen/include/asm-x86/domain.h       |    2 +-
 xen/include/asm-x86/event.h        |    2 +-
 xen/include/asm-x86/guest_access.h |   12 ++++++------
 xen/include/asm-x86/x86_64/regs.h  |    9 +++++----
 xen/include/public/domctl.h        |    3 +++
 xen/include/xen/sched.h            |   21 ++++++++++++++++++---
 10 files changed, 48 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/debug.c b/xen/arch/x86/debug.c
index e67473e..167421d 100644
--- a/xen/arch/x86/debug.c
+++ b/xen/arch/x86/debug.c
@@ -158,7 +158,7 @@ dbg_rw_guest_mem(dbgva_t addr, dbgbyte_t *buf, int len, struct domain *dp,
 
         pagecnt = min_t(long, PAGE_SIZE - (addr & ~PAGE_MASK), len);
 
-        mfn = (dp->is_hvm
+        mfn = (!is_pv_domain(dp)
                ? dbg_hvm_va2mfn(addr, dp, toaddr, &gfn)
                : dbg_pv_va2mfn(addr, dp, pgd3));
         if ( mfn == INVALID_MFN ) 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 8b04339..80ff4a3 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -648,6 +648,13 @@ int arch_set_info_guest(
     unsigned int i;
     int rc = 0, compat;
 
+    /* This removed when all patches are checked in and PVH is done */
+    if ( is_pvh_vcpu(v) )
+    {
+        printk("PVH: You don't have the correct xen version for PVH\n");
+        return -EINVAL;
+    }
+
     /* The context is a compat-mode one if the target domain is compat-mode;
      * we expect the tools to DTRT even in compat-mode callers. */
     compat = is_pv_32on64_domain(d);
diff --git a/xen/common/domain.c b/xen/common/domain.c
index b5d44d4..10e1cab 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -234,7 +234,7 @@ struct domain *domain_create(
         goto fail;
 
     if ( domcr_flags & DOMCRF_hvm )
-        d->is_hvm = 1;
+        d->guest_type = is_hvm;
 
     if ( domid == 0 )
     {
diff --git a/xen/include/asm-x86/desc.h b/xen/include/asm-x86/desc.h
index 354b889..4dca0a3 100644
--- a/xen/include/asm-x86/desc.h
+++ b/xen/include/asm-x86/desc.h
@@ -38,7 +38,12 @@
 
 #ifndef __ASSEMBLY__
 
+#ifndef NDEBUG
+#define GUEST_KERNEL_RPL(d) (is_pvh_domain(d) ? ({ BUG(); 0; }) :    \
+                                                is_pv_32bit_domain(d) ? 1 : 3)
+#else
 #define GUEST_KERNEL_RPL(d) (is_pv_32bit_domain(d) ? 1 : 3)
+#endif
 
 /* Fix up the RPL of a guest segment selector. */
 #define __fixup_guest_selector(d, sel)                             \
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 1d5783f..a4e6dee 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -16,7 +16,7 @@
 #define is_pv_32on64_domain(d) (is_pv_32bit_domain(d))
 #define is_pv_32on64_vcpu(v)   (is_pv_32on64_domain((v)->domain))
 
-#define is_hvm_pv_evtchn_domain(d) (is_hvm_domain(d) && \
+#define is_hvm_pv_evtchn_domain(d) (!is_pv_domain(d) && \
         d->arch.hvm_domain.irq.callback_via_type == HVMIRQ_callback_vector)
 #define is_hvm_pv_evtchn_vcpu(v) (is_hvm_pv_evtchn_domain(v->domain))
 
diff --git a/xen/include/asm-x86/event.h b/xen/include/asm-x86/event.h
index 06057c7..7ed5812 100644
--- a/xen/include/asm-x86/event.h
+++ b/xen/include/asm-x86/event.h
@@ -18,7 +18,7 @@ int hvm_local_events_need_delivery(struct vcpu *v);
 static inline int local_events_need_delivery(void)
 {
     struct vcpu *v = current;
-    return (is_hvm_vcpu(v) ? hvm_local_events_need_delivery(v) :
+    return (!is_pv_vcpu(v) ? hvm_local_events_need_delivery(v) :
             (vcpu_info(v, evtchn_upcall_pending) &&
              !vcpu_info(v, evtchn_upcall_mask)));
 }
diff --git a/xen/include/asm-x86/guest_access.h b/xen/include/asm-x86/guest_access.h
index ca700c9..675dda1 100644
--- a/xen/include/asm-x86/guest_access.h
+++ b/xen/include/asm-x86/guest_access.h
@@ -14,27 +14,27 @@
 
 /* Raw access functions: no type checking. */
 #define raw_copy_to_guest(dst, src, len)        \
-    (is_hvm_vcpu(current) ?                     \
+    (!is_pv_vcpu(current) ?                     \
      copy_to_user_hvm((dst), (src), (len)) :    \
      copy_to_user((dst), (src), (len)))
 #define raw_copy_from_guest(dst, src, len)      \
-    (is_hvm_vcpu(current) ?                     \
+    (!is_pv_vcpu(current) ?                     \
      copy_from_user_hvm((dst), (src), (len)) :  \
      copy_from_user((dst), (src), (len)))
 #define raw_clear_guest(dst,  len)              \
-    (is_hvm_vcpu(current) ?                     \
+    (!is_pv_vcpu(current) ?                     \
      clear_user_hvm((dst), (len)) :             \
      clear_user((dst), (len)))
 #define __raw_copy_to_guest(dst, src, len)      \
-    (is_hvm_vcpu(current) ?                     \
+    (!is_pv_vcpu(current) ?                     \
      copy_to_user_hvm((dst), (src), (len)) :    \
      __copy_to_user((dst), (src), (len)))
 #define __raw_copy_from_guest(dst, src, len)    \
-    (is_hvm_vcpu(current) ?                     \
+    (!is_pv_vcpu(current) ?                     \
      copy_from_user_hvm((dst), (src), (len)) :  \
      __copy_from_user((dst), (src), (len)))
 #define __raw_clear_guest(dst,  len)            \
-    (is_hvm_vcpu(current) ?                     \
+    (!is_pv_vcpu(current) ?                     \
      clear_user_hvm((dst), (len)) :             \
      clear_user((dst), (len)))
 
diff --git a/xen/include/asm-x86/x86_64/regs.h b/xen/include/asm-x86/x86_64/regs.h
index 3cdc702..bb475cf 100644
--- a/xen/include/asm-x86/x86_64/regs.h
+++ b/xen/include/asm-x86/x86_64/regs.h
@@ -10,10 +10,11 @@
 #define ring_2(r)    (((r)->cs & 3) == 2)
 #define ring_3(r)    (((r)->cs & 3) == 3)
 
-#define guest_kernel_mode(v, r)                                 \
-    (!is_pv_32bit_vcpu(v) ?                                     \
-     (ring_3(r) && ((v)->arch.flags & TF_kernel_mode)) :        \
-     (ring_1(r)))
+#define guest_kernel_mode(v, r)                                   \
+    (is_pvh_vcpu(v) ? ({ ASSERT(v == current); ring_0(r); }) :    \
+     (!is_pv_32bit_vcpu(v) ?                                      \
+      (ring_3(r) && ((v)->arch.flags & TF_kernel_mode)) :         \
+      (ring_1(r))))
 
 #define permit_softint(dpl, v, r) \
     ((dpl) >= (guest_kernel_mode(v, r) ? 1 : 3))
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 4c5b2bb..6b1aa11 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -89,6 +89,9 @@ struct xen_domctl_getdomaininfo {
  /* Being debugged.  */
 #define _XEN_DOMINF_debugged  6
 #define XEN_DOMINF_debugged   (1U<<_XEN_DOMINF_debugged)
+/* domain is PVH */
+#define _XEN_DOMINF_pvh_guest 7
+#define XEN_DOMINF_pvh_guest   (1U<<_XEN_DOMINF_pvh_guest)
  /* XEN_DOMINF_shutdown guest-supplied code.  */
 #define XEN_DOMINF_shutdownmask 255
 #define XEN_DOMINF_shutdownshift 16
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 41f749e..516704e 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -235,6 +235,14 @@ struct mem_event_per_domain
     struct mem_event_domain access;
 };
 
+/*
+ * PVH is a PV guest running in an HVM container. While is_hvm is false
+ * for it, it uses many of the HVM data structs.
+ */
+enum guest_type {
+    is_pv, is_pvh, is_hvm
+};
+
 struct domain
 {
     domid_t          domain_id;
@@ -282,8 +290,8 @@ struct domain
     struct rangeset *iomem_caps;
     struct rangeset *irq_caps;
 
-    /* Is this an HVM guest? */
-    bool_t           is_hvm;
+    enum guest_type guest_type;
+
 #ifdef HAS_PASSTHROUGH
     /* Does this guest need iommu mappings? */
     bool_t           need_iommu;
@@ -461,6 +469,9 @@ struct domain *domain_create(
  /* DOMCRF_oos_off: dont use out-of-sync optimization for shadow page tables */
 #define _DOMCRF_oos_off         4
 #define DOMCRF_oos_off          (1U<<_DOMCRF_oos_off)
+ /* DOMCRF_pvh: Create PV domain in HVM container */
+#define _DOMCRF_pvh            5
+#define DOMCRF_pvh             (1U<<_DOMCRF_pvh)
 
 /*
  * rcu_lock_domain_by_id() is more efficient than get_domain_by_id().
@@ -735,8 +746,12 @@ void watchdog_domain_destroy(struct domain *d);
 
 #define VM_ASSIST(_d,_t) (test_bit((_t), &(_d)->vm_assist))
 
-#define is_hvm_domain(d) ((d)->is_hvm)
+#define is_pv_domain(d) ((d)->guest_type == is_pv)
+#define is_pv_vcpu(v)   (is_pv_domain(v->domain))
+#define is_hvm_domain(d) ((d)->guest_type == is_hvm)
 #define is_hvm_vcpu(v)   (is_hvm_domain(v->domain))
+#define is_pvh_domain(d) ((d)->guest_type == is_pvh)
+#define is_pvh_vcpu(v)   (is_pvh_domain(v->domain))
 #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
                            cpumask_weight((v)->cpu_affinity) == 1)
 #ifdef HAS_PASSTHROUGH
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 08/20] PVH xen: tools changes to create PVH domain
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (6 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 07/20] PVH xen: Introduce PVH guest type Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 09/20] PVH xen: domain creation code changes Mukesh Rathor
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This patch contains tools changes for PVH. For now, only one mode is
supported/tested:
    dom0> losetup /dev/loop1 guest.img
    dom0> In vm.cfg file: disk = ['phy:/dev/loop1,xvda,w']

Chnages in V2: None
Chnages in V3:
  - Document pvh boolean flag in xl.cfg.pod.5
  - Rename ci_pvh and bi_pvh to pvh, and domcr_is_pvh to pvh_enabled.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 docs/man/xl.cfg.pod.5             |    3 +++
 tools/debugger/gdbsx/xg/xg_main.c |    4 +++-
 tools/libxc/xc_dom.h              |    1 +
 tools/libxc/xc_dom_x86.c          |    7 ++++---
 tools/libxl/libxl_create.c        |    2 ++
 tools/libxl/libxl_dom.c           |   18 +++++++++++++++++-
 tools/libxl/libxl_types.idl       |    2 ++
 tools/libxl/libxl_x86.c           |    4 +++-
 tools/libxl/xl_cmdimpl.c          |   11 +++++++++++
 tools/xenstore/xenstored_domain.c |   12 +++++++-----
 10 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index f8b4576..17c5679 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -620,6 +620,9 @@ if your particular guest kernel does not require this behaviour then
 it is safe to allow this to be enabled but you may wish to disable it
 anyway.
 
+=item B<pvh=BOOLEAN>
+Selects whether to run this guest in an HVM container. Default is 0.
+
 =back
 
 =head2 Fully-virtualised (HVM) Guest Specific Options
diff --git a/tools/debugger/gdbsx/xg/xg_main.c b/tools/debugger/gdbsx/xg/xg_main.c
index 64c7484..5736b86 100644
--- a/tools/debugger/gdbsx/xg/xg_main.c
+++ b/tools/debugger/gdbsx/xg/xg_main.c
@@ -81,6 +81,7 @@ int xgtrc_on = 0;
 struct xen_domctl domctl;         /* just use a global domctl */
 
 static int     _hvm_guest;        /* hvm guest? 32bit HVMs have 64bit context */
+static int     _pvh_guest;        /* PV guest in HVM container */
 static domid_t _dom_id;           /* guest domid */
 static int     _max_vcpu_id;      /* thus max_vcpu_id+1 VCPUs */
 static int     _dom0_fd;          /* fd of /dev/privcmd */
@@ -309,6 +310,7 @@ xg_attach(int domid, int guest_bitness)
 
     _max_vcpu_id = domctl.u.getdomaininfo.max_vcpu_id;
     _hvm_guest = (domctl.u.getdomaininfo.flags & XEN_DOMINF_hvm_guest);
+    _pvh_guest = (domctl.u.getdomaininfo.flags & XEN_DOMINF_pvh_guest);
     return _max_vcpu_id;
 }
 
@@ -369,7 +371,7 @@ _change_TF(vcpuid_t which_vcpu, int guest_bitness, int setit)
     int sz = sizeof(anyc);
 
     /* first try the MTF for hvm guest. otherwise do manually */
-    if (_hvm_guest) {
+    if (_hvm_guest || _pvh_guest) {
         domctl.u.debug_op.vcpu = which_vcpu;
         domctl.u.debug_op.op = setit ? XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON :
                                        XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF;
diff --git a/tools/libxc/xc_dom.h b/tools/libxc/xc_dom.h
index ac36600..8b43d2b 100644
--- a/tools/libxc/xc_dom.h
+++ b/tools/libxc/xc_dom.h
@@ -130,6 +130,7 @@ struct xc_dom_image {
     domid_t console_domid;
     domid_t xenstore_domid;
     xen_pfn_t shared_info_mfn;
+    int pvh_enabled;
 
     xc_interface *xch;
     domid_t guest_domid;
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index f1be43b..24f6759 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -389,7 +389,8 @@ static int setup_pgtables_x86_64(struct xc_dom_image *dom)
         pgpfn = (addr - dom->parms.virt_base) >> PAGE_SHIFT_X86;
         l1tab[l1off] =
             pfn_to_paddr(xc_dom_p2m_guest(dom, pgpfn)) | L1_PROT;
-        if ( (addr >= dom->pgtables_seg.vstart) && 
+        if ( (!dom->pvh_enabled)                &&
+             (addr >= dom->pgtables_seg.vstart) &&
              (addr < dom->pgtables_seg.vend) )
             l1tab[l1off] &= ~_PAGE_RW; /* page tables are r/o */
         if ( l1off == (L1_PAGETABLE_ENTRIES_X86_64 - 1) )
@@ -706,7 +707,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
     rc = x86_compat(dom->xch, dom->guest_domid, dom->guest_type);
     if ( rc )
         return rc;
-    if ( xc_dom_feature_translated(dom) )
+    if ( xc_dom_feature_translated(dom) && !dom->pvh_enabled )
     {
         dom->shadow_enabled = 1;
         rc = x86_shadow(dom->xch, dom->guest_domid);
@@ -832,7 +833,7 @@ int arch_setup_bootlate(struct xc_dom_image *dom)
         }
 
         /* Map grant table frames into guest physmap. */
-        for ( i = 0; ; i++ )
+        for ( i = 0; !dom->pvh_enabled; i++ )
         {
             rc = xc_domain_add_to_physmap(dom->xch, dom->guest_domid,
                                           XENMAPSPACE_grant_table,
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index cb9c822..83e2d5b 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -421,6 +421,8 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_create_info *info,
         flags |= XEN_DOMCTL_CDF_hvm_guest;
         flags |= libxl_defbool_val(info->hap) ? XEN_DOMCTL_CDF_hap : 0;
         flags |= libxl_defbool_val(info->oos) ? 0 : XEN_DOMCTL_CDF_oos_off;
+    } else if ( libxl_defbool_val(info->pvh) ) {
+        flags |= XEN_DOMCTL_CDF_hap;
     }
     *domid = -1;
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index b38d0a7..cefbf76 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -329,9 +329,23 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
     struct xc_dom_image *dom;
     int ret;
     int flags = 0;
+    int is_pvh = libxl_defbool_val(info->pvh);
 
     xc_dom_loginit(ctx->xch);
 
+    if (is_pvh) {
+        char *pv_feats = "writable_descriptor_tables|auto_translated_physmap"
+                         "|supervisor_mode_kernel|hvm_callback_vector";
+
+        if (info->u.pv.features && info->u.pv.features[0] != '\0')
+        {
+            LOG(ERROR, "Didn't expect info->u.pv.features to contain string\n");
+            LOG(ERROR, "String: %s\n", info->u.pv.features);
+            return ERROR_FAIL;
+        }
+        info->u.pv.features = strdup(pv_feats);
+    }
+
     dom = xc_dom_allocate(ctx->xch, state->pv_cmdline, info->u.pv.features);
     if (!dom) {
         LOGE(ERROR, "xc_dom_allocate failed");
@@ -370,6 +384,7 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
     }
 
     dom->flags = flags;
+    dom->pvh_enabled = is_pvh;
     dom->console_evtchn = state->console_port;
     dom->console_domid = state->console_domid;
     dom->xenstore_evtchn = state->store_port;
@@ -400,7 +415,8 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
         LOGE(ERROR, "xc_dom_boot_image failed");
         goto out;
     }
-    if ( (ret = xc_dom_gnttab_init(dom)) != 0 ) {
+    /* PVH sets up its own grant during boot via hvm mechanisms */
+    if ( !is_pvh && (ret = xc_dom_gnttab_init(dom)) != 0 ) {
         LOGE(ERROR, "xc_dom_gnttab_init failed");
         goto out;
     }
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index ecf1f0b..2599e01 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -245,6 +245,7 @@ libxl_domain_create_info = Struct("domain_create_info",[
     ("platformdata", libxl_key_value_list),
     ("poolid",       uint32),
     ("run_hotplug_scripts",libxl_defbool),
+    ("pvh",          libxl_defbool),
     ], dir=DIR_IN)
 
 MemKB = UInt(64, init_val = "LIBXL_MEMKB_DEFAULT")
@@ -346,6 +347,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                       ])),
                  ("invalid", Struct(None, [])),
                  ], keyvar_init_val = "LIBXL_DOMAIN_TYPE_INVALID")),
+    ("pvh",       libxl_defbool),
     ], dir=DIR_IN
 )
 
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index a17f6ae..424bc68 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -290,7 +290,9 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
     if (rtc_timeoffset)
         xc_domain_set_time_offset(ctx->xch, domid, rtc_timeoffset);
 
-    if (d_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM) {
+    if (d_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM ||
+        libxl_defbool_val(d_config->b_info.pvh)) {
+
         unsigned long shadow;
         shadow = (d_config->b_info.shadow_memkb + 1023) / 1024;
         xc_shadow_control(ctx->xch, domid, XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION, NULL, 0, &shadow, 0, NULL);
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c1a969b..3ee7593 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -610,8 +610,18 @@ static void parse_config_data(const char *config_source,
         !strncmp(buf, "hvm", strlen(buf)))
         c_info->type = LIBXL_DOMAIN_TYPE_HVM;
 
+    libxl_defbool_setdefault(&c_info->pvh, false);
+    libxl_defbool_setdefault(&c_info->hap, false);
+    xlu_cfg_get_defbool(config, "pvh", &c_info->pvh, 0);
     xlu_cfg_get_defbool(config, "hap", &c_info->hap, 0);
 
+    if (libxl_defbool_val(c_info->pvh) &&
+        !libxl_defbool_val(c_info->hap)) {
+
+        fprintf(stderr, "hap is required for PVH domain\n");
+        exit(1);
+    }
+
     if (xlu_cfg_replace_string (config, "name", &c_info->name, 0)) {
         fprintf(stderr, "Domain name must be specified.\n");
         exit(1);
@@ -918,6 +928,7 @@ static void parse_config_data(const char *config_source,
 
         b_info->u.pv.cmdline = cmdline;
         xlu_cfg_replace_string (config, "ramdisk", &b_info->u.pv.ramdisk, 0);
+        libxl_defbool_set(&b_info->pvh, libxl_defbool_val(c_info->pvh));
         break;
     }
     default:
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index bf83d58..10c23a1 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -168,13 +168,15 @@ static int readchn(struct connection *conn, void *data, unsigned int len)
 static void *map_interface(domid_t domid, unsigned long mfn)
 {
 	if (*xcg_handle != NULL) {
-		/* this is the preferred method */
-		return xc_gnttab_map_grant_ref(*xcg_handle, domid,
+                void *addr;
+                /* this is the preferred method */
+                addr = xc_gnttab_map_grant_ref(*xcg_handle, domid,
 			GNTTAB_RESERVED_XENSTORE, PROT_READ|PROT_WRITE);
-	} else {
-		return xc_map_foreign_range(*xc_handle, domid,
-			getpagesize(), PROT_READ|PROT_WRITE, mfn);
+                if (addr)
+                        return addr;
 	}
+	return xc_map_foreign_range(*xc_handle, domid,
+		        getpagesize(), PROT_READ|PROT_WRITE, mfn);
 }
 
 static void unmap_interface(void *interface)
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 09/20] PVH xen: domain creation code changes
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (7 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 08/20] PVH xen: tools changes to create PVH domain Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 10/20] PVH xen: create PVH vmcs, and also initialization Mukesh Rathor
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This patch contains changes to arch/x86/domain.c to allow for a PVH
domain.
Changes in V2:
  - changes to read_segment_register() moved to this patch.
  - The other comment was to create NULL functions for pvh_set_vcpu_info
    and pvh_read_descriptor which are implemented in later patch, but since
    I disable PVH creation until all patches are checked in, it is not needed.
    But it helps breaking down of patches.

Changes in V3:
  - Fix read_segment_register() macro to make sure args are evaluated once,
    and use # instead of STR for name in the macro.

Changes in V4:
  - Remove pvh substruct in the hvm substruct, as the vcpu_info_mfn has been
    moved out of pv_vcpu struct.
  - rename hvm_pvh_* functions to hvm_*.

Changes in V5:
  - remove pvh_read_descriptor().

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domain.c         |   69 ++++++++++++++++++++++++++--------------
 xen/arch/x86/mm.c             |    3 ++
 xen/arch/x86/mm/hap/hap.c     |    4 ++-
 xen/include/asm-x86/hvm/hvm.h |    8 +++++
 xen/include/asm-x86/system.h  |   18 ++++++++--
 5 files changed, 73 insertions(+), 29 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 80ff4a3..4883fd1 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -387,7 +387,7 @@ int vcpu_initialise(struct vcpu *v)
 
     v->arch.vcpu_info_mfn = INVALID_MFN;
 
-    if ( is_hvm_domain(d) )
+    if ( !is_pv_domain(d) )
     {
         rc = hvm_vcpu_initialise(v);
         goto done;
@@ -454,7 +454,7 @@ void vcpu_destroy(struct vcpu *v)
 
     vcpu_destroy_fpu(v);
 
-    if ( is_hvm_vcpu(v) )
+    if ( !is_pv_vcpu(v) )
         hvm_vcpu_destroy(v);
     else
         xfree(v->arch.pv_vcpu.trap_ctxt);
@@ -466,7 +466,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
     int rc = -ENOMEM;
 
     d->arch.hvm_domain.hap_enabled =
-        is_hvm_domain(d) &&
+        !is_pv_domain(d) &&
         hvm_funcs.hap_supported &&
         (domcr_flags & DOMCRF_hap);
     d->arch.hvm_domain.mem_sharing_enabled = 0;
@@ -514,7 +514,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
     mapcache_domain_init(d);
 
     HYPERVISOR_COMPAT_VIRT_START(d) =
-        is_hvm_domain(d) ? ~0u : __HYPERVISOR_COMPAT_VIRT_START;
+        is_pv_domain(d) ? __HYPERVISOR_COMPAT_VIRT_START : ~0u;
 
     if ( (rc = paging_domain_init(d, domcr_flags)) != 0 )
         goto fail;
@@ -556,7 +556,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
             goto fail;
     }
 
-    if ( is_hvm_domain(d) )
+    if ( !is_pv_domain(d) )
     {
         if ( (rc = hvm_domain_initialise(d)) != 0 )
         {
@@ -565,12 +565,11 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
         }
     }
     else
-    {
         /* 64-bit PV guest by default. */
         d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
 
+    if ( !is_hvm_domain(d) )
         spin_lock_init(&d->arch.e820_lock);
-    }
 
     /* initialize default tsc behavior in case tools don't */
     tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0);
@@ -592,9 +591,10 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
 
 void arch_domain_destroy(struct domain *d)
 {
-    if ( is_hvm_domain(d) )
+    if ( !is_pv_domain(d) )
         hvm_domain_destroy(d);
-    else
+
+    if ( !is_hvm_domain(d) )
         xfree(d->arch.e820);
 
     free_domain_pirqs(d);
@@ -662,7 +662,7 @@ int arch_set_info_guest(
 #define c(fld) (compat ? (c.cmp->fld) : (c.nat->fld))
     flags = c(flags);
 
-    if ( !is_hvm_vcpu(v) )
+    if ( is_pv_vcpu(v) )
     {
         if ( !compat )
         {
@@ -715,7 +715,7 @@ int arch_set_info_guest(
     v->fpu_initialised = !!(flags & VGCF_I387_VALID);
 
     v->arch.flags &= ~TF_kernel_mode;
-    if ( (flags & VGCF_in_kernel) || is_hvm_vcpu(v)/*???*/ )
+    if ( (flags & VGCF_in_kernel) || !is_pv_vcpu(v)/*???*/ )
         v->arch.flags |= TF_kernel_mode;
 
     v->arch.vgc_flags = flags;
@@ -726,7 +726,7 @@ int arch_set_info_guest(
     if ( !compat )
     {
         memcpy(&v->arch.user_regs, &c.nat->user_regs, sizeof(c.nat->user_regs));
-        if ( !is_hvm_vcpu(v) )
+        if ( is_pv_vcpu(v) )
             memcpy(v->arch.pv_vcpu.trap_ctxt, c.nat->trap_ctxt,
                    sizeof(c.nat->trap_ctxt));
     }
@@ -742,10 +742,13 @@ int arch_set_info_guest(
 
     v->arch.user_regs.eflags |= 2;
 
-    if ( is_hvm_vcpu(v) )
+    if ( !is_pv_vcpu(v) )
     {
         hvm_set_info_guest(v);
-        goto out;
+        if ( is_hvm_vcpu(v) || v->is_initialised )
+            goto out;
+        else 
+            goto pvh_skip_pv_stuff;
     }
 
     init_int80_direct_trap(v);
@@ -754,7 +757,10 @@ int arch_set_info_guest(
     v->arch.pv_vcpu.iopl = (v->arch.user_regs.eflags >> 12) & 3;
     v->arch.user_regs.eflags &= ~X86_EFLAGS_IOPL;
 
-    /* Ensure real hardware interrupts are enabled. */
+    /*
+     * Ensure real hardware interrupts are enabled. Note: PVH may not have
+     * IDT set on all vcpus so we don't enable IF for it yet.
+     */
     v->arch.user_regs.eflags |= X86_EFLAGS_IF;
 
     if ( !v->is_initialised )
@@ -856,6 +862,7 @@ int arch_set_info_guest(
 
     set_bit(_VPF_in_reset, &v->pause_flags);
 
+pvh_skip_pv_stuff:
     if ( !compat )
         cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]);
     else
@@ -864,7 +871,7 @@ int arch_set_info_guest(
 
     if ( !cr3_page )
         rc = -EINVAL;
-    else if ( paging_mode_refcounts(d) )
+    else if ( paging_mode_refcounts(d) || is_pvh_vcpu(v) )
         /* nothing */;
     else if ( cr3_page == v->arch.old_guest_table )
     {
@@ -890,8 +897,15 @@ int arch_set_info_guest(
         /* handled below */;
     else if ( !compat )
     {
+        /* PVH 32bitfixme */
+        if ( is_pvh_vcpu(v) )
+        {
+            v->arch.cr3 = page_to_mfn(cr3_page);
+            v->arch.hvm_vcpu.guest_cr[3] = c.nat->ctrlreg[3];
+        }
+
         v->arch.guest_table = pagetable_from_page(cr3_page);
-        if ( c.nat->ctrlreg[1] )
+        if ( c.nat->ctrlreg[1] && !is_pvh_vcpu(v) )
         {
             cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]);
             cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
@@ -946,6 +960,13 @@ int arch_set_info_guest(
 
     update_cr3(v);
 
+    if ( is_pvh_vcpu(v) )
+    {
+        /* guest is bringing up non-boot SMP vcpu */
+        if ( (rc=hvm_set_vcpu_info(v, c.nat)) != 0 )
+            return rc;
+    }
+
  out:
     if ( flags & VGCF_online )
         clear_bit(_VPF_down, &v->pause_flags);
@@ -1450,7 +1471,7 @@ static void update_runstate_area(struct vcpu *v)
 
 static inline int need_full_gdt(struct vcpu *v)
 {
-    return (!is_hvm_vcpu(v) && !is_idle_vcpu(v));
+    return (is_pv_vcpu(v) && !is_idle_vcpu(v));
 }
 
 static void __context_switch(void)
@@ -1584,7 +1605,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
         /* Re-enable interrupts before restoring state which may fault. */
         local_irq_enable();
 
-        if ( !is_hvm_vcpu(next) )
+        if ( is_pv_vcpu(next) )
         {
             load_LDT(next);
             load_segments(next);
@@ -1707,12 +1728,12 @@ unsigned long hypercall_create_continuation(
         regs->eax  = op;
 
         /* Ensure the hypercall trap instruction is re-executed. */
-        if ( !is_hvm_vcpu(current) )
+        if ( is_pv_vcpu(current) )
             regs->eip -= 2;  /* re-execute 'syscall' / 'int $xx' */
         else
             current->arch.hvm_vcpu.hcall_preempted = 1;
 
-        if ( !is_hvm_vcpu(current) ?
+        if ( is_pv_vcpu(current) ?
              !is_pv_32on64_vcpu(current) :
              (hvm_guest_x86_mode(current) == 8) )
         {
@@ -1982,7 +2003,7 @@ int domain_relinquish_resources(struct domain *d)
             unmap_vcpu_info(v);
         }
 
-        if ( !is_hvm_domain(d) )
+        if ( is_pv_domain(d) )
         {
             for_each_vcpu ( d, v )
             {
@@ -2055,7 +2076,7 @@ int domain_relinquish_resources(struct domain *d)
         BUG();
     }
 
-    if ( is_hvm_domain(d) )
+    if ( !is_pv_domain(d) )
         hvm_domain_relinquish_resources(d);
 
     return 0;
@@ -2139,7 +2160,7 @@ void vcpu_mark_events_pending(struct vcpu *v)
     if ( already_pending )
         return;
 
-    if ( is_hvm_vcpu(v) )
+    if ( !is_pv_vcpu(v) )
         hvm_assert_evtchn_irq(v);
     else
         vcpu_kick(v);
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 60f1a4f..ef37053 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4330,6 +4330,9 @@ void destroy_gdt(struct vcpu *v)
     int i;
     unsigned long pfn;
 
+    if ( is_pvh_vcpu(v) )
+        return;
+
     v->arch.pv_vcpu.gdt_ents = 0;
     pl1e = gdt_ldt_ptes(v->domain, v);
     for ( i = 0; i < FIRST_RESERVED_GDT_PAGE; i++ )
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index bff05d9..5aa0852 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -639,7 +639,9 @@ static void hap_update_cr3(struct vcpu *v, int do_locking)
 const struct paging_mode *
 hap_paging_get_mode(struct vcpu *v)
 {
-    return !hvm_paging_enabled(v)   ? &hap_paging_real_mode :
+    /* PVH 32bitfixme */
+    return is_pvh_vcpu(v) ? &hap_paging_long_mode :
+        !hvm_paging_enabled(v)   ? &hap_paging_real_mode :
         hvm_long_mode_enabled(v) ? &hap_paging_long_mode :
         hvm_pae_enabled(v)       ? &hap_paging_pae_mode  :
                                    &hap_paging_protected_mode;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 8408420..9b5fa5b 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -192,6 +192,8 @@ struct hvm_function_table {
                                 paddr_t *L1_gpa, unsigned int *page_order,
                                 uint8_t *p2m_acc, bool_t access_r,
                                 bool_t access_w, bool_t access_x);
+    /* PVH functions */
+    int (*pvh_set_vcpu_info)(struct vcpu *v, struct vcpu_guest_context *ctxtp);
 };
 
 extern struct hvm_function_table hvm_funcs;
@@ -325,6 +327,12 @@ static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
     return hvm_funcs.get_shadow_gs_base(v);
 }
 
+static inline int hvm_set_vcpu_info(struct vcpu *v, 
+                                        struct vcpu_guest_context *ctxtp)
+{
+    return hvm_funcs.pvh_set_vcpu_info(v, ctxtp);
+}
+
 #define is_viridian_domain(_d)                                             \
  (is_hvm_domain(_d) && ((_d)->arch.hvm_domain.params[HVM_PARAM_VIRIDIAN]))
 
diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
index d8dc6f2..7780c16 100644
--- a/xen/include/asm-x86/system.h
+++ b/xen/include/asm-x86/system.h
@@ -4,10 +4,20 @@
 #include <xen/lib.h>
 #include <asm/bitops.h>
 
-#define read_segment_register(vcpu, regs, name)                 \
-({  u16 __sel;                                                  \
-    asm volatile ( "movw %%" STR(name) ",%0" : "=r" (__sel) );  \
-    __sel;                                                      \
+/* 
+ * We need vcpu because during context switch, going from pure PV to PVH,
+ * in save_segments(), current has been updated to next, and no longer pointing
+ * to the pure PV. Note: for PVH, we update regs->selectors on each vmexit.
+ */
+#define read_segment_register(vcpu, regs, name)                   \
+({  u16 __sel;                                                    \
+    struct cpu_user_regs *_regs = (regs);                         \
+                                                                  \
+    if ( is_pvh_vcpu(vcpu) )                                      \
+        __sel = _regs->name;                                      \
+    else                                                          \
+        asm volatile ( "movw %%" #name ",%0" : "=r" (__sel) );    \
+    __sel;                                                        \
 })
 
 #define wbinvd() \
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 10/20] PVH xen: create PVH vmcs, and also initialization
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (8 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 09/20] PVH xen: domain creation code changes Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 11/20] PVH xen: introduce pvh.c Mukesh Rathor
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This patch mainly contains code to create a VMCS for PVH guest, and HVM
specific vcpu/domain creation code.

Changes in V2:
  - Avoid call to hvm_do_resume() at call site rather than return in it.
  - Return for PVH vmx_do_resume prior to intel debugger stuff.

Changes in V3:
  - Cleanup pvh_construct_vmcs().
  - Fix formatting in few places, adding XENLOG_G_ERR to printing.
  - Do not load the CS selector for PVH here, but try to do that in Linux.

Changes in V4:
  - Remove VM_ENTRY_LOAD_DEBUG_CTLS clearing.
  - Add 32bit kernel changes mark.
  - Verify pit_init call for PVH.

Changes in V5:
  - formatting. remove unnecessary variable guest_pat.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/hvm/hvm.c      |   94 ++++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c |  312 ++++++++++++++++++++++++++++++++++++++----
 xen/arch/x86/hvm/vmx/vmx.c  |   40 ++++++
 3 files changed, 410 insertions(+), 36 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7c3cb15..e103c70 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -510,6 +510,30 @@ static int hvm_print_line(
     return X86EMUL_OKAY;
 }
 
+static int pvh_dom_initialise(struct domain *d)
+{
+    int rc;
+
+    if ( !d->arch.hvm_domain.hap_enabled )
+        return -EINVAL;
+
+    spin_lock_init(&d->arch.hvm_domain.irq_lock);
+
+    hvm_init_cacheattr_region_list(d);
+
+    if ( (rc = paging_enable(d, PG_refcounts|PG_translate|PG_external)) != 0 )
+        goto fail1;
+
+    if ( (rc = hvm_funcs.domain_initialise(d)) != 0 )
+        goto fail1;
+
+    return 0;
+
+fail1:
+    hvm_destroy_cacheattr_region_list(d);
+    return rc;
+}
+
 int hvm_domain_initialise(struct domain *d)
 {
     int rc;
@@ -520,6 +544,8 @@ int hvm_domain_initialise(struct domain *d)
                  "on a non-VT/AMDV platform.\n");
         return -EINVAL;
     }
+    if ( is_pvh_domain(d) )
+        return pvh_dom_initialise(d);
 
     spin_lock_init(&d->arch.hvm_domain.pbuf_lock);
     spin_lock_init(&d->arch.hvm_domain.irq_lock);
@@ -584,6 +610,11 @@ int hvm_domain_initialise(struct domain *d)
 
 void hvm_domain_relinquish_resources(struct domain *d)
 {
+    if ( is_pvh_domain(d) )
+    {
+        pit_deinit(d);
+        return;
+    }
     if ( hvm_funcs.nhvm_domain_relinquish_resources )
         hvm_funcs.nhvm_domain_relinquish_resources(d);
 
@@ -609,10 +640,14 @@ void hvm_domain_relinquish_resources(struct domain *d)
 void hvm_domain_destroy(struct domain *d)
 {
     hvm_funcs.domain_destroy(d);
+    hvm_destroy_cacheattr_region_list(d);
+
+    if ( is_pvh_domain(d) )
+        return;
+
     rtc_deinit(d);
     stdvga_deinit(d);
     vioapic_deinit(d);
-    hvm_destroy_cacheattr_region_list(d);
 }
 
 static int hvm_save_tsc_adjust(struct domain *d, hvm_domain_context_t *h)
@@ -1066,14 +1101,46 @@ static int __init __hvm_register_CPU_XSAVE_save_and_restore(void)
 }
 __initcall(__hvm_register_CPU_XSAVE_save_and_restore);
 
+static int pvh_vcpu_initialise(struct vcpu *v)
+{
+    int rc;
+
+    if ( (rc = hvm_funcs.vcpu_initialise(v)) != 0 )
+        return rc;
+
+    softirq_tasklet_init(&v->arch.hvm_vcpu.assert_evtchn_irq_tasklet,
+                         (void(*)(unsigned long))hvm_assert_evtchn_irq,
+                         (unsigned long)v);
+
+    v->arch.hvm_vcpu.hcall_64bit = 1;    /* PVH 32bitfixme */
+    v->arch.user_regs.eflags = 2;
+    v->arch.hvm_vcpu.inject_trap.vector = -1;
+
+    if ( (rc = hvm_vcpu_cacheattr_init(v)) != 0 )
+    {
+        hvm_funcs.vcpu_destroy(v);
+        return rc;
+    }
+    if ( v->vcpu_id == 0 )
+        pit_init(v, cpu_khz);
+
+    return 0;
+}
+
 int hvm_vcpu_initialise(struct vcpu *v)
 {
     int rc;
     struct domain *d = v->domain;
-    domid_t dm_domid = d->arch.hvm_domain.params[HVM_PARAM_DM_DOMAIN];
+    domid_t dm_domid;
 
     hvm_asid_flush_vcpu(v);
 
+    spin_lock_init(&v->arch.hvm_vcpu.tm_lock);
+    INIT_LIST_HEAD(&v->arch.hvm_vcpu.tm_list);
+
+    if ( is_pvh_vcpu(v) )
+        return pvh_vcpu_initialise(v);
+
     if ( (rc = vlapic_init(v)) != 0 )
         goto fail1;
 
@@ -1084,6 +1151,8 @@ int hvm_vcpu_initialise(struct vcpu *v)
          && (rc = nestedhvm_vcpu_initialise(v)) < 0 ) 
         goto fail3;
 
+    dm_domid = d->arch.hvm_domain.params[HVM_PARAM_DM_DOMAIN];
+
     /* Create ioreq event channel. */
     rc = alloc_unbound_xen_event_channel(v, dm_domid, NULL);
     if ( rc < 0 )
@@ -1106,9 +1175,6 @@ int hvm_vcpu_initialise(struct vcpu *v)
         get_ioreq(v)->vp_eport = v->arch.hvm_vcpu.xen_port;
     spin_unlock(&d->arch.hvm_domain.ioreq.lock);
 
-    spin_lock_init(&v->arch.hvm_vcpu.tm_lock);
-    INIT_LIST_HEAD(&v->arch.hvm_vcpu.tm_list);
-
     v->arch.hvm_vcpu.inject_trap.vector = -1;
 
     rc = setup_compat_arg_xlat(v);
@@ -1163,7 +1229,10 @@ void hvm_vcpu_destroy(struct vcpu *v)
 
     tasklet_kill(&v->arch.hvm_vcpu.assert_evtchn_irq_tasklet);
     hvm_vcpu_cacheattr_destroy(v);
-    vlapic_destroy(v);
+
+    if ( !is_pvh_vcpu(v) )
+        vlapic_destroy(v);
+
     hvm_funcs.vcpu_destroy(v);
 
     /* Event channel is already freed by evtchn_destroy(). */
@@ -4511,8 +4580,11 @@ static int hvm_memory_event_traps(long p, uint32_t reason,
     return 1;
 }
 
+/* PVH fixme: add support for monitoring guest behaviour in below functions */
 void hvm_memory_event_cr0(unsigned long value, unsigned long old) 
 {
+    if ( is_pvh_vcpu(current) )
+        return;
     hvm_memory_event_traps(current->domain->arch.hvm_domain
                              .params[HVM_PARAM_MEMORY_EVENT_CR0],
                            MEM_EVENT_REASON_CR0,
@@ -4521,6 +4593,8 @@ void hvm_memory_event_cr0(unsigned long value, unsigned long old)
 
 void hvm_memory_event_cr3(unsigned long value, unsigned long old) 
 {
+    if ( is_pvh_vcpu(current) )
+        return;
     hvm_memory_event_traps(current->domain->arch.hvm_domain
                              .params[HVM_PARAM_MEMORY_EVENT_CR3],
                            MEM_EVENT_REASON_CR3,
@@ -4529,6 +4603,8 @@ void hvm_memory_event_cr3(unsigned long value, unsigned long old)
 
 void hvm_memory_event_cr4(unsigned long value, unsigned long old) 
 {
+    if ( is_pvh_vcpu(current) )
+        return;
     hvm_memory_event_traps(current->domain->arch.hvm_domain
                              .params[HVM_PARAM_MEMORY_EVENT_CR4],
                            MEM_EVENT_REASON_CR4,
@@ -4537,6 +4613,8 @@ void hvm_memory_event_cr4(unsigned long value, unsigned long old)
 
 void hvm_memory_event_msr(unsigned long msr, unsigned long value)
 {
+    if ( is_pvh_vcpu(current) )
+        return;
     hvm_memory_event_traps(current->domain->arch.hvm_domain
                              .params[HVM_PARAM_MEMORY_EVENT_MSR],
                            MEM_EVENT_REASON_MSR,
@@ -4549,6 +4627,8 @@ int hvm_memory_event_int3(unsigned long gla)
     unsigned long gfn;
     gfn = paging_gva_to_gfn(current, gla, &pfec);
 
+    if ( is_pvh_vcpu(current) )
+        return 0;
     return hvm_memory_event_traps(current->domain->arch.hvm_domain
                                     .params[HVM_PARAM_MEMORY_EVENT_INT3],
                                   MEM_EVENT_REASON_INT3,
@@ -4561,6 +4641,8 @@ int hvm_memory_event_single_step(unsigned long gla)
     unsigned long gfn;
     gfn = paging_gva_to_gfn(current, gla, &pfec);
 
+    if ( is_pvh_vcpu(current) )
+        return 0;
     return hvm_memory_event_traps(current->domain->arch.hvm_domain
             .params[HVM_PARAM_MEMORY_EVENT_SINGLE_STEP],
             MEM_EVENT_REASON_SINGLESTEP,
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index ef0ee7f..2ad07fd 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -634,7 +634,7 @@ void vmx_vmcs_exit(struct vcpu *v)
     {
         /* Don't confuse vmx_do_resume (for @v or @current!) */
         vmx_clear_vmcs(v);
-        if ( is_hvm_vcpu(current) )
+        if ( !is_pv_vcpu(current) )
             vmx_load_vmcs(current);
 
         spin_unlock(&v->arch.hvm_vmx.vmcs_lock);
@@ -825,16 +825,285 @@ void virtual_vmcs_vmwrite(void *vvmcs, u32 vmcs_encoding, u64 val)
     virtual_vmcs_exit(vvmcs);
 }
 
-static int construct_vmcs(struct vcpu *v)
+static void vmx_set_common_host_vmcs_fields(struct vcpu *v)
 {
-    struct domain *d = v->domain;
     uint16_t sysenter_cs;
     unsigned long sysenter_eip;
+
+    /* Host data selectors. */
+    __vmwrite(HOST_SS_SELECTOR, __HYPERVISOR_DS);
+    __vmwrite(HOST_DS_SELECTOR, __HYPERVISOR_DS);
+    __vmwrite(HOST_ES_SELECTOR, __HYPERVISOR_DS);
+    __vmwrite(HOST_FS_SELECTOR, 0);
+    __vmwrite(HOST_GS_SELECTOR, 0);
+    __vmwrite(HOST_FS_BASE, 0);
+    __vmwrite(HOST_GS_BASE, 0);
+
+    /* Host control registers. */
+    v->arch.hvm_vmx.host_cr0 = read_cr0() | X86_CR0_TS;
+    __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);
+    __vmwrite(HOST_CR4,
+              mmu_cr4_features | (xsave_enabled(v) ? X86_CR4_OSXSAVE : 0));
+
+    /* Host CS:RIP. */
+    __vmwrite(HOST_CS_SELECTOR, __HYPERVISOR_CS);
+    __vmwrite(HOST_RIP, (unsigned long)vmx_asm_vmexit_handler);
+
+    /* Host SYSENTER CS:RIP. */
+    rdmsrl(MSR_IA32_SYSENTER_CS, sysenter_cs);
+    __vmwrite(HOST_SYSENTER_CS, sysenter_cs);
+    rdmsrl(MSR_IA32_SYSENTER_EIP, sysenter_eip);
+    __vmwrite(HOST_SYSENTER_EIP, sysenter_eip);
+}
+
+static int pvh_check_requirements(struct vcpu *v)
+{
+    u64 required, tmpval = real_cr4_to_pv_guest_cr4(mmu_cr4_features);
+
+    if ( !paging_mode_hap(v->domain) )
+    {
+        printk(XENLOG_G_INFO "HAP is required for PVH guest.\n");
+        return -EINVAL;
+    }
+    if ( !cpu_has_vmx_pat )
+    {
+        printk(XENLOG_G_INFO "PVH: CPU does not have PAT support\n");
+        return -ENOSYS;
+    }
+    if ( !cpu_has_vmx_msr_bitmap )
+    {
+        printk(XENLOG_G_INFO "PVH: CPU does not have msr bitmap\n");
+        return -ENOSYS;
+    }
+    if ( !cpu_has_vmx_vpid )
+    {
+        printk(XENLOG_G_INFO "PVH: CPU doesn't have VPID support\n");
+        return -ENOSYS;
+    }
+    if ( !cpu_has_vmx_secondary_exec_control )
+    {
+        printk(XENLOG_G_INFO "CPU Secondary exec is required to run PVH\n");
+        return -ENOSYS;
+    }
+
+    if ( v->domain->arch.vtsc )
+    {
+        printk(XENLOG_G_INFO
+                "At present PVH only supports the default timer mode\n");
+        return -ENOSYS;
+    }
+
+    required = X86_CR4_PAE | X86_CR4_VMXE | X86_CR4_OSFXSR;
+    if ( (tmpval & required) != required )
+    {
+        printk(XENLOG_G_INFO "PVH: required CR4 features not available:%lx\n",
+                required);
+        return -ENOSYS;
+    }
+
+    return 0;
+}
+
+static int pvh_construct_vmcs(struct vcpu *v)
+{
+    int rc, msr_type;
+    unsigned long *msr_bitmap;
+    struct domain *d = v->domain;
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    struct ept_data *ept = &p2m->ept;
+    u32 vmexit_ctl = vmx_vmexit_control;
+    u32 vmentry_ctl = vmx_vmentry_control;
+    u64 host_pat, tmpval = -1;
+
+    if ( (rc = pvh_check_requirements(v)) )
+        return rc;
+
+    msr_bitmap = alloc_xenheap_page();
+    if ( msr_bitmap == NULL )
+        return -ENOMEM;
+
+    /* 1. Pin-Based Controls */
+    __vmwrite(PIN_BASED_VM_EXEC_CONTROL, vmx_pin_based_exec_control);
+
+    v->arch.hvm_vmx.exec_control = vmx_cpu_based_exec_control;
+
+    /* 2. Primary Processor-based controls */
+    /*
+     * If rdtsc exiting is turned on and it goes thru emulate_privileged_op,
+     * then pv_vcpu.ctrlreg must be added to the pvh struct.
+     */
+    v->arch.hvm_vmx.exec_control &= ~CPU_BASED_RDTSC_EXITING;
+    v->arch.hvm_vmx.exec_control &= ~CPU_BASED_USE_TSC_OFFSETING;
+
+    v->arch.hvm_vmx.exec_control &= ~(CPU_BASED_INVLPG_EXITING |
+                                      CPU_BASED_CR3_LOAD_EXITING |
+                                      CPU_BASED_CR3_STORE_EXITING);
+    v->arch.hvm_vmx.exec_control |= CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
+    v->arch.hvm_vmx.exec_control &= ~CPU_BASED_MONITOR_TRAP_FLAG;
+    v->arch.hvm_vmx.exec_control |= CPU_BASED_ACTIVATE_MSR_BITMAP;
+    v->arch.hvm_vmx.exec_control &= ~CPU_BASED_TPR_SHADOW;
+    v->arch.hvm_vmx.exec_control &= ~CPU_BASED_VIRTUAL_NMI_PENDING;
+
+    __vmwrite(CPU_BASED_VM_EXEC_CONTROL, v->arch.hvm_vmx.exec_control);
+
+    /* 3. Secondary Processor-based controls. Intel SDM: all resvd bits are 0*/
+    v->arch.hvm_vmx.secondary_exec_control = SECONDARY_EXEC_ENABLE_EPT;
+    v->arch.hvm_vmx.secondary_exec_control |= SECONDARY_EXEC_ENABLE_VPID;
+    v->arch.hvm_vmx.secondary_exec_control |= SECONDARY_EXEC_PAUSE_LOOP_EXITING;
+
+    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
+              v->arch.hvm_vmx.secondary_exec_control);
+
+    __vmwrite(IO_BITMAP_A, virt_to_maddr((char *)hvm_io_bitmap + 0));
+    __vmwrite(IO_BITMAP_B, virt_to_maddr((char *)hvm_io_bitmap + PAGE_SIZE));
+
+    /* MSR bitmap for intercepts */
+    memset(msr_bitmap, ~0, PAGE_SIZE);
+    v->arch.hvm_vmx.msr_bitmap = msr_bitmap;
+    __vmwrite(MSR_BITMAP, virt_to_maddr(msr_bitmap));
+
+    msr_type = MSR_TYPE_R | MSR_TYPE_W;
+    /* Disable interecepts for MSRs that have corresponding VMCS fields */
+    vmx_disable_intercept_for_msr(v, MSR_FS_BASE, msr_type);
+    vmx_disable_intercept_for_msr(v, MSR_GS_BASE, msr_type);
+    vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_CS, msr_type);
+    vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_ESP, msr_type);
+    vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_EIP, msr_type);
+    vmx_disable_intercept_for_msr(v, MSR_SHADOW_GS_BASE, msr_type);
+    vmx_disable_intercept_for_msr(v, MSR_IA32_CR_PAT, msr_type);
+
+    /* 
+     * We don't disable intercepts for MSRs: MSR_STAR, MSR_LSTAR, MSR_CSTAR,
+     * and MSR_SYSCALL_MASK because we need to specify save/restore area to 
+     * save/restore at every VM exit and entry. Instead, let the intercept 
+     * functions save them into vmx_msr_state fields. See comment in 
+     * vmx_restore_host_msrs(). See also vmx_restore_guest_msrs().
+     */
+    __vmwrite(VM_ENTRY_MSR_LOAD_COUNT, 0);
+    __vmwrite(VM_EXIT_MSR_LOAD_COUNT, 0);
+    __vmwrite(VM_EXIT_MSR_STORE_COUNT, 0);
+
+    __vmwrite(VM_EXIT_CONTROLS, vmexit_ctl);
+
+    /*
+     * Note: we run with default VM_ENTRY_LOAD_DEBUG_CTLS of 1, which means
+     * upon vmentry, the cpu reads/loads VMCS.DR7 and VMCS.DEBUGCTLS, and not
+     * use the host values. 0 would cause it to not use the VMCS values.
+     */
+    vmentry_ctl &= ~VM_ENTRY_LOAD_GUEST_EFER;
+    vmentry_ctl &= ~VM_ENTRY_SMM;
+    vmentry_ctl &= ~VM_ENTRY_DEACT_DUAL_MONITOR;
+    /* PVH 32bitfixme */
+    vmentry_ctl |= VM_ENTRY_IA32E_MODE;       /* GUEST_EFER.LME/LMA ignored */
+
+    __vmwrite(VM_ENTRY_CONTROLS, vmentry_ctl);
+
+    vmx_set_common_host_vmcs_fields(v);
+
+    __vmwrite(VM_ENTRY_INTR_INFO, 0);
+    __vmwrite(CR3_TARGET_COUNT, 0);
+    __vmwrite(GUEST_ACTIVITY_STATE, 0);
+
+    /* These are sorta irrelevant as we load the discriptors directly. */
+    __vmwrite(GUEST_CS_SELECTOR, 0);
+    __vmwrite(GUEST_DS_SELECTOR, 0);
+    __vmwrite(GUEST_SS_SELECTOR, 0);
+    __vmwrite(GUEST_ES_SELECTOR, 0);
+    __vmwrite(GUEST_FS_SELECTOR, 0);
+    __vmwrite(GUEST_GS_SELECTOR, 0);
+
+    __vmwrite(GUEST_CS_BASE, 0);
+    __vmwrite(GUEST_CS_LIMIT, ~0u);
+    /* CS.L == 1, exec, read/write, accessed. PVH 32bitfixme */
+    __vmwrite(GUEST_CS_AR_BYTES, 0xa09b);
+
+    __vmwrite(GUEST_DS_BASE, 0);
+    __vmwrite(GUEST_DS_LIMIT, ~0u);
+    __vmwrite(GUEST_DS_AR_BYTES, 0xc093); /* read/write, accessed */
+
+    __vmwrite(GUEST_SS_BASE, 0);
+    __vmwrite(GUEST_SS_LIMIT, ~0u);
+    __vmwrite(GUEST_SS_AR_BYTES, 0xc093); /* read/write, accessed */
+
+    __vmwrite(GUEST_ES_BASE, 0);
+    __vmwrite(GUEST_ES_LIMIT, ~0u);
+    __vmwrite(GUEST_ES_AR_BYTES, 0xc093); /* read/write, accessed */
+
+    __vmwrite(GUEST_FS_BASE, 0);
+    __vmwrite(GUEST_FS_LIMIT, ~0u);
+    __vmwrite(GUEST_FS_AR_BYTES, 0xc093); /* read/write, accessed */
+
+    __vmwrite(GUEST_GS_BASE, 0);
+    __vmwrite(GUEST_GS_LIMIT, ~0u);
+    __vmwrite(GUEST_GS_AR_BYTES, 0xc093); /* read/write, accessed */
+
+    __vmwrite(GUEST_GDTR_BASE, 0);
+    __vmwrite(GUEST_GDTR_LIMIT, 0);
+
+    __vmwrite(GUEST_LDTR_BASE, 0);
+    __vmwrite(GUEST_LDTR_LIMIT, 0);
+    __vmwrite(GUEST_LDTR_AR_BYTES, 0x82); /* LDT */
+    __vmwrite(GUEST_LDTR_SELECTOR, 0);
+
+    /* Guest TSS. */
+    __vmwrite(GUEST_TR_BASE, 0);
+    __vmwrite(GUEST_TR_LIMIT, 0xff);
+    __vmwrite(GUEST_TR_AR_BYTES, 0x8b); /* 32-bit TSS (busy) */
+
+    __vmwrite(GUEST_INTERRUPTIBILITY_INFO, 0);
+    __vmwrite(GUEST_DR7, 0);
+    __vmwrite(VMCS_LINK_POINTER, ~0UL);
+
+    __vmwrite(PAGE_FAULT_ERROR_CODE_MASK, 0);
+    __vmwrite(PAGE_FAULT_ERROR_CODE_MATCH, 0);
+
+    v->arch.hvm_vmx.exception_bitmap = HVM_TRAP_MASK | (1U << TRAP_debug) |
+                                   (1U << TRAP_int3) | (1U << TRAP_no_device);
+    __vmwrite(EXCEPTION_BITMAP, v->arch.hvm_vmx.exception_bitmap);
+
+    /* Set WP bit so rdonly pages are not written from CPL 0 */
+    tmpval = X86_CR0_PG | X86_CR0_NE | X86_CR0_PE | X86_CR0_WP;
+    __vmwrite(GUEST_CR0, tmpval);
+    __vmwrite(CR0_READ_SHADOW, tmpval);
+    v->arch.hvm_vcpu.hw_cr[0] = v->arch.hvm_vcpu.guest_cr[0] = tmpval;
+
+    tmpval = real_cr4_to_pv_guest_cr4(mmu_cr4_features);
+    __vmwrite(GUEST_CR4, tmpval);
+    __vmwrite(CR4_READ_SHADOW, tmpval);
+    v->arch.hvm_vcpu.guest_cr[4] = tmpval;
+
+    __vmwrite(CR0_GUEST_HOST_MASK, ~0UL);
+    __vmwrite(CR4_GUEST_HOST_MASK, ~0UL);
+
+     v->arch.hvm_vmx.vmx_realmode = 0;
+
+    ept->asr  = pagetable_get_pfn(p2m_get_pagetable(p2m));
+    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
+
+    rdmsrl(MSR_IA32_CR_PAT, host_pat);
+    __vmwrite(HOST_PAT, host_pat);
+    __vmwrite(GUEST_PAT, MSR_IA32_CR_PAT_RESET);
+
+    /* the paging mode is updated for PVH by arch_set_info_guest() */
+
+    return 0;
+}
+
+static int construct_vmcs(struct vcpu *v)
+{
+    struct domain *d = v->domain;
     u32 vmexit_ctl = vmx_vmexit_control;
     u32 vmentry_ctl = vmx_vmentry_control;
 
     vmx_vmcs_enter(v);
 
+    if ( is_pvh_vcpu(v) )
+    {
+        int rc = pvh_construct_vmcs(v);
+        vmx_vmcs_exit(v);
+        return rc;
+    }
+
     /* VMCS controls. */
     __vmwrite(PIN_BASED_VM_EXEC_CONTROL, vmx_pin_based_exec_control);
 
@@ -932,30 +1201,7 @@ static int construct_vmcs(struct vcpu *v)
         __vmwrite(POSTED_INTR_NOTIFICATION_VECTOR, posted_intr_vector);
     }
 
-    /* Host data selectors. */
-    __vmwrite(HOST_SS_SELECTOR, __HYPERVISOR_DS);
-    __vmwrite(HOST_DS_SELECTOR, __HYPERVISOR_DS);
-    __vmwrite(HOST_ES_SELECTOR, __HYPERVISOR_DS);
-    __vmwrite(HOST_FS_SELECTOR, 0);
-    __vmwrite(HOST_GS_SELECTOR, 0);
-    __vmwrite(HOST_FS_BASE, 0);
-    __vmwrite(HOST_GS_BASE, 0);
-
-    /* Host control registers. */
-    v->arch.hvm_vmx.host_cr0 = read_cr0() | X86_CR0_TS;
-    __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);
-    __vmwrite(HOST_CR4,
-              mmu_cr4_features | (xsave_enabled(v) ? X86_CR4_OSXSAVE : 0));
-
-    /* Host CS:RIP. */
-    __vmwrite(HOST_CS_SELECTOR, __HYPERVISOR_CS);
-    __vmwrite(HOST_RIP, (unsigned long)vmx_asm_vmexit_handler);
-
-    /* Host SYSENTER CS:RIP. */
-    rdmsrl(MSR_IA32_SYSENTER_CS, sysenter_cs);
-    __vmwrite(HOST_SYSENTER_CS, sysenter_cs);
-    rdmsrl(MSR_IA32_SYSENTER_EIP, sysenter_eip);
-    __vmwrite(HOST_SYSENTER_EIP, sysenter_eip);
+    vmx_set_common_host_vmcs_fields(v);
 
     /* MSR intercepts. */
     __vmwrite(VM_EXIT_MSR_LOAD_COUNT, 0);
@@ -1275,8 +1521,11 @@ void vmx_do_resume(struct vcpu *v)
 
         vmx_clear_vmcs(v);
         vmx_load_vmcs(v);
-        hvm_migrate_timers(v);
-        hvm_migrate_pirqs(v);
+        if ( !is_pvh_vcpu(v) )
+        {
+            hvm_migrate_timers(v);
+            hvm_migrate_pirqs(v);
+        }
         vmx_set_host_env(v);
         /*
          * Both n1 VMCS and n2 VMCS need to update the host environment after 
@@ -1288,6 +1537,9 @@ void vmx_do_resume(struct vcpu *v)
         hvm_asid_flush_vcpu(v);
     }
 
+    if ( is_pvh_vcpu(v) )
+        reset_stack_and_jump(vmx_asm_do_vmentry);
+
     debug_state = v->domain->debugger_attached
                   || v->domain->arch.hvm_domain.params[HVM_PARAM_MEMORY_EVENT_INT3]
                   || v->domain->arch.hvm_domain.params[HVM_PARAM_MEMORY_EVENT_SINGLE_STEP];
@@ -1471,7 +1723,7 @@ static void vmcs_dump(unsigned char ch)
 
     for_each_domain ( d )
     {
-        if ( !is_hvm_domain(d) )
+        if ( is_pv_domain(d) )
             continue;
         printk("\n>>> Domain %d <<<\n", d->domain_id);
         for_each_vcpu ( d, v )
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 7e5dba8..bd4c8bd 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -82,6 +82,9 @@ static int vmx_domain_initialise(struct domain *d)
 {
     int rc;
 
+    if ( is_pvh_domain(d) )
+        return 0;
+
     if ( (rc = vmx_alloc_vlapic_mapping(d)) != 0 )
         return rc;
 
@@ -90,6 +93,9 @@ static int vmx_domain_initialise(struct domain *d)
 
 static void vmx_domain_destroy(struct domain *d)
 {
+    if ( is_pvh_domain(d) )
+        return;
+
     vmx_free_vlapic_mapping(d);
 }
 
@@ -113,6 +119,12 @@ static int vmx_vcpu_initialise(struct vcpu *v)
 
     vpmu_initialise(v);
 
+    if ( is_pvh_vcpu(v) ) 
+    {
+        /* this for hvm_long_mode_enabled(v) */
+        v->arch.hvm_vcpu.guest_efer = EFER_SCE | EFER_LMA | EFER_LME;
+        return 0;
+    }
     vmx_install_vlapic_mapping(v);
 
     /* %eax == 1 signals full real-mode support to the guest loader. */
@@ -1034,6 +1046,28 @@ static void vmx_update_host_cr3(struct vcpu *v)
     vmx_vmcs_exit(v);
 }
 
+/*
+ * PVH guest never causes CR3 write vmexit. This called during the guest 
+ * setup.
+ */
+static void vmx_update_pvh_cr(struct vcpu *v, unsigned int cr)
+{
+    vmx_vmcs_enter(v);
+    switch ( cr )
+    {
+    case 3:
+        __vmwrite(GUEST_CR3, v->arch.hvm_vcpu.guest_cr[3]);
+        hvm_asid_flush_vcpu(v);
+        break;
+
+    default:
+        printk(XENLOG_ERR
+               "PVH: d%d v%d unexpected cr%d update at rip:%lx\n",
+               v->domain->domain_id, v->vcpu_id, cr, __vmread(GUEST_RIP));
+    }
+    vmx_vmcs_exit(v);
+}
+
 void vmx_update_debug_state(struct vcpu *v)
 {
     unsigned long mask;
@@ -1053,6 +1087,12 @@ void vmx_update_debug_state(struct vcpu *v)
 
 static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
 {
+    if ( is_pvh_vcpu(v) )
+    {
+        vmx_update_pvh_cr(v, cr);
+        return;
+    }
+
     vmx_vmcs_enter(v);
 
     switch ( cr )
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 11/20] PVH xen: introduce pvh.c
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (9 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 10/20] PVH xen: create PVH vmcs, and also initialization Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15 10:42   ` Jan Beulich
  2013-05-15  0:52 ` [PATCH 12/20] PVH xen: create read_descriptor_sel() Mukesh Rathor
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This patch introduces pvh.c for generic hvm code specific to PVH guest.

Changes in V5:
  - Separate this into a new patch.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/hvm/Makefile     |    3 +-
 xen/arch/x86/hvm/hvm.c        |    4 -
 xen/arch/x86/hvm/pvh.c        |  202 +++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/hvm.h |    6 +
 xen/include/asm-x86/pvh.h     |    6 +
 5 files changed, 216 insertions(+), 5 deletions(-)
 create mode 100644 xen/arch/x86/hvm/pvh.c
 create mode 100644 xen/include/asm-x86/pvh.h

diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index eea5555..65ff9f3 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -22,4 +22,5 @@ obj-y += vlapic.o
 obj-y += vmsi.o
 obj-y += vpic.o
 obj-y += vpt.o
-obj-y += vpmu.o
\ No newline at end of file
+obj-y += vpmu.o
+obj-y += pvh.o
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e103c70..20e6e15 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3253,10 +3253,6 @@ static long hvm_vcpu_op(
     return rc;
 }
 
-typedef unsigned long hvm_hypercall_t(
-    unsigned long, unsigned long, unsigned long, unsigned long, unsigned long,
-    unsigned long);
-
 #define HYPERCALL(x)                                        \
     [ __HYPERVISOR_ ## x ] = (hvm_hypercall_t *) do_ ## x
 
diff --git a/xen/arch/x86/hvm/pvh.c b/xen/arch/x86/hvm/pvh.c
new file mode 100644
index 0000000..480fe1c
--- /dev/null
+++ b/xen/arch/x86/hvm/pvh.c
@@ -0,0 +1,202 @@
+/*
+ * Copyright (C) 2013, Mukesh Rathor, Oracle Corp.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ */
+#include <xen/hypercall.h>
+#include <xen/guest_access.h>
+#include <asm/p2m.h>
+#include <asm/traps.h>
+#include <asm/hvm/vmx/vmx.h>
+#include <public/sched.h>
+
+
+static int pvh_grant_table_op(unsigned int cmd, XEN_GUEST_HANDLE(void) uop,
+                              unsigned int count)
+{
+#ifndef NDEBUG
+    switch ( cmd )
+    {
+    /* the following grant ops have been tested for PVH guest. */
+    case GNTTABOP_map_grant_ref:
+    case GNTTABOP_unmap_grant_ref:
+    case GNTTABOP_setup_table:
+    case GNTTABOP_copy:
+    case GNTTABOP_query_size:
+    case GNTTABOP_set_version:
+        return do_grant_table_op(cmd, uop, count);
+    }
+    return -ENOSYS;
+#else
+    return do_grant_table_op(cmd, uop, count);
+#endif
+}
+
+static long pvh_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE(void) arg)
+{
+    long rc = -ENOSYS;
+
+#ifndef NDEBUG
+    int valid = 0;
+
+    switch ( cmd )
+    {
+    case VCPUOP_register_runstate_memory_area:
+    case VCPUOP_get_runstate_info:
+    case VCPUOP_set_periodic_timer:
+    case VCPUOP_stop_periodic_timer:
+    case VCPUOP_set_singleshot_timer:
+    case VCPUOP_stop_singleshot_timer:
+    case VCPUOP_is_up:
+    case VCPUOP_up:
+    case VCPUOP_initialise:
+        valid = 1;
+    }
+    if ( !valid )
+        return rc;
+#endif
+
+    rc = do_vcpu_op(cmd, vcpuid, arg);
+
+    /* pvh boot vcpu setting context for bringing up smp vcpu */
+    if ( cmd == VCPUOP_initialise )
+        vmx_vmcs_enter(current);
+    return rc;
+}
+
+static long pvh_physdev_op(int cmd, XEN_GUEST_HANDLE(void) arg)
+{
+#ifndef NDEBUG
+    int valid = 0;
+    switch ( cmd )
+    {
+     case PHYSDEVOP_map_pirq:
+     case PHYSDEVOP_unmap_pirq:
+     case PHYSDEVOP_eoi:
+     case PHYSDEVOP_irq_status_query:
+     case PHYSDEVOP_get_free_pirq:
+         valid = 1;
+     }
+     if ( !valid && !IS_PRIV(current->domain) )
+        return -ENOSYS;
+#endif
+    return do_physdev_op(cmd, arg);
+}
+
+static long pvh_hvm_op(unsigned long op, XEN_GUEST_HANDLE(void) arg)
+{
+    long rc = -EINVAL;
+    struct xen_hvm_param harg;
+    struct domain *d;
+
+    if ( copy_from_guest(&harg, arg, 1) )
+        return -EFAULT;
+
+    rc = rcu_lock_target_domain_by_id(harg.domid, &d);
+    if ( rc != 0 )
+        return rc;
+
+    if ( is_hvm_domain(d) )
+    {
+        /* pvh dom0 is building an hvm guest */
+        rcu_unlock_domain(d);
+        return do_hvm_op(op, arg);
+    }
+
+    rc = -ENOSYS;
+    if ( op == HVMOP_set_param )
+    {
+        if ( harg.index == HVM_PARAM_CALLBACK_IRQ )
+        {
+            struct hvm_irq *hvm_irq = &d->arch.hvm_domain.irq;
+            uint64_t via = harg.value;
+            uint8_t via_type = (uint8_t)(via >> 56) + 1;
+
+            if ( via_type == HVMIRQ_callback_vector )
+            {
+                hvm_irq->callback_via_type = HVMIRQ_callback_vector;
+                hvm_irq->callback_via.vector = (uint8_t)via;
+                rc = 0;
+            }
+        }
+    }
+    rcu_unlock_domain(d);
+    if ( rc )
+        gdprintk(XENLOG_DEBUG, "op:%ld -ENOSYS\n", op);
+
+    return rc;
+}
+
+#ifndef NDEBUG
+/* PVH 32bitfixme */
+static hvm_hypercall_t *pvh_hypercall64_table[NR_hypercalls] = {
+    [__HYPERVISOR_platform_op]     = (hvm_hypercall_t *)do_platform_op,
+    [__HYPERVISOR_memory_op]       = (hvm_hypercall_t *)do_memory_op,
+    [__HYPERVISOR_xen_version]     = (hvm_hypercall_t *)do_xen_version,
+    [__HYPERVISOR_console_io]      = (hvm_hypercall_t *)do_console_io,
+    [__HYPERVISOR_grant_table_op]  = (hvm_hypercall_t *)pvh_grant_table_op,
+    [__HYPERVISOR_vcpu_op]         = (hvm_hypercall_t *)pvh_vcpu_op,
+    [__HYPERVISOR_mmuext_op]       = (hvm_hypercall_t *)do_mmuext_op,
+    [__HYPERVISOR_xsm_op]          = (hvm_hypercall_t *)do_xsm_op,
+    [__HYPERVISOR_sched_op]        = (hvm_hypercall_t *)do_sched_op,
+    [__HYPERVISOR_event_channel_op]= (hvm_hypercall_t *)do_event_channel_op,
+    [__HYPERVISOR_physdev_op]      = (hvm_hypercall_t *)pvh_physdev_op,
+    [__HYPERVISOR_hvm_op]          = (hvm_hypercall_t *)pvh_hvm_op,
+    [__HYPERVISOR_sysctl]          = (hvm_hypercall_t *)do_sysctl,
+    [__HYPERVISOR_domctl]          = (hvm_hypercall_t *)do_domctl
+};
+#endif
+
+/*
+ * Check if hypercall is valid
+ * Returns: 0 if hcall is not valid with eax set to the errno to ret to guest
+ */
+static bool_t hcall_valid(struct cpu_user_regs *regs)
+{
+    struct segment_register sreg;
+
+    hvm_get_segment_register(current, x86_seg_ss, &sreg);
+    if ( unlikely(sreg.attr.fields.dpl != 0) )
+    {
+        regs->eax = -EPERM;
+        return 0;
+    }
+
+    return 1;
+}
+
+/* PVH 32bitfixme */
+int pvh_do_hypercall(struct cpu_user_regs *regs)
+{
+    uint32_t hnum = regs->eax;
+
+    if ( hnum >= NR_hypercalls || pvh_hypercall64_table[hnum] == NULL )
+    {
+        gdprintk(XENLOG_WARNING, "PVH: Unimplemented HCALL:%d. Returning "
+                 "-ENOSYS. domid:%d IP:%lx SP:%lx\n",
+                 hnum, current->domain->domain_id, regs->rip, regs->rsp);
+        regs->eax = -ENOSYS;
+        vmx_update_guest_eip();
+        return HVM_HCALL_completed;
+    }
+
+    if ( !hcall_valid(regs) )
+        return HVM_HCALL_completed;
+
+    current->arch.hvm_vcpu.hcall_preempted = 0;
+    regs->rax = pvh_hypercall64_table[hnum](regs->rdi, regs->rsi, regs->rdx,
+                                            regs->r10, regs->r8, regs->r9);
+
+    if ( current->arch.hvm_vcpu.hcall_preempted )
+        return HVM_HCALL_preempted;
+
+    return HVM_HCALL_completed;
+}
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 9b5fa5b..43b9ce1 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -506,4 +506,10 @@ bool_t nhvm_vmcx_hap_enabled(struct vcpu *v);
 /* interrupt */
 enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v);
 
+
+/* hypercall table typedef for HVM */
+typedef unsigned long hvm_hypercall_t(
+    unsigned long, unsigned long, unsigned long, unsigned long, unsigned long,
+    unsigned long);
+
 #endif /* __ASM_X86_HVM_HVM_H__ */
diff --git a/xen/include/asm-x86/pvh.h b/xen/include/asm-x86/pvh.h
new file mode 100644
index 0000000..73e59d3
--- /dev/null
+++ b/xen/include/asm-x86/pvh.h
@@ -0,0 +1,6 @@
+#ifndef __ASM_X86_PVH_H__
+#define __ASM_X86_PVH_H__
+
+int pvh_do_hypercall(struct cpu_user_regs *regs);
+
+#endif  /* __ASM_X86_PVH_H__ */
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 12/20] PVH xen: create read_descriptor_sel()
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (10 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 11/20] PVH xen: introduce pvh.c Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 13/20] PVH xen: introduce vmx_pvh.c Mukesh Rathor
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This patch changes read descriptor functionality to support PVH by
introducing read_descriptor_sel(). Also, we make emulate_forced_invalid_op()
public and suitable for PVH use in next patch.

Changes in V5: None. New patch (separating this from prev patch 10 which was
               getting large).

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/traps.c            |   87 +++++++++++++++++++++++++++++++++------
 xen/include/asm-x86/processor.h |    1 +
 2 files changed, 75 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index f68c526..663e351 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -905,7 +905,7 @@ static int emulate_invalid_rdtscp(struct cpu_user_regs *regs)
     return EXCRET_fault_fixed;
 }
 
-static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
+int emulate_forced_invalid_op(struct cpu_user_regs *regs)
 {
     char sig[5], instr[2];
     unsigned long eip, rc;
@@ -913,7 +913,7 @@ static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
     eip = regs->eip;
 
     /* Check for forced emulation signature: ud2 ; .ascii "xen". */
-    if ( (rc = copy_from_user(sig, (char *)eip, sizeof(sig))) != 0 )
+    if ( (rc = raw_copy_from_guest(sig, (char *)eip, sizeof(sig))) != 0 )
     {
         propagate_page_fault(eip + sizeof(sig) - rc, 0);
         return EXCRET_fault_fixed;
@@ -923,7 +923,7 @@ static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
     eip += sizeof(sig);
 
     /* We only emulate CPUID. */
-    if ( ( rc = copy_from_user(instr, (char *)eip, sizeof(instr))) != 0 )
+    if ( ( rc = raw_copy_from_guest(instr, (char *)eip, sizeof(instr))) != 0 )
     {
         propagate_page_fault(eip + sizeof(instr) - rc, 0);
         return EXCRET_fault_fixed;
@@ -1068,6 +1068,12 @@ void propagate_page_fault(unsigned long addr, u16 error_code)
     struct vcpu *v = current;
     struct trap_bounce *tb = &v->arch.pv_vcpu.trap_bounce;
 
+    if ( is_pvh_vcpu(v) ) 
+    {
+        hvm_inject_page_fault(error_code, addr);
+        return;
+    }
+
     v->arch.pv_vcpu.ctrlreg[2] = addr;
     arch_set_cr2(v, addr);
 
@@ -1506,6 +1512,49 @@ static int read_descriptor(unsigned int sel,
     return 1;
 }
 
+static int read_descriptor_sel(unsigned int sel,
+                               enum x86_segment which_sel,
+                               struct vcpu *v,
+                               const struct cpu_user_regs *regs,
+                               unsigned long *base,
+                               unsigned long *limit,
+                               unsigned int *ar,
+                               unsigned int vm86attr)
+{
+    struct segment_register seg;
+    unsigned int long_mode = 0;
+
+    if ( !is_pvh_vcpu(v) )
+        return read_descriptor(sel, v, regs, base, limit, ar, vm86attr);
+
+    hvm_get_segment_register(v, x86_seg_cs, &seg);
+    long_mode = seg.attr.fields.l;
+
+    if ( which_sel != x86_seg_cs )
+        hvm_get_segment_register(v, which_sel, &seg);
+
+    /* ar is returned packed as in segment_attributes_t. Fix it up */
+    *ar = (unsigned int)seg.attr.bytes;
+    *ar = (*ar & 0xff ) | ((*ar & 0xf00) << 4);
+    *ar = *ar << 8;
+
+    if ( long_mode )
+    {
+        *limit = ~0UL;
+
+        if ( which_sel < x86_seg_fs )
+        {
+            *base = 0UL;
+            return 1;
+        }
+   }
+   else
+       *limit = (unsigned long)seg.limit;
+
+   *base = seg.base;
+    return 1;
+}
+
 static int read_gate_descriptor(unsigned int gate_sel,
                                 const struct vcpu *v,
                                 unsigned int *sel,
@@ -1833,6 +1882,7 @@ static int is_cpufreq_controller(struct domain *d)
 
 int emulate_privileged_op(struct cpu_user_regs *regs)
 {
+    enum x86_segment which_sel;
     struct vcpu *v = current;
     unsigned long *reg, eip = regs->eip;
     u8 opcode, modrm_reg = 0, modrm_rm = 0, rep_prefix = 0, lock = 0, rex = 0;
@@ -1855,9 +1905,10 @@ int emulate_privileged_op(struct cpu_user_regs *regs)
     void (*io_emul)(struct cpu_user_regs *) __attribute__((__regparm__(1)));
     uint64_t val, msr_content;
 
-    if ( !read_descriptor(regs->cs, v, regs,
-                          &code_base, &code_limit, &ar,
-                          _SEGMENT_CODE|_SEGMENT_S|_SEGMENT_DPL|_SEGMENT_P) )
+    if ( !read_descriptor_sel(regs->cs, x86_seg_cs, v, regs,
+                              &code_base, &code_limit, &ar,
+                              _SEGMENT_CODE|_SEGMENT_S|
+                              _SEGMENT_DPL|_SEGMENT_P) )
         goto fail;
     op_default = op_bytes = (ar & (_SEGMENT_L|_SEGMENT_DB)) ? 4 : 2;
     ad_default = ad_bytes = (ar & _SEGMENT_L) ? 8 : op_default;
@@ -1868,6 +1919,7 @@ int emulate_privileged_op(struct cpu_user_regs *regs)
 
     /* emulating only opcodes not allowing SS to be default */
     data_sel = read_segment_register(v, regs, ds);
+    which_sel = x86_seg_ds;
 
     /* Legacy prefixes. */
     for ( i = 0; i < 8; i++, rex == opcode || (rex = 0) )
@@ -1883,23 +1935,29 @@ int emulate_privileged_op(struct cpu_user_regs *regs)
             continue;
         case 0x2e: /* CS override */
             data_sel = regs->cs;
+            which_sel = x86_seg_cs;
             continue;
         case 0x3e: /* DS override */
             data_sel = read_segment_register(v, regs, ds);
+            which_sel = x86_seg_ds;
             continue;
         case 0x26: /* ES override */
             data_sel = read_segment_register(v, regs, es);
+            which_sel = x86_seg_es;
             continue;
         case 0x64: /* FS override */
             data_sel = read_segment_register(v, regs, fs);
+            which_sel = x86_seg_fs;
             lm_ovr = lm_seg_fs;
             continue;
         case 0x65: /* GS override */
             data_sel = read_segment_register(v, regs, gs);
+            which_sel = x86_seg_gs;
             lm_ovr = lm_seg_gs;
             continue;
         case 0x36: /* SS override */
             data_sel = regs->ss;
+            which_sel = x86_seg_ss;
             continue;
         case 0xf0: /* LOCK */
             lock = 1;
@@ -1943,15 +2001,16 @@ int emulate_privileged_op(struct cpu_user_regs *regs)
         if ( !(opcode & 2) )
         {
             data_sel = read_segment_register(v, regs, es);
+            which_sel = x86_seg_es;
             lm_ovr = lm_seg_none;
         }
 
         if ( !(ar & _SEGMENT_L) )
         {
-            if ( !read_descriptor(data_sel, v, regs,
-                                  &data_base, &data_limit, &ar,
-                                  _SEGMENT_WR|_SEGMENT_S|_SEGMENT_DPL|
-                                  _SEGMENT_P) )
+            if ( !read_descriptor_sel(data_sel, which_sel, v, regs,
+                                      &data_base, &data_limit, &ar,
+                                      _SEGMENT_WR|_SEGMENT_S|_SEGMENT_DPL|
+                                      _SEGMENT_P) )
                 goto fail;
             if ( !(ar & _SEGMENT_S) ||
                  !(ar & _SEGMENT_P) ||
@@ -1981,9 +2040,9 @@ int emulate_privileged_op(struct cpu_user_regs *regs)
                 }
             }
             else
-                read_descriptor(data_sel, v, regs,
-                                &data_base, &data_limit, &ar,
-                                0);
+                read_descriptor_sel(data_sel, which_sel, v, regs,
+                                    &data_base, &data_limit, &ar,
+                                    0);
             data_limit = ~0UL;
             ar = _SEGMENT_WR|_SEGMENT_S|_SEGMENT_DPL|_SEGMENT_P;
         }
@@ -2638,6 +2697,8 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
     unsigned long off, eip, opnd_off, base, limit;
     int jump;
 
+    ASSERT(!is_pvh_vcpu(v));
+
     /* Check whether this fault is due to the use of a call gate. */
     if ( !read_gate_descriptor(regs->error_code, v, &sel, &off, &ar) ||
          (((ar >> 13) & 3) < (regs->cs & 3)) ||
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 8c70324..ab15ff0 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -567,6 +567,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
 int microcode_resume_cpu(int cpu);
 
 void pv_cpuid(struct cpu_user_regs *regs);
+int emulate_forced_invalid_op(struct cpu_user_regs *regs);
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_X86_PROCESSOR_H */
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 13/20] PVH xen: introduce vmx_pvh.c
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (11 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 12/20] PVH xen: create read_descriptor_sel() Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15 11:46   ` Jan Beulich
  2013-05-15  0:52 ` [PATCH 14/20] PVH xen: some misc changes like mtrr, intr, msi Mukesh Rathor
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

The heart of this patch is vmx exit handler for PVH guest. It is nicely
isolated in a separate module as preferred by most of us. A call to it
is added to vmx_pvh_vmexit_handler().

Changes in V2:
  - Move non VMX generic code to arch/x86/hvm/pvh.c
  - Remove get_gpr_ptr() and use existing decode_register() instead.
  - Defer call to pvh vmx exit handler until interrupts are enabled. So the
    caller vmx_pvh_vmexit_handler() handles the NMI/EXT-INT/TRIPLE_FAULT now.
  - Fix the CPUID (wrongly) clearing bit 24. No need to do this now, set
    the correct feature bits in CR4 during vmcs creation.
  - Fix few hard tabs.

Changes in V3:
  - Lot of cleanup and rework in PVH vm exit handler.
  - add parameter to emulate_forced_invalid_op().

Changes in V5:
  - Move pvh.c and emulate_forced_invalid_op related changes to another patch.
  - Formatting.
  - Remove vmx_pvh_read_descriptor().
  - Use SS DPL instead of CS.RPL for CPL.
  - Remove pvh_user_cpuid() and call pv_cpuid for user mode also.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/hvm/vmx/Makefile     |    1 +
 xen/arch/x86/hvm/vmx/vmx.c        |    7 +
 xen/arch/x86/hvm/vmx/vmx_pvh.c    |  502 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/vmx/vmx.h |    2 +
 4 files changed, 512 insertions(+), 0 deletions(-)
 create mode 100644 xen/arch/x86/hvm/vmx/vmx_pvh.c

diff --git a/xen/arch/x86/hvm/vmx/Makefile b/xen/arch/x86/hvm/vmx/Makefile
index 373b3d9..8b71dae 100644
--- a/xen/arch/x86/hvm/vmx/Makefile
+++ b/xen/arch/x86/hvm/vmx/Makefile
@@ -5,3 +5,4 @@ obj-y += vmcs.o
 obj-y += vmx.o
 obj-y += vpmu_core2.o
 obj-y += vvmx.o
+obj-y += vmx_pvh.o
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index bd4c8bd..338272c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1595,6 +1595,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .deliver_posted_intr  = vmx_deliver_posted_intr,
     .sync_pir_to_irr      = vmx_sync_pir_to_irr,
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
+    .pvh_set_vcpu_info    = vmx_pvh_set_vcpu_info,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2438,6 +2439,12 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
     if ( unlikely(exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) )
         return vmx_failed_vmentry(exit_reason, regs);
 
+    if ( is_pvh_vcpu(v) )
+    {
+        vmx_pvh_vmexit_handler(regs);
+        return;
+    }
+
     if ( v->arch.hvm_vmx.vmx_realmode )
     {
         /* Put RFLAGS back the way the guest wants it */
diff --git a/xen/arch/x86/hvm/vmx/vmx_pvh.c b/xen/arch/x86/hvm/vmx/vmx_pvh.c
new file mode 100644
index 0000000..13a2813
--- /dev/null
+++ b/xen/arch/x86/hvm/vmx/vmx_pvh.c
@@ -0,0 +1,502 @@
+/*
+ * Copyright (C) 2013, Mukesh Rathor, Oracle Corp.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include <xen/hypercall.h>
+#include <xen/guest_access.h>
+#include <asm/p2m.h>
+#include <asm/traps.h>
+#include <asm/hvm/vmx/vmx.h>
+#include <public/sched.h>
+#include <asm/pvh.h>
+
+#ifndef NDEBUG
+int pvhdbg = 0;
+#define dbgp1(...) do { (pvhdbg == 1) ? printk(__VA_ARGS__) : 0; } while ( 0 )
+#else
+#define dbgp1(...) ((void)0)
+#endif
+
+
+static void read_vmcs_selectors(struct cpu_user_regs *regs)
+{
+    regs->cs = __vmread(GUEST_CS_SELECTOR);
+    regs->ss = __vmread(GUEST_SS_SELECTOR);
+    regs->ds = __vmread(GUEST_DS_SELECTOR);
+    regs->es = __vmread(GUEST_ES_SELECTOR);
+    regs->gs = __vmread(GUEST_GS_SELECTOR);
+    regs->fs = __vmread(GUEST_FS_SELECTOR);
+}
+
+/* returns : 0 == msr read successfully */
+static int vmxit_msr_read(struct cpu_user_regs *regs)
+{
+    u64 msr_content = 0;
+
+    switch ( regs->ecx )
+    {
+    case MSR_IA32_MISC_ENABLE:
+        rdmsrl(MSR_IA32_MISC_ENABLE, msr_content);
+        msr_content |= MSR_IA32_MISC_ENABLE_BTS_UNAVAIL |
+                       MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL;
+        break;
+
+    default:
+        /* pvh fixme: see hvm_msr_read_intercept() */
+        rdmsrl(regs->ecx, msr_content);
+        break;
+    }
+    regs->eax = (uint32_t)msr_content;
+    regs->edx = (uint32_t)(msr_content >> 32);
+    vmx_update_guest_eip();
+
+    dbgp1("msr read c:%lx a:%lx d:%lx RIP:%lx RSP:%lx\n", regs->ecx, regs->eax,
+          regs->edx, regs->rip, regs->rsp);
+
+    return 0;
+}
+
+/* returns : 0 == msr written successfully */
+static int vmxit_msr_write(struct cpu_user_regs *regs)
+{
+    uint64_t msr_content = (uint32_t)regs->eax | ((uint64_t)regs->edx << 32);
+
+    dbgp1("PVH: msr write:0x%lx. eax:0x%lx edx:0x%lx\n", regs->ecx,
+          regs->eax, regs->edx);
+
+    if ( hvm_msr_write_intercept(regs->ecx, msr_content) == X86EMUL_OKAY )
+    {
+        vmx_update_guest_eip();
+        return 0;
+    }
+    return 1;
+}
+
+static int vmxit_debug(struct cpu_user_regs *regs)
+{
+    struct vcpu *vp = current;
+    unsigned long exit_qualification = __vmread(EXIT_QUALIFICATION);
+
+    write_debugreg(6, exit_qualification | 0xffff0ff0);
+
+    /* gdbsx or another debugger */
+    if ( vp->domain->domain_id != 0 &&    /* never pause dom0 */
+         guest_kernel_mode(vp, regs) &&  vp->domain->debugger_attached )
+
+        domain_pause_for_debugger();
+    else
+        hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+
+    return 0;
+}
+
+/* Returns: rc == 0: handled the MTF vmexit */
+static int vmxit_mtf(struct cpu_user_regs *regs)
+{
+    struct vcpu *vp = current;
+    int rc = -EINVAL, ss = vp->arch.hvm_vcpu.single_step;
+
+    vp->arch.hvm_vmx.exec_control &= ~CPU_BASED_MONITOR_TRAP_FLAG;
+    __vmwrite(CPU_BASED_VM_EXEC_CONTROL, vp->arch.hvm_vmx.exec_control);
+    vp->arch.hvm_vcpu.single_step = 0;
+
+    if ( vp->domain->debugger_attached && ss )
+    {
+        domain_pause_for_debugger();
+        rc = 0;
+    }
+    return rc;
+}
+
+static int vmxit_int3(struct cpu_user_regs *regs)
+{
+    int ilen = vmx_get_instruction_length();
+    struct vcpu *vp = current;
+    struct hvm_trap trap_info = {
+        .vector = TRAP_int3,
+        .type = X86_EVENTTYPE_SW_EXCEPTION,
+        .error_code = HVM_DELIVER_NO_ERROR_CODE,
+        .insn_len = ilen
+    };
+
+    /* gdbsx or another debugger. Never pause dom0 */
+    if ( vp->domain->domain_id != 0 && guest_kernel_mode(vp, regs) )
+    {
+        regs->eip += ilen;
+        dbgp1("[%d]PVH: domain pause for debugger\n", smp_processor_id());
+        current->arch.gdbsx_vcpu_event = TRAP_int3;
+        domain_pause_for_debugger();
+        return 0;
+    }
+    hvm_inject_trap(&trap_info);
+
+    return 0;
+}
+
+static int vmxit_invalid_op(struct cpu_user_regs *regs)
+{
+    if ( guest_kernel_mode(current, regs) || !emulate_forced_invalid_op(regs) )
+        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+
+    return 0;
+}
+
+/* Returns: rc == 0: handled the exception/NMI */
+static int vmxit_exception(struct cpu_user_regs *regs)
+{
+    unsigned int vector = (__vmread(VM_EXIT_INTR_INFO)) & INTR_INFO_VECTOR_MASK;
+    int rc = -ENOSYS;
+
+    dbgp1(" EXCPT: vec:%d cs:%lx r.IP:%lx\n", vector,
+          __vmread(GUEST_CS_SELECTOR), regs->eip);
+
+    switch ( vector )
+    {
+    case TRAP_debug:
+        rc = vmxit_debug(regs);
+        break;
+
+    case TRAP_int3:
+        rc = vmxit_int3(regs);
+        break;
+
+    case TRAP_invalid_op:
+        rc = vmxit_invalid_op(regs);
+        break;
+
+    case TRAP_no_device:
+        hvm_funcs.fpu_dirty_intercept();
+        rc = 0;
+        break;
+
+    default:
+        gdprintk(XENLOG_G_WARNING,
+                 "PVH: Unhandled trap:%d. IP:%lx\n", vector, regs->eip);
+    }
+    return rc;
+}
+
+static int vmxit_vmcall(struct cpu_user_regs *regs)
+{
+    if ( pvh_do_hypercall(regs) != HVM_HCALL_preempted )
+        vmx_update_guest_eip();
+    return 0;
+}
+
+/* Returns: rc == 0: success */
+static int access_cr0(struct cpu_user_regs *regs, uint acc_typ, uint64_t *regp)
+{
+    struct vcpu *vp = current;
+
+    if ( acc_typ == VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR )
+    {
+        unsigned long new_cr0 = *regp;
+        unsigned long old_cr0 = __vmread(GUEST_CR0);
+
+        dbgp1("PVH:writing to CR0. RIP:%lx val:0x%lx\n", regs->rip, *regp);
+        if ( (u32)new_cr0 != new_cr0 )
+        {
+            gdprintk(XENLOG_G_WARNING,
+                     "Guest setting upper 32 bits in CR0: %lx", new_cr0);
+            return -EPERM;
+        }
+
+        new_cr0 &= ~HVM_CR0_GUEST_RESERVED_BITS;
+        /* ET is reserved and should be always be 1. */
+        new_cr0 |= X86_CR0_ET;
+
+        /* pvh not expected to change to real mode */
+        if ( (new_cr0 & (X86_CR0_PE | X86_CR0_PG)) !=
+             (X86_CR0_PG | X86_CR0_PE) )
+        {
+            gdprintk(XENLOG_G_WARNING,
+                     "PVH attempting to turn off PE/PG. CR0:%lx\n", new_cr0);
+            return -EPERM;
+        }
+        /* TS going from 1 to 0 */
+        if ( (old_cr0 & X86_CR0_TS) && ((new_cr0 & X86_CR0_TS) == 0) )
+            vmx_fpu_enter(vp);
+
+        vp->arch.hvm_vcpu.hw_cr[0] = vp->arch.hvm_vcpu.guest_cr[0] = new_cr0;
+        __vmwrite(GUEST_CR0, new_cr0);
+        __vmwrite(CR0_READ_SHADOW, new_cr0);
+    }
+    else
+    {
+        *regp = __vmread(GUEST_CR0);
+    }
+    return 0;
+}
+
+/* Returns: rc == 0: success */
+static int access_cr4(struct cpu_user_regs *regs, uint acc_typ, uint64_t *regp)
+{
+    if ( acc_typ == VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR )
+    {
+        u64 old_cr4 = __vmread(GUEST_CR4);
+        u64 new = *regp;
+
+        if ( (old_cr4 ^ new) & (X86_CR4_PSE | X86_CR4_PGE | X86_CR4_PAE) )
+            vpid_sync_all();
+
+        __vmwrite(CR4_READ_SHADOW, new);
+
+        new &= ~X86_CR4_PAE;     /* PVH always runs with hap enabled */
+        new |= X86_CR4_VMXE | X86_CR4_MCE;
+        __vmwrite(GUEST_CR4, new);
+    }
+    else
+        *regp = __vmread(CR4_READ_SHADOW);
+
+    return 0;
+}
+
+/* Returns: rc == 0: success, else -errno */
+static int vmxit_cr_access(struct cpu_user_regs *regs)
+{
+    unsigned long exit_qualification = __vmread(EXIT_QUALIFICATION);
+    uint acc_typ = VMX_CONTROL_REG_ACCESS_TYPE(exit_qualification);
+    int cr, rc = -EINVAL;
+
+    switch ( acc_typ )
+    {
+    case VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR:
+    case VMX_CONTROL_REG_ACCESS_TYPE_MOV_FROM_CR:
+    {
+        uint gpr = VMX_CONTROL_REG_ACCESS_GPR(exit_qualification);
+        uint64_t *regp = decode_register(gpr, regs, 0);
+        cr = VMX_CONTROL_REG_ACCESS_NUM(exit_qualification);
+
+        if ( regp == NULL )
+            break;
+
+        switch ( cr )
+        {
+        case 0:
+            rc = access_cr0(regs, acc_typ, regp);
+            break;
+
+        case 3:
+            gdprintk(XENLOG_G_ERR, "PVH: unexpected cr3 vmexit. rip:%lx\n",
+                     regs->rip);
+            domain_crash_synchronous();
+            break;
+
+        case 4:
+            rc = access_cr4(regs, acc_typ, regp);
+            break;
+        }
+        if ( rc == 0 )
+            vmx_update_guest_eip();
+        break;
+    }
+
+    case VMX_CONTROL_REG_ACCESS_TYPE_CLTS:
+    {
+            struct vcpu *vp = current;
+            unsigned long cr0 = vp->arch.hvm_vcpu.guest_cr[0] & ~X86_CR0_TS;
+            vp->arch.hvm_vcpu.hw_cr[0] = vp->arch.hvm_vcpu.guest_cr[0] = cr0;
+
+            vmx_fpu_enter(vp);
+            __vmwrite(GUEST_CR0, cr0);
+            __vmwrite(CR0_READ_SHADOW, cr0);
+            vmx_update_guest_eip();
+            rc = 0;
+    }
+    }
+    return rc;
+}
+
+/*
+ * NOTE: a PVH guest sets IOPL natively by setting bits in the eflags, and not
+ *       by hypercalls used by a PV.
+ */
+static int vmxit_io_instr(struct cpu_user_regs *regs)
+{
+    struct segment_register seg;
+    int requested = (regs->rflags & X86_EFLAGS_IOPL) >> 12;
+    int curr_lvl = (regs->rflags & X86_EFLAGS_VM) ? 3 : 0;
+
+    if ( curr_lvl == 0 )
+    {
+        hvm_get_segment_register(current, x86_seg_ss, &seg);
+        curr_lvl = seg.attr.fields.dpl;
+    }
+    if ( requested >= curr_lvl && emulate_privileged_op(regs) )
+        return 0;
+
+    hvm_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+    return 0;
+}
+
+static int pvh_ept_handle_violation(unsigned long qualification,
+                                    paddr_t gpa, struct cpu_user_regs *regs)
+{
+    unsigned long gla, gfn = gpa >> PAGE_SHIFT;
+    p2m_type_t p2mt;
+    mfn_t mfn = get_gfn_query_unlocked(current->domain, gfn, &p2mt);
+
+    gdprintk(XENLOG_G_ERR, "EPT violation %#lx (%c%c%c/%c%c%c), "
+             "gpa %#"PRIpaddr", mfn %#lx, type %i. IP:0x%lx RSP:0x%lx\n",
+             qualification,
+             (qualification & EPT_READ_VIOLATION) ? 'r' : '-',
+             (qualification & EPT_WRITE_VIOLATION) ? 'w' : '-',
+             (qualification & EPT_EXEC_VIOLATION) ? 'x' : '-',
+             (qualification & EPT_EFFECTIVE_READ) ? 'r' : '-',
+             (qualification & EPT_EFFECTIVE_WRITE) ? 'w' : '-',
+             (qualification & EPT_EFFECTIVE_EXEC) ? 'x' : '-',
+             gpa, mfn_x(mfn), p2mt, regs->rip, regs->rsp);
+
+    ept_walk_table(current->domain, gfn);
+
+    if ( qualification & EPT_GLA_VALID )
+    {
+        gla = __vmread(GUEST_LINEAR_ADDRESS);
+        gdprintk(XENLOG_G_ERR, " --- GLA %#lx\n", gla);
+    }
+    hvm_inject_hw_exception(TRAP_gp_fault, 0);
+    return 0;
+}
+
+/*
+ * Main vm exit handler for PVH . Called from vmx_vmexit_handler().
+ * Note: vmx_asm_vmexit_handler updates rip/rsp/eflags in regs{} struct.
+ */
+void vmx_pvh_vmexit_handler(struct cpu_user_regs *regs)
+{
+    unsigned long exit_qualification;
+    unsigned int exit_reason = __vmread(VM_EXIT_REASON);
+    int rc=0, ccpu = smp_processor_id();
+    struct vcpu *v = current;
+
+    dbgp1("PVH:[%d]left VMCS exitreas:%d RIP:%lx RSP:%lx EFLAGS:%lx CR0:%lx\n",
+          ccpu, exit_reason, regs->rip, regs->rsp, regs->rflags,
+          __vmread(GUEST_CR0));
+
+    /* guest_kernel_mode() needs cs. read_segment_register needs others */
+    read_vmcs_selectors(regs);
+
+    switch ( (uint16_t)exit_reason )
+    {
+    /* NMI and machine_check are handled by the caller, we handle rest here */
+    case EXIT_REASON_EXCEPTION_NMI:      /* 0 */
+        rc = vmxit_exception(regs);
+        break;
+
+    case EXIT_REASON_EXTERNAL_INTERRUPT: /* 1 */
+        break;              /* handled in vmx_vmexit_handler() */
+
+    case EXIT_REASON_PENDING_VIRT_INTR:  /* 7 */
+        /* Disable the interrupt window. */
+        v->arch.hvm_vmx.exec_control &= ~CPU_BASED_VIRTUAL_INTR_PENDING;
+        __vmwrite(CPU_BASED_VM_EXEC_CONTROL, v->arch.hvm_vmx.exec_control);
+        break;
+
+    case EXIT_REASON_CPUID:              /* 10 */
+        pv_cpuid(regs);
+        vmx_update_guest_eip();
+        break;
+
+    case EXIT_REASON_HLT:                /* 12 */
+        vmx_update_guest_eip();
+        hvm_hlt(regs->eflags);
+        break;
+
+    case EXIT_REASON_VMCALL:             /* 18 */
+        rc = vmxit_vmcall(regs);
+        break;
+
+    case EXIT_REASON_CR_ACCESS:          /* 28 */
+        rc = vmxit_cr_access(regs);
+        break;
+
+    case EXIT_REASON_DR_ACCESS:          /* 29 */
+        exit_qualification = __vmread(EXIT_QUALIFICATION);
+        vmx_dr_access(exit_qualification, regs);
+        break;
+
+    case EXIT_REASON_IO_INSTRUCTION:     /* 30 */
+        vmxit_io_instr(regs);
+        break;
+
+    case EXIT_REASON_MSR_READ:           /* 31 */
+        rc = vmxit_msr_read(regs);
+        break;
+
+    case EXIT_REASON_MSR_WRITE:          /* 32 */
+        rc = vmxit_msr_write(regs);
+        break;
+
+    case EXIT_REASON_MONITOR_TRAP_FLAG:  /* 37 */
+        rc = vmxit_mtf(regs);
+        break;
+
+    case EXIT_REASON_MCE_DURING_VMENTRY: /* 41 */
+        break;              /* handled in vmx_vmexit_handler() */
+
+    case EXIT_REASON_EPT_VIOLATION:      /* 48 */
+    {
+        paddr_t gpa = __vmread(GUEST_PHYSICAL_ADDRESS);
+        exit_qualification = __vmread(EXIT_QUALIFICATION);
+        rc = pvh_ept_handle_violation(exit_qualification, gpa, regs);
+        break;
+    }
+
+    default:
+        rc = 1;
+        gdprintk(XENLOG_G_ERR,
+                 "PVH: Unexpected exit reason:0x%x\n", exit_reason);
+    }
+
+    if ( rc )
+    {
+        exit_qualification = __vmread(EXIT_QUALIFICATION);
+        gdprintk(XENLOG_G_WARNING,
+                 "PVH: [%d] exit_reas:%d 0x%x qual:%ld 0x%lx cr0:0x%016lx\n",
+                 ccpu, exit_reason, exit_reason, exit_qualification,
+                 exit_qualification, __vmread(GUEST_CR0));
+        gdprintk(XENLOG_G_WARNING, "PVH: RIP:%lx RSP:%lx EFLAGS:%lx CR3:%lx\n",
+                 regs->rip, regs->rsp, regs->rflags, __vmread(GUEST_CR3));
+        domain_crash_synchronous();
+    }
+}
+
+/*
+ * Sets info for non boot SMP vcpu. VCPU 0 context is set by the library.
+ * In case of linux, the call comes from cpu_initialize_context().
+ */
+int vmx_pvh_set_vcpu_info(struct vcpu *v, struct vcpu_guest_context *ctxtp)
+{
+    if ( v->vcpu_id == 0 )
+        return 0;
+
+    vmx_vmcs_enter(v);
+    __vmwrite(GUEST_GDTR_BASE, ctxtp->gdt.pvh.addr);
+    __vmwrite(GUEST_GDTR_LIMIT, ctxtp->gdt.pvh.limit);
+    __vmwrite(GUEST_GS_BASE, ctxtp->gs_base_user);
+
+    __vmwrite(GUEST_CS_SELECTOR, ctxtp->user_regs.cs);
+    __vmwrite(GUEST_DS_SELECTOR, ctxtp->user_regs.ds);
+    __vmwrite(GUEST_ES_SELECTOR, ctxtp->user_regs.es);
+    __vmwrite(GUEST_SS_SELECTOR, ctxtp->user_regs.ss);
+    __vmwrite(GUEST_GS_SELECTOR, ctxtp->user_regs.gs);
+
+    if ( vmx_add_guest_msr(MSR_SHADOW_GS_BASE) )
+    {
+        vmx_vmcs_exit(v);
+        return -EINVAL;
+    }
+    vmx_write_guest_msr(MSR_SHADOW_GS_BASE, ctxtp->gs_base_kernel);
+
+    vmx_vmcs_exit(v);
+    return 0;
+}
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 6fc0965..c9b5e23 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -471,6 +471,8 @@ void setup_ept_dump(void);
 
 void vmx_update_guest_eip(void);
 void vmx_dr_access(unsigned long exit_qualification,struct cpu_user_regs *regs);
+void vmx_pvh_vmexit_handler(struct cpu_user_regs *regs);
+int  vmx_pvh_set_vcpu_info(struct vcpu *v, struct vcpu_guest_context *ctxtp);
 
 int alloc_p2m_hap_data(struct p2m_domain *p2m);
 void free_p2m_hap_data(struct p2m_domain *p2m);
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 14/20] PVH xen: some misc changes like mtrr, intr, msi...
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (12 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 13/20] PVH xen: introduce vmx_pvh.c Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 15/20] PVH xen: hcall page initialize, create PVH guest type, etc Mukesh Rathor
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

Changes in irq.c as PVH doesn't use vlapic emulation. In mtrr we add
assert and set MTRR_TYPEs for PVH.

Changes in V2:
   - Some cleanup of redundant code.
   - time.c: Honor no rdtsc exiting for PVH by setting vtsc to 0 in time.c

Changes in V3:
   - Dont check for pvh in making mmio rangesets readonly.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/hvm/irq.c      |    3 +++
 xen/arch/x86/hvm/mtrr.c     |   11 +++++++++++
 xen/arch/x86/hvm/vmx/intr.c |    7 ++++---
 xen/arch/x86/time.c         |    9 +++++++++
 4 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index 9eae5de..92fb245 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -405,6 +405,9 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
          && vcpu_info(v, evtchn_upcall_pending) )
         return hvm_intack_vector(plat->irq.callback_via.vector);
 
+    if ( is_pvh_vcpu(v) )
+        return hvm_intack_none;
+
     if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
         return hvm_intack_pic(0);
 
diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index ef51a8d..f088ce0 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -578,6 +578,9 @@ int32_t hvm_set_mem_pinned_cacheattr(
 {
     struct hvm_mem_pinned_cacheattr_range *range;
 
+    /* PVH note: The guest writes to MSR_IA32_CR_PAT natively */
+    ASSERT( !is_pvh_domain(d) );
+
     if ( !((type == PAT_TYPE_UNCACHABLE) ||
            (type == PAT_TYPE_WRCOMB) ||
            (type == PAT_TYPE_WRTHROUGH) ||
@@ -693,6 +696,14 @@ uint8_t epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn,
          ((d->vcpu == NULL) || ((v = d->vcpu[0]) == NULL)) )
         return MTRR_TYPE_WRBACK;
 
+    /* PVH fixme: Add support for more memory types */
+    if ( is_pvh_domain(d) )
+    {
+        if (direct_mmio)
+            return MTRR_TYPE_UNCACHABLE;
+        return MTRR_TYPE_WRBACK;
+    }
+
     if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] )
         return MTRR_TYPE_WRBACK;
 
diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
index e376f3c..b94f9d5 100644
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -219,15 +219,16 @@ void vmx_intr_assist(void)
         return;
     }
 
-    /* Crank the handle on interrupt state. */
-    pt_vector = pt_update_irq(v);
+    if ( !is_pvh_vcpu(v) )
+        /* Crank the handle on interrupt state. */
+        pt_vector = pt_update_irq(v);
 
     do {
         intack = hvm_vcpu_has_pending_irq(v);
         if ( likely(intack.source == hvm_intsrc_none) )
             goto out;
 
-        if ( unlikely(nvmx_intr_intercept(v, intack)) )
+        if ( !is_pvh_vcpu(v) && unlikely(nvmx_intr_intercept(v, intack)) )
             goto out;
 
         intblk = hvm_interrupt_blocked(v, intack);
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 6e94847..484eb07 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1933,6 +1933,15 @@ void tsc_set_info(struct domain *d,
         d->arch.vtsc = 0;
         return;
     }
+    if ( is_pvh_domain(d) && tsc_mode != TSC_MODE_NEVER_EMULATE )
+    {
+        /* PVH fixme: support more tsc modes */
+        dprintk(XENLOG_WARNING,
+                "PVH currently does not support tsc emulation. Setting it "
+                "to no emulation\n");
+        d->arch.vtsc = 0;
+        return;
+    }
 
     switch ( d->arch.tsc_mode = tsc_mode )
     {
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 15/20] PVH xen: hcall page initialize, create PVH guest type, etc...
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (13 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 14/20] PVH xen: some misc changes like mtrr, intr, msi Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 16/20] PVH xen: Miscellaneous changes Mukesh Rathor
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

Create hcall page for PVH same as HVM. Set the PVH guest type if PV with
HAP is created, and some other changes in traps.c to support PVH.

Changes in V2:
  - Fix emulate_forced_invalid_op() to use proper copy function, and inject PF
    in case it fails.
  - remove extraneous PVH check in STI/CLI ops en emulate_privileged_op().
  - Make assert a debug ASSERT in show_registers().
  - debug.c: keep get_gfn() locked and move put_gfn closer to it.

Changes in V3:
  - Mostly formatting.

Changes in V5:
  - emulation of forced invalid op redone, and in a separate patch.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/traps.c        |   19 ++++++++++++++++++-
 xen/arch/x86/x86_64/traps.c |    8 +++++---
 xen/common/domain.c         |    9 +++++++++
 xen/common/domctl.c         |    5 +++++
 xen/common/kernel.c         |    6 +++++-
 5 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 663e351..6bd4659 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -459,6 +459,10 @@ static void instruction_done(
     struct cpu_user_regs *regs, unsigned long eip, unsigned int bpmatch)
 {
     regs->eip = eip;
+
+    if ( is_pvh_vcpu(current) )
+        return;
+
     regs->eflags &= ~X86_EFLAGS_RF;
     if ( bpmatch || (regs->eflags & X86_EFLAGS_TF) )
     {
@@ -475,6 +479,9 @@ static unsigned int check_guest_io_breakpoint(struct vcpu *v,
     unsigned int width, i, match = 0;
     unsigned long start;
 
+    if ( is_pvh_vcpu(v) )
+        return 0;          /* PVH fixme: support io breakpoint */
+
     if ( !(v->arch.debugreg[5]) ||
          !(v->arch.pv_vcpu.ctrlreg[4] & X86_CR4_DE) )
         return 0;
@@ -1620,6 +1627,13 @@ static int guest_io_okay(
     int user_mode = !(v->arch.flags & TF_kernel_mode);
 #define TOGGLE_MODE() if ( user_mode ) toggle_guest_mode(v)
 
+    /*
+     * For PVH we check this in vmexit for EXIT_REASON_IO_INSTRUCTION
+     * and so don't need to check again here.
+     */
+    if ( is_pvh_vcpu(v) )
+        return 1;
+
     if ( !vm86_mode(regs) &&
          (v->arch.pv_vcpu.iopl >= (guest_kernel_mode(v, regs) ? 1 : 3)) )
         return 1;
@@ -1865,7 +1879,7 @@ static inline uint64_t guest_misc_enable(uint64_t val)
         _ptr = (unsigned int)_ptr;                                          \
     if ( (limit) < sizeof(_x) - 1 || (eip) > (limit) - (sizeof(_x) - 1) )   \
         goto fail;                                                          \
-    if ( (_rc = copy_from_user(&_x, (type *)_ptr, sizeof(_x))) != 0 )       \
+    if ( (_rc = raw_copy_from_guest(&_x, (type *)_ptr, sizeof(_x))) != 0 )  \
     {                                                                       \
         propagate_page_fault(_ptr + sizeof(_x) - _rc, 0);                   \
         goto skip;                                                          \
@@ -3315,6 +3329,9 @@ void do_device_not_available(struct cpu_user_regs *regs)
 
     BUG_ON(!guest_mode(regs));
 
+    /* PVH should not get here. (ctrlreg is not implemented) */
+    ASSERT(!is_pvh_vcpu(curr));
+
     vcpu_restore_fpu_lazy(curr);
 
     if ( curr->arch.pv_vcpu.ctrlreg[0] & X86_CR0_TS )
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index d2f7209..0df1e1c 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -146,8 +146,8 @@ void vcpu_show_registers(const struct vcpu *v)
     const struct cpu_user_regs *regs = &v->arch.user_regs;
     unsigned long crs[8];
 
-    /* No need to handle HVM for now. */
-    if ( is_hvm_vcpu(v) )
+    /* No need to handle HVM and PVH for now. */
+    if ( !is_pv_vcpu(v) )
         return;
 
     crs[0] = v->arch.pv_vcpu.ctrlreg[0];
@@ -440,6 +440,8 @@ static long register_guest_callback(struct callback_register *reg)
     long ret = 0;
     struct vcpu *v = current;
 
+    ASSERT(!is_pvh_vcpu(v));
+
     if ( !is_canonical_address(reg->address) )
         return -EINVAL;
 
@@ -620,7 +622,7 @@ static void hypercall_page_initialise_ring3_kernel(void *hypercall_page)
 void hypercall_page_initialise(struct domain *d, void *hypercall_page)
 {
     memset(hypercall_page, 0xCC, PAGE_SIZE);
-    if ( is_hvm_domain(d) )
+    if ( !is_pv_domain(d) )
         hvm_hypercall_page_initialise(d, hypercall_page);
     else if ( !is_pv_32bit_domain(d) )
         hypercall_page_initialise_ring3_kernel(hypercall_page);
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 10e1cab..a734755 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -235,6 +235,15 @@ struct domain *domain_create(
 
     if ( domcr_flags & DOMCRF_hvm )
         d->guest_type = is_hvm;
+    else if ( domcr_flags & DOMCRF_pvh )
+    {
+        if ( !(domcr_flags & DOMCRF_hap) )
+        {
+            printk(XENLOG_INFO "PVH guest must have HAP on\n");
+            goto fail;
+        }
+        d->guest_type = is_pvh;
+    }
 
     if ( domid == 0 )
     {
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 9bd8f80..f9c361d 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -187,6 +187,8 @@ void getdomaininfo(struct domain *d, struct xen_domctl_getdomaininfo *info)
 
     if ( is_hvm_domain(d) )
         info->flags |= XEN_DOMINF_hvm_guest;
+    else if ( is_pvh_domain(d) )
+        info->flags |= XEN_DOMINF_pvh_guest;
 
     xsm_security_domaininfo(d, info);
 
@@ -443,6 +445,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
         domcr_flags = 0;
         if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_hvm_guest )
             domcr_flags |= DOMCRF_hvm;
+        else if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_hap )
+            domcr_flags |= DOMCRF_pvh;     /* PV with HAP is a PVH guest */
+
         if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_hap )
             domcr_flags |= DOMCRF_hap;
         if ( op->u.createdomain.flags & XEN_DOMCTL_CDF_s3_integrity )
diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index 72fb905..3bba758 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -289,7 +289,11 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
             if ( current->domain == dom0 )
                 fi.submap |= 1U << XENFEAT_dom0;
 #ifdef CONFIG_X86
-            if ( !is_hvm_vcpu(current) )
+            if ( is_pvh_vcpu(current) )
+                fi.submap |= (1U << XENFEAT_hvm_safe_pvclock) |
+                             (1U << XENFEAT_supervisor_mode_kernel) |
+                             (1U << XENFEAT_hvm_callback_vector);
+            else if ( !is_hvm_vcpu(current) )
                 fi.submap |= (1U << XENFEAT_mmu_pt_update_preserve_ad) |
                              (1U << XENFEAT_highmem_assist) |
                              (1U << XENFEAT_gnttab_map_avail_bits);
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 16/20] PVH xen: Miscellaneous changes
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (14 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 15/20] PVH xen: hcall page initialize, create PVH guest type, etc Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15 11:53   ` Jan Beulich
  2013-05-15  0:52 ` [PATCH 17/20] PVH xen: Introduce p2m_map_foreign Mukesh Rathor
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This patch contains misc changes like restricting iobitmap calls for PVH,
restricting 32bit PVH guest, etc..

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domain.c      |    7 +++++++
 xen/arch/x86/domain_page.c |   10 +++++-----
 xen/arch/x86/domctl.c      |   19 +++++++++++++------
 xen/arch/x86/mm.c          |    2 +-
 xen/arch/x86/physdev.c     |   13 +++++++++++++
 xen/common/grant_table.c   |    4 ++--
 6 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 4883fd1..21382eb 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -339,6 +339,13 @@ int switch_compat(struct domain *d)
 
     if ( d == NULL )
         return -EINVAL;
+
+    if ( is_pvh_domain(d) )
+    {
+        dprintk(XENLOG_G_ERR,
+                "Xen does not currently support 32bit PVH guests\n");
+        return -EINVAL;
+    }
     if ( !may_switch_mode(d) )
         return -EACCES;
     if ( is_pv_32on64_domain(d) )
diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
index efda6af..7685416 100644
--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -34,7 +34,7 @@ static inline struct vcpu *mapcache_current_vcpu(void)
      * then it means we are running on the idle domain's page table and must
      * therefore use its mapcache.
      */
-    if ( unlikely(pagetable_is_null(v->arch.guest_table)) && !is_hvm_vcpu(v) )
+    if ( unlikely(pagetable_is_null(v->arch.guest_table)) && is_pv_vcpu(v) )
     {
         /* If we really are idling, perform lazy context switch now. */
         if ( (v = idle_vcpu[smp_processor_id()]) == current )
@@ -71,7 +71,7 @@ void *map_domain_page(unsigned long mfn)
 #endif
 
     v = mapcache_current_vcpu();
-    if ( !v || is_hvm_vcpu(v) )
+    if ( !v || !is_pv_vcpu(v) )
         return mfn_to_virt(mfn);
 
     dcache = &v->domain->arch.pv_domain.mapcache;
@@ -175,7 +175,7 @@ void unmap_domain_page(const void *ptr)
     ASSERT(va >= MAPCACHE_VIRT_START && va < MAPCACHE_VIRT_END);
 
     v = mapcache_current_vcpu();
-    ASSERT(v && !is_hvm_vcpu(v));
+    ASSERT(v && is_pv_vcpu(v));
 
     dcache = &v->domain->arch.pv_domain.mapcache;
     ASSERT(dcache->inuse);
@@ -242,7 +242,7 @@ int mapcache_domain_init(struct domain *d)
     struct mapcache_domain *dcache = &d->arch.pv_domain.mapcache;
     unsigned int bitmap_pages;
 
-    if ( is_hvm_domain(d) || is_idle_domain(d) )
+    if ( !is_pv_domain(d) || is_idle_domain(d) )
         return 0;
 
 #ifdef NDEBUG
@@ -273,7 +273,7 @@ int mapcache_vcpu_init(struct vcpu *v)
     unsigned int ents = d->max_vcpus * MAPCACHE_VCPU_ENTRIES;
     unsigned int nr = PFN_UP(BITS_TO_LONGS(ents) * sizeof(long));
 
-    if ( is_hvm_vcpu(v) || !dcache->inuse )
+    if ( !is_pv_vcpu(v) || !dcache->inuse )
         return 0;
 
     if ( ents > dcache->entries )
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index c5a6f6f..3604816 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -64,9 +64,10 @@ long domctl_memory_mapping(struct domain *d, unsigned long gfn,
 
     if ( add_map )
     {
-        printk(XENLOG_G_INFO
-               "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
-               d->domain_id, gfn, mfn, nr_mfns);
+        if ( !is_pvh_domain(d) )     /* PVH maps lots and lots */
+            printk(XENLOG_G_INFO
+                   "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
+                   d->domain_id, gfn, mfn, nr_mfns);
 
         ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
         if ( !ret && paging_mode_translate(d) )
@@ -91,9 +92,10 @@ long domctl_memory_mapping(struct domain *d, unsigned long gfn,
     }
     else
     {
-        printk(XENLOG_G_INFO
-               "memory_map:remove: dom%d gfn=%lx mfn=%lx nr=%lx\n",
-               d->domain_id, gfn, mfn, nr_mfns);
+        if ( !is_pvh_domain(d) )     /* PVH unmaps lots and lots */
+            printk(XENLOG_G_INFO
+                   "memory_map:remove: dom%d gfn=%lx mfn=%lx nr=%lx\n",
+                   d->domain_id, gfn, mfn, nr_mfns);
 
         if ( paging_mode_translate(d) )
             for ( i = 0; i < nr_mfns; i++ )
@@ -1304,6 +1306,11 @@ void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
             c.nat->gs_base_kernel = hvm_get_shadow_gs_base(v);
         }
     }
+    else if ( is_pvh_vcpu(v) )
+    {
+        /* pvh fixme: punt it to phase II */
+        printk(XENLOG_WARNING "PVH: fixme: arch_get_info_guest()\n");
+    }
     else
     {
         c(ldt_base = v->arch.pv_vcpu.ldt_base);
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index ef37053..88c6f0c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2805,7 +2805,7 @@ static struct domain *get_pg_owner(domid_t domid)
         goto out;
     }
 
-    if ( unlikely(paging_mode_translate(curr)) )
+    if ( !is_pvh_domain(curr) && unlikely(paging_mode_translate(curr)) )
     {
         MEM_LOG("Cannot mix foreign mappings with translated domains");
         goto out;
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index eb8a407..520824b 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -475,6 +475,13 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     case PHYSDEVOP_set_iopl: {
         struct physdev_set_iopl set_iopl;
+
+        if ( is_pvh_vcpu(current) )
+        {
+            ret = -EINVAL;
+            break;
+        }
+
         ret = -EFAULT;
         if ( copy_from_guest(&set_iopl, arg, 1) != 0 )
             break;
@@ -488,6 +495,12 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     case PHYSDEVOP_set_iobitmap: {
         struct physdev_set_iobitmap set_iobitmap;
+
+        if ( is_pvh_vcpu(current) )
+        {
+            ret = -EINVAL;
+            break;
+        }
         ret = -EFAULT;
         if ( copy_from_guest(&set_iobitmap, arg, 1) != 0 )
             break;
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 3f97328..a2073d2 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -721,7 +721,7 @@ __gnttab_map_grant_ref(
 
     double_gt_lock(lgt, rgt);
 
-    if ( !is_hvm_domain(ld) && need_iommu(ld) )
+    if ( is_pv_domain(ld) && need_iommu(ld) )
     {
         unsigned int wrc, rdc;
         int err = 0;
@@ -932,7 +932,7 @@ __gnttab_unmap_common(
             act->pin -= GNTPIN_hstw_inc;
     }
 
-    if ( !is_hvm_domain(ld) && need_iommu(ld) )
+    if ( is_pv_domain(ld) && need_iommu(ld) )
     {
         unsigned int wrc, rdc;
         int err = 0;
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 17/20] PVH xen: Introduce p2m_map_foreign
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (15 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 16/20] PVH xen: Miscellaneous changes Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15 11:55   ` Jan Beulich
  2013-05-15  0:52 ` [PATCH 18/20] PVH xen: Add and remove foreign pages Mukesh Rathor
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

In this patch, a new type p2m_map_foreign for pages that a dom0 maps from
foreign domains its creating is introduced.

Changes in V2:
   - Make guest_physmap_add_entry() same for PVH in terms of overwriting old
     entry.
   - In set_foreign_p2m_entry() do locked get_gfn and not unlocked.
   - Replace ASSERT with return -EINVAL in do_physdev_op.
   - Remove unnecessary check for PVH in do_physdev_op().

Changes in V3:
   - remove changes unrelated to this patch.

Changes in V5:
   - remove mmio check for highest gfn tracking.
   - set_foreign_p2m_entry looks same as set_mmio_p2m_entry, so make a common
     function for both.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/mm/p2m-ept.c |    1 +
 xen/arch/x86/mm/p2m-pt.c  |    1 +
 xen/arch/x86/mm/p2m.c     |   28 ++++++++++++++++++++--------
 xen/include/asm-x86/p2m.h |    4 ++++
 4 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 595c6e7..67c200c 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -75,6 +75,7 @@ static void ept_p2m_type_to_flags(ept_entry_t *entry, p2m_type_t type, p2m_acces
             entry->w = 0;
             break;
         case p2m_grant_map_rw:
+        case p2m_map_foreign:
             entry->r = entry->w = 1;
             entry->x = 0;
             break;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 302b621..021a6af 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -89,6 +89,7 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn)
     case p2m_ram_rw:
         return flags | P2M_BASE_FLAGS | _PAGE_RW;
     case p2m_grant_map_rw:
+    case p2m_map_foreign:
         return flags | P2M_BASE_FLAGS | _PAGE_RW | _PAGE_NX_BIT;
     case p2m_mmio_direct:
         if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn_x(mfn)) )
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index f5ddd20..6f545ff 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -523,7 +523,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn, unsigned long mfn,
         for ( i = 0; i < (1UL << page_order); i++ )
         {
             mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a, 0, NULL);
-            if ( !p2m_is_grant(t) && !p2m_is_shared(t) )
+            if ( !p2m_is_grant(t) && !p2m_is_shared(t) && !p2m_is_foreign(t) )
                 set_gpfn_from_mfn(mfn+i, INVALID_M2P_ENTRY);
             ASSERT( !p2m_is_valid(t) || mfn + i == mfn_x(mfn_return) );
         }
@@ -754,10 +754,9 @@ void p2m_change_type_range(struct domain *d,
     p2m_unlock(p2m);
 }
 
-
-
-int
-set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
+static int
+set_typed_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
+                    p2m_type_t gfn_p2mt)
 {
     int rc = 0;
     p2m_access_t a;
@@ -782,16 +781,29 @@ set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
         set_gpfn_from_mfn(mfn_x(omfn), INVALID_M2P_ENTRY);
     }
 
-    P2M_DEBUG("set mmio %lx %lx\n", gfn, mfn_x(mfn));
-    rc = set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_mmio_direct, p2m->default_access);
+    P2M_DEBUG("set %d %lx %lx\n", gfn_p2mt, gfn, mfn_x(mfn));
+    rc = set_p2m_entry(p2m, gfn, mfn, PAGE_ORDER_4K, gfn_p2mt, 
+                       p2m->default_access);
     gfn_unlock(p2m, gfn, 0);
     if ( 0 == rc )
         gdprintk(XENLOG_ERR,
-            "set_mmio_p2m_entry: set_p2m_entry failed! mfn=%08lx\n",
+            "%s: set_p2m_entry failed! mfn=%08lx\n", __func__,
             mfn_x(get_gfn_query_unlocked(p2m->domain, gfn, &ot)));
     return rc;
 }
 
+/* Returns: True for success. 0 for failure. */
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
+{
+    return set_typed_p2m_entry(d, gfn, mfn, p2m_map_foreign);
+}
+
+int
+set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
+{
+    return set_typed_p2m_entry(d, gfn, mfn, p2m_mmio_direct);
+}
+
 int
 clear_mmio_p2m_entry(struct domain *d, unsigned long gfn)
 {
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 43583b2..6fc71a1 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -70,6 +70,7 @@ typedef enum {
     p2m_ram_paging_in = 11,       /* Memory that is being paged in */
     p2m_ram_shared = 12,          /* Shared or sharable memory */
     p2m_ram_broken = 13,          /* Broken page, access cause domain crash */
+    p2m_map_foreign  = 14,        /* ram pages from foreign domain */
 } p2m_type_t;
 
 /*
@@ -180,6 +181,7 @@ typedef unsigned int p2m_query_t;
 #define p2m_is_sharable(_t) (p2m_to_mask(_t) & P2M_SHARABLE_TYPES)
 #define p2m_is_shared(_t)   (p2m_to_mask(_t) & P2M_SHARED_TYPES)
 #define p2m_is_broken(_t)   (p2m_to_mask(_t) & P2M_BROKEN_TYPES)
+#define p2m_is_foreign(_t)  (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign))
 
 /* Per-p2m-table state */
 struct p2m_domain {
@@ -510,6 +512,8 @@ p2m_type_t p2m_change_type(struct domain *d, unsigned long gfn,
 int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
 int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn);
 
+/* Set foreign mfn in the current guest's p2m table. */
+int set_foreign_p2m_entry(struct domain *domp, unsigned long gfn, mfn_t mfn);
 
 /* 
  * Populate-on-demand
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 18/20] PVH xen: Add and remove foreign pages
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (16 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 17/20] PVH xen: Introduce p2m_map_foreign Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15 12:05   ` Jan Beulich
  2013-05-15  0:52 ` [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH Mukesh Rathor
  2013-05-15  0:52 ` [PATCH 20/20] PVH xen: PVH dom0 creation Mukesh Rathor
  19 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

In this patch, a new function, xenmem_add_foreign_to_pmap(), is added
to map pages from foreign guest into current dom0 for domU creation.
Also, allow XENMEM_remove_from_physmap to remove p2m_map_foreign
pages. Note, in this path, we must release the refcount that was taken
during the map phase.

Changes in V2:
  - Move the XENMEM_remove_from_physmap changes here instead of prev patch
  - Move grant changes from this to one of the next patches.
  - In xenmem_add_foreign_to_pmap(), do locked get_gfn
  - Fail the mappings for qemu mapping pages for memory not there.

Changes in V3:
  - remove mmio pages.
  - remove unrelated changes.
  - cleanup both add and remove.

Changes in V5:
  - add a comment in public/memory.h

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/mm.c           |   80 +++++++++++++++++++++++++++++++++++++++++++
 xen/common/memory.c         |   38 ++++++++++++++++++---
 xen/include/public/memory.h |    2 +-
 3 files changed, 114 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 88c6f0c..501b510 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4520,6 +4520,78 @@ static int handle_iomem_range(unsigned long s, unsigned long e, void *p)
     return 0;
 }
 
+/*
+ * Add frames from foreign domain to current domain's physmap. Similar to
+ * XENMAPSPACE_gmfn but the frame is foreign being mapped into current,
+ * and is not removed from foreign domain.
+ * Usage: libxl on pvh dom0 creating a guest and doing privcmd_ioctl_mmap.
+ * Side Effect: the mfn for fgfn will be refcounted so it is not lost
+ *              while mapped here. The refcnt is released in do_memory_op()
+ *              via XENMEM_remove_from_physmap.
+ * Returns: 0 ==> success
+ */
+static int xenmem_add_foreign_to_pmap(domid_t foreign_domid,
+                                      unsigned long fgfn, unsigned long gpfn)
+{
+    p2m_type_t p2mt, p2mt_prev;
+    int rc = 0;
+    unsigned long prev_mfn, mfn = 0;
+    struct domain *fdom, *currd = current->domain;
+    struct page_info *page = NULL;
+
+    if ( currd->domain_id == foreign_domid || foreign_domid == DOMID_SELF ||
+         !is_pvh_domain(currd) )
+        return -EINVAL;
+
+    if ( !IS_PRIV(currd) || (fdom = get_pg_owner(foreign_domid)) == NULL )
+        return -EPERM;
+
+    /* following will take a refcnt on the mfn */
+    page = get_page_from_gfn(fdom, fgfn, &p2mt, P2M_ALLOC);
+    if ( !page || !p2m_is_valid(p2mt) )
+    {
+        if ( page )
+            put_page(page);
+        put_pg_owner(fdom);
+        return -EINVAL;
+    }
+    mfn = page_to_mfn(page);
+
+    /* Remove previously mapped page if it is present. */
+    prev_mfn = mfn_x(get_gfn(currd, gpfn, &p2mt_prev));
+    if ( mfn_valid(prev_mfn) )
+    {
+        if ( is_xen_heap_mfn(prev_mfn) )
+            /* Xen heap frames are simply unhooked from this phys slot */
+            guest_physmap_remove_page(currd, gpfn, prev_mfn, 0);
+        else
+            /* Normal domain memory is freed, to avoid leaking memory. */
+            guest_remove_page(currd, gpfn);
+    }
+    /*
+     * Create the new mapping. Can't use guest_physmap_add_page() because it
+     * will update the m2p table which will result in  mfn -> gpfn of dom0
+     * and not fgfn of domU.
+     */
+    if ( set_foreign_p2m_entry(currd, gpfn, _mfn(mfn)) == 0 )
+    {
+        dprintk(XENLOG_WARNING,
+                "guest_physmap_add_page failed. gpfn:%lx mfn:%lx fgfn:%lx\n",
+                gpfn, mfn, fgfn);
+        put_page(page);
+        rc = -EINVAL;
+    }
+
+    /*
+     * We must do this put_gfn after set_foreign_p2m_entry so another cpu
+     * doesn't populate the gpfn before us.
+     */
+    put_gfn(currd, gpfn);
+
+    put_pg_owner(fdom);
+    return rc;
+}
+
 static int xenmem_add_to_physmap_once(
     struct domain *d,
     const struct xen_add_to_physmap *xatp,
@@ -4582,6 +4654,14 @@ static int xenmem_add_to_physmap_once(
             page = mfn_to_page(mfn);
             break;
         }
+
+        case XENMAPSPACE_gmfn_foreign:
+        {
+            rc = xenmem_add_foreign_to_pmap(foreign_domid, xatp->idx,
+                                            xatp->gpfn);
+            return rc;
+        }
+
         default:
             break;
     }
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 68501d1..a321d33 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -675,9 +675,11 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     case XENMEM_remove_from_physmap:
     {
+        unsigned long mfn;
         struct xen_remove_from_physmap xrfp;
         struct page_info *page;
-        struct domain *d;
+        struct domain *d, *foreign_dom = NULL;
+        p2m_type_t p2mt, tp;
 
         if ( copy_from_guest(&xrfp, arg, 1) )
             return -EFAULT;
@@ -695,11 +697,37 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         domain_lock(d);
 
-        page = get_page_from_gfn(d, xrfp.gpfn, NULL, P2M_ALLOC);
-        if ( page )
+        /*
+         * if PVH, the gfn could be mapped to a mfn from foreign domain by the
+         * user space tool during domain creation. We need to check for that,
+         * free it up from the p2m, and release refcnt on it. In such a case,
+         * page would be NULL and the following call would not have refcnt'd
+         * the page. See also xenmem_add_foreign_to_pmap().
+         */
+        page = get_page_from_gfn(d, xrfp.gpfn, &p2mt, P2M_ALLOC);
+
+        if ( page || p2m_is_foreign(p2mt) )
         {
-            guest_physmap_remove_page(d, xrfp.gpfn, page_to_mfn(page), 0);
-            put_page(page);
+            if ( page )
+                mfn = page_to_mfn(page);
+            else
+            {
+                mfn = mfn_x(get_gfn_query(d, xrfp.gpfn, &tp));
+                foreign_dom = page_get_owner(mfn_to_page(mfn));
+                ASSERT(is_pvh_domain(d));
+                ASSERT(d != foreign_dom);
+                ASSERT(p2m_is_foreign(tp));
+            }
+
+            guest_physmap_remove_page(d, xrfp.gpfn, mfn, 0);
+            if (page)
+                put_page(page);
+
+            if ( p2m_is_foreign(p2mt) )
+            {
+                put_page(mfn_to_page(mfn));
+                put_gfn(d, xrfp.gpfn);
+            }
         }
         else
             rc = -ENOENT;
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 51d5254..b496551 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -208,7 +208,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_machphys_mapping_t);
 #define XENMAPSPACE_gmfn_range   3 /* GMFN range, XENMEM_add_to_physmap only. */
 #define XENMAPSPACE_gmfn_foreign 4 /* GMFN from another dom,
                                     * XENMEM_add_to_physmap_range only.
-                                    */
+                                    * (PVH x86 only) */
 /* ` } */
 
 /*
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (17 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 18/20] PVH xen: Add and remove foreign pages Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  2013-05-15 12:12   ` Jan Beulich
  2013-05-15  0:52 ` [PATCH 20/20] PVH xen: PVH dom0 creation Mukesh Rathor
  19 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

This patch prepares for dom0 PVH by making some changes in the elf
code like use different copy function for PVH. Also, add check in iommu.c
to check for iommu enabled for dom0 PVH.

Changes in V2: None

Changes in V3:
   - introduct early_pvh_copy_or_zero() to replace dbg_rw_mem().

Changes in V5:
   - make v_start static and not pass it around.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domain_build.c       |   40 ++++++++++++++++++++++++++++++++++++-
 xen/arch/x86/setup.c              |    4 +++
 xen/common/libelf/libelf-loader.c |   18 ++++++++++++++++
 xen/drivers/passthrough/iommu.c   |   24 ++++++++++++++++++++-
 xen/include/asm-x86/pvh.h         |    6 +++++
 5 files changed, 89 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index 9980ea2..c5a0e0c 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -35,12 +35,14 @@
 #include <asm/setup.h>
 #include <asm/bzimage.h> /* for bzimage_parse */
 #include <asm/io_apic.h>
+#include <asm/pvh.h>
 
 #include <public/version.h>
 
 static long __initdata dom0_nrpages;
 static long __initdata dom0_min_nrpages;
 static long __initdata dom0_max_nrpages = LONG_MAX;
+static unsigned long __initdata v_start;
 
 /*
  * dom0_mem=[min:<min_amt>,][max:<max_amt>,][<amt>]
@@ -307,6 +309,43 @@ static void __init process_dom0_ioports_disable(void)
     }
 }
 
+ /*
+  * Copy or zero function for dom0 only during boot. This because
+  * raw_copy_to_guest -> copy_to_user_hvm -> __hvm_copy needs curr to
+  * point to the hvm/pvh vcpu which is not all setup yet.
+  *
+  * If src is NULL, then len bytes are zeroed.
+  */
+void __init early_pvh_copy_or_zero(unsigned long dest, const void *src, 
+                                   int len)
+{
+    while ( len > 0 )
+    {
+        char *va;
+        p2m_type_t gfntype;
+        unsigned long mfn, gfn, pagecnt;
+        struct domain *d = get_domain_by_id(0);
+
+        pagecnt = min_t(unsigned long, PAGE_SIZE - (dest & ~PAGE_MASK), len);
+
+        gfn = (dest - v_start) >> PAGE_SHIFT;
+        if ( (mfn = mfn_x(get_gfn_query(d, gfn, &gfntype))) == INVALID_MFN )
+            panic("Unable to get mfn for gfn:%lx\n", gfn);
+        put_gfn(d, gfn);
+
+        va = map_domain_page(mfn) + (dest & (PAGE_SIZE-1));
+        if ( src )
+            memcpy(va, src, pagecnt);
+        else
+            memset(va, 0, pagecnt);
+        unmap_domain_page(va);
+
+        dest += pagecnt;
+        src = src ? src + pagecnt : 0;
+        len -= pagecnt;
+    }
+}
+
 int __init construct_dom0(
     struct domain *d,
     const module_t *image, unsigned long image_headroom,
@@ -355,7 +394,6 @@ int __init construct_dom0(
     unsigned long vstack_end;
     unsigned long vpt_start;
     unsigned long vpt_end;
-    unsigned long v_start;
     unsigned long v_end;
 
     /* Machine address of next candidate page-table page. */
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index be541cb..6d35d1d 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -60,6 +60,10 @@ integer_param("maxcpus", max_cpus);
 static bool_t __initdata disable_smep;
 invbool_param("smep", disable_smep);
 
+/* Boot dom0 in PVH mode */
+bool_t __initdata opt_dom0pvh;
+boolean_param("dom0pvh", opt_dom0pvh);
+
 /* **** Linux config option: propagated to domain0. */
 /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
 /* "acpi=force":  Override the disable blacklist.                   */
diff --git a/xen/common/libelf/libelf-loader.c b/xen/common/libelf/libelf-loader.c
index 3cf9c59..8387b2e 100644
--- a/xen/common/libelf/libelf-loader.c
+++ b/xen/common/libelf/libelf-loader.c
@@ -16,6 +16,10 @@
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
  */
 
+#ifdef __XEN__
+#include <asm/pvh.h>
+#endif
+
 #include "libelf-private.h"
 
 /* ------------------------------------------------------------------------ */
@@ -127,6 +131,16 @@ static int elf_load_image(void *dst, const void *src, uint64_t filesz, uint64_t
     int rc;
     if ( filesz > ULONG_MAX || memsz > ULONG_MAX )
         return -1;
+
+    if ( opt_dom0pvh )
+    {
+        unsigned long addr = (unsigned long)dst;
+        early_pvh_copy_or_zero(addr, src, filesz);
+        early_pvh_copy_or_zero(addr + filesz, NULL, memsz - filesz);
+
+        return 0;
+    }
+
     rc = raw_copy_to_guest(dst, src, filesz);
     if ( rc != 0 )
         return -1;
@@ -260,6 +274,10 @@ void elf_parse_binary(struct elf_binary *elf)
             __FUNCTION__, elf->pstart, elf->pend);
 }
 
+/*
+ * This function called from the libraries when building guests, and also for
+ * dom0 from construct_dom0().
+ */
 int elf_load_binary(struct elf_binary *elf)
 {
     const elf_phdr *phdr;
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 93ad122..0dfbd3c 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -125,15 +125,27 @@ int iommu_domain_init(struct domain *d)
     return hd->platform_ops->init(d);
 }
 
+static inline void check_dom0_pvh_reqs(struct domain *d)
+{
+    if ( !iommu_enabled )
+        panic("For pvh dom0, iommu must be enabled\n");
+
+    if ( iommu_passthrough )
+        panic("For pvh dom0, dom0-passthrough must not be enabled\n");
+}
+
 void __init iommu_dom0_init(struct domain *d)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
+    if ( is_pvh_domain(d) )
+        check_dom0_pvh_reqs(d);
+
     if ( !iommu_enabled )
         return;
 
     register_keyhandler('o', &iommu_p2m_table);
-    d->need_iommu = !!iommu_dom0_strict;
+    d->need_iommu = is_pvh_domain(d) || !!iommu_dom0_strict;
     if ( need_iommu(d) )
     {
         struct page_info *page;
@@ -146,7 +158,15 @@ void __init iommu_dom0_init(struct domain *d)
                  ((page->u.inuse.type_info & PGT_type_mask)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
-            hd->platform_ops->map_page(d, mfn, mfn, mapping);
+
+            if ( is_pvh_domain(d) )
+            {
+                unsigned long gfn = mfn_to_gfn(d, _mfn(mfn));
+                hd->platform_ops->map_page(d, gfn, mfn, mapping);
+            }
+            else
+                hd->platform_ops->map_page(d, mfn, mfn, mapping);
+
             if ( !(i++ & 0xfffff) )
                 process_pending_softirqs();
         }
diff --git a/xen/include/asm-x86/pvh.h b/xen/include/asm-x86/pvh.h
index 73e59d3..849c7d7 100644
--- a/xen/include/asm-x86/pvh.h
+++ b/xen/include/asm-x86/pvh.h
@@ -1,6 +1,12 @@
 #ifndef __ASM_X86_PVH_H__
 #define __ASM_X86_PVH_H__
 
+#include <asm/types.h>
+
+struct cpu_user_regs;   
 int pvh_do_hypercall(struct cpu_user_regs *regs);
 
+extern bool_t opt_dom0pvh;
+void early_pvh_copy_or_zero(unsigned long dest, const void *src, int len);
+
 #endif  /* __ASM_X86_PVH_H__ */
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 20/20] PVH xen: PVH dom0 creation....
  2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
                   ` (18 preceding siblings ...)
  2013-05-15  0:52 ` [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH Mukesh Rathor
@ 2013-05-15  0:52 ` Mukesh Rathor
  19 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15  0:52 UTC (permalink / raw)
  To: Xen-devel

Finally, the hardest. Mostly modify construct_dom0() to boot PV dom0 in
PVH mode. opt_dom0pvh, which when specified in the command line, causes
dom0 to boot in PVH mode.
Note, the call to elf_load_binary() is moved down after required PVH setup
so that we can use the code path for both PV and PVH.

Change in V2:
  - Map the entire IO region upfront in the P2M for PVH dom0.

Change in V3:
  - Fixup pvh_map_all_iomem() to make sure map upto 4GB of io space.
  - remove use of dbg_* functions.

Change in V5:
  - no need to pass around v_start, and update comment in public/xen.h.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
---
 xen/arch/x86/domain_build.c |  269 ++++++++++++++++++++++++++++++++++---------
 xen/arch/x86/mm/hap/hap.c   |   14 +++
 xen/arch/x86/setup.c        |    6 +-
 xen/include/asm-x86/hap.h   |    1 +
 xen/include/public/xen.h    |    2 +
 5 files changed, 236 insertions(+), 56 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index c5a0e0c..9083a3b 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -36,6 +36,7 @@
 #include <asm/bzimage.h> /* for bzimage_parse */
 #include <asm/io_apic.h>
 #include <asm/pvh.h>
+#include <asm/hap.h>
 
 #include <public/version.h>
 
@@ -309,6 +310,72 @@ static void __init process_dom0_ioports_disable(void)
     }
 }
 
+/*
+ * Set the 1:1 map for all non-RAM regions for dom 0. Thus, dom0 will have
+ * the entire io region mapped in the EPT/NPT.
+ *
+ * pvh fixme: The following doesn't map MMIO ranges when they sit above the
+ *            highest E820 covered address.
+ */
+static __init void pvh_map_all_iomem(struct domain *d)
+{
+    unsigned long start_pfn, end_pfn, end = 0, start = 0;
+    const struct e820entry *entry;
+    unsigned int i, nump;
+    int rc;
+
+    for ( i = 0, entry = e820.map; i < e820.nr_map; i++, entry++ )
+    {
+        end = entry->addr + entry->size;
+
+        if ( entry->type == E820_RAM || entry->type == E820_UNUSABLE ||
+             i == e820.nr_map - 1 )
+        {
+            start_pfn = PFN_DOWN(start);
+            end_pfn = PFN_UP(end);
+
+            if ( entry->type == E820_RAM || entry->type == E820_UNUSABLE )
+                end_pfn = PFN_UP(entry->addr);
+
+            if ( start_pfn < end_pfn )
+            {
+                nump = end_pfn - start_pfn;
+                /* Add pages to the mapping */
+                rc = domctl_memory_mapping(d, start_pfn, start_pfn, nump, 1);
+                BUG_ON(rc);
+            }
+            start = end;
+        }
+    }
+
+    /* If the e820 ended under 4GB, we must map the remaining space upto 4GB */
+    if ( end < GB(4) )
+    {
+        start_pfn = PFN_UP(end);
+        end_pfn = (GB(4)) >> PAGE_SHIFT;
+        nump = end_pfn - start_pfn;
+        rc = domctl_memory_mapping(d, start_pfn, start_pfn, nump, 1);
+        BUG_ON(rc);
+    }
+}
+
+static __init void dom0_update_physmap(struct domain *d, unsigned long pfn,
+                                   unsigned long mfn, unsigned long vphysmap_s)
+{
+    if ( is_pvh_domain(d) )
+    {
+        int rc = guest_physmap_add_page(d, pfn, mfn, 0);
+        BUG_ON(rc);
+        return;
+    }
+    if ( !is_pv_32on64_domain(d) )
+        ((unsigned long *)vphysmap_s)[pfn] = mfn;
+    else
+        ((unsigned int *)vphysmap_s)[pfn] = mfn;
+
+    set_gpfn_from_mfn(mfn, pfn);
+}
+
  /*
   * Copy or zero function for dom0 only during boot. This because
   * raw_copy_to_guest -> copy_to_user_hvm -> __hvm_copy needs curr to
@@ -353,6 +420,7 @@ int __init construct_dom0(
     void *(*bootstrap_map)(const module_t *),
     char *cmdline)
 {
+    char *si_buf=NULL;
     int i, cpu, rc, compatible, compat32, order, machine;
     struct cpu_user_regs *regs;
     unsigned long pfn, mfn;
@@ -361,7 +429,7 @@ int __init construct_dom0(
     unsigned long alloc_spfn;
     unsigned long alloc_epfn;
     unsigned long initrd_pfn = -1, initrd_mfn = 0;
-    unsigned long count;
+    unsigned long count, shared_info_paddr = 0;
     struct page_info *page = NULL;
     start_info_t *si;
     struct vcpu *v = d->vcpu[0];
@@ -449,11 +517,19 @@ int __init construct_dom0(
         return -EINVAL;
     }
 
-    if ( parms.elf_notes[XEN_ELFNOTE_SUPPORTED_FEATURES].type != XEN_ENT_NONE &&
-         !test_bit(XENFEAT_dom0, parms.f_supported) )
+    if ( parms.elf_notes[XEN_ELFNOTE_SUPPORTED_FEATURES].type != XEN_ENT_NONE )
     {
-        printk("Kernel does not support Dom0 operation\n");
-        return -EINVAL;
+        if ( !test_bit(XENFEAT_dom0, parms.f_supported) )
+        {
+            printk("Kernel does not support Dom0 operation\n");
+            return -EINVAL;
+        }
+        if ( is_pvh_domain(d) &&
+             !test_bit(XENFEAT_hvm_callback_vector, parms.f_supported) )
+        {
+            printk("Kernel does not support PVH mode\n");
+            return -EINVAL;
+        }
     }
 
     if ( compat32 )
@@ -518,6 +594,14 @@ int __init construct_dom0(
     vstartinfo_end   = (vstartinfo_start +
                         sizeof(struct start_info) +
                         sizeof(struct dom0_vga_console_info));
+
+    if ( is_pvh_domain(d) )
+    {
+        /* note, following is paddr as opposed to maddr */
+        shared_info_paddr = round_pgup(vstartinfo_end) - v_start;
+        vstartinfo_end   += PAGE_SIZE;
+    }
+
     vpt_start        = round_pgup(vstartinfo_end);
     for ( nr_pt_pages = 2; ; nr_pt_pages++ )
     {
@@ -659,16 +743,34 @@ int __init construct_dom0(
         maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l3_page_table;
         l3start = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
     }
-    clear_page(l4tab);
-    init_guest_l4_table(l4tab, d);
-    v->arch.guest_table = pagetable_from_paddr(__pa(l4start));
-    if ( is_pv_32on64_domain(d) )
-        v->arch.guest_table_user = v->arch.guest_table;
+    if ( is_pvh_domain(d) )
+    {
+        v->arch.cr3 = v->arch.hvm_vcpu.guest_cr[3] = (vpt_start - v_start);
+
+        /* HAP is required for PVH and pfns are sequentially mapped there */
+        pfn = 0;
+    }
+    else
+    {
+        clear_page(l4tab);
+        init_guest_l4_table(l4tab, d);
+        v->arch.guest_table = pagetable_from_paddr(__pa(l4start));
+        if ( is_pv_32on64_domain(d) )
+            v->arch.guest_table_user = v->arch.guest_table;
+        pfn = alloc_spfn;
+    }
 
     l4tab += l4_table_offset(v_start);
-    pfn = alloc_spfn;
     for ( count = 0; count < ((v_end-v_start)>>PAGE_SHIFT); count++ )
     {
+        /*
+         * initrd chunk's mfns are allocated from a separate mfn chunk. Hence
+         * we need to adjust for them.
+         */
+        signed long pvh_adj = is_pvh_domain(d) ?
+                              (PFN_UP(initrd_len) - alloc_spfn) << PAGE_SHIFT
+                              : 0;
+
         if ( !((unsigned long)l1tab & (PAGE_SIZE-1)) )
         {
             maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l1_page_table;
@@ -695,16 +797,17 @@ int __init construct_dom0(
                     clear_page(l3tab);
                     if ( count == 0 )
                         l3tab += l3_table_offset(v_start);
-                    *l4tab = l4e_from_paddr(__pa(l3start), L4_PROT);
+                    *l4tab = l4e_from_paddr(__pa(l3start) + pvh_adj, L4_PROT);
                     l4tab++;
                 }
-                *l3tab = l3e_from_paddr(__pa(l2start), L3_PROT);
+                *l3tab = l3e_from_paddr(__pa(l2start) + pvh_adj, L3_PROT);
                 l3tab++;
             }
-            *l2tab = l2e_from_paddr(__pa(l1start), L2_PROT);
+            *l2tab = l2e_from_paddr(__pa(l1start) + pvh_adj, L2_PROT);
             l2tab++;
         }
-        if ( count < initrd_pfn || count >= initrd_pfn + PFN_UP(initrd_len) )
+        if ( is_pvh_domain(d) ||
+             count < initrd_pfn || count >= initrd_pfn + PFN_UP(initrd_len) )
             mfn = pfn++;
         else
             mfn = initrd_mfn++;
@@ -712,6 +815,9 @@ int __init construct_dom0(
                                     L1_PROT : COMPAT_L1_PROT));
         l1tab++;
 
+        if ( is_pvh_domain(d) )
+            continue;
+
         page = mfn_to_page(mfn);
         if ( (page->u.inuse.type_info == 0) &&
              !get_page_and_type(page, d, PGT_writable_page) )
@@ -740,6 +846,9 @@ int __init construct_dom0(
                COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*l2tab));
     }
 
+    if  ( is_pvh_domain(d) )
+        goto pvh_skip_pt_rdonly;
+
     /* Pages that are part of page tables must be read only. */
     l4tab = l4start + l4_table_offset(vpt_start);
     l3start = l3tab = l4e_to_l3e(*l4tab);
@@ -779,6 +888,8 @@ int __init construct_dom0(
         }
     }
 
+pvh_skip_pt_rdonly:
+
     /* Mask all upcalls... */
     for ( i = 0; i < XEN_LEGACY_MAX_VCPUS; i++ )
         shared_info(d, vcpu_info[i].evtchn_upcall_mask) = 1;
@@ -802,35 +913,20 @@ int __init construct_dom0(
     write_ptbase(v);
     mapcache_override_current(v);
 
-    /* Copy the OS image and free temporary buffer. */
-    elf.dest = (void*)vkern_start;
-    rc = elf_load_binary(&elf);
-    if ( rc < 0 )
-    {
-        printk("Failed to load the kernel binary\n");
-        return rc;
-    }
-    bootstrap_map(NULL);
-
-    if ( UNSET_ADDR != parms.virt_hypercall )
+    /* Set up start info area. */
+    if ( is_pvh_domain(d) )
     {
-        if ( (parms.virt_hypercall < v_start) ||
-             (parms.virt_hypercall >= v_end) )
+        /* avoid calling copy for every write to the vstartinfo_start */
+        if ( (si_buf = xmalloc_bytes(PAGE_SIZE)) == NULL )
         {
-            mapcache_override_current(NULL);
-            write_ptbase(current);
-            printk("Invalid HYPERCALL_PAGE field in ELF notes.\n");
-            return -1;
+            printk("PVH: xmalloc failed to alloc %ld bytes.\n", PAGE_SIZE);
+            return -ENOMEM;
         }
-        hypercall_page_initialise(
-            d, (void *)(unsigned long)parms.virt_hypercall);
+        si = (start_info_t *)si_buf;
     }
+    else
+        si = (start_info_t *)vstartinfo_start;
 
-    /* Free temporary buffers. */
-    discard_initial_images();
-
-    /* Set up start info area. */
-    si = (start_info_t *)vstartinfo_start;
     clear_page(si);
     si->nr_pages = nr_pages;
 
@@ -847,6 +943,10 @@ int __init construct_dom0(
              elf_64bit(&elf) ? 64 : 32, parms.pae ? "p" : "");
 
     count = d->tot_pages;
+
+    if ( is_pvh_domain(d) )
+        goto pvh_skip_guest_p2m_table;
+
     l4start = map_domain_page(pagetable_get_pfn(v->arch.guest_table));
     l3tab = NULL;
     l2tab = NULL;
@@ -973,6 +1073,11 @@ int __init construct_dom0(
         unmap_domain_page(l3tab);
     unmap_domain_page(l4start);
 
+pvh_skip_guest_p2m_table:
+
+    if (is_pvh_domain(d) )
+        hap_set_pvh_alloc_for_dom0(d, nr_pages);
+
     /* Write the phys->machine and machine->phys table entries. */
     for ( pfn = 0; pfn < count; pfn++ )
     {
@@ -989,11 +1094,8 @@ int __init construct_dom0(
         if ( pfn > REVERSE_START && (vinitrd_start || pfn < initrd_pfn) )
             mfn = alloc_epfn - (pfn - REVERSE_START);
 #endif
-        if ( !is_pv_32on64_domain(d) )
-            ((unsigned long *)vphysmap_start)[pfn] = mfn;
-        else
-            ((unsigned int *)vphysmap_start)[pfn] = mfn;
-        set_gpfn_from_mfn(mfn, pfn);
+        dom0_update_physmap(d, pfn, mfn, vphysmap_start);
+
         if (!(pfn & 0xfffff))
             process_pending_softirqs();
     }
@@ -1009,8 +1111,8 @@ int __init construct_dom0(
             if ( !page->u.inuse.type_info &&
                  !get_page_and_type(page, d, PGT_writable_page) )
                 BUG();
-            ((unsigned long *)vphysmap_start)[pfn] = mfn;
-            set_gpfn_from_mfn(mfn, pfn);
+
+            dom0_update_physmap(d, pfn, mfn, vphysmap_start);
             ++pfn;
             if (!(pfn & 0xfffff))
                 process_pending_softirqs();
@@ -1030,11 +1132,7 @@ int __init construct_dom0(
 #ifndef NDEBUG
 #define pfn (nr_pages - 1 - (pfn - (alloc_epfn - alloc_spfn)))
 #endif
-            if ( !is_pv_32on64_domain(d) )
-                ((unsigned long *)vphysmap_start)[pfn] = mfn;
-            else
-                ((unsigned int *)vphysmap_start)[pfn] = mfn;
-            set_gpfn_from_mfn(mfn, pfn);
+            dom0_update_physmap(d, pfn, mfn, vphysmap_start);
 #undef pfn
             page++; pfn++;
             if (!(pfn & 0xfffff))
@@ -1042,6 +1140,50 @@ int __init construct_dom0(
         }
     }
 
+    /* Copy the OS image and free temporary buffer. */
+    elf.dest = (void*)vkern_start;
+    rc = elf_load_binary(&elf);
+    if ( rc < 0 )
+    {
+        printk("Failed to load the kernel binary\n");
+        return rc;
+    }
+    bootstrap_map(NULL);
+
+    if ( UNSET_ADDR != parms.virt_hypercall )
+    {
+        void *addr;
+        if ( is_pvh_domain(d) )
+        {
+            if ( (addr = xzalloc_bytes(PAGE_SIZE)) == NULL )
+            {
+                printk("pvh: xzalloc failed for %ld bytes.\n", PAGE_SIZE);
+                return -ENOMEM;
+            }
+        } 
+        else
+            addr = (void *)parms.virt_hypercall;
+
+        if ( (parms.virt_hypercall < v_start) ||
+             (parms.virt_hypercall >= v_end) )
+        {
+            mapcache_override_current(NULL);
+            write_ptbase(current);
+            printk("Invalid HYPERCALL_PAGE field in ELF notes.\n");
+            return -1;
+        }
+        hypercall_page_initialise(d, addr);
+
+        if ( is_pvh_domain(d) )
+        {
+            early_pvh_copy_or_zero(parms.virt_hypercall, addr, PAGE_SIZE);
+            xfree(addr);
+        }
+    }
+
+    /* Free temporary buffers. */
+    discard_initial_images();
+
     if ( initrd_len != 0 )
     {
         si->mod_start = vinitrd_start ?: initrd_pfn;
@@ -1057,6 +1199,16 @@ int __init construct_dom0(
         si->console.dom0.info_off  = sizeof(struct start_info);
         si->console.dom0.info_size = sizeof(struct dom0_vga_console_info);
     }
+    if ( is_pvh_domain(d) )
+    {
+        unsigned long mfn = virt_to_mfn(d->shared_info);
+        unsigned long pfn = shared_info_paddr >> PAGE_SHIFT;
+        si->shared_info = shared_info_paddr;
+        dom0_update_physmap(d, pfn, mfn, 0);
+
+        early_pvh_copy_or_zero(vstartinfo_start, si_buf, PAGE_SIZE);
+        xfree(si_buf);
+    }
 
     if ( is_pv_32on64_domain(d) )
         xlat_start_info(si, XLAT_start_info_console_dom0);
@@ -1088,12 +1240,18 @@ int __init construct_dom0(
     regs->eip = parms.virt_entry;
     regs->esp = vstack_end;
     regs->esi = vstartinfo_start;
-    regs->eflags = X86_EFLAGS_IF;
+    regs->eflags = X86_EFLAGS_IF | 0x2;
 
     if ( opt_dom0_shadow )
-        if ( paging_enable(d, PG_SH_enable) == 0 ) 
+    {
+        if ( is_pvh_domain(d) )
+        {
+            printk("Invalid option dom0_shadow for PVH\n");
+            return -EINVAL;
+        }
+        if ( paging_enable(d, PG_SH_enable) == 0 )
             paging_update_paging_modes(v);
-
+    }
     if ( supervisor_mode_kernel )
     {
         v->arch.pv_vcpu.kernel_ss &= ~3;
@@ -1170,6 +1328,9 @@ int __init construct_dom0(
 
     BUG_ON(rc != 0);
 
+    if ( is_pvh_domain(d) )
+        pvh_map_all_iomem(d);
+
     iommu_dom0_init(dom0);
 
     return 0;
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 5aa0852..674c324 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -580,6 +580,20 @@ int hap_domctl(struct domain *d, xen_domctl_shadow_op_t *sc,
     }
 }
 
+/* Resize hap table. Copied from: libxl_get_required_shadow_memory() */
+void hap_set_pvh_alloc_for_dom0(struct domain *d, unsigned long num_pages)
+{
+    int rc;
+    unsigned long memkb = num_pages * (PAGE_SIZE / 1024);
+
+    memkb = 4 * (256 * d->max_vcpus + 2 * (memkb / 1024));
+    num_pages = ((memkb+1023)/1024) << (20 - PAGE_SHIFT);
+    paging_lock(d);
+    rc = hap_set_allocation(d, num_pages, NULL);
+    paging_unlock(d);
+    BUG_ON(rc);
+}
+
 static const struct paging_mode hap_paging_real_mode;
 static const struct paging_mode hap_paging_protected_mode;
 static const struct paging_mode hap_paging_pae_mode;
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 6d35d1d..60f4dd8 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -549,7 +549,7 @@ void __init __start_xen(unsigned long mbi_p)
 {
     char *memmap_type = NULL;
     char *cmdline, *kextra, *loader;
-    unsigned int initrdidx;
+    unsigned int initrdidx, domcr_flags = 0;
     multiboot_info_t *mbi = __va(mbi_p);
     module_t *mod = (module_t *)__va(mbi->mods_addr);
     unsigned long nr_pages, modules_headroom, *module_map;
@@ -1321,7 +1321,9 @@ void __init __start_xen(unsigned long mbi_p)
         panic("Could not protect TXT memory regions\n");
 
     /* Create initial domain 0. */
-    dom0 = domain_create(0, DOMCRF_s3_integrity, 0);
+    domcr_flags = (opt_dom0pvh ? DOMCRF_pvh | DOMCRF_hap : 0);
+    domcr_flags |= DOMCRF_s3_integrity;
+    dom0 = domain_create(0, domcr_flags, 0);
     if ( IS_ERR(dom0) || (alloc_dom0_vcpu0() == NULL) )
         panic("Error creating domain 0\n");
 
diff --git a/xen/include/asm-x86/hap.h b/xen/include/asm-x86/hap.h
index e03f983..aab8558 100644
--- a/xen/include/asm-x86/hap.h
+++ b/xen/include/asm-x86/hap.h
@@ -63,6 +63,7 @@ int   hap_track_dirty_vram(struct domain *d,
                            XEN_GUEST_HANDLE_64(uint8) dirty_bitmap);
 
 extern const struct paging_mode *hap_paging_get_mode(struct vcpu *);
+void hap_set_pvh_alloc_for_dom0(struct domain *d, unsigned long num_pages);
 
 #endif /* XEN_HAP_H */
 
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 3cab74f..28d1e13 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -693,6 +693,8 @@ typedef struct shared_info shared_info_t;
  *      c. list of allocated page frames [mfn_list, nr_pages]
  *         (unless relocated due to XEN_ELFNOTE_INIT_P2M)
  *      d. start_info_t structure        [register ESI (x86)]
+ *      d1. struct shared_info_t                [shared_info]
+ *                   (above if auto translated guest)
  *      e. bootstrap page tables         [pt_base and CR3 (x86)]
  *      f. bootstrap stack               [register ESP (x86)]
  *  4. Bootstrap elements are packed together, but each is 4kB-aligned.
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-15  0:52 ` [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range Mukesh Rathor
@ 2013-05-15  9:58   ` Jan Beulich
  2013-05-15 23:05     ` Mukesh Rathor
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2013-05-15  9:58 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4519,7 +4519,8 @@ static int handle_iomem_range(unsigned long s, unsigned long e, void *p)
>  
>  static int xenmem_add_to_physmap_once(
>      struct domain *d,
> -    const struct xen_add_to_physmap *xatp)
> +    const struct xen_add_to_physmap *xatp,
> +    domid_t foreign_domid)
>  {
>      struct page_info *page = NULL;
>      unsigned long gfn = 0; /* gcc ... */
> @@ -4646,7 +4647,7 @@ static int xenmem_add_to_physmap(struct domain *d,

I know I said this before: This patch can't be complete, or else the
new function parameter would actually get used. With the way
things are, if this patch gets applied, a user of the new XENMEM_
sub-op would not get the expected behavior.

And I know I said this before too: It should be possible to apply
any contiguous initial sub-portion of a patch series without
introducing breakage to the tree - you shouldn't assume the
whole series gets applied in one go (or, if found necessary later,
gets reverted altogether).

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 03/20] PVH xen: create domctl_memory_mapping() function
  2013-05-15  0:52 ` [PATCH 03/20] PVH xen: create domctl_memory_mapping() function Mukesh Rathor
@ 2013-05-15 10:07   ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-15 10:07 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> @@ -625,67 +689,12 @@ long arch_do_domctl(
>          unsigned long mfn = domctl->u.memory_mapping.first_mfn;
>          unsigned long nr_mfns = domctl->u.memory_mapping.nr_mfns;
>          int add = domctl->u.memory_mapping.add_mapping;
> -        unsigned long i;
> -
> -        ret = -EINVAL;
> -        if ( (mfn + nr_mfns - 1) < mfn || /* wrap? */
> -             ((mfn | (mfn + nr_mfns - 1)) >> (paddr_bits - PAGE_SHIFT)) ||
> -             (gfn + nr_mfns - 1) < gfn ) /* wrap? */
> -            break;
>  
>          ret = -EPERM;
>          if ( !iomem_access_permitted(current->domain, mfn, mfn + nr_mfns - 1) )
>              break;

Removing the checks above will allow the assertion in
rangeset_contains_range() to trigger upon bad input.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/20] PVH xen: Move e820 fields out of pv_domain struct
  2013-05-15  0:52 ` [PATCH 06/20] PVH xen: Move e820 fields out of pv_domain struct Mukesh Rathor
@ 2013-05-15 10:27   ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-15 10:27 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -569,7 +569,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags)
>          /* 64-bit PV guest by default. */
>          d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
>  
> -        spin_lock_init(&d->arch.pv_domain.e820_lock);
> +        spin_lock_init(&d->arch.e820_lock);
>      }
>  
>      /* initialize default tsc behavior in case tools don't */
> @@ -595,7 +595,7 @@ void arch_domain_destroy(struct domain *d)
>      if ( is_hvm_domain(d) )
>          hvm_domain_destroy(d);
>      else
> -        xfree(d->arch.pv_domain.e820);
> +        xfree(d->arch.e820);
>  
>      free_domain_pirqs(d);
>      if ( !is_idle_domain(d) )

Such initialization and cleanup shouldn't remain conditional upon
!is_hvm_domain() if the fields are now universal - while not an
immediate bug, it's nevertheless a latent one.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 11/20] PVH xen: introduce pvh.c
  2013-05-15  0:52 ` [PATCH 11/20] PVH xen: introduce pvh.c Mukesh Rathor
@ 2013-05-15 10:42   ` Jan Beulich
  2013-05-16  1:42     ` Mukesh Rathor
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2013-05-15 10:42 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> --- /dev/null
> +++ b/xen/arch/x86/hvm/pvh.c
> @@ -0,0 +1,202 @@
> +/*
> + * Copyright (C) 2013, Mukesh Rathor, Oracle Corp.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License v2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + */
> +#include <xen/hypercall.h>
> +#include <xen/guest_access.h>
> +#include <asm/p2m.h>
> +#include <asm/traps.h>
> +#include <asm/hvm/vmx/vmx.h>
> +#include <public/sched.h>
> +
> +
> +static int pvh_grant_table_op(unsigned int cmd, XEN_GUEST_HANDLE(void) uop,
> +                              unsigned int count)
> +{
> +#ifndef NDEBUG

The whole function should be in such a conditional (if it's really
needed, which I previously said I doubt). Without doing so, and
with the way you have pvh_hypercall64_table[], build will fail
for debug=n.

> +    switch ( cmd )
> +    {
> +    /* the following grant ops have been tested for PVH guest. */
> +    case GNTTABOP_map_grant_ref:
> +    case GNTTABOP_unmap_grant_ref:
> +    case GNTTABOP_setup_table:
> +    case GNTTABOP_copy:
> +    case GNTTABOP_query_size:
> +    case GNTTABOP_set_version:
> +        return do_grant_table_op(cmd, uop, count);
> +    }
> +    return -ENOSYS;
> +#else
> +    return do_grant_table_op(cmd, uop, count);
> +#endif
> +}
> +
> +static long pvh_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE(void) arg)
> +{
> +    long rc = -ENOSYS;
> +
> +#ifndef NDEBUG
> +    int valid = 0;
> +
> +    switch ( cmd )
> +    {
> +    case VCPUOP_register_runstate_memory_area:
> +    case VCPUOP_get_runstate_info:
> +    case VCPUOP_set_periodic_timer:
> +    case VCPUOP_stop_periodic_timer:
> +    case VCPUOP_set_singleshot_timer:
> +    case VCPUOP_stop_singleshot_timer:
> +    case VCPUOP_is_up:
> +    case VCPUOP_up:
> +    case VCPUOP_initialise:
> +        valid = 1;
> +    }
> +    if ( !valid )
> +        return rc;
> +#endif
> +
> +    rc = do_vcpu_op(cmd, vcpuid, arg);
> +
> +    /* pvh boot vcpu setting context for bringing up smp vcpu */
> +    if ( cmd == VCPUOP_initialise )
> +        vmx_vmcs_enter(current);

This is wrong in three ways - for one, you can't call a vmx function
from here, then the operation also doesn't appear to belong here,
and with pvh_hypercall64_table[] not even existing in non-debug
builds this won't happen there at all (which is making very clear that
these function overrides are plain wrong, as I had tried to tell you
from the beginning).

> +    return rc;
> +}
> +
> +static long pvh_physdev_op(int cmd, XEN_GUEST_HANDLE(void) arg)
> +{
> +#ifndef NDEBUG

Same as for grant table op above.

> +    int valid = 0;
> +    switch ( cmd )
> +    {
> +     case PHYSDEVOP_map_pirq:
> +     case PHYSDEVOP_unmap_pirq:
> +     case PHYSDEVOP_eoi:
> +     case PHYSDEVOP_irq_status_query:
> +     case PHYSDEVOP_get_free_pirq:
> +         valid = 1;
> +     }
> +     if ( !valid && !IS_PRIV(current->domain) )
> +        return -ENOSYS;
> +#endif
> +    return do_physdev_op(cmd, arg);
> +}
> +
> +static long pvh_hvm_op(unsigned long op, XEN_GUEST_HANDLE(void) arg)
> +{
> +    long rc = -EINVAL;
> +    struct xen_hvm_param harg;
> +    struct domain *d;
> +
> +    if ( copy_from_guest(&harg, arg, 1) )
> +        return -EFAULT;
> +
> +    rc = rcu_lock_target_domain_by_id(harg.domid, &d);
> +    if ( rc != 0 )
> +        return rc;
> +
> +    if ( is_hvm_domain(d) )
> +    {
> +        /* pvh dom0 is building an hvm guest */
> +        rcu_unlock_domain(d);
> +        return do_hvm_op(op, arg);
> +    }
> +
> +    rc = -ENOSYS;
> +    if ( op == HVMOP_set_param )
> +    {
> +        if ( harg.index == HVM_PARAM_CALLBACK_IRQ )
> +        {
> +            struct hvm_irq *hvm_irq = &d->arch.hvm_domain.irq;
> +            uint64_t via = harg.value;
> +            uint8_t via_type = (uint8_t)(via >> 56) + 1;
> +
> +            if ( via_type == HVMIRQ_callback_vector )
> +            {
> +                hvm_irq->callback_via_type = HVMIRQ_callback_vector;
> +                hvm_irq->callback_via.vector = (uint8_t)via;
> +                rc = 0;
> +            }
> +        }
> +    }
> +    rcu_unlock_domain(d);
> +    if ( rc )
> +        gdprintk(XENLOG_DEBUG, "op:%ld -ENOSYS\n", op);

This should be dropped.

> +
> +    return rc;
> +}
> +
> +#ifndef NDEBUG
> +/* PVH 32bitfixme */
> +static hvm_hypercall_t *pvh_hypercall64_table[NR_hypercalls] = {
> +    [__HYPERVISOR_platform_op]     = (hvm_hypercall_t *)do_platform_op,
> +    [__HYPERVISOR_memory_op]       = (hvm_hypercall_t *)do_memory_op,
> +    [__HYPERVISOR_xen_version]     = (hvm_hypercall_t *)do_xen_version,
> +    [__HYPERVISOR_console_io]      = (hvm_hypercall_t *)do_console_io,
> +    [__HYPERVISOR_grant_table_op]  = (hvm_hypercall_t *)pvh_grant_table_op,
> +    [__HYPERVISOR_vcpu_op]         = (hvm_hypercall_t *)pvh_vcpu_op,
> +    [__HYPERVISOR_mmuext_op]       = (hvm_hypercall_t *)do_mmuext_op,
> +    [__HYPERVISOR_xsm_op]          = (hvm_hypercall_t *)do_xsm_op,
> +    [__HYPERVISOR_sched_op]        = (hvm_hypercall_t *)do_sched_op,
> +    [__HYPERVISOR_event_channel_op]= (hvm_hypercall_t *)do_event_channel_op,
> +    [__HYPERVISOR_physdev_op]      = (hvm_hypercall_t *)pvh_physdev_op,
> +    [__HYPERVISOR_hvm_op]          = (hvm_hypercall_t *)pvh_hvm_op,
> +    [__HYPERVISOR_sysctl]          = (hvm_hypercall_t *)do_sysctl,
> +    [__HYPERVISOR_domctl]          = (hvm_hypercall_t *)do_domctl
> +};
> +#endif
> +
> +/*
> + * Check if hypercall is valid
> + * Returns: 0 if hcall is not valid with eax set to the errno to ret to guest
> + */
> +static bool_t hcall_valid(struct cpu_user_regs *regs)
> +{
> +    struct segment_register sreg;
> +
> +    hvm_get_segment_register(current, x86_seg_ss, &sreg);
> +    if ( unlikely(sreg.attr.fields.dpl != 0) )
> +    {
> +        regs->eax = -EPERM;
> +        return 0;
> +    }
> +
> +    return 1;
> +}
> +
> +/* PVH 32bitfixme */
> +int pvh_do_hypercall(struct cpu_user_regs *regs)
> +{
> +    uint32_t hnum = regs->eax;
> +
> +    if ( hnum >= NR_hypercalls || pvh_hypercall64_table[hnum] == NULL )
> +    {
> +        gdprintk(XENLOG_WARNING, "PVH: Unimplemented HCALL:%d. Returning "
> +                 "-ENOSYS. domid:%d IP:%lx SP:%lx\n",
> +                 hnum, current->domain->domain_id, regs->rip, regs->rsp);
> +        regs->eax = -ENOSYS;
> +        vmx_update_guest_eip();
> +        return HVM_HCALL_completed;
> +    }
> +
> +    if ( !hcall_valid(regs) )
> +        return HVM_HCALL_completed;
> +
> +    current->arch.hvm_vcpu.hcall_preempted = 0;
> +    regs->rax = pvh_hypercall64_table[hnum](regs->rdi, regs->rsi, regs->rdx,
> +                                            regs->r10, regs->r8, regs->r9);

Another build error with debug=n?

> +
> +    if ( current->arch.hvm_vcpu.hcall_preempted )
> +        return HVM_HCALL_preempted;
> +
> +    return HVM_HCALL_completed;
> +}

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 13/20] PVH xen: introduce vmx_pvh.c
  2013-05-15  0:52 ` [PATCH 13/20] PVH xen: introduce vmx_pvh.c Mukesh Rathor
@ 2013-05-15 11:46   ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-15 11:46 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> +        case 3:
> +            gdprintk(XENLOG_G_ERR, "PVH: unexpected cr3 vmexit. rip:%lx\n",
> +                     regs->rip);
> +            domain_crash_synchronous();
> +            break;

If at all possible, please avoid domain_crash_synchronous() in
favor of domain_crash().

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 16/20] PVH xen: Miscellaneous changes
  2013-05-15  0:52 ` [PATCH 16/20] PVH xen: Miscellaneous changes Mukesh Rathor
@ 2013-05-15 11:53   ` Jan Beulich
  2013-05-16  1:51     ` Mukesh Rathor
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2013-05-15 11:53 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -64,9 +64,10 @@ long domctl_memory_mapping(struct domain *d, unsigned long gfn,
>  
>      if ( add_map )
>      {
> -        printk(XENLOG_G_INFO
> -               "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
> -               d->domain_id, gfn, mfn, nr_mfns);
> +        if ( !is_pvh_domain(d) )     /* PVH maps lots and lots */
> +            printk(XENLOG_G_INFO
> +                   "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
> +                   d->domain_id, gfn, mfn, nr_mfns);
>  
>          ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
>          if ( !ret && paging_mode_translate(d) )
> @@ -91,9 +92,10 @@ long domctl_memory_mapping(struct domain *d, unsigned long gfn,
>      }
>      else
>      {
> -        printk(XENLOG_G_INFO
> -               "memory_map:remove: dom%d gfn=%lx mfn=%lx nr=%lx\n",
> -               d->domain_id, gfn, mfn, nr_mfns);
> +        if ( !is_pvh_domain(d) )     /* PVH unmaps lots and lots */
> +            printk(XENLOG_G_INFO
> +                   "memory_map:remove: dom%d gfn=%lx mfn=%lx nr=%lx\n",
> +                   d->domain_id, gfn, mfn, nr_mfns);
>  
>          if ( paging_mode_translate(d) )
>              for ( i = 0; i < nr_mfns; i++ )

Are these changes still necessary? IOW why would a PVH guest be
mapping so much more MMIO memory than a PV one?

> @@ -1304,6 +1306,11 @@ void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c)
>              c.nat->gs_base_kernel = hvm_get_shadow_gs_base(v);
>          }
>      }
> +    else if ( is_pvh_vcpu(v) )
> +    {
> +        /* pvh fixme: punt it to phase II */
> +        printk(XENLOG_WARNING "PVH: fixme: arch_get_info_guest()\n");
> +    }

Please at least clear out all state that doesn't get properly obtained
(short of being able to return an error).

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 17/20] PVH xen: Introduce p2m_map_foreign
  2013-05-15  0:52 ` [PATCH 17/20] PVH xen: Introduce p2m_map_foreign Mukesh Rathor
@ 2013-05-15 11:55   ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-15 11:55 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> +/* Returns: True for success. 0 for failure. */
> +int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
> +{
> +    return set_typed_p2m_entry(d, gfn, mfn, p2m_map_foreign);
> +}
> +
> +int
> +set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
> +{
> +    return set_typed_p2m_entry(d, gfn, mfn, p2m_mmio_direct);
> +}

Consistent coding style please (placement of return type).

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 18/20] PVH xen: Add and remove foreign pages
  2013-05-15  0:52 ` [PATCH 18/20] PVH xen: Add and remove foreign pages Mukesh Rathor
@ 2013-05-15 12:05   ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-15 12:05 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> @@ -695,11 +697,37 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>          domain_lock(d);
>  
> -        page = get_page_from_gfn(d, xrfp.gpfn, NULL, P2M_ALLOC);
> -        if ( page )
> +        /*
> +         * if PVH, the gfn could be mapped to a mfn from foreign domain by the
> +         * user space tool during domain creation. We need to check for that,
> +         * free it up from the p2m, and release refcnt on it. In such a case,
> +         * page would be NULL and the following call would not have refcnt'd
> +         * the page. See also xenmem_add_foreign_to_pmap().
> +         */
> +        page = get_page_from_gfn(d, xrfp.gpfn, &p2mt, P2M_ALLOC);
> +
> +        if ( page || p2m_is_foreign(p2mt) )
>          {
> -            guest_physmap_remove_page(d, xrfp.gpfn, page_to_mfn(page), 0);
> -            put_page(page);
> +            if ( page )
> +                mfn = page_to_mfn(page);
> +            else
> +            {
> +                mfn = mfn_x(get_gfn_query(d, xrfp.gpfn, &tp));
> +                foreign_dom = page_get_owner(mfn_to_page(mfn));
> +                ASSERT(is_pvh_domain(d));
> +                ASSERT(d != foreign_dom);
> +                ASSERT(p2m_is_foreign(tp));

A guest can perform this operation on itself, so I'm afraid none of
the assertions are really valid (they likely will all need to become
error returns).

> +            }
> +
> +            guest_physmap_remove_page(d, xrfp.gpfn, mfn, 0);
> +            if (page)

Coding style flaws have greatly reduced in the series up to here,
but there are still a few cases left. I'm not going to point them out
individually, I really expect you to go through your patches and
fix them.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH
  2013-05-15  0:52 ` [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH Mukesh Rathor
@ 2013-05-15 12:12   ` Jan Beulich
  2013-05-16  1:58     ` Mukesh Rathor
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2013-05-15 12:12 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -60,6 +60,10 @@ integer_param("maxcpus", max_cpus);
>  static bool_t __initdata disable_smep;
>  invbool_param("smep", disable_smep);
>  
> +/* Boot dom0 in PVH mode */
> +bool_t __initdata opt_dom0pvh;
> +boolean_param("dom0pvh", opt_dom0pvh);

Does this really belong here (instead of domain_build.c)?

> --- a/xen/common/libelf/libelf-loader.c
> +++ b/xen/common/libelf/libelf-loader.c
> @@ -127,6 +131,16 @@ static int elf_load_image(void *dst, const void *src, uint64_t filesz, uint64_t
>      int rc;
>      if ( filesz > ULONG_MAX || memsz > ULONG_MAX )
>          return -1;
> +
> +    if ( opt_dom0pvh )

So you define (above) and declare (below) the variable in x86-
specific files, but use it in common code? That's going to break the
ARM build.

> +    {
> +        unsigned long addr = (unsigned long)dst;
> +        early_pvh_copy_or_zero(addr, src, filesz);
> +        early_pvh_copy_or_zero(addr + filesz, NULL, memsz - filesz);

And anyway - repeating my earlier complaint - I don't see why this
is necessary. In fact I don't see why most of the PV Dom0 building
code can't be used unchanged for PVH: There's no real need for
lifting the few restrictions that apply, and hence there needn't be
any fear of colliding address spaces.

> @@ -146,7 +158,15 @@ void __init iommu_dom0_init(struct domain *d)
>                   ((page->u.inuse.type_info & PGT_type_mask)
>                    == PGT_writable_page) )
>                  mapping |= IOMMUF_writable;
> -            hd->platform_ops->map_page(d, mfn, mfn, mapping);
> +
> +            if ( is_pvh_domain(d) )
> +            {
> +                unsigned long gfn = mfn_to_gfn(d, _mfn(mfn));
> +                hd->platform_ops->map_page(d, gfn, mfn, mapping);
> +            }
> +            else
> +                hd->platform_ops->map_page(d, mfn, mfn, mapping);

With mfn_to_gfn(mfn) == mfn, is there really a need for two code
paths here?

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-15  9:58   ` Jan Beulich
@ 2013-05-15 23:05     ` Mukesh Rathor
  2013-05-16  7:21       ` Jan Beulich
  0 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-15 23:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Wed, 15 May 2013 10:58:43 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -4519,7 +4519,8 @@ static int handle_iomem_range(unsigned long
> > s, unsigned long e, void *p) 
> >  static int xenmem_add_to_physmap_once(
> >      struct domain *d,
> > -    const struct xen_add_to_physmap *xatp)
> > +    const struct xen_add_to_physmap *xatp,
> > +    domid_t foreign_domid)
> >  {
> >      struct page_info *page = NULL;
> >      unsigned long gfn = 0; /* gcc ... */
> > @@ -4646,7 +4647,7 @@ static int xenmem_add_to_physmap(struct
> > domain *d,
> 
> I know I said this before: This patch can't be complete, or else the
> new function parameter would actually get used. With the way
> things are, if this patch gets applied, a user of the new XENMEM_
> sub-op would not get the expected behavior.
> 

No, the new foreign_domid parameter is meaningful for only the 
XENMAPSPACE_gmfn_foreign OP which is defined in patch 0018. So we 
should be OK here.

thanks
Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 11/20] PVH xen: introduce pvh.c
  2013-05-15 10:42   ` Jan Beulich
@ 2013-05-16  1:42     ` Mukesh Rathor
  2013-05-16  8:00       ` Jan Beulich
  0 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-16  1:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Wed, 15 May 2013 11:42:34 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > --- /dev/null
> > +
> > +static int pvh_grant_table_op(unsigned int cmd,
> > XEN_GUEST_HANDLE(void) uop,
> > +                              unsigned int count)
> > +{
> > +#ifndef NDEBUG
> 
> The whole function should be in such a conditional (if it's really
> needed, which I previously said I doubt). Without doing so, and

Temporary code, can we leave it for initial stages while PVH is being
tested and tried. It will help with debug a lot. Alternately, would you be
ok with something like the patch way below. I could do same for grant
and hypercalls too.

> with the way you have pvh_hypercall64_table[], build will fail
> for debug=n.

I know. I left it for you to implment it the way you wanted with memcopy
of the table, which I completely disagree with.

> > +    rc = do_vcpu_op(cmd, vcpuid, arg);
> > +
> > +    /* pvh boot vcpu setting context for bringing up smp vcpu */
> > +    if ( cmd == VCPUOP_initialise )
> > +        vmx_vmcs_enter(current);  
> 
> This is wrong in three ways - for one, you can't call a vmx function
> from here, then the operation also doesn't appear to belong here,

Ooops, sorry! This was in vmx_pvh.c file originally and got moved. Yeah, 
I struggled with this, I can't remember why I needed this. Let me go 
back and investigate this.

> and with pvh_hypercall64_table[] not even existing in non-debug
> builds this won't happen there at all (which is making very clear that
> these function overrides are plain wrong, as I had tried to tell you
> from the beginning).

I don't think they are *wrong*, they work just fine for HVM! They might be
a different approach than you might like, and I appreciate that. But I am 
thinking of being able to easily catch and debug things when the feature 
goes thru infancy.

thanks
Mukesh

--------------------------------------------------------------------------
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 21382eb..b1a33b5 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1129,6 +1129,10 @@ arch_do_vcpu_op(
         struct domain *d = v->domain;
         struct vcpu_register_vcpu_info info;
 
+        rc = -ENOSYS;
+        if ( is_pvh_vcpu(current) )
+            break;
+
         rc = -EFAULT;
         if ( copy_from_guest(&info, arg, 1) )
             break;
@@ -1169,6 +1173,10 @@ arch_do_vcpu_op(
     {
         struct vcpu_get_physid cpu_id;
 
+        rc = -ENOSYS;
+        if ( is_pvh_vcpu(current) )
+            break;
+
         rc = -EINVAL;
         if ( !is_pinned_vcpu(v) )
             break;
diff --git a/xen/common/domain.c b/xen/common/domain.c
index a734755..68de4bb 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -968,6 +968,12 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
     }
 
     case VCPUOP_down:
+        if ( is_pvh_vcpu(current) )
+        {
+            rc = -ENOSYS;
+            break;
+        }
+
         if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
             vcpu_sleep_nosync(v);
         break;
@@ -1039,6 +1045,11 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
 
 #ifdef VCPU_TRAP_NMI
     case VCPUOP_send_nmi:
+        if ( is_pvh_vcpu(current) )
+        {
+            rc = -ENOSYS;
+            break;
+        }
         if ( !guest_handle_is_null(arg) )
             return -EINVAL;

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 16/20] PVH xen: Miscellaneous changes
  2013-05-15 11:53   ` Jan Beulich
@ 2013-05-16  1:51     ` Mukesh Rathor
  0 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-16  1:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Wed, 15 May 2013 12:53:00 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > --- a/xen/arch/x86/domctl.c
> > +++ b/xen/arch/x86/domctl.c
> > @@ -64,9 +64,10 @@ long domctl_memory_mapping(struct domain *d,
> > unsigned long gfn, 
> >      if ( add_map )
> >      {
> > -        printk(XENLOG_G_INFO
> > -               "memory_map:add: dom%d gfn=%lx mfn=%lx nr=%lx\n",
> > -               d->domain_id, gfn, mfn, nr_mfns);
> > +        if ( !is_pvh_domain(d) )     /* PVH maps lots and lots */
> > +            printk(XENLOG_G_INFO
> > +                   "memory_map:add: dom%d gfn=%lx mfn=%lx
> > nr=%lx\n",
> > +                   d->domain_id, gfn, mfn, nr_mfns);
> >  
> >          ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
> >          if ( !ret && paging_mode_translate(d) )
> > @@ -91,9 +92,10 @@ long domctl_memory_mapping(struct domain *d,
> > unsigned long gfn, }
> >      else
> >      {
> > -        printk(XENLOG_G_INFO
> > -               "memory_map:remove: dom%d gfn=%lx mfn=%lx nr=%lx\n",
> > -               d->domain_id, gfn, mfn, nr_mfns);
> > +        if ( !is_pvh_domain(d) )     /* PVH unmaps lots and lots */
> > +            printk(XENLOG_G_INFO
> > +                   "memory_map:remove: dom%d gfn=%lx mfn=%lx
> > nr=%lx\n",
> > +                   d->domain_id, gfn, mfn, nr_mfns);
> >  
> >          if ( paging_mode_translate(d) )
> >              for ( i = 0; i < nr_mfns; i++ )
> 
> Are these changes still necessary? IOW why would a PVH guest be
> mapping so much more MMIO memory than a PV one?

Right, not needed anymore since I map the IO space all upfront now in
xen. In earlier patches linux was doing it.

thanks
Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH
  2013-05-15 12:12   ` Jan Beulich
@ 2013-05-16  1:58     ` Mukesh Rathor
  2013-05-16  8:03       ` Jan Beulich
  0 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-16  1:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Wed, 15 May 2013 13:12:56 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > --- a/xen/arch/x86/setup.c
> > +++ b/xen/arch/x86/setup.c
> > @@ -60,6 +60,10 @@ integer_param("maxcpus", max_cpus);
> >  static bool_t __initdata disable_smep;
> >  invbool_param("smep", disable_smep);
> >  
> > +/* Boot dom0 in PVH mode */
> > +bool_t __initdata opt_dom0pvh;
> > +boolean_param("dom0pvh", opt_dom0pvh);
> 
> Does this really belong here (instead of domain_build.c)?

Yes, so we can pass the flag to domain_create() which sets the guest type:

    domcr_flags = (opt_dom0pvh ? DOMCRF_pvh | DOMCRF_hap : 0);
    domcr_flags |= DOMCRF_s3_integrity;
    dom0 = domain_create(0, domcr_flags, 0);

> > --- a/xen/common/libelf/libelf-loader.c
> > +++ b/xen/common/libelf/libelf-loader.c
> > @@ -127,6 +131,16 @@ static int elf_load_image(void *dst, const
> > void *src, uint64_t filesz, uint64_t int rc;
> >      if ( filesz > ULONG_MAX || memsz > ULONG_MAX )
> >          return -1;
> > +
> > +    if ( opt_dom0pvh )
> 
> So you define (above) and declare (below) the variable in x86-
> specific files, but use it in common code? That's going to break the
> ARM build.

Shoot, forgot about ARM. thanks.

> > +    {
> > +        unsigned long addr = (unsigned long)dst;
> > +        early_pvh_copy_or_zero(addr, src, filesz);
> > +        early_pvh_copy_or_zero(addr + filesz, NULL, memsz -
> > filesz);
> 
> And anyway - repeating my earlier complaint - I don't see why this
> is necessary. In fact I don't see why most of the PV Dom0 building
> code can't be used unchanged for PVH: There's no real need for
> lifting the few restrictions that apply, and hence there needn't be
> any fear of colliding address spaces.

Hmm... thats the best way I could come up with. If you want to prototype
something and replace what I've got, it's perfectly ok by me.


thanks
Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-15 23:05     ` Mukesh Rathor
@ 2013-05-16  7:21       ` Jan Beulich
  2013-05-16 11:03         ` Stefano Stabellini
  2013-05-16 23:56         ` Mukesh Rathor
  0 siblings, 2 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-16  7:21 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

>>> On 16.05.13 at 01:05, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> On Wed, 15 May 2013 10:58:43 +0100
> "Jan Beulich" <JBeulich@suse.com> wrote:
> 
>> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >>> wrote:
>> > --- a/xen/arch/x86/mm.c
>> > +++ b/xen/arch/x86/mm.c
>> > @@ -4519,7 +4519,8 @@ static int handle_iomem_range(unsigned long
>> > s, unsigned long e, void *p) 
>> >  static int xenmem_add_to_physmap_once(
>> >      struct domain *d,
>> > -    const struct xen_add_to_physmap *xatp)
>> > +    const struct xen_add_to_physmap *xatp,
>> > +    domid_t foreign_domid)
>> >  {
>> >      struct page_info *page = NULL;
>> >      unsigned long gfn = 0; /* gcc ... */
>> > @@ -4646,7 +4647,7 @@ static int xenmem_add_to_physmap(struct
>> > domain *d,
>> 
>> I know I said this before: This patch can't be complete, or else the
>> new function parameter would actually get used. With the way
>> things are, if this patch gets applied, a user of the new XENMEM_
>> sub-op would not get the expected behavior.
>> 
> 
> No, the new foreign_domid parameter is meaningful for only the 
> XENMAPSPACE_gmfn_foreign OP which is defined in patch 0018. So we 
> should be OK here.

Mukesh, please. Go look at your own patch again: It adds handling
of XENMEM_add_to_physmap_range to arch_memory_op(), calling
xenmem_add_to_physmap_range(), which in turn calls
xenmem_add_to_physmap_once() passing xatpr->foreign_domid
as the last argument. I don't see anywhere in the patch prevention
of that execution flow for an arbitrary guest. If I'm overlooking
something, please point me to it.

Furthermore, now that you forced me to look at that code yet
another time,

>+        xen_ulong_t idx;
>+        xen_pfn_t gpfn;

Pointless variables, ...

>+        struct xen_add_to_physmap xatp;
>+
>+        if ( copy_from_guest_offset(&idx, xatpr->idxs, xatpr->size-1, 1)  ||
>+             copy_from_guest_offset(&gpfn, xatpr->gpfns, xatpr->size-1, 1) )

... you can read directly into the respective xatp fields here.

>+        {
>+            return -EFAULT;
>+        }

Pointless (and inconsistent with code further down in this same
function) braces.

>+
>+        xatp.space = xatpr->space;
>+        xatp.idx = idx;
>+        xatp.gpfn = gpfn;
>+        rc = xenmem_add_to_physmap_once(d, &xatp, xatpr->foreign_domid);

xatp has a domid field - why don't you use that instead of adding a
new function parameter? I'm unclear anyway why two domain IDs
are useful here at all - Ian, Stefano, for one I still can't spot any use
of xen_add_to_physmap_range in tools and qemu (and hence can't
see a clear use case), and then I doubt there's real use for one
domain mapping GFNs from a second domain into a third one. If it's
really dead code that got added here, shouldn't we drop it now
rather than releasing 4.3 with it baked into the interface?

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 11/20] PVH xen: introduce pvh.c
  2013-05-16  1:42     ` Mukesh Rathor
@ 2013-05-16  8:00       ` Jan Beulich
  2013-05-17  0:27         ` Mukesh Rathor
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2013-05-16  8:00 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 16.05.13 at 03:42, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1129,6 +1129,10 @@ arch_do_vcpu_op(
>          struct domain *d = v->domain;
>          struct vcpu_register_vcpu_info info;
>  
> +        rc = -ENOSYS;
> +        if ( is_pvh_vcpu(current) )
> +            break;
> +

Assuming this is meant to be temporary - yes, this _might_ be
acceptable (if accompanied by a proper comment). But then again
registering vCPU info is a pretty basic interface (which recently
got even moved into common code iirc), so I'm having a hard time
seeing why you need to suppress it rather than make it work from
the beginning.

>          rc = -EFAULT;
>          if ( copy_from_guest(&info, arg, 1) )
>              break;
> @@ -1169,6 +1173,10 @@ arch_do_vcpu_op(
>      {
>          struct vcpu_get_physid cpu_id;
>  
> +        rc = -ENOSYS;
> +        if ( is_pvh_vcpu(current) )
> +            break;
> +

Similarly here - what's wrong with this for PVH?

>          rc = -EINVAL;
>          if ( !is_pinned_vcpu(v) )
>              break;
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -968,6 +968,12 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
>      }
>  
>      case VCPUOP_down:
> +        if ( is_pvh_vcpu(current) )
> +        {
> +            rc = -ENOSYS;
> +            break;
> +        }

I can see that this may indeed require some special cases to be
taken care of. But adding a comment is then the minimum
requirement. And the increasing amount of "PVH fixme"s is
worrying in their own right - once again, I went through a similar
exercise with 32-on-64 support, and didn't have a need to post
patches for public review (with the underlying implication of them
being in a mergeable state) with this many issues left open.

> +
>          if ( !test_and_set_bit(_VPF_down, &v->pause_flags) )
>              vcpu_sleep_nosync(v);
>          break;
> @@ -1039,6 +1045,11 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>  #ifdef VCPU_TRAP_NMI
>      case VCPUOP_send_nmi:
> +        if ( is_pvh_vcpu(current) )
> +        {
> +            rc = -ENOSYS;
> +            break;
> +        }

This one may even have to remain, assuming PVH would send NMIs
via LAPIC ICR? But then using !is_pv_domain() would seem the right
thing here. Otoh, allowing as many PV operations as possible along
with the respective HVM counterparts may be desirable for easing
the kernel side transition?

Jan

>          if ( !guest_handle_is_null(arg) )
>              return -EINVAL;
>  

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH
  2013-05-16  1:58     ` Mukesh Rathor
@ 2013-05-16  8:03       ` Jan Beulich
  2013-05-17  1:14         ` Mukesh Rathor
  2013-05-18  2:01         ` Mukesh Rathor
  0 siblings, 2 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-16  8:03 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 16.05.13 at 03:58, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> On Wed, 15 May 2013 13:12:56 +0100
> "Jan Beulich" <JBeulich@suse.com> wrote:
> 
>> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >>> wrote:
>> > --- a/xen/arch/x86/setup.c
>> > +++ b/xen/arch/x86/setup.c
>> > @@ -60,6 +60,10 @@ integer_param("maxcpus", max_cpus);
>> >  static bool_t __initdata disable_smep;
>> >  invbool_param("smep", disable_smep);
>> >  
>> > +/* Boot dom0 in PVH mode */
>> > +bool_t __initdata opt_dom0pvh;
>> > +boolean_param("dom0pvh", opt_dom0pvh);
>> 
>> Does this really belong here (instead of domain_build.c)?
> 
> Yes, so we can pass the flag to domain_create() which sets the guest type:
> 
>     domcr_flags = (opt_dom0pvh ? DOMCRF_pvh | DOMCRF_hap : 0);
>     domcr_flags |= DOMCRF_s3_integrity;
>     dom0 = domain_create(0, domcr_flags, 0);

The symbol is global, so use sites don't matter. And I continue to
think that logically it belongs into domain_build.c.

>> > +    {
>> > +        unsigned long addr = (unsigned long)dst;
>> > +        early_pvh_copy_or_zero(addr, src, filesz);
>> > +        early_pvh_copy_or_zero(addr + filesz, NULL, memsz -
>> > filesz);
>> 
>> And anyway - repeating my earlier complaint - I don't see why this
>> is necessary. In fact I don't see why most of the PV Dom0 building
>> code can't be used unchanged for PVH: There's no real need for
>> lifting the few restrictions that apply, and hence there needn't be
>> any fear of colliding address spaces.
> 
> Hmm... thats the best way I could come up with. If you want to prototype
> something and replace what I've got, it's perfectly ok by me.

There's nothing to prototype - just use the code that's there for
PV Dom0.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-16  7:21       ` Jan Beulich
@ 2013-05-16 11:03         ` Stefano Stabellini
  2013-05-16 12:01           ` Jan Beulich
  2013-05-16 23:56         ` Mukesh Rathor
  1 sibling, 1 reply; 51+ messages in thread
From: Stefano Stabellini @ 2013-05-16 11:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

On Thu, 16 May 2013, Jan Beulich wrote:
> >>> On 16.05.13 at 01:05, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> > On Wed, 15 May 2013 10:58:43 +0100
> > "Jan Beulich" <JBeulich@suse.com> wrote:
> > 
> >> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
> >> >>> wrote:
> >> > --- a/xen/arch/x86/mm.c
> >> > +++ b/xen/arch/x86/mm.c
> >> > @@ -4519,7 +4519,8 @@ static int handle_iomem_range(unsigned long
> >> > s, unsigned long e, void *p) 
> >> >  static int xenmem_add_to_physmap_once(
> >> >      struct domain *d,
> >> > -    const struct xen_add_to_physmap *xatp)
> >> > +    const struct xen_add_to_physmap *xatp,
> >> > +    domid_t foreign_domid)
> >> >  {
> >> >      struct page_info *page = NULL;
> >> >      unsigned long gfn = 0; /* gcc ... */
> >> > @@ -4646,7 +4647,7 @@ static int xenmem_add_to_physmap(struct
> >> > domain *d,
> >> 
> >> I know I said this before: This patch can't be complete, or else the
> >> new function parameter would actually get used. With the way
> >> things are, if this patch gets applied, a user of the new XENMEM_
> >> sub-op would not get the expected behavior.
> >> 
> > 
> > No, the new foreign_domid parameter is meaningful for only the 
> > XENMAPSPACE_gmfn_foreign OP which is defined in patch 0018. So we 
> > should be OK here.
> 
> Mukesh, please. Go look at your own patch again: It adds handling
> of XENMEM_add_to_physmap_range to arch_memory_op(), calling
> xenmem_add_to_physmap_range(), which in turn calls
> xenmem_add_to_physmap_once() passing xatpr->foreign_domid
> as the last argument. I don't see anywhere in the patch prevention
> of that execution flow for an arbitrary guest. If I'm overlooking
> something, please point me to it.
> 
> Furthermore, now that you forced me to look at that code yet
> another time,
> 
> >+        xen_ulong_t idx;
> >+        xen_pfn_t gpfn;
> 
> Pointless variables, ...
> 
> >+        struct xen_add_to_physmap xatp;
> >+
> >+        if ( copy_from_guest_offset(&idx, xatpr->idxs, xatpr->size-1, 1)  ||
> >+             copy_from_guest_offset(&gpfn, xatpr->gpfns, xatpr->size-1, 1) )
> 
> ... you can read directly into the respective xatp fields here.
> 
> >+        {
> >+            return -EFAULT;
> >+        }
> 
> Pointless (and inconsistent with code further down in this same
> function) braces.
> 
> >+
> >+        xatp.space = xatpr->space;
> >+        xatp.idx = idx;
> >+        xatp.gpfn = gpfn;
> >+        rc = xenmem_add_to_physmap_once(d, &xatp, xatpr->foreign_domid);
> 
> xatp has a domid field - why don't you use that instead of adding a
> new function parameter? I'm unclear anyway why two domain IDs
> are useful here at all - Ian, Stefano, for one I still can't spot any use
> of xen_add_to_physmap_range in tools and qemu (and hence can't
> see a clear use case), and then I doubt there's real use for one
> domain mapping GFNs from a second domain into a third one. If it's
> really dead code that got added here, shouldn't we drop it now
> rather than releasing 4.3 with it baked into the interface?

We use XENMEM_add_to_physmap_range to map foreign mfns in dom0 during
domain creation.

> Jan
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-16 11:03         ` Stefano Stabellini
@ 2013-05-16 12:01           ` Jan Beulich
  2013-05-16 15:04             ` Stefano Stabellini
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2013-05-16 12:01 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Ian Campbell, xen-devel

>>> On 16.05.13 at 13:03, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:
> On Thu, 16 May 2013, Jan Beulich wrote:
>> >>> On 16.05.13 at 01:05, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
>> >+        xatp.space = xatpr->space;
>> >+        xatp.idx = idx;
>> >+        xatp.gpfn = gpfn;
>> >+        rc = xenmem_add_to_physmap_once(d, &xatp, xatpr->foreign_domid);
>> 
>> xatp has a domid field - why don't you use that instead of adding a
>> new function parameter? I'm unclear anyway why two domain IDs
>> are useful here at all - Ian, Stefano, for one I still can't spot any use
>> of xen_add_to_physmap_range in tools and qemu (and hence can't
>> see a clear use case), and then I doubt there's real use for one
>> domain mapping GFNs from a second domain into a third one. If it's
>> really dead code that got added here, shouldn't we drop it now
>> rather than releasing 4.3 with it baked into the interface?
> 
> We use XENMEM_add_to_physmap_range to map foreign mfns in dom0 during
> domain creation.

Hmm - for one, where is that code? And then - this involves only
two domains, but the interface explicitly permits for three, and
that aspect was what my query was about.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-16 12:01           ` Jan Beulich
@ 2013-05-16 15:04             ` Stefano Stabellini
  0 siblings, 0 replies; 51+ messages in thread
From: Stefano Stabellini @ 2013-05-16 15:04 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

On Thu, 16 May 2013, Jan Beulich wrote:
> >>> On 16.05.13 at 13:03, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:
> > On Thu, 16 May 2013, Jan Beulich wrote:
> >> >>> On 16.05.13 at 01:05, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> >> >+        xatp.space = xatpr->space;
> >> >+        xatp.idx = idx;
> >> >+        xatp.gpfn = gpfn;
> >> >+        rc = xenmem_add_to_physmap_once(d, &xatp, xatpr->foreign_domid);
> >> 
> >> xatp has a domid field - why don't you use that instead of adding a
> >> new function parameter? I'm unclear anyway why two domain IDs
> >> are useful here at all - Ian, Stefano, for one I still can't spot any use
> >> of xen_add_to_physmap_range in tools and qemu (and hence can't
> >> see a clear use case), and then I doubt there's real use for one
> >> domain mapping GFNs from a second domain into a third one. If it's
> >> really dead code that got added here, shouldn't we drop it now
> >> rather than releasing 4.3 with it baked into the interface?
> > 
> > We use XENMEM_add_to_physmap_range to map foreign mfns in dom0 during
> > domain creation.
> 
> Hmm - for one, where is that code?

arch/arm/xen/enlighten.c:map_foreign_page


> And then - this involves only
> two domains, but the interface explicitly permits for three, and
> that aspect was what my query was about.
 
We are using two domains: DOMID_SELF and the foreign_domid. I take that
your point is that given that we are always using domid = DOMID_SELF,
that field is not actually useful? I guess that is correct however I
wouldn't go as far as changing that interface again only for that.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-16  7:21       ` Jan Beulich
  2013-05-16 11:03         ` Stefano Stabellini
@ 2013-05-16 23:56         ` Mukesh Rathor
  2013-05-17  6:37           ` Jan Beulich
  1 sibling, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-16 23:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Thu, 16 May 2013 08:21:16 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 16.05.13 at 01:05, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > On Wed, 15 May 2013 10:58:43 +0100
> > "Jan Beulich" <JBeulich@suse.com> wrote:
> > 
> >> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
> >> >>> wrote:
> >> > --- a/xen/arch/x86/mm.c
> >> > +++ b/xen/arch/x86/mm.c
> >> > @@ -4519,7 +4519,8 @@ static int handle_iomem_range(unsigned long
> >> > s, unsigned long e, void *p) 
> >> >  static int xenmem_add_to_physmap_once(
> >> >      struct domain *d,
> >> > -    const struct xen_add_to_physmap *xatp)
> >> > +    const struct xen_add_to_physmap *xatp,
> >> > +    domid_t foreign_domid)
> >> >  {
> >> >      struct page_info *page = NULL;
> >> >      unsigned long gfn = 0; /* gcc ... */
> >> > @@ -4646,7 +4647,7 @@ static int xenmem_add_to_physmap(struct
> >> > domain *d,
> >> 
> >> I know I said this before: This patch can't be complete, or else
> >> the new function parameter would actually get used. With the way
> >> things are, if this patch gets applied, a user of the new XENMEM_
> >> sub-op would not get the expected behavior.
> >> 
> > 
> > No, the new foreign_domid parameter is meaningful for only the 
> > XENMAPSPACE_gmfn_foreign OP which is defined in patch 0018. So we 
> > should be OK here.
> 
> Mukesh, please. Go look at your own patch again: It adds handling
> of XENMEM_add_to_physmap_range to arch_memory_op(), calling
> xenmem_add_to_physmap_range(), which in turn calls
> xenmem_add_to_physmap_once() passing xatpr->foreign_domid
> as the last argument. I don't see anywhere in the patch prevention
> of that execution flow for an arbitrary guest. If I'm overlooking
> something, please point me to it.

Hmm..., looking at the code again, xenmem_add_to_physmap_once() will
start with setting mfn=0, then in the case statement, it won't find 
XENMAPSPACE_gmfn_foreign, so will get out of switch to :

    if ( !paging_mode_translate(d) || (mfn == 0) )

Since mfn is 0, it will return -EINVAL. The error would be copied
to the xatpr->errs and guest will see that.  What am I missing? I can
add full support in just one patch, adding patch 18 here, but then in
the past I was asked to break big patches. 

> Furthermore, now that you forced me to look at that code yet
> another time,
> 
> >+        xen_ulong_t idx;
> >+        xen_pfn_t gpfn;
> 
> Pointless variables, ...
> 
> >+        struct xen_add_to_physmap xatp;
> >+
> >+        if ( copy_from_guest_offset(&idx, xatpr->idxs,
> >xatpr->size-1, 1)  ||
> >+             copy_from_guest_offset(&gpfn, xatpr->gpfns,
> >xatpr->size-1, 1) )
> 
> ... you can read directly into the respective xatp fields here.

I could, but it makes the lines long/wrap in the if statement making the code
harder to read IMO. The compiler should do exact same thing in both cases.
If it really bothers you, I can change it.

thanks
Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 11/20] PVH xen: introduce pvh.c
  2013-05-16  8:00       ` Jan Beulich
@ 2013-05-17  0:27         ` Mukesh Rathor
  2013-05-17  6:43           ` Jan Beulich
  0 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-17  0:27 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Thu, 16 May 2013 09:00:44 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 16.05.13 at 03:42, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > --- a/xen/arch/x86/domain.c
> > +++ b/xen/arch/x86/domain.c
> > @@ -1129,6 +1129,10 @@ arch_do_vcpu_op(
> >          struct domain *d = v->domain;
> >          struct vcpu_register_vcpu_info info;
> >  
> > +        rc = -ENOSYS;
> > +        if ( is_pvh_vcpu(current) )
> > +            break;
> > +
> 
> Assuming this is meant to be temporary - yes, this _might_ be
> acceptable (if accompanied by a proper comment). But then again
> registering vCPU info is a pretty basic interface (which recently
> got even moved into common code iirc), so I'm having a hard time
> seeing why you need to suppress it rather than make it work from
> the beginning.

Because I am not as smart as you, and my brain gets full sooner :). I
only know to do big features in pieces. As it is already, each refresh
costs me days to merge, resolving conflicts, looking at code to see what
changed, etc.. then almost always something breaks, and takes a while
to debug. I got linux side patch to watch out too.

My understanding with previous maintainers was, establish a baseline with
something working, then keep adding bells and whistles to it. This is an
effort to that end. I am also OK dropping the ball on pvh and let
someones else who can finish it all in one shot do it.

Thanks again for all your time and input, you are thorough :).

Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH
  2013-05-16  8:03       ` Jan Beulich
@ 2013-05-17  1:14         ` Mukesh Rathor
  2013-05-17  6:45           ` Jan Beulich
  2013-05-18  2:01         ` Mukesh Rathor
  1 sibling, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-17  1:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Thu, 16 May 2013 09:03:16 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 16.05.13 at 03:58, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > On Wed, 15 May 2013 13:12:56 +0100
> > "Jan Beulich" <JBeulich@suse.com> wrote:
> > 
> >> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
> >> >>> wrote:
> >> > --- a/xen/arch/x86/setup.c
> >> > +++ b/xen/arch/x86/setup.c
> >> > @@ -60,6 +60,10 @@ integer_param("maxcpus", max_cpus);
> >> >  static bool_t __initdata disable_smep;
> >> >  invbool_param("smep", disable_smep);
> >> >  
> >> > +/* Boot dom0 in PVH mode */
> >> > +bool_t __initdata opt_dom0pvh;
> >> > +boolean_param("dom0pvh", opt_dom0pvh);
> >> 
> >> Does this really belong here (instead of domain_build.c)?
> > 
> > Yes, so we can pass the flag to domain_create() which sets the
> > guest type:
> > 
> >     domcr_flags = (opt_dom0pvh ? DOMCRF_pvh | DOMCRF_hap : 0);
> >     domcr_flags |= DOMCRF_s3_integrity;
> >     dom0 = domain_create(0, domcr_flags, 0);
> 
> The symbol is global, so use sites don't matter. And I continue to
> think that logically it belongs into domain_build.c.

Oh, you mean just the declarations, sure I can do that.

thanks
Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-16 23:56         ` Mukesh Rathor
@ 2013-05-17  6:37           ` Jan Beulich
  2013-05-17 22:24             ` Mukesh Rathor
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2013-05-17  6:37 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 17.05.13 at 01:56, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> On Thu, 16 May 2013 08:21:16 +0100
> "Jan Beulich" <JBeulich@suse.com> wrote:
> 
>> >>> On 16.05.13 at 01:05, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >>> wrote:
>> > On Wed, 15 May 2013 10:58:43 +0100
>> > "Jan Beulich" <JBeulich@suse.com> wrote:
>> > 
>> >> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >> >>> wrote:
>> >> > --- a/xen/arch/x86/mm.c
>> >> > +++ b/xen/arch/x86/mm.c
>> >> > @@ -4519,7 +4519,8 @@ static int handle_iomem_range(unsigned long
>> >> > s, unsigned long e, void *p) 
>> >> >  static int xenmem_add_to_physmap_once(
>> >> >      struct domain *d,
>> >> > -    const struct xen_add_to_physmap *xatp)
>> >> > +    const struct xen_add_to_physmap *xatp,
>> >> > +    domid_t foreign_domid)
>> >> >  {
>> >> >      struct page_info *page = NULL;
>> >> >      unsigned long gfn = 0; /* gcc ... */
>> >> > @@ -4646,7 +4647,7 @@ static int xenmem_add_to_physmap(struct
>> >> > domain *d,
>> >> 
>> >> I know I said this before: This patch can't be complete, or else
>> >> the new function parameter would actually get used. With the way
>> >> things are, if this patch gets applied, a user of the new XENMEM_
>> >> sub-op would not get the expected behavior.
>> >> 
>> > 
>> > No, the new foreign_domid parameter is meaningful for only the 
>> > XENMAPSPACE_gmfn_foreign OP which is defined in patch 0018. So we 
>> > should be OK here.
>> 
>> Mukesh, please. Go look at your own patch again: It adds handling
>> of XENMEM_add_to_physmap_range to arch_memory_op(), calling
>> xenmem_add_to_physmap_range(), which in turn calls
>> xenmem_add_to_physmap_once() passing xatpr->foreign_domid
>> as the last argument. I don't see anywhere in the patch prevention
>> of that execution flow for an arbitrary guest. If I'm overlooking
>> something, please point me to it.
> 
> Hmm..., looking at the code again, xenmem_add_to_physmap_once() will
> start with setting mfn=0, then in the case statement, it won't find 
> XENMAPSPACE_gmfn_foreign, so will get out of switch to :
> 
>     if ( !paging_mode_translate(d) || (mfn == 0) )
> 
> Since mfn is 0, it will return -EINVAL. The error would be copied
> to the xatpr->errs and guest will see that.  What am I missing?

Okay, I see now, but that was entirely unrecognizable from
patch context and patch description.

> I can
> add full support in just one patch, adding patch 18 here, but then in
> the past I was asked to break big patches. 

With the above briefly stated in the patch description, I'm fine
with you keeping it separate.

>> Furthermore, now that you forced me to look at that code yet
>> another time,
>> 
>> >+        xen_ulong_t idx;
>> >+        xen_pfn_t gpfn;
>> 
>> Pointless variables, ...
>> 
>> >+        struct xen_add_to_physmap xatp;
>> >+
>> >+        if ( copy_from_guest_offset(&idx, xatpr->idxs,
>> >xatpr->size-1, 1)  ||
>> >+             copy_from_guest_offset(&gpfn, xatpr->gpfns,
>> >xatpr->size-1, 1) )
>> 
>> ... you can read directly into the respective xatp fields here.
> 
> I could, but it makes the lines long/wrap in the if statement making the 
> code
> harder to read IMO. The compiler should do exact same thing in both cases.

I'm afraid it's not permitted to do so because of the addresses of
the variables being taken and, through some macro levels, passed
to global functions.

> If it really bothers you, I can change it.

It's certainly a matter of taste to some degree, but to me it's
inefficient code (unless you could prove at least modern gcc indeed
doing said optimization despite the use of the & operator) and
would sooner or later (once stumbling across that code again)
prompt me to submit a cleanup patch...

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 11/20] PVH xen: introduce pvh.c
  2013-05-17  0:27         ` Mukesh Rathor
@ 2013-05-17  6:43           ` Jan Beulich
  2013-05-21  0:08             ` Mukesh Rathor
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2013-05-17  6:43 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 17.05.13 at 02:27, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> On Thu, 16 May 2013 09:00:44 +0100
> "Jan Beulich" <JBeulich@suse.com> wrote:
> 
>> >>> On 16.05.13 at 03:42, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >>> wrote:
>> > --- a/xen/arch/x86/domain.c
>> > +++ b/xen/arch/x86/domain.c
>> > @@ -1129,6 +1129,10 @@ arch_do_vcpu_op(
>> >          struct domain *d = v->domain;
>> >          struct vcpu_register_vcpu_info info;
>> >  
>> > +        rc = -ENOSYS;
>> > +        if ( is_pvh_vcpu(current) )
>> > +            break;
>> > +
>> 
>> Assuming this is meant to be temporary - yes, this _might_ be
>> acceptable (if accompanied by a proper comment). But then again
>> registering vCPU info is a pretty basic interface (which recently
>> got even moved into common code iirc), so I'm having a hard time
>> seeing why you need to suppress it rather than make it work from
>> the beginning.
> 
> Because I am not as smart as you, and my brain gets full sooner :). I
> only know to do big features in pieces. As it is already, each refresh
> costs me days to merge, resolving conflicts, looking at code to see what
> changed, etc.. then almost always something breaks, and takes a while
> to debug. I got linux side patch to watch out too.

I understand it's hard, the more considering how long you've been
on it already.

> My understanding with previous maintainers was, establish a baseline with
> something working, then keep adding bells and whistles to it. This is an
> effort to that end. I am also OK dropping the ball on pvh and let
> someones else who can finish it all in one shot do it.

And I already indicated various places where I'm fine with such
compromises. But this one - with the code already working for PV
and HVM guests - I simply expect would work without you doing
_anything_ (or if not, it must be something really simple that needs
adjustment).

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH
  2013-05-17  1:14         ` Mukesh Rathor
@ 2013-05-17  6:45           ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-17  6:45 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 17.05.13 at 03:14, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> On Thu, 16 May 2013 09:03:16 +0100
> "Jan Beulich" <JBeulich@suse.com> wrote:
> 
>> >>> On 16.05.13 at 03:58, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >>> wrote:
>> > On Wed, 15 May 2013 13:12:56 +0100
>> > "Jan Beulich" <JBeulich@suse.com> wrote:
>> > 
>> >> >>> On 15.05.13 at 02:52, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >> >>> wrote:
>> >> > --- a/xen/arch/x86/setup.c
>> >> > +++ b/xen/arch/x86/setup.c
>> >> > @@ -60,6 +60,10 @@ integer_param("maxcpus", max_cpus);
>> >> >  static bool_t __initdata disable_smep;
>> >> >  invbool_param("smep", disable_smep);
>> >> >  
>> >> > +/* Boot dom0 in PVH mode */
>> >> > +bool_t __initdata opt_dom0pvh;
>> >> > +boolean_param("dom0pvh", opt_dom0pvh);
>> >> 
>> >> Does this really belong here (instead of domain_build.c)?
>> > 
>> > Yes, so we can pass the flag to domain_create() which sets the
>> > guest type:
>> > 
>> >     domcr_flags = (opt_dom0pvh ? DOMCRF_pvh | DOMCRF_hap : 0);
>> >     domcr_flags |= DOMCRF_s3_integrity;
>> >     dom0 = domain_create(0, domcr_flags, 0);
>> 
>> The symbol is global, so use sites don't matter. And I continue to
>> think that logically it belongs into domain_build.c.
> 
> Oh, you mean just the declarations, sure I can do that.

Just to avoid confusion - I mean the _definition_ (of both variable
and command line argument), which header the declaration is in I
don't really care as long as it's half way sensible.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range
  2013-05-17  6:37           ` Jan Beulich
@ 2013-05-17 22:24             ` Mukesh Rathor
  0 siblings, 0 replies; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-17 22:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Fri, 17 May 2013 07:37:16 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 17.05.13 at 01:56, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > On Thu, 16 May 2013 08:21:16 +0100
> > "Jan Beulich" <JBeulich@suse.com> wrote:
> > 
> >> >>> On 16.05.13 at 01:05, Mukesh Rathor <mukesh.rathor@oracle.com>
> >> >>> wrote:
> >> > On Wed, 15 May 2013 10:58:43 +0100
> >> > "Jan Beulich" <JBeulich@suse.com> wrote:
> >> > 
> >> >xatpr->size-1, 1) )
> >> 
> >> ... you can read directly into the respective xatp fields here.
> > 
> > I could, but it makes the lines long/wrap in the if statement
> > making the code
> > harder to read IMO. The compiler should do exact same thing in both
> > cases.
> 
> I'm afraid it's not permitted to do so because of the addresses of
> the variables being taken and, through some macro levels, passed
> to global functions.
> 
> > If it really bothers you, I can change it.
> 
> It's certainly a matter of taste to some degree, but to me it's
> inefficient code (unless you could prove at least modern gcc indeed
> doing said optimization despite the use of the & operator) and
> would sooner or later (once stumbling across that code again)
> prompt me to submit a cleanup patch...

Modern compilers are amazing, it generates pretty much the same code, 
slightly different ordering, and only two instructions are different:

NEW (meaning not using local variables):

        if ( copy_from_guest_offset(&xatp.idx, xatpr->idxs, xatpr->size-1, 1)
             || copy_from_guest_offset(&xatp.gpfn, xatpr->gpfns, xatpr->size-1,
                                       1) )
            return -EFAULT;


0xffff82c4c017198b <xenmem_add_to_physmap_range+31>:    mov    %rax,%r12
0xffff82c4c017198e <xenmem_add_to_physmap_range+34>:    and    %rsp,%r12
0xffff82c4c0171991 <xenmem_add_to_physmap_range+37>:    lea    -0x50(%rbp),%r15
0xffff82c4c0171995 <xenmem_add_to_physmap_range+41>:    lea    0x8(%r15),%rdx
0xffff82c4c0171999 <xenmem_add_to_physmap_range+45>:    mov    %rdx,-0x58(%rbp)
0xffff82c4c017199d <xenmem_add_to_physmap_range+49>:    mov    %r12,%r13

0xffff82c4c01719a0 <xenmem_add_to_physmap_range+52>:    lea    0x10(%r15),%rdx
0xffff82c4c01719a0 <xenmem_add_to_physmap_range+52>:    lea    -0x60(%rbp),%rdx
                      ^^^^^^^^^ Above is OLD with local variables.

0xffff82c4c01719a4 <xenmem_add_to_physmap_range+56>:    mov    %rdx,-0x60(%rbp)
0xffff82c4c01719a8 <xenmem_add_to_physmap_range+60>:    mov    %r12,%r14
0xffff82c4c01719ab <xenmem_add_to_physmap_range+63>:    lea    -0x34(%rbp),%rdx
0xffff82c4c01719af <xenmem_add_to_physmap_range+67>:    mov    %rdx,-0x70(%rbp)
0xffff82c4c01719b3 <xenmem_add_to_physmap_range+71>:    mov    %r12,-0x78(%rbp)
0xffff82c4c01719b7 <xenmem_add_to_physmap_range+75>:    jmpq   0xffff82c4c0171b6a <xenmem_add_to_physmap_range+510 at mm.c:4685>
0xffff82c4c01719bc <xenmem_add_to_physmap_range+80>:    mov    0x8(%rbx),%rdx
0xffff82c4c01719c0 <xenmem_add_to_physmap_range+84>:    mov    0x7fe8(%r12),%rcx
0xffff82c4c01719c8 <xenmem_add_to_physmap_range+92>:    mov    0x10(%rcx),%rcx
0xffff82c4c01719cc <xenmem_add_to_physmap_range+96>:    cmpb   $0x0,0x1e8(%rcx)
0xffff82c4c01719d3 <xenmem_add_to_physmap_range+103>:   je     0xffff82c4c01719ed <xenmem_add_to_physmap_range+129 at mm.c:4689>
0xffff82c4c01719d5 <xenmem_add_to_physmap_range+105>:   movzwl %ax,%eax
0xffff82c4c01719d8 <xenmem_add_to_physmap_range+108>:   lea    -0x8(%rdx,%rax,8),%rsi
0xffff82c4c01719dd <xenmem_add_to_physmap_range+113>:   mov    $0x8,%edx

0xffff82c4c01719e2 <xenmem_add_to_physmap_range+118>:   mov    -0x58(%rbp),%rdi
0xffff82c4c01719fc <xenmem_add_to_physmap_range+144>:   mov    %r15,%rdi
                      ^^^^^^^^^ Above is OLD with local variables.

0xffff82c4c01719e6 <xenmem_add_to_physmap_range+122>:   callq  0xffff82c4c01b5861 <copy_from_user_hvm at hvm.c:2739>
0xffff82c4c01719eb <xenmem_add_to_physmap_range+127>:   jmp    0xffff82c4c0171a03 <xenmem_add_to_physmap_range+151 at mm.c:4689>
0xffff82c4c01719ed <xenmem_add_to_physmap_range+129>:   movzwl %ax,%eax
0xffff82c4c01719f0 <xenmem_add_to_physmap_range+132>:   lea    -0x8(%rdx,%rax,8),%rsi
0xffff82c4c01719f5 <xenmem_add_to_physmap_range+137>:   mov    $0x8,%edx
0xffff82c4c01719fa <xenmem_add_to_physmap_range+142>:   mov    -0x58(%rbp),%rdi
0xffff82c4c01719fe <xenmem_add_to_physmap_range+146>:   callq  0xffff82c4c018c59b <copy_from_user at usercopy.c:167>

Since, I already changed it, I'll leave it changed, but now I'm going crazy
whether the '||' belongs on line by itself, and the CODING_STYE doesn't
say anything, so hopefully I've it right and not have to crank out
another version just for that :).

thanks,
Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH
  2013-05-16  8:03       ` Jan Beulich
  2013-05-17  1:14         ` Mukesh Rathor
@ 2013-05-18  2:01         ` Mukesh Rathor
  2013-05-21  7:14           ` Jan Beulich
  1 sibling, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-18  2:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Thu, 16 May 2013 09:03:16 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 16.05.13 at 03:58, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > On Wed, 15 May 2013 13:12:56 +0100
> > "Jan Beulich" <JBeulich@suse.com> wrote:
> > 
> >> > +    {
> >> > +        unsigned long addr = (unsigned long)dst;
> >> > +        early_pvh_copy_or_zero(addr, src, filesz);
> >> > +        early_pvh_copy_or_zero(addr + filesz, NULL, memsz -
> >> > filesz);
> >> 
> >> And anyway - repeating my earlier complaint - I don't see why this
> >> is necessary. In fact I don't see why most of the PV Dom0 building
> >> code can't be used unchanged for PVH: There's no real need for
> >> lifting the few restrictions that apply, and hence there needn't be
> >> any fear of colliding address spaces.
> > 
> > Hmm... thats the best way I could come up with. If you want to
> > prototype something and replace what I've got, it's perfectly ok by
> > me.
> 
> There's nothing to prototype - just use the code that's there for
> PV Dom0.

Not sure if you are referring to just changes in elf_load_image():

+    if ( opt_dom0pvh )
+    {
+        unsigned long addr = (unsigned long)dst;
+        early_pvh_copy_or_zero(addr, src, filesz);
+        early_pvh_copy_or_zero(addr + filesz, NULL, memsz - filesz);
+
+        return 0;
+    }
+
     rc = raw_copy_to_guest(dst, src, filesz);

or all changes including construct_dom0() also?

As the comment says, for elf_load_image() we need early_pvh_copy_or_zero
because it's too early in boot and construct_dom0() is running on idle
vcpu where curr points to.

If that doesn't address your concern, please elaborate.

thanks
Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 11/20] PVH xen: introduce pvh.c
  2013-05-17  6:43           ` Jan Beulich
@ 2013-05-21  0:08             ` Mukesh Rathor
  2013-05-21  7:07               ` Jan Beulich
  0 siblings, 1 reply; 51+ messages in thread
From: Mukesh Rathor @ 2013-05-21  0:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Fri, 17 May 2013 07:43:19 +0100
"Jan Beulich" <JBeulich@suse.com> wrote:

> >>> On 17.05.13 at 02:27, Mukesh Rathor <mukesh.rathor@oracle.com>
> >>> wrote:
> > On Thu, 16 May 2013 09:00:44 +0100
> > "Jan Beulich" <JBeulich@suse.com> wrote:
> > 
> >> >>> On 16.05.13 at 03:42, Mukesh Rathor <mukesh.rathor@oracle.com>
> >> >>> wrote:
.... 
> > My understanding with previous maintainers was, establish a
> > baseline with something working, then keep adding bells and
> > whistles to it. This is an effort to that end. I am also OK
> > dropping the ball on pvh and let someones else who can finish it
> > all in one shot do it.
> 
> And I already indicated various places where I'm fine with such
> compromises. But this one - with the code already working for PV
> and HVM guests - I simply expect would work without you doing
> _anything_ (or if not, it must be something really simple that needs
> adjustment).

Actually, now that pvh.c has shrunk a lot, and code is very similar
to hvm.c, are you OK if I just add some code to hvm.c and get rid of
pvh.c?

thanks,
Mukesh

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 11/20] PVH xen: introduce pvh.c
  2013-05-21  0:08             ` Mukesh Rathor
@ 2013-05-21  7:07               ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-21  7:07 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 21.05.13 at 02:08, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> On Fri, 17 May 2013 07:43:19 +0100
> "Jan Beulich" <JBeulich@suse.com> wrote:
> 
>> >>> On 17.05.13 at 02:27, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >>> wrote:
>> > On Thu, 16 May 2013 09:00:44 +0100
>> > "Jan Beulich" <JBeulich@suse.com> wrote:
>> > 
>> >> >>> On 16.05.13 at 03:42, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >> >>> wrote:
> .... 
>> > My understanding with previous maintainers was, establish a
>> > baseline with something working, then keep adding bells and
>> > whistles to it. This is an effort to that end. I am also OK
>> > dropping the ball on pvh and let someones else who can finish it
>> > all in one shot do it.
>> 
>> And I already indicated various places where I'm fine with such
>> compromises. But this one - with the code already working for PV
>> and HVM guests - I simply expect would work without you doing
>> _anything_ (or if not, it must be something really simple that needs
>> adjustment).
> 
> Actually, now that pvh.c has shrunk a lot, and code is very similar
> to hvm.c, are you OK if I just add some code to hvm.c and get rid of
> pvh.c?

Sure.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH
  2013-05-18  2:01         ` Mukesh Rathor
@ 2013-05-21  7:14           ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2013-05-21  7:14 UTC (permalink / raw)
  To: Mukesh Rathor; +Cc: xen-devel

>>> On 18.05.13 at 04:01, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:
> On Thu, 16 May 2013 09:03:16 +0100
> "Jan Beulich" <JBeulich@suse.com> wrote:
> 
>> >>> On 16.05.13 at 03:58, Mukesh Rathor <mukesh.rathor@oracle.com>
>> >>> wrote:
>> > On Wed, 15 May 2013 13:12:56 +0100
>> > "Jan Beulich" <JBeulich@suse.com> wrote:
>> > 
>> >> > +    {
>> >> > +        unsigned long addr = (unsigned long)dst;
>> >> > +        early_pvh_copy_or_zero(addr, src, filesz);
>> >> > +        early_pvh_copy_or_zero(addr + filesz, NULL, memsz -
>> >> > filesz);
>> >> 
>> >> And anyway - repeating my earlier complaint - I don't see why this
>> >> is necessary. In fact I don't see why most of the PV Dom0 building
>> >> code can't be used unchanged for PVH: There's no real need for
>> >> lifting the few restrictions that apply, and hence there needn't be
>> >> any fear of colliding address spaces.
>> > 
>> > Hmm... thats the best way I could come up with. If you want to
>> > prototype something and replace what I've got, it's perfectly ok by
>> > me.
>> 
>> There's nothing to prototype - just use the code that's there for
>> PV Dom0.
> 
> Not sure if you are referring to just changes in elf_load_image():
> 
> +    if ( opt_dom0pvh )
> +    {
> +        unsigned long addr = (unsigned long)dst;
> +        early_pvh_copy_or_zero(addr, src, filesz);
> +        early_pvh_copy_or_zero(addr + filesz, NULL, memsz - filesz);
> +
> +        return 0;
> +    }
> +
>      rc = raw_copy_to_guest(dst, src, filesz);
> 
> or all changes including construct_dom0() also?

The part above is particularly ugly, but indeed I think you can only
get all or nothing here.

> As the comment says, for elf_load_image() we need early_pvh_copy_or_zero
> because it's too early in boot and construct_dom0() is running on idle
> vcpu where curr points to.
> 
> If that doesn't address your concern, please elaborate.

The fact that we're running on the idle vCPU here isn't any different
for PV Dom0. As much as PV Dom0 setup temporarily switches to
Dom0's page tables, I would imagine PVH Dom0 setup could do so
too, provided you don't lift the address space restrictions (which in
theory you could do, but in practice I don't see any need for). That
should allow having the PVH changes to Dom0 setup be a single code
fragment (adjusting Dom0 page tables from using machine addresses
to physical ones _after_ all other setup was done).

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2013-05-21  7:14 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-15  0:52 [PATCH 00/20][V5]: PVH xen: version 5 patches Mukesh Rathor
2013-05-15  0:52 ` [PATCH 01/20] PVH xen: turn gdb_frames/gdt_ents into union Mukesh Rathor
2013-05-15  0:52 ` [PATCH 02/20] PVH xen: add XENMEM_add_to_physmap_range Mukesh Rathor
2013-05-15  9:58   ` Jan Beulich
2013-05-15 23:05     ` Mukesh Rathor
2013-05-16  7:21       ` Jan Beulich
2013-05-16 11:03         ` Stefano Stabellini
2013-05-16 12:01           ` Jan Beulich
2013-05-16 15:04             ` Stefano Stabellini
2013-05-16 23:56         ` Mukesh Rathor
2013-05-17  6:37           ` Jan Beulich
2013-05-17 22:24             ` Mukesh Rathor
2013-05-15  0:52 ` [PATCH 03/20] PVH xen: create domctl_memory_mapping() function Mukesh Rathor
2013-05-15 10:07   ` Jan Beulich
2013-05-15  0:52 ` [PATCH 04/20] PVH xen: add params to read_segment_register Mukesh Rathor
2013-05-15  0:52 ` [PATCH 05/20] PVH xen: vmx realted preparatory changes for PVH Mukesh Rathor
2013-05-15  0:52 ` [PATCH 06/20] PVH xen: Move e820 fields out of pv_domain struct Mukesh Rathor
2013-05-15 10:27   ` Jan Beulich
2013-05-15  0:52 ` [PATCH 07/20] PVH xen: Introduce PVH guest type Mukesh Rathor
2013-05-15  0:52 ` [PATCH 08/20] PVH xen: tools changes to create PVH domain Mukesh Rathor
2013-05-15  0:52 ` [PATCH 09/20] PVH xen: domain creation code changes Mukesh Rathor
2013-05-15  0:52 ` [PATCH 10/20] PVH xen: create PVH vmcs, and also initialization Mukesh Rathor
2013-05-15  0:52 ` [PATCH 11/20] PVH xen: introduce pvh.c Mukesh Rathor
2013-05-15 10:42   ` Jan Beulich
2013-05-16  1:42     ` Mukesh Rathor
2013-05-16  8:00       ` Jan Beulich
2013-05-17  0:27         ` Mukesh Rathor
2013-05-17  6:43           ` Jan Beulich
2013-05-21  0:08             ` Mukesh Rathor
2013-05-21  7:07               ` Jan Beulich
2013-05-15  0:52 ` [PATCH 12/20] PVH xen: create read_descriptor_sel() Mukesh Rathor
2013-05-15  0:52 ` [PATCH 13/20] PVH xen: introduce vmx_pvh.c Mukesh Rathor
2013-05-15 11:46   ` Jan Beulich
2013-05-15  0:52 ` [PATCH 14/20] PVH xen: some misc changes like mtrr, intr, msi Mukesh Rathor
2013-05-15  0:52 ` [PATCH 15/20] PVH xen: hcall page initialize, create PVH guest type, etc Mukesh Rathor
2013-05-15  0:52 ` [PATCH 16/20] PVH xen: Miscellaneous changes Mukesh Rathor
2013-05-15 11:53   ` Jan Beulich
2013-05-16  1:51     ` Mukesh Rathor
2013-05-15  0:52 ` [PATCH 17/20] PVH xen: Introduce p2m_map_foreign Mukesh Rathor
2013-05-15 11:55   ` Jan Beulich
2013-05-15  0:52 ` [PATCH 18/20] PVH xen: Add and remove foreign pages Mukesh Rathor
2013-05-15 12:05   ` Jan Beulich
2013-05-15  0:52 ` [PATCH 19/20] PVH xen: elf and iommu related changes to prep for dom0 PVH Mukesh Rathor
2013-05-15 12:12   ` Jan Beulich
2013-05-16  1:58     ` Mukesh Rathor
2013-05-16  8:03       ` Jan Beulich
2013-05-17  1:14         ` Mukesh Rathor
2013-05-17  6:45           ` Jan Beulich
2013-05-18  2:01         ` Mukesh Rathor
2013-05-21  7:14           ` Jan Beulich
2013-05-15  0:52 ` [PATCH 20/20] PVH xen: PVH dom0 creation Mukesh Rathor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.