xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m
@ 2015-07-14  0:14 Ed White
  2015-07-14  0:14 ` [PATCH v5 01/15] common/domain: Helpers to pause a domain while in context Ed White
                   ` (14 more replies)
  0 siblings, 15 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

This set of patches adds support to hvm domains for EPTP switching by creating
multiple copies of the host p2m (currently limited to 10 copies).

The primary use of this capability is expected to be in scenarios where access
to memory needs to be monitored and/or restricted below the level at which the
guest OS page tables operate. Two examples that were discussed at the 2014 Xen
developer summit are:

    VM introspection: 
        http://www.slideshare.net/xen_com_mgr/
        zero-footprint-guest-memory-introspection-from-xen

    Secure inter-VM communication:
        http://www.slideshare.net/xen_com_mgr/nakajima-nvf

A more detailed design specification can be found at:
    http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01319.html

Each p2m copy is populated lazily on EPT violations.
Permissions for pages in alternate p2m's can be changed in a similar
way to the existing memory access interface, and gfn->mfn mappings can be changed.

All this is done through extra HVMOP types.

The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain
code is hypervisor-only, the toolstack has not been modified.

The intra-domain code has been tested. Violation notifications can only be received
for pages that have been modified (access permissions and/or gfn->mfn mapping) 
intra-domain, and only on VCPU's that have enabled notification.

VMFUNC and #VE will both be emulated on hardware without native support.

This code is not compatible with nested hvm functionality and will refuse to work
with nested hvm active. It is also not compatible with migration. It should be
considered experimental.

Changes since v4:

    Patch 3:  don't set bit 63 of top-level entries.

    Patch 5:  extra locking order description in mm-locks.h
              don't initialise altp2m data unless altp2m is enabled globally
               and hardware supports it
              removed some hardware-specific wrappers that were not being used
              renamed ap2m... interfaces to altp2m...
              fixed error path in p2m_init_altp2m

    Patch 7:  addressed remaining feedback

    Patch 8:  made suppress_ve preservation consistent

    Patch 9:  changed flag bit to avoid collision with recently applied series

    Patch 10: check pad fields for zero
              minor formatting changes

    Patch 11: renamed HVM parameter

    Patch 15: removed v3 workaround


Changes since v3:

Major changes are:

    Replaced patch 8.

    Refactored patch 11 to use a single HVMOP with subcodes.

    Addressed feedback in patch 7, and some other patches.

    Added two tools/test patches from Tamas. Both are optional.

    Added various ack's and reviewed-by's.

    Rebased.

Ravi Sahita will now be the point of contact for this series.


Changes since v2:

Addressed all v2 feedback *except*:

    In patch 5, the per-domain EPTP list page is still allocated from the
    Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
    hardware when walking the EPTP list during exit processing in patch 6.

    HVM_ops are not merged. Tamas suggested merging the memory access ops,
    but in practice they are not as similar as they appear on the surface.
    Razvan suggested merging the implementation code in p2m.c, but that is
    also not as common as it appears on the surface.
    Andrew suggested merging all altp2m ops into one with a subop code in
    the input stucture. His point that only 255 ops can be defined is well
    taken, but altp2m uses only 2 more ops than the recently introduced
    ioreq ops, and <15% of the available ops have been defined. Since we
    don't know how to implement XSM hooks and policy with the subop model,
    we have not adopted this suggestion.

    The p2m set/get interface is not modified. The altp2m code needs to
    write suppress_ve in 2 places and read it in 1 place. The original
    patch series managed this by coupling the state of suppress_ve to the
    p2m memory type, which Tim disliked. In v2 of the series, special
    set/get interaces were added to access suppress_ve only when required.
    Jan has suggested changing the existing interfaces, but we feel this
    is inappropriate for this experimental patch series. Changing the
    existing interfaces would require a design agreement to be reached
    and would impact a large amount of existing code.

    Andrew kindly added some reviewed-by's to v2. I have not carried
    his reviewed-by of the memory event patch forward because Tamas
    requested significant changes to the patch.


Changes since v1:

Many changes since v1 in response to maintainer feedback, including:

    Suppress_ve state is now decoupled from memory type
    VMFUNC emulation handled in x86 emulator
    Lazy-copy algorithm copies any page where mfn != INVALID_MFN
    All nested page fault handling except lazy-copy is now in
        top-level (hvm.c) nested page fault handler
    Split p2m lock type (as suggested by Tim) to avoid lock order violations
    XSM hooks
    Xen parameter to globally enable altp2m (default disabled) and HVM parameter
    Altp2m reference counting no longer uses dirty_cpu bitmap
    Remapped page tracking to invalidate altp2m's where needed to protect Xen
    Many other minor changes

The altp2m invalidation is implemented to a level that I believe satisifies
the requirements of protecting Xen. Invalidation notification is not yet
implemented, and there may be other cases where invalidation is warranted to
protect the integrity of the restrictions placed through altp2m. We may add
further patches in this area.

Testability is still a potential issue. We have offered to make our internal
Windows test binaries available for intra-domain testing. Tamas has
been working on toolstack support for cross-domain testing with a slightly
earlier patch series, and we hope he will submit that support.

Not all of the patches will be of interest to everyone copied here. I've
copied everyone on this initial mailing to give context.

Andrew Cooper (1):
  common/domain: Helpers to pause a domain while in context

Ed White (9):
  VMX: VMFUNC and #VE definitions and detection.
  VMX: implement suppress #VE.
  x86/HVM: Hardware alternate p2m support detection.
  x86/altp2m: basic data structures and support routines.
  VMX/altp2m: add code to support EPTP switching and #VE.
  x86/altp2m: alternate p2m memory events.
  x86/altp2m: add remaining support routines.
  x86/altp2m: define and implement alternate p2m HVMOP types.
  x86/altp2m: Add altp2mhvm HVM domain parameter.

George Dunlap (1):
  x86/altp2m: add control of suppress_ve.

Ravi Sahita (2):
  VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  x86/altp2m: XSM hooks for altp2m HVM ops

Tamas K Lengyel (2):
  tools/libxc: add support to altp2m hvmops
  tools/xen-access: altp2m testcases

 docs/man/xl.cfg.pod.5                        |  12 +
 docs/misc/xen-command-line.markdown          |   7 +
 tools/flask/policy/policy/modules/xen/xen.if |   4 +-
 tools/libxc/Makefile                         |   1 +
 tools/libxc/include/xenctrl.h                |  21 +
 tools/libxc/xc_altp2m.c                      | 237 ++++++++++++
 tools/libxl/libxl.h                          |   6 +
 tools/libxl/libxl_create.c                   |   1 +
 tools/libxl/libxl_dom.c                      |   2 +
 tools/libxl/libxl_types.idl                  |   1 +
 tools/libxl/xl_cmdimpl.c                     |  10 +
 tools/tests/xen-access/xen-access.c          | 173 +++++++--
 xen/arch/x86/hvm/Makefile                    |   1 +
 xen/arch/x86/hvm/altp2m.c                    |  77 ++++
 xen/arch/x86/hvm/emulate.c                   |  19 +-
 xen/arch/x86/hvm/hvm.c                       | 253 +++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c                  |  42 +-
 xen/arch/x86/hvm/vmx/vmx.c                   | 177 +++++++++
 xen/arch/x86/mm/hap/Makefile                 |   1 +
 xen/arch/x86/mm/hap/altp2m_hap.c             |  98 +++++
 xen/arch/x86/mm/hap/hap.c                    |  38 +-
 xen/arch/x86/mm/mem_sharing.c                |   4 +-
 xen/arch/x86/mm/mm-locks.h                   |  46 ++-
 xen/arch/x86/mm/p2m-ept.c                    |  37 +-
 xen/arch/x86/mm/p2m-pod.c                    |  12 +-
 xen/arch/x86/mm/p2m-pt.c                     |  10 +-
 xen/arch/x86/mm/p2m.c                        | 554 +++++++++++++++++++++++++--
 xen/arch/x86/x86_emulate/x86_emulate.c       |  19 +-
 xen/arch/x86/x86_emulate/x86_emulate.h       |   4 +
 xen/common/domain.c                          |  28 ++
 xen/common/vm_event.c                        |   4 +
 xen/include/asm-arm/p2m.h                    |   6 +
 xen/include/asm-x86/domain.h                 |  10 +
 xen/include/asm-x86/hvm/altp2m.h             |  42 ++
 xen/include/asm-x86/hvm/hvm.h                |  25 ++
 xen/include/asm-x86/hvm/vcpu.h               |   9 +
 xen/include/asm-x86/hvm/vmx/vmcs.h           |  14 +-
 xen/include/asm-x86/hvm/vmx/vmx.h            |  13 +-
 xen/include/asm-x86/msr-index.h              |   1 +
 xen/include/asm-x86/p2m.h                    |  92 ++++-
 xen/include/public/hvm/hvm_op.h              |  82 ++++
 xen/include/public/hvm/params.h              |   5 +-
 xen/include/public/vm_event.h                |  11 +
 xen/include/xen/sched.h                      |   5 +
 xen/include/xsm/dummy.h                      |  12 +
 xen/include/xsm/xsm.h                        |  12 +
 xen/xsm/dummy.c                              |   2 +
 xen/xsm/flask/hooks.c                        |  12 +
 xen/xsm/flask/policy/access_vectors          |   7 +
 49 files changed, 2156 insertions(+), 103 deletions(-)
 create mode 100644 tools/libxc/xc_altp2m.c
 create mode 100644 xen/arch/x86/hvm/altp2m.c
 create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c
 create mode 100644 xen/include/asm-x86/hvm/altp2m.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v5 01/15] common/domain: Helpers to pause a domain while in context
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14  0:14 ` [PATCH v5 02/15] VMX: VMFUNC and #VE definitions and detection Ed White
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Andrew Cooper <andrew.cooper3@citrix.com>

For use on codepaths which would need to use domain_pause() but might be in
the target domain's context.  In the case that the target domain is in
context, all other vcpus are paused.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/common/domain.c     | 28 ++++++++++++++++++++++++++++
 xen/include/xen/sched.h |  5 +++++
 2 files changed, 33 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3bc52e6..1bb24ae 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1010,6 +1010,34 @@ int domain_unpause_by_systemcontroller(struct domain *d)
     return 0;
 }
 
+void domain_pause_except_self(struct domain *d)
+{
+    struct vcpu *v, *curr = current;
+
+    if ( curr->domain == d )
+    {
+        for_each_vcpu( d, v )
+            if ( likely(v != curr) )
+                vcpu_pause(v);
+    }
+    else
+        domain_pause(d);
+}
+
+void domain_unpause_except_self(struct domain *d)
+{
+    struct vcpu *v, *curr = current;
+
+    if ( curr->domain == d )
+    {
+        for_each_vcpu( d, v )
+            if ( likely(v != curr) )
+                vcpu_unpause(v);
+    }
+    else
+        domain_unpause(d);
+}
+
 int vcpu_reset(struct vcpu *v)
 {
     struct domain *d = v->domain;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b29d9e7..73d3bc8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -804,6 +804,11 @@ static inline int domain_pause_by_systemcontroller_nosync(struct domain *d)
 {
     return __domain_pause_by_systemcontroller(d, domain_pause_nosync);
 }
+
+/* domain_pause() but safe against trying to pause current. */
+void domain_pause_except_self(struct domain *d);
+void domain_unpause_except_self(struct domain *d);
+
 void cpu_init(void);
 
 struct scheduler;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 02/15] VMX: VMFUNC and #VE definitions and detection.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
  2015-07-14  0:14 ` [PATCH v5 01/15] common/domain: Helpers to pause a domain while in context Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14  0:14 ` [PATCH v5 03/15] VMX: implement suppress #VE Ed White
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Currently, neither is enabled globally but may be enabled on a per-VCPU
basis by the altp2m code.

Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
that bit is now hardware-defined.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
---
 xen/arch/x86/hvm/vmx/vmcs.c        | 42 +++++++++++++++++++++++++++++++++++---
 xen/arch/x86/mm/p2m-ept.c          |  1 -
 xen/include/asm-x86/hvm/vmx/vmcs.h | 14 +++++++++++--
 xen/include/asm-x86/hvm/vmx/vmx.h  | 13 +++++++++++-
 xen/include/asm-x86/msr-index.h    |  1 +
 5 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 4c5ceb5..bc1cabd 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -101,6 +101,8 @@ u32 vmx_secondary_exec_control __read_mostly;
 u32 vmx_vmexit_control __read_mostly;
 u32 vmx_vmentry_control __read_mostly;
 u64 vmx_ept_vpid_cap __read_mostly;
+u64 vmx_vmfunc __read_mostly;
+bool_t vmx_virt_exception __read_mostly;
 
 const u32 vmx_introspection_force_enabled_msrs[] = {
     MSR_IA32_SYSENTER_EIP,
@@ -140,6 +142,8 @@ static void __init vmx_display_features(void)
     P(cpu_has_vmx_virtual_intr_delivery, "Virtual Interrupt Delivery");
     P(cpu_has_vmx_posted_intr_processing, "Posted Interrupt Processing");
     P(cpu_has_vmx_vmcs_shadowing, "VMCS shadowing");
+    P(cpu_has_vmx_vmfunc, "VM Functions");
+    P(cpu_has_vmx_virt_exceptions, "Virtualisation Exceptions");
     P(cpu_has_vmx_pml, "Page Modification Logging");
 #undef P
 
@@ -185,6 +189,7 @@ static int vmx_init_vmcs_config(void)
     u64 _vmx_misc_cap = 0;
     u32 _vmx_vmexit_control;
     u32 _vmx_vmentry_control;
+    u64 _vmx_vmfunc = 0;
     bool_t mismatch = 0;
 
     rdmsr(MSR_IA32_VMX_BASIC, vmx_basic_msr_low, vmx_basic_msr_high);
@@ -230,7 +235,9 @@ static int vmx_init_vmcs_config(void)
                SECONDARY_EXEC_ENABLE_EPT |
                SECONDARY_EXEC_ENABLE_RDTSCP |
                SECONDARY_EXEC_PAUSE_LOOP_EXITING |
-               SECONDARY_EXEC_ENABLE_INVPCID);
+               SECONDARY_EXEC_ENABLE_INVPCID |
+               SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+               SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
         rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
         if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
             opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
@@ -341,6 +348,24 @@ static int vmx_init_vmcs_config(void)
           || !(_vmx_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT) )
         _vmx_pin_based_exec_control  &= ~ PIN_BASED_POSTED_INTERRUPT;
 
+    /* The IA32_VMX_VMFUNC MSR exists only when VMFUNC is available */
+    if ( _vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS )
+    {
+        rdmsrl(MSR_IA32_VMX_VMFUNC, _vmx_vmfunc);
+
+        /*
+         * VMFUNC leaf 0 (EPTP switching) must be supported.
+         *
+         * Or we just don't use VMFUNC.
+         */
+        if ( !(_vmx_vmfunc & VMX_VMFUNC_EPTP_SWITCHING) )
+            _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+    }
+
+    /* Virtualization exceptions are only enabled if VMFUNC is enabled */
+    if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+        _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
     min = 0;
     opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
     _vmx_vmentry_control = adjust_vmx_controls(
@@ -361,6 +386,9 @@ static int vmx_init_vmcs_config(void)
         vmx_vmentry_control        = _vmx_vmentry_control;
         vmx_basic_msr              = ((u64)vmx_basic_msr_high << 32) |
                                      vmx_basic_msr_low;
+        vmx_vmfunc                 = _vmx_vmfunc;
+        vmx_virt_exception         = !!(_vmx_secondary_exec_control &
+                                       SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
         vmx_display_features();
 
         /* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -397,6 +425,9 @@ static int vmx_init_vmcs_config(void)
         mismatch |= cap_check(
             "EPT and VPID Capability",
             vmx_ept_vpid_cap, _vmx_ept_vpid_cap);
+        mismatch |= cap_check(
+            "VMFUNC Capability",
+            vmx_vmfunc, _vmx_vmfunc);
         if ( cpu_has_vmx_ins_outs_instr_info !=
              !!(vmx_basic_msr_high & (VMX_BASIC_INS_OUT_INFO >> 32)) )
         {
@@ -967,6 +998,11 @@ static int construct_vmcs(struct vcpu *v)
     /* Do not enable Monitor Trap Flag unless start single step debug */
     v->arch.hvm_vmx.exec_control &= ~CPU_BASED_MONITOR_TRAP_FLAG;
 
+    /* Disable VMFUNC and #VE for now: they may be enabled later by altp2m. */
+    v->arch.hvm_vmx.secondary_exec_control &=
+        ~(SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+          SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
+
     if ( is_pvh_domain(d) )
     {
         /* Disable virtual apics, TPR */
@@ -1790,9 +1826,9 @@ void vmcs_dump_vcpu(struct vcpu *v)
         printk("PLE Gap=%08x Window=%08x\n",
                vmr32(PLE_GAP), vmr32(PLE_WINDOW));
     if ( v->arch.hvm_vmx.secondary_exec_control &
-         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VMFUNC) )
+         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
         printk("Virtual processor ID = 0x%04x VMfunc controls = %016lx\n",
-               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VMFUNC_CONTROL));
+               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VM_FUNCTION_CONTROL));
 
     vmx_vmcs_exit(v);
 }
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 5133eb6..a6c9adf 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -281,7 +281,6 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
         epte->sp = (level > 1);
         epte->mfn += i * trunk;
         epte->snp = (iommu_enabled && iommu_snoop);
-        ASSERT(!epte->avail3);
 
         ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
 
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 1104bda..cb0ee6c 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -222,9 +222,10 @@ extern u32 vmx_vmentry_control;
 #define SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY    0x00000200
 #define SECONDARY_EXEC_PAUSE_LOOP_EXITING       0x00000400
 #define SECONDARY_EXEC_ENABLE_INVPCID           0x00001000
-#define SECONDARY_EXEC_ENABLE_VMFUNC            0x00002000
+#define SECONDARY_EXEC_ENABLE_VM_FUNCTIONS      0x00002000
 #define SECONDARY_EXEC_ENABLE_VMCS_SHADOWING    0x00004000
 #define SECONDARY_EXEC_ENABLE_PML               0x00020000
+#define SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS   0x00040000
 extern u32 vmx_secondary_exec_control;
 
 #define VMX_EPT_EXEC_ONLY_SUPPORTED             0x00000001
@@ -285,6 +286,10 @@ extern u32 vmx_secondary_exec_control;
     (vmx_pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT)
 #define cpu_has_vmx_vmcs_shadowing \
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VMCS_SHADOWING)
+#define cpu_has_vmx_vmfunc \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS)
+#define cpu_has_vmx_virt_exceptions \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS)
 #define cpu_has_vmx_pml \
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_PML)
 
@@ -316,6 +321,9 @@ extern u64 vmx_basic_msr;
 #define VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK  0x0FF
 #define VMX_GUEST_INTR_STATUS_SVI_OFFSET        8
 
+/* VMFUNC leaf definitions */
+#define VMX_VMFUNC_EPTP_SWITCHING   (1ULL << 0)
+
 /* VMCS field encodings. */
 #define VMCS_HIGH(x) ((x) | 1)
 enum vmcs_field {
@@ -350,12 +358,14 @@ enum vmcs_field {
     VIRTUAL_APIC_PAGE_ADDR          = 0x00002012,
     APIC_ACCESS_ADDR                = 0x00002014,
     PI_DESC_ADDR                    = 0x00002016,
-    VMFUNC_CONTROL                  = 0x00002018,
+    VM_FUNCTION_CONTROL             = 0x00002018,
     EPT_POINTER                     = 0x0000201a,
     EOI_EXIT_BITMAP0                = 0x0000201c,
 #define EOI_EXIT_BITMAP(n) (EOI_EXIT_BITMAP0 + (n) * 2) /* n = 0...3 */
+    EPTP_LIST_ADDR                  = 0x00002024,
     VMREAD_BITMAP                   = 0x00002026,
     VMWRITE_BITMAP                  = 0x00002028,
+    VIRT_EXCEPTION_INFO             = 0x0000202a,
     GUEST_PHYSICAL_ADDRESS          = 0x00002400,
     VMCS_LINK_POINTER               = 0x00002800,
     GUEST_IA32_DEBUGCTL             = 0x00002802,
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 35f804a..5b59d3c 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -47,7 +47,7 @@ typedef union {
         access      :   4,  /* bits 61:58 - p2m_access_t */
         tm          :   1,  /* bit 62 - VT-d transient-mapping hint in
                                shared EPT/VT-d usage */
-        avail3      :   1;  /* bit 63 - Software available 3 */
+        suppress_ve :   1;  /* bit 63 - suppress #VE */
     };
     u64 epte;
 } ept_entry_t;
@@ -186,6 +186,7 @@ static inline unsigned long pi_get_pir(struct pi_desc *pi_desc, int group)
 #define EXIT_REASON_XSETBV              55
 #define EXIT_REASON_APIC_WRITE          56
 #define EXIT_REASON_INVPCID             58
+#define EXIT_REASON_VMFUNC              59
 #define EXIT_REASON_PML_FULL            62
 
 /*
@@ -554,4 +555,14 @@ void p2m_init_hap_data(struct p2m_domain *p2m);
 #define EPT_L4_PAGETABLE_SHIFT      39
 #define EPT_PAGETABLE_ENTRIES       512
 
+/* #VE information page */
+typedef struct {
+    u32 exit_reason;
+    u32 semaphore;
+    u64 exit_qualification;
+    u64 gla;
+    u64 gpa;
+    u16 eptp_index;
+} ve_info_t;
+
 #endif /* __ASM_X86_HVM_VMX_VMX_H__ */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 83f2f70..8069d60 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -130,6 +130,7 @@
 #define MSR_IA32_VMX_TRUE_PROCBASED_CTLS        0x48e
 #define MSR_IA32_VMX_TRUE_EXIT_CTLS             0x48f
 #define MSR_IA32_VMX_TRUE_ENTRY_CTLS            0x490
+#define MSR_IA32_VMX_VMFUNC                     0x491
 #define IA32_FEATURE_CONTROL_MSR                0x3a
 #define IA32_FEATURE_CONTROL_MSR_LOCK                     0x0001
 #define IA32_FEATURE_CONTROL_MSR_ENABLE_VMXON_INSIDE_SMX  0x0002
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 03/15] VMX: implement suppress #VE.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
  2015-07-14  0:14 ` [PATCH v5 01/15] common/domain: Helpers to pause a domain while in context Ed White
  2015-07-14  0:14 ` [PATCH v5 02/15] VMX: VMFUNC and #VE definitions and detection Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14 12:46   ` Jan Beulich
  2015-07-14 13:47   ` George Dunlap
  2015-07-14  0:14 ` [PATCH v5 04/15] x86/HVM: Hardware alternate p2m support detection Ed White
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

In preparation for selectively enabling #VE in a later patch, set
suppress #VE on all EPTE's.

Suppress #VE should always be the default condition for two reasons:
it is generally not safe to deliver #VE into a guest unless that guest
has been modified to receive it; and even then for most EPT violations only
the hypervisor is able to handle the violation.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
---
 xen/arch/x86/mm/p2m-ept.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index a6c9adf..59410ea 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -41,7 +41,8 @@
 #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
 static inline bool_t is_epte_valid(ept_entry_t *e)
 {
-    return (e->epte != 0 && e->sa_p2mt != p2m_invalid);
+    /* suppress_ve alone is not considered valid, so mask it off */
+    return ((e->epte & ~(1ul << 63)) != 0 && e->sa_p2mt != p2m_invalid);
 }
 
 /* returns : 0 for success, -errno otherwise */
@@ -219,6 +220,8 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
 static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
 {
     struct page_info *pg;
+    ept_entry_t *table;
+    unsigned int i;
 
     pg = p2m_alloc_ptp(p2m, 0);
     if ( pg == NULL )
@@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
     /* Manually set A bit to avoid overhead of MMU having to write it later. */
     ept_entry->a = 1;
 
+    ept_entry->suppress_ve = 1;
+
+    table = __map_domain_page(pg);
+
+    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
+        table[i].suppress_ve = 1;
+
+    unmap_domain_page(table);
+
     return 1;
 }
 
@@ -281,6 +293,7 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
         epte->sp = (level > 1);
         epte->mfn += i * trunk;
         epte->snp = (iommu_enabled && iommu_snoop);
+        epte->suppress_ve = 1;
 
         ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
 
@@ -790,6 +803,8 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
 
+    new_entry.suppress_ve = 1;
+
     rc = atomic_write_ept_entry(ept_entry, new_entry, target);
     if ( unlikely(rc) )
         old_entry.epte = 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 04/15] x86/HVM: Hardware alternate p2m support detection.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (2 preceding siblings ...)
  2015-07-14  0:14 ` [PATCH v5 03/15] VMX: implement suppress #VE Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14  0:14 ` [PATCH v5 05/15] x86/altp2m: basic data structures and support routines Ed White
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

As implemented here, only supported on platforms with VMX HAP.

By default this functionality is force-disabled, it can be enabled
by specifying altp2m=1 on the Xen command line.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 docs/misc/xen-command-line.markdown | 7 +++++++
 xen/arch/x86/hvm/hvm.c              | 7 +++++++
 xen/arch/x86/hvm/vmx/vmx.c          | 1 +
 xen/include/asm-x86/hvm/hvm.h       | 9 +++++++++
 4 files changed, 24 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index aa684c0..3391c66 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -139,6 +139,13 @@ mode during S3 resume.
 > Default: `true`
 
 Permit Xen to use superpages when performing memory management.
+ 
+### altp2m (Intel)
+> `= <boolean>`
+
++> Default: `false`
+
+Permit multiple copies of host p2m.
 
 ### apic
 > `= bigsmp | default`
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 535d622..4019658 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -94,6 +94,10 @@ bool_t opt_hvm_fep;
 boolean_param("hvm_fep", opt_hvm_fep);
 #endif
 
+/* Xen command-line option to enable altp2m */
+static bool_t __initdata opt_altp2m_enabled = 0;
+boolean_param("altp2m", opt_altp2m_enabled);
+
 static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -160,6 +164,9 @@ static int __init hvm_enable(void)
     if ( !fns->pvh_supported )
         printk(XENLOG_INFO "HVM: PVH mode not supported on this platform\n");
 
+    if ( !opt_altp2m_enabled )
+        hvm_funcs.altp2m_supported = 0;
+
     /*
      * Allow direct access to the PC debug ports 0x80 and 0xed (they are
      * often used for I/O delays, but the vmexits simply slow things down).
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index fc29b89..07527dd 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1841,6 +1841,7 @@ const struct hvm_function_table * __init start_vmx(void)
     if ( cpu_has_vmx_ept && (cpu_has_vmx_pat || opt_force_ept) )
     {
         vmx_function_table.hap_supported = 1;
+        vmx_function_table.altp2m_supported = 1;
 
         vmx_function_table.hap_capabilities = 0;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 57f9605..c61cfe7 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -94,6 +94,9 @@ struct hvm_function_table {
     /* Necessary hardware support for PVH mode? */
     int pvh_supported;
 
+    /* Necessary hardware support for alternate p2m's? */
+    bool_t altp2m_supported;
+
     /* Indicate HAP capabilities. */
     int hap_capabilities;
 
@@ -509,6 +512,12 @@ bool_t nhvm_vmcx_hap_enabled(struct vcpu *v);
 /* interrupt */
 enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v);
 
+/* returns true if hardware supports alternate p2m's */
+static inline bool_t hvm_altp2m_supported(void)
+{
+    return hvm_funcs.altp2m_supported;
+}
+
 #ifndef NDEBUG
 /* Permit use of the Forced Emulation Prefix in HVM guests */
 extern bool_t opt_hvm_fep;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (3 preceding siblings ...)
  2015-07-14  0:14 ` [PATCH v5 04/15] x86/HVM: Hardware alternate p2m support detection Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14 13:13   ` Jan Beulich
  2015-07-14 15:57   ` George Dunlap
  2015-07-14  0:14 ` [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
                   ` (9 subsequent siblings)
  14 siblings, 2 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Add the basic data structures needed to support alternate p2m's and
the functions to initialise them and tear them down.

Although Intel hardware can handle 512 EPTP's per hardware thread
concurrently, only 10 per domain are supported in this patch for
performance reasons.

The iterator in hap_enable() does need to handle 512, so that is now
uint16_t.

This change also splits the p2m lock into one lock type for altp2m's
and another type for all other p2m's. The purpose of this is to place
the altp2m list lock between the types, so the list lock can be
acquired whilst holding the host p2m lock.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/Makefile        |   1 +
 xen/arch/x86/hvm/altp2m.c        |  77 +++++++++++++++++++++++++++++
 xen/arch/x86/hvm/hvm.c           |  21 ++++++++
 xen/arch/x86/mm/hap/hap.c        |  38 ++++++++++++++-
 xen/arch/x86/mm/mm-locks.h       |  46 +++++++++++++++++-
 xen/arch/x86/mm/p2m.c            | 102 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/domain.h     |  10 ++++
 xen/include/asm-x86/hvm/altp2m.h |  38 +++++++++++++++
 xen/include/asm-x86/hvm/hvm.h    |  14 ++++++
 xen/include/asm-x86/hvm/vcpu.h   |   9 ++++
 xen/include/asm-x86/p2m.h        |  32 +++++++++++-
 11 files changed, 384 insertions(+), 4 deletions(-)
 create mode 100644 xen/arch/x86/hvm/altp2m.c
 create mode 100644 xen/include/asm-x86/hvm/altp2m.h

diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index 69af47f..eb1a37b 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -1,6 +1,7 @@
 subdir-y += svm
 subdir-y += vmx
 
+obj-y += altp2m.o
 obj-y += asid.o
 obj-y += emulate.o
 obj-y += event.o
diff --git a/xen/arch/x86/hvm/altp2m.c b/xen/arch/x86/hvm/altp2m.c
new file mode 100644
index 0000000..a10f347
--- /dev/null
+++ b/xen/arch/x86/hvm/altp2m.c
@@ -0,0 +1,77 @@
+/*
+ * Alternate p2m HVM
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#include <asm/hvm/support.h>
+#include <asm/hvm/hvm.h>
+#include <asm/p2m.h>
+#include <asm/hvm/altp2m.h>
+
+void
+altp2m_vcpu_reset(struct vcpu *v)
+{
+    struct altp2mvcpu *av = &vcpu_altp2m(v);
+
+    av->p2midx = INVALID_ALTP2M;
+    av->veinfo_gfn = _gfn(INVALID_GFN);
+}
+
+void
+altp2m_vcpu_initialise(struct vcpu *v)
+{
+    if ( v != current )
+        vcpu_pause(v);
+
+    altp2m_vcpu_reset(v);
+    vcpu_altp2m(v).p2midx = 0;
+    atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+
+    altp2m_vcpu_update_eptp(v);
+
+    if ( v != current )
+        vcpu_unpause(v);
+}
+
+void
+altp2m_vcpu_destroy(struct vcpu *v)
+{
+    struct p2m_domain *p2m;
+
+    if ( v != current )
+        vcpu_pause(v);
+
+    if ( (p2m = p2m_get_altp2m(v)) )
+        atomic_dec(&p2m->active_vcpus);
+
+    altp2m_vcpu_reset(v);
+
+    altp2m_vcpu_update_eptp(v);
+    altp2m_vcpu_update_vmfunc_ve(v);
+
+    if ( v != current )
+        vcpu_unpause(v);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4019658..38deedc 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -58,6 +58,7 @@
 #include <asm/hvm/cacheattr.h>
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/altp2m.h>
 #include <asm/hvm/event.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
@@ -2380,6 +2381,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
 {
     hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
 
+    altp2m_vcpu_destroy(v);
     nestedhvm_vcpu_destroy(v);
 
     free_compat_arg_xlat(v);
@@ -6498,6 +6500,25 @@ enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v)
     return hvm_funcs.nhvm_intr_blocked(v);
 }
 
+void altp2m_vcpu_update_eptp(struct vcpu *v)
+{
+    if ( hvm_funcs.altp2m_vcpu_update_eptp )
+        hvm_funcs.altp2m_vcpu_update_eptp(v);
+}
+
+void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+    if ( hvm_funcs.altp2m_vcpu_update_vmfunc_ve )
+        hvm_funcs.altp2m_vcpu_update_vmfunc_ve(v);
+}
+
+bool_t altp2m_vcpu_emulate_ve(struct vcpu *v)
+{
+    if ( hvm_funcs.altp2m_vcpu_emulate_ve )
+        return hvm_funcs.altp2m_vcpu_emulate_ve(v);
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index d0d3f1e..8dd3c20 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)
 int hap_enable(struct domain *d, u32 mode)
 {
     unsigned int old_pages;
-    uint8_t i;
+    uint16_t i;
     int rv = 0;
 
     domain_pause(d);
@@ -498,6 +498,28 @@ int hap_enable(struct domain *d, u32 mode)
            goto out;
     }
 
+    if ( hvm_altp2m_supported() )
+    {
+        /* Init alternate p2m data */
+        if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
+        {
+            rv = -ENOMEM;
+            goto out;
+        }
+
+        for ( i = 0; i < MAX_EPTP; i++ )
+            d->arch.altp2m_eptp[i] = INVALID_MFN;
+
+        for ( i = 0; i < MAX_ALTP2M; i++ )
+        {
+            rv = p2m_alloc_table(d->arch.altp2m_p2m[i]);
+            if ( rv != 0 )
+               goto out;
+        }
+
+        d->arch.altp2m_active = 0;
+    }
+
     /* Now let other users see the new mode */
     d->arch.paging.mode = mode | PG_HAP_enable;
 
@@ -510,6 +532,20 @@ void hap_final_teardown(struct domain *d)
 {
     uint8_t i;
 
+    if ( hvm_altp2m_supported() )
+    {
+        d->arch.altp2m_active = 0;
+
+        if ( d->arch.altp2m_eptp )
+        {
+            free_xenheap_page(d->arch.altp2m_eptp);
+            d->arch.altp2m_eptp = NULL;
+        }
+
+        for ( i = 0; i < MAX_ALTP2M; i++ )
+            p2m_teardown(d->arch.altp2m_p2m[i]);
+    }
+
     /* Destroy nestedp2m's first */
     for (i = 0; i < MAX_NESTEDP2M; i++) {
         p2m_teardown(d->arch.nested_p2m[i]);
diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index b4f035e..c66f105 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
 #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
 #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
 
-/* P2M lock (per-p2m-table)
+/* P2M lock (per-non-alt-p2m-table)
  *
  * This protects all queries and updates to the p2m table.
  * Queries may be made under the read lock but all modifications
@@ -225,10 +225,52 @@ declare_mm_lock(nestedp2m)
  *
  * The write lock is recursive as it is common for a code path to look
  * up a gfn and later mutate it.
+ *
+ * Note that this lock shares its implementation with the altp2m
+ * lock (not the altp2m list lock), so the implementation
+ * is found there.
+ *
+ * Changes made to the host p2m when in altp2m mode are propagated to the
+ * altp2ms synchronously in ept_set_entry().  At that point, we will hold
+ * the host p2m lock; propagating this change involves grabbing the
+ * altp2m_list lock, and the locks of the individual alternate p2ms.  In
+ * order to allow us to maintain locking order discipline, we split the p2m
+ * lock into p2m (for host p2ms) and altp2m (for alternate p2ms), putting
+ * the altp2mlist lock in the middle.
  */
 
 declare_mm_rwlock(p2m);
-#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
+
+/* Alternate P2M list lock (per-domain)
+ *
+ * A per-domain lock that protects the list of alternate p2m's.
+ * Any operation that walks the list needs to acquire this lock.
+ * Additionally, before destroying an alternate p2m all VCPU's
+ * in the target domain must be paused.
+ */
+
+declare_mm_lock(altp2mlist)
+#define altp2m_list_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_list_lock)
+#define altp2m_list_unlock(d) mm_unlock(&(d)->arch.altp2m_list_lock)
+
+/* P2M lock (per-altp2m-table)
+ *
+ * This protects all queries and updates to the p2m table.
+ * Queries may be made under the read lock but all modifications
+ * need the main (write) lock.
+ *
+ * The write lock is recursive as it is common for a code path to look
+ * up a gfn and later mutate it.
+ */
+
+declare_mm_rwlock(altp2m);
+#define p2m_lock(p)                         \
+{                                           \
+    if ( p2m_is_altp2m(p) )                 \
+        mm_write_lock(altp2m, &(p)->lock);  \
+    else                                    \
+        mm_write_lock(p2m, &(p)->lock);     \
+}
 #define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
 #define gfn_lock(p,g,o)       p2m_lock(p)
 #define gfn_unlock(p,g,o)     p2m_unlock(p)
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6b39733..d378444 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -35,6 +35,7 @@
 #include <asm/hvm/vmx/vmx.h> /* ept_p2m_init() */
 #include <asm/mem_sharing.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/altp2m.h>
 #include <asm/hvm/svm/amd-iommu-proto.h>
 #include <xsm/xsm.h>
 
@@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct domain *d)
     }
 }
 
+static void p2m_teardown_altp2m(struct domain *d)
+{
+    unsigned int i;
+    struct p2m_domain *p2m;
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( !d->arch.altp2m_p2m[i] )
+            continue;
+        p2m = d->arch.altp2m_p2m[i];
+        p2m_free_one(p2m);
+        d->arch.altp2m_p2m[i] = NULL;
+    }
+}
+
+static int p2m_init_altp2m(struct domain *d)
+{
+    uint8_t i;
+    struct p2m_domain *p2m;
+
+    mm_lock_init(&d->arch.altp2m_list_lock);
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        d->arch.altp2m_p2m[i] = p2m = p2m_init_one(d);
+        if ( p2m == NULL )
+        {
+            p2m_teardown_altp2m(d);
+            return -ENOMEM;
+        }
+        p2m->p2m_class = p2m_alternate;
+        p2m->access_required = 1;
+        _atomic_set(&p2m->active_vcpus, 0);
+    }
+
+    return 0;
+}
+
 int p2m_init(struct domain *d)
 {
     int rc;
@@ -196,7 +234,18 @@ int p2m_init(struct domain *d)
      * (p2m_init runs too early for HVM_PARAM_* options) */
     rc = p2m_init_nestedp2m(d);
     if ( rc )
+    {
         p2m_teardown_hostp2m(d);
+        return rc;
+    }
+
+    rc = p2m_init_altp2m(d);
+    if ( rc )
+    {
+        p2m_teardown_hostp2m(d);
+        p2m_teardown_nestedp2m(d);
+        p2m_teardown_altp2m(d);
+    }
 
     return rc;
 }
@@ -1918,6 +1967,59 @@ int unmap_mmio_regions(struct domain *d,
     return err;
 }
 
+uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp)
+{
+    struct p2m_domain *p2m;
+    struct ept_data *ept;
+    uint16_t i;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
+            continue;
+
+        p2m = d->arch.altp2m_p2m[i];
+        ept = &p2m->ept;
+
+        if ( eptp == ept_get_eptp(ept) )
+            goto out;
+    }
+
+    i = INVALID_ALTP2M;
+
+out:
+    altp2m_list_unlock(d);
+    return i;
+}
+
+bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx)
+{
+    struct domain *d = v->domain;
+    bool_t rc = 0;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        if ( idx != vcpu_altp2m(v).p2midx )
+        {
+            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+            vcpu_altp2m(v).p2midx = idx;
+            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+            altp2m_vcpu_update_eptp(v);
+        }
+        rc = 1;
+    }
+
+    altp2m_list_unlock(d);
+    return rc;
+}
+
 /*** Audit ***/
 
 #if P2M_AUDIT
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 96bde65..294e089 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -234,6 +234,10 @@ struct paging_vcpu {
 typedef xen_domctl_cpuid_t cpuid_input_t;
 
 #define MAX_NESTEDP2M 10
+
+#define MAX_ALTP2M      ((uint16_t)10)
+#define INVALID_ALTP2M  ((uint16_t)~0)
+#define MAX_EPTP        (PAGE_SIZE / sizeof(uint64_t))
 struct p2m_domain;
 struct time_scale {
     int shift;
@@ -293,6 +297,12 @@ struct arch_domain
     struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
     mm_lock_t nested_p2m_lock;
 
+    /* altp2m: allow multiple copies of host p2m */
+    bool_t altp2m_active;
+    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
+    mm_lock_t altp2m_list_lock;
+    uint64_t *altp2m_eptp;
+
     /* NB. protected by d->event_lock and by irq_desc[irq].lock */
     struct radix_tree_root irq_pirq;
 
diff --git a/xen/include/asm-x86/hvm/altp2m.h b/xen/include/asm-x86/hvm/altp2m.h
new file mode 100644
index 0000000..38de494
--- /dev/null
+++ b/xen/include/asm-x86/hvm/altp2m.h
@@ -0,0 +1,38 @@
+/*
+ * Alternate p2m HVM
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#ifndef _ALTP2M_H
+#define _ALTP2M_H
+
+#include <xen/types.h>
+#include <xen/sched.h>         /* for struct vcpu, struct domain */
+#include <asm/hvm/vcpu.h>      /* for vcpu_altp2m */
+
+/* Alternate p2m HVM on/off per domain */
+static inline bool_t altp2m_active(const struct domain *d)
+{
+    return d->arch.altp2m_active;
+}
+
+/* Alternate p2m VCPU */
+void altp2m_vcpu_initialise(struct vcpu *v);
+void altp2m_vcpu_destroy(struct vcpu *v);
+void altp2m_vcpu_reset(struct vcpu *v);
+
+#endif /* _ALTP2M_H */
+
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index c61cfe7..473ac30 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -210,6 +210,11 @@ struct hvm_function_table {
                                   uint32_t *ecx, uint32_t *edx);
 
     void (*enable_msr_exit_interception)(struct domain *d);
+
+    /* Alternate p2m */
+    void (*altp2m_vcpu_update_eptp)(struct vcpu *v);
+    void (*altp2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
+    bool_t (*altp2m_vcpu_emulate_ve)(struct vcpu *v);
 };
 
 extern struct hvm_function_table hvm_funcs;
@@ -525,6 +530,15 @@ extern bool_t opt_hvm_fep;
 #define opt_hvm_fep 0
 #endif
 
+/* updates the current EPTP in VMCS */
+void altp2m_vcpu_update_eptp(struct vcpu *v);
+
+/* updates VMCS fields related to VMFUNC and #VE */
+void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v);
+
+/* emulates #VE */
+bool_t altp2m_vcpu_emulate_ve(struct vcpu *v);
+
 #endif /* __ASM_X86_HVM_HVM_H__ */
 
 /*
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 3d8f4dc..09f25c1 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -118,6 +118,13 @@ struct nestedvcpu {
 
 #define vcpu_nestedhvm(v) ((v)->arch.hvm_vcpu.nvcpu)
 
+struct altp2mvcpu {
+    uint16_t    p2midx;         /* alternate p2m index */
+    gfn_t       veinfo_gfn;     /* #VE information page gfn */
+};
+
+#define vcpu_altp2m(v) ((v)->arch.hvm_vcpu.avcpu)
+
 struct hvm_vcpu {
     /* Guest control-register and EFER values, just as the guest sees them. */
     unsigned long       guest_cr[5];
@@ -163,6 +170,8 @@ struct hvm_vcpu {
 
     struct nestedvcpu   nvcpu;
 
+    struct altp2mvcpu   avcpu;
+
     struct mtrr_state   mtrr;
     u64                 pat_cr;
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index b49c09b..f38b452 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -175,6 +175,7 @@ typedef unsigned int p2m_query_t;
 typedef enum {
     p2m_host,
     p2m_nested,
+    p2m_alternate,
 } p2m_class_t;
 
 /* Per-p2m-table state */
@@ -193,7 +194,7 @@ struct p2m_domain {
 
     struct domain     *domain;   /* back pointer to domain */
 
-    p2m_class_t       p2m_class; /* host/nested/? */
+    p2m_class_t       p2m_class; /* host/nested/alternate */
 
     /* Nested p2ms only: nested p2m base value that this p2m shadows.
      * This can be cleared to P2M_BASE_EADDR under the per-p2m lock but
@@ -219,6 +220,9 @@ struct p2m_domain {
      * host p2m's lock. */
     int                defer_nested_flush;
 
+    /* Alternate p2m: count of vcpu's currently using this p2m. */
+    atomic_t           active_vcpus;
+
     /* Pages used to construct the p2m */
     struct page_list_head pages;
 
@@ -317,6 +321,11 @@ static inline bool_t p2m_is_nestedp2m(const struct p2m_domain *p2m)
     return p2m->p2m_class == p2m_nested;
 }
 
+static inline bool_t p2m_is_altp2m(const struct p2m_domain *p2m)
+{
+    return p2m->p2m_class == p2m_alternate;
+}
+
 #define p2m_get_pagetable(p2m)  ((p2m)->phys_table)
 
 /**** p2m query accessors. They lock p2m_lock, and thus serialize
@@ -722,6 +731,27 @@ void nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
     l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
 
 /*
+ * Alternate p2m: shadow p2m tables used for alternate memory views
+ */
+
+/* get current alternate p2m table */
+static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    uint16_t index = vcpu_altp2m(v).p2midx;
+
+    ASSERT(index < MAX_ALTP2M);
+
+    return (index == INVALID_ALTP2M) ? NULL : d->arch.altp2m_p2m[index];
+}
+
+/* Locate an alternate p2m by its EPTP */
+uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp);
+
+/* Switch alternate p2m for a single vcpu */
+bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
+
+/*
  * p2m type to IOMMU flags
  */
 static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (4 preceding siblings ...)
  2015-07-14  0:14 ` [PATCH v5 05/15] x86/altp2m: basic data structures and support routines Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14 13:57   ` Jan Beulich
  2015-07-14  0:14 ` [PATCH v5 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Implement and hook up the code to enable VMX support of VMFUNC and #VE.

VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 138 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 138 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 07527dd..38dba6b 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -56,6 +56,7 @@
 #include <asm/debugger.h>
 #include <asm/apic.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/altp2m.h>
 #include <asm/event.h>
 #include <asm/monitor.h>
 #include <public/arch-x86/cpuid.h>
@@ -1763,6 +1764,104 @@ static void vmx_enable_msr_exit_interception(struct domain *d)
                                          MSR_TYPE_W);
 }
 
+static void vmx_vcpu_update_eptp(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    struct p2m_domain *p2m = NULL;
+    struct ept_data *ept;
+
+    if ( altp2m_active(d) )
+        p2m = p2m_get_altp2m(v);
+    if ( !p2m )
+        p2m = p2m_get_hostp2m(d);
+
+    ept = &p2m->ept;
+    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
+
+    vmx_vmcs_enter(v);
+
+    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
+
+    if ( v->arch.hvm_vmx.secondary_exec_control &
+        SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+        __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);
+
+    vmx_vmcs_exit(v);
+}
+
+static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+
+    if ( !cpu_has_vmx_vmfunc )
+        return;
+
+    if ( cpu_has_vmx_virt_exceptions )
+        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
+    vmx_vmcs_enter(v);
+
+    if ( !d->is_dying && altp2m_active(d) )
+    {
+        v->arch.hvm_vmx.secondary_exec_control |= mask;
+        __vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
+        __vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp));
+
+        if ( cpu_has_vmx_virt_exceptions )
+        {
+            p2m_type_t t;
+            mfn_t mfn;
+
+            mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
+
+            if ( mfn_x(mfn) != INVALID_MFN )
+                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
+            else
+                mask &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+        }
+    }
+    else
+        v->arch.hvm_vmx.secondary_exec_control &= ~mask;
+
+    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
+        v->arch.hvm_vmx.secondary_exec_control);
+
+    vmx_vmcs_exit(v);
+}
+
+static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
+{
+    bool_t rc = 0;
+    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) != INVALID_GFN ?
+        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) : NULL;
+
+    if ( !veinfo )
+        return 0;
+
+    if ( veinfo->semaphore != 0 )
+        goto out;
+
+    rc = 1;
+
+    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
+    veinfo->semaphore = ~0l;
+    veinfo->eptp_index = vcpu_altp2m(v).p2midx;
+
+    vmx_vmcs_enter(v);
+    __vmread(EXIT_QUALIFICATION, &veinfo->exit_qualification);
+    __vmread(GUEST_LINEAR_ADDRESS, &veinfo->gla);
+    __vmread(GUEST_PHYSICAL_ADDRESS, &veinfo->gpa);
+    vmx_vmcs_exit(v);
+
+    hvm_inject_hw_exception(TRAP_virtualisation,
+                            HVM_DELIVER_NO_ERROR_CODE);
+
+out:
+    hvm_unmap_guest_frame(veinfo, 0);
+    return rc;
+}
+
 static struct hvm_function_table __initdata vmx_function_table = {
     .name                 = "VMX",
     .cpu_up_prepare       = vmx_cpu_up_prepare,
@@ -1822,6 +1921,9 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
     .hypervisor_cpuid_leaf = vmx_hypervisor_cpuid_leaf,
     .enable_msr_exit_interception = vmx_enable_msr_exit_interception,
+    .altp2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
+    .altp2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
+    .altp2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2744,6 +2846,42 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
     /* Now enable interrupts so it's safe to take locks. */
     local_irq_enable();
 
+    /*
+     * If the guest has the ability to switch EPTP without an exit,
+     * figure out whether it has done so and update the altp2m data.
+     */
+    if ( altp2m_active(v->domain) &&
+        (v->arch.hvm_vmx.secondary_exec_control &
+        SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+    {
+        unsigned long idx;
+
+        if ( v->arch.hvm_vmx.secondary_exec_control &
+            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+            __vmread(EPTP_INDEX, &idx);
+        else
+        {
+            unsigned long eptp;
+
+            __vmread(EPT_POINTER, &eptp);
+
+            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
+                 INVALID_ALTP2M )
+            {
+                gdprintk(XENLOG_ERR, "EPTP not found in alternate p2m list\n");
+                domain_crash(v->domain);
+            }
+        }
+
+        if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )
+        {
+            BUG_ON(idx >= MAX_ALTP2M);
+            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+            vcpu_altp2m(v).p2midx = (uint16_t)idx;
+            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+        }
+    }
+
     /* XXX: This looks ugly, but we need a mechanism to ensure
      * any pending vmresume has really happened
      */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (5 preceding siblings ...)
  2015-07-14  0:14 ` [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14 14:04   ` Jan Beulich
  2015-07-14  0:14 ` [PATCH v5 08/15] x86/altp2m: add control of suppress_ve Ed White
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Ravi Sahita <ravi.sahita@intel.com>

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
---
 xen/arch/x86/hvm/emulate.c             | 19 +++++++++++++++--
 xen/arch/x86/hvm/vmx/vmx.c             | 38 ++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_emulate/x86_emulate.c | 19 +++++++++++------
 xen/arch/x86/x86_emulate/x86_emulate.h |  4 ++++
 xen/include/asm-x86/hvm/hvm.h          |  2 ++
 5 files changed, 74 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index fe5661d..1aa8af4 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1436,6 +1436,19 @@ static int hvmemul_invlpg(
     return rc;
 }
 
+static int hvmemul_vmfunc(
+    struct x86_emulate_ctxt *ctxt)
+{
+    int rc;
+
+    rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
+    if ( rc != X86EMUL_OKAY )
+    {
+        hvmemul_inject_hw_exception(TRAP_invalid_op, 0, ctxt);
+    }
+    return rc;
+}
+
 static const struct x86_emulate_ops hvm_emulate_ops = {
     .read          = hvmemul_read,
     .insn_fetch    = hvmemul_insn_fetch,
@@ -1459,7 +1472,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmfunc        = hvmemul_vmfunc,
 };
 
 static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
@@ -1485,7 +1499,8 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmfunc        = hvmemul_vmfunc,
 };
 
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 38dba6b..845cdbc 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -82,6 +82,7 @@ static void vmx_fpu_dirty_intercept(void);
 static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content);
 static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg_intercept(unsigned long vaddr);
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
 
 uint8_t __read_mostly posted_intr_vector;
 
@@ -1830,6 +1831,19 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
     vmx_vmcs_exit(v);
 }
 
+static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
+{
+    int rc = X86EMUL_EXCEPTION;
+    struct vcpu *curr = current;
+
+    if ( !cpu_has_vmx_vmfunc && altp2m_active(curr->domain) &&
+         regs->eax == 0 &&
+         p2m_switch_vcpu_altp2m_by_id(curr, (uint16_t)regs->ecx) )
+        rc = X86EMUL_OKAY;
+
+    return rc;
+}
+
 static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
 {
     bool_t rc = 0;
@@ -1898,6 +1912,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .msr_read_intercept   = vmx_msr_read_intercept,
     .msr_write_intercept  = vmx_msr_write_intercept,
     .invlpg_intercept     = vmx_invlpg_intercept,
+    .vmfunc_intercept     = vmx_vmfunc_intercept,
     .handle_cd            = vmx_handle_cd,
     .set_info_guest       = vmx_set_info_guest,
     .set_rdtsc_exiting    = vmx_set_rdtsc_exiting,
@@ -1924,6 +1939,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .altp2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
     .altp2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
     .altp2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
+    .altp2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2095,6 +2111,19 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
         vpid_sync_vcpu_gva(curr, vaddr);
 }
 
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs)
+{
+    /*
+     * This handler is a placeholder for future where Xen may
+     * want to handle VMFUNC exits and resume a domain normally without
+     * injecting a #UD to the guest - for example, in a VT-nested
+     * scenario where Xen may want to lazily shadow the alternate
+     * EPTP list.
+     */
+    gdprintk(XENLOG_ERR, "Failed guest VMFUNC execution\n");
+    return X86EMUL_EXCEPTION;
+}
+
 static int vmx_cr_access(unsigned long exit_qualification)
 {
     struct vcpu *curr = current;
@@ -3234,6 +3263,15 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMFUNC:
+        if ( (vmx_vmfunc_intercept(regs) == X86EMUL_EXCEPTION) ||
+             (vmx_vmfunc_intercept(regs) == X86EMUL_UNHANDLEABLE) ||
+             (vmx_vmfunc_intercept(regs) == X86EMUL_RETRY) )
+            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        else
+            update_guest_eip();
+        break;
+
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
     case EXIT_REASON_GETSEC:
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index c017c69..e596131 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -3786,6 +3786,7 @@ x86_emulate(
         break;
     }
 
+ no_writeback:
     /* Inject #DB if single-step tracing was enabled at instruction start. */
     if ( (ctxt->regs->eflags & EFLG_TF) && (rc == X86EMUL_OKAY) &&
          (ops->inject_hw_exception != NULL) )
@@ -3816,19 +3817,17 @@ x86_emulate(
         struct segment_register reg;
         unsigned long base, limit, cr0, cr0w;
 
-        if ( modrm == 0xdf ) /* invlpga */
+        switch( modrm )
         {
+        case 0xdf: /* invlpga */
             generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
             fail_if(ops->invlpg == NULL);
             if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
                                    ctxt)) )
                 goto done;
-            break;
-        }
-
-        if ( modrm == 0xf9 ) /* rdtscp */
-        {
+            goto no_writeback;
+        case 0xf9: /* rdtscp */ {
             uint64_t tsc_aux;
             fail_if(ops->read_msr == NULL);
             if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
@@ -3836,6 +3835,14 @@ x86_emulate(
             _regs.ecx = (uint32_t)tsc_aux;
             goto rdtsc;
         }
+        case 0xd4: /* vmfunc */
+            generate_exception_if(lock_prefix | rep_prefix() | (vex.pfx == vex_66),
+                                  EXC_UD, -1);
+            fail_if(ops->vmfunc == NULL);
+            if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
+                goto done;
+            goto no_writeback;
+        }
 
         switch ( modrm_reg & 7 )
         {
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 064b8f4..a4d4ec8 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -397,6 +397,10 @@ struct x86_emulate_ops
         enum x86_segment seg,
         unsigned long offset,
         struct x86_emulate_ctxt *ctxt);
+
+    /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */
+    int (*vmfunc)(
+        struct x86_emulate_ctxt *ctxt);
 };
 
 struct cpu_user_regs;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 473ac30..335b7a6 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -167,6 +167,7 @@ struct hvm_function_table {
     int (*msr_read_intercept)(unsigned int msr, uint64_t *msr_content);
     int (*msr_write_intercept)(unsigned int msr, uint64_t msr_content);
     void (*invlpg_intercept)(unsigned long vaddr);
+    int (*vmfunc_intercept)(struct cpu_user_regs *regs);
     void (*handle_cd)(struct vcpu *v, unsigned long value);
     void (*set_info_guest)(struct vcpu *v);
     void (*set_rdtsc_exiting)(struct vcpu *v, bool_t);
@@ -215,6 +216,7 @@ struct hvm_function_table {
     void (*altp2m_vcpu_update_eptp)(struct vcpu *v);
     void (*altp2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
     bool_t (*altp2m_vcpu_emulate_ve)(struct vcpu *v);
+    int (*altp2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
 };
 
 extern struct hvm_function_table hvm_funcs;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (6 preceding siblings ...)
  2015-07-14  0:14 ` [PATCH v5 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14 17:03   ` George Dunlap
  2015-07-14  0:14 ` [PATCH v5 09/15] x86/altp2m: alternate p2m memory events Ed White
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: George Dunlap <george.dunlap@eu.citrix.com>

The existing ept_set_entry() and ept_get_entry() routines are extended
to optionally set/get suppress_ve.  Passing -1 will set suppress_ve on
new p2m entries, or retain suppress_ve flag on existing entries.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/mm/mem_sharing.c |  4 ++--
 xen/arch/x86/mm/p2m-ept.c     | 18 ++++++++++++----
 xen/arch/x86/mm/p2m-pod.c     | 12 +++++------
 xen/arch/x86/mm/p2m-pt.c      | 10 +++++++--
 xen/arch/x86/mm/p2m.c         | 48 +++++++++++++++++++++----------------------
 xen/include/asm-x86/p2m.h     | 24 ++++++++++++----------
 6 files changed, 67 insertions(+), 49 deletions(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 16e329e..1d1918f 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1260,7 +1260,7 @@ int relinquish_shared_pages(struct domain *d)
 
         if ( atomic_read(&d->shr_pages) == 0 )
             break;
-        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
         if ( mfn_valid(mfn) && (t == p2m_ram_shared) )
         {
             /* Does not fail with ENOMEM given the DESTROY flag */
@@ -1270,7 +1270,7 @@ int relinquish_shared_pages(struct domain *d)
              * unshare.  Must succeed: we just read the old entry and
              * we hold the p2m lock. */
             set_rc = p2m->set_entry(p2m, gfn, _mfn(0), PAGE_ORDER_4K,
-                                    p2m_invalid, p2m_access_rwx);
+                                    p2m_invalid, p2m_access_rwx, -1);
             ASSERT(set_rc == 0);
             count += 0x10;
         }
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 59410ea..9167833 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -657,7 +657,8 @@ bool_t ept_handle_misconfig(uint64_t gpa)
  */
 static int
 ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, 
-              unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma)
+              unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma,
+              int sve)
 {
     ept_entry_t *table, *ept_entry = NULL;
     unsigned long gfn_remainder = gfn;
@@ -803,7 +804,11 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
 
-    new_entry.suppress_ve = 1;
+    if ( sve != -1 )
+        new_entry.suppress_ve = !!sve;
+    else
+        new_entry.suppress_ve = is_epte_valid(&old_entry) ?
+                                    old_entry.suppress_ve : 1;
 
     rc = atomic_write_ept_entry(ept_entry, new_entry, target);
     if ( unlikely(rc) )
@@ -850,8 +855,9 @@ out:
 
 /* Read ept p2m entries */
 static mfn_t ept_get_entry(struct p2m_domain *p2m,
-                           unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
-                           p2m_query_t q, unsigned int *page_order)
+                            unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
+                            p2m_query_t q, unsigned int *page_order,
+                            bool_t *sve)
 {
     ept_entry_t *table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
     unsigned long gfn_remainder = gfn;
@@ -865,6 +871,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
 
     *t = p2m_mmio_dm;
     *a = p2m_access_n;
+    if ( sve )
+        *sve = 1;
 
     /* This pfn is higher than the highest the p2m map currently holds */
     if ( gfn > p2m->max_mapped_pfn )
@@ -930,6 +938,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
         else
             *t = ept_entry->sa_p2mt;
         *a = ept_entry->access;
+        if ( sve )
+            *sve = ept_entry->suppress_ve;
 
         mfn = _mfn(ept_entry->mfn);
         if ( i )
diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 0679f00..a2f6d02 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -536,7 +536,7 @@ recount:
         p2m_access_t a;
         p2m_type_t t;
 
-        (void)p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
+        (void)p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL, NULL);
 
         if ( t == p2m_populate_on_demand )
             pod++;
@@ -587,7 +587,7 @@ recount:
         p2m_type_t t;
         p2m_access_t a;
 
-        mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL, NULL);
         if ( t == p2m_populate_on_demand )
         {
             p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid,
@@ -676,7 +676,7 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     for ( i=0; i<SUPERPAGE_PAGES; i++ )
     {
         p2m_access_t a; 
-        mfn = p2m->get_entry(p2m, gfn + i, &type, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gfn + i, &type, &a, 0, NULL, NULL);
 
         if ( i == 0 )
         {
@@ -808,7 +808,7 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
     for ( i=0; i<count; i++ )
     {
         p2m_access_t a;
-        mfns[i] = p2m->get_entry(p2m, gfns[i], types + i, &a, 0, NULL);
+        mfns[i] = p2m->get_entry(p2m, gfns[i], types + i, &a, 0, NULL, NULL);
         /* If this is ram, and not a pagetable or from the xen heap, and probably not mapped
            elsewhere, map it; otherwise, skip. */
         if ( p2m_is_ram(types[i])
@@ -947,7 +947,7 @@ p2m_pod_emergency_sweep(struct p2m_domain *p2m)
     for ( i=p2m->pod.reclaim_single; i > 0 ; i-- )
     {
         p2m_access_t a;
-        (void)p2m->get_entry(p2m, i, &t, &a, 0, NULL);
+        (void)p2m->get_entry(p2m, i, &t, &a, 0, NULL, NULL);
         if ( p2m_is_ram(t) )
         {
             gfns[j] = i;
@@ -1135,7 +1135,7 @@ guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn,
     for ( i = 0; i < (1UL << order); i++ )
     {
         p2m_access_t a;
-        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
+        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL, NULL);
         if ( p2m_is_ram(ot) )
         {
             P2M_DEBUG("gfn_to_mfn returned type %d!\n", ot);
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index e50b6fa..f6944cd 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -482,7 +482,8 @@ int p2m_pt_handle_deferred_changes(uint64_t gpa)
 /* Returns: 0 for success, -errno for failure */
 static int
 p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
-                 unsigned int page_order, p2m_type_t p2mt, p2m_access_t p2ma)
+                 unsigned int page_order, p2m_type_t p2mt, p2m_access_t p2ma,
+                 int sve)
 {
     /* XXX -- this might be able to be faster iff current->domain == d */
     void *table;
@@ -495,6 +496,8 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
     unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
     unsigned long old_mfn = 0;
 
+    ASSERT(sve != 0);
+
     if ( tb_init_done )
     {
         struct {
@@ -689,7 +692,7 @@ static inline p2m_type_t recalc_type(bool_t recalc, p2m_type_t t,
 static mfn_t
 p2m_pt_get_entry(struct p2m_domain *p2m, unsigned long gfn,
                  p2m_type_t *t, p2m_access_t *a, p2m_query_t q,
-                 unsigned int *page_order)
+                 unsigned int *page_order, bool_t *sve)
 {
     mfn_t mfn;
     paddr_t addr = ((paddr_t)gfn) << PAGE_SHIFT;
@@ -701,6 +704,9 @@ p2m_pt_get_entry(struct p2m_domain *p2m, unsigned long gfn,
 
     ASSERT(paging_mode_translate(p2m->domain));
 
+    if ( sve )
+        *sve = 1;
+
     /* XXX This is for compatibility with the old model, where anything not 
      * XXX marked as RAM was considered to be emulated MMIO space.
      * XXX Once we start explicitly registering MMIO regions in the p2m 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index d378444..a3636ab 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -346,7 +346,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn,
         /* Grab the lock here, don't release until put_gfn */
         gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order);
+    mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL);
 
     if ( (q & P2M_UNSHARE) && p2m_is_shared(*t) )
     {
@@ -355,7 +355,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn,
          * sleeping. */
         if ( mem_sharing_unshare_page(p2m->domain, gfn, 0) < 0 )
             (void)mem_sharing_notify_enomem(p2m->domain, gfn, 0);
-        mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order);
+        mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL);
     }
 
     if (unlikely((p2m_is_broken(*t))))
@@ -459,7 +459,7 @@ int p2m_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         else
             order = 0;
 
-        set_rc = p2m->set_entry(p2m, gfn, mfn, order, p2mt, p2ma);
+        set_rc = p2m->set_entry(p2m, gfn, mfn, order, p2mt, p2ma, -1);
         if ( set_rc )
             rc = set_rc;
 
@@ -623,7 +623,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn, unsigned long mfn,
     {
         for ( i = 0; i < (1UL << page_order); i++ )
         {
-            mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a, 0, NULL);
+            mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a, 0, NULL, NULL);
             if ( !p2m_is_grant(t) && !p2m_is_shared(t) && !p2m_is_foreign(t) )
                 set_gpfn_from_mfn(mfn+i, INVALID_M2P_ENTRY);
             ASSERT( !p2m_is_valid(t) || mfn + i == mfn_x(mfn_return) );
@@ -686,7 +686,7 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
     /* First, remove m->p mappings for existing p->m mappings */
     for ( i = 0; i < (1UL << page_order); i++ )
     {
-        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
+        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL, NULL);
         if ( p2m_is_shared(ot) )
         {
             /* Do an unshare to cleanly take care of all corner 
@@ -710,7 +710,7 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
                 (void)mem_sharing_notify_enomem(p2m->domain, gfn + i, 0);
                 return rc;
             }
-            omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
+            omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL, NULL);
             ASSERT(!p2m_is_shared(ot));
         }
         if ( p2m_is_grant(ot) || p2m_is_foreign(ot) )
@@ -758,7 +758,7 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
              * address */
             P2M_DEBUG("aliased! mfn=%#lx, old gfn=%#lx, new gfn=%#lx\n",
                       mfn + i, ogfn, gfn + i);
-            omfn = p2m->get_entry(p2m, ogfn, &ot, &a, 0, NULL);
+            omfn = p2m->get_entry(p2m, ogfn, &ot, &a, 0, NULL, NULL);
             if ( p2m_is_ram(ot) && !p2m_is_paged(ot) )
             {
                 ASSERT(mfn_valid(omfn));
@@ -825,7 +825,7 @@ int p2m_change_type_one(struct domain *d, unsigned long gfn,
 
     gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, &pt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &pt, &a, 0, NULL, NULL);
     rc = likely(pt == ot)
          ? p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, nt,
                          p2m->default_access)
@@ -909,7 +909,7 @@ static int set_typed_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
         return -EIO;
 
     gfn_lock(p2m, gfn, 0);
-    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL);
+    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL, NULL);
     if ( p2m_is_grant(ot) || p2m_is_foreign(ot) )
     {
         p2m_unlock(p2m);
@@ -960,7 +960,7 @@ int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
         return -EIO;
 
     gfn_lock(p2m, gfn, 0);
-    actual_mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL);
+    actual_mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
 
     /* Do not use mfn_valid() here as it will usually fail for MMIO pages. */
     if ( (INVALID_MFN == mfn_x(actual_mfn)) || (t != p2m_mmio_direct) )
@@ -996,7 +996,7 @@ int set_shared_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
         return -EIO;
 
     gfn_lock(p2m, gfn, 0);
-    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL);
+    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL, NULL);
     /* At the moment we only allow p2m change if gfn has already been made
      * sharable first */
     ASSERT(p2m_is_shared(ot));
@@ -1048,7 +1048,7 @@ int p2m_mem_paging_nominate(struct domain *d, unsigned long gfn)
 
     gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
 
     /* Check if mfn is valid */
     if ( !mfn_valid(mfn) )
@@ -1110,7 +1110,7 @@ int p2m_mem_paging_evict(struct domain *d, unsigned long gfn)
     gfn_lock(p2m, gfn, 0);
 
     /* Get mfn */
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
     if ( unlikely(!mfn_valid(mfn)) )
         goto out;
 
@@ -1242,7 +1242,7 @@ void p2m_mem_paging_populate(struct domain *d, unsigned long gfn)
 
     /* Fix p2m mapping */
     gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
     /* Allow only nominated or evicted pages to enter page-in path */
     if ( p2mt == p2m_ram_paging_out || p2mt == p2m_ram_paged )
     {
@@ -1304,7 +1304,7 @@ int p2m_mem_paging_prep(struct domain *d, unsigned long gfn, uint64_t buffer)
 
     gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
 
     ret = -ENOENT;
     /* Allow missing pages */
@@ -1392,7 +1392,7 @@ void p2m_mem_paging_resume(struct domain *d, vm_event_response_t *rsp)
         unsigned long gfn = rsp->u.mem_access.gfn;
 
         gfn_lock(p2m, gfn, 0);
-        mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
         /*
          * Allow only pages which were prepared properly, or pages which
          * were nominated but not evicted.
@@ -1537,11 +1537,11 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
      * These calls to p2m->set_entry() must succeed: we have the gfn
      * locked and just did a successful get_entry(). */
     gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, NULL);
 
     if ( npfec.write_access && p2ma == p2m_access_rx2rw ) 
     {
-        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw);
+        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw, -1);
         ASSERT(rc == 0);
         gfn_unlock(p2m, gfn, 0);
         return 1;
@@ -1550,7 +1550,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     {
         ASSERT(npfec.write_access || npfec.read_access || npfec.insn_fetch);
         rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
-                            p2mt, p2m_access_rwx);
+                            p2mt, p2m_access_rwx, -1);
         ASSERT(rc == 0);
     }
     gfn_unlock(p2m, gfn, 0);
@@ -1570,14 +1570,14 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
         else
         {
             gfn_lock(p2m, gfn, 0);
-            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, NULL);
             if ( p2ma != p2m_access_n2rwx )
             {
                 /* A listener is not required, so clear the access
                  * restrictions.  This set must succeed: we have the
                  * gfn locked and just did a successful get_entry(). */
                 rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
-                                    p2mt, p2m_access_rwx);
+                                    p2mt, p2m_access_rwx, -1);
                 ASSERT(rc == 0);
             }
             gfn_unlock(p2m, gfn, 0);
@@ -1697,8 +1697,8 @@ long p2m_set_mem_access(struct domain *d, unsigned long pfn, uint32_t nr,
     p2m_lock(p2m);
     for ( pfn += start; nr > start; ++pfn )
     {
-        mfn = p2m->get_entry(p2m, pfn, &t, &_a, 0, NULL);
-        rc = p2m->set_entry(p2m, pfn, mfn, PAGE_ORDER_4K, t, a);
+        mfn = p2m->get_entry(p2m, pfn, &t, &_a, 0, NULL, NULL);
+        rc = p2m->set_entry(p2m, pfn, mfn, PAGE_ORDER_4K, t, a, -1);
         if ( rc )
             break;
 
@@ -1746,7 +1746,7 @@ int p2m_get_mem_access(struct domain *d, unsigned long pfn,
     }
 
     gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, pfn, &t, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, pfn, &t, &a, 0, NULL, NULL);
     gfn_unlock(p2m, gfn, 0);
 
     if ( mfn_x(mfn) == INVALID_MFN )
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index f38b452..36b180b 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -226,17 +226,19 @@ struct p2m_domain {
     /* Pages used to construct the p2m */
     struct page_list_head pages;
 
-    int                (*set_entry   )(struct p2m_domain *p2m,
-                                       unsigned long gfn,
-                                       mfn_t mfn, unsigned int page_order,
-                                       p2m_type_t p2mt,
-                                       p2m_access_t p2ma);
-    mfn_t              (*get_entry   )(struct p2m_domain *p2m,
-                                       unsigned long gfn,
-                                       p2m_type_t *p2mt,
-                                       p2m_access_t *p2ma,
-                                       p2m_query_t q,
-                                       unsigned int *page_order);
+    int                (*set_entry)(struct p2m_domain *p2m,
+                                         unsigned long gfn,
+                                         mfn_t mfn, unsigned int page_order,
+                                         p2m_type_t p2mt,
+                                         p2m_access_t p2ma,
+                                         int sve);
+    mfn_t              (*get_entry)(struct p2m_domain *p2m,
+                                         unsigned long gfn,
+                                         p2m_type_t *p2mt,
+                                         p2m_access_t *p2ma,
+                                         p2m_query_t q,
+                                         unsigned int *page_order,
+                                         bool_t *sve);
     void               (*enable_hardware_log_dirty)(struct p2m_domain *p2m);
     void               (*disable_hardware_log_dirty)(struct p2m_domain *p2m);
     void               (*flush_hardware_cached_dirty)(struct p2m_domain *p2m);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 09/15] x86/altp2m: alternate p2m memory events.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (7 preceding siblings ...)
  2015-07-14  0:14 ` [PATCH v5 08/15] x86/altp2m: add control of suppress_ve Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14 14:08   ` Jan Beulich
  2015-07-14  0:14 ` [PATCH v5 10/15] x86/altp2m: add remaining support routines Ed White
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Add a flag to indicate that a memory event occurred in an alternate p2m
and a field containing the p2m index. Allow any event response to switch
to a different alternate p2m using the same flag and field.

Modify p2m_mem_access_check() to handle alternate p2m's.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> for the x86 bits.
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Tamas K Lengyel <tlengyel@novetta.com>
---
 xen/arch/x86/mm/p2m.c         | 19 ++++++++++++++++++-
 xen/common/vm_event.c         |  4 ++++
 xen/include/asm-arm/p2m.h     |  6 ++++++
 xen/include/asm-x86/p2m.h     |  3 +++
 xen/include/public/vm_event.h | 11 +++++++++++
 5 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index a3636ab..ecec53e 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1518,6 +1518,12 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
     }
 }
 
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
+{
+    if ( altp2m_active(v->domain) )
+        p2m_switch_vcpu_altp2m_by_id(v, idx);
+}
+
 bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
                             struct npfec npfec,
                             vm_event_request_t **req_ptr)
@@ -1525,7 +1531,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     struct vcpu *v = current;
     unsigned long gfn = gpa >> PAGE_SHIFT;
     struct domain *d = v->domain;    
-    struct p2m_domain* p2m = p2m_get_hostp2m(d);
+    struct p2m_domain *p2m = NULL;
     mfn_t mfn;
     p2m_type_t p2mt;
     p2m_access_t p2ma;
@@ -1533,6 +1539,11 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     int rc;
     unsigned long eip = guest_cpu_user_regs()->eip;
 
+    if ( altp2m_active(d) )
+        p2m = p2m_get_altp2m(v);
+    if ( !p2m )
+        p2m = p2m_get_hostp2m(d);
+
     /* First, handle rx2rw conversion automatically.
      * These calls to p2m->set_entry() must succeed: we have the gfn
      * locked and just did a successful get_entry(). */
@@ -1639,6 +1650,12 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
         req->vcpu_id = v->vcpu_id;
 
         p2m_vm_event_fill_regs(req);
+
+        if ( altp2m_active(v->domain) )
+        {
+            req->flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
+            req->altp2m_idx = vcpu_altp2m(v).p2midx;
+        }
     }
 
     /* Pause the current VCPU */
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 120a78a..13224e2 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -399,6 +399,10 @@ void vm_event_resume(struct domain *d, struct vm_event_domain *ved)
 
         };
 
+        /* Check for altp2m switch */
+        if ( rsp.flags & VM_EVENT_FLAG_ALTERNATE_P2M )
+            p2m_altp2m_check(v, rsp.altp2m_idx);
+
         /* Unpause domain. */
         if ( rsp.flags & VM_EVENT_FLAG_VCPU_PAUSED )
             vm_event_vcpu_unpause(v);
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 63748ef..08bdce3 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -109,6 +109,12 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
     /* Not supported on ARM. */
 }
 
+static inline
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
+{
+    /* Not supported on ARM. */
+}
+
 #define p2m_is_foreign(_t)  ((_t) == p2m_map_foreign)
 #define p2m_is_ram(_t)      ((_t) == p2m_ram_rw || (_t) == p2m_ram_ro)
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 36b180b..5c5f4e8 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -753,6 +753,9 @@ uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp);
 /* Switch alternate p2m for a single vcpu */
 bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
 
+/* Check to see if vcpu should be switched to a different p2m. */
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx);
+
 /*
  * p2m type to IOMMU flags
  */
diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h
index 577e971..3556e92 100644
--- a/xen/include/public/vm_event.h
+++ b/xen/include/public/vm_event.h
@@ -47,6 +47,16 @@
 #define VM_EVENT_FLAG_VCPU_PAUSED     (1 << 0)
 /* Flags to aid debugging mem_event */
 #define VM_EVENT_FLAG_FOREIGN         (1 << 1)
+/*
+ * This flag can be set in a request or a response
+ *
+ * On a request, indicates that the event occurred in the alternate p2m specified by
+ * the altp2m_idx request field.
+ *
+ * On a response, indicates that the VCPU should resume in the alternate p2m specified
+ * by the altp2m_idx response field if possible.
+ */
+#define VM_EVENT_FLAG_ALTERNATE_P2M   (1 << 5)
 
 /*
  * Reasons for the vm event request
@@ -194,6 +204,7 @@ typedef struct vm_event_st {
     uint32_t flags;     /* VM_EVENT_FLAG_* */
     uint32_t reason;    /* VM_EVENT_REASON_* */
     uint32_t vcpu_id;
+    uint16_t altp2m_idx; /* may be used during request and response */
 
     union {
         struct vm_event_paging                mem_paging;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (8 preceding siblings ...)
  2015-07-14  0:14 ` [PATCH v5 09/15] x86/altp2m: alternate p2m memory events Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14 14:31   ` Jan Beulich
  2015-07-16 14:44   ` George Dunlap
  2015-07-14  0:14 ` [PATCH v5 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
                   ` (4 subsequent siblings)
  14 siblings, 2 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Add the remaining routines required to support enabling the alternate
p2m functionality.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c           |  58 +++++-
 xen/arch/x86/mm/hap/Makefile     |   1 +
 xen/arch/x86/mm/hap/altp2m_hap.c |  98 ++++++++++
 xen/arch/x86/mm/p2m-ept.c        |   3 +
 xen/arch/x86/mm/p2m.c            | 385 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/altp2m.h |   4 +
 xen/include/asm-x86/p2m.h        |  33 ++++
 7 files changed, 576 insertions(+), 6 deletions(-)
 create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 38deedc..a9f4b1b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2802,10 +2802,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     mfn_t mfn;
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
-    struct p2m_domain *p2m;
+    struct p2m_domain *p2m, *hostp2m;
     int rc, fall_through = 0, paged = 0;
     int sharing_enomem = 0;
     vm_event_request_t *req_ptr = NULL;
+    bool_t ap2m_active = 0;
 
     /* On Nested Virtualization, walk the guest page table.
      * If this succeeds, all is fine.
@@ -2865,11 +2866,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
         goto out;
     }
 
-    p2m = p2m_get_hostp2m(currd);
-    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
+    ap2m_active = altp2m_active(currd);
+
+    /* Take a lock on the host p2m speculatively, to avoid potential
+     * locking order problems later and to handle unshare etc.
+     */
+    hostp2m = p2m_get_hostp2m(currd);
+    mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
                               P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE : 0),
                               NULL);
 
+    if ( ap2m_active )
+    {
+        if ( altp2m_hap_nested_page_fault(curr, gpa, gla, npfec, &p2m) == 1 )
+        {
+            /* entry was lazily copied from host -- retry */
+            __put_gfn(hostp2m, gfn);
+            rc = 1;
+            goto out;
+        }
+
+        mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+    }
+    else
+        p2m = hostp2m;
+
     /* Check access permissions first, then handle faults */
     if ( mfn_x(mfn) != INVALID_MFN )
     {
@@ -2909,6 +2930,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
 
         if ( violation )
         {
+            /* Should #VE be emulated for this fault? */
+            if ( p2m_is_altp2m(p2m) && !cpu_has_vmx_virt_exceptions )
+            {
+                bool_t sve;
+
+                p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
+
+                if ( !sve && altp2m_vcpu_emulate_ve(curr) )
+                {
+                    rc = 1;
+                    goto out_put_gfn;
+                }
+            }
+
             if ( p2m_mem_access_check(gpa, gla, npfec, &req_ptr) )
             {
                 fall_through = 1;
@@ -2928,7 +2963,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
          (npfec.write_access &&
           (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
     {
-        put_gfn(currd, gfn);
+        __put_gfn(p2m, gfn);
+        if ( ap2m_active )
+            __put_gfn(hostp2m, gfn);
 
         rc = 0;
         if ( unlikely(is_pvh_domain(currd)) )
@@ -2957,6 +2994,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     /* Spurious fault? PoD and log-dirty also take this path. */
     if ( p2m_is_ram(p2mt) )
     {
+        rc = 1;
         /*
          * Page log dirty is always done with order 0. If this mfn resides in
          * a large page, we do not change other pages type within that large
@@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
         if ( npfec.write_access )
         {
             paging_mark_dirty(currd, mfn_x(mfn));
+            /* If p2m is really an altp2m, unlock here to avoid lock ordering
+             * violation when the change below is propagated from host p2m */
+            if ( ap2m_active )
+                __put_gfn(p2m, gfn);
             p2m_change_type_one(currd, gfn, p2m_ram_logdirty, p2m_ram_rw);
+            __put_gfn(ap2m_active ? hostp2m : p2m, gfn);
+
+            goto out;
         }
-        rc = 1;
         goto out_put_gfn;
     }
 
@@ -2977,7 +3021,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     rc = fall_through;
 
 out_put_gfn:
-    put_gfn(currd, gfn);
+    __put_gfn(p2m, gfn);
+    if ( ap2m_active )
+        __put_gfn(hostp2m, gfn);
 out:
     /* All of these are delayed until we exit, since we might 
      * sleep on event ring wait queues, and we must not hold
diff --git a/xen/arch/x86/mm/hap/Makefile b/xen/arch/x86/mm/hap/Makefile
index 68f2bb5..216cd90 100644
--- a/xen/arch/x86/mm/hap/Makefile
+++ b/xen/arch/x86/mm/hap/Makefile
@@ -4,6 +4,7 @@ obj-y += guest_walk_3level.o
 obj-$(x86_64) += guest_walk_4level.o
 obj-y += nested_hap.o
 obj-y += nested_ept.o
+obj-y += altp2m_hap.o
 
 guest_walk_%level.o: guest_walk.c Makefile
 	$(CC) $(CFLAGS) -DGUEST_PAGING_LEVELS=$* -c $< -o $@
diff --git a/xen/arch/x86/mm/hap/altp2m_hap.c b/xen/arch/x86/mm/hap/altp2m_hap.c
new file mode 100644
index 0000000..52c7877
--- /dev/null
+++ b/xen/arch/x86/mm/hap/altp2m_hap.c
@@ -0,0 +1,98 @@
+/******************************************************************************
+ * arch/x86/mm/hap/altp2m_hap.c
+ *
+ * Copyright (c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <asm/domain.h>
+#include <asm/page.h>
+#include <asm/paging.h>
+#include <asm/p2m.h>
+#include <asm/hap.h>
+#include <asm/hvm/altp2m.h>
+
+#include "private.h"
+
+/*
+ * If the fault is for a not present entry:
+ *     if the entry in the host p2m has a valid mfn, copy it and retry
+ *     else indicate that outer handler should handle fault
+ *
+ * If the fault is for a present entry:
+ *     indicate that outer handler should handle fault
+ */
+
+int
+altp2m_hap_nested_page_fault(struct vcpu *v, paddr_t gpa,
+                                unsigned long gla, struct npfec npfec,
+                                struct p2m_domain **ap2m)
+{
+    struct p2m_domain *hp2m = p2m_get_hostp2m(v->domain);
+    p2m_type_t p2mt;
+    p2m_access_t p2ma;
+    unsigned int page_order;
+    gfn_t gfn = _gfn(paddr_to_pfn(gpa));
+    unsigned long mask;
+    mfn_t mfn;
+    int rv;
+
+    *ap2m = p2m_get_altp2m(v);
+
+    mfn = get_gfn_type_access(*ap2m, gfn_x(gfn), &p2mt, &p2ma,
+                              0, &page_order);
+    __put_gfn(*ap2m, gfn_x(gfn));
+
+    if ( mfn_x(mfn) != INVALID_MFN )
+        return 0;
+
+    mfn = get_gfn_type_access(hp2m, gfn_x(gfn), &p2mt, &p2ma,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order);
+    put_gfn(hp2m->domain, gfn_x(gfn));
+
+    if ( mfn_x(mfn) == INVALID_MFN )
+        return 0;
+
+    p2m_lock(*ap2m);
+
+    /* If this is a superpage mapping, round down both frame numbers
+     * to the start of the superpage. */
+    mask = ~((1UL << page_order) - 1);
+    mfn = _mfn(mfn_x(mfn) & mask);
+
+    rv = p2m_set_entry(*ap2m, gfn_x(gfn) & mask, mfn, page_order, p2mt, p2ma);
+    p2m_unlock(*ap2m);
+
+    if ( rv )
+    {
+        gdprintk(XENLOG_ERR,
+	    "failed to set entry for %#"PRIx64" -> %#"PRIx64" p2m %#"PRIx64"\n",
+	    gfn_x(gfn), mfn_x(mfn), (unsigned long)*ap2m);
+        domain_crash(hp2m->domain);
+    }
+
+    return 1;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 9167833..7ca0598 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -850,6 +850,9 @@ out:
     if ( is_epte_present(&old_entry) )
         ept_free_entry(p2m, &old_entry, target);
 
+    if ( rc == 0 && p2m_is_hostp2m(p2m) )
+        p2m_altp2m_propagate_change(d, _gfn(gfn), mfn, order, p2mt, p2ma);
+
     return rc;
 }
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index ecec53e..0c37bbf 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2037,6 +2037,391 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx)
     return rc;
 }
 
+void p2m_flush_altp2m(struct domain *d)
+{
+    uint16_t i;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        p2m_flush_table(d->arch.altp2m_p2m[i]);
+        /* Uninit and reinit ept to force TLB shootdown */
+        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
+        ept_p2m_init(d->arch.altp2m_p2m[i]);
+        d->arch.altp2m_eptp[i] = INVALID_MFN;
+    }
+
+    altp2m_list_unlock(d);
+}
+
+static void p2m_init_altp2m_helper(struct domain *d, uint16_t i)
+{
+    struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
+    struct ept_data *ept;
+
+    p2m->min_remapped_gfn = INVALID_GFN;
+    p2m->max_remapped_gfn = INVALID_GFN;
+    ept = &p2m->ept;
+    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
+    d->arch.altp2m_eptp[i] = ept_get_eptp(ept);
+}
+
+long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx)
+{
+    long rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] == INVALID_MFN )
+    {
+        p2m_init_altp2m_helper(d, idx);
+        rc = 0;
+    }
+
+    altp2m_list_unlock(d);
+    return rc;
+}
+
+long p2m_init_next_altp2m(struct domain *d, uint16_t *idx)
+{
+    long rc = -EINVAL;
+    uint16_t i;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] != INVALID_MFN )
+            continue;
+
+        p2m_init_altp2m_helper(d, i);
+        *idx = i;
+        rc = 0;
+
+        break;
+    }
+
+    altp2m_list_unlock(d);
+    return rc;
+}
+
+long p2m_destroy_altp2m_by_id(struct domain *d, uint16_t idx)
+{
+    struct p2m_domain *p2m;
+    long rc = -EINVAL;
+
+    if ( !idx || idx > MAX_ALTP2M )
+        return rc;
+
+    domain_pause_except_self(d);
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        p2m = d->arch.altp2m_p2m[idx];
+
+        if ( !_atomic_read(p2m->active_vcpus) )
+        {
+            p2m_flush_table(d->arch.altp2m_p2m[idx]);
+            /* Uninit and reinit ept to force TLB shootdown */
+            ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
+            ept_p2m_init(d->arch.altp2m_p2m[idx]);
+            d->arch.altp2m_eptp[idx] = INVALID_MFN;
+            rc = 0;
+        }
+    }
+
+    altp2m_list_unlock(d);
+
+    domain_unpause_except_self(d);
+
+    return rc;
+}
+
+long p2m_switch_domain_altp2m_by_id(struct domain *d, uint16_t idx)
+{
+    struct vcpu *v;
+    long rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    domain_pause_except_self(d);
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        for_each_vcpu( d, v )
+            if ( idx != vcpu_altp2m(v).p2midx )
+            {
+                atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+                vcpu_altp2m(v).p2midx = idx;
+                atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+                altp2m_vcpu_update_eptp(v);
+            }
+
+        rc = 0;
+    }
+
+    altp2m_list_unlock(d);
+
+    domain_unpause_except_self(d);
+
+    return rc;
+}
+
+long p2m_set_altp2m_mem_access(struct domain *d, uint16_t idx,
+                                 gfn_t gfn, xenmem_access_t access)
+{
+    struct p2m_domain *hp2m, *ap2m;
+    p2m_access_t req_a, old_a;
+    p2m_type_t t;
+    mfn_t mfn;
+    unsigned int page_order;
+    long rc = -EINVAL;
+
+    static const p2m_access_t memaccess[] = {
+#define ACCESS(ac) [XENMEM_access_##ac] = p2m_access_##ac
+        ACCESS(n),
+        ACCESS(r),
+        ACCESS(w),
+        ACCESS(rw),
+        ACCESS(x),
+        ACCESS(rx),
+        ACCESS(wx),
+        ACCESS(rwx),
+#undef ACCESS
+    };
+
+    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
+        return rc;
+
+    ap2m = d->arch.altp2m_p2m[idx];
+
+    switch ( access )
+    {
+    case 0 ... ARRAY_SIZE(memaccess) - 1:
+        req_a = memaccess[access];
+        break;
+    case XENMEM_access_default:
+        req_a = ap2m->default_access;
+        break;
+    default:
+        return rc;
+    }
+
+    /* If request to set default access */
+    if ( gfn_x(gfn) == INVALID_GFN )
+    {
+        ap2m->default_access = req_a;
+        return 0;
+    }
+
+    hp2m = p2m_get_hostp2m(d);
+
+    p2m_lock(ap2m);
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(gfn), &t, &old_a, 0, NULL, NULL);
+
+    /* Check host p2m if no valid entry in alternate */
+    if ( !mfn_valid(mfn) )
+    {
+        mfn = hp2m->get_entry(hp2m, gfn_x(gfn), &t, &old_a,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order, NULL);
+
+        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
+            goto out;
+
+        /* If this is a superpage, copy that first */
+        if ( page_order != PAGE_ORDER_4K )
+        {
+            gfn_t gfn2;
+            unsigned long mask;
+            mfn_t mfn2;
+
+            mask = ~((1UL << page_order) - 1);
+            gfn2 = _gfn(gfn_x(gfn) & mask);
+            mfn2 = _mfn(mfn_x(mfn) & mask);
+
+            if ( ap2m->set_entry(ap2m, gfn_x(gfn2), mfn2, page_order, t, old_a, 1) )
+                goto out;
+        }
+    }
+
+    if ( !ap2m->set_entry(ap2m, gfn_x(gfn), mfn, PAGE_ORDER_4K, t, req_a,
+                          (current->domain != d)) )
+        rc = 0;
+
+out:
+    p2m_unlock(ap2m);
+    return rc;
+}
+
+long p2m_change_altp2m_gfn(struct domain *d, uint16_t idx,
+                             gfn_t old_gfn, gfn_t new_gfn)
+{
+    struct p2m_domain *hp2m, *ap2m;
+    p2m_access_t a;
+    p2m_type_t t;
+    mfn_t mfn;
+    unsigned int page_order;
+    long rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
+        return rc;
+
+    hp2m = p2m_get_hostp2m(d);
+    ap2m = d->arch.altp2m_p2m[idx];
+
+    p2m_lock(ap2m);
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(old_gfn), &t, &a, 0, NULL, NULL);
+
+    if ( gfn_x(new_gfn) == INVALID_GFN )
+    {
+        if ( mfn_valid(mfn) )
+            p2m_remove_page(ap2m, gfn_x(old_gfn), mfn_x(mfn), PAGE_ORDER_4K);
+        rc = 0;
+        goto out;
+    }
+
+    /* Check host p2m if no valid entry in alternate */
+    if ( !mfn_valid(mfn) )
+    {
+        mfn = hp2m->get_entry(hp2m, gfn_x(old_gfn), &t, &a,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order, NULL);
+
+        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
+            goto out;
+
+        /* If this is a superpage, copy that first */
+        if ( page_order != PAGE_ORDER_4K )
+        {
+            gfn_t gfn;
+            unsigned long mask;
+
+            mask = ~((1UL << page_order) - 1);
+            gfn = _gfn(gfn_x(old_gfn) & mask);
+            mfn = _mfn(mfn_x(mfn) & mask);
+
+            if ( ap2m->set_entry(ap2m, gfn_x(gfn), mfn, page_order, t, a, 1) )
+                goto out;
+        }
+    }
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(new_gfn), &t, &a, 0, NULL, NULL);
+
+    if ( !mfn_valid(mfn) )
+        mfn = hp2m->get_entry(hp2m, gfn_x(new_gfn), &t, &a, 0, NULL, NULL);
+
+    if ( !mfn_valid(mfn) || (t != p2m_ram_rw) )
+        goto out;
+
+    if ( !ap2m->set_entry(ap2m, gfn_x(old_gfn), mfn, PAGE_ORDER_4K, t, a,
+                          (current->domain != d)) )
+    {
+        rc = 0;
+
+        if ( ap2m->min_remapped_gfn == INVALID_GFN ||
+             gfn_x(new_gfn) < ap2m->min_remapped_gfn )
+            ap2m->min_remapped_gfn = gfn_x(new_gfn);
+        if ( ap2m->max_remapped_gfn == INVALID_GFN ||
+             gfn_x(new_gfn) > ap2m->max_remapped_gfn )
+            ap2m->max_remapped_gfn = gfn_x(new_gfn);
+    }
+
+out:
+    p2m_unlock(ap2m);
+    return rc;
+}
+
+static void p2m_reset_altp2m(struct p2m_domain *p2m)
+{
+    p2m_flush_table(p2m);
+    /* Uninit and reinit ept to force TLB shootdown */
+    ept_p2m_uninit(p2m);
+    ept_p2m_init(p2m);
+    p2m->min_remapped_gfn = INVALID_GFN;
+    p2m->max_remapped_gfn = INVALID_GFN;
+}
+
+void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
+                                 mfn_t mfn, unsigned int page_order,
+                                 p2m_type_t p2mt, p2m_access_t p2ma)
+{
+    struct p2m_domain *p2m;
+    p2m_access_t a;
+    p2m_type_t t;
+    mfn_t m;
+    uint16_t i;
+    bool_t reset_p2m;
+    unsigned int reset_count = 0;
+    uint16_t last_reset_idx = ~0;
+
+    if ( !altp2m_active(d) )
+        return;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
+            continue;
+
+        p2m = d->arch.altp2m_p2m[i];
+        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
+
+        reset_p2m = 0;
+
+        /* Check for a dropped page that may impact this altp2m */
+        if ( mfn_x(mfn) == INVALID_MFN &&
+             gfn_x(gfn) >= p2m->min_remapped_gfn &&
+             gfn_x(gfn) <= p2m->max_remapped_gfn )
+            reset_p2m = 1;
+
+        if ( reset_p2m )
+        {
+            if ( !reset_count++ )
+            {
+                p2m_reset_altp2m(p2m);
+                last_reset_idx = i;
+            }
+            else
+            {
+                /* At least 2 altp2m's impacted, so reset everything */
+                __put_gfn(p2m, gfn_x(gfn));
+
+                for ( i = 0; i < MAX_ALTP2M; i++ )
+                {
+                    if ( i == last_reset_idx ||
+                         d->arch.altp2m_eptp[i] == INVALID_MFN )
+                        continue;
+
+                    p2m = d->arch.altp2m_p2m[i];
+                    p2m_lock(p2m);
+                    p2m_reset_altp2m(p2m);
+                    p2m_unlock(p2m);
+                }
+
+                goto out;
+            }
+        }
+        else if ( mfn_x(m) != INVALID_MFN )
+           p2m_set_entry(p2m, gfn_x(gfn), mfn, page_order, p2mt, p2ma);
+
+        __put_gfn(p2m, gfn_x(gfn));
+    }
+
+out:
+    altp2m_list_unlock(d);
+}
+
 /*** Audit ***/
 
 #if P2M_AUDIT
diff --git a/xen/include/asm-x86/hvm/altp2m.h b/xen/include/asm-x86/hvm/altp2m.h
index 38de494..6aa8c6b 100644
--- a/xen/include/asm-x86/hvm/altp2m.h
+++ b/xen/include/asm-x86/hvm/altp2m.h
@@ -34,5 +34,9 @@ void altp2m_vcpu_initialise(struct vcpu *v);
 void altp2m_vcpu_destroy(struct vcpu *v);
 void altp2m_vcpu_reset(struct vcpu *v);
 
+/* Alternate p2m paging */
+int altp2m_hap_nested_page_fault(struct vcpu *v, paddr_t gpa,
+    unsigned long gla, struct npfec npfec, struct p2m_domain **ap2m);
+
 #endif /* _ALTP2M_H */
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 5c5f4e8..49d1829 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -268,6 +268,11 @@ struct p2m_domain {
     /* Highest guest frame that's ever been mapped in the p2m */
     unsigned long max_mapped_pfn;
 
+    /* Alternate p2m's only: range of gfn's for which underlying
+     * mfn may have duplicate mappings */
+    unsigned long min_remapped_gfn;
+    unsigned long max_remapped_gfn;
+
     /* When releasing shared gfn's in a preemptible manner, recall where
      * to resume the search */
     unsigned long next_shared_gfn_to_relinquish;
@@ -756,6 +761,34 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
 /* Check to see if vcpu should be switched to a different p2m. */
 void p2m_altp2m_check(struct vcpu *v, uint16_t idx);
 
+/* Flush all the alternate p2m's for a domain */
+void p2m_flush_altp2m(struct domain *d);
+
+/* Make a specific alternate p2m valid */
+long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx);
+
+/* Find an available alternate p2m and make it valid */
+long p2m_init_next_altp2m(struct domain *d, uint16_t *idx);
+
+/* Make a specific alternate p2m invalid */
+long p2m_destroy_altp2m_by_id(struct domain *d, uint16_t idx);
+
+/* Switch alternate p2m for entire domain */
+long p2m_switch_domain_altp2m_by_id(struct domain *d, uint16_t idx);
+
+/* Set access type for a gfn */
+long p2m_set_altp2m_mem_access(struct domain *d, uint16_t idx,
+                                 gfn_t gfn, xenmem_access_t access);
+
+/* Change a gfn->mfn mapping */
+long p2m_change_altp2m_gfn(struct domain *d, uint16_t idx,
+                             gfn_t old_gfn, gfn_t new_gfn);
+
+/* Propagate a host p2m change to all alternate p2m's */
+void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
+                                 mfn_t mfn, unsigned int page_order,
+                                 p2m_type_t p2mt, p2m_access_t p2ma);
+
 /*
  * p2m type to IOMMU flags
  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (9 preceding siblings ...)
  2015-07-14  0:14 ` [PATCH v5 10/15] x86/altp2m: add remaining support routines Ed White
@ 2015-07-14  0:14 ` Ed White
  2015-07-14 14:36   ` Jan Beulich
  2015-07-14  0:15 ` [PATCH v5 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 56+ messages in thread
From: Ed White @ 2015-07-14  0:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 xen/arch/x86/hvm/hvm.c          | 142 ++++++++++++++++++++++++++++++++++++++++
 xen/include/public/hvm/hvm_op.h |  82 +++++++++++++++++++++++
 2 files changed, 224 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index a9f4b1b..df6c6b6 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6443,6 +6443,148 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case HVMOP_altp2m:
+    {
+        struct xen_hvm_altp2m_op a;
+        struct domain *d = NULL;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        if ( a.pad[0] || a.pad[1] )
+            return -EINVAL;
+
+        switch ( a.cmd )
+        {
+        case HVMOP_altp2m_get_domain_state:
+        case HVMOP_altp2m_set_domain_state:
+        case HVMOP_altp2m_create_p2m:
+        case HVMOP_altp2m_destroy_p2m:
+        case HVMOP_altp2m_switch_p2m:
+        case HVMOP_altp2m_set_mem_access:
+        case HVMOP_altp2m_change_gfn:
+            d = rcu_lock_domain_by_any_id(a.domain);
+            if ( d == NULL )
+                return -ESRCH;
+
+            if ( !is_hvm_domain(d) || !hvm_altp2m_supported() )
+                rc = -EINVAL;
+            break;
+
+        case HVMOP_altp2m_vcpu_enable_notify:
+           break;
+
+        default:
+            return -ENOSYS;
+        }
+
+        if ( !rc )
+        {
+            switch ( a.cmd )
+            {
+            case HVMOP_altp2m_get_domain_state:
+                a.u.domain_state.state = altp2m_active(d);
+                rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+                break;
+
+            case HVMOP_altp2m_set_domain_state:
+            {
+                struct vcpu *v;
+                bool_t ostate;
+                
+                if ( nestedhvm_enabled(d) )
+                {
+                    rc = -EINVAL;
+                    break;
+                }
+
+                ostate = d->arch.altp2m_active;
+                d->arch.altp2m_active = !!a.u.domain_state.state;
+
+                /* If the alternate p2m state has changed, handle appropriately */
+                if ( d->arch.altp2m_active != ostate &&
+                     (ostate || !(rc = p2m_init_altp2m_by_id(d, 0))) )
+                {
+                    for_each_vcpu( d, v )
+                    {
+                        if ( !ostate )
+                            altp2m_vcpu_initialise(v);
+                        else
+                            altp2m_vcpu_destroy(v);
+                    }
+
+                    if ( ostate )
+                        p2m_flush_altp2m(d);
+                }
+                break;
+            }
+            default:
+                if ( !(d ? d : current->domain)->arch.altp2m_active )
+                {
+                    rc = -EINVAL;
+                    break;
+                }
+
+                switch ( a.cmd )
+                {
+                case HVMOP_altp2m_vcpu_enable_notify:
+                {
+                    struct vcpu *curr = current;
+                    p2m_type_t p2mt;
+
+                    if ( (gfn_x(vcpu_altp2m(curr).veinfo_gfn) != INVALID_GFN) ||
+                         (mfn_x(get_gfn_query_unlocked(curr->domain,
+                                a.u.enable_notify.gfn, &p2mt)) == INVALID_MFN) )
+                        return -EINVAL;
+
+                    vcpu_altp2m(curr).veinfo_gfn = _gfn(a.u.enable_notify.gfn);
+                    altp2m_vcpu_update_vmfunc_ve(curr);
+                    break;
+                }
+                case HVMOP_altp2m_create_p2m:
+                    if ( !(rc = p2m_init_next_altp2m(d, &a.u.view.view)) )
+                        rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+                    break;
+
+                case HVMOP_altp2m_destroy_p2m:
+                    rc = p2m_destroy_altp2m_by_id(d, a.u.view.view);
+                    break;
+
+                case HVMOP_altp2m_switch_p2m:
+                    rc = p2m_switch_domain_altp2m_by_id(d, a.u.view.view);
+                    break;
+
+                case HVMOP_altp2m_set_mem_access:
+                    if ( a.u.set_mem_access.pad[0] || a.u.set_mem_access.pad[1] ||
+                         a.u.set_mem_access.pad[2] || a.u.set_mem_access.pad[3] )
+                        rc = -EINVAL;
+                    else
+                        rc = p2m_set_altp2m_mem_access(d, a.u.set_mem_access.view,
+                                _gfn(a.u.set_mem_access.gfn),
+                                a.u.set_mem_access.hvmmem_access);
+                    break;
+
+                case HVMOP_altp2m_change_gfn:
+                    if ( a.u.change_gfn.pad[0] || a.u.change_gfn.pad[1] ||
+                         a.u.change_gfn.pad[2] || a.u.change_gfn.pad[3] ||
+                         a.u.change_gfn.pad[4] || a.u.change_gfn.pad[5] )
+                        rc = -EINVAL;
+                    else
+                        rc = p2m_change_altp2m_gfn(d, a.u.change_gfn.view,
+                                _gfn(a.u.change_gfn.old_gfn),
+                                _gfn(a.u.change_gfn.new_gfn));
+                    break;
+                }
+                break;
+            }
+        }
+
+        if ( d )
+            rcu_unlock_domain(d);
+
+        break;
+    }
+
     default:
     {
         gdprintk(XENLOG_DEBUG, "Bad HVM op %ld.\n", op);
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index 9b84e84..05d42c4 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -396,6 +396,88 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_evtchn_upcall_vector_t);
 
 #endif /* defined(__i386__) || defined(__x86_64__) */
 
+/* HVMOP_altp2m: perform altp2m state operations */
+#define HVMOP_altp2m 24
+
+struct xen_hvm_altp2m_domain_state {
+    /* IN or OUT variable on/off */
+    uint8_t state;
+};
+typedef struct xen_hvm_altp2m_domain_state xen_hvm_altp2m_domain_state_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_domain_state_t);
+
+struct xen_hvm_altp2m_vcpu_enable_notify {
+    /* #VE info area gfn */
+    uint64_t gfn;
+};
+typedef struct xen_hvm_altp2m_vcpu_enable_notify xen_hvm_altp2m_vcpu_enable_notify_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_vcpu_enable_notify_t);
+
+struct xen_hvm_altp2m_view {
+    /* IN/OUT variable */
+    uint16_t view;
+    /* Create view only: default access type
+     * NOTE: currently ignored */
+    uint16_t hvmmem_default_access; /* xenmem_access_t */
+};
+typedef struct xen_hvm_altp2m_view xen_hvm_altp2m_view_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_view_t);
+
+struct xen_hvm_altp2m_set_mem_access {
+    /* view */
+    uint16_t view;
+    /* Memory type */
+    uint16_t hvmmem_access; /* xenmem_access_t */
+    uint8_t pad[4];
+    /* gfn */
+    uint64_t gfn;
+};
+typedef struct xen_hvm_altp2m_set_mem_access xen_hvm_altp2m_set_mem_access_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
+
+struct xen_hvm_altp2m_change_gfn {
+    /* view */
+    uint16_t view;
+    uint8_t pad[6];
+    /* old gfn */
+    uint64_t old_gfn;
+    /* new gfn, INVALID_GFN (~0UL) means revert */
+    uint64_t new_gfn;
+};
+typedef struct xen_hvm_altp2m_change_gfn xen_hvm_altp2m_change_gfn_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_change_gfn_t);
+
+struct xen_hvm_altp2m_op {
+    uint32_t cmd;
+/* Get/set the altp2m state for a domain */
+#define HVMOP_altp2m_get_domain_state     1
+#define HVMOP_altp2m_set_domain_state     2
+/* Set the current VCPU to receive altp2m event notifications */
+#define HVMOP_altp2m_vcpu_enable_notify   3
+/* Create a new view */
+#define HVMOP_altp2m_create_p2m           4
+/* Destroy a view */
+#define HVMOP_altp2m_destroy_p2m          5
+/* Switch view for an entire domain */
+#define HVMOP_altp2m_switch_p2m           6
+/* Notify that a page of memory is to have specific access types */
+#define HVMOP_altp2m_set_mem_access       7
+/* Change a p2m entry to have a different gfn->mfn mapping */
+#define HVMOP_altp2m_change_gfn           8
+    domid_t domain;
+    uint8_t pad[2];
+    union {
+        struct xen_hvm_altp2m_domain_state       domain_state;
+        struct xen_hvm_altp2m_vcpu_enable_notify enable_notify;
+        struct xen_hvm_altp2m_view               view;
+        struct xen_hvm_altp2m_set_mem_access     set_mem_access;
+        struct xen_hvm_altp2m_change_gfn         change_gfn;
+        uint8_t pad[64];
+    } u;
+};
+typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
+
 #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (10 preceding siblings ...)
  2015-07-14  0:14 ` [PATCH v5 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
@ 2015-07-14  0:15 ` Ed White
  2015-07-14  0:15 ` [PATCH v5 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:15 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

The altp2mhvm and nestedhvm parameters are mutually
exclusive and cannot be set together.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 docs/man/xl.cfg.pod.5           | 12 ++++++++++++
 tools/libxl/libxl.h             |  6 ++++++
 tools/libxl/libxl_create.c      |  1 +
 tools/libxl/libxl_dom.c         |  2 ++
 tools/libxl/libxl_types.idl     |  1 +
 tools/libxl/xl_cmdimpl.c        | 10 ++++++++++
 xen/arch/x86/hvm/hvm.c          | 23 +++++++++++++++++++++--
 xen/include/public/hvm/params.h |  5 ++++-
 8 files changed, 57 insertions(+), 3 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index a3e0e2e..18afd46 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1035,6 +1035,18 @@ enabled by default and you should usually omit it. It may be necessary
 to disable the HPET in order to improve compatibility with guest
 Operating Systems (X86 only)
 
+=item B<altp2mhvm=BOOLEAN>
+
+Enables or disables hvm guest access to alternate-p2m capability.
+Alternate-p2m allows a guest to manage multiple p2m guest physical
+"memory views" (as opposed to a single p2m). This option is
+disabled by default and is available only to hvm domains.
+You may want this option if you want to access-control/isolate
+access to specific guest physical memory pages accessed by
+the guest, e.g. for HVM domain memory introspection or
+for isolation/access-control of memory between components within
+a single guest hvm domain.
+
 =item B<nestedhvm=BOOLEAN>
 
 Enable or disables guest access to hardware virtualisation features,
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index a1c5d15..17222e7 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -745,6 +745,12 @@ typedef struct libxl__ctx libxl_ctx;
 #define LIBXL_HAVE_BUILDINFO_SERIAL_LIST 1
 
 /*
+ * LIBXL_HAVE_ALTP2M
+ * If this is defined, then libxl supports alternate p2m functionality.
+ */
+#define LIBXL_HAVE_ALTP2M 1
+
+/*
  * LIBXL_HAVE_REMUS
  * If this is defined, then libxl supports remus.
  */
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index f366a09..418deee 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -329,6 +329,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
         libxl_defbool_setdefault(&b_info->u.hvm.hpet,               true);
         libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,          true);
         libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm,         false);
+        libxl_defbool_setdefault(&b_info->u.hvm.altp2m,             false);
         libxl_defbool_setdefault(&b_info->u.hvm.usb,                false);
         libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index bdc0465..1008a16 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -300,6 +300,8 @@ static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
                     libxl_defbool_val(info->u.hvm.vpt_align));
     xc_hvm_param_set(handle, domid, HVM_PARAM_NESTEDHVM,
                     libxl_defbool_val(info->u.hvm.nested_hvm));
+    xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
+                    libxl_defbool_val(info->u.hvm.altp2m));
 }
 
 int libxl__build_pre(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index e1632fa..fb641fe 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -440,6 +440,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                        ("mmio_hole_memkb",  MemKB),
                                        ("timer_mode",       libxl_timer_mode),
                                        ("nested_hvm",       libxl_defbool),
+                                       ("altp2m",           libxl_defbool),
                                        ("smbios_firmware",  string),
                                        ("acpi_firmware",    string),
                                        ("nographic",        libxl_defbool),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c858068..43cf6bf 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1500,6 +1500,16 @@ static void parse_config_data(const char *config_source,
 
         xlu_cfg_get_defbool(config, "nestedhvm", &b_info->u.hvm.nested_hvm, 0);
 
+        xlu_cfg_get_defbool(config, "altp2mhvm", &b_info->u.hvm.altp2m, 0);
+
+        if (!libxl_defbool_is_default(b_info->u.hvm.nested_hvm) &&
+            libxl_defbool_val(b_info->u.hvm.nested_hvm) &&
+            !libxl_defbool_is_default(b_info->u.hvm.altp2m) &&
+            libxl_defbool_val(b_info->u.hvm.altp2m)) {
+            fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used together\n");
+            exit(1);
+        }
+
         xlu_cfg_replace_string(config, "smbios_firmware",
                                &b_info->u.hvm.smbios_firmware, 0);
         xlu_cfg_replace_string(config, "acpi_firmware",
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index df6c6b6..df69c1d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5750,6 +5750,7 @@ static int hvm_allow_set_param(struct domain *d,
     case HVM_PARAM_VIRIDIAN:
     case HVM_PARAM_IOREQ_SERVER_PFN:
     case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
+    case HVM_PARAM_ALTP2M:
         if ( value != 0 && a->value != value )
             rc = -EEXIST;
         break;
@@ -5872,6 +5873,9 @@ static int hvmop_set_param(
          */
         if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
             rc = -EINVAL;
+        if ( a.value &&
+             d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] )
+            rc = -EINVAL;
         /* Set up NHVM state for any vcpus that are already up. */
         if ( a.value &&
              !d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM] )
@@ -5882,6 +5886,13 @@ static int hvmop_set_param(
             for_each_vcpu(d, v)
                 nestedhvm_vcpu_destroy(v);
         break;
+    case HVM_PARAM_ALTP2M:
+        if ( a.value > 1 )
+            rc = -EINVAL;
+        if ( a.value &&
+             d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM] )
+            rc = -EINVAL;
+        break;
     case HVM_PARAM_BUFIOREQ_EVTCHN:
         rc = -EINVAL;
         break;
@@ -5942,6 +5953,7 @@ static int hvm_allow_get_param(struct domain *d,
     case HVM_PARAM_STORE_EVTCHN:
     case HVM_PARAM_CONSOLE_PFN:
     case HVM_PARAM_CONSOLE_EVTCHN:
+    case HVM_PARAM_ALTP2M:
         break;
     /*
      * The following parameters must not be read by the guest
@@ -6483,6 +6495,12 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             switch ( a.cmd )
             {
             case HVMOP_altp2m_get_domain_state:
+                if ( !d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] )
+                {
+                    rc = -EINVAL;
+                    break;
+                }
+
                 a.u.domain_state.state = altp2m_active(d);
                 rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
                 break;
@@ -6491,8 +6509,9 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             {
                 struct vcpu *v;
                 bool_t ostate;
-                
-                if ( nestedhvm_enabled(d) )
+
+                if ( !d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] ||
+                     nestedhvm_enabled(d) )
                 {
                     rc = -EINVAL;
                     break;
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index 7c73089..147d9b8 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -187,6 +187,9 @@
 /* Location of the VM Generation ID in guest physical address space. */
 #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
 
-#define HVM_NR_PARAMS          35
+/* Boolean: Enable altp2m */
+#define HVM_PARAM_ALTP2M       35
+
+#define HVM_NR_PARAMS          36
 
 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 13/15] x86/altp2m: XSM hooks for altp2m HVM ops
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (11 preceding siblings ...)
  2015-07-14  0:15 ` [PATCH v5 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
@ 2015-07-14  0:15 ` Ed White
  2015-07-14  0:15 ` [PATCH v5 14/15] tools/libxc: add support to altp2m hvmops Ed White
  2015-07-14  0:15 ` [PATCH v5 15/15] tools/xen-access: altp2m testcases Ed White
  14 siblings, 0 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:15 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Ravi Sahita <ravi.sahita@intel.com>

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 tools/flask/policy/policy/modules/xen/xen.if |  4 ++--
 xen/arch/x86/hvm/hvm.c                       |  6 ++++++
 xen/include/xsm/dummy.h                      | 12 ++++++++++++
 xen/include/xsm/xsm.h                        | 12 ++++++++++++
 xen/xsm/dummy.c                              |  2 ++
 xen/xsm/flask/hooks.c                        | 12 ++++++++++++
 xen/xsm/flask/policy/access_vectors          |  7 +++++++
 7 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if
index f4cde11..6177fe9 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -8,7 +8,7 @@
 define(`declare_domain_common', `
 	allow $1 $2:grant { query setup };
 	allow $1 $2:mmu { adjust physmap map_read map_write stat pinpage updatemp mmuext_op };
-	allow $1 $2:hvm { getparam setparam };
+	allow $1 $2:hvm { getparam setparam altp2mhvm_op };
 	allow $1 $2:domain2 get_vnumainfo;
 ')
 
@@ -58,7 +58,7 @@ define(`create_domain_common', `
 	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
 	allow $1 $2:grant setup;
 	allow $1 $2:hvm { cacheattr getparam hvmctl irqlevel pciroute sethvmc
-			setparam pcilevel trackdirtyvram nested };
+			setparam pcilevel trackdirtyvram nested altp2mhvm altp2mhvm_op };
 ')
 
 # create_domain(priv, target)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index df69c1d..ea1c784 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5887,6 +5887,9 @@ static int hvmop_set_param(
                 nestedhvm_vcpu_destroy(v);
         break;
     case HVM_PARAM_ALTP2M:
+        rc = xsm_hvm_param_altp2mhvm(XSM_PRIV, d);
+        if ( rc )
+            break;
         if ( a.value > 1 )
             rc = -EINVAL;
         if ( a.value &&
@@ -6491,6 +6494,9 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         }
 
         if ( !rc )
+            rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d ? d : current->domain);
+
+        if ( !rc )
         {
             switch ( a.cmd )
             {
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index f044c0f..e0b561d 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -548,6 +548,18 @@ static XSM_INLINE int xsm_hvm_param_nested(XSM_DEFAULT_ARG struct domain *d)
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_hvm_param_altp2mhvm(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
+static XSM_INLINE int xsm_hvm_altp2mhvm_op(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_TARGET);
+    return xsm_default_action(action, current->domain, d);
+}
+
 static XSM_INLINE int xsm_vm_event_control(XSM_DEFAULT_ARG struct domain *d, int mode, int op)
 {
     XSM_ASSERT_ACTION(XSM_PRIV);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index c872d44..dc48d23 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -147,6 +147,8 @@ struct xsm_operations {
     int (*hvm_param) (struct domain *d, unsigned long op);
     int (*hvm_control) (struct domain *d, unsigned long op);
     int (*hvm_param_nested) (struct domain *d);
+    int (*hvm_param_altp2mhvm) (struct domain *d);
+    int (*hvm_altp2mhvm_op) (struct domain *d);
     int (*get_vnumainfo) (struct domain *d);
 
     int (*vm_event_control) (struct domain *d, int mode, int op);
@@ -586,6 +588,16 @@ static inline int xsm_hvm_param_nested (xsm_default_t def, struct domain *d)
     return xsm_ops->hvm_param_nested(d);
 }
 
+static inline int xsm_hvm_param_altp2mhvm (xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->hvm_param_altp2mhvm(d);
+}
+
+static inline int xsm_hvm_altp2mhvm_op (xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->hvm_altp2mhvm_op(d);
+}
+
 static inline int xsm_get_vnumainfo (xsm_default_t def, struct domain *d)
 {
     return xsm_ops->get_vnumainfo(d);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index e84b0e4..3461d4f 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -116,6 +116,8 @@ void xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, hvm_param);
     set_to_dummy_if_null(ops, hvm_control);
     set_to_dummy_if_null(ops, hvm_param_nested);
+    set_to_dummy_if_null(ops, hvm_param_altp2mhvm);
+    set_to_dummy_if_null(ops, hvm_altp2mhvm_op);
 
     set_to_dummy_if_null(ops, do_xsm_op);
 #ifdef CONFIG_COMPAT
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 6e37d29..2b998c9 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1170,6 +1170,16 @@ static int flask_hvm_param_nested(struct domain *d)
     return current_has_perm(d, SECCLASS_HVM, HVM__NESTED);
 }
 
+static int flask_hvm_param_altp2mhvm(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_HVM, HVM__ALTP2MHVM);
+}
+
+static int flask_hvm_altp2mhvm_op(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_HVM, HVM__ALTP2MHVM_OP);
+}
+
 static int flask_vm_event_control(struct domain *d, int mode, int op)
 {
     return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__VM_EVENT);
@@ -1654,6 +1664,8 @@ static struct xsm_operations flask_ops = {
     .hvm_param = flask_hvm_param,
     .hvm_control = flask_hvm_param,
     .hvm_param_nested = flask_hvm_param_nested,
+    .hvm_param_altp2mhvm = flask_hvm_param_altp2mhvm,
+    .hvm_altp2mhvm_op = flask_hvm_altp2mhvm_op,
 
     .do_xsm_op = do_flask_op,
     .get_vnumainfo = flask_get_vnumainfo,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 68284d5..d168de2 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -272,6 +272,13 @@ class hvm
     share_mem
 # HVMOP_set_param setting HVM_PARAM_NESTEDHVM
     nested
+# HVMOP_set_param setting HVM_PARAM_ALTP2MHVM
+    altp2mhvm
+# HVMOP_altp2m_set_domain_state HVMOP_altp2m_get_domain_state
+# HVMOP_altp2m_vcpu_enable_notify HVMOP_altp2m_create_p2m
+# HVMOP_altp2m_destroy_p2m HVMOP_altp2m_switch_p2m
+# HVMOP_altp2m_set_mem_access HVMOP_altp2m_change_gfn
+    altp2mhvm_op
 }
 
 # Class event describes event channels.  Interdomain event channels have their
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 14/15] tools/libxc: add support to altp2m hvmops
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (12 preceding siblings ...)
  2015-07-14  0:15 ` [PATCH v5 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
@ 2015-07-14  0:15 ` Ed White
  2015-07-14  0:15 ` [PATCH v5 15/15] tools/xen-access: altp2m testcases Ed White
  14 siblings, 0 replies; 56+ messages in thread
From: Ed White @ 2015-07-14  0:15 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Tamas K Lengyel <tlengyel@novetta.com>

Wrappers to issue altp2m hvmops.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/Makefile          |   1 +
 tools/libxc/include/xenctrl.h |  21 ++++
 tools/libxc/xc_altp2m.c       | 237 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 259 insertions(+)
 create mode 100644 tools/libxc/xc_altp2m.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 153b79e..c2c2b1c 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -10,6 +10,7 @@ override CONFIG_MIGRATE := n
 endif
 
 CTRL_SRCS-y       :=
+CTRL_SRCS-y       += xc_altp2m.c
 CTRL_SRCS-y       += xc_core.c
 CTRL_SRCS-$(CONFIG_X86) += xc_core_x86.c
 CTRL_SRCS-$(CONFIG_ARM) += xc_core_arm.c
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index d1d2ab3..ecddf28 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2316,6 +2316,27 @@ void xc_tmem_save_done(xc_interface *xch, int dom);
 int xc_tmem_restore(xc_interface *xch, int dom, int fd);
 int xc_tmem_restore_extra(xc_interface *xch, int dom, int fd);
 
+/**
+ * altp2m operations
+ */
+
+int xc_altp2m_get_domain_state(xc_interface *handle, domid_t dom, bool *state);
+int xc_altp2m_set_domain_state(xc_interface *handle, domid_t dom, bool state);
+int xc_altp2m_set_vcpu_enable_notify(xc_interface *handle, xen_pfn_t gfn);
+int xc_altp2m_create_view(xc_interface *handle, domid_t domid,
+                          xenmem_access_t default_access, uint16_t *view_id);
+int xc_altp2m_destroy_view(xc_interface *handle, domid_t domid,
+                           uint16_t view_id);
+/* Switch all vCPUs of the domain to the specified altp2m view */
+int xc_altp2m_switch_to_view(xc_interface *handle, domid_t domid,
+                             uint16_t view_id);
+int xc_altp2m_set_mem_access(xc_interface *handle, domid_t domid,
+                             uint16_t view_id, xen_pfn_t gfn,
+                             xenmem_access_t access);
+int xc_altp2m_change_gfn(xc_interface *handle, domid_t domid,
+                         uint16_t view_id, xen_pfn_t old_gfn,
+                         xen_pfn_t new_gfn);
+
 /** 
  * Mem paging operations.
  * Paging is supported only on the x86 architecture in 64 bit mode, with
diff --git a/tools/libxc/xc_altp2m.c b/tools/libxc/xc_altp2m.c
new file mode 100644
index 0000000..a4be36b
--- /dev/null
+++ b/tools/libxc/xc_altp2m.c
@@ -0,0 +1,237 @@
+/******************************************************************************
+ *
+ * xc_altp2m.c
+ *
+ * Interface to altp2m related HVMOPs
+ *
+ * Copyright (c) 2015 Tamas K Lengyel (tamas@tklengyel.com)
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+#include "xc_private.h"
+#include <stdbool.h>
+#include <xen/hvm/hvm_op.h>
+
+int xc_altp2m_get_domain_state(xc_interface *handle, domid_t dom, bool *state)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_get_domain_state;
+    arg->domain = dom;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    if ( !rc )
+        *state = arg->u.domain_state.state;
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_set_domain_state(xc_interface *handle, domid_t dom, bool state)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_set_domain_state;
+    arg->domain = dom;
+    arg->u.domain_state.state = state;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+/* This is a bit odd to me that it acts on current.. */
+int xc_altp2m_set_vcpu_enable_notify(xc_interface *handle, xen_pfn_t gfn)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_vcpu_enable_notify;
+    arg->u.enable_notify.gfn = gfn;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_create_view(xc_interface *handle, domid_t domid,
+                          xenmem_access_t default_access, uint16_t *view_id)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_create_p2m;
+    arg->domain = domid;
+    arg->u.view.view = -1;
+    arg->u.view.hvmmem_default_access = default_access;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    if ( !rc )
+        *view_id = arg->u.view.view;
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_destroy_view(xc_interface *handle, domid_t domid,
+                           uint16_t view_id)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_destroy_p2m;
+    arg->domain = domid;
+    arg->u.view.view = view_id;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+/* Switch all vCPUs of the domain to the specified altp2m view */
+int xc_altp2m_switch_to_view(xc_interface *handle, domid_t domid,
+                             uint16_t view_id)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_switch_p2m;
+    arg->domain = domid;
+    arg->u.view.view = view_id;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_set_mem_access(xc_interface *handle, domid_t domid,
+                             uint16_t view_id, xen_pfn_t gfn,
+                             xenmem_access_t access)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_set_mem_access;
+    arg->domain = domid;
+    arg->u.set_mem_access.view = view_id;
+    arg->u.set_mem_access.hvmmem_access = access;
+    arg->u.set_mem_access.gfn = gfn;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_change_gfn(xc_interface *handle, domid_t domid,
+                         uint16_t view_id, xen_pfn_t old_gfn,
+                         xen_pfn_t new_gfn)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_change_gfn;
+    arg->domain = domid;
+    arg->u.change_gfn.view = view_id;
+    arg->u.change_gfn.old_gfn = old_gfn;
+    arg->u.change_gfn.new_gfn = new_gfn;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v5 15/15] tools/xen-access: altp2m testcases
  2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (13 preceding siblings ...)
  2015-07-14  0:15 ` [PATCH v5 14/15] tools/libxc: add support to altp2m hvmops Ed White
@ 2015-07-14  0:15 ` Ed White
  2015-07-14  9:56   ` Wei Liu
  14 siblings, 1 reply; 56+ messages in thread
From: Ed White @ 2015-07-14  0:15 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Tamas K Lengyel <tlengyel@novetta.com>

Working altp2m test-case. Extended the test tool to support singlestepping
to better highlight the core feature of altp2m view switching.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
---
 tools/tests/xen-access/xen-access.c | 173 ++++++++++++++++++++++++++++++------
 1 file changed, 148 insertions(+), 25 deletions(-)

diff --git a/tools/tests/xen-access/xen-access.c b/tools/tests/xen-access/xen-access.c
index 12ab921..6b69c26 100644
--- a/tools/tests/xen-access/xen-access.c
+++ b/tools/tests/xen-access/xen-access.c
@@ -275,6 +275,19 @@ xenaccess_t *xenaccess_init(xc_interface **xch_r, domid_t domain_id)
     return NULL;
 }
 
+static inline
+int control_singlestep(
+    xc_interface *xch,
+    domid_t domain_id,
+    unsigned long vcpu,
+    bool enable)
+{
+    uint32_t op = enable ?
+        XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON : XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF;
+
+    return xc_domain_debug_control(xch, domain_id, op, vcpu);
+}
+
 /*
  * Note that this function is not thread safe.
  */
@@ -317,13 +330,15 @@ static void put_response(vm_event_t *vm_event, vm_event_response_t *rsp)
 
 void usage(char* progname)
 {
-    fprintf(stderr,
-            "Usage: %s [-m] <domain_id> write|exec|breakpoint\n"
+    fprintf(stderr, "Usage: %s [-m] <domain_id> write|exec", progname);
+#if defined(__i386__) || defined(__x86_64__)
+            fprintf(stderr, "|breakpoint|altp2m_write|altp2m_exec");
+#endif
+            fprintf(stderr,
             "\n"
             "Logs first page writes, execs, or breakpoint traps that occur on the domain.\n"
             "\n"
-            "-m requires this program to run, or else the domain may pause\n",
-            progname);
+            "-m requires this program to run, or else the domain may pause\n");
 }
 
 int main(int argc, char *argv[])
@@ -341,6 +356,8 @@ int main(int argc, char *argv[])
     int required = 0;
     int breakpoint = 0;
     int shutting_down = 0;
+    int altp2m = 0;
+    uint16_t altp2m_view_id = 0;
 
     char* progname = argv[0];
     argv++;
@@ -379,10 +396,22 @@ int main(int argc, char *argv[])
         default_access = XENMEM_access_rw;
         after_first_access = XENMEM_access_rwx;
     }
+#if defined(__i386__) || defined(__x86_64__)
     else if ( !strcmp(argv[0], "breakpoint") )
     {
         breakpoint = 1;
     }
+    else if ( !strcmp(argv[0], "altp2m_write") )
+    {
+        default_access = XENMEM_access_rx;
+        altp2m = 1;
+    }
+    else if ( !strcmp(argv[0], "altp2m_exec") )
+    {
+        default_access = XENMEM_access_rw;
+        altp2m = 1;
+    }
+#endif
     else
     {
         usage(argv[0]);
@@ -415,22 +444,73 @@ int main(int argc, char *argv[])
         goto exit;
     }
 
-    /* Set the default access type and convert all pages to it */
-    rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
-    if ( rc < 0 )
+    /* With altp2m we just create a new, restricted view of the memory */
+    if ( altp2m )
     {
-        ERROR("Error %d setting default mem access type\n", rc);
-        goto exit;
-    }
+        xen_pfn_t gfn = 0;
+        unsigned long perm_set = 0;
+
+        rc = xc_altp2m_set_domain_state( xch, domain_id, 1 );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d enabling altp2m on domain!\n", rc);
+            goto exit;
+        }
+
+        rc = xc_altp2m_create_view( xch, domain_id, default_access, &altp2m_view_id );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d creating altp2m view!\n", rc);
+            goto exit;
+        }
 
-    rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
-                           (xenaccess->max_gpfn - START_PFN) );
+        DPRINTF("altp2m view created with id %u\n", altp2m_view_id);
+        DPRINTF("Setting altp2m mem_access permissions.. ");
 
-    if ( rc < 0 )
+        for(; gfn < xenaccess->max_gpfn; ++gfn)
+        {
+            rc = xc_altp2m_set_mem_access( xch, domain_id, altp2m_view_id, gfn,
+                                           default_access);
+            if ( !rc )
+                perm_set++;
+        }
+
+        DPRINTF("done! Permissions set on %lu pages.\n", perm_set);
+
+        rc = xc_altp2m_switch_to_view( xch, domain_id, altp2m_view_id );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d switching to altp2m view!\n", rc);
+            goto exit;
+        }
+
+        rc = xc_monitor_singlestep( xch, domain_id, 1 );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d failed to enable singlestep monitoring!\n", rc);
+            goto exit;
+        }
+    }
+
+    if ( !altp2m )
     {
-        ERROR("Error %d setting all memory to access type %d\n", rc,
-              default_access);
-        goto exit;
+        /* Set the default access type and convert all pages to it */
+        rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
+        if ( rc < 0 )
+        {
+            ERROR("Error %d setting default mem access type\n", rc);
+            goto exit;
+        }
+
+        rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
+                               (xenaccess->max_gpfn - START_PFN) );
+
+        if ( rc < 0 )
+        {
+            ERROR("Error %d setting all memory to access type %d\n", rc,
+                  default_access);
+            goto exit;
+        }
     }
 
     if ( breakpoint )
@@ -448,13 +528,29 @@ int main(int argc, char *argv[])
     {
         if ( interrupted )
         {
+            /* Unregister for every event */
             DPRINTF("xenaccess shutting down on signal %d\n", interrupted);
 
-            /* Unregister for every event */
-            rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, ~0ull, 0);
-            rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, START_PFN,
-                                   (xenaccess->max_gpfn - START_PFN) );
-            rc = xc_monitor_software_breakpoint(xch, domain_id, 0);
+            if ( breakpoint )
+                rc = xc_monitor_software_breakpoint(xch, domain_id, 0);
+
+            if ( altp2m )
+            {
+                uint32_t vcpu_id;
+
+                rc = xc_altp2m_switch_to_view( xch, domain_id, 0 );
+                rc = xc_altp2m_destroy_view(xch, domain_id, altp2m_view_id);
+                rc = xc_altp2m_set_domain_state(xch, domain_id, 0);
+                rc = xc_monitor_singlestep(xch, domain_id, 0);
+
+                for ( vcpu_id = 0; vcpu_id<XEN_LEGACY_MAX_VCPUS; vcpu_id++)
+                    rc = control_singlestep(xch, domain_id, vcpu_id, 0);
+
+            } else {
+                rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, ~0ull, 0);
+                rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, START_PFN,
+                                       (xenaccess->max_gpfn - START_PFN) );
+            }
 
             shutting_down = 1;
         }
@@ -500,7 +596,7 @@ int main(int argc, char *argv[])
                 }
 
                 printf("PAGE ACCESS: %c%c%c for GFN %"PRIx64" (offset %06"
-                       PRIx64") gla %016"PRIx64" (valid: %c; fault in gpt: %c; fault with gla: %c) (vcpu %u)\n",
+                       PRIx64") gla %016"PRIx64" (valid: %c; fault in gpt: %c; fault with gla: %c) (vcpu %u, altp2m view %u)\n",
                        (req.u.mem_access.flags & MEM_ACCESS_R) ? 'r' : '-',
                        (req.u.mem_access.flags & MEM_ACCESS_W) ? 'w' : '-',
                        (req.u.mem_access.flags & MEM_ACCESS_X) ? 'x' : '-',
@@ -510,9 +606,20 @@ int main(int argc, char *argv[])
                        (req.u.mem_access.flags & MEM_ACCESS_GLA_VALID) ? 'y' : 'n',
                        (req.u.mem_access.flags & MEM_ACCESS_FAULT_IN_GPT) ? 'y' : 'n',
                        (req.u.mem_access.flags & MEM_ACCESS_FAULT_WITH_GLA) ? 'y': 'n',
-                       req.vcpu_id);
+                       req.vcpu_id,
+                       req.altp2m_idx);
 
-                if ( default_access != after_first_access )
+                if ( altp2m && req.flags & VM_EVENT_FLAG_ALTERNATE_P2M)
+                {
+                    DPRINTF("\tSwitching back to default view!\n");
+
+                    rsp.reason = req.reason;
+                    rsp.flags = req.flags;
+                    rsp.altp2m_idx = 0;
+
+                    control_singlestep(xch, domain_id, rsp.vcpu_id, 1);
+                }
+                else if ( default_access != after_first_access )
                 {
                     rc = xc_set_mem_access(xch, domain_id, after_first_access,
                                            req.u.mem_access.gfn, 1);
@@ -525,7 +632,6 @@ int main(int argc, char *argv[])
                     }
                 }
 
-
                 rsp.u.mem_access.gfn = req.u.mem_access.gfn;
                 break;
             case VM_EVENT_REASON_SOFTWARE_BREAKPOINT:
@@ -546,6 +652,23 @@ int main(int argc, char *argv[])
                 }
 
                 break;
+            case VM_EVENT_REASON_SINGLESTEP:
+                printf("Singlestep: rip=%016"PRIx64", vcpu %d\n",
+                       req.regs.x86.rip,
+                       req.vcpu_id);
+
+                if ( altp2m )
+                {
+                    printf("\tSwitching altp2m to view %u!\n", altp2m_view_id);
+
+                    rsp.reason = req.reason;
+                    rsp.flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
+                    rsp.altp2m_idx = altp2m_view_id;
+                }
+
+                control_singlestep(xch, domain_id, req.vcpu_id, 0);
+
+                break;
             default:
                 fprintf(stderr, "UNKNOWN REASON CODE %d\n", req.reason);
             }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 15/15] tools/xen-access: altp2m testcases
  2015-07-14  0:15 ` [PATCH v5 15/15] tools/xen-access: altp2m testcases Ed White
@ 2015-07-14  9:56   ` Wei Liu
  2015-07-14 11:52     ` Lengyel, Tamas
  0 siblings, 1 reply; 56+ messages in thread
From: Wei Liu @ 2015-07-14  9:56 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Mon, Jul 13, 2015 at 05:15:03PM -0700, Ed White wrote:
> From: Tamas K Lengyel <tlengyel@novetta.com>
> 
> Working altp2m test-case. Extended the test tool to support singlestepping
> to better highlight the core feature of altp2m view switching.
> 
> Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

Some nits below.

I do notice there is inconsistency in coding style, so I won't ask you
to resubmit just for fixing style. But a follow-up patch to fix up all
style problem is welcome.

> ---
>  tools/tests/xen-access/xen-access.c | 173 ++++++++++++++++++++++++++++++------
>  1 file changed, 148 insertions(+), 25 deletions(-)
> 
> diff --git a/tools/tests/xen-access/xen-access.c b/tools/tests/xen-access/xen-access.c
> index 12ab921..6b69c26 100644
> --- a/tools/tests/xen-access/xen-access.c
> +++ b/tools/tests/xen-access/xen-access.c
> @@ -275,6 +275,19 @@ xenaccess_t *xenaccess_init(xc_interface **xch_r, domid_t domain_id)
>      return NULL;
>  }
>  
> +static inline
> +int control_singlestep(
> +    xc_interface *xch,
> +    domid_t domain_id,
> +    unsigned long vcpu,
> +    bool enable)
> +{
> +    uint32_t op = enable ?
> +        XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON : XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF;
> +
> +    return xc_domain_debug_control(xch, domain_id, op, vcpu);
> +}
> +
>  /*
>   * Note that this function is not thread safe.
>   */
> @@ -317,13 +330,15 @@ static void put_response(vm_event_t *vm_event, vm_event_response_t *rsp)
>  
>  void usage(char* progname)
>  {
> -    fprintf(stderr,
> -            "Usage: %s [-m] <domain_id> write|exec|breakpoint\n"
> +    fprintf(stderr, "Usage: %s [-m] <domain_id> write|exec", progname);
> +#if defined(__i386__) || defined(__x86_64__)
> +            fprintf(stderr, "|breakpoint|altp2m_write|altp2m_exec");
> +#endif
> +            fprintf(stderr,
>              "\n"
>              "Logs first page writes, execs, or breakpoint traps that occur on the domain.\n"
>              "\n"
> -            "-m requires this program to run, or else the domain may pause\n",
> -            progname);
> +            "-m requires this program to run, or else the domain may pause\n");

Indentation looks wrong, but this is not your fault so don't worry
about this.

>  }
>  
>  int main(int argc, char *argv[])
> @@ -341,6 +356,8 @@ int main(int argc, char *argv[])
>      int required = 0;
>      int breakpoint = 0;
>      int shutting_down = 0;
> +    int altp2m = 0;
> +    uint16_t altp2m_view_id = 0;
>  
>      char* progname = argv[0];
>      argv++;
> @@ -379,10 +396,22 @@ int main(int argc, char *argv[])
>          default_access = XENMEM_access_rw;
>          after_first_access = XENMEM_access_rwx;
>      }
> +#if defined(__i386__) || defined(__x86_64__)
>      else if ( !strcmp(argv[0], "breakpoint") )
>      {
>          breakpoint = 1;
>      }
> +    else if ( !strcmp(argv[0], "altp2m_write") )
> +    {
> +        default_access = XENMEM_access_rx;
> +        altp2m = 1;
> +    }
> +    else if ( !strcmp(argv[0], "altp2m_exec") )
> +    {
> +        default_access = XENMEM_access_rw;
> +        altp2m = 1;
> +    }
> +#endif
>      else
>      {
>          usage(argv[0]);
> @@ -415,22 +444,73 @@ int main(int argc, char *argv[])
>          goto exit;
>      }
>  
> -    /* Set the default access type and convert all pages to it */
> -    rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
> -    if ( rc < 0 )
> +    /* With altp2m we just create a new, restricted view of the memory */
> +    if ( altp2m )
>      {
> -        ERROR("Error %d setting default mem access type\n", rc);
> -        goto exit;
> -    }
> +        xen_pfn_t gfn = 0;
> +        unsigned long perm_set = 0;
> +
> +        rc = xc_altp2m_set_domain_state( xch, domain_id, 1 );

Extraneous spaces in brackets.

> +        if ( rc < 0 )
> +        {
> +            ERROR("Error %d enabling altp2m on domain!\n", rc);
> +            goto exit;
> +        }
> +
> +        rc = xc_altp2m_create_view( xch, domain_id, default_access, &altp2m_view_id );

Extraneous spaces and line too long.

> +        if ( rc < 0 )
> +        {
> +            ERROR("Error %d creating altp2m view!\n", rc);
> +            goto exit;
> +        }
>  
> -    rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
> -                           (xenaccess->max_gpfn - START_PFN) );
> +        DPRINTF("altp2m view created with id %u\n", altp2m_view_id);
> +        DPRINTF("Setting altp2m mem_access permissions.. ");
>  
> -    if ( rc < 0 )
> +        for(; gfn < xenaccess->max_gpfn; ++gfn)
> +        {
> +            rc = xc_altp2m_set_mem_access( xch, domain_id, altp2m_view_id, gfn,
> +                                           default_access);

Space.

> +            if ( !rc )
> +                perm_set++;
> +        }
> +
> +        DPRINTF("done! Permissions set on %lu pages.\n", perm_set);
> +
> +        rc = xc_altp2m_switch_to_view( xch, domain_id, altp2m_view_id );

Ditto.

The rest of code has similar issues too. I won't point them out one by
one. Again, don't worry about this now.

Wei.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 15/15] tools/xen-access: altp2m testcases
  2015-07-14  9:56   ` Wei Liu
@ 2015-07-14 11:52     ` Lengyel, Tamas
  0 siblings, 0 replies; 56+ messages in thread
From: Lengyel, Tamas @ 2015-07-14 11:52 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ravi Sahita, George Dunlap, Ian Jackson, Tim Deegan, Ed White,
	Xen-devel, Jan Beulich, Andrew Cooper, Daniel De Graaf


[-- Attachment #1.1: Type: text/plain, Size: 5753 bytes --]

On Tue, Jul 14, 2015 at 5:56 AM, Wei Liu <wei.liu2@citrix.com> wrote:

> On Mon, Jul 13, 2015 at 05:15:03PM -0700, Ed White wrote:
> > From: Tamas K Lengyel <tlengyel@novetta.com>
> >
> > Working altp2m test-case. Extended the test tool to support
> singlestepping
> > to better highlight the core feature of altp2m view switching.
> >
> > Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
> > Signed-off-by: Ed White <edmund.h.white@intel.com>
> >
> > Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
>
> Acked-by: Wei Liu <wei.liu2@citrix.com>
>
> Some nits below.
>
> I do notice there is inconsistency in coding style, so I won't ask you
> to resubmit just for fixing style. But a follow-up patch to fix up all
> style problem is welcome.
>
> > ---
> >  tools/tests/xen-access/xen-access.c | 173
> ++++++++++++++++++++++++++++++------
> >  1 file changed, 148 insertions(+), 25 deletions(-)
> >
> > diff --git a/tools/tests/xen-access/xen-access.c
> b/tools/tests/xen-access/xen-access.c
> > index 12ab921..6b69c26 100644
> > --- a/tools/tests/xen-access/xen-access.c
> > +++ b/tools/tests/xen-access/xen-access.c
> > @@ -275,6 +275,19 @@ xenaccess_t *xenaccess_init(xc_interface **xch_r,
> domid_t domain_id)
> >      return NULL;
> >  }
> >
> > +static inline
> > +int control_singlestep(
> > +    xc_interface *xch,
> > +    domid_t domain_id,
> > +    unsigned long vcpu,
> > +    bool enable)
> > +{
> > +    uint32_t op = enable ?
> > +        XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON :
> XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF;
> > +
> > +    return xc_domain_debug_control(xch, domain_id, op, vcpu);
> > +}
> > +
> >  /*
> >   * Note that this function is not thread safe.
> >   */
> > @@ -317,13 +330,15 @@ static void put_response(vm_event_t *vm_event,
> vm_event_response_t *rsp)
> >
> >  void usage(char* progname)
> >  {
> > -    fprintf(stderr,
> > -            "Usage: %s [-m] <domain_id> write|exec|breakpoint\n"
> > +    fprintf(stderr, "Usage: %s [-m] <domain_id> write|exec", progname);
> > +#if defined(__i386__) || defined(__x86_64__)
> > +            fprintf(stderr, "|breakpoint|altp2m_write|altp2m_exec");
> > +#endif
> > +            fprintf(stderr,
> >              "\n"
> >              "Logs first page writes, execs, or breakpoint traps that
> occur on the domain.\n"
> >              "\n"
> > -            "-m requires this program to run, or else the domain may
> pause\n",
> > -            progname);
> > +            "-m requires this program to run, or else the domain may
> pause\n");
>
> Indentation looks wrong, but this is not your fault so don't worry
> about this.
>
> >  }
> >
> >  int main(int argc, char *argv[])
> > @@ -341,6 +356,8 @@ int main(int argc, char *argv[])
> >      int required = 0;
> >      int breakpoint = 0;
> >      int shutting_down = 0;
> > +    int altp2m = 0;
> > +    uint16_t altp2m_view_id = 0;
> >
> >      char* progname = argv[0];
> >      argv++;
> > @@ -379,10 +396,22 @@ int main(int argc, char *argv[])
> >          default_access = XENMEM_access_rw;
> >          after_first_access = XENMEM_access_rwx;
> >      }
> > +#if defined(__i386__) || defined(__x86_64__)
> >      else if ( !strcmp(argv[0], "breakpoint") )
> >      {
> >          breakpoint = 1;
> >      }
> > +    else if ( !strcmp(argv[0], "altp2m_write") )
> > +    {
> > +        default_access = XENMEM_access_rx;
> > +        altp2m = 1;
> > +    }
> > +    else if ( !strcmp(argv[0], "altp2m_exec") )
> > +    {
> > +        default_access = XENMEM_access_rw;
> > +        altp2m = 1;
> > +    }
> > +#endif
> >      else
> >      {
> >          usage(argv[0]);
> > @@ -415,22 +444,73 @@ int main(int argc, char *argv[])
> >          goto exit;
> >      }
> >
> > -    /* Set the default access type and convert all pages to it */
> > -    rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
> > -    if ( rc < 0 )
> > +    /* With altp2m we just create a new, restricted view of the memory
> */
> > +    if ( altp2m )
> >      {
> > -        ERROR("Error %d setting default mem access type\n", rc);
> > -        goto exit;
> > -    }
> > +        xen_pfn_t gfn = 0;
> > +        unsigned long perm_set = 0;
> > +
> > +        rc = xc_altp2m_set_domain_state( xch, domain_id, 1 );
>
> Extraneous spaces in brackets.
>
> > +        if ( rc < 0 )
> > +        {
> > +            ERROR("Error %d enabling altp2m on domain!\n", rc);
> > +            goto exit;
> > +        }
> > +
> > +        rc = xc_altp2m_create_view( xch, domain_id, default_access,
> &altp2m_view_id );
>
> Extraneous spaces and line too long.
>
> > +        if ( rc < 0 )
> > +        {
> > +            ERROR("Error %d creating altp2m view!\n", rc);
> > +            goto exit;
> > +        }
> >
> > -    rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
> > -                           (xenaccess->max_gpfn - START_PFN) );
> > +        DPRINTF("altp2m view created with id %u\n", altp2m_view_id);
> > +        DPRINTF("Setting altp2m mem_access permissions.. ");
> >
> > -    if ( rc < 0 )
> > +        for(; gfn < xenaccess->max_gpfn; ++gfn)
> > +        {
> > +            rc = xc_altp2m_set_mem_access( xch, domain_id,
> altp2m_view_id, gfn,
> > +                                           default_access);
>
> Space.
>
> > +            if ( !rc )
> > +                perm_set++;
> > +        }
> > +
> > +        DPRINTF("done! Permissions set on %lu pages.\n", perm_set);
> > +
> > +        rc = xc_altp2m_switch_to_view( xch, domain_id, altp2m_view_id );
>
> Ditto.
>
> The rest of code has similar issues too. I won't point them out one by
> one. Again, don't worry about this now.
>
> Wei.
>

Thanks Wei! I'll submit a patch to fix the style issues after (and if) this
gets merged!

Tamas

[-- Attachment #1.2: Type: text/html, Size: 7678 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 03/15] VMX: implement suppress #VE.
  2015-07-14  0:14 ` [PATCH v5 03/15] VMX: implement suppress #VE Ed White
@ 2015-07-14 12:46   ` Jan Beulich
  2015-07-14 13:47   ` George Dunlap
  1 sibling, 0 replies; 56+ messages in thread
From: Jan Beulich @ 2015-07-14 12:46 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
> In preparation for selectively enabling #VE in a later patch, set
> suppress #VE on all EPTE's.
> 
> Suppress #VE should always be the default condition for two reasons:
> it is generally not safe to deliver #VE into a guest unless that guest
> has been modified to receive it; and even then for most EPT violations only
> the hypervisor is able to handle the violation.
> 
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
> ---

Missing revision log here. And I think the dropping of the adjustment
to ept_p2m_init() invalidates reviews as well as Jun's ack - just
consider if my request to do so was wrong for some reason, and
neither of them saw me asking for this.

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-14  0:14 ` [PATCH v5 05/15] x86/altp2m: basic data structures and support routines Ed White
@ 2015-07-14 13:13   ` Jan Beulich
  2015-07-14 14:45     ` George Dunlap
  2015-07-16  8:57     ` Sahita, Ravi
  2015-07-14 15:57   ` George Dunlap
  1 sibling, 2 replies; 56+ messages in thread
From: Jan Beulich @ 2015-07-14 13:13 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
> +void
> +altp2m_vcpu_initialise(struct vcpu *v)
> +{
> +    if ( v != current )
> +        vcpu_pause(v);
> +
> +    altp2m_vcpu_reset(v);
> +    vcpu_altp2m(v).p2midx = 0;
> +    atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
> +
> +    altp2m_vcpu_update_eptp(v);
> +
> +    if ( v != current )
> +        vcpu_unpause(v);
> +}
> +
> +void
> +altp2m_vcpu_destroy(struct vcpu *v)
> +{
> +    struct p2m_domain *p2m;
> +
> +    if ( v != current )
> +        vcpu_pause(v);
> +
> +    if ( (p2m = p2m_get_altp2m(v)) )
> +        atomic_dec(&p2m->active_vcpus);
> +
> +    altp2m_vcpu_reset(v);
> +
> +    altp2m_vcpu_update_eptp(v);
> +    altp2m_vcpu_update_vmfunc_ve(v);
> +
> +    if ( v != current )
> +        vcpu_unpause(v);
> +}

There not being any caller of altp2m_vcpu_initialise() I can't judge
about its pausing requirements, but for the destroy case I can't
see what the pausing is good for. Considering its sole user it's also
not really clear why the two update operations need to be done
while destroying.

> @@ -6498,6 +6500,25 @@ enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v)
>      return hvm_funcs.nhvm_intr_blocked(v);
>  }
>  
> +void altp2m_vcpu_update_eptp(struct vcpu *v)
> +{
> +    if ( hvm_funcs.altp2m_vcpu_update_eptp )
> +        hvm_funcs.altp2m_vcpu_update_eptp(v);
> +}
> +
> +void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v)
> +{
> +    if ( hvm_funcs.altp2m_vcpu_update_vmfunc_ve )
> +        hvm_funcs.altp2m_vcpu_update_vmfunc_ve(v);
> +}
> +
> +bool_t altp2m_vcpu_emulate_ve(struct vcpu *v)
> +{
> +    if ( hvm_funcs.altp2m_vcpu_emulate_ve )
> +        return hvm_funcs.altp2m_vcpu_emulate_ve(v);
> +    return 0;
> +}

The patch context here suggests that you're using pretty outdated
a tree - nhvm_interrupt_blocked() went away a week ago. In line
with the commit doing so, all of the above should become inline
functions.

> +uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp)
> +{
> +    struct p2m_domain *p2m;
> +    struct ept_data *ept;
> +    uint16_t i;
> +
> +    altp2m_list_lock(d);
> +
> +    for ( i = 0; i < MAX_ALTP2M; i++ )
> +    {
> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
> +            continue;
> +
> +        p2m = d->arch.altp2m_p2m[i];
> +        ept = &p2m->ept;
> +
> +        if ( eptp == ept_get_eptp(ept) )
> +            goto out;
> +    }
> +
> +    i = INVALID_ALTP2M;
> +
> +out:

Labels should be indented by at least one space.

Pending the rename, the function may also live in the wrong file (the
use of ept_get_eptp() suggests so even if the function itself got
renamed).

> +bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx)
> +{
> +    struct domain *d = v->domain;
> +    bool_t rc = 0;
> +
> +    if ( idx > MAX_ALTP2M )
> +        return rc;
> +
> +    altp2m_list_lock(d);
> +
> +    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
> +    {
> +        if ( idx != vcpu_altp2m(v).p2midx )
> +        {
> +            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
> +            vcpu_altp2m(v).p2midx = idx;
> +            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);

Are the two results of p2m_get_altp2m(v) here guaranteed to be
distinct? If they aren't, is it safe to decrement first (potentially
dropping the count to zero)?

> @@ -722,6 +731,27 @@ void nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
>      l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
>  
>  /*
> + * Alternate p2m: shadow p2m tables used for alternate memory views
> + */
> +
> +/* get current alternate p2m table */
> +static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    uint16_t index = vcpu_altp2m(v).p2midx;
> +
> +    ASSERT(index < MAX_ALTP2M);
> +
> +    return (index == INVALID_ALTP2M) ? NULL : d->arch.altp2m_p2m[index];
> +}

Looking at this again, I'm afraid I'd still prefer index < MAX_ALTP2M
in the return statement (and the ASSERT() dropped): The ASSERT()
does nothing in a debug=n build, and hence wouldn't shield us from
possibly having to issue an XSA if somehow an access outside the
array's bounds turned out possible.

I've also avoided to repeat any of the un-addressed points that I
raised against earlier versions.

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 03/15] VMX: implement suppress #VE.
  2015-07-14  0:14 ` [PATCH v5 03/15] VMX: implement suppress #VE Ed White
  2015-07-14 12:46   ` Jan Beulich
@ 2015-07-14 13:47   ` George Dunlap
  1 sibling, 0 replies; 56+ messages in thread
From: George Dunlap @ 2015-07-14 13:47 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Tue, Jul 14, 2015 at 1:14 AM, Ed White <edmund.h.white@intel.com> wrote:
> In preparation for selectively enabling #VE in a later patch, set
> suppress #VE on all EPTE's.
>
> Suppress #VE should always be the default condition for two reasons:
> it is generally not safe to deliver #VE into a guest unless that guest
> has been modified to receive it; and even then for most EPT violations only
> the hypervisor is able to handle the violation.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>

As Jan says, you should have dropped the reviewed-by's here.

Acked-by: George Dunlap <george.dunlap@eu.citrix.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-14  0:14 ` [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
@ 2015-07-14 13:57   ` Jan Beulich
  2015-07-16  9:20     ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-14 13:57 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
> Implement and hook up the code to enable VMX support of VMFUNC and #VE.
> 
> VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.
> 
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
> ---
>  xen/arch/x86/hvm/vmx/vmx.c | 138 
> +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 138 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 07527dd..38dba6b 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -56,6 +56,7 @@
>  #include <asm/debugger.h>
>  #include <asm/apic.h>
>  #include <asm/hvm/nestedhvm.h>
> +#include <asm/hvm/altp2m.h>
>  #include <asm/event.h>
>  #include <asm/monitor.h>
>  #include <public/arch-x86/cpuid.h>
> @@ -1763,6 +1764,104 @@ static void vmx_enable_msr_exit_interception(struct 
> domain *d)
>                                           MSR_TYPE_W);
>  }
>  
> +static void vmx_vcpu_update_eptp(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    struct p2m_domain *p2m = NULL;
> +    struct ept_data *ept;
> +
> +    if ( altp2m_active(d) )
> +        p2m = p2m_get_altp2m(v);
> +    if ( !p2m )
> +        p2m = p2m_get_hostp2m(d);
> +
> +    ept = &p2m->ept;
> +    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
> +
> +    vmx_vmcs_enter(v);
> +
> +    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
> +
> +    if ( v->arch.hvm_vmx.secondary_exec_control &
> +        SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
> +        __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);
> +
> +    vmx_vmcs_exit(v);
> +}
> +
> +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
> +
> +    if ( !cpu_has_vmx_vmfunc )
> +        return;
> +
> +    if ( cpu_has_vmx_virt_exceptions )
> +        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
> +
> +    vmx_vmcs_enter(v);
> +
> +    if ( !d->is_dying && altp2m_active(d) )
> +    {
> +        v->arch.hvm_vmx.secondary_exec_control |= mask;
> +        __vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
> +        __vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp));
> +
> +        if ( cpu_has_vmx_virt_exceptions )
> +        {
> +            p2m_type_t t;
> +            mfn_t mfn;
> +
> +            mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
> +
> +            if ( mfn_x(mfn) != INVALID_MFN )
> +                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
> +            else
> +                mask &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;

Considering the rest of the function, this is dead code.

> +        }
> +    }
> +    else
> +        v->arch.hvm_vmx.secondary_exec_control &= ~mask;
> +
> +    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> +        v->arch.hvm_vmx.secondary_exec_control);
> +
> +    vmx_vmcs_exit(v);
> +}
> +
> +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
> +{
> +    bool_t rc = 0;
> +    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) != INVALID_GFN ?
> +        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) : NULL;
> +
> +    if ( !veinfo )
> +        return 0;
> +
> +    if ( veinfo->semaphore != 0 )
> +        goto out;
> +
> +    rc = 1;
> +
> +    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
> +    veinfo->semaphore = ~0l;

Isn't semaphore a 32-bit quantity?

> +    veinfo->eptp_index = vcpu_altp2m(v).p2midx;
> +
> +    vmx_vmcs_enter(v);
> +    __vmread(EXIT_QUALIFICATION, &veinfo->exit_qualification);
> +    __vmread(GUEST_LINEAR_ADDRESS, &veinfo->gla);
> +    __vmread(GUEST_PHYSICAL_ADDRESS, &veinfo->gpa);
> +    vmx_vmcs_exit(v);
> +
> +    hvm_inject_hw_exception(TRAP_virtualisation,
> +                            HVM_DELIVER_NO_ERROR_CODE);
> +
> +out:

Un-indented label again.

> @@ -2744,6 +2846,42 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>      /* Now enable interrupts so it's safe to take locks. */
>      local_irq_enable();
>  
> +    /*
> +     * If the guest has the ability to switch EPTP without an exit,
> +     * figure out whether it has done so and update the altp2m data.
> +     */
> +    if ( altp2m_active(v->domain) &&
> +        (v->arch.hvm_vmx.secondary_exec_control &
> +        SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )

Indentation.

> +    {
> +        unsigned long idx;
> +
> +        if ( v->arch.hvm_vmx.secondary_exec_control &
> +            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
> +            __vmread(EPTP_INDEX, &idx);
> +        else
> +        {
> +            unsigned long eptp;
> +
> +            __vmread(EPT_POINTER, &eptp);
> +
> +            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
> +                 INVALID_ALTP2M )
> +            {
> +                gdprintk(XENLOG_ERR, "EPTP not found in alternate p2m list\n");
> +                domain_crash(v->domain);
> +            }
> +        }
> +
> +        if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )

Is this cast really necessary?

> +        {
> +            BUG_ON(idx >= MAX_ALTP2M);
> +            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
> +            vcpu_altp2m(v).p2midx = (uint16_t)idx;

This one surely isn't (or else the field type is wrong).

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-14  0:14 ` [PATCH v5 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
@ 2015-07-14 14:04   ` Jan Beulich
  2015-07-14 17:56     ` Sahita, Ravi
  2015-07-17 22:41     ` Sahita, Ravi
  0 siblings, 2 replies; 56+ messages in thread
From: Jan Beulich @ 2015-07-14 14:04 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1436,6 +1436,19 @@ static int hvmemul_invlpg(
>      return rc;
>  }
>  
> +static int hvmemul_vmfunc(
> +    struct x86_emulate_ctxt *ctxt)
> +{
> +    int rc;
> +
> +    rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
> +    if ( rc != X86EMUL_OKAY )
> +    {
> +        hvmemul_inject_hw_exception(TRAP_invalid_op, 0, ctxt);
> +    }
> +    return rc;

Pointless braces and missing blank line before final return.

> @@ -1830,6 +1831,19 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
>      vmx_vmcs_exit(v);
>  }
>  
> +static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
> +{
> +    int rc = X86EMUL_EXCEPTION;
> +    struct vcpu *curr = current;
> +
> +    if ( !cpu_has_vmx_vmfunc && altp2m_active(curr->domain) &&
> +         regs->eax == 0 &&
> +         p2m_switch_vcpu_altp2m_by_id(curr, (uint16_t)regs->ecx) )

Documentation suggests that the upper 32 bits of RAX are being
ignored, and that all 32 bits of ECX are being used.

> @@ -3234,6 +3263,15 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>              update_guest_eip();
>          break;
>  
> +    case EXIT_REASON_VMFUNC:
> +        if ( (vmx_vmfunc_intercept(regs) == X86EMUL_EXCEPTION) ||
> +             (vmx_vmfunc_intercept(regs) == X86EMUL_UNHANDLEABLE) ||
> +             (vmx_vmfunc_intercept(regs) == X86EMUL_RETRY) )

Why would you want to invoke the function 3 times? How about
simply != X86EMUL_OKAY?

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 09/15] x86/altp2m: alternate p2m memory events.
  2015-07-14  0:14 ` [PATCH v5 09/15] x86/altp2m: alternate p2m memory events Ed White
@ 2015-07-14 14:08   ` Jan Beulich
  2015-07-16  9:22     ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-14 14:08 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
> --- a/xen/include/public/vm_event.h
> +++ b/xen/include/public/vm_event.h
> @@ -47,6 +47,16 @@
>  #define VM_EVENT_FLAG_VCPU_PAUSED     (1 << 0)
>  /* Flags to aid debugging mem_event */
>  #define VM_EVENT_FLAG_FOREIGN         (1 << 1)
> +/*
> + * This flag can be set in a request or a response
> + *
> + * On a request, indicates that the event occurred in the alternate p2m specified by
> + * the altp2m_idx request field.
> + *
> + * On a response, indicates that the VCPU should resume in the alternate p2m specified
> + * by the altp2m_idx response field if possible.
> + */
> +#define VM_EVENT_FLAG_ALTERNATE_P2M   (1 << 5)

So I suppose you use 5 here because of what went into staging
recently. But the patch context doesn't reflect this, i.e. the patch
is inconsistent at this point (and won't apply).

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-14  0:14 ` [PATCH v5 10/15] x86/altp2m: add remaining support routines Ed White
@ 2015-07-14 14:31   ` Jan Beulich
  2015-07-16  9:16     ` Sahita, Ravi
  2015-07-16 14:44   ` George Dunlap
  1 sibling, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-14 14:31 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2802,10 +2802,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>      mfn_t mfn;
>      struct vcpu *curr = current;
>      struct domain *currd = curr->domain;
> -    struct p2m_domain *p2m;
> +    struct p2m_domain *p2m, *hostp2m;
>      int rc, fall_through = 0, paged = 0;
>      int sharing_enomem = 0;
>      vm_event_request_t *req_ptr = NULL;
> +    bool_t ap2m_active = 0;

Pointless initializer afaict.

> @@ -2865,11 +2866,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>          goto out;
>      }
>  
> -    p2m = p2m_get_hostp2m(currd);
> -    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
> +    ap2m_active = altp2m_active(currd);
> +
> +    /* Take a lock on the host p2m speculatively, to avoid potential
> +     * locking order problems later and to handle unshare etc.
> +     */

Comment style.

> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>          if ( npfec.write_access )
>          {
>              paging_mark_dirty(currd, mfn_x(mfn));
> +            /* If p2m is really an altp2m, unlock here to avoid lock ordering
> +             * violation when the change below is propagated from host p2m */
> +            if ( ap2m_active )
> +                __put_gfn(p2m, gfn);
>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty, p2m_ram_rw);

And this won't result in any races?

Also - comment style again (and more elsewhere).

> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -2037,6 +2037,391 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx)
>      return rc;
>  }
>  
> +void p2m_flush_altp2m(struct domain *d)
> +{
> +    uint16_t i;
> +
> +    altp2m_list_lock(d);
> +
> +    for ( i = 0; i < MAX_ALTP2M; i++ )
> +    {
> +        p2m_flush_table(d->arch.altp2m_p2m[i]);
> +        /* Uninit and reinit ept to force TLB shootdown */
> +        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
> +        ept_p2m_init(d->arch.altp2m_p2m[i]);

ept_... in non-EPT code again.

> +        d->arch.altp2m_eptp[i] = INVALID_MFN;
> +    }
> +
> +    altp2m_list_unlock(d);
> +}
> +
> +static void p2m_init_altp2m_helper(struct domain *d, uint16_t i)
> +{
> +    struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
> +    struct ept_data *ept;
> +
> +    p2m->min_remapped_gfn = INVALID_GFN;
> +    p2m->max_remapped_gfn = INVALID_GFN;
> +    ept = &p2m->ept;
> +    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
> +    d->arch.altp2m_eptp[i] = ept_get_eptp(ept);

Same here.

> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx)
> +{
> +    long rc = -EINVAL;

Why long (for both variable and function return type)? (More of
these in functions below.)

> +long p2m_init_next_altp2m(struct domain *d, uint16_t *idx)
> +{
> +    long rc = -EINVAL;
> +    uint16_t i;

As in the earlier patch(es) - unsigned int.

> +long p2m_change_altp2m_gfn(struct domain *d, uint16_t idx,
> +                             gfn_t old_gfn, gfn_t new_gfn)
> +{
> +    struct p2m_domain *hp2m, *ap2m;
> +    p2m_access_t a;
> +    p2m_type_t t;
> +    mfn_t mfn;
> +    unsigned int page_order;
> +    long rc = -EINVAL;
> +
> +    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
> +        return rc;
> +
> +    hp2m = p2m_get_hostp2m(d);
> +    ap2m = d->arch.altp2m_p2m[idx];
> +
> +    p2m_lock(ap2m);
> +
> +    mfn = ap2m->get_entry(ap2m, gfn_x(old_gfn), &t, &a, 0, NULL, NULL);
> +
> +    if ( gfn_x(new_gfn) == INVALID_GFN )
> +    {
> +        if ( mfn_valid(mfn) )
> +            p2m_remove_page(ap2m, gfn_x(old_gfn), mfn_x(mfn), PAGE_ORDER_4K);
> +        rc = 0;
> +        goto out;
> +    }
> +
> +    /* Check host p2m if no valid entry in alternate */
> +    if ( !mfn_valid(mfn) )
> +    {
> +        mfn = hp2m->get_entry(hp2m, gfn_x(old_gfn), &t, &a,
> +                              P2M_ALLOC | P2M_UNSHARE, &page_order, NULL);
> +
> +        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
> +            goto out;
> +
> +        /* If this is a superpage, copy that first */
> +        if ( page_order != PAGE_ORDER_4K )
> +        {
> +            gfn_t gfn;
> +            unsigned long mask;
> +
> +            mask = ~((1UL << page_order) - 1);
> +            gfn = _gfn(gfn_x(old_gfn) & mask);
> +            mfn = _mfn(mfn_x(mfn) & mask);
> +
> +            if ( ap2m->set_entry(ap2m, gfn_x(gfn), mfn, page_order, t, a, 1) 
> )
> +                goto out;
> +        }
> +    }
> +
> +    mfn = ap2m->get_entry(ap2m, gfn_x(new_gfn), &t, &a, 0, NULL, NULL);
> +
> +    if ( !mfn_valid(mfn) )
> +        mfn = hp2m->get_entry(hp2m, gfn_x(new_gfn), &t, &a, 0, NULL, NULL);
> +
> +    if ( !mfn_valid(mfn) || (t != p2m_ram_rw) )
> +        goto out;
> +
> +    if ( !ap2m->set_entry(ap2m, gfn_x(old_gfn), mfn, PAGE_ORDER_4K, t, a,
> +                          (current->domain != d)) )
> +    {
> +        rc = 0;
> +
> +        if ( ap2m->min_remapped_gfn == INVALID_GFN ||
> +             gfn_x(new_gfn) < ap2m->min_remapped_gfn )
> +            ap2m->min_remapped_gfn = gfn_x(new_gfn);
> +        if ( ap2m->max_remapped_gfn == INVALID_GFN ||
> +             gfn_x(new_gfn) > ap2m->max_remapped_gfn )
> +            ap2m->max_remapped_gfn = gfn_x(new_gfn);

For the purpose here (and without conflict with the consumer side)
it would seem to be better to initialize max_remapped_gfn to zero,
as then both if() can get away with just one comparison.

> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
> +                                 mfn_t mfn, unsigned int page_order,
> +                                 p2m_type_t p2mt, p2m_access_t p2ma)
> +{
> +    struct p2m_domain *p2m;
> +    p2m_access_t a;
> +    p2m_type_t t;
> +    mfn_t m;
> +    uint16_t i;
> +    bool_t reset_p2m;
> +    unsigned int reset_count = 0;
> +    uint16_t last_reset_idx = ~0;
> +
> +    if ( !altp2m_active(d) )
> +        return;
> +
> +    altp2m_list_lock(d);
> +
> +    for ( i = 0; i < MAX_ALTP2M; i++ )
> +    {
> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
> +            continue;
> +
> +        p2m = d->arch.altp2m_p2m[i];
> +        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
> +
> +        reset_p2m = 0;
> +
> +        /* Check for a dropped page that may impact this altp2m */
> +        if ( mfn_x(mfn) == INVALID_MFN &&
> +             gfn_x(gfn) >= p2m->min_remapped_gfn &&
> +             gfn_x(gfn) <= p2m->max_remapped_gfn )
> +            reset_p2m = 1;

Considering that this looks like an optimization, what's the downside
of possibly having min=0 and max=<end-of-address-space>? I.e.
can there a long latency operation result that's this way a guest can
effect?

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-14  0:14 ` [PATCH v5 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
@ 2015-07-14 14:36   ` Jan Beulich
  2015-07-16  9:02     ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-14 14:36 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -6443,6 +6443,148 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
>          break;
>      }
>  
> +    case HVMOP_altp2m:
> +    {
> +        struct xen_hvm_altp2m_op a;
> +        struct domain *d = NULL;
> +
> +        if ( copy_from_guest(&a, arg, 1) )
> +            return -EFAULT;
> +
> +        if ( a.pad[0] || a.pad[1] )
> +            return -EINVAL;

Why can't the field be uint16_t, making this a single check?

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-14 13:13   ` Jan Beulich
@ 2015-07-14 14:45     ` George Dunlap
  2015-07-14 14:58       ` Jan Beulich
  2015-07-16  8:57     ` Sahita, Ravi
  1 sibling, 1 reply; 56+ messages in thread
From: George Dunlap @ 2015-07-14 14:45 UTC (permalink / raw)
  To: Jan Beulich, Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, Andrew Cooper, Ian Jackson,
	xen-devel, tlengyel, Daniel De Graaf

On 07/14/2015 02:13 PM, Jan Beulich wrote:
>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>> +void
>> +altp2m_vcpu_initialise(struct vcpu *v)
>> +{
>> +    if ( v != current )
>> +        vcpu_pause(v);
>> +
>> +    altp2m_vcpu_reset(v);
>> +    vcpu_altp2m(v).p2midx = 0;
>> +    atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
>> +
>> +    altp2m_vcpu_update_eptp(v);
>> +
>> +    if ( v != current )
>> +        vcpu_unpause(v);
>> +}
>> +
>> +void
>> +altp2m_vcpu_destroy(struct vcpu *v)
>> +{
>> +    struct p2m_domain *p2m;
>> +
>> +    if ( v != current )
>> +        vcpu_pause(v);
>> +
>> +    if ( (p2m = p2m_get_altp2m(v)) )
>> +        atomic_dec(&p2m->active_vcpus);
>> +
>> +    altp2m_vcpu_reset(v);
>> +
>> +    altp2m_vcpu_update_eptp(v);
>> +    altp2m_vcpu_update_vmfunc_ve(v);
>> +
>> +    if ( v != current )
>> +        vcpu_unpause(v);
>> +}
> 
> There not being any caller of altp2m_vcpu_initialise() I can't judge
> about its pausing requirements, but for the destroy case I can't
> see what the pausing is good for. Considering its sole user it's also
> not really clear why the two update operations need to be done
> while destroying.

So looking at this after all the patches have been applied, it looks
like initialise() and destroy() are called from the
altp2m_set_domain_state(), which the guest uses to enable or disable the
altp2m functionality.  In that case, it seems like pausing is probably
appropriate.

 -George

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-14 14:45     ` George Dunlap
@ 2015-07-14 14:58       ` Jan Beulich
  0 siblings, 0 replies; 56+ messages in thread
From: Jan Beulich @ 2015-07-14 14:58 UTC (permalink / raw)
  To: George Dunlap, Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, Andrew Cooper, Ian Jackson,
	xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 16:45, <george.dunlap@eu.citrix.com> wrote:
> On 07/14/2015 02:13 PM, Jan Beulich wrote:
>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>> +void
>>> +altp2m_vcpu_initialise(struct vcpu *v)
>>> +{
>>> +    if ( v != current )
>>> +        vcpu_pause(v);
>>> +
>>> +    altp2m_vcpu_reset(v);
>>> +    vcpu_altp2m(v).p2midx = 0;
>>> +    atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
>>> +
>>> +    altp2m_vcpu_update_eptp(v);
>>> +
>>> +    if ( v != current )
>>> +        vcpu_unpause(v);
>>> +}
>>> +
>>> +void
>>> +altp2m_vcpu_destroy(struct vcpu *v)
>>> +{
>>> +    struct p2m_domain *p2m;
>>> +
>>> +    if ( v != current )
>>> +        vcpu_pause(v);
>>> +
>>> +    if ( (p2m = p2m_get_altp2m(v)) )
>>> +        atomic_dec(&p2m->active_vcpus);
>>> +
>>> +    altp2m_vcpu_reset(v);
>>> +
>>> +    altp2m_vcpu_update_eptp(v);
>>> +    altp2m_vcpu_update_vmfunc_ve(v);
>>> +
>>> +    if ( v != current )
>>> +        vcpu_unpause(v);
>>> +}
>> 
>> There not being any caller of altp2m_vcpu_initialise() I can't judge
>> about its pausing requirements, but for the destroy case I can't
>> see what the pausing is good for. Considering its sole user it's also
>> not really clear why the two update operations need to be done
>> while destroying.
> 
> So looking at this after all the patches have been applied, it looks
> like initialise() and destroy() are called from the
> altp2m_set_domain_state(), which the guest uses to enable or disable the
> altp2m functionality.  In that case, it seems like pausing is probably
> appropriate.

Right. Albeit I'd then question whether it wouldn't better be the caller
to do the pausing (via domain_pause_except_self()), considering that
the functions are being called in a fir_each_vcpu() loop.

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-14  0:14 ` [PATCH v5 05/15] x86/altp2m: basic data structures and support routines Ed White
  2015-07-14 13:13   ` Jan Beulich
@ 2015-07-14 15:57   ` George Dunlap
  2015-07-21 17:44     ` Sahita, Ravi
  1 sibling, 1 reply; 56+ messages in thread
From: George Dunlap @ 2015-07-14 15:57 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, Ian Jackson, Tim Deegan, Jan Beulich,
	Andrew Cooper, tlengyel, Daniel De Graaf

On 07/14/2015 01:14 AM, Ed White wrote:
> Add the basic data structures needed to support alternate p2m's and
> the functions to initialise them and tear them down.
> 
> Although Intel hardware can handle 512 EPTP's per hardware thread
> concurrently, only 10 per domain are supported in this patch for
> performance reasons.
> 
> The iterator in hap_enable() does need to handle 512, so that is now
> uint16_t.
> 
> This change also splits the p2m lock into one lock type for altp2m's
> and another type for all other p2m's. The purpose of this is to place
> the altp2m list lock between the types, so the list lock can be
> acquired whilst holding the host p2m lock.
> 
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

With the number of major changes you made here, you definitely should
have dropped this reviewed-by.

> ---
>  xen/arch/x86/hvm/Makefile        |   1 +
>  xen/arch/x86/hvm/altp2m.c        |  77 +++++++++++++++++++++++++++++
>  xen/arch/x86/hvm/hvm.c           |  21 ++++++++
>  xen/arch/x86/mm/hap/hap.c        |  38 ++++++++++++++-
>  xen/arch/x86/mm/mm-locks.h       |  46 +++++++++++++++++-
>  xen/arch/x86/mm/p2m.c            | 102 +++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/domain.h     |  10 ++++
>  xen/include/asm-x86/hvm/altp2m.h |  38 +++++++++++++++
>  xen/include/asm-x86/hvm/hvm.h    |  14 ++++++
>  xen/include/asm-x86/hvm/vcpu.h   |   9 ++++
>  xen/include/asm-x86/p2m.h        |  32 +++++++++++-
>  11 files changed, 384 insertions(+), 4 deletions(-)
>  create mode 100644 xen/arch/x86/hvm/altp2m.c
>  create mode 100644 xen/include/asm-x86/hvm/altp2m.h
> 
> diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
> index 69af47f..eb1a37b 100644
> --- a/xen/arch/x86/hvm/Makefile
> +++ b/xen/arch/x86/hvm/Makefile
> @@ -1,6 +1,7 @@
>  subdir-y += svm
>  subdir-y += vmx
>  
> +obj-y += altp2m.o
>  obj-y += asid.o
>  obj-y += emulate.o
>  obj-y += event.o
> diff --git a/xen/arch/x86/hvm/altp2m.c b/xen/arch/x86/hvm/altp2m.c
> new file mode 100644
> index 0000000..a10f347
> --- /dev/null
> +++ b/xen/arch/x86/hvm/altp2m.c
> @@ -0,0 +1,77 @@
> +/*
> + * Alternate p2m HVM
> + * Copyright (c) 2014, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + */
> +
> +#include <asm/hvm/support.h>
> +#include <asm/hvm/hvm.h>
> +#include <asm/p2m.h>
> +#include <asm/hvm/altp2m.h>
> +
> +void
> +altp2m_vcpu_reset(struct vcpu *v)

OK, so it looks like at the end of this patch series:
* altp2m_vcpu_reset() isn't called outside this file
* altp2m_vcpu_initialise() is only called from hvm.c when the guest
enables the altp2m functionality
* altp2m_vcpu_destroy() is called when the guest disables altp2m
funcitonality, or when the vcpu is destroyed

Looking at the "vcpu_destroy" case, it's hard to tell exactly how much
on that path is actually useful; but it looks like the only thing that's
critical is decreasing the active_vcpu count of the p2m that's being used.

Also, it looks like these functions don't do anything specifically with
the HVM side of things.

So on the whole, it seems like these would better go along with the
other altp2m functions inside p2m.c.

Thoughts?

 -George

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-14  0:14 ` [PATCH v5 08/15] x86/altp2m: add control of suppress_ve Ed White
@ 2015-07-14 17:03   ` George Dunlap
  0 siblings, 0 replies; 56+ messages in thread
From: George Dunlap @ 2015-07-14 17:03 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Tue, Jul 14, 2015 at 1:14 AM, Ed White <edmund.h.white@intel.com> wrote:
> From: George Dunlap <george.dunlap@eu.citrix.com>
>
> The existing ept_set_entry() and ept_get_entry() routines are extended
> to optionally set/get suppress_ve.  Passing -1 will set suppress_ve on
> new p2m entries, or retain suppress_ve flag on existing entries.
>
> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
> Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks!

Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-14 14:04   ` Jan Beulich
@ 2015-07-14 17:56     ` Sahita, Ravi
  2015-07-17 22:41     ` Sahita, Ravi
  1 sibling, 0 replies; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-14 17:56 UTC (permalink / raw)
  To: Jan Beulich, White, Edmund H
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, tlengyel, Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Tuesday, July 14, 2015 7:04 AM
>
>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -1436,6 +1436,19 @@ static int hvmemul_invlpg(
>>      return rc;
>>  }
>>
>> +static int hvmemul_vmfunc(
>> +    struct x86_emulate_ctxt *ctxt)
>> +{
>> +    int rc;
>> +
>> +    rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
>> +    if ( rc != X86EMUL_OKAY )
>> +    {
>> +        hvmemul_inject_hw_exception(TRAP_invalid_op, 0, ctxt);
>> +    }
>> +    return rc;
>
>Pointless braces and missing blank line before final return.

ok

>
>> @@ -1830,6 +1831,19 @@ static void vmx_vcpu_update_vmfunc_ve(struct
>vcpu *v)
>>      vmx_vmcs_exit(v);
>>  }
>>
>> +static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs) {
>> +    int rc = X86EMUL_EXCEPTION;
>> +    struct vcpu *curr = current;
>> +
>> +    if ( !cpu_has_vmx_vmfunc && altp2m_active(curr->domain) &&
>> +         regs->eax == 0 &&
>> +         p2m_switch_vcpu_altp2m_by_id(curr, (uint16_t)regs->ecx) )
>
>Documentation suggests that the upper 32 bits of RAX are being ignored, and
>that all 32 bits of ECX are being used.
>
>> @@ -3234,6 +3263,15 @@ void vmx_vmexit_handler(struct cpu_user_regs
>*regs)
>>              update_guest_eip();
>>          break;
>>
>> +    case EXIT_REASON_VMFUNC:
>> +        if ( (vmx_vmfunc_intercept(regs) == X86EMUL_EXCEPTION) ||
>> +             (vmx_vmfunc_intercept(regs) == X86EMUL_UNHANDLEABLE) ||
>> +             (vmx_vmfunc_intercept(regs) == X86EMUL_RETRY) )
>
>Why would you want to invoke the function 3 times? How about simply !=
>X86EMUL_OKAY?

My bad - I mean to invoke vmx_vmfunc_intercept() once and just check the error conditions listed.
Which change to !=X86EMUL_OKAY

Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-14 13:13   ` Jan Beulich
  2015-07-14 14:45     ` George Dunlap
@ 2015-07-16  8:57     ` Sahita, Ravi
  2015-07-16  9:07       ` Jan Beulich
  1 sibling, 1 reply; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-16  8:57 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Tuesday, July 14, 2015 6:13 AM
>
>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>> +void
>> +altp2m_vcpu_initialise(struct vcpu *v) {
>> +    if ( v != current )
>> +        vcpu_pause(v);
>> +
>> +    altp2m_vcpu_reset(v);
>> +    vcpu_altp2m(v).p2midx = 0;
>> +    atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
>> +
>> +    altp2m_vcpu_update_eptp(v);
>> +
>> +    if ( v != current )
>> +        vcpu_unpause(v);
>> +}
>> +
>> +void
>> +altp2m_vcpu_destroy(struct vcpu *v)
>> +{
>> +    struct p2m_domain *p2m;
>> +
>> +    if ( v != current )
>> +        vcpu_pause(v);
>> +
>> +    if ( (p2m = p2m_get_altp2m(v)) )
>> +        atomic_dec(&p2m->active_vcpus);
>> +
>> +    altp2m_vcpu_reset(v);
>> +
>> +    altp2m_vcpu_update_eptp(v);
>> +    altp2m_vcpu_update_vmfunc_ve(v);
>> +
>> +    if ( v != current )
>> +        vcpu_unpause(v);
>> +}
>
>There not being any caller of altp2m_vcpu_initialise() I can't judge about its
>pausing requirements, but for the destroy case I can't see what the pausing is
>good for. Considering its sole user it's also not really clear why the two update
>operations need to be done while destroying.
>

initialise doesn't get called until the HVMOP implementation in a later patch. 
initialise and destroy both operate on a running domain and modify vmcs state, so the vcpu has to be paused.

>> @@ -6498,6 +6500,25 @@ enum hvm_intblk
>nhvm_interrupt_blocked(struct vcpu *v)
>>      return hvm_funcs.nhvm_intr_blocked(v);  }
>>
>> +void altp2m_vcpu_update_eptp(struct vcpu *v) {
>> +    if ( hvm_funcs.altp2m_vcpu_update_eptp )
>> +        hvm_funcs.altp2m_vcpu_update_eptp(v);
>> +}
>> +
>> +void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v) {
>> +    if ( hvm_funcs.altp2m_vcpu_update_vmfunc_ve )
>> +        hvm_funcs.altp2m_vcpu_update_vmfunc_ve(v);
>> +}
>> +
>> +bool_t altp2m_vcpu_emulate_ve(struct vcpu *v) {
>> +    if ( hvm_funcs.altp2m_vcpu_emulate_ve )
>> +        return hvm_funcs.altp2m_vcpu_emulate_ve(v);
>> +    return 0;
>> +}
>
>The patch context here suggests that you're using pretty outdated a tree -
>nhvm_interrupt_blocked() went away a week ago. In line with the commit
>doing so, all of the above should become inline functions.
>

We are rebasing on staging and testing now

>> +uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp) {
>> +    struct p2m_domain *p2m;
>> +    struct ept_data *ept;
>> +    uint16_t i;
>> +
>> +    altp2m_list_lock(d);
>> +
>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>> +    {
>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>> +            continue;
>> +
>> +        p2m = d->arch.altp2m_p2m[i];
>> +        ept = &p2m->ept;
>> +
>> +        if ( eptp == ept_get_eptp(ept) )
>> +            goto out;
>> +    }
>> +
>> +    i = INVALID_ALTP2M;
>> +
>> +out:
>
>Labels should be indented by at least one space.

Ok

>
>Pending the rename, the function may also live in the wrong file (the use of
>ept_get_eptp() suggests so even if the function itself got renamed).
>

We will look at this one to see if its just a rename. 

>> +bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx) {
>> +    struct domain *d = v->domain;
>> +    bool_t rc = 0;
>> +
>> +    if ( idx > MAX_ALTP2M )
>> +        return rc;
>> +
>> +    altp2m_list_lock(d);
>> +
>> +    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
>> +    {
>> +        if ( idx != vcpu_altp2m(v).p2midx )
>> +        {
>> +            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
>> +            vcpu_altp2m(v).p2midx = idx;
>> +            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
>
>Are the two results of p2m_get_altp2m(v) here guaranteed to be distinct? If
>they aren't, is it safe to decrement first (potentially dropping the count to
>zero)?
>

Yes on both.

>> @@ -722,6 +731,27 @@ void nestedp2m_write_p2m_entry(struct
>p2m_domain *p2m, unsigned long gfn,
>>      l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
>>
>>  /*
>> + * Alternate p2m: shadow p2m tables used for alternate memory views
>> + */
>> +
>> +/* get current alternate p2m table */ static inline struct p2m_domain
>> +*p2m_get_altp2m(struct vcpu *v) {
>> +    struct domain *d = v->domain;
>> +    uint16_t index = vcpu_altp2m(v).p2midx;
>> +
>> +    ASSERT(index < MAX_ALTP2M);
>> +
>> +    return (index == INVALID_ALTP2M) ? NULL :
>> +d->arch.altp2m_p2m[index]; }
>
>Looking at this again, I'm afraid I'd still prefer index < MAX_ALTP2M in the
>return statement (and the ASSERT() dropped): The ASSERT() does nothing in a
>debug=n build, and hence wouldn't shield us from possibly having to issue an
>XSA if somehow an access outside the array's bounds turned out possible.
>

the assert was breaking v5 anyway. BUG_ON (with the right check) is probably the right thing to do, as we do in the exit handling that checks for a VMFUNC having changed the index.
So will make that change.

>I've also avoided to repeat any of the un-addressed points that I raised against
>earlier versions.
>

I went back and looked at the earlier versions of the comments on this patch and afaict we have either addressed (accepted) those points or described why they shouldn't cause changes with reasoning . so if I missed something please let me know.

Ravi

>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-14 14:36   ` Jan Beulich
@ 2015-07-16  9:02     ` Sahita, Ravi
  2015-07-16  9:09       ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-16  9:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Tuesday, July 14, 2015 7:37 AM
>
>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -6443,6 +6443,148 @@ long do_hvm_op(unsigned long op,
>XEN_GUEST_HANDLE_PARAM(void) arg)
>>          break;
>>      }
>>
>> +    case HVMOP_altp2m:
>> +    {
>> +        struct xen_hvm_altp2m_op a;
>> +        struct domain *d = NULL;
>> +
>> +        if ( copy_from_guest(&a, arg, 1) )
>> +            return -EFAULT;
>> +
>> +        if ( a.pad[0] || a.pad[1] )
>> +            return -EINVAL;
>
>Why can't the field be uint16_t, making this a single check?
>

Could be of course;  we had asked for an example and we found domctl, which pads with uint8_t[] and followed the same approach. Change required?

Ravi

>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-16  8:57     ` Sahita, Ravi
@ 2015-07-16  9:07       ` Jan Beulich
  2015-07-17 22:36         ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-16  9:07 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 16.07.15 at 10:57, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Tuesday, July 14, 2015 6:13 AM
>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>> @@ -722,6 +731,27 @@ void nestedp2m_write_p2m_entry(struct
>>p2m_domain *p2m, unsigned long gfn,
>>>      l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
>>>
>>>  /*
>>> + * Alternate p2m: shadow p2m tables used for alternate memory views
>>> + */
>>> +
>>> +/* get current alternate p2m table */ static inline struct p2m_domain
>>> +*p2m_get_altp2m(struct vcpu *v) {
>>> +    struct domain *d = v->domain;
>>> +    uint16_t index = vcpu_altp2m(v).p2midx;
>>> +
>>> +    ASSERT(index < MAX_ALTP2M);
>>> +
>>> +    return (index == INVALID_ALTP2M) ? NULL :
>>> +d->arch.altp2m_p2m[index]; }
>>
>>Looking at this again, I'm afraid I'd still prefer index < MAX_ALTP2M in the
>>return statement (and the ASSERT() dropped): The ASSERT() does nothing in a
>>debug=n build, and hence wouldn't shield us from possibly having to issue an
>>XSA if somehow an access outside the array's bounds turned out possible.
>>
> 
> the assert was breaking v5 anyway. BUG_ON (with the right check) is probably 
> the right thing to do, as we do in the exit handling that checks for a VMFUNC 
> having changed the index.
> So will make that change.

But why use a BUG_ON() when you can deal with this more
gracefully? Please try to avoid crashing the hypervisor when
there are other ways to recover.

>>I've also avoided to repeat any of the un-addressed points that I raised 
> against
>>earlier versions.
>>
> 
> I went back and looked at the earlier versions of the comments on this patch 
> and afaict we have either addressed (accepted) those points or described why 
> they shouldn't cause changes with reasoning . so if I missed something please 
> let me know.

Just one example of what wasn't done is the conversion of local
variable, function return, and function parameter types from
(bogus) uint8_t / uint16_t to unsigned int (iirc also in other
patches).

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-16  9:02     ` Sahita, Ravi
@ 2015-07-16  9:09       ` Jan Beulich
  0 siblings, 0 replies; 56+ messages in thread
From: Jan Beulich @ 2015-07-16  9:09 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 16.07.15 at 11:02, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Tuesday, July 14, 2015 7:37 AM
>>
>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -6443,6 +6443,148 @@ long do_hvm_op(unsigned long op,
>>XEN_GUEST_HANDLE_PARAM(void) arg)
>>>          break;
>>>      }
>>>
>>> +    case HVMOP_altp2m:
>>> +    {
>>> +        struct xen_hvm_altp2m_op a;
>>> +        struct domain *d = NULL;
>>> +
>>> +        if ( copy_from_guest(&a, arg, 1) )
>>> +            return -EFAULT;
>>> +
>>> +        if ( a.pad[0] || a.pad[1] )
>>> +            return -EINVAL;
>>
>>Why can't the field be uint16_t, making this a single check?
>>
> 
> Could be of course;  we had asked for an example and we found domctl, which 
> pads with uint8_t[] and followed the same approach.

domctl is a particularly bad example, as there checking of padding
space to be zero is pointless as there is no ABI guarantee anyway
(same for sysctl). I.e. I'd be surprised if you found such a check in
any domctl handling...

> Change required?

Yes please - let's avoid ugly code like the above.

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-14 14:31   ` Jan Beulich
@ 2015-07-16  9:16     ` Sahita, Ravi
  2015-07-16  9:34       ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-16  9:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Tuesday, July 14, 2015 7:32 AM
>
>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -2802,10 +2802,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>      mfn_t mfn;
>>      struct vcpu *curr = current;
>>      struct domain *currd = curr->domain;
>> -    struct p2m_domain *p2m;
>> +    struct p2m_domain *p2m, *hostp2m;
>>      int rc, fall_through = 0, paged = 0;
>>      int sharing_enomem = 0;
>>      vm_event_request_t *req_ptr = NULL;
>> +    bool_t ap2m_active = 0;
>
>Pointless initializer afaict.
>

True - didn't use to be - will fix.

>> @@ -2865,11 +2866,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>          goto out;
>>      }
>>
>> -    p2m = p2m_get_hostp2m(currd);
>> -    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
>> +    ap2m_active = altp2m_active(currd);
>> +
>> +    /* Take a lock on the host p2m speculatively, to avoid potential
>> +     * locking order problems later and to handle unshare etc.
>> +     */
>
>Comment style.
>

Got it

>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>          if ( npfec.write_access )
>>          {
>>              paging_mark_dirty(currd, mfn_x(mfn));
>> +            /* If p2m is really an altp2m, unlock here to avoid lock ordering
>> +             * violation when the change below is propagated from host p2m */
>> +            if ( ap2m_active )
>> +                __put_gfn(p2m, gfn);
>>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>> p2m_ram_rw);
>
>And this won't result in any races?
>

No

>Also - comment style again (and more elsewhere).
>

Got it

>> --- a/xen/arch/x86/mm/p2m.c
>> +++ b/xen/arch/x86/mm/p2m.c
>> @@ -2037,6 +2037,391 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct
>vcpu *v, uint16_t idx)
>>      return rc;
>>  }
>>
>> +void p2m_flush_altp2m(struct domain *d) {
>> +    uint16_t i;
>> +
>> +    altp2m_list_lock(d);
>> +
>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>> +    {
>> +        p2m_flush_table(d->arch.altp2m_p2m[i]);
>> +        /* Uninit and reinit ept to force TLB shootdown */
>> +        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
>> +        ept_p2m_init(d->arch.altp2m_p2m[i]);
>
>ept_... in non-EPT code again.
>

There is no non-EPT altp2m implementation, and this file already includes ept.. callouts for p2m's implemented using EPT's.


>> +        d->arch.altp2m_eptp[i] = INVALID_MFN;
>> +    }
>> +
>> +    altp2m_list_unlock(d);
>> +}
>> +
>> +static void p2m_init_altp2m_helper(struct domain *d, uint16_t i) {
>> +    struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
>> +    struct ept_data *ept;
>> +
>> +    p2m->min_remapped_gfn = INVALID_GFN;
>> +    p2m->max_remapped_gfn = INVALID_GFN;
>> +    ept = &p2m->ept;
>> +    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
>> +    d->arch.altp2m_eptp[i] = ept_get_eptp(ept);
>
>Same here.
>

Same as above... this file already includes ept.. callouts for p2m's implemented using EPT's.

>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
>> +    long rc = -EINVAL;
>
>Why long (for both variable and function return type)? (More of these in
>functions below.)
>

Because the error variable in the code that calls these (in hvm.c) is a long, and you had given feedback earlier to propagate the returns from these functions through that calling code.


>> +long p2m_init_next_altp2m(struct domain *d, uint16_t *idx) {
>> +    long rc = -EINVAL;
>> +    uint16_t i;
>
>As in the earlier patch(es) - unsigned int.
>

Ok, but why does it matter? uint16_t will always suffice.


>> +long p2m_change_altp2m_gfn(struct domain *d, uint16_t idx,
>> +                             gfn_t old_gfn, gfn_t new_gfn) {
>> +    struct p2m_domain *hp2m, *ap2m;
>> +    p2m_access_t a;
>> +    p2m_type_t t;
>> +    mfn_t mfn;
>> +    unsigned int page_order;
>> +    long rc = -EINVAL;
>> +
>> +    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
>> +        return rc;
>> +
>> +    hp2m = p2m_get_hostp2m(d);
>> +    ap2m = d->arch.altp2m_p2m[idx];
>> +
>> +    p2m_lock(ap2m);
>> +
>> +    mfn = ap2m->get_entry(ap2m, gfn_x(old_gfn), &t, &a, 0, NULL,
>> + NULL);
>> +
>> +    if ( gfn_x(new_gfn) == INVALID_GFN )
>> +    {
>> +        if ( mfn_valid(mfn) )
>> +            p2m_remove_page(ap2m, gfn_x(old_gfn), mfn_x(mfn),
>PAGE_ORDER_4K);
>> +        rc = 0;
>> +        goto out;
>> +    }
>> +
>> +    /* Check host p2m if no valid entry in alternate */
>> +    if ( !mfn_valid(mfn) )
>> +    {
>> +        mfn = hp2m->get_entry(hp2m, gfn_x(old_gfn), &t, &a,
>> +                              P2M_ALLOC | P2M_UNSHARE, &page_order,
>> + NULL);
>> +
>> +        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
>> +            goto out;
>> +
>> +        /* If this is a superpage, copy that first */
>> +        if ( page_order != PAGE_ORDER_4K )
>> +        {
>> +            gfn_t gfn;
>> +            unsigned long mask;
>> +
>> +            mask = ~((1UL << page_order) - 1);
>> +            gfn = _gfn(gfn_x(old_gfn) & mask);
>> +            mfn = _mfn(mfn_x(mfn) & mask);
>> +
>> +            if ( ap2m->set_entry(ap2m, gfn_x(gfn), mfn, page_order,
>> + t, a, 1)
>> )
>> +                goto out;
>> +        }
>> +    }
>> +
>> +    mfn = ap2m->get_entry(ap2m, gfn_x(new_gfn), &t, &a, 0, NULL,
>> + NULL);
>> +
>> +    if ( !mfn_valid(mfn) )
>> +        mfn = hp2m->get_entry(hp2m, gfn_x(new_gfn), &t, &a, 0, NULL,
>> + NULL);
>> +
>> +    if ( !mfn_valid(mfn) || (t != p2m_ram_rw) )
>> +        goto out;
>> +
>> +    if ( !ap2m->set_entry(ap2m, gfn_x(old_gfn), mfn, PAGE_ORDER_4K, t, a,
>> +                          (current->domain != d)) )
>> +    {
>> +        rc = 0;
>> +
>> +        if ( ap2m->min_remapped_gfn == INVALID_GFN ||
>> +             gfn_x(new_gfn) < ap2m->min_remapped_gfn )
>> +            ap2m->min_remapped_gfn = gfn_x(new_gfn);
>> +        if ( ap2m->max_remapped_gfn == INVALID_GFN ||
>> +             gfn_x(new_gfn) > ap2m->max_remapped_gfn )
>> +            ap2m->max_remapped_gfn = gfn_x(new_gfn);
>
>For the purpose here (and without conflict with the consumer side) it would
>seem to be better to initialize max_remapped_gfn to zero, as then both if()
>can get away with just one comparison.
>

See below for a full explanation of this portion and why max_remapped_gfn cannot be inited to 0 (zero is a valid gfn)...

>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>> +                                 mfn_t mfn, unsigned int page_order,
>> +                                 p2m_type_t p2mt, p2m_access_t p2ma)
>> +{
>> +    struct p2m_domain *p2m;
>> +    p2m_access_t a;
>> +    p2m_type_t t;
>> +    mfn_t m;
>> +    uint16_t i;
>> +    bool_t reset_p2m;
>> +    unsigned int reset_count = 0;
>> +    uint16_t last_reset_idx = ~0;
>> +
>> +    if ( !altp2m_active(d) )
>> +        return;
>> +
>> +    altp2m_list_lock(d);
>> +
>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>> +    {
>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>> +            continue;
>> +
>> +        p2m = d->arch.altp2m_p2m[i];
>> +        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
>> +
>> +        reset_p2m = 0;
>> +
>> +        /* Check for a dropped page that may impact this altp2m */
>> +        if ( mfn_x(mfn) == INVALID_MFN &&
>> +             gfn_x(gfn) >= p2m->min_remapped_gfn &&
>> +             gfn_x(gfn) <= p2m->max_remapped_gfn )
>> +            reset_p2m = 1;
>
>Considering that this looks like an optimization, what's the downside of
>possibly having min=0 and max=<end-of-address-space>? I.e.
>can there a long latency operation result that's this way a guest can effect?
>

... A p2m is a gfn->mfn map, amongst other things. There is a reverse mfn->gfn map, but that is only valid for the host p2m. Unless the remap altp2m hypercall is used, the gfn->mfn map in every altp2m mirrors the gfn->mfn map in the host p2m (or a subset thereof, due to lazy-copy), so handling removal of an mfn from a guest is simple: do a reverse look up for the host p2m and mark the relevant gfn as invalid, then do the same for every altp2m where that gfn is currently valid.

Remap changes things: it says take gfn1 and replace ->mfn with the ->mfn of gfn2. Here is where the optimization is used and the  invalidate logic is: record the lowest and highest gfn2's that have been used in remap ops; if an mfn is dropped from the hostp2m, for the purposes of altp2m invalidation, see if the gfn derived from the host p2m reverse lookup falls within the range of used gfn2's. If it does, an invalidation is required. Which is why min and max are inited the way they are - hope the explanation clarifies this optimization.

Ravi

>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-14 13:57   ` Jan Beulich
@ 2015-07-16  9:20     ` Sahita, Ravi
  2015-07-16  9:38       ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-16  9:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Tuesday, July 14, 2015 6:57 AM
>
>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>> Implement and hook up the code to enable VMX support of VMFUNC and
>#VE.
>>
>> VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.
>>
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
>> ---
>>  xen/arch/x86/hvm/vmx/vmx.c | 138
>> +++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 138 insertions(+)
>>
>> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
>> index 07527dd..38dba6b 100644
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -56,6 +56,7 @@
>>  #include <asm/debugger.h>
>>  #include <asm/apic.h>
>>  #include <asm/hvm/nestedhvm.h>
>> +#include <asm/hvm/altp2m.h>
>>  #include <asm/event.h>
>>  #include <asm/monitor.h>
>>  #include <public/arch-x86/cpuid.h>
>> @@ -1763,6 +1764,104 @@ static void
>> vmx_enable_msr_exit_interception(struct
>> domain *d)
>>                                           MSR_TYPE_W);  }
>>
>> +static void vmx_vcpu_update_eptp(struct vcpu *v) {
>> +    struct domain *d = v->domain;
>> +    struct p2m_domain *p2m = NULL;
>> +    struct ept_data *ept;
>> +
>> +    if ( altp2m_active(d) )
>> +        p2m = p2m_get_altp2m(v);
>> +    if ( !p2m )
>> +        p2m = p2m_get_hostp2m(d);
>> +
>> +    ept = &p2m->ept;
>> +    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
>> +
>> +    vmx_vmcs_enter(v);
>> +
>> +    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
>> +
>> +    if ( v->arch.hvm_vmx.secondary_exec_control &
>> +        SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
>> +        __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);
>> +
>> +    vmx_vmcs_exit(v);
>> +}
>> +
>> +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) {
>> +    struct domain *d = v->domain;
>> +    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
>> +
>> +    if ( !cpu_has_vmx_vmfunc )
>> +        return;
>> +
>> +    if ( cpu_has_vmx_virt_exceptions )
>> +        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
>> +
>> +    vmx_vmcs_enter(v);
>> +
>> +    if ( !d->is_dying && altp2m_active(d) )
>> +    {
>> +        v->arch.hvm_vmx.secondary_exec_control |= mask;
>> +        __vmwrite(VM_FUNCTION_CONTROL,
>VMX_VMFUNC_EPTP_SWITCHING);
>> +        __vmwrite(EPTP_LIST_ADDR,
>> + virt_to_maddr(d->arch.altp2m_eptp));
>> +
>> +        if ( cpu_has_vmx_virt_exceptions )
>> +        {
>> +            p2m_type_t t;
>> +            mfn_t mfn;
>> +
>> +            mfn = get_gfn_query_unlocked(d,
>> + gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
>> +
>> +            if ( mfn_x(mfn) != INVALID_MFN )
>> +                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
>> +            else
>> +                mask &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
>
>Considering the rest of the function, this is dead code.
>

Right, should be v->arch.hvm_vmx.secondary_exec_control &=...


>> +        }
>> +    }
>> +    else
>> +        v->arch.hvm_vmx.secondary_exec_control &= ~mask;
>> +
>> +    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
>> +        v->arch.hvm_vmx.secondary_exec_control);
>> +
>> +    vmx_vmcs_exit(v);
>> +}
>> +
>> +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v) {
>> +    bool_t rc = 0;
>> +    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) != INVALID_GFN
>?
>> +        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) :
>> +NULL;
>> +
>> +    if ( !veinfo )
>> +        return 0;
>> +
>> +    if ( veinfo->semaphore != 0 )
>> +        goto out;
>> +
>> +    rc = 1;
>> +
>> +    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
>> +    veinfo->semaphore = ~0l;
>
>Isn't semaphore a 32-bit quantity?
>

Yes.

>> +    veinfo->eptp_index = vcpu_altp2m(v).p2midx;
>> +
>> +    vmx_vmcs_enter(v);
>> +    __vmread(EXIT_QUALIFICATION, &veinfo->exit_qualification);
>> +    __vmread(GUEST_LINEAR_ADDRESS, &veinfo->gla);
>> +    __vmread(GUEST_PHYSICAL_ADDRESS, &veinfo->gpa);
>> +    vmx_vmcs_exit(v);
>> +
>> +    hvm_inject_hw_exception(TRAP_virtualisation,
>> +                            HVM_DELIVER_NO_ERROR_CODE);
>> +
>> +out:
>
>Un-indented label again.
>

Got it.

>> @@ -2744,6 +2846,42 @@ void vmx_vmexit_handler(struct cpu_user_regs
>*regs)
>>      /* Now enable interrupts so it's safe to take locks. */
>>      local_irq_enable();
>>
>> +    /*
>> +     * If the guest has the ability to switch EPTP without an exit,
>> +     * figure out whether it has done so and update the altp2m data.
>> +     */
>> +    if ( altp2m_active(v->domain) &&
>> +        (v->arch.hvm_vmx.secondary_exec_control &
>> +        SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
>
>Indentation.
>

Yep.

>> +    {
>> +        unsigned long idx;
>> +
>> +        if ( v->arch.hvm_vmx.secondary_exec_control &
>> +            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
>> +            __vmread(EPTP_INDEX, &idx);
>> +        else
>> +        {
>> +            unsigned long eptp;
>> +
>> +            __vmread(EPT_POINTER, &eptp);
>> +
>> +            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
>> +                 INVALID_ALTP2M )
>> +            {
>> +                gdprintk(XENLOG_ERR, "EPTP not found in alternate p2m list\n");
>> +                domain_crash(v->domain);
>> +            }
>> +        }
>> +
>> +        if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )
>
>Is this cast really necessary?
>

Yes - The index is 16-bits, this reflects how the field is specified in the vmcs also.

>> +        {
>> +            BUG_ON(idx >= MAX_ALTP2M);
>> +            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
>> +            vcpu_altp2m(v).p2midx = (uint16_t)idx;
>
>This one surely isn't (or else the field type is wrong).
>

Again required. idx can't be uint16_t because __vmread() requires unsigned long*, but the index is 16 bits.

Thanks,
Ravi

>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 09/15] x86/altp2m: alternate p2m memory events.
  2015-07-14 14:08   ` Jan Beulich
@ 2015-07-16  9:22     ` Sahita, Ravi
  0 siblings, 0 replies; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-16  9:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Tuesday, July 14, 2015 7:08 AM
>
>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>> --- a/xen/include/public/vm_event.h
>> +++ b/xen/include/public/vm_event.h
>> @@ -47,6 +47,16 @@
>>  #define VM_EVENT_FLAG_VCPU_PAUSED     (1 << 0)
>>  /* Flags to aid debugging mem_event */
>>  #define VM_EVENT_FLAG_FOREIGN         (1 << 1)
>> +/*
>> + * This flag can be set in a request or a response
>> + *
>> + * On a request, indicates that the event occurred in the alternate
>> +p2m specified by
>> + * the altp2m_idx request field.
>> + *
>> + * On a response, indicates that the VCPU should resume in the
>> +alternate p2m specified
>> + * by the altp2m_idx response field if possible.
>> + */
>> +#define VM_EVENT_FLAG_ALTERNATE_P2M   (1 << 5)
>
>So I suppose you use 5 here because of what went into staging recently. But
>the patch context doesn't reflect this, i.e. the patch is inconsistent at this point
>(and won't apply).
>
>Jan

Yes - we didn't have time to rebase on staging - we are working on that now

Ravi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-16  9:16     ` Sahita, Ravi
@ 2015-07-16  9:34       ` Jan Beulich
  2015-07-17 22:32         ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-16  9:34 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 16.07.15 at 11:16, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Tuesday, July 14, 2015 7:32 AM
>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>unsigned long gla,
>>>          if ( npfec.write_access )
>>>          {
>>>              paging_mark_dirty(currd, mfn_x(mfn));
>>> +            /* If p2m is really an altp2m, unlock here to avoid lock 
> ordering
>>> +             * violation when the change below is propagated from host p2m 
> */
>>> +            if ( ap2m_active )
>>> +                __put_gfn(p2m, gfn);
>>>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>>> p2m_ram_rw);
>>
>>And this won't result in any races?
> 
> No

To be honest I expected a little more than just "no" here. Now
I have to ask - why?

>>> --- a/xen/arch/x86/mm/p2m.c
>>> +++ b/xen/arch/x86/mm/p2m.c
>>> @@ -2037,6 +2037,391 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct
>>vcpu *v, uint16_t idx)
>>>      return rc;
>>>  }
>>>
>>> +void p2m_flush_altp2m(struct domain *d) {
>>> +    uint16_t i;
>>> +
>>> +    altp2m_list_lock(d);
>>> +
>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>> +    {
>>> +        p2m_flush_table(d->arch.altp2m_p2m[i]);
>>> +        /* Uninit and reinit ept to force TLB shootdown */
>>> +        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
>>> +        ept_p2m_init(d->arch.altp2m_p2m[i]);
>>
>>ept_... in non-EPT code again.
>>
> 
> There is no non-EPT altp2m implementation, and this file already includes 
> ept.. callouts for p2m's implemented using EPT's.

The only two calls currently there are ept_p2m_{,un}init(), which
need to be there with the current code structuring. Everything
else that's EPT-specific should be abstracted through hooks set
by ept_p2m_init().

>>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
>>> +    long rc = -EINVAL;
>>
>>Why long (for both variable and function return type)? (More of these in
>>functions below.)
> 
> Because the error variable in the code that calls these (in hvm.c) is a 
> long, and you had given feedback earlier to propagate the returns from these 
> functions through that calling code.

I don't see the connection. The function only returns zero or -E...
values, so why would its return type be "long"?

>>> +long p2m_init_next_altp2m(struct domain *d, uint16_t *idx) {
>>> +    long rc = -EINVAL;
>>> +    uint16_t i;
>>
>>As in the earlier patch(es) - unsigned int.
> 
> Ok, but why does it matter? uint16_t will always suffice.

And will always produce worse code.

>>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>>> +                                 mfn_t mfn, unsigned int page_order,
>>> +                                 p2m_type_t p2mt, p2m_access_t p2ma)
>>> +{
>>> +    struct p2m_domain *p2m;
>>> +    p2m_access_t a;
>>> +    p2m_type_t t;
>>> +    mfn_t m;
>>> +    uint16_t i;
>>> +    bool_t reset_p2m;
>>> +    unsigned int reset_count = 0;
>>> +    uint16_t last_reset_idx = ~0;
>>> +
>>> +    if ( !altp2m_active(d) )
>>> +        return;
>>> +
>>> +    altp2m_list_lock(d);
>>> +
>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>> +    {
>>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>>> +            continue;
>>> +
>>> +        p2m = d->arch.altp2m_p2m[i];
>>> +        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
>>> +
>>> +        reset_p2m = 0;
>>> +
>>> +        /* Check for a dropped page that may impact this altp2m */
>>> +        if ( mfn_x(mfn) == INVALID_MFN &&
>>> +             gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>> +             gfn_x(gfn) <= p2m->max_remapped_gfn )
>>> +            reset_p2m = 1;
>>
>>Considering that this looks like an optimization, what's the downside of
>>possibly having min=0 and max=<end-of-address-space>? I.e.
>>can there a long latency operation result that's this way a guest can effect?
>>
> 
> ... A p2m is a gfn->mfn map, amongst other things. There is a reverse mfn->gfn 
> map, but that is only valid for the host p2m. Unless the remap altp2m 
> hypercall is used, the gfn->mfn map in every altp2m mirrors the gfn->mfn map in 
> the host p2m (or a subset thereof, due to lazy-copy), so handling removal of 
> an mfn from a guest is simple: do a reverse look up for the host p2m and mark 
> the relevant gfn as invalid, then do the same for every altp2m where that gfn 
> is currently valid.
> 
> Remap changes things: it says take gfn1 and replace ->mfn with the ->mfn of 
> gfn2. Here is where the optimization is used and the  invalidate logic is: 
> record the lowest and highest gfn2's that have been used in remap ops; if an 
> mfn is dropped from the hostp2m, for the purposes of altp2m invalidation, see 
> if the gfn derived from the host p2m reverse lookup falls within the range of 
> used gfn2's. If it does, an invalidation is required. Which is why min and 
> max are inited the way they are - hope the explanation clarifies this 
> optimization.

Sadly it doesn't, it just re-states what I already understood and doesn't
answer the question: What happens if min=0 and
max=<end-of-address-space>? I.e. can the guest nullify the
optimization by careful fiddling issuing some of the new hypercalls, and if
so will this have any negative impact on the hypervisor? I'm asking this
from a security standpoint ...

Nor do I find my question answered why max can't be initialized to zero:
You don't care whether max is a valid GFN when a certain GFN doesn't
fall in the (then empty) [min, max] range. What am I missing?

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-16  9:20     ` Sahita, Ravi
@ 2015-07-16  9:38       ` Jan Beulich
  2015-07-17 21:08         ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-16  9:38 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 16.07.15 at 11:20, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Tuesday, July 14, 2015 6:57 AM
>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>> +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v) {
>>> +    bool_t rc = 0;
>>> +    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) != INVALID_GFN
>>?
>>> +        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) :
>>> +NULL;
>>> +
>>> +    if ( !veinfo )
>>> +        return 0;
>>> +
>>> +    if ( veinfo->semaphore != 0 )
>>> +        goto out;
>>> +
>>> +    rc = 1;
>>> +
>>> +    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
>>> +    veinfo->semaphore = ~0l;
>>
>>Isn't semaphore a 32-bit quantity?
> 
> Yes.

I.e. the l suffix can and should be dropped.

>>> +    {
>>> +        unsigned long idx;
>>> +
>>> +        if ( v->arch.hvm_vmx.secondary_exec_control &
>>> +            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
>>> +            __vmread(EPTP_INDEX, &idx);
>>> +        else
>>> +        {
>>> +            unsigned long eptp;
>>> +
>>> +            __vmread(EPT_POINTER, &eptp);
>>> +
>>> +            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
>>> +                 INVALID_ALTP2M )
>>> +            {
>>> +                gdprintk(XENLOG_ERR, "EPTP not found in alternate p2m list\n");
>>> +                domain_crash(v->domain);
>>> +            }
>>> +        }
>>> +
>>> +        if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )
>>
>>Is this cast really necessary?
> 
> Yes - The index is 16-bits, this reflects how the field is specified in the 
> vmcs also.

While "yes" answers the question, the explanation you give suggests
that the answer may be wrong: Can idx indeed have bits set beyond
bit 15? Because if it can't, the cast is pointless.

>>> +        {
>>> +            BUG_ON(idx >= MAX_ALTP2M);
>>> +            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
>>> +            vcpu_altp2m(v).p2midx = (uint16_t)idx;
>>
>>This one surely isn't (or else the field type is wrong).
> 
> Again required. idx can't be uint16_t because __vmread() requires unsigned 
> long*, but the index is 16 bits.

But it's a 16-bit VMCS field that you read it from, and hence the
upper 48 bits are necessarily zero.

Just to re-iterate: Casts are necessary in certain places, yes, but
I see them used pointlessly or even wrongly more often than not.

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-14  0:14 ` [PATCH v5 10/15] x86/altp2m: add remaining support routines Ed White
  2015-07-14 14:31   ` Jan Beulich
@ 2015-07-16 14:44   ` George Dunlap
  2015-07-17 21:01     ` Sahita, Ravi
  1 sibling, 1 reply; 56+ messages in thread
From: George Dunlap @ 2015-07-16 14:44 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Tue, Jul 14, 2015 at 1:14 AM, Ed White <edmund.h.white@intel.com> wrote:
> Add the remaining routines required to support enabling the alternate
> p2m functionality.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/arch/x86/hvm/hvm.c           |  58 +++++-
>  xen/arch/x86/mm/hap/Makefile     |   1 +
>  xen/arch/x86/mm/hap/altp2m_hap.c |  98 ++++++++++
>  xen/arch/x86/mm/p2m-ept.c        |   3 +
>  xen/arch/x86/mm/p2m.c            | 385 +++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/hvm/altp2m.h |   4 +
>  xen/include/asm-x86/p2m.h        |  33 ++++
>  7 files changed, 576 insertions(+), 6 deletions(-)
>  create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 38deedc..a9f4b1b 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2802,10 +2802,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>      mfn_t mfn;
>      struct vcpu *curr = current;
>      struct domain *currd = curr->domain;
> -    struct p2m_domain *p2m;
> +    struct p2m_domain *p2m, *hostp2m;
>      int rc, fall_through = 0, paged = 0;
>      int sharing_enomem = 0;
>      vm_event_request_t *req_ptr = NULL;
> +    bool_t ap2m_active = 0;
>
>      /* On Nested Virtualization, walk the guest page table.
>       * If this succeeds, all is fine.
> @@ -2865,11 +2866,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>          goto out;
>      }
>
> -    p2m = p2m_get_hostp2m(currd);
> -    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
> +    ap2m_active = altp2m_active(currd);
> +
> +    /* Take a lock on the host p2m speculatively, to avoid potential
> +     * locking order problems later and to handle unshare etc.
> +     */
> +    hostp2m = p2m_get_hostp2m(currd);
> +    mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
>                                P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE : 0),
>                                NULL);
>
> +    if ( ap2m_active )
> +    {
> +        if ( altp2m_hap_nested_page_fault(curr, gpa, gla, npfec, &p2m) == 1 )
> +        {
> +            /* entry was lazily copied from host -- retry */
> +            __put_gfn(hostp2m, gfn);
> +            rc = 1;
> +            goto out;
> +        }
> +
> +        mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
> +    }
> +    else
> +        p2m = hostp2m;
> +
>      /* Check access permissions first, then handle faults */
>      if ( mfn_x(mfn) != INVALID_MFN )
>      {
> @@ -2909,6 +2930,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>
>          if ( violation )
>          {
> +            /* Should #VE be emulated for this fault? */
> +            if ( p2m_is_altp2m(p2m) && !cpu_has_vmx_virt_exceptions )
> +            {
> +                bool_t sve;
> +
> +                p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
> +
> +                if ( !sve && altp2m_vcpu_emulate_ve(curr) )
> +                {
> +                    rc = 1;
> +                    goto out_put_gfn;
> +                }
> +            }
> +
>              if ( p2m_mem_access_check(gpa, gla, npfec, &req_ptr) )
>              {
>                  fall_through = 1;
> @@ -2928,7 +2963,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>           (npfec.write_access &&
>            (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
>      {
> -        put_gfn(currd, gfn);
> +        __put_gfn(p2m, gfn);
> +        if ( ap2m_active )
> +            __put_gfn(hostp2m, gfn);
>
>          rc = 0;
>          if ( unlikely(is_pvh_domain(currd)) )
> @@ -2957,6 +2994,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>      /* Spurious fault? PoD and log-dirty also take this path. */
>      if ( p2m_is_ram(p2mt) )
>      {
> +        rc = 1;
>          /*
>           * Page log dirty is always done with order 0. If this mfn resides in
>           * a large page, we do not change other pages type within that large
> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>          if ( npfec.write_access )
>          {
>              paging_mark_dirty(currd, mfn_x(mfn));
> +            /* If p2m is really an altp2m, unlock here to avoid lock ordering
> +             * violation when the change below is propagated from host p2m */
> +            if ( ap2m_active )
> +                __put_gfn(p2m, gfn);
>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty, p2m_ram_rw);
> +            __put_gfn(ap2m_active ? hostp2m : p2m, gfn);
> +
> +            goto out;
>          }
> -        rc = 1;
>          goto out_put_gfn;
>      }
>
> @@ -2977,7 +3021,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>      rc = fall_through;
>
>  out_put_gfn:
> -    put_gfn(currd, gfn);
> +    __put_gfn(p2m, gfn);
> +    if ( ap2m_active )
> +        __put_gfn(hostp2m, gfn);
>  out:
>      /* All of these are delayed until we exit, since we might
>       * sleep on event ring wait queues, and we must not hold
> diff --git a/xen/arch/x86/mm/hap/Makefile b/xen/arch/x86/mm/hap/Makefile
> index 68f2bb5..216cd90 100644
> --- a/xen/arch/x86/mm/hap/Makefile
> +++ b/xen/arch/x86/mm/hap/Makefile
> @@ -4,6 +4,7 @@ obj-y += guest_walk_3level.o
>  obj-$(x86_64) += guest_walk_4level.o
>  obj-y += nested_hap.o
>  obj-y += nested_ept.o
> +obj-y += altp2m_hap.o
>
>  guest_walk_%level.o: guest_walk.c Makefile
>         $(CC) $(CFLAGS) -DGUEST_PAGING_LEVELS=$* -c $< -o $@
> diff --git a/xen/arch/x86/mm/hap/altp2m_hap.c b/xen/arch/x86/mm/hap/altp2m_hap.c
> new file mode 100644
> index 0000000..52c7877
> --- /dev/null
> +++ b/xen/arch/x86/mm/hap/altp2m_hap.c
> @@ -0,0 +1,98 @@
> +/******************************************************************************
> + * arch/x86/mm/hap/altp2m_hap.c
> + *
> + * Copyright (c) 2014 Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> + */
> +
> +#include <asm/domain.h>
> +#include <asm/page.h>
> +#include <asm/paging.h>
> +#include <asm/p2m.h>
> +#include <asm/hap.h>
> +#include <asm/hvm/altp2m.h>
> +
> +#include "private.h"
> +
> +/*
> + * If the fault is for a not present entry:
> + *     if the entry in the host p2m has a valid mfn, copy it and retry
> + *     else indicate that outer handler should handle fault
> + *
> + * If the fault is for a present entry:
> + *     indicate that outer handler should handle fault
> + */
> +
> +int
> +altp2m_hap_nested_page_fault(struct vcpu *v, paddr_t gpa,
> +                                unsigned long gla, struct npfec npfec,
> +                                struct p2m_domain **ap2m)
> +{
> +    struct p2m_domain *hp2m = p2m_get_hostp2m(v->domain);
> +    p2m_type_t p2mt;
> +    p2m_access_t p2ma;
> +    unsigned int page_order;
> +    gfn_t gfn = _gfn(paddr_to_pfn(gpa));
> +    unsigned long mask;
> +    mfn_t mfn;
> +    int rv;
> +
> +    *ap2m = p2m_get_altp2m(v);
> +
> +    mfn = get_gfn_type_access(*ap2m, gfn_x(gfn), &p2mt, &p2ma,
> +                              0, &page_order);
> +    __put_gfn(*ap2m, gfn_x(gfn));
> +
> +    if ( mfn_x(mfn) != INVALID_MFN )
> +        return 0;
> +
> +    mfn = get_gfn_type_access(hp2m, gfn_x(gfn), &p2mt, &p2ma,
> +                              P2M_ALLOC | P2M_UNSHARE, &page_order);
> +    put_gfn(hp2m->domain, gfn_x(gfn));

So the reason I said my comments weren't blockers for v3 was so that
it could be checked in before the code freeze last Friday if that
turned out to be possible.

Please do at least give this function a name that reflects what it
does (i.e., try to propagate changes from the host p2m to the altp2m),
and change this put_gfn() to match the __put_gfn() above.

I'd prefer it if you moved this into mm/p2m.c as well.

 -George

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-16 14:44   ` George Dunlap
@ 2015-07-17 21:01     ` Sahita, Ravi
  0 siblings, 0 replies; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-17 21:01 UTC (permalink / raw)
  To: George Dunlap, White, Edmund H
  Cc: Wei Liu, Tim Deegan, Ian Jackson, xen-devel, Jan Beulich,
	Andrew Cooper, tlengyel, Daniel De Graaf

>From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George
>Dunlap
>Sent: Thursday, July 16, 2015 7:45 AM
>
>On Tue, Jul 14, 2015 at 1:14 AM, Ed White <edmund.h.white@intel.com>
>wrote:
>> Add the remaining routines required to support enabling the alternate
>> p2m functionality.
>>
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>>  xen/arch/x86/hvm/hvm.c           |  58 +++++-
>>  xen/arch/x86/mm/hap/Makefile     |   1 +
>>  xen/arch/x86/mm/hap/altp2m_hap.c |  98 ++++++++++
>>  xen/arch/x86/mm/p2m-ept.c        |   3 +
>>  xen/arch/x86/mm/p2m.c            | 385
>+++++++++++++++++++++++++++++++++++++++
>>  xen/include/asm-x86/hvm/altp2m.h |   4 +
>>  xen/include/asm-x86/p2m.h        |  33 ++++
>>  7 files changed, 576 insertions(+), 6 deletions(-)  create mode
>> 100644 xen/arch/x86/mm/hap/altp2m_hap.c
>>
>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index
>> 38deedc..a9f4b1b 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -2802,10 +2802,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>      mfn_t mfn;
>>      struct vcpu *curr = current;
>>      struct domain *currd = curr->domain;
>> -    struct p2m_domain *p2m;
>> +    struct p2m_domain *p2m, *hostp2m;
>>      int rc, fall_through = 0, paged = 0;
>>      int sharing_enomem = 0;
>>      vm_event_request_t *req_ptr = NULL;
>> +    bool_t ap2m_active = 0;
>>
>>      /* On Nested Virtualization, walk the guest page table.
>>       * If this succeeds, all is fine.
>> @@ -2865,11 +2866,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>          goto out;
>>      }
>>
>> -    p2m = p2m_get_hostp2m(currd);
>> -    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
>> +    ap2m_active = altp2m_active(currd);
>> +
>> +    /* Take a lock on the host p2m speculatively, to avoid potential
>> +     * locking order problems later and to handle unshare etc.
>> +     */
>> +    hostp2m = p2m_get_hostp2m(currd);
>> +    mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
>>                                P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE : 0),
>>                                NULL);
>>
>> +    if ( ap2m_active )
>> +    {
>> +        if ( altp2m_hap_nested_page_fault(curr, gpa, gla, npfec, &p2m) == 1 )
>> +        {
>> +            /* entry was lazily copied from host -- retry */
>> +            __put_gfn(hostp2m, gfn);
>> +            rc = 1;
>> +            goto out;
>> +        }
>> +
>> +        mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>> +    }
>> +    else
>> +        p2m = hostp2m;
>> +
>>      /* Check access permissions first, then handle faults */
>>      if ( mfn_x(mfn) != INVALID_MFN )
>>      {
>> @@ -2909,6 +2930,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>> unsigned long gla,
>>
>>          if ( violation )
>>          {
>> +            /* Should #VE be emulated for this fault? */
>> +            if ( p2m_is_altp2m(p2m) && !cpu_has_vmx_virt_exceptions )
>> +            {
>> +                bool_t sve;
>> +
>> +                p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL,
>> + &sve);
>> +
>> +                if ( !sve && altp2m_vcpu_emulate_ve(curr) )
>> +                {
>> +                    rc = 1;
>> +                    goto out_put_gfn;
>> +                }
>> +            }
>> +
>>              if ( p2m_mem_access_check(gpa, gla, npfec, &req_ptr) )
>>              {
>>                  fall_through = 1;
>> @@ -2928,7 +2963,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>           (npfec.write_access &&
>>            (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
>>      {
>> -        put_gfn(currd, gfn);
>> +        __put_gfn(p2m, gfn);
>> +        if ( ap2m_active )
>> +            __put_gfn(hostp2m, gfn);
>>
>>          rc = 0;
>>          if ( unlikely(is_pvh_domain(currd)) ) @@ -2957,6 +2994,7 @@
>> int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>>      /* Spurious fault? PoD and log-dirty also take this path. */
>>      if ( p2m_is_ram(p2mt) )
>>      {
>> +        rc = 1;
>>          /*
>>           * Page log dirty is always done with order 0. If this mfn resides in
>>           * a large page, we do not change other pages type within
>> that large @@ -2965,9 +3003,15 @@ int
>hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>>          if ( npfec.write_access )
>>          {
>>              paging_mark_dirty(currd, mfn_x(mfn));
>> +            /* If p2m is really an altp2m, unlock here to avoid lock ordering
>> +             * violation when the change below is propagated from host p2m */
>> +            if ( ap2m_active )
>> +                __put_gfn(p2m, gfn);
>>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>> p2m_ram_rw);
>> +            __put_gfn(ap2m_active ? hostp2m : p2m, gfn);
>> +
>> +            goto out;
>>          }
>> -        rc = 1;
>>          goto out_put_gfn;
>>      }
>>
>> @@ -2977,7 +3021,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>      rc = fall_through;
>>
>>  out_put_gfn:
>> -    put_gfn(currd, gfn);
>> +    __put_gfn(p2m, gfn);
>> +    if ( ap2m_active )
>> +        __put_gfn(hostp2m, gfn);
>>  out:
>>      /* All of these are delayed until we exit, since we might
>>       * sleep on event ring wait queues, and we must not hold diff
>> --git a/xen/arch/x86/mm/hap/Makefile b/xen/arch/x86/mm/hap/Makefile
>> index 68f2bb5..216cd90 100644
>> --- a/xen/arch/x86/mm/hap/Makefile
>> +++ b/xen/arch/x86/mm/hap/Makefile
>> @@ -4,6 +4,7 @@ obj-y += guest_walk_3level.o
>>  obj-$(x86_64) += guest_walk_4level.o
>>  obj-y += nested_hap.o
>>  obj-y += nested_ept.o
>> +obj-y += altp2m_hap.o
>>
>>  guest_walk_%level.o: guest_walk.c Makefile
>>         $(CC) $(CFLAGS) -DGUEST_PAGING_LEVELS=$* -c $< -o $@ diff
>> --git a/xen/arch/x86/mm/hap/altp2m_hap.c
>> b/xen/arch/x86/mm/hap/altp2m_hap.c
>> new file mode 100644
>> index 0000000..52c7877
>> --- /dev/null
>> +++ b/xen/arch/x86/mm/hap/altp2m_hap.c
>> @@ -0,0 +1,98 @@
>>
>+/*********************************************************
>***********
>> +**********
>> + * arch/x86/mm/hap/altp2m_hap.c
>> + *
>> + * Copyright (c) 2014 Intel Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> +modify
>> + * it under the terms of the GNU General Public License as published
>> +by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
>> +02111-1307  USA  */
>> +
>> +#include <asm/domain.h>
>> +#include <asm/page.h>
>> +#include <asm/paging.h>
>> +#include <asm/p2m.h>
>> +#include <asm/hap.h>
>> +#include <asm/hvm/altp2m.h>
>> +
>> +#include "private.h"
>> +
>> +/*
>> + * If the fault is for a not present entry:
>> + *     if the entry in the host p2m has a valid mfn, copy it and retry
>> + *     else indicate that outer handler should handle fault
>> + *
>> + * If the fault is for a present entry:
>> + *     indicate that outer handler should handle fault
>> + */
>> +
>> +int
>> +altp2m_hap_nested_page_fault(struct vcpu *v, paddr_t gpa,
>> +                                unsigned long gla, struct npfec npfec,
>> +                                struct p2m_domain **ap2m) {
>> +    struct p2m_domain *hp2m = p2m_get_hostp2m(v->domain);
>> +    p2m_type_t p2mt;
>> +    p2m_access_t p2ma;
>> +    unsigned int page_order;
>> +    gfn_t gfn = _gfn(paddr_to_pfn(gpa));
>> +    unsigned long mask;
>> +    mfn_t mfn;
>> +    int rv;
>> +
>> +    *ap2m = p2m_get_altp2m(v);
>> +
>> +    mfn = get_gfn_type_access(*ap2m, gfn_x(gfn), &p2mt, &p2ma,
>> +                              0, &page_order);
>> +    __put_gfn(*ap2m, gfn_x(gfn));
>> +
>> +    if ( mfn_x(mfn) != INVALID_MFN )
>> +        return 0;
>> +
>> +    mfn = get_gfn_type_access(hp2m, gfn_x(gfn), &p2mt, &p2ma,
>> +                              P2M_ALLOC | P2M_UNSHARE, &page_order);
>> +    put_gfn(hp2m->domain, gfn_x(gfn));
>
>So the reason I said my comments weren't blockers for v3 was so that it could
>be checked in before the code freeze last Friday if that turned out to be
>possible.
>
>Please do at least give this function a name that reflects what it does (i.e., try
>to propagate changes from the host p2m to the altp2m), and change this
>put_gfn() to match the __put_gfn() above.
>
>I'd prefer it if you moved this into mm/p2m.c as well.
>
> -George

Will do

Ravi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-16  9:38       ` Jan Beulich
@ 2015-07-17 21:08         ` Sahita, Ravi
  2015-07-20  6:21           ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-17 21:08 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 16, 2015 2:39 AM
>
>>>> On 16.07.15 at 11:20, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Tuesday, July 14, 2015 6:57 AM
>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>> +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v) {
>>>> +    bool_t rc = 0;
>>>> +    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) !=
>>>> +INVALID_GFN
>>>?
>>>> +        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) :
>>>> +NULL;
>>>> +
>>>> +    if ( !veinfo )
>>>> +        return 0;
>>>> +
>>>> +    if ( veinfo->semaphore != 0 )
>>>> +        goto out;
>>>> +
>>>> +    rc = 1;
>>>> +
>>>> +    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
>>>> +    veinfo->semaphore = ~0l;
>>>
>>>Isn't semaphore a 32-bit quantity?
>>
>> Yes.
>
>I.e. the l suffix can and should be dropped.
>

Ok.

>>>> +    {
>>>> +        unsigned long idx;
>>>> +
>>>> +        if ( v->arch.hvm_vmx.secondary_exec_control &
>>>> +            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
>>>> +            __vmread(EPTP_INDEX, &idx);
>>>> +        else
>>>> +        {
>>>> +            unsigned long eptp;
>>>> +
>>>> +            __vmread(EPT_POINTER, &eptp);
>>>> +
>>>> +            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
>>>> +                 INVALID_ALTP2M )
>>>> +            {
>>>> +                gdprintk(XENLOG_ERR, "EPTP not found in alternate p2m
>list\n");
>>>> +                domain_crash(v->domain);
>>>> +            }
>>>> +        }
>>>> +
>>>> +        if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )
>>>
>>>Is this cast really necessary?
>>
>> Yes - The index is 16-bits, this reflects how the field is specified
>> in the vmcs also.
>
>While "yes" answers the question, the explanation you give suggests that the
>answer may be wrong: Can idx indeed have bits set beyond bit 15? Because if
>it can't, the cast is pointless.
>

We were just trying to ensure we matched the hardware behavior (I think there was a message George had posted earlier for SVE that asked for that).
Since hardware considers only a 16 bit field we were doing the same.

>>>> +        {
>>>> +            BUG_ON(idx >= MAX_ALTP2M);
>>>> +            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
>>>> +            vcpu_altp2m(v).p2midx = (uint16_t)idx;
>>>
>>>This one surely isn't (or else the field type is wrong).
>>
>> Again required. idx can't be uint16_t because __vmread() requires
>> unsigned long*, but the index is 16 bits.
>
>But it's a 16-bit VMCS field that you read it from, and hence the upper 48 bits
>are necessarily zero.
>
>Just to re-iterate: Casts are necessary in certain places, yes, but I see them
>used pointlessly or even wrongly more often than not.
>
>Jan

Same approach as above - emulating hardware exactly.
Should we add a comment?

Thanks,
Ravi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-16  9:34       ` Jan Beulich
@ 2015-07-17 22:32         ` Sahita, Ravi
  2015-07-20  6:53           ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-17 22:32 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 16, 2015 2:34 AM
>
>>>> On 16.07.15 at 11:16, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Tuesday, July 14, 2015 7:32 AM
>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>>unsigned long gla,
>>>>          if ( npfec.write_access )
>>>>          {
>>>>              paging_mark_dirty(currd, mfn_x(mfn));
>>>> +            /* If p2m is really an altp2m, unlock here to avoid
>>>> + lock
>> ordering
>>>> +             * violation when the change below is propagated from
>>>> + host p2m
>> */
>>>> +            if ( ap2m_active )
>>>> +                __put_gfn(p2m, gfn);
>>>>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>>>> p2m_ram_rw);
>>>
>>>And this won't result in any races?
>>
>> No
>
>To be honest I expected a little more than just "no" here. Now I have to ask -
>why?
>

Yes, I should have described it more than that :-)  so this part of the code is handling the log dirty transition of the page, and this page permission transition happens always on the hostp2m. Given the way the locking order is setup (hostp2m->altp2m-list-lock->altp2m and there was a separate writeup and discussion with George on that), at this point in this sequence there is a p2m lock (whether it's a hostp2m or altp2m lock depends on the mode of the domain) - the reason we have to drop the lock here first is due to what happens next; the permission changes in hostp2m will be serially propagated to altp2ms and not dropping the lock here would cause a locking order violation. Hope that clarifies. 

>>>> --- a/xen/arch/x86/mm/p2m.c
>>>> +++ b/xen/arch/x86/mm/p2m.c
>>>> @@ -2037,6 +2037,391 @@ bool_t
>p2m_switch_vcpu_altp2m_by_id(struct
>>>vcpu *v, uint16_t idx)
>>>>      return rc;
>>>>  }
>>>>
>>>> +void p2m_flush_altp2m(struct domain *d) {
>>>> +    uint16_t i;
>>>> +
>>>> +    altp2m_list_lock(d);
>>>> +
>>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>>> +    {
>>>> +        p2m_flush_table(d->arch.altp2m_p2m[i]);
>>>> +        /* Uninit and reinit ept to force TLB shootdown */
>>>> +        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
>>>> +        ept_p2m_init(d->arch.altp2m_p2m[i]);
>>>
>>>ept_... in non-EPT code again.
>>>
>>
>> There is no non-EPT altp2m implementation, and this file already
>> includes ept.. callouts for p2m's implemented using EPT's.
>
>The only two calls currently there are ept_p2m_{,un}init(), which need to be
>there with the current code structuring. Everything else that's EPT-specific
>should be abstracted through hooks set by ept_p2m_init().
>

We hear your input on this one - we would like to treat this change as a Category 2 change (per the preface message I sent earlier). I hope that's acceptable - also we will prioritize this change above the other change you suggested on the altp2m data structure reorganization in patch 5/15.

>>>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
>>>> +    long rc = -EINVAL;
>>>
>>>Why long (for both variable and function return type)? (More of these
>>>in functions below.)
>>
>> Because the error variable in the code that calls these (in hvm.c) is
>> a long, and you had given feedback earlier to propagate the returns
>> from these functions through that calling code.
>
>I don't see the connection. The function only returns zero or -E...
>values, so why would its return type be "long"?
>

do_hvm_op declares a rc that is of type "long" and hence this returns a "long"


>>>> +long p2m_init_next_altp2m(struct domain *d, uint16_t *idx) {
>>>> +    long rc = -EINVAL;
>>>> +    uint16_t i;
>>>
>>>As in the earlier patch(es) - unsigned int.
>>
>> Ok, but why does it matter? uint16_t will always suffice.
>
>And will always produce worse code.
>

Ok.

>>>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>>>> +                                 mfn_t mfn, unsigned int page_order,
>>>> +                                 p2m_type_t p2mt, p2m_access_t
>>>> +p2ma) {
>>>> +    struct p2m_domain *p2m;
>>>> +    p2m_access_t a;
>>>> +    p2m_type_t t;
>>>> +    mfn_t m;
>>>> +    uint16_t i;
>>>> +    bool_t reset_p2m;
>>>> +    unsigned int reset_count = 0;
>>>> +    uint16_t last_reset_idx = ~0;
>>>> +
>>>> +    if ( !altp2m_active(d) )
>>>> +        return;
>>>> +
>>>> +    altp2m_list_lock(d);
>>>> +
>>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>>> +    {
>>>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>>>> +            continue;
>>>> +
>>>> +        p2m = d->arch.altp2m_p2m[i];
>>>> +        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
>>>> +
>>>> +        reset_p2m = 0;
>>>> +
>>>> +        /* Check for a dropped page that may impact this altp2m */
>>>> +        if ( mfn_x(mfn) == INVALID_MFN &&
>>>> +             gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>>> +             gfn_x(gfn) <= p2m->max_remapped_gfn )
>>>> +            reset_p2m = 1;
>>>
>>>Considering that this looks like an optimization, what's the downside
>>>of possibly having min=0 and max=<end-of-address-space>? I.e.
>>>can there a long latency operation result that's this way a guest can effect?
>>>
>>
>> ... A p2m is a gfn->mfn map, amongst other things. There is a reverse
>> mfn->gfn map, but that is only valid for the host p2m. Unless the
>> remap altp2m hypercall is used, the gfn->mfn map in every altp2m
>> mirrors the gfn->mfn map in the host p2m (or a subset thereof, due to
>> lazy-copy), so handling removal of an mfn from a guest is simple: do a
>> reverse look up for the host p2m and mark the relevant gfn as invalid,
>> then do the same for every altp2m where that gfn is currently valid.
>>
>> Remap changes things: it says take gfn1 and replace ->mfn with the
>> ->mfn of gfn2. Here is where the optimization is used and the  invalidate
>logic is:
>> record the lowest and highest gfn2's that have been used in remap ops;
>> if an mfn is dropped from the hostp2m, for the purposes of altp2m
>> invalidation, see if the gfn derived from the host p2m reverse lookup
>> falls within the range of used gfn2's. If it does, an invalidation is
>> required. Which is why min and max are inited the way they are - hope
>> the explanation clarifies this optimization.
>
>Sadly it doesn't, it just re-states what I already understood and doesn't
>answer the question: What happens if min=0 and max=<end-of-address-
>space>? I.e. can the guest nullify the optimization by careful fiddling issuing
>some of the new hypercalls, and if so will this have any negative impact on the
>hypervisor? I'm asking this from a security standpoint ...
>

To take that exact case, If min=0 and max=<end of address space> then any hostp2m change where the first mfn is dropped, will cause all altp2ms to be reset even if the mfn dropped doesn't affect altp2ms at all, which wont serve as an optimization at all - Hope that clarifies.

>Nor do I find my question answered why max can't be initialized to zero:
>You don't care whether max is a valid GFN when a certain GFN doesn't fall in
>the (then empty) [min, max] range. What am I missing?

Since 0 is a valid GFN so we cannot initialize min or max to 0 - since matching this condition (gfn_x(gfn) >= p2m->min_remapped_gfn && gfn_x(gfn) <= p2m->max_remapped_gfn) will cause a reset (throw away) of the altp2m to rebuild it from the hostp2m. So essentially what is being done here is the range is the non-existent set to start with, unless some altp2m changes occur, and then it is grown to be the smallest set around the gfns affected.

Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-16  9:07       ` Jan Beulich
@ 2015-07-17 22:36         ` Sahita, Ravi
  2015-07-20  6:20           ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-17 22:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 16, 2015 2:08 AM
>
>>>> On 16.07.15 at 10:57, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Tuesday, July 14, 2015 6:13 AM
>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>> @@ -722,6 +731,27 @@ void nestedp2m_write_p2m_entry(struct
>>>p2m_domain *p2m, unsigned long gfn,
>>>>      l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
>>>>
>>>>  /*
>>>> + * Alternate p2m: shadow p2m tables used for alternate memory views
>>>> + */
>>>> +
>>>> +/* get current alternate p2m table */ static inline struct
>>>> +p2m_domain *p2m_get_altp2m(struct vcpu *v) {
>>>> +    struct domain *d = v->domain;
>>>> +    uint16_t index = vcpu_altp2m(v).p2midx;
>>>> +
>>>> +    ASSERT(index < MAX_ALTP2M);
>>>> +
>>>> +    return (index == INVALID_ALTP2M) ? NULL :
>>>> +d->arch.altp2m_p2m[index]; }
>>>
>>>Looking at this again, I'm afraid I'd still prefer index < MAX_ALTP2M
>>>in the return statement (and the ASSERT() dropped): The ASSERT() does
>>>nothing in a debug=n build, and hence wouldn't shield us from possibly
>>>having to issue an XSA if somehow an access outside the array's bounds
>turned out possible.
>>>
>>
>> the assert was breaking v5 anyway. BUG_ON (with the right check) is
>> probably the right thing to do, as we do in the exit handling that
>> checks for a VMFUNC having changed the index.
>> So will make that change.
>
>But why use a BUG_ON() when you can deal with this more gracefully? Please
>try to avoid crashing the hypervisor when there are other ways to recover.
>

So in this case there isnt a graceful fallback; this case can happen only if there is a bug in the hypervisor - which should be reported via the BUG_ON.

>>>I've also avoided to repeat any of the un-addressed points that I
>>>raised
>> against
>>>earlier versions.
>>>
>>
>> I went back and looked at the earlier versions of the comments on this
>> patch and afaict we have either addressed (accepted) those points or
>> described why they shouldn't cause changes with reasoning . so if I
>> missed something please let me know.
>
>Just one example of what wasn't done is the conversion of local variable,
>function return, and function parameter types from
>(bogus) uint8_t / uint16_t to unsigned int (iirc also in other patches).

Working through these as best as can (treating it as Category 4)

Thanks,
Ravi


>
>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-14 14:04   ` Jan Beulich
  2015-07-14 17:56     ` Sahita, Ravi
@ 2015-07-17 22:41     ` Sahita, Ravi
  1 sibling, 0 replies; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-17 22:41 UTC (permalink / raw)
  To: 'Jan Beulich', White, Edmund H
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf


>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Tuesday, July 14, 2015 7:04 AM
>
>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -1436,6 +1436,19 @@ static int hvmemul_invlpg(
>>      return rc;
>>  }
>>
>> +static int hvmemul_vmfunc(
>> +    struct x86_emulate_ctxt *ctxt)
>> +{
>> +    int rc;
>> +
>> +    rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
>> +    if ( rc != X86EMUL_OKAY )
>> +    {
>> +        hvmemul_inject_hw_exception(TRAP_invalid_op, 0, ctxt);
>> +    }
>> +    return rc;
>
>Pointless braces and missing blank line before final return.
>

Will be fixed.

>> @@ -1830,6 +1831,19 @@ static void vmx_vcpu_update_vmfunc_ve(struct
>vcpu *v)
>>      vmx_vmcs_exit(v);
>>  }
>>
>> +static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs) {
>> +    int rc = X86EMUL_EXCEPTION;
>> +    struct vcpu *curr = current;
>> +
>> +    if ( !cpu_has_vmx_vmfunc && altp2m_active(curr->domain) &&
>> +         regs->eax == 0 &&
>> +         p2m_switch_vcpu_altp2m_by_id(curr, (uint16_t)regs->ecx) )
>
>Documentation suggests that the upper 32 bits of RAX are being ignored, and
>that all 32 bits of ECX are being used.

Will fix to use _eax and _ecx,  ECX is expected to be <512 by hardware which the p2m_switch_... function checks.

>
>> @@ -3234,6 +3263,15 @@ void vmx_vmexit_handler(struct cpu_user_regs
>*regs)
>>              update_guest_eip();
>>          break;
>>
>> +    case EXIT_REASON_VMFUNC:
>> +        if ( (vmx_vmfunc_intercept(regs) == X86EMUL_EXCEPTION) ||
>> +             (vmx_vmfunc_intercept(regs) == X86EMUL_UNHANDLEABLE) ||
>> +             (vmx_vmfunc_intercept(regs) == X86EMUL_RETRY) )
>
>Why would you want to invoke the function 3 times? How about simply !=
>X86EMUL_OKAY?

Yuck - sorry introduced in error of course - will fix.

Thanks,
Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-17 22:36         ` Sahita, Ravi
@ 2015-07-20  6:20           ` Jan Beulich
  2015-07-21  5:18             ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-20  6:20 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 18.07.15 at 00:36, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 16, 2015 2:08 AM
>>
>>>>> On 16.07.15 at 10:57, <ravi.sahita@intel.com> wrote:
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>Sent: Tuesday, July 14, 2015 6:13 AM
>>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>>> @@ -722,6 +731,27 @@ void nestedp2m_write_p2m_entry(struct
>>>>p2m_domain *p2m, unsigned long gfn,
>>>>>      l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
>>>>>
>>>>>  /*
>>>>> + * Alternate p2m: shadow p2m tables used for alternate memory views
>>>>> + */
>>>>> +
>>>>> +/* get current alternate p2m table */ static inline struct
>>>>> +p2m_domain *p2m_get_altp2m(struct vcpu *v) {
>>>>> +    struct domain *d = v->domain;
>>>>> +    uint16_t index = vcpu_altp2m(v).p2midx;
>>>>> +
>>>>> +    ASSERT(index < MAX_ALTP2M);
>>>>> +
>>>>> +    return (index == INVALID_ALTP2M) ? NULL :
>>>>> +d->arch.altp2m_p2m[index]; }
>>>>
>>>>Looking at this again, I'm afraid I'd still prefer index < MAX_ALTP2M
>>>>in the return statement (and the ASSERT() dropped): The ASSERT() does
>>>>nothing in a debug=n build, and hence wouldn't shield us from possibly
>>>>having to issue an XSA if somehow an access outside the array's bounds
>>turned out possible.
>>>>
>>>
>>> the assert was breaking v5 anyway. BUG_ON (with the right check) is
>>> probably the right thing to do, as we do in the exit handling that
>>> checks for a VMFUNC having changed the index.
>>> So will make that change.
>>
>>But why use a BUG_ON() when you can deal with this more gracefully? Please
>>try to avoid crashing the hypervisor when there are other ways to recover.
>>
> 
> So in this case there isnt a graceful fallback; this case can happen only if 
> there is a bug in the hypervisor - which should be reported via the BUG_ON.

Generally (as an example), if a hypervisor bug can be confined to a
guest, killing just the guest instead of the hypervisor would still be
preferred (allowing the admin to attempt to gracefully shut down
other guests before updating/restarting).

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-17 21:08         ` Sahita, Ravi
@ 2015-07-20  6:21           ` Jan Beulich
  2015-07-21  5:49             ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-20  6:21 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 17.07.15 at 23:08, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 16, 2015 2:39 AM
>>
>>>>> On 16.07.15 at 11:20, <ravi.sahita@intel.com> wrote:
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>Sent: Tuesday, July 14, 2015 6:57 AM
>>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>>> +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v) {
>>>>> +    bool_t rc = 0;
>>>>> +    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) !=
>>>>> +INVALID_GFN
>>>>?
>>>>> +        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) :
>>>>> +NULL;
>>>>> +
>>>>> +    if ( !veinfo )
>>>>> +        return 0;
>>>>> +
>>>>> +    if ( veinfo->semaphore != 0 )
>>>>> +        goto out;
>>>>> +
>>>>> +    rc = 1;
>>>>> +
>>>>> +    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
>>>>> +    veinfo->semaphore = ~0l;
>>>>
>>>>Isn't semaphore a 32-bit quantity?
>>>
>>> Yes.
>>
>>I.e. the l suffix can and should be dropped.
>>
> 
> Ok.
> 
>>>>> +    {
>>>>> +        unsigned long idx;
>>>>> +
>>>>> +        if ( v->arch.hvm_vmx.secondary_exec_control &
>>>>> +            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
>>>>> +            __vmread(EPTP_INDEX, &idx);
>>>>> +        else
>>>>> +        {
>>>>> +            unsigned long eptp;
>>>>> +
>>>>> +            __vmread(EPT_POINTER, &eptp);
>>>>> +
>>>>> +            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
>>>>> +                 INVALID_ALTP2M )
>>>>> +            {
>>>>> +                gdprintk(XENLOG_ERR, "EPTP not found in alternate p2m
>>list\n");
>>>>> +                domain_crash(v->domain);
>>>>> +            }
>>>>> +        }
>>>>> +
>>>>> +        if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )
>>>>
>>>>Is this cast really necessary?
>>>
>>> Yes - The index is 16-bits, this reflects how the field is specified
>>> in the vmcs also.
>>
>>While "yes" answers the question, the explanation you give suggests that the
>>answer may be wrong: Can idx indeed have bits set beyond bit 15? Because if
>>it can't, the cast is pointless.
>>
> 
> We were just trying to ensure we matched the hardware behavior (I think 
> there was a message George had posted earlier for SVE that asked for that).
> Since hardware considers only a 16 bit field we were doing the same.
> 
>>>>> +        {
>>>>> +            BUG_ON(idx >= MAX_ALTP2M);
>>>>> +            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
>>>>> +            vcpu_altp2m(v).p2midx = (uint16_t)idx;
>>>>
>>>>This one surely isn't (or else the field type is wrong).
>>>
>>> Again required. idx can't be uint16_t because __vmread() requires
>>> unsigned long*, but the index is 16 bits.
>>
>>But it's a 16-bit VMCS field that you read it from, and hence the upper 48 
> bits
>>are necessarily zero.
>>
>>Just to re-iterate: Casts are necessary in certain places, yes, but I see them
>>used pointlessly or even wrongly more often than not.
> 
> Same approach as above - emulating hardware exactly.
> Should we add a comment?

No, drop the casts.

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-17 22:32         ` Sahita, Ravi
@ 2015-07-20  6:53           ` Jan Beulich
  2015-07-21  5:46             ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-20  6:53 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 18.07.15 at 00:32, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 16, 2015 2:34 AM
>>
>>>>> On 16.07.15 at 11:16, <ravi.sahita@intel.com> wrote:
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>Sent: Tuesday, July 14, 2015 7:32 AM
>>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>>>unsigned long gla,
>>>>>          if ( npfec.write_access )
>>>>>          {
>>>>>              paging_mark_dirty(currd, mfn_x(mfn));
>>>>> +            /* If p2m is really an altp2m, unlock here to avoid
>>>>> + lock
>>> ordering
>>>>> +             * violation when the change below is propagated from
>>>>> + host p2m
>>> */
>>>>> +            if ( ap2m_active )
>>>>> +                __put_gfn(p2m, gfn);
>>>>>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>>>>> p2m_ram_rw);
>>>>
>>>>And this won't result in any races?
>>>
>>> No
>>
>>To be honest I expected a little more than just "no" here. Now I have to ask -
>>why?
>>
> 
> Yes, I should have described it more than that :-)  so this part of the code 
> is handling the log dirty transition of the page, and this page permission 
> transition happens always on the hostp2m. Given the way the locking order is 
> setup (hostp2m->altp2m-list-lock->altp2m and there was a separate writeup and 
> discussion with George on that), at this point in this sequence there is a 
> p2m lock (whether it's a hostp2m or altp2m lock depends on the mode of the 
> domain) - the reason we have to drop the lock here first is due to what 
> happens next; the permission changes in hostp2m will be serially propagated 
> to altp2ms and not dropping the lock here would cause a locking order 
> violation. Hope that clarifies. 

Sadly it doesn't at all: You re-explain why you need to drop the lock,
while you fail to say anything on why this won't cause a race.

>>>>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
>>>>> +    long rc = -EINVAL;
>>>>
>>>>Why long (for both variable and function return type)? (More of these
>>>>in functions below.)
>>>
>>> Because the error variable in the code that calls these (in hvm.c) is
>>> a long, and you had given feedback earlier to propagate the returns
>>> from these functions through that calling code.
>>
>>I don't see the connection. The function only returns zero or -E...
>>values, so why would its return type be "long"?
>>
> 
> do_hvm_op declares a rc that is of type "long" and hence this returns a 
> "long"

What type your caller(s) return is of no interest at all here: What
would you do if you had multiple callers with differing return types?
A function's return type should be chosen based on the range of
values it may return, and the result possibly widened to not yield
inefficient code (like in some of the uint16_t cases elsewhere in the
series would be necessary).

>>>>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>>>>> +                                 mfn_t mfn, unsigned int page_order,
>>>>> +                                 p2m_type_t p2mt, p2m_access_t
>>>>> +p2ma) {
>>>>> +    struct p2m_domain *p2m;
>>>>> +    p2m_access_t a;
>>>>> +    p2m_type_t t;
>>>>> +    mfn_t m;
>>>>> +    uint16_t i;
>>>>> +    bool_t reset_p2m;
>>>>> +    unsigned int reset_count = 0;
>>>>> +    uint16_t last_reset_idx = ~0;
>>>>> +
>>>>> +    if ( !altp2m_active(d) )
>>>>> +        return;
>>>>> +
>>>>> +    altp2m_list_lock(d);
>>>>> +
>>>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>>>> +    {
>>>>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>>>>> +            continue;
>>>>> +
>>>>> +        p2m = d->arch.altp2m_p2m[i];
>>>>> +        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
>>>>> +
>>>>> +        reset_p2m = 0;
>>>>> +
>>>>> +        /* Check for a dropped page that may impact this altp2m */
>>>>> +        if ( mfn_x(mfn) == INVALID_MFN &&
>>>>> +             gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>>>> +             gfn_x(gfn) <= p2m->max_remapped_gfn )
>>>>> +            reset_p2m = 1;
>>>>
>>>>Considering that this looks like an optimization, what's the downside
>>>>of possibly having min=0 and max=<end-of-address-space>? I.e.
>>>>can there a long latency operation result that's this way a guest can effect?
>>>>
>>>
>>> ... A p2m is a gfn->mfn map, amongst other things. There is a reverse
>>> mfn->gfn map, but that is only valid for the host p2m. Unless the
>>> remap altp2m hypercall is used, the gfn->mfn map in every altp2m
>>> mirrors the gfn->mfn map in the host p2m (or a subset thereof, due to
>>> lazy-copy), so handling removal of an mfn from a guest is simple: do a
>>> reverse look up for the host p2m and mark the relevant gfn as invalid,
>>> then do the same for every altp2m where that gfn is currently valid.
>>>
>>> Remap changes things: it says take gfn1 and replace ->mfn with the
>>> ->mfn of gfn2. Here is where the optimization is used and the  invalidate
>>logic is:
>>> record the lowest and highest gfn2's that have been used in remap ops;
>>> if an mfn is dropped from the hostp2m, for the purposes of altp2m
>>> invalidation, see if the gfn derived from the host p2m reverse lookup
>>> falls within the range of used gfn2's. If it does, an invalidation is
>>> required. Which is why min and max are inited the way they are - hope
>>> the explanation clarifies this optimization.
>>
>>Sadly it doesn't, it just re-states what I already understood and doesn't
>>answer the question: What happens if min=0 and max=<end-of-address-
>>space>? I.e. can the guest nullify the optimization by careful fiddling 
> issuing
>>some of the new hypercalls, and if so will this have any negative impact on 
> the
>>hypervisor? I'm asking this from a security standpoint ...
>>
> 
> To take that exact case, If min=0 and max=<end of address space> then any 
> hostp2m change where the first mfn is dropped, will cause all altp2ms to be 
> reset even if the mfn dropped doesn't affect altp2ms at all, which wont serve 
> as an optimization at all - Hope that clarifies.

Again - no. I understand the optimization is gone then. But what's the
effect? I.e. will the guest, by extending this range to be arbitrarily
wide, be able to cause a long latency hypervisor operation (i.e. a DoS)?

>>Nor do I find my question answered why max can't be initialized to zero:
>>You don't care whether max is a valid GFN when a certain GFN doesn't fall in
>>the (then empty) [min, max] range. What am I missing?
> 
> Since 0 is a valid GFN so we cannot initialize min or max to 0 - since 
> matching this condition (gfn_x(gfn) >= p2m->min_remapped_gfn && gfn_x(gfn) <= 
> p2m->max_remapped_gfn) will cause a reset (throw away) of the altp2m to 
> rebuild it from the hostp2m. So essentially what is being done here is the 
> range is the non-existent set to start with, unless some altp2m changes occur, 
> and then it is grown to be the smallest set around the gfns affected.

Again you just re-state what was already clear, yet you neglect
answering the actual question. Taking what you wrote above,
when max=0 (and min=INVALID_GFN), then

	gfn_x(gfn) >= p2m->min_remapped_gfn &&
	 gfn_x(gfn) <= p2m->max_remapped_gfn

will be false for any value of gfn; in fact the "max" part won't
even be looked at because the "min" part will already be false
for any valid gfn, i.e. only the INVALID_GFN case would make it
to the "max" part.

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-20  6:20           ` Jan Beulich
@ 2015-07-21  5:18             ` Sahita, Ravi
  0 siblings, 0 replies; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-21  5:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Sunday, July 19, 2015 11:20 PM
>
>>>> On 18.07.15 at 00:36, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Thursday, July 16, 2015 2:08 AM
>>>
>>>>>> On 16.07.15 at 10:57, <ravi.sahita@intel.com> wrote:
>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>Sent: Tuesday, July 14, 2015 6:13 AM
>>>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>>>> @@ -722,6 +731,27 @@ void nestedp2m_write_p2m_entry(struct
>>>>>p2m_domain *p2m, unsigned long gfn,
>>>>>>      l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
>>>>>>
>>>>>>  /*
>>>>>> + * Alternate p2m: shadow p2m tables used for alternate memory
>>>>>> + views */
>>>>>> +
>>>>>> +/* get current alternate p2m table */ static inline struct
>>>>>> +p2m_domain *p2m_get_altp2m(struct vcpu *v) {
>>>>>> +    struct domain *d = v->domain;
>>>>>> +    uint16_t index = vcpu_altp2m(v).p2midx;
>>>>>> +
>>>>>> +    ASSERT(index < MAX_ALTP2M);
>>>>>> +
>>>>>> +    return (index == INVALID_ALTP2M) ? NULL :
>>>>>> +d->arch.altp2m_p2m[index]; }
>>>>>
>>>>>Looking at this again, I'm afraid I'd still prefer index <
>>>>>MAX_ALTP2M in the return statement (and the ASSERT() dropped): The
>>>>>ASSERT() does nothing in a debug=n build, and hence wouldn't shield
>>>>>us from possibly having to issue an XSA if somehow an access outside
>>>>>the array's bounds
>>>turned out possible.
>>>>>
>>>>
>>>> the assert was breaking v5 anyway. BUG_ON (with the right check) is
>>>> probably the right thing to do, as we do in the exit handling that
>>>> checks for a VMFUNC having changed the index.
>>>> So will make that change.
>>>
>>>But why use a BUG_ON() when you can deal with this more gracefully?
>>>Please try to avoid crashing the hypervisor when there are other ways to
>recover.
>>>
>>
>> So in this case there isnt a graceful fallback; this case can happen
>> only if there is a bug in the hypervisor - which should be reported via the
>BUG_ON.
>
>Generally (as an example), if a hypervisor bug can be confined to a guest,
>killing just the guest instead of the hypervisor would still be preferred
>(allowing the admin to attempt to gracefully shut down other guests before
>updating/restarting).
>
>Jan

I agree with that principle, and that's what I was looking to do in the last iteration, but in this sort of error condition there is no telling what else could have gone wrong on the hypervisor side to cause this, so it seems the crash treatment seems suitable.

Ravi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-20  6:53           ` Jan Beulich
@ 2015-07-21  5:46             ` Sahita, Ravi
  2015-07-21  6:38               ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-21  5:46 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Sunday, July 19, 2015 11:53 PM
>
>>>> On 18.07.15 at 00:32, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Thursday, July 16, 2015 2:34 AM
>>>
>>>>>> On 16.07.15 at 11:16, <ravi.sahita@intel.com> wrote:
>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>Sent: Tuesday, July 14, 2015 7:32 AM
>>>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>>>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t
>gpa,
>>>>>unsigned long gla,
>>>>>>          if ( npfec.write_access )
>>>>>>          {
>>>>>>              paging_mark_dirty(currd, mfn_x(mfn));
>>>>>> +            /* If p2m is really an altp2m, unlock here to avoid
>>>>>> + lock
>>>> ordering
>>>>>> +             * violation when the change below is propagated from
>>>>>> + host p2m
>>>> */
>>>>>> +            if ( ap2m_active )
>>>>>> +                __put_gfn(p2m, gfn);
>>>>>>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>>>>>> p2m_ram_rw);
>>>>>
>>>>>And this won't result in any races?
>>>>
>>>> No
>>>
>>>To be honest I expected a little more than just "no" here. Now I have
>>>to ask - why?
>>>
>>
>> Yes, I should have described it more than that :-)  so this part of
>> the code is handling the log dirty transition of the page, and this
>> page permission transition happens always on the hostp2m. Given the
>> way the locking order is setup (hostp2m->altp2m-list-lock->altp2m and
>> there was a separate writeup and discussion with George on that), at
>> this point in this sequence there is a p2m lock (whether it's a
>> hostp2m or altp2m lock depends on the mode of the
>> domain) - the reason we have to drop the lock here first is due to
>> what happens next; the permission changes in hostp2m will be serially
>> propagated to altp2ms and not dropping the lock here would cause a
>> locking order violation. Hope that clarifies.
>
>Sadly it doesn't at all: You re-explain why you need to drop the lock, while you
>fail to say anything on why this won't cause a race.
>

It only drops the lock on the altp2m, which is no longer required in this function anyway. The important aspect is that there is still a lock held on the host p2m, and that is dropped after the log-dirty updates, as it would be in the non-altp2m case (maybe that was the part I should have explained clearly in the para above). Does that clarify or do you see a particular race condition here? (We don't ).

>>>>>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
>>>>>> +    long rc = -EINVAL;
>>>>>
>>>>>Why long (for both variable and function return type)? (More of
>>>>>these in functions below.)
>>>>
>>>> Because the error variable in the code that calls these (in hvm.c)
>>>> is a long, and you had given feedback earlier to propagate the
>>>> returns from these functions through that calling code.
>>>
>>>I don't see the connection. The function only returns zero or -E...
>>>values, so why would its return type be "long"?
>>>
>>
>> do_hvm_op declares a rc that is of type "long" and hence this returns
>> a "long"
>
>What type your caller(s) return is of no interest at all here: What would you do
>if you had multiple callers with differing return types?
>A function's return type should be chosen based on the range of values it may
>return, and the result possibly widened to not yield inefficient code (like in
>some of the uint16_t cases elsewhere in the series would be necessary).
>

What do you suggest the return type be?

>>>>>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>>>>>> +                                 mfn_t mfn, unsigned int page_order,
>>>>>> +                                 p2m_type_t p2mt, p2m_access_t
>>>>>> +p2ma) {
>>>>>> +    struct p2m_domain *p2m;
>>>>>> +    p2m_access_t a;
>>>>>> +    p2m_type_t t;
>>>>>> +    mfn_t m;
>>>>>> +    uint16_t i;
>>>>>> +    bool_t reset_p2m;
>>>>>> +    unsigned int reset_count = 0;
>>>>>> +    uint16_t last_reset_idx = ~0;
>>>>>> +
>>>>>> +    if ( !altp2m_active(d) )
>>>>>> +        return;
>>>>>> +
>>>>>> +    altp2m_list_lock(d);
>>>>>> +
>>>>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>>>>> +    {
>>>>>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>>>>>> +            continue;
>>>>>> +
>>>>>> +        p2m = d->arch.altp2m_p2m[i];
>>>>>> +        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0,
>>>>>> + NULL);
>>>>>> +
>>>>>> +        reset_p2m = 0;
>>>>>> +
>>>>>> +        /* Check for a dropped page that may impact this altp2m */
>>>>>> +        if ( mfn_x(mfn) == INVALID_MFN &&
>>>>>> +             gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>>>>> +             gfn_x(gfn) <= p2m->max_remapped_gfn )
>>>>>> +            reset_p2m = 1;
>>>>>
>>>>>Considering that this looks like an optimization, what's the
>>>>>downside of possibly having min=0 and max=<end-of-address-space>?
>I.e.
>>>>>can there a long latency operation result that's this way a guest can
>effect?
>>>>>
>>>>
>>>> ... A p2m is a gfn->mfn map, amongst other things. There is a
>>>> reverse
>>>> mfn->gfn map, but that is only valid for the host p2m. Unless the
>>>> remap altp2m hypercall is used, the gfn->mfn map in every altp2m
>>>> mirrors the gfn->mfn map in the host p2m (or a subset thereof, due
>>>> to lazy-copy), so handling removal of an mfn from a guest is simple:
>>>> do a reverse look up for the host p2m and mark the relevant gfn as
>>>> invalid, then do the same for every altp2m where that gfn is currently
>valid.
>>>>
>>>> Remap changes things: it says take gfn1 and replace ->mfn with the
>>>> ->mfn of gfn2. Here is where the optimization is used and the
>>>> ->invalidate
>>>logic is:
>>>> record the lowest and highest gfn2's that have been used in remap
>>>> ops; if an mfn is dropped from the hostp2m, for the purposes of
>>>> altp2m invalidation, see if the gfn derived from the host p2m
>>>> reverse lookup falls within the range of used gfn2's. If it does, an
>>>> invalidation is required. Which is why min and max are inited the
>>>> way they are - hope the explanation clarifies this optimization.
>>>
>>>Sadly it doesn't, it just re-states what I already understood and
>>>doesn't answer the question: What happens if min=0 and
>>>max=<end-of-address-
>>>space>? I.e. can the guest nullify the optimization by careful
>>>space>fiddling
>> issuing
>>>some of the new hypercalls, and if so will this have any negative
>>>impact on
>> the
>>>hypervisor? I'm asking this from a security standpoint ...
>>>
>>
>> To take that exact case, If min=0 and max=<end of address space> then
>> any hostp2m change where the first mfn is dropped, will cause all
>> altp2ms to be reset even if the mfn dropped doesn't affect altp2ms at
>> all, which wont serve as an optimization at all - Hope that clarifies.
>
>Again - no. I understand the optimization is gone then. But what's the effect?
>I.e. will the guest, by extending this range to be arbitrarily wide, be able to
>cause a long latency hypervisor operation (i.e. a DoS)?
>

The extent of the range affects the likelihood of an invalidation. It has no impact on the cost of an invalidation (so no its not a DoS issue).
I'm not sure what change you are suggesting here or just clarification (if you think this optimization is confusing perhaps some documentation of this optimization will help?)


>>>Nor do I find my question answered why max can't be initialized to zero:
>>>You don't care whether max is a valid GFN when a certain GFN doesn't
>>>fall in the (then empty) [min, max] range. What am I missing?
>>
>> Since 0 is a valid GFN so we cannot initialize min or max to 0 - since
>> matching this condition (gfn_x(gfn) >= p2m->min_remapped_gfn &&
>> gfn_x(gfn) <=
>> p2m->max_remapped_gfn) will cause a reset (throw away) of the altp2m
>> p2m->to
>> rebuild it from the hostp2m. So essentially what is being done here is
>> the range is the non-existent set to start with, unless some altp2m
>> changes occur, and then it is grown to be the smallest set around the gfns
>affected.
>
>Again you just re-state what was already clear, yet you neglect answering the
>actual question. Taking what you wrote above, when max=0 (and
>min=INVALID_GFN), then
>
>	gfn_x(gfn) >= p2m->min_remapped_gfn &&
>	 gfn_x(gfn) <= p2m->max_remapped_gfn
>
>will be false for any value of gfn; in fact the "max" part won't even be looked
>at because the "min" part will already be false for any valid gfn, i.e. only the
>INVALID_GFN case would make it to the "max" part.
>

You are suggesting an alternative which will work, but what we have will also produce the same results, and the results are correct for both cases - so can we go with the approach we have taken currently?

Ravi



>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-20  6:21           ` Jan Beulich
@ 2015-07-21  5:49             ` Sahita, Ravi
  0 siblings, 0 replies; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-21  5:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Sunday, July 19, 2015 11:22 PM
>
>>>> On 17.07.15 at 23:08, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Thursday, July 16, 2015 2:39 AM
>>>
>>>>>> On 16.07.15 at 11:20, <ravi.sahita@intel.com> wrote:
>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>Sent: Tuesday, July 14, 2015 6:57 AM
>>>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>>>> +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v) {
>>>>>> +    bool_t rc = 0;
>>>>>> +    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) !=
>>>>>> +INVALID_GFN
>>>>>?
>>>>>> +        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn),
>0) :
>>>>>> +NULL;
>>>>>> +
>>>>>> +    if ( !veinfo )
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    if ( veinfo->semaphore != 0 )
>>>>>> +        goto out;
>>>>>> +
>>>>>> +    rc = 1;
>>>>>> +
>>>>>> +    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
>>>>>> +    veinfo->semaphore = ~0l;
>>>>>
>>>>>Isn't semaphore a 32-bit quantity?
>>>>
>>>> Yes.
>>>
>>>I.e. the l suffix can and should be dropped.
>>>
>>
>> Ok.
>>
>>>>>> +    {
>>>>>> +        unsigned long idx;
>>>>>> +
>>>>>> +        if ( v->arch.hvm_vmx.secondary_exec_control &
>>>>>> +            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
>>>>>> +            __vmread(EPTP_INDEX, &idx);
>>>>>> +        else
>>>>>> +        {
>>>>>> +            unsigned long eptp;
>>>>>> +
>>>>>> +            __vmread(EPT_POINTER, &eptp);
>>>>>> +
>>>>>> +            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
>>>>>> +                 INVALID_ALTP2M )
>>>>>> +            {
>>>>>> +                gdprintk(XENLOG_ERR, "EPTP not found in alternate
>>>>>> + p2m
>>>list\n");
>>>>>> +                domain_crash(v->domain);
>>>>>> +            }
>>>>>> +        }
>>>>>> +
>>>>>> +        if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )
>>>>>
>>>>>Is this cast really necessary?
>>>>
>>>> Yes - The index is 16-bits, this reflects how the field is specified
>>>> in the vmcs also.
>>>
>>>While "yes" answers the question, the explanation you give suggests
>>>that the answer may be wrong: Can idx indeed have bits set beyond bit
>>>15? Because if it can't, the cast is pointless.
>>>
>>
>> We were just trying to ensure we matched the hardware behavior (I
>> think there was a message George had posted earlier for SVE that asked for
>that).
>> Since hardware considers only a 16 bit field we were doing the same.
>>
>>>>>> +        {
>>>>>> +            BUG_ON(idx >= MAX_ALTP2M);
>>>>>> +            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
>>>>>> +            vcpu_altp2m(v).p2midx = (uint16_t)idx;
>>>>>
>>>>>This one surely isn't (or else the field type is wrong).
>>>>
>>>> Again required. idx can't be uint16_t because __vmread() requires
>>>> unsigned long*, but the index is 16 bits.
>>>
>>>But it's a 16-bit VMCS field that you read it from, and hence the
>>>upper 48
>> bits
>>>are necessarily zero.
>>>
>>>Just to re-iterate: Casts are necessary in certain places, yes, but I
>>>see them used pointlessly or even wrongly more often than not.
>>
>> Same approach as above - emulating hardware exactly.
>> Should we add a comment?
>
>No, drop the casts.
>
>Jan

Ok

Ravi

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-21  5:46             ` Sahita, Ravi
@ 2015-07-21  6:38               ` Jan Beulich
  2015-07-21 18:33                 ` Sahita, Ravi
  0 siblings, 1 reply; 56+ messages in thread
From: Jan Beulich @ 2015-07-21  6:38 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 21.07.15 at 07:46, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Sunday, July 19, 2015 11:53 PM
>>>>> On 18.07.15 at 00:32, <ravi.sahita@intel.com> wrote:
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>Sent: Thursday, July 16, 2015 2:34 AM
>>>>
>>>>>>> On 16.07.15 at 11:16, <ravi.sahita@intel.com> wrote:
>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>Sent: Tuesday, July 14, 2015 7:32 AM
>>>>>>>>> On 14.07.15 at 02:14, <edmund.h.white@intel.com> wrote:
>>>>>>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>>>>>unsigned long gla,
>>>>>>>          if ( npfec.write_access )
>>>>>>>          {
>>>>>>>              paging_mark_dirty(currd, mfn_x(mfn));
>>>>>>> +            /* If p2m is really an altp2m, unlock here to avoid
>>>>>>> + lock
>>>>> ordering
>>>>>>> +             * violation when the change below is propagated from
>>>>>>> + host p2m
>>>>> */
>>>>>>> +            if ( ap2m_active )
>>>>>>> +                __put_gfn(p2m, gfn);
>>>>>>>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>>>>>>> p2m_ram_rw);
>>>>>>
>>>>>>And this won't result in any races?
>>>>>
>>>>> No
>>>>
>>>>To be honest I expected a little more than just "no" here. Now I have
>>>>to ask - why?
>>>>
>>>
>>> Yes, I should have described it more than that :-)  so this part of
>>> the code is handling the log dirty transition of the page, and this
>>> page permission transition happens always on the hostp2m. Given the
>>> way the locking order is setup (hostp2m->altp2m-list-lock->altp2m and
>>> there was a separate writeup and discussion with George on that), at
>>> this point in this sequence there is a p2m lock (whether it's a
>>> hostp2m or altp2m lock depends on the mode of the
>>> domain) - the reason we have to drop the lock here first is due to
>>> what happens next; the permission changes in hostp2m will be serially
>>> propagated to altp2ms and not dropping the lock here would cause a
>>> locking order violation. Hope that clarifies.
>>
>>Sadly it doesn't at all: You re-explain why you need to drop the lock, while 
> you
>>fail to say anything on why this won't cause a race.
>>
> 
> It only drops the lock on the altp2m, which is no longer required in this 
> function anyway. The important aspect is that there is still a lock held on 
> the host p2m, and that is dropped after the log-dirty updates, as it would be 
> in the non-altp2m case (maybe that was the part I should have explained 
> clearly in the para above). Does that clarify or do you see a particular race 
> condition here? (We don't ).

Sounds okay then.

>>>>>>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
>>>>>>> +    long rc = -EINVAL;
>>>>>>
>>>>>>Why long (for both variable and function return type)? (More of
>>>>>>these in functions below.)
>>>>>
>>>>> Because the error variable in the code that calls these (in hvm.c)
>>>>> is a long, and you had given feedback earlier to propagate the
>>>>> returns from these functions through that calling code.
>>>>
>>>>I don't see the connection. The function only returns zero or -E...
>>>>values, so why would its return type be "long"?
>>>>
>>>
>>> do_hvm_op declares a rc that is of type "long" and hence this returns
>>> a "long"
>>
>>What type your caller(s) return is of no interest at all here: What would you 
> do
>>if you had multiple callers with differing return types?
>>A function's return type should be chosen based on the range of values it may
>>return, and the result possibly widened to not yield inefficient code (like 
> in
>>some of the uint16_t cases elsewhere in the series would be necessary).
>>
> 
> What do you suggest the return type be?

For the case here - int (quite obviously I would say).

For the uint16_t ones - unsigned int.

>>>>>>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>>>>>>> +                                 mfn_t mfn, unsigned int page_order,
>>>>>>> +                                 p2m_type_t p2mt, p2m_access_t
>>>>>>> +p2ma) {
>>>>>>> +    struct p2m_domain *p2m;
>>>>>>> +    p2m_access_t a;
>>>>>>> +    p2m_type_t t;
>>>>>>> +    mfn_t m;
>>>>>>> +    uint16_t i;
>>>>>>> +    bool_t reset_p2m;
>>>>>>> +    unsigned int reset_count = 0;
>>>>>>> +    uint16_t last_reset_idx = ~0;
>>>>>>> +
>>>>>>> +    if ( !altp2m_active(d) )
>>>>>>> +        return;
>>>>>>> +
>>>>>>> +    altp2m_list_lock(d);
>>>>>>> +
>>>>>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>>>>>> +    {
>>>>>>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>>>>>>> +            continue;
>>>>>>> +
>>>>>>> +        p2m = d->arch.altp2m_p2m[i];
>>>>>>> +        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0,
>>>>>>> + NULL);
>>>>>>> +
>>>>>>> +        reset_p2m = 0;
>>>>>>> +
>>>>>>> +        /* Check for a dropped page that may impact this altp2m */
>>>>>>> +        if ( mfn_x(mfn) == INVALID_MFN &&
>>>>>>> +             gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>>>>>> +             gfn_x(gfn) <= p2m->max_remapped_gfn )
>>>>>>> +            reset_p2m = 1;
>>>>>>
>>>>>>Considering that this looks like an optimization, what's the
>>>>>>downside of possibly having min=0 and max=<end-of-address-space>?
>>I.e.
>>>>>>can there a long latency operation result that's this way a guest can
>>effect?
>>>>>>
>>>>>
>>>>> ... A p2m is a gfn->mfn map, amongst other things. There is a
>>>>> reverse
>>>>> mfn->gfn map, but that is only valid for the host p2m. Unless the
>>>>> remap altp2m hypercall is used, the gfn->mfn map in every altp2m
>>>>> mirrors the gfn->mfn map in the host p2m (or a subset thereof, due
>>>>> to lazy-copy), so handling removal of an mfn from a guest is simple:
>>>>> do a reverse look up for the host p2m and mark the relevant gfn as
>>>>> invalid, then do the same for every altp2m where that gfn is currently
>>valid.
>>>>>
>>>>> Remap changes things: it says take gfn1 and replace ->mfn with the
>>>>> ->mfn of gfn2. Here is where the optimization is used and the
>>>>> ->invalidate
>>>>logic is:
>>>>> record the lowest and highest gfn2's that have been used in remap
>>>>> ops; if an mfn is dropped from the hostp2m, for the purposes of
>>>>> altp2m invalidation, see if the gfn derived from the host p2m
>>>>> reverse lookup falls within the range of used gfn2's. If it does, an
>>>>> invalidation is required. Which is why min and max are inited the
>>>>> way they are - hope the explanation clarifies this optimization.
>>>>
>>>>Sadly it doesn't, it just re-states what I already understood and
>>>>doesn't answer the question: What happens if min=0 and
>>>>max=<end-of-address-
>>>>space>? I.e. can the guest nullify the optimization by careful
>>>>space>fiddling
>>> issuing
>>>>some of the new hypercalls, and if so will this have any negative
>>>>impact on
>>> the
>>>>hypervisor? I'm asking this from a security standpoint ...
>>>>
>>>
>>> To take that exact case, If min=0 and max=<end of address space> then
>>> any hostp2m change where the first mfn is dropped, will cause all
>>> altp2ms to be reset even if the mfn dropped doesn't affect altp2ms at
>>> all, which wont serve as an optimization at all - Hope that clarifies.
>>
>>Again - no. I understand the optimization is gone then. But what's the effect?
>>I.e. will the guest, by extending this range to be arbitrarily wide, be able 
> to
>>cause a long latency hypervisor operation (i.e. a DoS)?
>>
> 
> The extent of the range affects the likelihood of an invalidation. It has no 
> impact on the cost of an invalidation (so no its not a DoS issue).
> I'm not sure what change you are suggesting here or just clarification (if 
> you think this optimization is confusing perhaps some documentation of this 
> optimization will help?)

Well, the optimization must be optimizing _something_. And hence
_something_ must go sub-optimal when the optimization is being
subverted. And the question is how much worse un-optimized is
compared to optimized.

It _looks like_ the overall effect really is just to avoid a one time
(for a given non-preemptible operation) reset, but whether that's
really the case depends on the calling contexts (which, as said a
couple of times before, is hard to see for a patch that introduces
functions without callers - hence the question).

>>>>Nor do I find my question answered why max can't be initialized to zero:
>>>>You don't care whether max is a valid GFN when a certain GFN doesn't
>>>>fall in the (then empty) [min, max] range. What am I missing?
>>>
>>> Since 0 is a valid GFN so we cannot initialize min or max to 0 - since
>>> matching this condition (gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>> gfn_x(gfn) <=
>>> p2m->max_remapped_gfn) will cause a reset (throw away) of the altp2m
>>> p2m->to
>>> rebuild it from the hostp2m. So essentially what is being done here is
>>> the range is the non-existent set to start with, unless some altp2m
>>> changes occur, and then it is grown to be the smallest set around the gfns
>>affected.
>>
>>Again you just re-state what was already clear, yet you neglect answering the
>>actual question. Taking what you wrote above, when max=0 (and
>>min=INVALID_GFN), then
>>
>>	gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>	 gfn_x(gfn) <= p2m->max_remapped_gfn
>>
>>will be false for any value of gfn; in fact the "max" part won't even be 
> looked
>>at because the "min" part will already be false for any valid gfn, i.e. only 
> the
>>INVALID_GFN case would make it to the "max" part.
>>
> 
> You are suggesting an alternative which will work, but what we have will 
> also produce the same results, and the results are correct for both cases - so 
> can we go with the approach we have taken currently?

Of course we can, but should we? I.e. why use sub-optimal code
when there clearly is a way to improve it? Counter question - why
are you insisting to use a model requiring (in the earlier pointed out
place) four comparisons when two can do? I realize this is only a
small inefficiency here, but you should realize that they add up if
we don't look to avoid them where possible.

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-14 15:57   ` George Dunlap
@ 2015-07-21 17:44     ` Sahita, Ravi
  0 siblings, 0 replies; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-21 17:44 UTC (permalink / raw)
  To: George Dunlap, White, Edmund H, xen-devel
  Cc: Sahita, Ravi, Wei Liu, Ian Jackson, Tim Deegan, Jan Beulich,
	Andrew Cooper, tlengyel, Daniel De Graaf

>From: George Dunlap [mailto:george.dunlap@eu.citrix.com]
>Sent: Tuesday, July 14, 2015 8:57 AM
>
>On 07/14/2015 01:14 AM, Ed White wrote:
>> Add the basic data structures needed to support alternate p2m's and
>> the functions to initialise them and tear them down.
>>
>> Although Intel hardware can handle 512 EPTP's per hardware thread
>> concurrently, only 10 per domain are supported in this patch for
>> performance reasons.
>>
>> The iterator in hap_enable() does need to handle 512, so that is now
>> uint16_t.
>>
>> This change also splits the p2m lock into one lock type for altp2m's
>> and another type for all other p2m's. The purpose of this is to place
>> the altp2m list lock between the types, so the list lock can be
>> acquired whilst holding the host p2m lock.
>>
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>
>With the number of major changes you made here, you definitely should
>have dropped this reviewed-by.
>
>> ---
>>  xen/arch/x86/hvm/Makefile        |   1 +
>>  xen/arch/x86/hvm/altp2m.c        |  77 +++++++++++++++++++++++++++++
>>  xen/arch/x86/hvm/hvm.c           |  21 ++++++++
>>  xen/arch/x86/mm/hap/hap.c        |  38 ++++++++++++++-
>>  xen/arch/x86/mm/mm-locks.h       |  46 +++++++++++++++++-
>>  xen/arch/x86/mm/p2m.c            | 102
>+++++++++++++++++++++++++++++++++++++++
>>  xen/include/asm-x86/domain.h     |  10 ++++
>>  xen/include/asm-x86/hvm/altp2m.h |  38 +++++++++++++++
>>  xen/include/asm-x86/hvm/hvm.h    |  14 ++++++
>>  xen/include/asm-x86/hvm/vcpu.h   |   9 ++++
>>  xen/include/asm-x86/p2m.h        |  32 +++++++++++-
>>  11 files changed, 384 insertions(+), 4 deletions(-)  create mode
>> 100644 xen/arch/x86/hvm/altp2m.c  create mode 100644
>> xen/include/asm-x86/hvm/altp2m.h
>>
>> diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
>> index 69af47f..eb1a37b 100644
>> --- a/xen/arch/x86/hvm/Makefile
>> +++ b/xen/arch/x86/hvm/Makefile
>> @@ -1,6 +1,7 @@
>>  subdir-y += svm
>>  subdir-y += vmx
>>
>> +obj-y += altp2m.o
>>  obj-y += asid.o
>>  obj-y += emulate.o
>>  obj-y += event.o
>> diff --git a/xen/arch/x86/hvm/altp2m.c b/xen/arch/x86/hvm/altp2m.c new
>> file mode 100644 index 0000000..a10f347
>> --- /dev/null
>> +++ b/xen/arch/x86/hvm/altp2m.c
>> @@ -0,0 +1,77 @@
>> +/*
>> + * Alternate p2m HVM
>> + * Copyright (c) 2014, Intel Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> +modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but
>> +WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of
>MERCHANTABILITY
>> +or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>> +License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> +along with
>> + * this program; if not, write to the Free Software Foundation, Inc.,
>> +59 Temple
>> + * Place - Suite 330, Boston, MA 02111-1307 USA.
>> + */
>> +
>> +#include <asm/hvm/support.h>
>> +#include <asm/hvm/hvm.h>
>> +#include <asm/p2m.h>
>> +#include <asm/hvm/altp2m.h>
>> +
>> +void
>> +altp2m_vcpu_reset(struct vcpu *v)
>
>OK, so it looks like at the end of this patch series:
>* altp2m_vcpu_reset() isn't called outside this file
>* altp2m_vcpu_initialise() is only called from hvm.c when the guest enables
>the altp2m functionality
>* altp2m_vcpu_destroy() is called when the guest disables altp2m
>funcitonality, or when the vcpu is destroyed
>
>Looking at the "vcpu_destroy" case, it's hard to tell exactly how much on that
>path is actually useful; but it looks like the only thing that's critical is decreasing
>the active_vcpu count of the p2m that's being used.
>
>Also, it looks like these functions don't do anything specifically with the HVM
>side of things.
>
>So on the whole, it seems like these would better go along with the other
>altp2m functions inside p2m.c.
>
>Thoughts?

George, apologies on this one - I completely missed this email from you -

We could move these functions into p2m.c, except destroy, the VMCS updates are critical.
We will try to get this into what will be our final rev (at least for 4.6 candidate). 
Again sorry for the snafu on this email.

Ravi

>
> -George

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-21  6:38               ` Jan Beulich
@ 2015-07-21 18:33                 ` Sahita, Ravi
  2015-07-22  7:33                   ` Jan Beulich
  0 siblings, 1 reply; 56+ messages in thread
From: Sahita, Ravi @ 2015-07-21 18:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Monday, July 20, 2015 11:38 PM
>
>>>>>>>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>>>>>>>> +                                 mfn_t mfn, unsigned int page_order,
>>>>>>>> +                                 p2m_type_t p2mt, p2m_access_t
>>>>>>>> +p2ma) {
>>>>>>>> +    struct p2m_domain *p2m;
>>>>>>>> +    p2m_access_t a;
>>>>>>>> +    p2m_type_t t;
>>>>>>>> +    mfn_t m;
>>>>>>>> +    uint16_t i;
>>>>>>>> +    bool_t reset_p2m;
>>>>>>>> +    unsigned int reset_count = 0;
>>>>>>>> +    uint16_t last_reset_idx = ~0;
>>>>>>>> +
>>>>>>>> +    if ( !altp2m_active(d) )
>>>>>>>> +        return;
>>>>>>>> +
>>>>>>>> +    altp2m_list_lock(d);
>>>>>>>> +
>>>>>>>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>>>>>>>> +    {
>>>>>>>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>>>>>>>> +            continue;
>>>>>>>> +
>>>>>>>> +        p2m = d->arch.altp2m_p2m[i];
>>>>>>>> +        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0,
>>>>>>>> + NULL);
>>>>>>>> +
>>>>>>>> +        reset_p2m = 0;
>>>>>>>> +
>>>>>>>> +        /* Check for a dropped page that may impact this altp2m */
>>>>>>>> +        if ( mfn_x(mfn) == INVALID_MFN &&
>>>>>>>> +             gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>>>>>>> +             gfn_x(gfn) <= p2m->max_remapped_gfn )
>>>>>>>> +            reset_p2m = 1;
>>>>>>>
>>>>>>>Considering that this looks like an optimization, what's the
>>>>>>>downside of possibly having min=0 and max=<end-of-address-
>space>?
>>>I.e.
>>>>>>>can there a long latency operation result that's this way a guest
>>>>>>>can
>>>effect?
>>>>>>>
>>>>>>
>>>>>> ... A p2m is a gfn->mfn map, amongst other things. There is a
>>>>>> reverse
>>>>>> mfn->gfn map, but that is only valid for the host p2m. Unless the
>>>>>> remap altp2m hypercall is used, the gfn->mfn map in every altp2m
>>>>>> mirrors the gfn->mfn map in the host p2m (or a subset thereof, due
>>>>>> to lazy-copy), so handling removal of an mfn from a guest is simple:
>>>>>> do a reverse look up for the host p2m and mark the relevant gfn as
>>>>>> invalid, then do the same for every altp2m where that gfn is
>>>>>> currently
>>>valid.
>>>>>>
>>>>>> Remap changes things: it says take gfn1 and replace ->mfn with the
>>>>>> ->mfn of gfn2. Here is where the optimization is used and the
>>>>>> ->invalidate
>>>>>logic is:
>>>>>> record the lowest and highest gfn2's that have been used in remap
>>>>>> ops; if an mfn is dropped from the hostp2m, for the purposes of
>>>>>> altp2m invalidation, see if the gfn derived from the host p2m
>>>>>> reverse lookup falls within the range of used gfn2's. If it does,
>>>>>> an invalidation is required. Which is why min and max are inited
>>>>>> the way they are - hope the explanation clarifies this optimization.
>>>>>
>>>>>Sadly it doesn't, it just re-states what I already understood and
>>>>>doesn't answer the question: What happens if min=0 and
>>>>>max=<end-of-address-
>>>>>space>? I.e. can the guest nullify the optimization by careful
>>>>>space>fiddling
>>>> issuing
>>>>>some of the new hypercalls, and if so will this have any negative
>>>>>impact on
>>>> the
>>>>>hypervisor? I'm asking this from a security standpoint ...
>>>>>
>>>>
>>>> To take that exact case, If min=0 and max=<end of address space>
>>>> then any hostp2m change where the first mfn is dropped, will cause
>>>> all altp2ms to be reset even if the mfn dropped doesn't affect
>>>> altp2ms at all, which wont serve as an optimization at all - Hope that
>clarifies.
>>>
>>>Again - no. I understand the optimization is gone then. But what's the
>effect?
>>>I.e. will the guest, by extending this range to be arbitrarily wide,
>>>be able
>> to
>>>cause a long latency hypervisor operation (i.e. a DoS)?
>>>
>>
>> The extent of the range affects the likelihood of an invalidation. It
>> has no impact on the cost of an invalidation (so no its not a DoS issue).
>> I'm not sure what change you are suggesting here or just clarification
>> (if you think this optimization is confusing perhaps some
>> documentation of this optimization will help?)
>
>Well, the optimization must be optimizing _something_. And hence
>_something_ must go sub-optimal when the optimization is being subverted.
>And the question is how much worse un-optimized is compared to optimized.
>
>It _looks like_ the overall effect really is just to avoid a one time (for a given
>non-preemptible operation) reset, but whether that's really the case depends
>on the calling contexts (which, as said a couple of times before, is hard to see
>for a patch that introduces functions without callers - hence the question).
>

As you now understand, invalidating an altp2m effectively resets it to be a (lazily-copied) exact duplicate of the host p2m again -- so losing any altp2m permissions restrictions or remaps. This is a first cut at minimizing the likelihood of that happening unnecessarily. There's some discussion on this first cut between Tim and Ed going back to February or March. The intention continues to be that this might be further optimized if real-world use shows this first cut to be inefficient.

>>>>>Nor do I find my question answered why max can't be initialized to zero:
>>>>>You don't care whether max is a valid GFN when a certain GFN doesn't
>>>>>fall in the (then empty) [min, max] range. What am I missing?
>>>>
>>>> Since 0 is a valid GFN so we cannot initialize min or max to 0 -
>>>> since matching this condition (gfn_x(gfn) >= p2m->min_remapped_gfn
>>>> &&
>>>> gfn_x(gfn) <=
>>>> p2m->max_remapped_gfn) will cause a reset (throw away) of the altp2m
>>>> p2m->to
>>>> rebuild it from the hostp2m. So essentially what is being done here
>>>> is the range is the non-existent set to start with, unless some
>>>> altp2m changes occur, and then it is grown to be the smallest set
>>>> around the gfns
>>>affected.
>>>
>>>Again you just re-state what was already clear, yet you neglect
>>>answering the actual question. Taking what you wrote above, when max=0
>>>(and min=INVALID_GFN), then
>>>
>>>	gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>>	 gfn_x(gfn) <= p2m->max_remapped_gfn
>>>
>>>will be false for any value of gfn; in fact the "max" part won't even
>>>be
>> looked
>>>at because the "min" part will already be false for any valid gfn,
>>>i.e. only
>> the
>>>INVALID_GFN case would make it to the "max" part.
>>>
>>
>> You are suggesting an alternative which will work, but what we have
>> will also produce the same results, and the results are correct for
>> both cases - so can we go with the approach we have taken currently?
>
>Of course we can, but should we? I.e. why use sub-optimal code when there
>clearly is a way to improve it? Counter question - why are you insisting to use a
>model requiring (in the earlier pointed out
>place) four comparisons when two can do? I realize this is only a small
>inefficiency here, but you should realize that they add up if we don't look to
>avoid them where possible.

We thought the code was easier to understand with min and max set to INVALID.
we can take your approach to avoid this inefficiency if you want.

Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v5 10/15] x86/altp2m: add remaining support routines.
  2015-07-21 18:33                 ` Sahita, Ravi
@ 2015-07-22  7:33                   ` Jan Beulich
  0 siblings, 0 replies; 56+ messages in thread
From: Jan Beulich @ 2015-07-22  7:33 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 21.07.15 at 20:33, <ravi.sahita@intel.com> wrote:
> We thought the code was easier to understand with min and max set to 
> INVALID.

But as you see it caused a lot of discussion instead.

> we can take your approach to avoid this inefficiency if you want.

Yes please.

Jan

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2015-07-22  7:33 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-14  0:14 [PATCH v5 00/15] Alternate p2m: support multiple copies of host p2m Ed White
2015-07-14  0:14 ` [PATCH v5 01/15] common/domain: Helpers to pause a domain while in context Ed White
2015-07-14  0:14 ` [PATCH v5 02/15] VMX: VMFUNC and #VE definitions and detection Ed White
2015-07-14  0:14 ` [PATCH v5 03/15] VMX: implement suppress #VE Ed White
2015-07-14 12:46   ` Jan Beulich
2015-07-14 13:47   ` George Dunlap
2015-07-14  0:14 ` [PATCH v5 04/15] x86/HVM: Hardware alternate p2m support detection Ed White
2015-07-14  0:14 ` [PATCH v5 05/15] x86/altp2m: basic data structures and support routines Ed White
2015-07-14 13:13   ` Jan Beulich
2015-07-14 14:45     ` George Dunlap
2015-07-14 14:58       ` Jan Beulich
2015-07-16  8:57     ` Sahita, Ravi
2015-07-16  9:07       ` Jan Beulich
2015-07-17 22:36         ` Sahita, Ravi
2015-07-20  6:20           ` Jan Beulich
2015-07-21  5:18             ` Sahita, Ravi
2015-07-14 15:57   ` George Dunlap
2015-07-21 17:44     ` Sahita, Ravi
2015-07-14  0:14 ` [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
2015-07-14 13:57   ` Jan Beulich
2015-07-16  9:20     ` Sahita, Ravi
2015-07-16  9:38       ` Jan Beulich
2015-07-17 21:08         ` Sahita, Ravi
2015-07-20  6:21           ` Jan Beulich
2015-07-21  5:49             ` Sahita, Ravi
2015-07-14  0:14 ` [PATCH v5 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
2015-07-14 14:04   ` Jan Beulich
2015-07-14 17:56     ` Sahita, Ravi
2015-07-17 22:41     ` Sahita, Ravi
2015-07-14  0:14 ` [PATCH v5 08/15] x86/altp2m: add control of suppress_ve Ed White
2015-07-14 17:03   ` George Dunlap
2015-07-14  0:14 ` [PATCH v5 09/15] x86/altp2m: alternate p2m memory events Ed White
2015-07-14 14:08   ` Jan Beulich
2015-07-16  9:22     ` Sahita, Ravi
2015-07-14  0:14 ` [PATCH v5 10/15] x86/altp2m: add remaining support routines Ed White
2015-07-14 14:31   ` Jan Beulich
2015-07-16  9:16     ` Sahita, Ravi
2015-07-16  9:34       ` Jan Beulich
2015-07-17 22:32         ` Sahita, Ravi
2015-07-20  6:53           ` Jan Beulich
2015-07-21  5:46             ` Sahita, Ravi
2015-07-21  6:38               ` Jan Beulich
2015-07-21 18:33                 ` Sahita, Ravi
2015-07-22  7:33                   ` Jan Beulich
2015-07-16 14:44   ` George Dunlap
2015-07-17 21:01     ` Sahita, Ravi
2015-07-14  0:14 ` [PATCH v5 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
2015-07-14 14:36   ` Jan Beulich
2015-07-16  9:02     ` Sahita, Ravi
2015-07-16  9:09       ` Jan Beulich
2015-07-14  0:15 ` [PATCH v5 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
2015-07-14  0:15 ` [PATCH v5 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
2015-07-14  0:15 ` [PATCH v5 14/15] tools/libxc: add support to altp2m hvmops Ed White
2015-07-14  0:15 ` [PATCH v5 15/15] tools/xen-access: altp2m testcases Ed White
2015-07-14  9:56   ` Wei Liu
2015-07-14 11:52     ` Lengyel, Tamas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).