xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m
@ 2015-07-22 23:01 Ed White
  2015-07-22 23:01 ` [PATCH v7 01/15] common/domain: Helpers to pause a domain while in context Ed White
                   ` (16 more replies)
  0 siblings, 17 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

This set of patches adds support to hvm domains for EPTP switching by creating
multiple copies of the host p2m (currently limited to 10 copies).

The primary use of this capability is expected to be in scenarios where access
to memory needs to be monitored and/or restricted below the level at which the
guest OS page tables operate. Two examples that were discussed at the 2014 Xen
developer summit are:

    VM introspection: 
        http://www.slideshare.net/xen_com_mgr/
        zero-footprint-guest-memory-introspection-from-xen

    Secure inter-VM communication:
        http://www.slideshare.net/xen_com_mgr/nakajima-nvf

A more detailed design specification can be found at:
    http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01319.html

Each p2m copy is populated lazily on EPT violations.
Permissions for pages in alternate p2m's can be changed in a similar
way to the existing memory access interface, and gfn->mfn mappings can be changed.

All this is done through extra HVMOP types.

The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain
code is hypervisor-only, the toolstack has not been modified.

The intra-domain code has been tested. Violation notifications can only be received
for pages that have been modified (access permissions and/or gfn->mfn mapping) 
intra-domain, and only on VCPU's that have enabled notification.

VMFUNC and #VE will both be emulated on hardware without native support.

This code is not compatible with nested hvm functionality and will refuse to work
with nested hvm active. It is also not compatible with migration. It should be
considered experimental.

Changes since v6:

    Rebased again.

	The patch list below specifies the changes in v7 of the altp2m series. 
	We intend for this v7 to be the last version of this patch series that we 
	are submitting during this freeze exception week. This is taking into account 
	the time-zone differences and resource availability. We believe we have 
	addressed the majority of review comments in this series, with the exception 
	of a couple. We have noted the pending ones that remain open in two patches 
	listed below - with the intent to be addressed in subsequent patches post 4.6. 
	Hopefully, this series satifies the criteria for inclusion in 4.6. We thank 
	the maintainers for their perseverance with this patch series and their 
	continuous feedback during this freeze exception week. 

    Patch 1:
        add George's R-b

    Patch 2:
        no changes

    Patch 3:
        add Jun's ack

    Patch 4:
        no changes

    Patch 5:
        move altp2m.c from hvm to mm
        move altp2m.h from x86/mm to x86
        change iterator types in hap_enable/hap_final_teardown
        fix altp2m init failure path in p2m_init
        rename altp2m_vcpu_update_eptp to altp2m_vcpu_update_p2m
        change uint16_t's/uint8_t's to unsigned int's
        rework p2m_get_altp2m
		
		not done - mechanical change of moving domain struct 
		    altp2m new members to a seperate alloc'ed struct 
			(initial stab for this patch has been written)

    Patch 6:
        remove casts around p2midx handling
        fix veinfo semaphore initialization
        mechanical changes due to patch 5 changes

    Patch 7:
        remove incorrect cast
        add Jan's ack

    Patch 8:
        formatting changes requested by Andrew

    Patch 9:
        no changes

    Patch 10:
        rename altp2m lazy copier, make bool_t, use __put_gfn throughout,
          and move to p2m.c, eliminating altp2m_hap.c
        change various p2m_altp2m... routines from long to int
        change uint16_t's/uint8_t's to unsigned int's
        optimize remapped gfn tracking and altp2m invalidation check
        mechanical changes due to patch 5 changes
		
		not done - abstracting some ept functionality from p2m

    Patch 11:
        fix cmd range check
        rework domain locking
        add Jan's ack

    Patch 12:
        no changes

    Patch 13:
        no changes

    Patch 14:
        no changes

    Patch 15:
        no changes


Changes since v5:

    Rebased on staging.

    We believe v6 addresses all ABI issues and actual bugs, it does
    not address all outstanding maintainer issues.

    Patch 1:
        no changes

    Patch 2:
        no changes

    Patch 3:
        no changes
        removed ack's etc

    Patch 4:
        fixed a markdown formatting error

    Patch 5:
        removed a buggy assert
        removed Andrew's R-b

    Patch 6:
        fixed a bug when disabling #VE due to bad veinfo gfn

    Patch 7:
        addressed Jan's most recent comments

    Patch 8:
        no changes

    Patch 9:
        Added padding to vm_event_t header (per Andrew)

    Patch 10:
        No changes

    Patch 11:
        Reworked structure padding
        Added altp2m_op interface version
        Reworked altp2m_op handling again

    Patch 12:
        Mechanical changes due to patch 11 changes

    Patch 13:
        Mechanical changes due to patch 11 changes

    Patch 14:
        Mechanical changes due to patch 11 changes

    Patch 15:
        Mechanical changes due to an upstream change


Changes since v4:

    Patch 3:  don't set bit 63 of top-level entries.

    Patch 5:  extra locking order description in mm-locks.h
              don't initialise altp2m data unless altp2m is enabled globally
               and hardware supports it
              removed some hardware-specific wrappers that were not being used
              renamed ap2m... interfaces to altp2m...
              fixed error path in p2m_init_altp2m

    Patch 7:  addressed remaining feedback

    Patch 8:  made suppress_ve preservation consistent

    Patch 9:  changed flag bit to avoid collision with recently applied series

    Patch 10: check pad fields for zero
              minor formatting changes

    Patch 11: renamed HVM parameter

    Patch 15: removed v3 workaround


Changes since v3:

Major changes are:

    Replaced patch 8.

    Refactored patch 11 to use a single HVMOP with subcodes.

    Addressed feedback in patch 7, and some other patches.

    Added two tools/test patches from Tamas. Both are optional.

    Added various ack's and reviewed-by's.

    Rebased.

Ravi Sahita will now be the point of contact for this series.


Changes since v2:

Addressed all v2 feedback *except*:

    In patch 5, the per-domain EPTP list page is still allocated from the
    Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
    hardware when walking the EPTP list during exit processing in patch 6.

    HVM_ops are not merged. Tamas suggested merging the memory access ops,
    but in practice they are not as similar as they appear on the surface.
    Razvan suggested merging the implementation code in p2m.c, but that is
    also not as common as it appears on the surface.
    Andrew suggested merging all altp2m ops into one with a subop code in
    the input stucture. His point that only 255 ops can be defined is well
    taken, but altp2m uses only 2 more ops than the recently introduced
    ioreq ops, and <15% of the available ops have been defined. Since we
    don't know how to implement XSM hooks and policy with the subop model,
    we have not adopted this suggestion.

    The p2m set/get interface is not modified. The altp2m code needs to
    write suppress_ve in 2 places and read it in 1 place. The original
    patch series managed this by coupling the state of suppress_ve to the
    p2m memory type, which Tim disliked. In v2 of the series, special
    set/get interaces were added to access suppress_ve only when required.
    Jan has suggested changing the existing interfaces, but we feel this
    is inappropriate for this experimental patch series. Changing the
    existing interfaces would require a design agreement to be reached
    and would impact a large amount of existing code.

    Andrew kindly added some reviewed-by's to v2. I have not carried
    his reviewed-by of the memory event patch forward because Tamas
    requested significant changes to the patch.


Changes since v1:

Many changes since v1 in response to maintainer feedback, including:

    Suppress_ve state is now decoupled from memory type
    VMFUNC emulation handled in x86 emulator
    Lazy-copy algorithm copies any page where mfn != INVALID_MFN
    All nested page fault handling except lazy-copy is now in
        top-level (hvm.c) nested page fault handler
    Split p2m lock type (as suggested by Tim) to avoid lock order violations
    XSM hooks
    Xen parameter to globally enable altp2m (default disabled) and HVM parameter
    Altp2m reference counting no longer uses dirty_cpu bitmap
    Remapped page tracking to invalidate altp2m's where needed to protect Xen
    Many other minor changes

The altp2m invalidation is implemented to a level that I believe satisifies
the requirements of protecting Xen. Invalidation notification is not yet
implemented, and there may be other cases where invalidation is warranted to
protect the integrity of the restrictions placed through altp2m. We may add
further patches in this area.

Testability is still a potential issue. We have offered to make our internal
Windows test binaries available for intra-domain testing. Tamas has
been working on toolstack support for cross-domain testing with a slightly
earlier patch series, and we hope he will submit that support.

Not all of the patches will be of interest to everyone copied here. I've
copied everyone on this initial mailing to give context.

Andrew Cooper (1):
  common/domain: Helpers to pause a domain while in context

Ed White (9):
  VMX: VMFUNC and #VE definitions and detection.
  VMX: implement suppress #VE.
  x86/HVM: Hardware alternate p2m support detection.
  x86/altp2m: basic data structures and support routines.
  VMX/altp2m: add code to support EPTP switching and #VE.
  x86/altp2m: alternate p2m memory events.
  x86/altp2m: add remaining support routines.
  x86/altp2m: define and implement alternate p2m HVMOP types.
  x86/altp2m: Add altp2mhvm HVM domain parameter.

George Dunlap (1):
  x86/altp2m: add control of suppress_ve.

Ravi Sahita (2):
  VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  x86/altp2m: XSM hooks for altp2m HVM ops

Tamas K Lengyel (2):
  tools/libxc: add support to altp2m hvmops
  tools/xen-access: altp2m testcases

 docs/man/xl.cfg.pod.5                        |  12 +
 docs/misc/xen-command-line.markdown          |   7 +
 tools/flask/policy/policy/modules/xen/xen.if |   4 +-
 tools/libxc/Makefile                         |   1 +
 tools/libxc/include/xenctrl.h                |  22 +
 tools/libxc/xc_altp2m.c                      | 248 +++++++++++
 tools/libxl/libxl.h                          |   6 +
 tools/libxl/libxl_create.c                   |   1 +
 tools/libxl/libxl_dom.c                      |   2 +
 tools/libxl/libxl_types.idl                  |   1 +
 tools/libxl/xl_cmdimpl.c                     |  10 +
 tools/tests/xen-access/xen-access.c          | 173 ++++++--
 xen/arch/x86/hvm/emulate.c                   |  18 +-
 xen/arch/x86/hvm/hvm.c                       | 252 ++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c                  |  42 +-
 xen/arch/x86/hvm/vmx/vmx.c                   | 176 ++++++++
 xen/arch/x86/mm/Makefile                     |   1 +
 xen/arch/x86/mm/altp2m.c                     |  77 ++++
 xen/arch/x86/mm/hap/hap.c                    |  40 +-
 xen/arch/x86/mm/mem_sharing.c                |   4 +-
 xen/arch/x86/mm/mm-locks.h                   |  46 +-
 xen/arch/x86/mm/p2m-ept.c                    |  35 +-
 xen/arch/x86/mm/p2m-pod.c                    |  12 +-
 xen/arch/x86/mm/p2m-pt.c                     |  10 +-
 xen/arch/x86/mm/p2m.c                        | 612 +++++++++++++++++++++++++--
 xen/arch/x86/x86_emulate/x86_emulate.c       |  19 +-
 xen/arch/x86/x86_emulate/x86_emulate.h       |   4 +
 xen/common/domain.c                          |  28 ++
 xen/common/vm_event.c                        |   4 +
 xen/include/asm-arm/p2m.h                    |   6 +
 xen/include/asm-x86/altp2m.h                 |  38 ++
 xen/include/asm-x86/domain.h                 |  10 +
 xen/include/asm-x86/hvm/hvm.h                |  25 ++
 xen/include/asm-x86/hvm/vcpu.h               |   9 +
 xen/include/asm-x86/hvm/vmx/vmcs.h           |  14 +-
 xen/include/asm-x86/hvm/vmx/vmx.h            |  13 +-
 xen/include/asm-x86/msr-index.h              |   1 +
 xen/include/asm-x86/p2m.h                    | 100 ++++-
 xen/include/public/hvm/hvm_op.h              |  89 ++++
 xen/include/public/hvm/params.h              |   5 +-
 xen/include/public/vm_event.h                |  12 +
 xen/include/xen/sched.h                      |   5 +
 xen/include/xsm/dummy.h                      |  12 +
 xen/include/xsm/xsm.h                        |  12 +
 xen/xsm/dummy.c                              |   2 +
 xen/xsm/flask/hooks.c                        |  12 +
 xen/xsm/flask/policy/access_vectors          |   7 +
 47 files changed, 2136 insertions(+), 103 deletions(-)
 create mode 100644 tools/libxc/xc_altp2m.c
 create mode 100644 xen/arch/x86/mm/altp2m.c
 create mode 100644 xen/include/asm-x86/altp2m.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v7 01/15] common/domain: Helpers to pause a domain while in context
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-22 23:01 ` [PATCH v7 02/15] VMX: VMFUNC and #VE definitions and detection Ed White
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

From: Andrew Cooper <andrew.cooper3@citrix.com>

For use on codepaths which would need to use domain_pause() but might be in
the target domain's context.  In the case that the target domain is in
context, all other vcpus are paused.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
---
Changes since v6:
        add George's R-b

 xen/common/domain.c     | 28 ++++++++++++++++++++++++++++
 xen/include/xen/sched.h |  5 +++++
 2 files changed, 33 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 791166b..1b9fcfc 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1010,6 +1010,34 @@ int domain_unpause_by_systemcontroller(struct domain *d)
     return 0;
 }
 
+void domain_pause_except_self(struct domain *d)
+{
+    struct vcpu *v, *curr = current;
+
+    if ( curr->domain == d )
+    {
+        for_each_vcpu( d, v )
+            if ( likely(v != curr) )
+                vcpu_pause(v);
+    }
+    else
+        domain_pause(d);
+}
+
+void domain_unpause_except_self(struct domain *d)
+{
+    struct vcpu *v, *curr = current;
+
+    if ( curr->domain == d )
+    {
+        for_each_vcpu( d, v )
+            if ( likely(v != curr) )
+                vcpu_unpause(v);
+    }
+    else
+        domain_unpause(d);
+}
+
 int vcpu_reset(struct vcpu *v)
 {
     struct domain *d = v->domain;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b29d9e7..73d3bc8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -804,6 +804,11 @@ static inline int domain_pause_by_systemcontroller_nosync(struct domain *d)
 {
     return __domain_pause_by_systemcontroller(d, domain_pause_nosync);
 }
+
+/* domain_pause() but safe against trying to pause current. */
+void domain_pause_except_self(struct domain *d);
+void domain_unpause_except_self(struct domain *d);
+
 void cpu_init(void);
 
 struct scheduler;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 02/15] VMX: VMFUNC and #VE definitions and detection.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
  2015-07-22 23:01 ` [PATCH v7 01/15] common/domain: Helpers to pause a domain while in context Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-22 23:01 ` [PATCH v7 03/15] VMX: implement suppress #VE Ed White
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

Currently, neither is enabled globally but may be enabled on a per-VCPU
basis by the altp2m code.

Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
that bit is now hardware-defined.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
---
Changes since v6:
        no changes

 xen/arch/x86/hvm/vmx/vmcs.c        | 42 +++++++++++++++++++++++++++++++++++---
 xen/arch/x86/mm/p2m-ept.c          |  1 -
 xen/include/asm-x86/hvm/vmx/vmcs.h | 14 +++++++++++--
 xen/include/asm-x86/hvm/vmx/vmx.h  | 13 +++++++++++-
 xen/include/asm-x86/msr-index.h    |  1 +
 5 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 4c5ceb5..bc1cabd 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -101,6 +101,8 @@ u32 vmx_secondary_exec_control __read_mostly;
 u32 vmx_vmexit_control __read_mostly;
 u32 vmx_vmentry_control __read_mostly;
 u64 vmx_ept_vpid_cap __read_mostly;
+u64 vmx_vmfunc __read_mostly;
+bool_t vmx_virt_exception __read_mostly;
 
 const u32 vmx_introspection_force_enabled_msrs[] = {
     MSR_IA32_SYSENTER_EIP,
@@ -140,6 +142,8 @@ static void __init vmx_display_features(void)
     P(cpu_has_vmx_virtual_intr_delivery, "Virtual Interrupt Delivery");
     P(cpu_has_vmx_posted_intr_processing, "Posted Interrupt Processing");
     P(cpu_has_vmx_vmcs_shadowing, "VMCS shadowing");
+    P(cpu_has_vmx_vmfunc, "VM Functions");
+    P(cpu_has_vmx_virt_exceptions, "Virtualisation Exceptions");
     P(cpu_has_vmx_pml, "Page Modification Logging");
 #undef P
 
@@ -185,6 +189,7 @@ static int vmx_init_vmcs_config(void)
     u64 _vmx_misc_cap = 0;
     u32 _vmx_vmexit_control;
     u32 _vmx_vmentry_control;
+    u64 _vmx_vmfunc = 0;
     bool_t mismatch = 0;
 
     rdmsr(MSR_IA32_VMX_BASIC, vmx_basic_msr_low, vmx_basic_msr_high);
@@ -230,7 +235,9 @@ static int vmx_init_vmcs_config(void)
                SECONDARY_EXEC_ENABLE_EPT |
                SECONDARY_EXEC_ENABLE_RDTSCP |
                SECONDARY_EXEC_PAUSE_LOOP_EXITING |
-               SECONDARY_EXEC_ENABLE_INVPCID);
+               SECONDARY_EXEC_ENABLE_INVPCID |
+               SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+               SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
         rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
         if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
             opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
@@ -341,6 +348,24 @@ static int vmx_init_vmcs_config(void)
           || !(_vmx_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT) )
         _vmx_pin_based_exec_control  &= ~ PIN_BASED_POSTED_INTERRUPT;
 
+    /* The IA32_VMX_VMFUNC MSR exists only when VMFUNC is available */
+    if ( _vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS )
+    {
+        rdmsrl(MSR_IA32_VMX_VMFUNC, _vmx_vmfunc);
+
+        /*
+         * VMFUNC leaf 0 (EPTP switching) must be supported.
+         *
+         * Or we just don't use VMFUNC.
+         */
+        if ( !(_vmx_vmfunc & VMX_VMFUNC_EPTP_SWITCHING) )
+            _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+    }
+
+    /* Virtualization exceptions are only enabled if VMFUNC is enabled */
+    if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+        _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
     min = 0;
     opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
     _vmx_vmentry_control = adjust_vmx_controls(
@@ -361,6 +386,9 @@ static int vmx_init_vmcs_config(void)
         vmx_vmentry_control        = _vmx_vmentry_control;
         vmx_basic_msr              = ((u64)vmx_basic_msr_high << 32) |
                                      vmx_basic_msr_low;
+        vmx_vmfunc                 = _vmx_vmfunc;
+        vmx_virt_exception         = !!(_vmx_secondary_exec_control &
+                                       SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
         vmx_display_features();
 
         /* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -397,6 +425,9 @@ static int vmx_init_vmcs_config(void)
         mismatch |= cap_check(
             "EPT and VPID Capability",
             vmx_ept_vpid_cap, _vmx_ept_vpid_cap);
+        mismatch |= cap_check(
+            "VMFUNC Capability",
+            vmx_vmfunc, _vmx_vmfunc);
         if ( cpu_has_vmx_ins_outs_instr_info !=
              !!(vmx_basic_msr_high & (VMX_BASIC_INS_OUT_INFO >> 32)) )
         {
@@ -967,6 +998,11 @@ static int construct_vmcs(struct vcpu *v)
     /* Do not enable Monitor Trap Flag unless start single step debug */
     v->arch.hvm_vmx.exec_control &= ~CPU_BASED_MONITOR_TRAP_FLAG;
 
+    /* Disable VMFUNC and #VE for now: they may be enabled later by altp2m. */
+    v->arch.hvm_vmx.secondary_exec_control &=
+        ~(SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+          SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
+
     if ( is_pvh_domain(d) )
     {
         /* Disable virtual apics, TPR */
@@ -1790,9 +1826,9 @@ void vmcs_dump_vcpu(struct vcpu *v)
         printk("PLE Gap=%08x Window=%08x\n",
                vmr32(PLE_GAP), vmr32(PLE_WINDOW));
     if ( v->arch.hvm_vmx.secondary_exec_control &
-         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VMFUNC) )
+         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
         printk("Virtual processor ID = 0x%04x VMfunc controls = %016lx\n",
-               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VMFUNC_CONTROL));
+               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VM_FUNCTION_CONTROL));
 
     vmx_vmcs_exit(v);
 }
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index e7ff739..9a3b65a 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -282,7 +282,6 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
         epte->sp = (level > 1);
         epte->mfn += i * trunk;
         epte->snp = (iommu_enabled && iommu_snoop);
-        ASSERT(!epte->avail3);
 
         ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
 
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 3132644..25ac8f7 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -222,9 +222,10 @@ extern u32 vmx_vmentry_control;
 #define SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY    0x00000200
 #define SECONDARY_EXEC_PAUSE_LOOP_EXITING       0x00000400
 #define SECONDARY_EXEC_ENABLE_INVPCID           0x00001000
-#define SECONDARY_EXEC_ENABLE_VMFUNC            0x00002000
+#define SECONDARY_EXEC_ENABLE_VM_FUNCTIONS      0x00002000
 #define SECONDARY_EXEC_ENABLE_VMCS_SHADOWING    0x00004000
 #define SECONDARY_EXEC_ENABLE_PML               0x00020000
+#define SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS   0x00040000
 extern u32 vmx_secondary_exec_control;
 
 #define VMX_EPT_EXEC_ONLY_SUPPORTED             0x00000001
@@ -285,6 +286,10 @@ extern u32 vmx_secondary_exec_control;
     (vmx_pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT)
 #define cpu_has_vmx_vmcs_shadowing \
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VMCS_SHADOWING)
+#define cpu_has_vmx_vmfunc \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS)
+#define cpu_has_vmx_virt_exceptions \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS)
 #define cpu_has_vmx_pml \
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_PML)
 
@@ -316,6 +321,9 @@ extern u64 vmx_basic_msr;
 #define VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK  0x0FF
 #define VMX_GUEST_INTR_STATUS_SVI_OFFSET        8
 
+/* VMFUNC leaf definitions */
+#define VMX_VMFUNC_EPTP_SWITCHING   (1ULL << 0)
+
 /* VMCS field encodings. */
 #define VMCS_HIGH(x) ((x) | 1)
 enum vmcs_field {
@@ -350,12 +358,14 @@ enum vmcs_field {
     VIRTUAL_APIC_PAGE_ADDR          = 0x00002012,
     APIC_ACCESS_ADDR                = 0x00002014,
     PI_DESC_ADDR                    = 0x00002016,
-    VMFUNC_CONTROL                  = 0x00002018,
+    VM_FUNCTION_CONTROL             = 0x00002018,
     EPT_POINTER                     = 0x0000201a,
     EOI_EXIT_BITMAP0                = 0x0000201c,
 #define EOI_EXIT_BITMAP(n) (EOI_EXIT_BITMAP0 + (n) * 2) /* n = 0...3 */
+    EPTP_LIST_ADDR                  = 0x00002024,
     VMREAD_BITMAP                   = 0x00002026,
     VMWRITE_BITMAP                  = 0x00002028,
+    VIRT_EXCEPTION_INFO             = 0x0000202a,
     GUEST_PHYSICAL_ADDRESS          = 0x00002400,
     VMCS_LINK_POINTER               = 0x00002800,
     GUEST_IA32_DEBUGCTL             = 0x00002802,
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index c5f3d24..ee1cac7 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -47,7 +47,7 @@ typedef union {
         access      :   4,  /* bits 61:58 - p2m_access_t */
         tm          :   1,  /* bit 62 - VT-d transient-mapping hint in
                                shared EPT/VT-d usage */
-        avail3      :   1;  /* bit 63 - Software available 3 */
+        suppress_ve :   1;  /* bit 63 - suppress #VE */
     };
     u64 epte;
 } ept_entry_t;
@@ -187,6 +187,7 @@ static inline unsigned long pi_get_pir(struct pi_desc *pi_desc, int group)
 #define EXIT_REASON_XSETBV              55
 #define EXIT_REASON_APIC_WRITE          56
 #define EXIT_REASON_INVPCID             58
+#define EXIT_REASON_VMFUNC              59
 #define EXIT_REASON_PML_FULL            62
 
 /*
@@ -555,4 +556,14 @@ void p2m_init_hap_data(struct p2m_domain *p2m);
 #define EPT_L4_PAGETABLE_SHIFT      39
 #define EPT_PAGETABLE_ENTRIES       512
 
+/* #VE information page */
+typedef struct {
+    u32 exit_reason;
+    u32 semaphore;
+    u64 exit_qualification;
+    u64 gla;
+    u64 gpa;
+    u16 eptp_index;
+} ve_info_t;
+
 #endif /* __ASM_X86_HVM_VMX_VMX_H__ */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 5425f77..e9c4723 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -130,6 +130,7 @@
 #define MSR_IA32_VMX_TRUE_PROCBASED_CTLS        0x48e
 #define MSR_IA32_VMX_TRUE_EXIT_CTLS             0x48f
 #define MSR_IA32_VMX_TRUE_ENTRY_CTLS            0x490
+#define MSR_IA32_VMX_VMFUNC                     0x491
 #define IA32_FEATURE_CONTROL_MSR                0x3a
 #define IA32_FEATURE_CONTROL_MSR_LOCK                     0x0001
 #define IA32_FEATURE_CONTROL_MSR_ENABLE_VMXON_INSIDE_SMX  0x0002
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 03/15] VMX: implement suppress #VE.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
  2015-07-22 23:01 ` [PATCH v7 01/15] common/domain: Helpers to pause a domain while in context Ed White
  2015-07-22 23:01 ` [PATCH v7 02/15] VMX: VMFUNC and #VE definitions and detection Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-22 23:01 ` [PATCH v7 04/15] x86/HVM: Hardware alternate p2m support detection Ed White
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

In preparation for selectively enabling #VE in a later patch, set
suppress #VE on all EPTE's.

Suppress #VE should always be the default condition for two reasons:
it is generally not safe to deliver #VE into a guest unless that guest
has been modified to receive it; and even then for most EPT violations only
the hypervisor is able to handle the violation.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
---
Changes since v6:
        add Jun's ack

 xen/arch/x86/mm/p2m-ept.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 9a3b65a..b532811 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -42,7 +42,8 @@
 #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
 static inline bool_t is_epte_valid(ept_entry_t *e)
 {
-    return (e->epte != 0 && e->sa_p2mt != p2m_invalid);
+    /* suppress_ve alone is not considered valid, so mask it off */
+    return ((e->epte & ~(1ul << 63)) != 0 && e->sa_p2mt != p2m_invalid);
 }
 
 /* returns : 0 for success, -errno otherwise */
@@ -220,6 +221,8 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
 static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
 {
     struct page_info *pg;
+    ept_entry_t *table;
+    unsigned int i;
 
     pg = p2m_alloc_ptp(p2m, 0);
     if ( pg == NULL )
@@ -233,6 +236,15 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
     /* Manually set A bit to avoid overhead of MMU having to write it later. */
     ept_entry->a = 1;
 
+    ept_entry->suppress_ve = 1;
+
+    table = __map_domain_page(pg);
+
+    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
+        table[i].suppress_ve = 1;
+
+    unmap_domain_page(table);
+
     return 1;
 }
 
@@ -282,6 +294,7 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
         epte->sp = (level > 1);
         epte->mfn += i * trunk;
         epte->snp = (iommu_enabled && iommu_snoop);
+        epte->suppress_ve = 1;
 
         ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
 
@@ -791,6 +804,8 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
 
+    new_entry.suppress_ve = 1;
+
     rc = atomic_write_ept_entry(ept_entry, new_entry, target);
     if ( unlikely(rc) )
         old_entry.epte = 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 04/15] x86/HVM: Hardware alternate p2m support detection.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (2 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 03/15] VMX: implement suppress #VE Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-22 23:01 ` [PATCH v7 05/15] x86/altp2m: basic data structures and support routines Ed White
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

As implemented here, only supported on platforms with VMX HAP.

By default this functionality is force-disabled, it can be enabled
by specifying altp2m=1 on the Xen command line.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v6:
        no changes

 docs/misc/xen-command-line.markdown | 7 +++++++
 xen/arch/x86/hvm/hvm.c              | 7 +++++++
 xen/arch/x86/hvm/vmx/vmx.c          | 1 +
 xen/include/asm-x86/hvm/hvm.h       | 9 +++++++++
 4 files changed, 24 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 70d7ab8..7bdcfff 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -140,6 +140,13 @@ mode during S3 resume.
 
 Permit Xen to use superpages when performing memory management.
 
+### altp2m (Intel)
+> `= <boolean>`
+
+> Default: `false`
+
+Permit multiple copies of host p2m.
+
 ### apic
 > `= bigsmp | default`
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c07e3ef..eafaf9d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -96,6 +96,10 @@ bool_t opt_hvm_fep;
 boolean_param("hvm_fep", opt_hvm_fep);
 #endif
 
+/* Xen command-line option to enable altp2m */
+static bool_t __initdata opt_altp2m_enabled = 0;
+boolean_param("altp2m", opt_altp2m_enabled);
+
 static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -162,6 +166,9 @@ static int __init hvm_enable(void)
     if ( !fns->pvh_supported )
         printk(XENLOG_INFO "HVM: PVH mode not supported on this platform\n");
 
+    if ( !opt_altp2m_enabled )
+        hvm_funcs.altp2m_supported = 0;
+
     /*
      * Allow direct access to the PC debug ports 0x80 and 0xed (they are
      * often used for I/O delays, but the vmexits simply slow things down).
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index d3183a8..4f8b0e0 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1847,6 +1847,7 @@ const struct hvm_function_table * __init start_vmx(void)
     if ( cpu_has_vmx_ept && (cpu_has_vmx_pat || opt_force_ept) )
     {
         vmx_function_table.hap_supported = 1;
+        vmx_function_table.altp2m_supported = 1;
 
         vmx_function_table.hap_capabilities = 0;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 82f1b32..3a94f8c 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -94,6 +94,9 @@ struct hvm_function_table {
     /* Necessary hardware support for PVH mode? */
     int pvh_supported;
 
+    /* Necessary hardware support for alternate p2m's? */
+    bool_t altp2m_supported;
+
     /* Indicate HAP capabilities. */
     int hap_capabilities;
 
@@ -530,6 +533,12 @@ static inline bool_t hvm_is_singlestep_supported(void)
             hvm_funcs.is_singlestep_supported());
 }
 
+/* returns true if hardware supports alternate p2m's */
+static inline bool_t hvm_altp2m_supported(void)
+{
+    return hvm_funcs.altp2m_supported;
+}
+
 #ifndef NDEBUG
 /* Permit use of the Forced Emulation Prefix in HVM guests */
 extern bool_t opt_hvm_fep;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (3 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 04/15] x86/HVM: Hardware alternate p2m support detection Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-23  9:22   ` Jan Beulich
  2015-07-22 23:01 ` [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

Add the basic data structures needed to support alternate p2m's and
the functions to initialise them and tear them down.

Although Intel hardware can handle 512 EPTP's per hardware thread
concurrently, only 10 per domain are supported in this patch for
performance reasons.

This change also splits the p2m lock into one lock type for altp2m's
and another type for all other p2m's. The purpose of this is to place
the altp2m list lock between the types, so the list lock can be
acquired whilst holding the host p2m lock.

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
Changes since v6:
        move altp2m.c from hvm to mm
        move altp2m.h from x86/mm to x86
        change iterator types in hap_enable/hap_final_teardown
        fix altp2m init failure path in p2m_init
        rename altp2m_vcpu_update_eptp to altp2m_vcpu_update_p2m
        change uint16_t's/uint8_t's to unsigned int's
        rework p2m_get_altp2m
		
		not done - mechanical change of moving domain struct 
		    altp2m new members to a seperate alloc'ed struct 
			(initial stab for this patch has been written)

 xen/arch/x86/hvm/hvm.c         |  21 +++++++++
 xen/arch/x86/mm/Makefile       |   1 +
 xen/arch/x86/mm/altp2m.c       |  77 +++++++++++++++++++++++++++++++
 xen/arch/x86/mm/hap/hap.c      |  40 +++++++++++++++-
 xen/arch/x86/mm/mm-locks.h     |  46 ++++++++++++++++++-
 xen/arch/x86/mm/p2m.c          | 101 +++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/altp2m.h   |  38 ++++++++++++++++
 xen/include/asm-x86/domain.h   |  10 ++++
 xen/include/asm-x86/hvm/hvm.h  |  14 ++++++
 xen/include/asm-x86/hvm/vcpu.h |   9 ++++
 xen/include/asm-x86/p2m.h      |  34 +++++++++++++-
 11 files changed, 386 insertions(+), 5 deletions(-)
 create mode 100644 xen/arch/x86/mm/altp2m.c
 create mode 100644 xen/include/asm-x86/altp2m.h

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index eafaf9d..e8d2ac3 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -61,6 +61,7 @@
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/event.h>
 #include <asm/hvm/vmx/vmx.h>
+#include <asm/altp2m.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
 #include <public/sched.h>
@@ -2462,6 +2463,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
 {
     hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
 
+    altp2m_vcpu_destroy(v);
     nestedhvm_vcpu_destroy(v);
 
     free_compat_arg_xlat(v);
@@ -6569,6 +6571,25 @@ void hvm_toggle_singlestep(struct vcpu *v)
     v->arch.hvm_vcpu.single_step = !v->arch.hvm_vcpu.single_step;
 }
 
+void altp2m_vcpu_update_p2m(struct vcpu *v)
+{
+    if ( hvm_funcs.altp2m_vcpu_update_p2m )
+        hvm_funcs.altp2m_vcpu_update_p2m(v);
+}
+
+void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+    if ( hvm_funcs.altp2m_vcpu_update_vmfunc_ve )
+        hvm_funcs.altp2m_vcpu_update_vmfunc_ve(v);
+}
+
+bool_t altp2m_vcpu_emulate_ve(struct vcpu *v)
+{
+    if ( hvm_funcs.altp2m_vcpu_emulate_ve )
+        return hvm_funcs.altp2m_vcpu_emulate_ve(v);
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/mm/Makefile b/xen/arch/x86/mm/Makefile
index ed4b1f8..aeccdfc 100644
--- a/xen/arch/x86/mm/Makefile
+++ b/xen/arch/x86/mm/Makefile
@@ -3,6 +3,7 @@ subdir-y += hap
 
 obj-y += paging.o
 obj-y += p2m.o p2m-pt.o p2m-ept.o p2m-pod.o
+obj-y += altp2m.o
 obj-y += guest_walk_2.o
 obj-y += guest_walk_3.o
 obj-$(x86_64) += guest_walk_4.o
diff --git a/xen/arch/x86/mm/altp2m.c b/xen/arch/x86/mm/altp2m.c
new file mode 100644
index 0000000..17b227c
--- /dev/null
+++ b/xen/arch/x86/mm/altp2m.c
@@ -0,0 +1,77 @@
+/*
+ * Alternate p2m HVM
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#include <asm/hvm/support.h>
+#include <asm/hvm/hvm.h>
+#include <asm/p2m.h>
+#include <asm/altp2m.h>
+
+void
+altp2m_vcpu_reset(struct vcpu *v)
+{
+    struct altp2mvcpu *av = &vcpu_altp2m(v);
+
+    av->p2midx = INVALID_ALTP2M;
+    av->veinfo_gfn = _gfn(INVALID_GFN);
+}
+
+void
+altp2m_vcpu_initialise(struct vcpu *v)
+{
+    if ( v != current )
+        vcpu_pause(v);
+
+    altp2m_vcpu_reset(v);
+    vcpu_altp2m(v).p2midx = 0;
+    atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+
+    altp2m_vcpu_update_p2m(v);
+
+    if ( v != current )
+        vcpu_unpause(v);
+}
+
+void
+altp2m_vcpu_destroy(struct vcpu *v)
+{
+    struct p2m_domain *p2m;
+
+    if ( v != current )
+        vcpu_pause(v);
+
+    if ( (p2m = p2m_get_altp2m(v)) )
+        atomic_dec(&p2m->active_vcpus);
+
+    altp2m_vcpu_reset(v);
+
+    altp2m_vcpu_update_p2m(v);
+    altp2m_vcpu_update_vmfunc_ve(v);
+
+    if ( v != current )
+        vcpu_unpause(v);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 63980af..2d5f6b3 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)
 int hap_enable(struct domain *d, u32 mode)
 {
     unsigned int old_pages;
-    uint8_t i;
+    unsigned int i;
     int rv = 0;
 
     domain_pause(d);
@@ -498,6 +498,28 @@ int hap_enable(struct domain *d, u32 mode)
            goto out;
     }
 
+    if ( hvm_altp2m_supported() )
+    {
+        /* Init alternate p2m data */
+        if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
+        {
+            rv = -ENOMEM;
+            goto out;
+        }
+
+        for ( i = 0; i < MAX_EPTP; i++ )
+            d->arch.altp2m_eptp[i] = INVALID_MFN;
+
+        for ( i = 0; i < MAX_ALTP2M; i++ )
+        {
+            rv = p2m_alloc_table(d->arch.altp2m_p2m[i]);
+            if ( rv != 0 )
+               goto out;
+        }
+
+        d->arch.altp2m_active = 0;
+    }
+
     /* Now let other users see the new mode */
     d->arch.paging.mode = mode | PG_HAP_enable;
 
@@ -508,7 +530,21 @@ int hap_enable(struct domain *d, u32 mode)
 
 void hap_final_teardown(struct domain *d)
 {
-    uint8_t i;
+    unsigned int i;
+
+    if ( hvm_altp2m_supported() )
+    {
+        d->arch.altp2m_active = 0;
+
+        if ( d->arch.altp2m_eptp )
+        {
+            free_xenheap_page(d->arch.altp2m_eptp);
+            d->arch.altp2m_eptp = NULL;
+        }
+
+        for ( i = 0; i < MAX_ALTP2M; i++ )
+            p2m_teardown(d->arch.altp2m_p2m[i]);
+    }
 
     /* Destroy nestedp2m's first */
     for (i = 0; i < MAX_NESTEDP2M; i++) {
diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index b4f035e..c66f105 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
 #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
 #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
 
-/* P2M lock (per-p2m-table)
+/* P2M lock (per-non-alt-p2m-table)
  *
  * This protects all queries and updates to the p2m table.
  * Queries may be made under the read lock but all modifications
@@ -225,10 +225,52 @@ declare_mm_lock(nestedp2m)
  *
  * The write lock is recursive as it is common for a code path to look
  * up a gfn and later mutate it.
+ *
+ * Note that this lock shares its implementation with the altp2m
+ * lock (not the altp2m list lock), so the implementation
+ * is found there.
+ *
+ * Changes made to the host p2m when in altp2m mode are propagated to the
+ * altp2ms synchronously in ept_set_entry().  At that point, we will hold
+ * the host p2m lock; propagating this change involves grabbing the
+ * altp2m_list lock, and the locks of the individual alternate p2ms.  In
+ * order to allow us to maintain locking order discipline, we split the p2m
+ * lock into p2m (for host p2ms) and altp2m (for alternate p2ms), putting
+ * the altp2mlist lock in the middle.
  */
 
 declare_mm_rwlock(p2m);
-#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
+
+/* Alternate P2M list lock (per-domain)
+ *
+ * A per-domain lock that protects the list of alternate p2m's.
+ * Any operation that walks the list needs to acquire this lock.
+ * Additionally, before destroying an alternate p2m all VCPU's
+ * in the target domain must be paused.
+ */
+
+declare_mm_lock(altp2mlist)
+#define altp2m_list_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_list_lock)
+#define altp2m_list_unlock(d) mm_unlock(&(d)->arch.altp2m_list_lock)
+
+/* P2M lock (per-altp2m-table)
+ *
+ * This protects all queries and updates to the p2m table.
+ * Queries may be made under the read lock but all modifications
+ * need the main (write) lock.
+ *
+ * The write lock is recursive as it is common for a code path to look
+ * up a gfn and later mutate it.
+ */
+
+declare_mm_rwlock(altp2m);
+#define p2m_lock(p)                         \
+{                                           \
+    if ( p2m_is_altp2m(p) )                 \
+        mm_write_lock(altp2m, &(p)->lock);  \
+    else                                    \
+        mm_write_lock(p2m, &(p)->lock);     \
+}
 #define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
 #define gfn_lock(p,g,o)       p2m_lock(p)
 #define gfn_unlock(p,g,o)     p2m_unlock(p)
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6fe6387..536f69c 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -35,6 +35,7 @@
 #include <asm/hvm/vmx/vmx.h> /* ept_p2m_init() */
 #include <asm/mem_sharing.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/altp2m.h>
 #include <asm/hvm/svm/amd-iommu-proto.h>
 #include <xsm/xsm.h>
 
@@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct domain *d)
     }
 }
 
+static void p2m_teardown_altp2m(struct domain *d)
+{
+    unsigned int i;
+    struct p2m_domain *p2m;
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( !d->arch.altp2m_p2m[i] )
+            continue;
+        p2m = d->arch.altp2m_p2m[i];
+        p2m_free_one(p2m);
+        d->arch.altp2m_p2m[i] = NULL;
+    }
+}
+
+static int p2m_init_altp2m(struct domain *d)
+{
+    unsigned int i;
+    struct p2m_domain *p2m;
+
+    mm_lock_init(&d->arch.altp2m_list_lock);
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        d->arch.altp2m_p2m[i] = p2m = p2m_init_one(d);
+        if ( p2m == NULL )
+        {
+            p2m_teardown_altp2m(d);
+            return -ENOMEM;
+        }
+        p2m->p2m_class = p2m_alternate;
+        p2m->access_required = 1;
+        _atomic_set(&p2m->active_vcpus, 0);
+    }
+
+    return 0;
+}
+
 int p2m_init(struct domain *d)
 {
     int rc;
@@ -196,7 +234,17 @@ int p2m_init(struct domain *d)
      * (p2m_init runs too early for HVM_PARAM_* options) */
     rc = p2m_init_nestedp2m(d);
     if ( rc )
+    {
+        p2m_teardown_hostp2m(d);
+        return rc;
+    }
+
+    rc = p2m_init_altp2m(d);
+    if ( rc )
+    {
         p2m_teardown_hostp2m(d);
+        p2m_teardown_nestedp2m(d);
+    }
 
     return rc;
 }
@@ -1940,6 +1988,59 @@ int unmap_mmio_regions(struct domain *d,
     return err;
 }
 
+unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp)
+{
+    struct p2m_domain *p2m;
+    struct ept_data *ept;
+    unsigned int i;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
+            continue;
+
+        p2m = d->arch.altp2m_p2m[i];
+        ept = &p2m->ept;
+
+        if ( eptp == ept_get_eptp(ept) )
+            goto out;
+    }
+
+    i = INVALID_ALTP2M;
+
+out:
+    altp2m_list_unlock(d);
+    return i;
+}
+
+bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, unsigned int idx)
+{
+    struct domain *d = v->domain;
+    bool_t rc = 0;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        if ( idx != vcpu_altp2m(v).p2midx )
+        {
+            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+            vcpu_altp2m(v).p2midx = idx;
+            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+            altp2m_vcpu_update_p2m(v);
+        }
+        rc = 1;
+    }
+
+    altp2m_list_unlock(d);
+    return rc;
+}
+
 /*** Audit ***/
 
 #if P2M_AUDIT
diff --git a/xen/include/asm-x86/altp2m.h b/xen/include/asm-x86/altp2m.h
new file mode 100644
index 0000000..38de494
--- /dev/null
+++ b/xen/include/asm-x86/altp2m.h
@@ -0,0 +1,38 @@
+/*
+ * Alternate p2m HVM
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#ifndef _ALTP2M_H
+#define _ALTP2M_H
+
+#include <xen/types.h>
+#include <xen/sched.h>         /* for struct vcpu, struct domain */
+#include <asm/hvm/vcpu.h>      /* for vcpu_altp2m */
+
+/* Alternate p2m HVM on/off per domain */
+static inline bool_t altp2m_active(const struct domain *d)
+{
+    return d->arch.altp2m_active;
+}
+
+/* Alternate p2m VCPU */
+void altp2m_vcpu_initialise(struct vcpu *v);
+void altp2m_vcpu_destroy(struct vcpu *v);
+void altp2m_vcpu_reset(struct vcpu *v);
+
+#endif /* _ALTP2M_H */
+
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 7a9e96f..752b284 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -233,6 +233,10 @@ struct paging_vcpu {
 typedef xen_domctl_cpuid_t cpuid_input_t;
 
 #define MAX_NESTEDP2M 10
+
+#define MAX_ALTP2M      ((uint16_t)10)
+#define INVALID_ALTP2M  ((uint16_t)~0)
+#define MAX_EPTP        (PAGE_SIZE / sizeof(uint64_t))
 struct p2m_domain;
 struct time_scale {
     int shift;
@@ -307,6 +311,12 @@ struct arch_domain
     struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
     mm_lock_t nested_p2m_lock;
 
+    /* altp2m: allow multiple copies of host p2m */
+    bool_t altp2m_active;
+    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
+    mm_lock_t altp2m_list_lock;
+    uint64_t *altp2m_eptp;
+
     /* NB. protected by d->event_lock and by irq_desc[irq].lock */
     struct radix_tree_root irq_pirq;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 3a94f8c..0de061a 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -206,6 +206,11 @@ struct hvm_function_table {
 
     void (*enable_msr_exit_interception)(struct domain *d);
     bool_t (*is_singlestep_supported)(void);
+
+    /* Alternate p2m */
+    void (*altp2m_vcpu_update_p2m)(struct vcpu *v);
+    void (*altp2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
+    bool_t (*altp2m_vcpu_emulate_ve)(struct vcpu *v);
 };
 
 extern struct hvm_function_table hvm_funcs;
@@ -546,6 +551,15 @@ extern bool_t opt_hvm_fep;
 #define opt_hvm_fep 0
 #endif
 
+/* updates the current hardware p2m */
+void altp2m_vcpu_update_p2m(struct vcpu *v);
+
+/* updates VMCS fields related to VMFUNC and #VE */
+void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v);
+
+/* emulates #VE */
+bool_t altp2m_vcpu_emulate_ve(struct vcpu *v);
+
 #endif /* __ASM_X86_HVM_HVM_H__ */
 
 /*
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 0df4524..c033c8c 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -135,6 +135,13 @@ struct nestedvcpu {
 
 #define vcpu_nestedhvm(v) ((v)->arch.hvm_vcpu.nvcpu)
 
+struct altp2mvcpu {
+    uint16_t    p2midx;         /* alternate p2m index */
+    gfn_t       veinfo_gfn;     /* #VE information page gfn */
+};
+
+#define vcpu_altp2m(v) ((v)->arch.hvm_vcpu.avcpu)
+
 struct hvm_vcpu {
     /* Guest control-register and EFER values, just as the guest sees them. */
     unsigned long       guest_cr[5];
@@ -177,6 +184,8 @@ struct hvm_vcpu {
 
     struct nestedvcpu   nvcpu;
 
+    struct altp2mvcpu   avcpu;
+
     struct mtrr_state   mtrr;
     u64                 pat_cr;
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index b49c09b..2d56aad 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -175,6 +175,7 @@ typedef unsigned int p2m_query_t;
 typedef enum {
     p2m_host,
     p2m_nested,
+    p2m_alternate,
 } p2m_class_t;
 
 /* Per-p2m-table state */
@@ -193,7 +194,7 @@ struct p2m_domain {
 
     struct domain     *domain;   /* back pointer to domain */
 
-    p2m_class_t       p2m_class; /* host/nested/? */
+    p2m_class_t       p2m_class; /* host/nested/alternate */
 
     /* Nested p2ms only: nested p2m base value that this p2m shadows.
      * This can be cleared to P2M_BASE_EADDR under the per-p2m lock but
@@ -219,6 +220,9 @@ struct p2m_domain {
      * host p2m's lock. */
     int                defer_nested_flush;
 
+    /* Alternate p2m: count of vcpu's currently using this p2m. */
+    atomic_t           active_vcpus;
+
     /* Pages used to construct the p2m */
     struct page_list_head pages;
 
@@ -317,6 +321,11 @@ static inline bool_t p2m_is_nestedp2m(const struct p2m_domain *p2m)
     return p2m->p2m_class == p2m_nested;
 }
 
+static inline bool_t p2m_is_altp2m(const struct p2m_domain *p2m)
+{
+    return p2m->p2m_class == p2m_alternate;
+}
+
 #define p2m_get_pagetable(p2m)  ((p2m)->phys_table)
 
 /**** p2m query accessors. They lock p2m_lock, and thus serialize
@@ -722,6 +731,29 @@ void nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
     l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
 
 /*
+ * Alternate p2m: shadow p2m tables used for alternate memory views
+ */
+
+/* get current alternate p2m table */
+static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v)
+{
+    unsigned int index = vcpu_altp2m(v).p2midx;
+
+    if ( index == INVALID_ALTP2M )
+        return NULL;
+
+    BUG_ON(index >= MAX_ALTP2M);
+
+    return v->domain->arch.altp2m_p2m[index];
+}
+
+/* Locate an alternate p2m by its EPTP */
+unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp);
+
+/* Switch alternate p2m for a single vcpu */
+bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, unsigned int idx);
+
+/*
  * p2m type to IOMMU flags
  */
 static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (4 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 05/15] x86/altp2m: basic data structures and support routines Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-23  9:43   ` Jan Beulich
  2015-07-22 23:01 ` [PATCH v7 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

Implement and hook up the code to enable VMX support of VMFUNC and #VE.

VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
---
Changes since v6:
        remove casts around p2midx handling
        fix veinfo semaphore initialization
        mechanical changes due to patch 5 changes

 xen/arch/x86/hvm/vmx/vmx.c | 139 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 139 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 4f8b0e0..269d160 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -56,6 +56,7 @@
 #include <asm/debugger.h>
 #include <asm/apic.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/altp2m.h>
 #include <asm/event.h>
 #include <asm/monitor.h>
 #include <public/arch-x86/cpuid.h>
@@ -1770,6 +1771,105 @@ static bool_t vmx_is_singlestep_supported(void)
     return cpu_has_monitor_trap_flag;
 }
 
+static void vmx_vcpu_update_eptp(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    struct p2m_domain *p2m = NULL;
+    struct ept_data *ept;
+
+    if ( altp2m_active(d) )
+        p2m = p2m_get_altp2m(v);
+    if ( !p2m )
+        p2m = p2m_get_hostp2m(d);
+
+    ept = &p2m->ept;
+    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
+
+    vmx_vmcs_enter(v);
+
+    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
+
+    if ( v->arch.hvm_vmx.secondary_exec_control &
+        SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+        __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);
+
+    vmx_vmcs_exit(v);
+}
+
+static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+
+    if ( !cpu_has_vmx_vmfunc )
+        return;
+
+    if ( cpu_has_vmx_virt_exceptions )
+        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
+    vmx_vmcs_enter(v);
+
+    if ( !d->is_dying && altp2m_active(d) )
+    {
+        v->arch.hvm_vmx.secondary_exec_control |= mask;
+        __vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
+        __vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp));
+
+        if ( cpu_has_vmx_virt_exceptions )
+        {
+            p2m_type_t t;
+            mfn_t mfn;
+
+            mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
+
+            if ( mfn_x(mfn) != INVALID_MFN )
+                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
+            else
+                v->arch.hvm_vmx.secondary_exec_control &=
+                    ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+        }
+    }
+    else
+        v->arch.hvm_vmx.secondary_exec_control &= ~mask;
+
+    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
+        v->arch.hvm_vmx.secondary_exec_control);
+
+    vmx_vmcs_exit(v);
+}
+
+static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
+{
+    bool_t rc = 0;
+    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) != INVALID_GFN ?
+        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) : NULL;
+
+    if ( !veinfo )
+        return 0;
+
+    if ( veinfo->semaphore != 0 )
+        goto out;
+
+    rc = 1;
+
+    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
+    veinfo->semaphore = ~0;
+    veinfo->eptp_index = vcpu_altp2m(v).p2midx;
+
+    vmx_vmcs_enter(v);
+    __vmread(EXIT_QUALIFICATION, &veinfo->exit_qualification);
+    __vmread(GUEST_LINEAR_ADDRESS, &veinfo->gla);
+    __vmread(GUEST_PHYSICAL_ADDRESS, &veinfo->gpa);
+    vmx_vmcs_exit(v);
+
+    hvm_inject_hw_exception(TRAP_virtualisation,
+                            HVM_DELIVER_NO_ERROR_CODE);
+
+out:
+    hvm_unmap_guest_frame(veinfo, 0);
+    return rc;
+}
+
 static struct hvm_function_table __initdata vmx_function_table = {
     .name                 = "VMX",
     .cpu_up_prepare       = vmx_cpu_up_prepare,
@@ -1828,6 +1928,9 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .hypervisor_cpuid_leaf = vmx_hypervisor_cpuid_leaf,
     .enable_msr_exit_interception = vmx_enable_msr_exit_interception,
     .is_singlestep_supported = vmx_is_singlestep_supported,
+    .altp2m_vcpu_update_p2m = vmx_vcpu_update_eptp,
+    .altp2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
+    .altp2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2769,6 +2872,42 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
     /* Now enable interrupts so it's safe to take locks. */
     local_irq_enable();
 
+    /*
+     * If the guest has the ability to switch EPTP without an exit,
+     * figure out whether it has done so and update the altp2m data.
+     */
+    if ( altp2m_active(v->domain) &&
+        (v->arch.hvm_vmx.secondary_exec_control &
+        SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+    {
+        unsigned long idx;
+
+        if ( v->arch.hvm_vmx.secondary_exec_control &
+            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+            __vmread(EPTP_INDEX, &idx);
+        else
+        {
+            unsigned long eptp;
+
+            __vmread(EPT_POINTER, &eptp);
+
+            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
+                 INVALID_ALTP2M )
+            {
+                gdprintk(XENLOG_ERR, "EPTP not found in alternate p2m list\n");
+                domain_crash(v->domain);
+            }
+        }
+
+        if ( idx != vcpu_altp2m(v).p2midx )
+        {
+            BUG_ON(idx >= MAX_ALTP2M);
+            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+            vcpu_altp2m(v).p2midx = idx;
+            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+        }
+    }
+
     /* XXX: This looks ugly, but we need a mechanism to ensure
      * any pending vmresume has really happened
      */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (5 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-22 23:01 ` [PATCH v7 08/15] x86/altp2m: add control of suppress_ve Ed White
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

From: Ravi Sahita <ravi.sahita@intel.com>

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v6:
        remove incorrect cast
        add Jan's ack

 xen/arch/x86/hvm/emulate.c             | 18 +++++++++++++++--
 xen/arch/x86/hvm/vmx/vmx.c             | 36 ++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_emulate/x86_emulate.c | 19 ++++++++++++------
 xen/arch/x86/x86_emulate/x86_emulate.h |  4 ++++
 xen/include/asm-x86/hvm/hvm.h          |  2 ++
 5 files changed, 71 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 15c2496..30acb78 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1593,6 +1593,18 @@ static int hvmemul_invlpg(
     return rc;
 }
 
+static int hvmemul_vmfunc(
+    struct x86_emulate_ctxt *ctxt)
+{
+    int rc;
+
+    rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
+    if ( rc != X86EMUL_OKAY )
+        hvmemul_inject_hw_exception(TRAP_invalid_op, 0, ctxt);
+
+    return rc;
+}
+
 static const struct x86_emulate_ops hvm_emulate_ops = {
     .read          = hvmemul_read,
     .insn_fetch    = hvmemul_insn_fetch,
@@ -1616,7 +1628,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmfunc        = hvmemul_vmfunc,
 };
 
 static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
@@ -1642,7 +1655,8 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmfunc        = hvmemul_vmfunc,
 };
 
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 269d160..4eaa97e 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -82,6 +82,7 @@ static void vmx_fpu_dirty_intercept(void);
 static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content);
 static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg_intercept(unsigned long vaddr);
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
 
 uint8_t __read_mostly posted_intr_vector;
 
@@ -1838,6 +1839,19 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
     vmx_vmcs_exit(v);
 }
 
+static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
+{
+    int rc = X86EMUL_EXCEPTION;
+    struct vcpu *curr = current;
+
+    if ( !cpu_has_vmx_vmfunc && altp2m_active(curr->domain) &&
+         regs->_eax == 0 &&
+         p2m_switch_vcpu_altp2m_by_id(curr, regs->_ecx) )
+        rc = X86EMUL_OKAY;
+
+    return rc;
+}
+
 static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
 {
     bool_t rc = 0;
@@ -1906,6 +1920,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .msr_read_intercept   = vmx_msr_read_intercept,
     .msr_write_intercept  = vmx_msr_write_intercept,
     .invlpg_intercept     = vmx_invlpg_intercept,
+    .vmfunc_intercept     = vmx_vmfunc_intercept,
     .handle_cd            = vmx_handle_cd,
     .set_info_guest       = vmx_set_info_guest,
     .set_rdtsc_exiting    = vmx_set_rdtsc_exiting,
@@ -1931,6 +1946,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .altp2m_vcpu_update_p2m = vmx_vcpu_update_eptp,
     .altp2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
     .altp2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
+    .altp2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2102,6 +2118,19 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
         vpid_sync_vcpu_gva(curr, vaddr);
 }
 
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs)
+{
+    /*
+     * This handler is a placeholder for future where Xen may
+     * want to handle VMFUNC exits and resume a domain normally without
+     * injecting a #UD to the guest - for example, in a VT-nested
+     * scenario where Xen may want to lazily shadow the alternate
+     * EPTP list.
+     */
+    gdprintk(XENLOG_ERR, "Failed guest VMFUNC execution\n");
+    return X86EMUL_EXCEPTION;
+}
+
 static int vmx_cr_access(unsigned long exit_qualification)
 {
     struct vcpu *curr = current;
@@ -3260,6 +3289,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMFUNC:
+        if ( vmx_vmfunc_intercept(regs) != X86EMUL_OKAY )
+            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        else
+            update_guest_eip();
+        break;
+
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
     case EXIT_REASON_GETSEC:
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index c017c69..e596131 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -3786,6 +3786,7 @@ x86_emulate(
         break;
     }
 
+ no_writeback:
     /* Inject #DB if single-step tracing was enabled at instruction start. */
     if ( (ctxt->regs->eflags & EFLG_TF) && (rc == X86EMUL_OKAY) &&
          (ops->inject_hw_exception != NULL) )
@@ -3816,19 +3817,17 @@ x86_emulate(
         struct segment_register reg;
         unsigned long base, limit, cr0, cr0w;
 
-        if ( modrm == 0xdf ) /* invlpga */
+        switch( modrm )
         {
+        case 0xdf: /* invlpga */
             generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
             fail_if(ops->invlpg == NULL);
             if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
                                    ctxt)) )
                 goto done;
-            break;
-        }
-
-        if ( modrm == 0xf9 ) /* rdtscp */
-        {
+            goto no_writeback;
+        case 0xf9: /* rdtscp */ {
             uint64_t tsc_aux;
             fail_if(ops->read_msr == NULL);
             if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
@@ -3836,6 +3835,14 @@ x86_emulate(
             _regs.ecx = (uint32_t)tsc_aux;
             goto rdtsc;
         }
+        case 0xd4: /* vmfunc */
+            generate_exception_if(lock_prefix | rep_prefix() | (vex.pfx == vex_66),
+                                  EXC_UD, -1);
+            fail_if(ops->vmfunc == NULL);
+            if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
+                goto done;
+            goto no_writeback;
+        }
 
         switch ( modrm_reg & 7 )
         {
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 064b8f4..a4d4ec8 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -397,6 +397,10 @@ struct x86_emulate_ops
         enum x86_segment seg,
         unsigned long offset,
         struct x86_emulate_ctxt *ctxt);
+
+    /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */
+    int (*vmfunc)(
+        struct x86_emulate_ctxt *ctxt);
 };
 
 struct cpu_user_regs;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 0de061a..425327a 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -167,6 +167,7 @@ struct hvm_function_table {
     int (*msr_read_intercept)(unsigned int msr, uint64_t *msr_content);
     int (*msr_write_intercept)(unsigned int msr, uint64_t msr_content);
     void (*invlpg_intercept)(unsigned long vaddr);
+    int (*vmfunc_intercept)(struct cpu_user_regs *regs);
     void (*handle_cd)(struct vcpu *v, unsigned long value);
     void (*set_info_guest)(struct vcpu *v);
     void (*set_rdtsc_exiting)(struct vcpu *v, bool_t);
@@ -211,6 +212,7 @@ struct hvm_function_table {
     void (*altp2m_vcpu_update_p2m)(struct vcpu *v);
     void (*altp2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
     bool_t (*altp2m_vcpu_emulate_ve)(struct vcpu *v);
+    int (*altp2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
 };
 
 extern struct hvm_function_table hvm_funcs;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (6 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-22 23:01 ` [PATCH v7 09/15] x86/altp2m: alternate p2m memory events Ed White
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

From: George Dunlap <george.dunlap@eu.citrix.com>

The existing ept_set_entry() and ept_get_entry() routines are extended
to optionally set/get suppress_ve.  Passing -1 will set suppress_ve on
new p2m entries, or retain suppress_ve flag on existing entries.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
---
Changes since v6:
        formatting changes requested by Andrew

 xen/arch/x86/mm/mem_sharing.c |  4 ++--
 xen/arch/x86/mm/p2m-ept.c     | 16 ++++++++++++---
 xen/arch/x86/mm/p2m-pod.c     | 12 +++++------
 xen/arch/x86/mm/p2m-pt.c      | 10 +++++++--
 xen/arch/x86/mm/p2m.c         | 48 +++++++++++++++++++++----------------------
 xen/include/asm-x86/p2m.h     | 24 ++++++++++++----------
 6 files changed, 66 insertions(+), 48 deletions(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 1a01e45..d2e3786 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1260,7 +1260,7 @@ int relinquish_shared_pages(struct domain *d)
 
         if ( atomic_read(&d->shr_pages) == 0 )
             break;
-        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
         if ( mfn_valid(mfn) && (t == p2m_ram_shared) )
         {
             /* Does not fail with ENOMEM given the DESTROY flag */
@@ -1270,7 +1270,7 @@ int relinquish_shared_pages(struct domain *d)
              * unshare.  Must succeed: we just read the old entry and
              * we hold the p2m lock. */
             set_rc = p2m->set_entry(p2m, gfn, _mfn(0), PAGE_ORDER_4K,
-                                    p2m_invalid, p2m_access_rwx);
+                                    p2m_invalid, p2m_access_rwx, -1);
             ASSERT(set_rc == 0);
             count += 0x10;
         }
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index b532811..b4c65f9 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -658,7 +658,8 @@ bool_t ept_handle_misconfig(uint64_t gpa)
  */
 static int
 ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, 
-              unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma)
+              unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma,
+              int sve)
 {
     ept_entry_t *table, *ept_entry = NULL;
     unsigned long gfn_remainder = gfn;
@@ -804,7 +805,11 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
 
-    new_entry.suppress_ve = 1;
+    if ( sve != -1 )
+        new_entry.suppress_ve = !!sve;
+    else
+        new_entry.suppress_ve = is_epte_valid(&old_entry) ?
+                                    old_entry.suppress_ve : 1;
 
     rc = atomic_write_ept_entry(ept_entry, new_entry, target);
     if ( unlikely(rc) )
@@ -852,7 +857,8 @@ out:
 /* Read ept p2m entries */
 static mfn_t ept_get_entry(struct p2m_domain *p2m,
                            unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
-                           p2m_query_t q, unsigned int *page_order)
+                           p2m_query_t q, unsigned int *page_order,
+                           bool_t *sve)
 {
     ept_entry_t *table = map_domain_page(_mfn(pagetable_get_pfn(p2m_get_pagetable(p2m))));
     unsigned long gfn_remainder = gfn;
@@ -866,6 +872,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
 
     *t = p2m_mmio_dm;
     *a = p2m_access_n;
+    if ( sve )
+        *sve = 1;
 
     /* This pfn is higher than the highest the p2m map currently holds */
     if ( gfn > p2m->max_mapped_pfn )
@@ -931,6 +939,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
         else
             *t = ept_entry->sa_p2mt;
         *a = ept_entry->access;
+        if ( sve )
+            *sve = ept_entry->suppress_ve;
 
         mfn = _mfn(ept_entry->mfn);
         if ( i )
diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 6e27bcd..6aee85a 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -536,7 +536,7 @@ recount:
         p2m_access_t a;
         p2m_type_t t;
 
-        (void)p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
+        (void)p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL, NULL);
 
         if ( t == p2m_populate_on_demand )
             pod++;
@@ -587,7 +587,7 @@ recount:
         p2m_type_t t;
         p2m_access_t a;
 
-        mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL, NULL);
         if ( t == p2m_populate_on_demand )
         {
             p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid,
@@ -676,7 +676,7 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     for ( i=0; i<SUPERPAGE_PAGES; i++ )
     {
         p2m_access_t a; 
-        mfn = p2m->get_entry(p2m, gfn + i, &type, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gfn + i, &type, &a, 0, NULL, NULL);
 
         if ( i == 0 )
         {
@@ -808,7 +808,7 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
     for ( i=0; i<count; i++ )
     {
         p2m_access_t a;
-        mfns[i] = p2m->get_entry(p2m, gfns[i], types + i, &a, 0, NULL);
+        mfns[i] = p2m->get_entry(p2m, gfns[i], types + i, &a, 0, NULL, NULL);
         /* If this is ram, and not a pagetable or from the xen heap, and probably not mapped
            elsewhere, map it; otherwise, skip. */
         if ( p2m_is_ram(types[i])
@@ -947,7 +947,7 @@ p2m_pod_emergency_sweep(struct p2m_domain *p2m)
     for ( i=p2m->pod.reclaim_single; i > 0 ; i-- )
     {
         p2m_access_t a;
-        (void)p2m->get_entry(p2m, i, &t, &a, 0, NULL);
+        (void)p2m->get_entry(p2m, i, &t, &a, 0, NULL, NULL);
         if ( p2m_is_ram(t) )
         {
             gfns[j] = i;
@@ -1135,7 +1135,7 @@ guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn,
     for ( i = 0; i < (1UL << order); i++ )
     {
         p2m_access_t a;
-        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
+        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL, NULL);
         if ( p2m_is_ram(ot) )
         {
             P2M_DEBUG("gfn_to_mfn returned type %d!\n", ot);
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index a6dd464..926de0b 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -482,7 +482,8 @@ int p2m_pt_handle_deferred_changes(uint64_t gpa)
 /* Returns: 0 for success, -errno for failure */
 static int
 p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
-                 unsigned int page_order, p2m_type_t p2mt, p2m_access_t p2ma)
+                 unsigned int page_order, p2m_type_t p2mt, p2m_access_t p2ma,
+                 int sve)
 {
     /* XXX -- this might be able to be faster iff current->domain == d */
     void *table;
@@ -495,6 +496,8 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
     unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
     unsigned long old_mfn = 0;
 
+    ASSERT(sve != 0);
+
     if ( tb_init_done )
     {
         struct {
@@ -689,7 +692,7 @@ static inline p2m_type_t recalc_type(bool_t recalc, p2m_type_t t,
 static mfn_t
 p2m_pt_get_entry(struct p2m_domain *p2m, unsigned long gfn,
                  p2m_type_t *t, p2m_access_t *a, p2m_query_t q,
-                 unsigned int *page_order)
+                 unsigned int *page_order, bool_t *sve)
 {
     mfn_t mfn;
     paddr_t addr = ((paddr_t)gfn) << PAGE_SHIFT;
@@ -701,6 +704,9 @@ p2m_pt_get_entry(struct p2m_domain *p2m, unsigned long gfn,
 
     ASSERT(paging_mode_translate(p2m->domain));
 
+    if ( sve )
+        *sve = 1;
+
     /* XXX This is for compatibility with the old model, where anything not 
      * XXX marked as RAM was considered to be emulated MMIO space.
      * XXX Once we start explicitly registering MMIO regions in the p2m 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 536f69c..6143d4e 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -345,7 +345,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn,
         /* Grab the lock here, don't release until put_gfn */
         gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order);
+    mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL);
 
     if ( (q & P2M_UNSHARE) && p2m_is_shared(*t) )
     {
@@ -354,7 +354,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn,
          * sleeping. */
         if ( mem_sharing_unshare_page(p2m->domain, gfn, 0) < 0 )
             (void)mem_sharing_notify_enomem(p2m->domain, gfn, 0);
-        mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order);
+        mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL);
     }
 
     if (unlikely((p2m_is_broken(*t))))
@@ -458,7 +458,7 @@ int p2m_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         else
             order = 0;
 
-        set_rc = p2m->set_entry(p2m, gfn, mfn, order, p2mt, p2ma);
+        set_rc = p2m->set_entry(p2m, gfn, mfn, order, p2mt, p2ma, -1);
         if ( set_rc )
             rc = set_rc;
 
@@ -622,7 +622,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn, unsigned long mfn,
     {
         for ( i = 0; i < (1UL << page_order); i++ )
         {
-            mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a, 0, NULL);
+            mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a, 0, NULL, NULL);
             if ( !p2m_is_grant(t) && !p2m_is_shared(t) && !p2m_is_foreign(t) )
                 set_gpfn_from_mfn(mfn+i, INVALID_M2P_ENTRY);
             ASSERT( !p2m_is_valid(t) || mfn + i == mfn_x(mfn_return) );
@@ -685,7 +685,7 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
     /* First, remove m->p mappings for existing p->m mappings */
     for ( i = 0; i < (1UL << page_order); i++ )
     {
-        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
+        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL, NULL);
         if ( p2m_is_shared(ot) )
         {
             /* Do an unshare to cleanly take care of all corner 
@@ -709,7 +709,7 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
                 (void)mem_sharing_notify_enomem(p2m->domain, gfn + i, 0);
                 return rc;
             }
-            omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
+            omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL, NULL);
             ASSERT(!p2m_is_shared(ot));
         }
         if ( p2m_is_grant(ot) || p2m_is_foreign(ot) )
@@ -757,7 +757,7 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
              * address */
             P2M_DEBUG("aliased! mfn=%#lx, old gfn=%#lx, new gfn=%#lx\n",
                       mfn + i, ogfn, gfn + i);
-            omfn = p2m->get_entry(p2m, ogfn, &ot, &a, 0, NULL);
+            omfn = p2m->get_entry(p2m, ogfn, &ot, &a, 0, NULL, NULL);
             if ( p2m_is_ram(ot) && !p2m_is_paged(ot) )
             {
                 ASSERT(mfn_valid(omfn));
@@ -824,7 +824,7 @@ int p2m_change_type_one(struct domain *d, unsigned long gfn,
 
     gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, &pt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &pt, &a, 0, NULL, NULL);
     rc = likely(pt == ot)
          ? p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, nt,
                          p2m->default_access)
@@ -908,7 +908,7 @@ static int set_typed_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
         return -EIO;
 
     gfn_lock(p2m, gfn, 0);
-    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL);
+    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL, NULL);
     if ( p2m_is_grant(ot) || p2m_is_foreign(ot) )
     {
         p2m_unlock(p2m);
@@ -959,7 +959,7 @@ int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
         return -EIO;
 
     gfn_lock(p2m, gfn, 0);
-    actual_mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL);
+    actual_mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
 
     /* Do not use mfn_valid() here as it will usually fail for MMIO pages. */
     if ( (INVALID_MFN == mfn_x(actual_mfn)) || (t != p2m_mmio_direct) )
@@ -995,7 +995,7 @@ int set_shared_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
         return -EIO;
 
     gfn_lock(p2m, gfn, 0);
-    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL);
+    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL, NULL);
     /* At the moment we only allow p2m change if gfn has already been made
      * sharable first */
     ASSERT(p2m_is_shared(ot));
@@ -1047,7 +1047,7 @@ int p2m_mem_paging_nominate(struct domain *d, unsigned long gfn)
 
     gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
 
     /* Check if mfn is valid */
     if ( !mfn_valid(mfn) )
@@ -1109,7 +1109,7 @@ int p2m_mem_paging_evict(struct domain *d, unsigned long gfn)
     gfn_lock(p2m, gfn, 0);
 
     /* Get mfn */
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
     if ( unlikely(!mfn_valid(mfn)) )
         goto out;
 
@@ -1241,7 +1241,7 @@ void p2m_mem_paging_populate(struct domain *d, unsigned long gfn)
 
     /* Fix p2m mapping */
     gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
     /* Allow only nominated or evicted pages to enter page-in path */
     if ( p2mt == p2m_ram_paging_out || p2mt == p2m_ram_paged )
     {
@@ -1303,7 +1303,7 @@ int p2m_mem_paging_prep(struct domain *d, unsigned long gfn, uint64_t buffer)
 
     gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
 
     ret = -ENOENT;
     /* Allow missing pages */
@@ -1391,7 +1391,7 @@ void p2m_mem_paging_resume(struct domain *d, vm_event_response_t *rsp)
         unsigned long gfn = rsp->u.mem_access.gfn;
 
         gfn_lock(p2m, gfn, 0);
-        mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
         /*
          * Allow only pages which were prepared properly, or pages which
          * were nominated but not evicted.
@@ -1540,11 +1540,11 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
      * These calls to p2m->set_entry() must succeed: we have the gfn
      * locked and just did a successful get_entry(). */
     gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, NULL);
 
     if ( npfec.write_access && p2ma == p2m_access_rx2rw ) 
     {
-        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw);
+        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw, -1);
         ASSERT(rc == 0);
         gfn_unlock(p2m, gfn, 0);
         return 1;
@@ -1553,7 +1553,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     {
         ASSERT(npfec.write_access || npfec.read_access || npfec.insn_fetch);
         rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
-                            p2mt, p2m_access_rwx);
+                            p2mt, p2m_access_rwx, -1);
         ASSERT(rc == 0);
     }
     gfn_unlock(p2m, gfn, 0);
@@ -1573,14 +1573,14 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
         else
         {
             gfn_lock(p2m, gfn, 0);
-            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, NULL);
             if ( p2ma != p2m_access_n2rwx )
             {
                 /* A listener is not required, so clear the access
                  * restrictions.  This set must succeed: we have the
                  * gfn locked and just did a successful get_entry(). */
                 rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
-                                    p2mt, p2m_access_rwx);
+                                    p2mt, p2m_access_rwx, -1);
                 ASSERT(rc == 0);
             }
             gfn_unlock(p2m, gfn, 0);
@@ -1711,8 +1711,8 @@ long p2m_set_mem_access(struct domain *d, gfn_t gfn, uint32_t nr,
     p2m_lock(p2m);
     for ( gfn_l = gfn_x(gfn) + start; nr > start; ++gfn_l )
     {
-        mfn = p2m->get_entry(p2m, gfn_l, &t, &_a, 0, NULL);
-        rc = p2m->set_entry(p2m, gfn_l, mfn, PAGE_ORDER_4K, t, a);
+        mfn = p2m->get_entry(p2m, gfn_l, &t, &_a, 0, NULL, NULL);
+        rc = p2m->set_entry(p2m, gfn_l, mfn, PAGE_ORDER_4K, t, a, -1);
         if ( rc )
             break;
 
@@ -1761,7 +1761,7 @@ int p2m_get_mem_access(struct domain *d, gfn_t gfn, xenmem_access_t *access)
     }
 
     gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, gfn_x(gfn), &t, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn_x(gfn), &t, &a, 0, NULL, NULL);
     gfn_unlock(p2m, gfn, 0);
 
     if ( mfn_x(mfn) == INVALID_MFN )
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 2d56aad..e7ee135 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -226,17 +226,19 @@ struct p2m_domain {
     /* Pages used to construct the p2m */
     struct page_list_head pages;
 
-    int                (*set_entry   )(struct p2m_domain *p2m,
-                                       unsigned long gfn,
-                                       mfn_t mfn, unsigned int page_order,
-                                       p2m_type_t p2mt,
-                                       p2m_access_t p2ma);
-    mfn_t              (*get_entry   )(struct p2m_domain *p2m,
-                                       unsigned long gfn,
-                                       p2m_type_t *p2mt,
-                                       p2m_access_t *p2ma,
-                                       p2m_query_t q,
-                                       unsigned int *page_order);
+    int                (*set_entry)(struct p2m_domain *p2m,
+                                    unsigned long gfn,
+                                    mfn_t mfn, unsigned int page_order,
+                                    p2m_type_t p2mt,
+                                    p2m_access_t p2ma,
+                                    int sve);
+    mfn_t              (*get_entry)(struct p2m_domain *p2m,
+                                    unsigned long gfn,
+                                    p2m_type_t *p2mt,
+                                    p2m_access_t *p2ma,
+                                    p2m_query_t q,
+                                    unsigned int *page_order,
+                                    bool_t *sve);
     void               (*enable_hardware_log_dirty)(struct p2m_domain *p2m);
     void               (*disable_hardware_log_dirty)(struct p2m_domain *p2m);
     void               (*flush_hardware_cached_dirty)(struct p2m_domain *p2m);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 09/15] x86/altp2m: alternate p2m memory events.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (7 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 08/15] x86/altp2m: add control of suppress_ve Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-22 23:01 ` [PATCH v7 10/15] x86/altp2m: add remaining support routines Ed White
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

Add a flag to indicate that a memory event occurred in an alternate p2m
and a field containing the p2m index. Allow any event response to switch
to a different alternate p2m using the same flag and field.

Modify p2m_mem_access_check() to handle alternate p2m's.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> for the x86 bits.
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Tamas K Lengyel <tlengyel@novetta.com>
---
Changes since v6:
        no changes

 xen/arch/x86/mm/p2m.c         | 19 ++++++++++++++++++-
 xen/common/vm_event.c         |  4 ++++
 xen/include/asm-arm/p2m.h     |  6 ++++++
 xen/include/asm-x86/p2m.h     |  3 +++
 xen/include/public/vm_event.h | 12 ++++++++++++
 5 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6143d4e..ac901ac 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1521,6 +1521,12 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
     }
 }
 
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
+{
+    if ( altp2m_active(v->domain) )
+        p2m_switch_vcpu_altp2m_by_id(v, idx);
+}
+
 bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
                             struct npfec npfec,
                             vm_event_request_t **req_ptr)
@@ -1528,7 +1534,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     struct vcpu *v = current;
     unsigned long gfn = gpa >> PAGE_SHIFT;
     struct domain *d = v->domain;    
-    struct p2m_domain* p2m = p2m_get_hostp2m(d);
+    struct p2m_domain *p2m = NULL;
     mfn_t mfn;
     p2m_type_t p2mt;
     p2m_access_t p2ma;
@@ -1536,6 +1542,11 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     int rc;
     unsigned long eip = guest_cpu_user_regs()->eip;
 
+    if ( altp2m_active(d) )
+        p2m = p2m_get_altp2m(v);
+    if ( !p2m )
+        p2m = p2m_get_hostp2m(d);
+
     /* First, handle rx2rw conversion automatically.
      * These calls to p2m->set_entry() must succeed: we have the gfn
      * locked and just did a successful get_entry(). */
@@ -1650,6 +1661,12 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
         req->vcpu_id = v->vcpu_id;
 
         p2m_vm_event_fill_regs(req);
+
+        if ( altp2m_active(v->domain) )
+        {
+            req->flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
+            req->altp2m_idx = vcpu_altp2m(v).p2midx;
+        }
     }
 
     /* Pause the current VCPU */
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 4c6bf98..f3a8736 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -412,6 +412,10 @@ void vm_event_resume(struct domain *d, struct vm_event_domain *ved)
 
         };
 
+        /* Check for altp2m switch */
+        if ( rsp.flags & VM_EVENT_FLAG_ALTERNATE_P2M )
+            p2m_altp2m_check(v, rsp.altp2m_idx);
+
         if ( rsp.flags & VM_EVENT_FLAG_VCPU_PAUSED )
         {
             if ( rsp.flags & VM_EVENT_FLAG_TOGGLE_SINGLESTEP )
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 63748ef..08bdce3 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -109,6 +109,12 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
     /* Not supported on ARM. */
 }
 
+static inline
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
+{
+    /* Not supported on ARM. */
+}
+
 #define p2m_is_foreign(_t)  ((_t) == p2m_map_foreign)
 #define p2m_is_ram(_t)      ((_t) == p2m_ram_rw || (_t) == p2m_ram_ro)
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index e7ee135..98aa12d 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -755,6 +755,9 @@ unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp);
 /* Switch alternate p2m for a single vcpu */
 bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, unsigned int idx);
 
+/* Check to see if vcpu should be switched to a different p2m. */
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx);
+
 /*
  * p2m type to IOMMU flags
  */
diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h
index fbc76b2..ff2f217 100644
--- a/xen/include/public/vm_event.h
+++ b/xen/include/public/vm_event.h
@@ -79,6 +79,16 @@
   * Currently only useful for MSR, CR0, CR3 and CR4 write events.
   */
 #define VM_EVENT_FLAG_DENY               (1 << 6)
+/*
+ * This flag can be set in a request or a response
+ *
+ * On a request, indicates that the event occurred in the alternate p2m specified by
+ * the altp2m_idx request field.
+ *
+ * On a response, indicates that the VCPU should resume in the alternate p2m specified
+ * by the altp2m_idx response field if possible.
+ */
+#define VM_EVENT_FLAG_ALTERNATE_P2M      (1 << 7)
 
 /*
  * Reasons for the vm event request
@@ -221,6 +231,8 @@ typedef struct vm_event_st {
     uint32_t flags;     /* VM_EVENT_FLAG_* */
     uint32_t reason;    /* VM_EVENT_REASON_* */
     uint32_t vcpu_id;
+    uint16_t altp2m_idx; /* may be used during request and response */
+    uint16_t _pad[3];
 
     union {
         struct vm_event_paging                mem_paging;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 10/15] x86/altp2m: add remaining support routines.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (8 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 09/15] x86/altp2m: alternate p2m memory events Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-23 10:05   ` Jan Beulich
  2015-07-23 19:10   ` George Dunlap
  2015-07-22 23:01 ` [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
                   ` (6 subsequent siblings)
  16 siblings, 2 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

Add the remaining routines required to support enabling the alternate
p2m functionality.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v6:
        rename altp2m lazy copier, make bool_t, use __put_gfn throughout,
          and move to p2m.c, eliminating altp2m_hap.c
        change various p2m_altp2m... routines from long to int
        change uint16_t's/uint8_t's to unsigned int's
        optimize remapped gfn tracking and altp2m invalidation check
        mechanical changes due to patch 5 changes
		
		not done - abstracting some ept functionality from p2m

 xen/arch/x86/hvm/hvm.c    |  61 ++++++-
 xen/arch/x86/mm/p2m-ept.c |   3 +
 xen/arch/x86/mm/p2m.c     | 444 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/p2m.h |  39 ++++
 4 files changed, 541 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e8d2ac3..ba771c3 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2856,10 +2856,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     mfn_t mfn;
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
-    struct p2m_domain *p2m;
+    struct p2m_domain *p2m, *hostp2m;
     int rc, fall_through = 0, paged = 0;
     int sharing_enomem = 0;
     vm_event_request_t *req_ptr = NULL;
+    bool_t ap2m_active;
 
     /* On Nested Virtualization, walk the guest page table.
      * If this succeeds, all is fine.
@@ -2919,11 +2920,32 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
         goto out;
     }
 
-    p2m = p2m_get_hostp2m(currd);
-    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
+    ap2m_active = altp2m_active(currd);
+
+    /*
+     * Take a lock on the host p2m speculatively, to avoid potential
+     * locking order problems later and to handle unshare etc.
+     */
+    hostp2m = p2m_get_hostp2m(currd);
+    mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
                               P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE : 0),
                               NULL);
 
+    if ( ap2m_active )
+    {
+        if ( p2m_altp2m_lazy_copy(curr, gpa, gla, npfec, &p2m) )
+        {
+            /* entry was lazily copied from host -- retry */
+            __put_gfn(hostp2m, gfn);
+            rc = 1;
+            goto out;
+        }
+
+        mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+    }
+    else
+        p2m = hostp2m;
+
     /* Check access permissions first, then handle faults */
     if ( mfn_x(mfn) != INVALID_MFN )
     {
@@ -2963,6 +2985,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
 
         if ( violation )
         {
+            /* Should #VE be emulated for this fault? */
+            if ( p2m_is_altp2m(p2m) && !cpu_has_vmx_virt_exceptions )
+            {
+                bool_t sve;
+
+                p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
+
+                if ( !sve && altp2m_vcpu_emulate_ve(curr) )
+                {
+                    rc = 1;
+                    goto out_put_gfn;
+                }
+            }
+
             if ( p2m_mem_access_check(gpa, gla, npfec, &req_ptr) )
             {
                 fall_through = 1;
@@ -2982,7 +3018,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
          (npfec.write_access &&
           (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
     {
-        put_gfn(currd, gfn);
+        __put_gfn(p2m, gfn);
+        if ( ap2m_active )
+            __put_gfn(hostp2m, gfn);
 
         rc = 0;
         if ( unlikely(is_pvh_domain(currd)) )
@@ -3011,6 +3049,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     /* Spurious fault? PoD and log-dirty also take this path. */
     if ( p2m_is_ram(p2mt) )
     {
+        rc = 1;
         /*
          * Page log dirty is always done with order 0. If this mfn resides in
          * a large page, we do not change other pages type within that large
@@ -3019,9 +3058,17 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
         if ( npfec.write_access )
         {
             paging_mark_dirty(currd, mfn_x(mfn));
+            /*
+             * If p2m is really an altp2m, unlock here to avoid lock ordering
+             * violation when the change below is propagated from host p2m.
+             */
+            if ( ap2m_active )
+                __put_gfn(p2m, gfn);
             p2m_change_type_one(currd, gfn, p2m_ram_logdirty, p2m_ram_rw);
+            __put_gfn(ap2m_active ? hostp2m : p2m, gfn);
+
+            goto out;
         }
-        rc = 1;
         goto out_put_gfn;
     }
 
@@ -3031,7 +3078,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     rc = fall_through;
 
 out_put_gfn:
-    put_gfn(currd, gfn);
+    __put_gfn(p2m, gfn);
+    if ( ap2m_active )
+        __put_gfn(hostp2m, gfn);
 out:
     /* All of these are delayed until we exit, since we might 
      * sleep on event ring wait queues, and we must not hold
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index b4c65f9..dc82518 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -851,6 +851,9 @@ out:
     if ( is_epte_present(&old_entry) )
         ept_free_entry(p2m, &old_entry, target);
 
+    if ( rc == 0 && p2m_is_hostp2m(p2m) )
+        p2m_altp2m_propagate_change(d, _gfn(gfn), mfn, order, p2mt, p2ma);
+
     return rc;
 }
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index ac901ac..66b025c 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2058,6 +2058,450 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, unsigned int idx)
     return rc;
 }
 
+/*
+ * If the fault is for a not present entry:
+ *     if the entry in the host p2m has a valid mfn, copy it and retry
+ *     else indicate that outer handler should handle fault
+ *
+ * If the fault is for a present entry:
+ *     indicate that outer handler should handle fault
+ */
+
+bool_t p2m_altp2m_lazy_copy(struct vcpu *v, paddr_t gpa,
+                            unsigned long gla, struct npfec npfec,
+                            struct p2m_domain **ap2m)
+{
+    struct p2m_domain *hp2m = p2m_get_hostp2m(v->domain);
+    p2m_type_t p2mt;
+    p2m_access_t p2ma;
+    unsigned int page_order;
+    gfn_t gfn = _gfn(paddr_to_pfn(gpa));
+    unsigned long mask;
+    mfn_t mfn;
+    int rv;
+
+    *ap2m = p2m_get_altp2m(v);
+
+    mfn = get_gfn_type_access(*ap2m, gfn_x(gfn), &p2mt, &p2ma,
+                              0, &page_order);
+    __put_gfn(*ap2m, gfn_x(gfn));
+
+    if ( mfn_x(mfn) != INVALID_MFN )
+        return 0;
+
+    mfn = get_gfn_type_access(hp2m, gfn_x(gfn), &p2mt, &p2ma,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order);
+    __put_gfn(hp2m, gfn_x(gfn));
+
+    if ( mfn_x(mfn) == INVALID_MFN )
+        return 0;
+
+    p2m_lock(*ap2m);
+
+    /*
+     * If this is a superpage mapping, round down both frame numbers
+     * to the start of the superpage.
+     */
+    mask = ~((1UL << page_order) - 1);
+    mfn = _mfn(mfn_x(mfn) & mask);
+
+    rv = p2m_set_entry(*ap2m, gfn_x(gfn) & mask, mfn, page_order, p2mt, p2ma);
+    p2m_unlock(*ap2m);
+
+    if ( rv )
+    {
+        gdprintk(XENLOG_ERR,
+	    "failed to set entry for %#"PRIx64" -> %#"PRIx64" p2m %#"PRIx64"\n",
+	    gfn_x(gfn), mfn_x(mfn), (unsigned long)*ap2m);
+        domain_crash(hp2m->domain);
+    }
+
+    return 1;
+}
+
+void p2m_flush_altp2m(struct domain *d)
+{
+    unsigned int i;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        p2m_flush_table(d->arch.altp2m_p2m[i]);
+        /* Uninit and reinit ept to force TLB shootdown */
+        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
+        ept_p2m_init(d->arch.altp2m_p2m[i]);
+        d->arch.altp2m_eptp[i] = INVALID_MFN;
+    }
+
+    altp2m_list_unlock(d);
+}
+
+static void p2m_init_altp2m_helper(struct domain *d, unsigned int i)
+{
+    struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
+    struct ept_data *ept;
+
+    p2m->min_remapped_gfn = INVALID_GFN;
+    p2m->max_remapped_gfn = 0;
+    ept = &p2m->ept;
+    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
+    d->arch.altp2m_eptp[i] = ept_get_eptp(ept);
+}
+
+int p2m_init_altp2m_by_id(struct domain *d, unsigned int idx)
+{
+    int rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] == INVALID_MFN )
+    {
+        p2m_init_altp2m_helper(d, idx);
+        rc = 0;
+    }
+
+    altp2m_list_unlock(d);
+    return rc;
+}
+
+int p2m_init_next_altp2m(struct domain *d, uint16_t *idx)
+{
+    int rc = -EINVAL;
+    unsigned int i;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] != INVALID_MFN )
+            continue;
+
+        p2m_init_altp2m_helper(d, i);
+        *idx = i;
+        rc = 0;
+
+        break;
+    }
+
+    altp2m_list_unlock(d);
+    return rc;
+}
+
+int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx)
+{
+    struct p2m_domain *p2m;
+    int rc = -EINVAL;
+
+    if ( !idx || idx > MAX_ALTP2M )
+        return rc;
+
+    domain_pause_except_self(d);
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        p2m = d->arch.altp2m_p2m[idx];
+
+        if ( !_atomic_read(p2m->active_vcpus) )
+        {
+            p2m_flush_table(d->arch.altp2m_p2m[idx]);
+            /* Uninit and reinit ept to force TLB shootdown */
+            ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
+            ept_p2m_init(d->arch.altp2m_p2m[idx]);
+            d->arch.altp2m_eptp[idx] = INVALID_MFN;
+            rc = 0;
+        }
+    }
+
+    altp2m_list_unlock(d);
+
+    domain_unpause_except_self(d);
+
+    return rc;
+}
+
+int p2m_switch_domain_altp2m_by_id(struct domain *d, unsigned int idx)
+{
+    struct vcpu *v;
+    int rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    domain_pause_except_self(d);
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        for_each_vcpu( d, v )
+            if ( idx != vcpu_altp2m(v).p2midx )
+            {
+                atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+                vcpu_altp2m(v).p2midx = idx;
+                atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+                altp2m_vcpu_update_p2m(v);
+            }
+
+        rc = 0;
+    }
+
+    altp2m_list_unlock(d);
+
+    domain_unpause_except_self(d);
+
+    return rc;
+}
+
+int p2m_set_altp2m_mem_access(struct domain *d, unsigned int idx,
+                              gfn_t gfn, xenmem_access_t access)
+{
+    struct p2m_domain *hp2m, *ap2m;
+    p2m_access_t req_a, old_a;
+    p2m_type_t t;
+    mfn_t mfn;
+    unsigned int page_order;
+    int rc = -EINVAL;
+
+    static const p2m_access_t memaccess[] = {
+#define ACCESS(ac) [XENMEM_access_##ac] = p2m_access_##ac
+        ACCESS(n),
+        ACCESS(r),
+        ACCESS(w),
+        ACCESS(rw),
+        ACCESS(x),
+        ACCESS(rx),
+        ACCESS(wx),
+        ACCESS(rwx),
+#undef ACCESS
+    };
+
+    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
+        return rc;
+
+    ap2m = d->arch.altp2m_p2m[idx];
+
+    switch ( access )
+    {
+    case 0 ... ARRAY_SIZE(memaccess) - 1:
+        req_a = memaccess[access];
+        break;
+    case XENMEM_access_default:
+        req_a = ap2m->default_access;
+        break;
+    default:
+        return rc;
+    }
+
+    /* If request to set default access */
+    if ( gfn_x(gfn) == INVALID_GFN )
+    {
+        ap2m->default_access = req_a;
+        return 0;
+    }
+
+    hp2m = p2m_get_hostp2m(d);
+
+    p2m_lock(ap2m);
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(gfn), &t, &old_a, 0, NULL, NULL);
+
+    /* Check host p2m if no valid entry in alternate */
+    if ( !mfn_valid(mfn) )
+    {
+        mfn = hp2m->get_entry(hp2m, gfn_x(gfn), &t, &old_a,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order, NULL);
+
+        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
+            goto out;
+
+        /* If this is a superpage, copy that first */
+        if ( page_order != PAGE_ORDER_4K )
+        {
+            gfn_t gfn2;
+            unsigned long mask;
+            mfn_t mfn2;
+
+            mask = ~((1UL << page_order) - 1);
+            gfn2 = _gfn(gfn_x(gfn) & mask);
+            mfn2 = _mfn(mfn_x(mfn) & mask);
+
+            if ( ap2m->set_entry(ap2m, gfn_x(gfn2), mfn2, page_order, t, old_a, 1) )
+                goto out;
+        }
+    }
+
+    if ( !ap2m->set_entry(ap2m, gfn_x(gfn), mfn, PAGE_ORDER_4K, t, req_a,
+                          (current->domain != d)) )
+        rc = 0;
+
+out:
+    p2m_unlock(ap2m);
+    return rc;
+}
+
+int p2m_change_altp2m_gfn(struct domain *d, unsigned int idx,
+                          gfn_t old_gfn, gfn_t new_gfn)
+{
+    struct p2m_domain *hp2m, *ap2m;
+    p2m_access_t a;
+    p2m_type_t t;
+    mfn_t mfn;
+    unsigned int page_order;
+    int rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
+        return rc;
+
+    hp2m = p2m_get_hostp2m(d);
+    ap2m = d->arch.altp2m_p2m[idx];
+
+    p2m_lock(ap2m);
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(old_gfn), &t, &a, 0, NULL, NULL);
+
+    if ( gfn_x(new_gfn) == INVALID_GFN )
+    {
+        if ( mfn_valid(mfn) )
+            p2m_remove_page(ap2m, gfn_x(old_gfn), mfn_x(mfn), PAGE_ORDER_4K);
+        rc = 0;
+        goto out;
+    }
+
+    /* Check host p2m if no valid entry in alternate */
+    if ( !mfn_valid(mfn) )
+    {
+        mfn = hp2m->get_entry(hp2m, gfn_x(old_gfn), &t, &a,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order, NULL);
+
+        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
+            goto out;
+
+        /* If this is a superpage, copy that first */
+        if ( page_order != PAGE_ORDER_4K )
+        {
+            gfn_t gfn;
+            unsigned long mask;
+
+            mask = ~((1UL << page_order) - 1);
+            gfn = _gfn(gfn_x(old_gfn) & mask);
+            mfn = _mfn(mfn_x(mfn) & mask);
+
+            if ( ap2m->set_entry(ap2m, gfn_x(gfn), mfn, page_order, t, a, 1) )
+                goto out;
+        }
+    }
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(new_gfn), &t, &a, 0, NULL, NULL);
+
+    if ( !mfn_valid(mfn) )
+        mfn = hp2m->get_entry(hp2m, gfn_x(new_gfn), &t, &a, 0, NULL, NULL);
+
+    if ( !mfn_valid(mfn) || (t != p2m_ram_rw) )
+        goto out;
+
+    if ( !ap2m->set_entry(ap2m, gfn_x(old_gfn), mfn, PAGE_ORDER_4K, t, a,
+                          (current->domain != d)) )
+    {
+        rc = 0;
+
+        if ( gfn_x(new_gfn) < ap2m->min_remapped_gfn )
+            ap2m->min_remapped_gfn = gfn_x(new_gfn);
+        if ( gfn_x(new_gfn) > ap2m->max_remapped_gfn )
+            ap2m->max_remapped_gfn = gfn_x(new_gfn);
+    }
+
+out:
+    p2m_unlock(ap2m);
+    return rc;
+}
+
+static void p2m_reset_altp2m(struct p2m_domain *p2m)
+{
+    p2m_flush_table(p2m);
+    /* Uninit and reinit ept to force TLB shootdown */
+    ept_p2m_uninit(p2m);
+    ept_p2m_init(p2m);
+    p2m->min_remapped_gfn = INVALID_GFN;
+    p2m->max_remapped_gfn = 0;
+}
+
+void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
+                                 mfn_t mfn, unsigned int page_order,
+                                 p2m_type_t p2mt, p2m_access_t p2ma)
+{
+    struct p2m_domain *p2m;
+    p2m_access_t a;
+    p2m_type_t t;
+    mfn_t m;
+    unsigned int i;
+    bool_t reset_p2m;
+    unsigned int reset_count = 0;
+    unsigned int last_reset_idx = ~0;
+
+    if ( !altp2m_active(d) )
+        return;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
+            continue;
+
+        p2m = d->arch.altp2m_p2m[i];
+        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
+
+        reset_p2m = 0;
+
+        /* Check for a dropped page that may impact this altp2m */
+        if ( mfn_x(mfn) == INVALID_MFN &&
+             gfn_x(gfn) >= p2m->min_remapped_gfn &&
+             gfn_x(gfn) <= p2m->max_remapped_gfn )
+            reset_p2m = 1;
+
+        if ( reset_p2m )
+        {
+            if ( !reset_count++ )
+            {
+                p2m_reset_altp2m(p2m);
+                last_reset_idx = i;
+            }
+            else
+            {
+                /* At least 2 altp2m's impacted, so reset everything */
+                __put_gfn(p2m, gfn_x(gfn));
+
+                for ( i = 0; i < MAX_ALTP2M; i++ )
+                {
+                    if ( i == last_reset_idx ||
+                         d->arch.altp2m_eptp[i] == INVALID_MFN )
+                        continue;
+
+                    p2m = d->arch.altp2m_p2m[i];
+                    p2m_lock(p2m);
+                    p2m_reset_altp2m(p2m);
+                    p2m_unlock(p2m);
+                }
+
+                goto out;
+            }
+        }
+        else if ( mfn_x(m) != INVALID_MFN )
+           p2m_set_entry(p2m, gfn_x(gfn), mfn, page_order, p2mt, p2ma);
+
+        __put_gfn(p2m, gfn_x(gfn));
+    }
+
+out:
+    altp2m_list_unlock(d);
+}
+
 /*** Audit ***/
 
 #if P2M_AUDIT
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 98aa12d..ab9485e 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -268,6 +268,13 @@ struct p2m_domain {
     /* Highest guest frame that's ever been mapped in the p2m */
     unsigned long max_mapped_pfn;
 
+    /*
+     * Alternate p2m's only: range of gfn's for which underlying
+     * mfn may have duplicate mappings
+     */
+    unsigned long min_remapped_gfn;
+    unsigned long max_remapped_gfn;
+
     /* When releasing shared gfn's in a preemptible manner, recall where
      * to resume the search */
     unsigned long next_shared_gfn_to_relinquish;
@@ -758,6 +765,38 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, unsigned int idx);
 /* Check to see if vcpu should be switched to a different p2m. */
 void p2m_altp2m_check(struct vcpu *v, uint16_t idx);
 
+/* Flush all the alternate p2m's for a domain */
+void p2m_flush_altp2m(struct domain *d);
+
+/* Alternate p2m paging */
+bool_t p2m_altp2m_lazy_copy(struct vcpu *v, paddr_t gpa,
+    unsigned long gla, struct npfec npfec, struct p2m_domain **ap2m);
+
+/* Make a specific alternate p2m valid */
+int p2m_init_altp2m_by_id(struct domain *d, unsigned int idx);
+
+/* Find an available alternate p2m and make it valid */
+int p2m_init_next_altp2m(struct domain *d, uint16_t *idx);
+
+/* Make a specific alternate p2m invalid */
+int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx);
+
+/* Switch alternate p2m for entire domain */
+int p2m_switch_domain_altp2m_by_id(struct domain *d, unsigned int idx);
+
+/* Set access type for a gfn */
+int p2m_set_altp2m_mem_access(struct domain *d, unsigned int idx,
+                              gfn_t gfn, xenmem_access_t access);
+
+/* Change a gfn->mfn mapping */
+int p2m_change_altp2m_gfn(struct domain *d, unsigned int idx,
+                          gfn_t old_gfn, gfn_t new_gfn);
+
+/* Propagate a host p2m change to all alternate p2m's */
+void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
+                                 mfn_t mfn, unsigned int page_order,
+                                 p2m_type_t p2mt, p2m_access_t p2ma);
+
 /*
  * p2m type to IOMMU flags
  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (9 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 10/15] x86/altp2m: add remaining support routines Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-23 10:22   ` Jan Beulich
  2015-07-22 23:01 ` [PATCH v7 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

Signed-off-by: Ed White <edmund.h.white@intel.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v6:
        fix cmd range check
        rework domain locking
        add Jan's ack

 xen/arch/x86/hvm/hvm.c          | 138 ++++++++++++++++++++++++++++++++++++++++
 xen/include/public/hvm/hvm_op.h |  89 ++++++++++++++++++++++++++
 2 files changed, 227 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ba771c3..4f4cccb 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6138,6 +6138,140 @@ static int hvmop_get_param(
     return rc;
 }
 
+static int do_altp2m_op(
+    XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    struct xen_hvm_altp2m_op a;
+    struct domain *d = NULL;
+    int rc = 0;
+
+    if ( !hvm_altp2m_supported() )
+        return -EOPNOTSUPP;
+
+    if ( copy_from_guest(&a, arg, 1) )
+        return -EFAULT;
+
+    if ( a.pad1 || a.pad2 ||
+         (a.version != HVMOP_ALTP2M_INTERFACE_VERSION) ||
+         (a.cmd < HVMOP_altp2m_get_domain_state) ||
+         (a.cmd > HVMOP_altp2m_change_gfn) )
+        return -EINVAL;
+
+    d = (a.cmd != HVMOP_altp2m_vcpu_enable_notify) ?
+        rcu_lock_domain_by_any_id(a.domain) : rcu_lock_current_domain();
+
+    if ( d == NULL )
+        return -ESRCH;
+
+    if ( !is_hvm_domain(d) )
+    {
+        rc = -EOPNOTSUPP;
+        goto out;
+    }
+
+    if ( (a.cmd != HVMOP_altp2m_get_domain_state) &&
+         (a.cmd != HVMOP_altp2m_set_domain_state) &&
+         !d->arch.altp2m_active )
+    {
+        rc = -EOPNOTSUPP;
+        goto out;
+    }
+
+    switch ( a.cmd )
+    {
+    case HVMOP_altp2m_get_domain_state:
+        a.u.domain_state.state = altp2m_active(d);
+        rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+        break;
+
+    case HVMOP_altp2m_set_domain_state:
+    {
+        struct vcpu *v;
+        bool_t ostate;
+
+        if ( nestedhvm_enabled(d) )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
+        ostate = d->arch.altp2m_active;
+        d->arch.altp2m_active = !!a.u.domain_state.state;
+
+        /* If the alternate p2m state has changed, handle appropriately */
+        if ( d->arch.altp2m_active != ostate &&
+             (ostate || !(rc = p2m_init_altp2m_by_id(d, 0))) )
+        {
+            for_each_vcpu( d, v )
+            {
+                if ( !ostate )
+                    altp2m_vcpu_initialise(v);
+                else
+                    altp2m_vcpu_destroy(v);
+            }
+
+            if ( ostate )
+                p2m_flush_altp2m(d);
+        }
+        break;
+    }
+
+    case HVMOP_altp2m_vcpu_enable_notify:
+    {
+        struct vcpu *curr = current;
+        p2m_type_t p2mt;
+
+        if ( a.u.enable_notify.pad || a.domain != DOMID_SELF ||
+             a.u.enable_notify.vcpu_id != curr->vcpu_id )
+            rc = -EINVAL;
+
+        if ( (gfn_x(vcpu_altp2m(curr).veinfo_gfn) != INVALID_GFN) ||
+             (mfn_x(get_gfn_query_unlocked(curr->domain,
+                    a.u.enable_notify.gfn, &p2mt)) == INVALID_MFN) )
+            return -EINVAL;
+
+        vcpu_altp2m(curr).veinfo_gfn = _gfn(a.u.enable_notify.gfn);
+        altp2m_vcpu_update_vmfunc_ve(curr);
+        break;
+    }
+
+    case HVMOP_altp2m_create_p2m:
+        if ( !(rc = p2m_init_next_altp2m(d, &a.u.view.view)) )
+            rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+        break;
+
+    case HVMOP_altp2m_destroy_p2m:
+        rc = p2m_destroy_altp2m_by_id(d, a.u.view.view);
+        break;
+
+    case HVMOP_altp2m_switch_p2m:
+        rc = p2m_switch_domain_altp2m_by_id(d, a.u.view.view);
+        break;
+
+    case HVMOP_altp2m_set_mem_access:
+        if ( a.u.set_mem_access.pad )
+            rc = -EINVAL;
+        else
+            rc = p2m_set_altp2m_mem_access(d, a.u.set_mem_access.view,
+                    _gfn(a.u.set_mem_access.gfn),
+                    a.u.set_mem_access.hvmmem_access);
+        break;
+
+    case HVMOP_altp2m_change_gfn:
+        if ( a.u.change_gfn.pad1 || a.u.change_gfn.pad2 )
+            rc = -EINVAL;
+        else
+            rc = p2m_change_altp2m_gfn(d, a.u.change_gfn.view,
+                    _gfn(a.u.change_gfn.old_gfn),
+                    _gfn(a.u.change_gfn.new_gfn));
+    }
+
+ out:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
 /*
  * Note that this value is effectively part of the ABI, even if we don't need
  * to make it a formal part of it: A guest suspended for migration in the
@@ -6567,6 +6701,10 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             rc = -EINVAL;
         break;
 
+    case HVMOP_altp2m:
+        rc = do_altp2m_op(arg);
+        break;
+
     default:
     {
         gdprintk(XENLOG_DEBUG, "Bad HVM op %ld.\n", op);
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index d053db9..014546a 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -398,6 +398,95 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_evtchn_upcall_vector_t);
 
 #define HVMOP_guest_request_vm_event 24
 
+/* HVMOP_altp2m: perform altp2m state operations */
+#define HVMOP_altp2m 25
+
+#define HVMOP_ALTP2M_INTERFACE_VERSION 0x00000001
+
+struct xen_hvm_altp2m_domain_state {
+    /* IN or OUT variable on/off */
+    uint8_t state;
+};
+typedef struct xen_hvm_altp2m_domain_state xen_hvm_altp2m_domain_state_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_domain_state_t);
+
+struct xen_hvm_altp2m_vcpu_enable_notify {
+    uint32_t vcpu_id;
+    uint32_t pad;
+    /* #VE info area gfn */
+    uint64_t gfn;
+};
+typedef struct xen_hvm_altp2m_vcpu_enable_notify xen_hvm_altp2m_vcpu_enable_notify_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_vcpu_enable_notify_t);
+
+struct xen_hvm_altp2m_view {
+    /* IN/OUT variable */
+    uint16_t view;
+    /* Create view only: default access type
+     * NOTE: currently ignored */
+    uint16_t hvmmem_default_access; /* xenmem_access_t */
+};
+typedef struct xen_hvm_altp2m_view xen_hvm_altp2m_view_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_view_t);
+
+struct xen_hvm_altp2m_set_mem_access {
+    /* view */
+    uint16_t view;
+    /* Memory type */
+    uint16_t hvmmem_access; /* xenmem_access_t */
+    uint32_t pad;
+    /* gfn */
+    uint64_t gfn;
+};
+typedef struct xen_hvm_altp2m_set_mem_access xen_hvm_altp2m_set_mem_access_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
+
+struct xen_hvm_altp2m_change_gfn {
+    /* view */
+    uint16_t view;
+    uint16_t pad1;
+    uint32_t pad2;
+    /* old gfn */
+    uint64_t old_gfn;
+    /* new gfn, INVALID_GFN (~0UL) means revert */
+    uint64_t new_gfn;
+};
+typedef struct xen_hvm_altp2m_change_gfn xen_hvm_altp2m_change_gfn_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_change_gfn_t);
+
+struct xen_hvm_altp2m_op {
+    uint32_t version;   /* HVMOP_ALTP2M_INTERFACE_VERSION */
+    uint32_t cmd;
+/* Get/set the altp2m state for a domain */
+#define HVMOP_altp2m_get_domain_state     1
+#define HVMOP_altp2m_set_domain_state     2
+/* Set the current VCPU to receive altp2m event notifications */
+#define HVMOP_altp2m_vcpu_enable_notify   3
+/* Create a new view */
+#define HVMOP_altp2m_create_p2m           4
+/* Destroy a view */
+#define HVMOP_altp2m_destroy_p2m          5
+/* Switch view for an entire domain */
+#define HVMOP_altp2m_switch_p2m           6
+/* Notify that a page of memory is to have specific access types */
+#define HVMOP_altp2m_set_mem_access       7
+/* Change a p2m entry to have a different gfn->mfn mapping */
+#define HVMOP_altp2m_change_gfn           8
+    domid_t domain;
+    uint16_t pad1;
+    uint32_t pad2;
+    union {
+        struct xen_hvm_altp2m_domain_state       domain_state;
+        struct xen_hvm_altp2m_vcpu_enable_notify enable_notify;
+        struct xen_hvm_altp2m_view               view;
+        struct xen_hvm_altp2m_set_mem_access     set_mem_access;
+        struct xen_hvm_altp2m_change_gfn         change_gfn;
+        uint8_t pad[64];
+    } u;
+};
+typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
+
 #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (10 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-22 23:01 ` [PATCH v7 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

The altp2mhvm and nestedhvm parameters are mutually
exclusive and cannot be set together.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Changes since v6:
        no changes

 docs/man/xl.cfg.pod.5           | 12 ++++++++++++
 tools/libxl/libxl.h             |  6 ++++++
 tools/libxl/libxl_create.c      |  1 +
 tools/libxl/libxl_dom.c         |  2 ++
 tools/libxl/libxl_types.idl     |  1 +
 tools/libxl/xl_cmdimpl.c        | 10 ++++++++++
 xen/arch/x86/hvm/hvm.c          | 21 ++++++++++++++++++++-
 xen/include/public/hvm/params.h |  5 ++++-
 8 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 382f30b..e53fd45 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1027,6 +1027,18 @@ enabled by default and you should usually omit it. It may be necessary
 to disable the HPET in order to improve compatibility with guest
 Operating Systems (X86 only)
 
+=item B<altp2mhvm=BOOLEAN>
+
+Enables or disables hvm guest access to alternate-p2m capability.
+Alternate-p2m allows a guest to manage multiple p2m guest physical
+"memory views" (as opposed to a single p2m). This option is
+disabled by default and is available only to hvm domains.
+You may want this option if you want to access-control/isolate
+access to specific guest physical memory pages accessed by
+the guest, e.g. for HVM domain memory introspection or
+for isolation/access-control of memory between components within
+a single guest hvm domain.
+
 =item B<nestedhvm=BOOLEAN>
 
 Enable or disables guest access to hardware virtualisation features,
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 5a7308d..6f86b21 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -758,6 +758,12 @@ typedef struct libxl__ctx libxl_ctx;
 #define LIBXL_HAVE_BUILDINFO_SERIAL_LIST 1
 
 /*
+ * LIBXL_HAVE_ALTP2M
+ * If this is defined, then libxl supports alternate p2m functionality.
+ */
+#define LIBXL_HAVE_ALTP2M 1
+
+/*
  * LIBXL_HAVE_REMUS
  * If this is defined, then libxl supports remus.
  */
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index a32e3df..b1614b2 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -277,6 +277,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
         libxl_defbool_setdefault(&b_info->u.hvm.hpet,               true);
         libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,          true);
         libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm,         false);
+        libxl_defbool_setdefault(&b_info->u.hvm.altp2m,             false);
         libxl_defbool_setdefault(&b_info->u.hvm.usb,                false);
         libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index edd7f3f..813c4a7 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -301,6 +301,8 @@ static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
                     libxl_defbool_val(info->u.hvm.vpt_align));
     xc_hvm_param_set(handle, domid, HVM_PARAM_NESTEDHVM,
                     libxl_defbool_val(info->u.hvm.nested_hvm));
+    xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
+                    libxl_defbool_val(info->u.hvm.altp2m));
 }
 
 int libxl__build_pre(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index bc0c4ef..b9dab54 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -458,6 +458,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                        ("mmio_hole_memkb",  MemKB),
                                        ("timer_mode",       libxl_timer_mode),
                                        ("nested_hvm",       libxl_defbool),
+                                       ("altp2m",           libxl_defbool),
                                        ("smbios_firmware",  string),
                                        ("acpi_firmware",    string),
                                        ("hdtype",           libxl_hdtype),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 1d45dd5..24b9808 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1564,6 +1564,16 @@ static void parse_config_data(const char *config_source,
 
         xlu_cfg_get_defbool(config, "nestedhvm", &b_info->u.hvm.nested_hvm, 0);
 
+        xlu_cfg_get_defbool(config, "altp2mhvm", &b_info->u.hvm.altp2m, 0);
+
+        if (!libxl_defbool_is_default(b_info->u.hvm.nested_hvm) &&
+            libxl_defbool_val(b_info->u.hvm.nested_hvm) &&
+            !libxl_defbool_is_default(b_info->u.hvm.altp2m) &&
+            libxl_defbool_val(b_info->u.hvm.altp2m)) {
+            fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used together\n");
+            exit(1);
+        }
+
         xlu_cfg_replace_string(config, "smbios_firmware",
                                &b_info->u.hvm.smbios_firmware, 0);
         xlu_cfg_replace_string(config, "acpi_firmware",
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4f4cccb..55e70f0 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5868,6 +5868,7 @@ static int hvm_allow_set_param(struct domain *d,
     case HVM_PARAM_VIRIDIAN:
     case HVM_PARAM_IOREQ_SERVER_PFN:
     case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
+    case HVM_PARAM_ALTP2M:
         if ( value != 0 && a->value != value )
             rc = -EEXIST;
         break;
@@ -5990,6 +5991,9 @@ static int hvmop_set_param(
          */
         if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
             rc = -EINVAL;
+        if ( a.value &&
+             d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] )
+            rc = -EINVAL;
         /* Set up NHVM state for any vcpus that are already up. */
         if ( a.value &&
              !d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM] )
@@ -6000,6 +6004,13 @@ static int hvmop_set_param(
             for_each_vcpu(d, v)
                 nestedhvm_vcpu_destroy(v);
         break;
+    case HVM_PARAM_ALTP2M:
+        if ( a.value > 1 )
+            rc = -EINVAL;
+        if ( a.value &&
+             d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM] )
+            rc = -EINVAL;
+        break;
     case HVM_PARAM_BUFIOREQ_EVTCHN:
         rc = -EINVAL;
         break;
@@ -6060,6 +6071,7 @@ static int hvm_allow_get_param(struct domain *d,
     case HVM_PARAM_STORE_EVTCHN:
     case HVM_PARAM_CONSOLE_PFN:
     case HVM_PARAM_CONSOLE_EVTCHN:
+    case HVM_PARAM_ALTP2M:
         break;
     /*
      * The following parameters must not be read by the guest
@@ -6180,6 +6192,12 @@ static int do_altp2m_op(
     switch ( a.cmd )
     {
     case HVMOP_altp2m_get_domain_state:
+        if ( !d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] )
+        {
+            rc = -EINVAL;
+            break;
+        }
+
         a.u.domain_state.state = altp2m_active(d);
         rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
         break;
@@ -6189,7 +6207,8 @@ static int do_altp2m_op(
         struct vcpu *v;
         bool_t ostate;
 
-        if ( nestedhvm_enabled(d) )
+        if ( !d->arch.hvm_domain.params[HVM_PARAM_ALTP2M] ||
+             nestedhvm_enabled(d) )
         {
             rc = -EINVAL;
             break;
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index 7c73089..147d9b8 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -187,6 +187,9 @@
 /* Location of the VM Generation ID in guest physical address space. */
 #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
 
-#define HVM_NR_PARAMS          35
+/* Boolean: Enable altp2m */
+#define HVM_PARAM_ALTP2M       35
+
+#define HVM_NR_PARAMS          36
 
 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 13/15] x86/altp2m: XSM hooks for altp2m HVM ops
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (11 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-23 16:08   ` Jan Beulich
  2015-07-22 23:01 ` [PATCH v7 14/15] tools/libxc: add support to altp2m hvmops Ed White
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

From: Ravi Sahita <ravi.sahita@intel.com>

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
Changes since v6:
        no changes

 tools/flask/policy/policy/modules/xen/xen.if |  4 ++--
 xen/arch/x86/hvm/hvm.c                       |  6 ++++++
 xen/include/xsm/dummy.h                      | 12 ++++++++++++
 xen/include/xsm/xsm.h                        | 12 ++++++++++++
 xen/xsm/dummy.c                              |  2 ++
 xen/xsm/flask/hooks.c                        | 12 ++++++++++++
 xen/xsm/flask/policy/access_vectors          |  7 +++++++
 7 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if
index da4c95b..a2f25e1 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -8,7 +8,7 @@
 define(`declare_domain_common', `
 	allow $1 $2:grant { query setup };
 	allow $1 $2:mmu { adjust physmap map_read map_write stat pinpage updatemp mmuext_op };
-	allow $1 $2:hvm { getparam setparam };
+	allow $1 $2:hvm { getparam setparam altp2mhvm_op };
 	allow $1 $2:domain2 get_vnumainfo;
 ')
 
@@ -58,7 +58,7 @@ define(`create_domain_common', `
 	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
 	allow $1 $2:grant setup;
 	allow $1 $2:hvm { cacheattr getparam hvmctl irqlevel pciroute sethvmc
-			setparam pcilevel trackdirtyvram nested };
+			setparam pcilevel trackdirtyvram nested altp2mhvm altp2mhvm_op };
 ')
 
 # create_domain(priv, target)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 55e70f0..cd5d8d5 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6005,6 +6005,9 @@ static int hvmop_set_param(
                 nestedhvm_vcpu_destroy(v);
         break;
     case HVM_PARAM_ALTP2M:
+        rc = xsm_hvm_param_altp2mhvm(XSM_PRIV, d);
+        if ( rc )
+            break;
         if ( a.value > 1 )
             rc = -EINVAL;
         if ( a.value &&
@@ -6189,6 +6192,9 @@ static int do_altp2m_op(
         goto out;
     }
 
+    if ( (rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d ? d : current->domain)) )
+        goto out;
+
     switch ( a.cmd )
     {
     case HVMOP_altp2m_get_domain_state:
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index adb02bc..bbbfce7 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -548,6 +548,18 @@ static XSM_INLINE int xsm_hvm_param_nested(XSM_DEFAULT_ARG struct domain *d)
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_hvm_param_altp2mhvm(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
+static XSM_INLINE int xsm_hvm_altp2mhvm_op(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_TARGET);
+    return xsm_default_action(action, current->domain, d);
+}
+
 static XSM_INLINE int xsm_vm_event_control(XSM_DEFAULT_ARG struct domain *d, int mode, int op)
 {
     XSM_ASSERT_ACTION(XSM_PRIV);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 7886574..3678a93 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -147,6 +147,8 @@ struct xsm_operations {
     int (*hvm_param) (struct domain *d, unsigned long op);
     int (*hvm_control) (struct domain *d, unsigned long op);
     int (*hvm_param_nested) (struct domain *d);
+    int (*hvm_param_altp2mhvm) (struct domain *d);
+    int (*hvm_altp2mhvm_op) (struct domain *d);
     int (*get_vnumainfo) (struct domain *d);
 
     int (*vm_event_control) (struct domain *d, int mode, int op);
@@ -587,6 +589,16 @@ static inline int xsm_hvm_param_nested (xsm_default_t def, struct domain *d)
     return xsm_ops->hvm_param_nested(d);
 }
 
+static inline int xsm_hvm_param_altp2mhvm (xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->hvm_param_altp2mhvm(d);
+}
+
+static inline int xsm_hvm_altp2mhvm_op (xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->hvm_altp2mhvm_op(d);
+}
+
 static inline int xsm_get_vnumainfo (xsm_default_t def, struct domain *d)
 {
     return xsm_ops->get_vnumainfo(d);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 06ac911..21b1bf8 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -116,6 +116,8 @@ void xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, hvm_param);
     set_to_dummy_if_null(ops, hvm_control);
     set_to_dummy_if_null(ops, hvm_param_nested);
+    set_to_dummy_if_null(ops, hvm_param_altp2mhvm);
+    set_to_dummy_if_null(ops, hvm_altp2mhvm_op);
 
     set_to_dummy_if_null(ops, do_xsm_op);
 #ifdef CONFIG_COMPAT
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 882681f..7a4522e 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1176,6 +1176,16 @@ static int flask_hvm_param_nested(struct domain *d)
     return current_has_perm(d, SECCLASS_HVM, HVM__NESTED);
 }
 
+static int flask_hvm_param_altp2mhvm(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_HVM, HVM__ALTP2MHVM);
+}
+
+static int flask_hvm_altp2mhvm_op(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_HVM, HVM__ALTP2MHVM_OP);
+}
+
 static int flask_vm_event_control(struct domain *d, int mode, int op)
 {
     return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__VM_EVENT);
@@ -1687,6 +1697,8 @@ static struct xsm_operations flask_ops = {
     .hvm_param = flask_hvm_param,
     .hvm_control = flask_hvm_param,
     .hvm_param_nested = flask_hvm_param_nested,
+    .hvm_param_altp2mhvm = flask_hvm_param_altp2mhvm,
+    .hvm_altp2mhvm_op = flask_hvm_altp2mhvm_op,
 
     .do_xsm_op = do_flask_op,
     .get_vnumainfo = flask_get_vnumainfo,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index b2a20c3..71495fd 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -282,6 +282,13 @@ class hvm
     share_mem
 # HVMOP_set_param setting HVM_PARAM_NESTEDHVM
     nested
+# HVMOP_set_param setting HVM_PARAM_ALTP2MHVM
+    altp2mhvm
+# HVMOP_altp2m_set_domain_state HVMOP_altp2m_get_domain_state
+# HVMOP_altp2m_vcpu_enable_notify HVMOP_altp2m_create_p2m
+# HVMOP_altp2m_destroy_p2m HVMOP_altp2m_switch_p2m
+# HVMOP_altp2m_set_mem_access HVMOP_altp2m_change_gfn
+    altp2mhvm_op
 }
 
 # Class event describes event channels.  Interdomain event channels have their
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 14/15] tools/libxc: add support to altp2m hvmops
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (12 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-22 23:01 ` [PATCH v7 15/15] tools/xen-access: altp2m testcases Ed White
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

From: Tamas K Lengyel <tlengyel@novetta.com>

Wrappers to issue altp2m hvmops.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes since v6:
        no changes

 tools/libxc/Makefile          |   1 +
 tools/libxc/include/xenctrl.h |  22 ++++
 tools/libxc/xc_altp2m.c       | 248 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 271 insertions(+)
 create mode 100644 tools/libxc/xc_altp2m.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 1aec848..b0a3e05 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -10,6 +10,7 @@ override CONFIG_MIGRATE := n
 endif
 
 CTRL_SRCS-y       :=
+CTRL_SRCS-y       += xc_altp2m.c
 CTRL_SRCS-y       += xc_core.c
 CTRL_SRCS-$(CONFIG_X86) += xc_core_x86.c
 CTRL_SRCS-$(CONFIG_ARM) += xc_core_arm.c
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index ce9029c..f869390 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2304,6 +2304,28 @@ void xc_tmem_save_done(xc_interface *xch, int dom);
 int xc_tmem_restore(xc_interface *xch, int dom, int fd);
 int xc_tmem_restore_extra(xc_interface *xch, int dom, int fd);
 
+/**
+ * altp2m operations
+ */
+
+int xc_altp2m_get_domain_state(xc_interface *handle, domid_t dom, bool *state);
+int xc_altp2m_set_domain_state(xc_interface *handle, domid_t dom, bool state);
+int xc_altp2m_set_vcpu_enable_notify(xc_interface *handle, domid_t domid,
+                                     uint32_t vcpuid, xen_pfn_t gfn);
+int xc_altp2m_create_view(xc_interface *handle, domid_t domid,
+                          xenmem_access_t default_access, uint16_t *view_id);
+int xc_altp2m_destroy_view(xc_interface *handle, domid_t domid,
+                           uint16_t view_id);
+/* Switch all vCPUs of the domain to the specified altp2m view */
+int xc_altp2m_switch_to_view(xc_interface *handle, domid_t domid,
+                             uint16_t view_id);
+int xc_altp2m_set_mem_access(xc_interface *handle, domid_t domid,
+                             uint16_t view_id, xen_pfn_t gfn,
+                             xenmem_access_t access);
+int xc_altp2m_change_gfn(xc_interface *handle, domid_t domid,
+                         uint16_t view_id, xen_pfn_t old_gfn,
+                         xen_pfn_t new_gfn);
+
 /** 
  * Mem paging operations.
  * Paging is supported only on the x86 architecture in 64 bit mode, with
diff --git a/tools/libxc/xc_altp2m.c b/tools/libxc/xc_altp2m.c
new file mode 100644
index 0000000..0f3c5ed
--- /dev/null
+++ b/tools/libxc/xc_altp2m.c
@@ -0,0 +1,248 @@
+/******************************************************************************
+ *
+ * xc_altp2m.c
+ *
+ * Interface to altp2m related HVMOPs
+ *
+ * Copyright (c) 2015 Tamas K Lengyel (tamas@tklengyel.com)
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+#include "xc_private.h"
+#include <stdbool.h>
+#include <xen/hvm/hvm_op.h>
+
+int xc_altp2m_get_domain_state(xc_interface *handle, domid_t dom, bool *state)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+    arg->cmd = HVMOP_altp2m_get_domain_state;
+    arg->domain = dom;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    if ( !rc )
+        *state = arg->u.domain_state.state;
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_set_domain_state(xc_interface *handle, domid_t dom, bool state)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+    arg->cmd = HVMOP_altp2m_set_domain_state;
+    arg->domain = dom;
+    arg->u.domain_state.state = state;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+/* This is a bit odd to me that it acts on current.. */
+int xc_altp2m_set_vcpu_enable_notify(xc_interface *handle, domid_t domid,
+                                     uint32_t vcpuid, xen_pfn_t gfn)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+    arg->cmd = HVMOP_altp2m_vcpu_enable_notify;
+    arg->domain = domid;
+    arg->u.enable_notify.vcpu_id = vcpuid;
+    arg->u.enable_notify.gfn = gfn;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_create_view(xc_interface *handle, domid_t domid,
+                          xenmem_access_t default_access, uint16_t *view_id)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+    arg->cmd = HVMOP_altp2m_create_p2m;
+    arg->domain = domid;
+    arg->u.view.view = -1;
+    arg->u.view.hvmmem_default_access = default_access;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    if ( !rc )
+        *view_id = arg->u.view.view;
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_destroy_view(xc_interface *handle, domid_t domid,
+                           uint16_t view_id)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+    arg->cmd = HVMOP_altp2m_destroy_p2m;
+    arg->domain = domid;
+    arg->u.view.view = view_id;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+/* Switch all vCPUs of the domain to the specified altp2m view */
+int xc_altp2m_switch_to_view(xc_interface *handle, domid_t domid,
+                             uint16_t view_id)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+    arg->cmd = HVMOP_altp2m_switch_p2m;
+    arg->domain = domid;
+    arg->u.view.view = view_id;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_set_mem_access(xc_interface *handle, domid_t domid,
+                             uint16_t view_id, xen_pfn_t gfn,
+                             xenmem_access_t access)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+    arg->cmd = HVMOP_altp2m_set_mem_access;
+    arg->domain = domid;
+    arg->u.set_mem_access.view = view_id;
+    arg->u.set_mem_access.hvmmem_access = access;
+    arg->u.set_mem_access.gfn = gfn;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_change_gfn(xc_interface *handle, domid_t domid,
+                         uint16_t view_id, xen_pfn_t old_gfn,
+                         xen_pfn_t new_gfn)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+    arg->cmd = HVMOP_altp2m_change_gfn;
+    arg->domain = domid;
+    arg->u.change_gfn.view = view_id;
+    arg->u.change_gfn.old_gfn = old_gfn;
+    arg->u.change_gfn.new_gfn = new_gfn;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v7 15/15] tools/xen-access: altp2m testcases
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (13 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 14/15] tools/libxc: add support to altp2m hvmops Ed White
@ 2015-07-22 23:01 ` Ed White
  2015-07-23 17:12 ` [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Wei Liu
  2015-07-24  9:56 ` Wei Liu
  16 siblings, 0 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

From: Tamas K Lengyel <tlengyel@novetta.com>

Working altp2m test-case. Extended the test tool to support singlestepping
to better highlight the core feature of altp2m view switching.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
Changes since v6:
        no changes

 tools/tests/xen-access/xen-access.c | 173 ++++++++++++++++++++++++++++++------
 1 file changed, 148 insertions(+), 25 deletions(-)

diff --git a/tools/tests/xen-access/xen-access.c b/tools/tests/xen-access/xen-access.c
index e6ca9ba..cdb8f4e 100644
--- a/tools/tests/xen-access/xen-access.c
+++ b/tools/tests/xen-access/xen-access.c
@@ -275,6 +275,19 @@ xenaccess_t *xenaccess_init(xc_interface **xch_r, domid_t domain_id)
     return NULL;
 }
 
+static inline
+int control_singlestep(
+    xc_interface *xch,
+    domid_t domain_id,
+    unsigned long vcpu,
+    bool enable)
+{
+    uint32_t op = enable ?
+        XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON : XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF;
+
+    return xc_domain_debug_control(xch, domain_id, op, vcpu);
+}
+
 /*
  * Note that this function is not thread safe.
  */
@@ -317,13 +330,15 @@ static void put_response(vm_event_t *vm_event, vm_event_response_t *rsp)
 
 void usage(char* progname)
 {
-    fprintf(stderr,
-            "Usage: %s [-m] <domain_id> write|exec|breakpoint\n"
+    fprintf(stderr, "Usage: %s [-m] <domain_id> write|exec", progname);
+#if defined(__i386__) || defined(__x86_64__)
+            fprintf(stderr, "|breakpoint|altp2m_write|altp2m_exec");
+#endif
+            fprintf(stderr,
             "\n"
             "Logs first page writes, execs, or breakpoint traps that occur on the domain.\n"
             "\n"
-            "-m requires this program to run, or else the domain may pause\n",
-            progname);
+            "-m requires this program to run, or else the domain may pause\n");
 }
 
 int main(int argc, char *argv[])
@@ -341,6 +356,8 @@ int main(int argc, char *argv[])
     int required = 0;
     int breakpoint = 0;
     int shutting_down = 0;
+    int altp2m = 0;
+    uint16_t altp2m_view_id = 0;
 
     char* progname = argv[0];
     argv++;
@@ -379,10 +396,22 @@ int main(int argc, char *argv[])
         default_access = XENMEM_access_rw;
         after_first_access = XENMEM_access_rwx;
     }
+#if defined(__i386__) || defined(__x86_64__)
     else if ( !strcmp(argv[0], "breakpoint") )
     {
         breakpoint = 1;
     }
+    else if ( !strcmp(argv[0], "altp2m_write") )
+    {
+        default_access = XENMEM_access_rx;
+        altp2m = 1;
+    }
+    else if ( !strcmp(argv[0], "altp2m_exec") )
+    {
+        default_access = XENMEM_access_rw;
+        altp2m = 1;
+    }
+#endif
     else
     {
         usage(argv[0]);
@@ -415,22 +444,73 @@ int main(int argc, char *argv[])
         goto exit;
     }
 
-    /* Set the default access type and convert all pages to it */
-    rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
-    if ( rc < 0 )
+    /* With altp2m we just create a new, restricted view of the memory */
+    if ( altp2m )
     {
-        ERROR("Error %d setting default mem access type\n", rc);
-        goto exit;
-    }
+        xen_pfn_t gfn = 0;
+        unsigned long perm_set = 0;
+
+        rc = xc_altp2m_set_domain_state( xch, domain_id, 1 );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d enabling altp2m on domain!\n", rc);
+            goto exit;
+        }
+
+        rc = xc_altp2m_create_view( xch, domain_id, default_access, &altp2m_view_id );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d creating altp2m view!\n", rc);
+            goto exit;
+        }
 
-    rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
-                           (xenaccess->max_gpfn - START_PFN) );
+        DPRINTF("altp2m view created with id %u\n", altp2m_view_id);
+        DPRINTF("Setting altp2m mem_access permissions.. ");
 
-    if ( rc < 0 )
+        for(; gfn < xenaccess->max_gpfn; ++gfn)
+        {
+            rc = xc_altp2m_set_mem_access( xch, domain_id, altp2m_view_id, gfn,
+                                           default_access);
+            if ( !rc )
+                perm_set++;
+        }
+
+        DPRINTF("done! Permissions set on %lu pages.\n", perm_set);
+
+        rc = xc_altp2m_switch_to_view( xch, domain_id, altp2m_view_id );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d switching to altp2m view!\n", rc);
+            goto exit;
+        }
+
+        rc = xc_monitor_singlestep( xch, domain_id, 1 );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d failed to enable singlestep monitoring!\n", rc);
+            goto exit;
+        }
+    }
+
+    if ( !altp2m )
     {
-        ERROR("Error %d setting all memory to access type %d\n", rc,
-              default_access);
-        goto exit;
+        /* Set the default access type and convert all pages to it */
+        rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
+        if ( rc < 0 )
+        {
+            ERROR("Error %d setting default mem access type\n", rc);
+            goto exit;
+        }
+
+        rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
+                               (xenaccess->max_gpfn - START_PFN) );
+
+        if ( rc < 0 )
+        {
+            ERROR("Error %d setting all memory to access type %d\n", rc,
+                  default_access);
+            goto exit;
+        }
     }
 
     if ( breakpoint )
@@ -448,13 +528,29 @@ int main(int argc, char *argv[])
     {
         if ( interrupted )
         {
+            /* Unregister for every event */
             DPRINTF("xenaccess shutting down on signal %d\n", interrupted);
 
-            /* Unregister for every event */
-            rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, ~0ull, 0);
-            rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, START_PFN,
-                                   (xenaccess->max_gpfn - START_PFN) );
-            rc = xc_monitor_software_breakpoint(xch, domain_id, 0);
+            if ( breakpoint )
+                rc = xc_monitor_software_breakpoint(xch, domain_id, 0);
+
+            if ( altp2m )
+            {
+                uint32_t vcpu_id;
+
+                rc = xc_altp2m_switch_to_view( xch, domain_id, 0 );
+                rc = xc_altp2m_destroy_view(xch, domain_id, altp2m_view_id);
+                rc = xc_altp2m_set_domain_state(xch, domain_id, 0);
+                rc = xc_monitor_singlestep(xch, domain_id, 0);
+
+                for ( vcpu_id = 0; vcpu_id<XEN_LEGACY_MAX_VCPUS; vcpu_id++)
+                    rc = control_singlestep(xch, domain_id, vcpu_id, 0);
+
+            } else {
+                rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, ~0ull, 0);
+                rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, START_PFN,
+                                       (xenaccess->max_gpfn - START_PFN) );
+            }
 
             shutting_down = 1;
         }
@@ -500,7 +596,7 @@ int main(int argc, char *argv[])
                 }
 
                 printf("PAGE ACCESS: %c%c%c for GFN %"PRIx64" (offset %06"
-                       PRIx64") gla %016"PRIx64" (valid: %c; fault in gpt: %c; fault with gla: %c) (vcpu %u)\n",
+                       PRIx64") gla %016"PRIx64" (valid: %c; fault in gpt: %c; fault with gla: %c) (vcpu %u, altp2m view %u)\n",
                        (req.u.mem_access.flags & MEM_ACCESS_R) ? 'r' : '-',
                        (req.u.mem_access.flags & MEM_ACCESS_W) ? 'w' : '-',
                        (req.u.mem_access.flags & MEM_ACCESS_X) ? 'x' : '-',
@@ -510,9 +606,20 @@ int main(int argc, char *argv[])
                        (req.u.mem_access.flags & MEM_ACCESS_GLA_VALID) ? 'y' : 'n',
                        (req.u.mem_access.flags & MEM_ACCESS_FAULT_IN_GPT) ? 'y' : 'n',
                        (req.u.mem_access.flags & MEM_ACCESS_FAULT_WITH_GLA) ? 'y': 'n',
-                       req.vcpu_id);
+                       req.vcpu_id,
+                       req.altp2m_idx);
 
-                if ( default_access != after_first_access )
+                if ( altp2m && req.flags & VM_EVENT_FLAG_ALTERNATE_P2M)
+                {
+                    DPRINTF("\tSwitching back to default view!\n");
+
+                    rsp.reason = req.reason;
+                    rsp.flags = req.flags;
+                    rsp.altp2m_idx = 0;
+
+                    control_singlestep(xch, domain_id, rsp.vcpu_id, 1);
+                }
+                else if ( default_access != after_first_access )
                 {
                     rc = xc_set_mem_access(xch, domain_id, after_first_access,
                                            req.u.mem_access.gfn, 1);
@@ -525,7 +632,6 @@ int main(int argc, char *argv[])
                     }
                 }
 
-
                 rsp.u.mem_access.gfn = req.u.mem_access.gfn;
                 break;
             case VM_EVENT_REASON_SOFTWARE_BREAKPOINT:
@@ -546,6 +652,23 @@ int main(int argc, char *argv[])
                 }
 
                 break;
+            case VM_EVENT_REASON_SINGLESTEP:
+                printf("Singlestep: rip=%016"PRIx64", vcpu %d\n",
+                       req.data.regs.x86.rip,
+                       req.vcpu_id);
+
+                if ( altp2m )
+                {
+                    printf("\tSwitching altp2m to view %u!\n", altp2m_view_id);
+
+                    rsp.reason = req.reason;
+                    rsp.flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
+                    rsp.altp2m_idx = altp2m_view_id;
+                }
+
+                control_singlestep(xch, domain_id, req.vcpu_id, 0);
+
+                break;
             default:
                 fprintf(stderr, "UNKNOWN REASON CODE %d\n", req.reason);
             }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-22 23:01 ` [PATCH v7 05/15] x86/altp2m: basic data structures and support routines Ed White
@ 2015-07-23  9:22   ` Jan Beulich
  2015-07-23 14:36     ` Sahita, Ravi
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-23  9:22 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, Jun Nakajima, tlengyel, Daniel De Graaf

>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
> @@ -6569,6 +6571,25 @@ void hvm_toggle_singlestep(struct vcpu *v)
>      v->arch.hvm_vcpu.single_step = !v->arch.hvm_vcpu.single_step;
>  }
>  
> +void altp2m_vcpu_update_p2m(struct vcpu *v)
> +{
> +    if ( hvm_funcs.altp2m_vcpu_update_p2m )
> +        hvm_funcs.altp2m_vcpu_update_p2m(v);
> +}
> +
> +void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v)
> +{
> +    if ( hvm_funcs.altp2m_vcpu_update_vmfunc_ve )
> +        hvm_funcs.altp2m_vcpu_update_vmfunc_ve(v);
> +}
> +
> +bool_t altp2m_vcpu_emulate_ve(struct vcpu *v)
> +{
> +    if ( hvm_funcs.altp2m_vcpu_emulate_ve )
> +        return hvm_funcs.altp2m_vcpu_emulate_ve(v);
> +    return 0;
> +}

These are _still_ not inline functions (in hvm.h), and 00/15 also
doesn't mention that this was intentionally left out.

> @@ -498,6 +498,28 @@ int hap_enable(struct domain *d, u32 mode)
>             goto out;
>      }
>  
> +    if ( hvm_altp2m_supported() )
> +    {
> +        /* Init alternate p2m data */
> +        if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
> +        {
> +            rv = -ENOMEM;
> +            goto out;
> +        }
> +
> +        for ( i = 0; i < MAX_EPTP; i++ )
> +            d->arch.altp2m_eptp[i] = INVALID_MFN;

And there is _still_ EPT-specific code left in a generic file.

> @@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct domain *d)
>      }
>  }
>  
> +static void p2m_teardown_altp2m(struct domain *d)
> +{
> +    unsigned int i;
> +    struct p2m_domain *p2m;
> +
> +    for ( i = 0; i < MAX_ALTP2M; i++ )
> +    {
> +        if ( !d->arch.altp2m_p2m[i] )
> +            continue;
> +        p2m = d->arch.altp2m_p2m[i];
> +        p2m_free_one(p2m);
> +        d->arch.altp2m_p2m[i] = NULL;

If you already think it's necessary to latch the array entry into a
local variable, why don't you zap the array entry _before_ freeing
the structure?

> @@ -1940,6 +1988,59 @@ int unmap_mmio_regions(struct domain *d,
>      return err;
>  }
>  
> +unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp)
> +{
> +    struct p2m_domain *p2m;
> +    struct ept_data *ept;
> +    unsigned int i;
> +
> +    altp2m_list_lock(d);
> +
> +    for ( i = 0; i < MAX_ALTP2M; i++ )
> +    {
> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
> +            continue;
> +
> +        p2m = d->arch.altp2m_p2m[i];
> +        ept = &p2m->ept;
> +
> +        if ( eptp == ept_get_eptp(ept) )
> +            goto out;
> +    }
> +
> +    i = INVALID_ALTP2M;
> +
> +out:

Just to repeat - labels should be indented by at least one space.

And you already know what the comment is regarding this being
EPT-specific code (and here one can't even debate whether it's
just unfortunate naming, since ept_get_eptp() is _very_ EPT-
specific, and that macro - if headers were properly structured -
shouldn't even be visible here).

> --- /dev/null
> +++ b/xen/include/asm-x86/altp2m.h
> @@ -0,0 +1,38 @@
> +/*
> + * Alternate p2m HVM
> + * Copyright (c) 2014, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
> + * Place - Suite 330, Boston, MA 02111-1307 USA.
> + */
> +
> +#ifndef _ALTP2M_H
> +#define _ALTP2M_H

I'm sure I said so before - this is not a specific enough guard symbol.
It should have at least an _X86 prefix.

> +#include <xen/types.h>
> +#include <xen/sched.h>         /* for struct vcpu, struct domain */
> +#include <asm/hvm/vcpu.h>      /* for vcpu_altp2m */

I don't see the type mentioned in the comment used anywhere
below - is this a stray include?

> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -233,6 +233,10 @@ struct paging_vcpu {
>  typedef xen_domctl_cpuid_t cpuid_input_t;
>  
>  #define MAX_NESTEDP2M 10
> +
> +#define MAX_ALTP2M      ((uint16_t)10)
> +#define INVALID_ALTP2M  ((uint16_t)~0)

Didn't you claim to have removed all stray casts?

Considering how late we are with this, this patch can have my ack
only provided you promise to address _all_ of the issues above in
follow-up patches.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-22 23:01 ` [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
@ 2015-07-23  9:43   ` Jan Beulich
  2015-07-23 14:40     ` Sahita, Ravi
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-23  9:43 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, Jun Nakajima, tlengyel, Daniel De Graaf

>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
> @@ -1770,6 +1771,105 @@ static bool_t vmx_is_singlestep_supported(void)
>      return cpu_has_monitor_trap_flag;
>  }
>  
> +static void vmx_vcpu_update_eptp(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    struct p2m_domain *p2m = NULL;
> +    struct ept_data *ept;
> +
> +    if ( altp2m_active(d) )
> +        p2m = p2m_get_altp2m(v);
> +    if ( !p2m )
> +        p2m = p2m_get_hostp2m(d);
> +
> +    ept = &p2m->ept;
> +    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
> +
> +    vmx_vmcs_enter(v);
> +
> +    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
> +
> +    if ( v->arch.hvm_vmx.secondary_exec_control &
> +        SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )

Indentation.

> +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
> +
> +    if ( !cpu_has_vmx_vmfunc )
> +        return;
> +
> +    if ( cpu_has_vmx_virt_exceptions )
> +        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
> +
> +    vmx_vmcs_enter(v);
> +
> +    if ( !d->is_dying && altp2m_active(d) )
> +    {
> +        v->arch.hvm_vmx.secondary_exec_control |= mask;
> +        __vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
> +        __vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp));
> +
> +        if ( cpu_has_vmx_virt_exceptions )
> +        {
> +            p2m_type_t t;
> +            mfn_t mfn;
> +
> +            mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
> +
> +            if ( mfn_x(mfn) != INVALID_MFN )
> +                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);

Considering that the VMCS field holds a byte-aligned address, why
do you have the (later introduced) hvmop specify a GFN instead of
a GPA?

Also you shouldn't be open coding pfn_to_paddr().

> +            else
> +                v->arch.hvm_vmx.secondary_exec_control &=
> +                    ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
> +        }
> +    }
> +    else
> +        v->arch.hvm_vmx.secondary_exec_control &= ~mask;
> +
> +    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> +        v->arch.hvm_vmx.secondary_exec_control);

Indentation again.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 10/15] x86/altp2m: add remaining support routines.
  2015-07-22 23:01 ` [PATCH v7 10/15] x86/altp2m: add remaining support routines Ed White
@ 2015-07-23 10:05   ` Jan Beulich
  2015-07-23 14:51     ` Sahita, Ravi
  2015-07-23 19:10   ` George Dunlap
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-23 10:05 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, Jun Nakajima, tlengyel, Daniel De Graaf

>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
> Add the remaining routines required to support enabling the alternate
> p2m functionality.
> 
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> Changes since v6:
>         rename altp2m lazy copier, make bool_t, use __put_gfn throughout,
>           and move to p2m.c, eliminating altp2m_hap.c
>         change various p2m_altp2m... routines from long to int
>         change uint16_t's/uint8_t's to unsigned int's
>         optimize remapped gfn tracking and altp2m invalidation check
>         mechanical changes due to patch 5 changes

Taken together clearly requiring dropping any earlier review tags.

Non-mm parts
Acked-by: Jan Beulich <jbeulich@suse.com>

> +void p2m_flush_altp2m(struct domain *d)
> +{
> +    unsigned int i;
> +
> +    altp2m_list_lock(d);
> +
> +    for ( i = 0; i < MAX_ALTP2M; i++ )
> +    {
> +        p2m_flush_table(d->arch.altp2m_p2m[i]);
> +        /* Uninit and reinit ept to force TLB shootdown */
> +        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
> +        ept_p2m_init(d->arch.altp2m_p2m[i]);
> +        d->arch.altp2m_eptp[i] = INVALID_MFN;
> +    }

And more EPT-specific code in a generic file...

> +int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx)
> +{
> +    struct p2m_domain *p2m;
> +    int rc = -EINVAL;
> +
> +    if ( !idx || idx > MAX_ALTP2M )

>= (and then also elsewhere further down)?

> +        return rc;
> +
> +    domain_pause_except_self(d);
> +
> +    altp2m_list_lock(d);
> +
> +    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
> +    {
> +        p2m = d->arch.altp2m_p2m[idx];
> +
> +        if ( !_atomic_read(p2m->active_vcpus) )
> +        {
> +            p2m_flush_table(d->arch.altp2m_p2m[idx]);
> +            /* Uninit and reinit ept to force TLB shootdown */
> +            ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
> +            ept_p2m_init(d->arch.altp2m_p2m[idx]);
> +            d->arch.altp2m_eptp[idx] = INVALID_MFN;

Perhaps worth making all of the above if() body a helper function
(considering that the loop body in p2m_flush_altp2m() does
exactly the same)?

> @@ -758,6 +765,38 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, unsigned int idx);
>  /* Check to see if vcpu should be switched to a different p2m. */
>  void p2m_altp2m_check(struct vcpu *v, uint16_t idx);

The context here suggests that I overlooked a still not replaced
uint16_t.

> +/* Flush all the alternate p2m's for a domain */
> +void p2m_flush_altp2m(struct domain *d);
> +
> +/* Alternate p2m paging */
> +bool_t p2m_altp2m_lazy_copy(struct vcpu *v, paddr_t gpa,
> +    unsigned long gla, struct npfec npfec, struct p2m_domain **ap2m);
> +
> +/* Make a specific alternate p2m valid */
> +int p2m_init_altp2m_by_id(struct domain *d, unsigned int idx);
> +
> +/* Find an available alternate p2m and make it valid */
> +int p2m_init_next_altp2m(struct domain *d, uint16_t *idx);

For this one, however, things really depend on the intended call
sites, i.e. I could see reasons for this to be kept as is.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-22 23:01 ` [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
@ 2015-07-23 10:22   ` Jan Beulich
  2015-07-23 14:56     ` Sahita, Ravi
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-23 10:22 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, Jun Nakajima, tlengyel, Daniel De Graaf

>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>

And I have to withdraw this ack pending clarification of (and perhaps
adjustment to) the #VE info address interface.

> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -6138,6 +6138,140 @@ static int hvmop_get_param(
>      return rc;
>  }
>  
> +static int do_altp2m_op(
> +    XEN_GUEST_HANDLE_PARAM(void) arg)
> +{
> +    struct xen_hvm_altp2m_op a;
> +    struct domain *d = NULL;
> +    int rc = 0;
> +
> +    if ( !hvm_altp2m_supported() )
> +        return -EOPNOTSUPP;
> +
> +    if ( copy_from_guest(&a, arg, 1) )
> +        return -EFAULT;
> +
> +    if ( a.pad1 || a.pad2 ||
> +         (a.version != HVMOP_ALTP2M_INTERFACE_VERSION) ||
> +         (a.cmd < HVMOP_altp2m_get_domain_state) ||
> +         (a.cmd > HVMOP_altp2m_change_gfn) )

I'm afraid such a change invalidates any earlier ack, even if ti is
correct. Instead of this, why don't you start numbering of the
sub-ops at zero? Or if you really have a reason to start at 1,
why not simply check a.cmd against zero (without using any
particular sub-op value)? And then it escapes me why this can't
be handled in a default case in the switch statement below
anyway.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-23  9:22   ` Jan Beulich
@ 2015-07-23 14:36     ` Sahita, Ravi
  2015-07-23 14:53       ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-23 14:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 23, 2015 2:22 AM
>
>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>> @@ -6569,6 +6571,25 @@ void hvm_toggle_singlestep(struct vcpu *v)
>>      v->arch.hvm_vcpu.single_step = !v->arch.hvm_vcpu.single_step;  }
>>
>> +void altp2m_vcpu_update_p2m(struct vcpu *v) {
>> +    if ( hvm_funcs.altp2m_vcpu_update_p2m )
>> +        hvm_funcs.altp2m_vcpu_update_p2m(v);
>> +}
>> +
>> +void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v) {
>> +    if ( hvm_funcs.altp2m_vcpu_update_vmfunc_ve )
>> +        hvm_funcs.altp2m_vcpu_update_vmfunc_ve(v);
>> +}
>> +
>> +bool_t altp2m_vcpu_emulate_ve(struct vcpu *v) {
>> +    if ( hvm_funcs.altp2m_vcpu_emulate_ve )
>> +        return hvm_funcs.altp2m_vcpu_emulate_ve(v);
>> +    return 0;
>> +}
>
>These are _still_ not inline functions (in hvm.h), and 00/15 also doesn't
>mention that this was intentionally left out.
>

Yup possibly missed this one.

>> @@ -498,6 +498,28 @@ int hap_enable(struct domain *d, u32 mode)
>>             goto out;
>>      }
>>
>> +    if ( hvm_altp2m_supported() )
>> +    {
>> +        /* Init alternate p2m data */
>> +        if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
>> +        {
>> +            rv = -ENOMEM;
>> +            goto out;
>> +        }
>> +
>> +        for ( i = 0; i < MAX_EPTP; i++ )
>> +            d->arch.altp2m_eptp[i] = INVALID_MFN;
>
>And there is _still_ EPT-specific code left in a generic file.
>

This was mentioned explicitly in the cover letter and in the patch header - 
Altough our error was that the comment was only in patch 10, and should have been in patch 5 also.
"    Patch 10:   ... not done - abstracting some ept functionality from p2m"

>> @@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct
>domain *d)
>>      }
>>  }
>>
>> +static void p2m_teardown_altp2m(struct domain *d) {
>> +    unsigned int i;
>> +    struct p2m_domain *p2m;
>> +
>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>> +    {
>> +        if ( !d->arch.altp2m_p2m[i] )
>> +            continue;
>> +        p2m = d->arch.altp2m_p2m[i];
>> +        p2m_free_one(p2m);
>> +        d->arch.altp2m_p2m[i] = NULL;
>
>If you already think it's necessary to latch the array entry into a local variable,
>why don't you zap the array entry _before_ freeing the structure?
>

Could be done.

>> @@ -1940,6 +1988,59 @@ int unmap_mmio_regions(struct domain *d,
>>      return err;
>>  }
>>
>> +unsigned int p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp)
>> +{
>> +    struct p2m_domain *p2m;
>> +    struct ept_data *ept;
>> +    unsigned int i;
>> +
>> +    altp2m_list_lock(d);
>> +
>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>> +    {
>> +        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>> +            continue;
>> +
>> +        p2m = d->arch.altp2m_p2m[i];
>> +        ept = &p2m->ept;
>> +
>> +        if ( eptp == ept_get_eptp(ept) )
>> +            goto out;
>> +    }
>> +
>> +    i = INVALID_ALTP2M;
>> +
>> +out:
>

Missed sorry.


>Just to repeat - labels should be indented by at least one space.
>
>And you already know what the comment is regarding this being EPT-specific
>code (and here one can't even debate whether it's just unfortunate naming,
>since ept_get_eptp() is _very_ EPT- specific, and that macro - if headers were
>properly structured - shouldn't even be visible here).
>

Was mentioned albeit in another patch (should have been mentioned here also)

>> --- /dev/null
>> +++ b/xen/include/asm-x86/altp2m.h
>> @@ -0,0 +1,38 @@
>> +/*
>> + * Alternate p2m HVM
>> + * Copyright (c) 2014, Intel Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> +modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but
>> +WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of
>MERCHANTABILITY
>> +or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>> +License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> +along with
>> + * this program; if not, write to the Free Software Foundation, Inc.,
>> +59 Temple
>> + * Place - Suite 330, Boston, MA 02111-1307 USA.
>> + */
>> +
>> +#ifndef _ALTP2M_H
>> +#define _ALTP2M_H
>
>I'm sure I said so before - this is not a specific enough guard symbol.
>It should have at least an _X86 prefix.
>

Right (don't think I saw this before but comment makes sense)

>> +#include <xen/types.h>
>> +#include <xen/sched.h>         /* for struct vcpu, struct domain */
>> +#include <asm/hvm/vcpu.h>      /* for vcpu_altp2m */
>
>I don't see the type mentioned in the comment used anywhere below - is this
>a stray include?
>
>> --- a/xen/include/asm-x86/domain.h
>> +++ b/xen/include/asm-x86/domain.h
>> @@ -233,6 +233,10 @@ struct paging_vcpu {  typedef xen_domctl_cpuid_t
>> cpuid_input_t;
>>
>>  #define MAX_NESTEDP2M 10
>> +
>> +#define MAX_ALTP2M      ((uint16_t)10)
>> +#define INVALID_ALTP2M  ((uint16_t)~0)
>
>Didn't you claim to have removed all stray casts?
>
>Considering how late we are with this, this patch can have my ack only
>provided you promise to address _all_ of the issues above in follow-up
>patches.

Yes - no problem.

>
>Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-23  9:43   ` Jan Beulich
@ 2015-07-23 14:40     ` Sahita, Ravi
  2015-07-23 15:00       ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-23 14:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 23, 2015 2:43 AM
>
>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>> @@ -1770,6 +1771,105 @@ static bool_t
>vmx_is_singlestep_supported(void)
>>      return cpu_has_monitor_trap_flag;  }
>>
>> +static void vmx_vcpu_update_eptp(struct vcpu *v) {
>> +    struct domain *d = v->domain;
>> +    struct p2m_domain *p2m = NULL;
>> +    struct ept_data *ept;
>> +
>> +    if ( altp2m_active(d) )
>> +        p2m = p2m_get_altp2m(v);
>> +    if ( !p2m )
>> +        p2m = p2m_get_hostp2m(d);
>> +
>> +    ept = &p2m->ept;
>> +    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
>> +
>> +    vmx_vmcs_enter(v);
>> +
>> +    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
>> +
>> +    if ( v->arch.hvm_vmx.secondary_exec_control &
>> +        SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
>
>Indentation.
>

Ok.

>> +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) {
>> +    struct domain *d = v->domain;
>> +    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
>> +
>> +    if ( !cpu_has_vmx_vmfunc )
>> +        return;
>> +
>> +    if ( cpu_has_vmx_virt_exceptions )
>> +        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
>> +
>> +    vmx_vmcs_enter(v);
>> +
>> +    if ( !d->is_dying && altp2m_active(d) )
>> +    {
>> +        v->arch.hvm_vmx.secondary_exec_control |= mask;
>> +        __vmwrite(VM_FUNCTION_CONTROL,
>VMX_VMFUNC_EPTP_SWITCHING);
>> +        __vmwrite(EPTP_LIST_ADDR,
>> + virt_to_maddr(d->arch.altp2m_eptp));
>> +
>> +        if ( cpu_has_vmx_virt_exceptions )
>> +        {
>> +            p2m_type_t t;
>> +            mfn_t mfn;
>> +
>> +            mfn = get_gfn_query_unlocked(d,
>> + gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
>> +
>> +            if ( mfn_x(mfn) != INVALID_MFN )
>> +                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) <<
>> + PAGE_SHIFT);
>
>Considering that the VMCS field holds a byte-aligned address, why do you
>have the (later introduced) hvmop specify a GFN instead of a GPA?
>

The SDM states:
" If the "EPT-violation #VE" VM-execution control is 1, the virtualization-exception information address must
satisfy the following checks:
- Bits 11:0 of the address must be 0.
- The address must not set any bits beyond the processor's physical-address width."

>Also you shouldn't be open coding pfn_to_paddr().
>

ok

thanks,
Ravi

>> +            else
>> +                v->arch.hvm_vmx.secondary_exec_control &=
>> +                    ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
>> +        }
>> +    }
>> +    else
>> +        v->arch.hvm_vmx.secondary_exec_control &= ~mask;
>> +
>> +    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
>> +        v->arch.hvm_vmx.secondary_exec_control);
>
>Indentation again.
>
>Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 10/15] x86/altp2m: add remaining support routines.
  2015-07-23 10:05   ` Jan Beulich
@ 2015-07-23 14:51     ` Sahita, Ravi
  2015-07-23 15:02       ` Jan Beulich
  2015-07-23 16:08       ` George Dunlap
  0 siblings, 2 replies; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-23 14:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 23, 2015 3:06 AM
>
>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>> Add the remaining routines required to support enabling the alternate
>> p2m functionality.
>>
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> Changes since v6:
>>         rename altp2m lazy copier, make bool_t, use __put_gfn throughout,
>>           and move to p2m.c, eliminating altp2m_hap.c
>>         change various p2m_altp2m... routines from long to int
>>         change uint16_t's/uint8_t's to unsigned int's
>>         optimize remapped gfn tracking and altp2m invalidation check
>>         mechanical changes due to patch 5 changes
>
>Taken together clearly requiring dropping any earlier review tags.
>
>Non-mm parts
>Acked-by: Jan Beulich <jbeulich@suse.com>
>

Ok  - other maintainers, please review the mm parts - thanks.

>> +void p2m_flush_altp2m(struct domain *d) {
>> +    unsigned int i;
>> +
>> +    altp2m_list_lock(d);
>> +
>> +    for ( i = 0; i < MAX_ALTP2M; i++ )
>> +    {
>> +        p2m_flush_table(d->arch.altp2m_p2m[i]);
>> +        /* Uninit and reinit ept to force TLB shootdown */
>> +        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
>> +        ept_p2m_init(d->arch.altp2m_p2m[i]);
>> +        d->arch.altp2m_eptp[i] = INVALID_MFN;
>> +    }
>
>And more EPT-specific code in a generic file...
>

This was mentioned explicitly as pending in the cover letter and in the patch header.
"    Patch 10:   ... not done - abstracting some ept functionality from p2m"

>> +int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx) {
>> +    struct p2m_domain *p2m;
>> +    int rc = -EINVAL;
>> +
>> +    if ( !idx || idx > MAX_ALTP2M )
>
>     >= (and then also elsewhere further down)?
>

Right.

>> +        return rc;
>> +
>> +    domain_pause_except_self(d);
>> +
>> +    altp2m_list_lock(d);
>> +
>> +    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
>> +    {
>> +        p2m = d->arch.altp2m_p2m[idx];
>> +
>> +        if ( !_atomic_read(p2m->active_vcpus) )
>> +        {
>> +            p2m_flush_table(d->arch.altp2m_p2m[idx]);
>> +            /* Uninit and reinit ept to force TLB shootdown */
>> +            ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
>> +            ept_p2m_init(d->arch.altp2m_p2m[idx]);
>> +            d->arch.altp2m_eptp[idx] = INVALID_MFN;
>
>Perhaps worth making all of the above if() body a helper function (considering
>that the loop body in p2m_flush_altp2m() does exactly the same)?
>

Ok can consider.


>> @@ -758,6 +765,38 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct
>vcpu
>> *v, unsigned int idx);
>>  /* Check to see if vcpu should be switched to a different p2m. */
>> void p2m_altp2m_check(struct vcpu *v, uint16_t idx);
>
>The context here suggests that I overlooked a still not replaced uint16_t.
>

Ok.

Just wanted to make sure these are also ok to do post 4.6

Thanks,
Ravi

>> +/* Flush all the alternate p2m's for a domain */ void
>> +p2m_flush_altp2m(struct domain *d);
>> +
>> +/* Alternate p2m paging */
>> +bool_t p2m_altp2m_lazy_copy(struct vcpu *v, paddr_t gpa,
>> +    unsigned long gla, struct npfec npfec, struct p2m_domain **ap2m);
>> +
>> +/* Make a specific alternate p2m valid */ int
>> +p2m_init_altp2m_by_id(struct domain *d, unsigned int idx);
>> +
>> +/* Find an available alternate p2m and make it valid */ int
>> +p2m_init_next_altp2m(struct domain *d, uint16_t *idx);
>
>For this one, however, things really depend on the intended call sites, i.e. I
>could see reasons for this to be kept as is.
>
>Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-23 14:36     ` Sahita, Ravi
@ 2015-07-23 14:53       ` Jan Beulich
  2015-07-23 15:00         ` Sahita, Ravi
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-23 14:53 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, Jun Nakajima, tlengyel,
	Daniel De Graaf

>>> On 23.07.15 at 16:36, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 23, 2015 2:22 AM
>>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>>> +out:
>>
> 
> Missed sorry.
> 
> 
>>Just to repeat - labels should be indented by at least one space.

But you realize this isn't the only case, and it looked like you adjusted
only exactly the one place where the comment was made?

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-23 10:22   ` Jan Beulich
@ 2015-07-23 14:56     ` Sahita, Ravi
  2015-07-23 15:08       ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-23 14:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 23, 2015 3:22 AM
>
>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>
>> Acked-by: Jan Beulich <jbeulich@suse.com>
>
>And I have to withdraw this ack pending clarification of (and perhaps
>adjustment to) the #VE info address interface.
>

Could we have the ack back :-) I clarified the #VE info address interface in the other email - repeating here:

" If the "EPT-violation #VE" VM-execution control is 1, the virtualization-exception information address must
satisfy the following checks:
- Bits 11:0 of the address must be 0.
- The address must not set any bits beyond the processor's physical-address width."


>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -6138,6 +6138,140 @@ static int hvmop_get_param(
>>      return rc;
>>  }
>>
>> +static int do_altp2m_op(
>> +    XEN_GUEST_HANDLE_PARAM(void) arg) {
>> +    struct xen_hvm_altp2m_op a;
>> +    struct domain *d = NULL;
>> +    int rc = 0;
>> +
>> +    if ( !hvm_altp2m_supported() )
>> +        return -EOPNOTSUPP;
>> +
>> +    if ( copy_from_guest(&a, arg, 1) )
>> +        return -EFAULT;
>> +
>> +    if ( a.pad1 || a.pad2 ||
>> +         (a.version != HVMOP_ALTP2M_INTERFACE_VERSION) ||
>> +         (a.cmd < HVMOP_altp2m_get_domain_state) ||
>> +         (a.cmd > HVMOP_altp2m_change_gfn) )
>
>I'm afraid such a change invalidates any earlier ack, even if ti is correct. Instead
>of this, why don't you start numbering of the sub-ops at zero? Or if you really
>have a reason to start at 1, why not simply check a.cmd against zero (without
>using any particular sub-op value)? And then it escapes me why this can't be
>handled in a default case in the switch statement below anyway.

Hmm - is that a requirement per se? we are consistently checking per the sub-op definition we have.
Would like this to be considered as is. 

As I said in the cover letter we have constraints on how much more we can do this week now - 
so requesting the maintainers to accept v7 with the review comments you have on those recorded as pending to be addressed by us.

Thanks much!
Ravi



>
>Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-23 14:53       ` Jan Beulich
@ 2015-07-23 15:00         ` Sahita, Ravi
  0 siblings, 0 replies; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-23 15:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 23, 2015 7:54 AM
>
>>>> On 23.07.15 at 16:36, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Thursday, July 23, 2015 2:22 AM
>>>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>>>> +out:
>>>
>>
>> Missed sorry.
>>
>>
>>>Just to repeat - labels should be indented by at least one space.
>
>But you realize this isn't the only case, and it looked like you adjusted only
>exactly the one place where the comment was made?
>

I agree these kinds of things are hard to catch - we did go in and scan our v7 for comment styles, indentation, case indentation, spaces in if conditional expressions etc,
But of course we missed some and this could be addressed - Andrew was also suggesting some sort of tool to do this would be good to use.

thanks
Ravi

>Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-23 14:40     ` Sahita, Ravi
@ 2015-07-23 15:00       ` Jan Beulich
  2015-07-23 15:02         ` Sahita, Ravi
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-23 15:00 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, Jun Nakajima, tlengyel,
	Daniel De Graaf

>>> On 23.07.15 at 16:40, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 23, 2015 2:43 AM
>>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>>> +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) {
>>> +    struct domain *d = v->domain;
>>> +    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
>>> +
>>> +    if ( !cpu_has_vmx_vmfunc )
>>> +        return;
>>> +
>>> +    if ( cpu_has_vmx_virt_exceptions )
>>> +        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
>>> +
>>> +    vmx_vmcs_enter(v);
>>> +
>>> +    if ( !d->is_dying && altp2m_active(d) )
>>> +    {
>>> +        v->arch.hvm_vmx.secondary_exec_control |= mask;
>>> +        __vmwrite(VM_FUNCTION_CONTROL,
>>VMX_VMFUNC_EPTP_SWITCHING);
>>> +        __vmwrite(EPTP_LIST_ADDR,
>>> + virt_to_maddr(d->arch.altp2m_eptp));
>>> +
>>> +        if ( cpu_has_vmx_virt_exceptions )
>>> +        {
>>> +            p2m_type_t t;
>>> +            mfn_t mfn;
>>> +
>>> +            mfn = get_gfn_query_unlocked(d,
>>> + gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
>>> +
>>> +            if ( mfn_x(mfn) != INVALID_MFN )
>>> +                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) <<
>>> + PAGE_SHIFT);
>>
>>Considering that the VMCS field holds a byte-aligned address, why do you
>>have the (later introduced) hvmop specify a GFN instead of a GPA?
>>
> 
> The SDM states:
> " If the "EPT-violation #VE" VM-execution control is 1, the 
> virtualization-exception information address must
> satisfy the following checks:
> - Bits 11:0 of the address must be 0.
> - The address must not set any bits beyond the processor's physical-address 
> width."

Ah, interesting. Knowing what to search for, I was able to find this.
But to be honest, it should also be stated in the section talking
about #VE, not just in the one talking about checks being done.

With that, just like for the other patch my ack depends on the
promise to deal with all the remaining issues.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-23 15:00       ` Jan Beulich
@ 2015-07-23 15:02         ` Sahita, Ravi
  0 siblings, 0 replies; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-23 15:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 23, 2015 8:00 AM
>To: Sahita, Ravi
>Cc: Andrew Cooper; Wei Liu; George Dunlap; Ian Jackson; White, Edmund H;
>Nakajima, Jun; xen-devel@lists.xen.org; tlengyel@novetta.com; Daniel De
>Graaf; Tim Deegan
>Subject: RE: [PATCH v7 06/15] VMX/altp2m: add code to support EPTP
>switching and #VE.
>
>>>> On 23.07.15 at 16:40, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Thursday, July 23, 2015 2:43 AM
>>>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>>>> +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) {
>>>> +    struct domain *d = v->domain;
>>>> +    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
>>>> +
>>>> +    if ( !cpu_has_vmx_vmfunc )
>>>> +        return;
>>>> +
>>>> +    if ( cpu_has_vmx_virt_exceptions )
>>>> +        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
>>>> +
>>>> +    vmx_vmcs_enter(v);
>>>> +
>>>> +    if ( !d->is_dying && altp2m_active(d) )
>>>> +    {
>>>> +        v->arch.hvm_vmx.secondary_exec_control |= mask;
>>>> +        __vmwrite(VM_FUNCTION_CONTROL,
>>>VMX_VMFUNC_EPTP_SWITCHING);
>>>> +        __vmwrite(EPTP_LIST_ADDR,
>>>> + virt_to_maddr(d->arch.altp2m_eptp));
>>>> +
>>>> +        if ( cpu_has_vmx_virt_exceptions )
>>>> +        {
>>>> +            p2m_type_t t;
>>>> +            mfn_t mfn;
>>>> +
>>>> +            mfn = get_gfn_query_unlocked(d,
>>>> + gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
>>>> +
>>>> +            if ( mfn_x(mfn) != INVALID_MFN )
>>>> +                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) <<
>>>> + PAGE_SHIFT);
>>>
>>>Considering that the VMCS field holds a byte-aligned address, why do
>>>you have the (later introduced) hvmop specify a GFN instead of a GPA?
>>>
>>
>> The SDM states:
>> " If the "EPT-violation #VE" VM-execution control is 1, the
>> virtualization-exception information address must satisfy the
>> following checks:
>> - Bits 11:0 of the address must be 0.
>> - The address must not set any bits beyond the processor's
>> physical-address width."
>
>Ah, interesting. Knowing what to search for, I was able to find this.
>But to be honest, it should also be stated in the section talking about #VE, not
>just in the one talking about checks being done.
>
>With that, just like for the other patch my ack depends on the promise to deal
>with all the remaining issues.

Yes our SDM has information that's not the best organized...
Agreed on the remaining issues

Thanks
Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 10/15] x86/altp2m: add remaining support routines.
  2015-07-23 14:51     ` Sahita, Ravi
@ 2015-07-23 15:02       ` Jan Beulich
  2015-07-23 16:08       ` George Dunlap
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2015-07-23 15:02 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, Jun Nakajima, tlengyel,
	Daniel De Graaf

>>> On 23.07.15 at 16:51, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 23, 2015 3:06 AM
>>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>>> @@ -758,6 +765,38 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct
>>vcpu
>>> *v, unsigned int idx);
>>>  /* Check to see if vcpu should be switched to a different p2m. */
>>> void p2m_altp2m_check(struct vcpu *v, uint16_t idx);
>>
>>The context here suggests that I overlooked a still not replaced uint16_t.
>>
> 
> Ok.
> 
> Just wanted to make sure these are also ok to do post 4.6

I guess so, but you realize you're stretching it? Plus the patch can't
go in as is anyway, considering the off-by-one bugs found.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-23 14:56     ` Sahita, Ravi
@ 2015-07-23 15:08       ` Jan Beulich
  2015-07-23 15:16         ` Sahita, Ravi
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-23 15:08 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, Jun Nakajima, tlengyel,
	Daniel De Graaf

>>> On 23.07.15 at 16:56, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 23, 2015 3:22 AM
>>
>>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>>
>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>
>>And I have to withdraw this ack pending clarification of (and perhaps
>>adjustment to) the #VE info address interface.
>>
> 
> Could we have the ack back :-) I clarified the #VE info address interface in 
> the other email - repeating here:
> 
> " If the "EPT-violation #VE" VM-execution control is 1, the 
> virtualization-exception information address must
> satisfy the following checks:
> - Bits 11:0 of the address must be 0.
> - The address must not set any bits beyond the processor's physical-address 
> width."

Yes, for this aspect.

>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -6138,6 +6138,140 @@ static int hvmop_get_param(
>>>      return rc;
>>>  }
>>>
>>> +static int do_altp2m_op(
>>> +    XEN_GUEST_HANDLE_PARAM(void) arg) {
>>> +    struct xen_hvm_altp2m_op a;
>>> +    struct domain *d = NULL;
>>> +    int rc = 0;
>>> +
>>> +    if ( !hvm_altp2m_supported() )
>>> +        return -EOPNOTSUPP;
>>> +
>>> +    if ( copy_from_guest(&a, arg, 1) )
>>> +        return -EFAULT;
>>> +
>>> +    if ( a.pad1 || a.pad2 ||
>>> +         (a.version != HVMOP_ALTP2M_INTERFACE_VERSION) ||
>>> +         (a.cmd < HVMOP_altp2m_get_domain_state) ||
>>> +         (a.cmd > HVMOP_altp2m_change_gfn) )
>>
>>I'm afraid such a change invalidates any earlier ack, even if ti is correct. 
> Instead
>>of this, why don't you start numbering of the sub-ops at zero? Or if you 
> really
>>have a reason to start at 1, why not simply check a.cmd against zero (without
>>using any particular sub-op value)? And then it escapes me why this can't be
>>handled in a default case in the switch statement below anyway.
> 
> Hmm - is that a requirement per se? we are consistently checking per the 
> sub-op definition we have.

Well, in a way. But doing range checks like this means future
additions of sub-ops would always need to touch this code. Quite
different from doing it in the default case of a switch statement.
Plus, can you see how the expression is going to look like if in
interface version 2 you need to remove one or two of the current
entries, replacing them with new, higher numbers?

> Would like this to be considered as is. 
> 
> As I said in the cover letter we have constraints on how much more we can do 
> this week now - 
> so requesting the maintainers to accept v7 with the review comments you have 
> on those recorded as pending to be addressed by us.

Yes, on that basis, albeit extremely hesitantly to be honest. If any
other maintainer would be as hesitant as I am about this, I would
likely put the two together to yield a NAK.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-23 15:08       ` Jan Beulich
@ 2015-07-23 15:16         ` Sahita, Ravi
  0 siblings, 0 replies; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-23 15:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 23, 2015 8:09 AM
>
>>>> On 23.07.15 at 16:56, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Thursday, July 23, 2015 3:22 AM
>>>
>>>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>>>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>>>
>>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>>
>>>And I have to withdraw this ack pending clarification of (and perhaps
>>>adjustment to) the #VE info address interface.
>>>
>>
>> Could we have the ack back :-) I clarified the #VE info address
>> interface in the other email - repeating here:
>>
>> " If the "EPT-violation #VE" VM-execution control is 1, the
>> virtualization-exception information address must satisfy the
>> following checks:
>> - Bits 11:0 of the address must be 0.
>> - The address must not set any bits beyond the processor's
>> physical-address width."
>
>Yes, for this aspect.
>
>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>> @@ -6138,6 +6138,140 @@ static int hvmop_get_param(
>>>>      return rc;
>>>>  }
>>>>
>>>> +static int do_altp2m_op(
>>>> +    XEN_GUEST_HANDLE_PARAM(void) arg) {
>>>> +    struct xen_hvm_altp2m_op a;
>>>> +    struct domain *d = NULL;
>>>> +    int rc = 0;
>>>> +
>>>> +    if ( !hvm_altp2m_supported() )
>>>> +        return -EOPNOTSUPP;
>>>> +
>>>> +    if ( copy_from_guest(&a, arg, 1) )
>>>> +        return -EFAULT;
>>>> +
>>>> +    if ( a.pad1 || a.pad2 ||
>>>> +         (a.version != HVMOP_ALTP2M_INTERFACE_VERSION) ||
>>>> +         (a.cmd < HVMOP_altp2m_get_domain_state) ||
>>>> +         (a.cmd > HVMOP_altp2m_change_gfn) )
>>>
>>>I'm afraid such a change invalidates any earlier ack, even if ti is correct.
>> Instead
>>>of this, why don't you start numbering of the sub-ops at zero? Or if
>>>you
>> really
>>>have a reason to start at 1, why not simply check a.cmd against zero
>>>(without using any particular sub-op value)? And then it escapes me
>>>why this can't be handled in a default case in the switch statement below
>anyway.
>>
>> Hmm - is that a requirement per se? we are consistently checking per
>> the sub-op definition we have.
>
>Well, in a way. But doing range checks like this means future additions of sub-
>ops would always need to touch this code. Quite different from doing it in the
>default case of a switch statement.
>Plus, can you see how the expression is going to look like if in interface version
>2 you need to remove one or two of the current entries, replacing them with
>new, higher numbers?
>

Yes in a revision I would handle this with the default case.

>> Would like this to be considered as is.
>>
>> As I said in the cover letter we have constraints on how much more we
>> can do this week now - so requesting the maintainers to accept v7 with
>> the review comments you have on those recorded as pending to be
>> addressed by us.
>
>Yes, on that basis, albeit extremely hesitantly to be honest. If any other
>maintainer would be as hesitant as I am about this, I would likely put the two
>together to yield a NAK.

Thanks for your flexibility and honesty on this - 
hope other maintainers are ok with v7 in the same way, with keeping the other comments you have logged already as pending for post 4.6.

Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 10/15] x86/altp2m: add remaining support routines.
  2015-07-23 14:51     ` Sahita, Ravi
  2015-07-23 15:02       ` Jan Beulich
@ 2015-07-23 16:08       ` George Dunlap
  2015-07-23 16:15         ` Jan Beulich
  1 sibling, 1 reply; 42+ messages in thread
From: George Dunlap @ 2015-07-23 16:08 UTC (permalink / raw)
  To: Sahita, Ravi, Jan Beulich
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

On 07/23/2015 03:51 PM, Sahita, Ravi wrote:
>>> +int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx) {
>>> +    struct p2m_domain *p2m;
>>> +    int rc = -EINVAL;
>>> +
>>> +    if ( !idx || idx > MAX_ALTP2M )
>>
>>     >= (and then also elsewhere further down)?
>>
> 
> Right.

[snip]

> Just wanted to make sure these are also ok to do post 4.6

Well the off-by-one errors certainly need to be fixed for 4.6.

If this was the only thing holding it up, the committer could fix it up
on check-in, or we could take a fix-up patch afterwards.

 -George

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 13/15] x86/altp2m: XSM hooks for altp2m HVM ops
  2015-07-22 23:01 ` [PATCH v7 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
@ 2015-07-23 16:08   ` Jan Beulich
  2015-07-23 16:56     ` Sahita, Ravi
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-23 16:08 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, Jun Nakajima, tlengyel, Daniel De Graaf

>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -6005,6 +6005,9 @@ static int hvmop_set_param(
>                  nestedhvm_vcpu_destroy(v);
>          break;
>      case HVM_PARAM_ALTP2M:
> +        rc = xsm_hvm_param_altp2mhvm(XSM_PRIV, d);
> +        if ( rc )
> +            break;
>          if ( a.value > 1 )
>              rc = -EINVAL;
>          if ( a.value &&
> @@ -6189,6 +6192,9 @@ static int do_altp2m_op(
>          goto out;
>      }
>  
> +    if ( (rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d ? d : current->domain)) )
> +        goto out;
> +

Shouldn't this have got updated after the changes to patch 11?

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 10/15] x86/altp2m: add remaining support routines.
  2015-07-23 16:08       ` George Dunlap
@ 2015-07-23 16:15         ` Jan Beulich
  2015-07-23 16:50           ` Sahita, Ravi
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2015-07-23 16:15 UTC (permalink / raw)
  To: George Dunlap
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Edmund H White, xen-devel, Jun Nakajima, tlengyel,
	Daniel De Graaf

>>> On 23.07.15 at 18:08, <george.dunlap@citrix.com> wrote:
> On 07/23/2015 03:51 PM, Sahita, Ravi wrote:
>>>> +int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx) {
>>>> +    struct p2m_domain *p2m;
>>>> +    int rc = -EINVAL;
>>>> +
>>>> +    if ( !idx || idx > MAX_ALTP2M )
>>>
>>>     >= (and then also elsewhere further down)?
>>>
>> 
>> Right.
> 
> [snip]
> 
>> Just wanted to make sure these are also ok to do post 4.6
> 
> Well the off-by-one errors certainly need to be fixed for 4.6.
> 
> If this was the only thing holding it up, the committer could fix it up
> on check-in, or we could take a fix-up patch afterwards.

Sure, but on this one in particular I intentionally gave my ack
only for the non-mm parts, hoping you would do another pass
over them (and then hopefully ack them).

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 10/15] x86/altp2m: add remaining support routines.
  2015-07-23 16:15         ` Jan Beulich
@ 2015-07-23 16:50           ` Sahita, Ravi
  0 siblings, 0 replies; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-23 16:50 UTC (permalink / raw)
  To: Jan Beulich, George Dunlap
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 23, 2015 9:16 AM
>
>>>> On 23.07.15 at 18:08, <george.dunlap@citrix.com> wrote:
>> On 07/23/2015 03:51 PM, Sahita, Ravi wrote:
>>>>> +int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx) {
>>>>> +    struct p2m_domain *p2m;
>>>>> +    int rc = -EINVAL;
>>>>> +
>>>>> +    if ( !idx || idx > MAX_ALTP2M )
>>>>
>>>>     >= (and then also elsewhere further down)?
>>>>
>>>
>>> Right.
>>
>> [snip]
>>
>>> Just wanted to make sure these are also ok to do post 4.6
>>
>> Well the off-by-one errors certainly need to be fixed for 4.6.
>>
>> If this was the only thing holding it up, the committer could fix it
>> up on check-in, or we could take a fix-up patch afterwards.

This was it for 4.6, others were post 4.6 - so if this could be done by the committer that would be great George - thanks, Ravi

>
>Sure, but on this one in particular I intentionally gave my ack only for the non-
>mm parts, hoping you would do another pass over them (and then hopefully
>ack them).
>
>Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 13/15] x86/altp2m: XSM hooks for altp2m HVM ops
  2015-07-23 16:08   ` Jan Beulich
@ 2015-07-23 16:56     ` Sahita, Ravi
  2015-07-24  7:49       ` Jan Beulich
  0 siblings, 1 reply; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-23 16:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, Nakajima, Jun, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 23, 2015 9:09 AM
>
>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -6005,6 +6005,9 @@ static int hvmop_set_param(
>>                  nestedhvm_vcpu_destroy(v);
>>          break;
>>      case HVM_PARAM_ALTP2M:
>> +        rc = xsm_hvm_param_altp2mhvm(XSM_PRIV, d);
>> +        if ( rc )
>> +            break;
>>          if ( a.value > 1 )
>>              rc = -EINVAL;
>>          if ( a.value &&
>> @@ -6189,6 +6192,9 @@ static int do_altp2m_op(
>>          goto out;
>>      }
>>
>> +    if ( (rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d ? d : current-
>>domain)) )
>> +        goto out;
>> +
>
>Shouldn't this have got updated after the changes to patch 11?

It was updated - a number of similar calls to check xsm permissions for this op were collapsed due to the sub-op'ing.
We don't differentiate across sub-ops for altp2m w.r.t. XSM permissions.

Thanks,
Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (14 preceding siblings ...)
  2015-07-22 23:01 ` [PATCH v7 15/15] tools/xen-access: altp2m testcases Ed White
@ 2015-07-23 17:12 ` Wei Liu
  2015-07-23 19:11   ` George Dunlap
  2015-07-24  9:56 ` Wei Liu
  16 siblings, 1 reply; 42+ messages in thread
From: Wei Liu @ 2015-07-23 17:12 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, xen-devel, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

Hi all

As I understand it most pending issues of this series are minor and the
final day for committing is tomorrow. Checking this series in as-is is
going to create some technical debt that either maintainers or Intel
developers need to pay back in the future (and Intel has signed up for
that, thank you). The clock is ticking but there is still disagreement,
so I would like to share some of my thoughts from a strategic point of
view.

I think this feature is important and gives us strategic advantages. I'm
not aware of other open source hypervisors that support such feature. It
is essential to VM introspection. It will help Xen project catch up in
NFV market. The feature is interesting in its own right I would imagine
power users / developers can do many interesting things with it.

>From my point of view the benefits out-weight technical debt.

Wei.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 10/15] x86/altp2m: add remaining support routines.
  2015-07-22 23:01 ` [PATCH v7 10/15] x86/altp2m: add remaining support routines Ed White
  2015-07-23 10:05   ` Jan Beulich
@ 2015-07-23 19:10   ` George Dunlap
  1 sibling, 0 replies; 42+ messages in thread
From: George Dunlap @ 2015-07-23 19:10 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Jan Beulich, Tim Deegan, Ian Jackson,
	xen-devel, Jun Nakajima, Andrew Cooper, tlengyel,
	Daniel De Graaf

On Thu, Jul 23, 2015 at 12:01 AM, Ed White <edmund.h.white@intel.com> wrote:
> +int p2m_destroy_altp2m_by_id(struct domain *d, unsigned int idx)
> +{
> +    struct p2m_domain *p2m;
> +    int rc = -EINVAL;
> +
> +    if ( !idx || idx > MAX_ALTP2M )
> +        return rc;
> +
> +    domain_pause_except_self(d);
> +
> +    altp2m_list_lock(d);
> +
> +    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
> +    {
> +        p2m = d->arch.altp2m_p2m[idx];
> +
> +        if ( !_atomic_read(p2m->active_vcpus) )
> +        {
> +            p2m_flush_table(d->arch.altp2m_p2m[idx]);
> +            /* Uninit and reinit ept to force TLB shootdown */
> +            ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
> +            ept_p2m_init(d->arch.altp2m_p2m[idx]);
> +            d->arch.altp2m_eptp[idx] = INVALID_MFN;
> +            rc = 0;
> +        }
> +    }
> +
> +    altp2m_list_unlock(d);
> +
> +    domain_unpause_except_self(d);

Can you put on your list of clean-ups to send before the 4.6 release
to have this return -EBUSY rather than -EINVAL if there are active
vcpus?

Thanks -- see my response to 00/15 for the rest.

 -George

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m
  2015-07-23 17:12 ` [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Wei Liu
@ 2015-07-23 19:11   ` George Dunlap
  0 siblings, 0 replies; 42+ messages in thread
From: George Dunlap @ 2015-07-23 19:11 UTC (permalink / raw)
  To: Wei Liu, Ed White
  Cc: Ravi Sahita, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, xen-devel, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

On 07/23/2015 06:12 PM, Wei Liu wrote:
> Hi all
> 
> As I understand it most pending issues of this series are minor and the
> final day for committing is tomorrow. Checking this series in as-is is
> going to create some technical debt that either maintainers or Intel
> developers need to pay back in the future (and Intel has signed up for
> that, thank you). The clock is ticking but there is still disagreement,
> so I would like to share some of my thoughts from a strategic point of
> view.
> 
> I think this feature is important and gives us strategic advantages. I'm
> not aware of other open source hypervisors that support such feature. It
> is essential to VM introspection. It will help Xen project catch up in
> NFV market. The feature is interesting in its own right I would imagine
> power users / developers can do many interesting things with it.
> 
> From my point of view the benefits out-weight technical debt.

I think this makes sense.

I've just looked through the series as a whole, and although there are
lots of things I would like to change, I'm convinced that as a whole it
will work as advertised; and furthermore, as Wei said, that it is
strategically important to get in.

So with that in mind:

mm-code:

Acked-by: George Dunlap <george.dunlap@eu.citrix.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 13/15] x86/altp2m: XSM hooks for altp2m HVM ops
  2015-07-23 16:56     ` Sahita, Ravi
@ 2015-07-24  7:49       ` Jan Beulich
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2015-07-24  7:49 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, Jun Nakajima, tlengyel,
	Daniel De Graaf

>>> On 23.07.15 at 18:56, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 23, 2015 9:09 AM
>>
>>>>> On 23.07.15 at 01:01, <edmund.h.white@intel.com> wrote:
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -6005,6 +6005,9 @@ static int hvmop_set_param(
>>>                  nestedhvm_vcpu_destroy(v);
>>>          break;
>>>      case HVM_PARAM_ALTP2M:
>>> +        rc = xsm_hvm_param_altp2mhvm(XSM_PRIV, d);
>>> +        if ( rc )
>>> +            break;
>>>          if ( a.value > 1 )
>>>              rc = -EINVAL;
>>>          if ( a.value &&
>>> @@ -6189,6 +6192,9 @@ static int do_altp2m_op(
>>>          goto out;
>>>      }
>>>
>>> +    if ( (rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d ? d : current-
>>>domain)) )
>>> +        goto out;
>>> +
>>
>>Shouldn't this have got updated after the changes to patch 11?
> 
> It was updated - a number of similar calls to check xsm permissions for this 
> op were collapsed due to the sub-op'ing.
> We don't differentiate across sub-ops for altp2m w.r.t. XSM permissions.

But I'm talking about the conditional expression still used there.

Jan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m
  2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (15 preceding siblings ...)
  2015-07-23 17:12 ` [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Wei Liu
@ 2015-07-24  9:56 ` Wei Liu
  2015-07-24 16:06   ` Sahita, Ravi
  16 siblings, 1 reply; 42+ messages in thread
From: Wei Liu @ 2015-07-24  9:56 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, xen-devel, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

Based on the understanding that:

1. George and Jan are happy with this series merged as-is.
2. Jan will do minor adjustments to address some issues.
3. Ravi and Ed will send out follow-up patches post-4.6 to fix other
   issues.

Release-acked-by: Wei Liu <wei.liu2@citrix.com>

And thanks to both sides to get this work done in time.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m
  2015-07-24  9:56 ` Wei Liu
@ 2015-07-24 16:06   ` Sahita, Ravi
  0 siblings, 0 replies; 42+ messages in thread
From: Sahita, Ravi @ 2015-07-24 16:06 UTC (permalink / raw)
  To: Wei Liu, White, Edmund H
  Cc: Sahita, Ravi, Nakajima, Jun, George Dunlap, Andrew Cooper,
	Tim Deegan, xen-devel, Jan Beulich, tlengyel, Daniel De Graaf,
	Ian Jackson

>From: Wei Liu [mailto:wei.liu2@citrix.com]
>Sent: Friday, July 24, 2015 2:56 AM
>
>Based on the understanding that:
>
>1. George and Jan are happy with this series merged as-is.
>2. Jan will do minor adjustments to address some issues.
>3. Ravi and Ed will send out follow-up patches post-4.6 to fix other
>   issues.
>
>Release-acked-by: Wei Liu <wei.liu2@citrix.com>
>
>And thanks to both sides to get this work done in time.

Thanks Wei and all maintainers!

Ravi

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2015-07-24 16:06 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-22 23:01 [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Ed White
2015-07-22 23:01 ` [PATCH v7 01/15] common/domain: Helpers to pause a domain while in context Ed White
2015-07-22 23:01 ` [PATCH v7 02/15] VMX: VMFUNC and #VE definitions and detection Ed White
2015-07-22 23:01 ` [PATCH v7 03/15] VMX: implement suppress #VE Ed White
2015-07-22 23:01 ` [PATCH v7 04/15] x86/HVM: Hardware alternate p2m support detection Ed White
2015-07-22 23:01 ` [PATCH v7 05/15] x86/altp2m: basic data structures and support routines Ed White
2015-07-23  9:22   ` Jan Beulich
2015-07-23 14:36     ` Sahita, Ravi
2015-07-23 14:53       ` Jan Beulich
2015-07-23 15:00         ` Sahita, Ravi
2015-07-22 23:01 ` [PATCH v7 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
2015-07-23  9:43   ` Jan Beulich
2015-07-23 14:40     ` Sahita, Ravi
2015-07-23 15:00       ` Jan Beulich
2015-07-23 15:02         ` Sahita, Ravi
2015-07-22 23:01 ` [PATCH v7 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
2015-07-22 23:01 ` [PATCH v7 08/15] x86/altp2m: add control of suppress_ve Ed White
2015-07-22 23:01 ` [PATCH v7 09/15] x86/altp2m: alternate p2m memory events Ed White
2015-07-22 23:01 ` [PATCH v7 10/15] x86/altp2m: add remaining support routines Ed White
2015-07-23 10:05   ` Jan Beulich
2015-07-23 14:51     ` Sahita, Ravi
2015-07-23 15:02       ` Jan Beulich
2015-07-23 16:08       ` George Dunlap
2015-07-23 16:15         ` Jan Beulich
2015-07-23 16:50           ` Sahita, Ravi
2015-07-23 19:10   ` George Dunlap
2015-07-22 23:01 ` [PATCH v7 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
2015-07-23 10:22   ` Jan Beulich
2015-07-23 14:56     ` Sahita, Ravi
2015-07-23 15:08       ` Jan Beulich
2015-07-23 15:16         ` Sahita, Ravi
2015-07-22 23:01 ` [PATCH v7 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
2015-07-22 23:01 ` [PATCH v7 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
2015-07-23 16:08   ` Jan Beulich
2015-07-23 16:56     ` Sahita, Ravi
2015-07-24  7:49       ` Jan Beulich
2015-07-22 23:01 ` [PATCH v7 14/15] tools/libxc: add support to altp2m hvmops Ed White
2015-07-22 23:01 ` [PATCH v7 15/15] tools/xen-access: altp2m testcases Ed White
2015-07-23 17:12 ` [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m Wei Liu
2015-07-23 19:11   ` George Dunlap
2015-07-24  9:56 ` Wei Liu
2015-07-24 16:06   ` Sahita, Ravi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).