xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m
@ 2015-07-10  0:52 Ed White
  2015-07-10  0:52 ` [PATCH v4 01/15] common/domain: Helpers to pause a domain while in context Ed White
                   ` (14 more replies)
  0 siblings, 15 replies; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

This set of patches adds support to hvm domains for EPTP switching by creating
multiple copies of the host p2m (currently limited to 10 copies).

The primary use of this capability is expected to be in scenarios where access
to memory needs to be monitored and/or restricted below the level at which the
guest OS page tables operate. Two examples that were discussed at the 2014 Xen
developer summit are:

    VM introspection: 
        http://www.slideshare.net/xen_com_mgr/
        zero-footprint-guest-memory-introspection-from-xen

    Secure inter-VM communication:
        http://www.slideshare.net/xen_com_mgr/nakajima-nvf

A more detailed design specification can be found at:
    http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01319.html

Each p2m copy is populated lazily on EPT violations.
Permissions for pages in alternate p2m's can be changed in a similar
way to the existing memory access interface, and gfn->mfn mappings can be changed.

All this is done through extra HVMOP types.

The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain
code is hypervisor-only, the toolstack has not been modified.

The intra-domain code has been tested. Violation notifications can only be received
for pages that have been modified (access permissions and/or gfn->mfn mapping) 
intra-domain, and only on VCPU's that have enabled notification.

VMFUNC and #VE will both be emulated on hardware without native support.

This code is not compatible with nested hvm functionality and will refuse to work
with nested hvm active. It is also not compatible with migration. It should be
considered experimental.


Changes since v3:

Major changes are:

    Replaced patch 8.

    Refactored patch 11 to use a single HVMOP with subcodes.

    Addressed feedback in patch 7, and some other patches.

    Added two tools/test patches from Tamas. Both are optional.

    Added various ack's and reviewed-by's.

    Rebased.

Ravi Sahita will now be the point of contact for this series.


Changes since v2:

Addressed all v2 feedback *except*:

    In patch 5, the per-domain EPTP list page is still allocated from the
    Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
    hardware when walking the EPTP list during exit processing in patch 6.

    HVM_ops are not merged. Tamas suggested merging the memory access ops,
    but in practice they are not as similar as they appear on the surface.
    Razvan suggested merging the implementation code in p2m.c, but that is
    also not as common as it appears on the surface.
    Andrew suggested merging all altp2m ops into one with a subop code in
    the input stucture. His point that only 255 ops can be defined is well
    taken, but altp2m uses only 2 more ops than the recently introduced
    ioreq ops, and <15% of the available ops have been defined. Since we
    don't know how to implement XSM hooks and policy with the subop model,
    we have not adopted this suggestion.

    The p2m set/get interface is not modified. The altp2m code needs to
    write suppress_ve in 2 places and read it in 1 place. The original
    patch series managed this by coupling the state of suppress_ve to the
    p2m memory type, which Tim disliked. In v2 of the series, special
    set/get interaces were added to access suppress_ve only when required.
    Jan has suggested changing the existing interfaces, but we feel this
    is inappropriate for this experimental patch series. Changing the
    existing interfaces would require a design agreement to be reached
    and would impact a large amount of existing code.

    Andrew kindly added some reviewed-by's to v2. I have not carried
    his reviewed-by of the memory event patch forward because Tamas
    requested significant changes to the patch.


Changes since v1:

Many changes since v1 in response to maintainer feedback, including:

    Suppress_ve state is now decoupled from memory type
    VMFUNC emulation handled in x86 emulator
    Lazy-copy algorithm copies any page where mfn != INVALID_MFN
    All nested page fault handling except lazy-copy is now in
        top-level (hvm.c) nested page fault handler
    Split p2m lock type (as suggested by Tim) to avoid lock order violations
    XSM hooks
    Xen parameter to globally enable altp2m (default disabled) and HVM parameter
    Altp2m reference counting no longer uses dirty_cpu bitmap
    Remapped page tracking to invalidate altp2m's where needed to protect Xen
    Many other minor changes

The altp2m invalidation is implemented to a level that I believe satisifies
the requirements of protecting Xen. Invalidation notification is not yet
implemented, and there may be other cases where invalidation is warranted to
protect the integrity of the restrictions placed through altp2m. We may add
further patches in this area.

Testability is still a potential issue. We have offered to make our internal
Windows test binaries available for intra-domain testing. Tamas has
been working on toolstack support for cross-domain testing with a slightly
earlier patch series, and we hope he will submit that support.

Not all of the patches will be of interest to everyone copied here. I've
copied everyone on this initial mailing to give context.

Andrew Cooper (1):
  common/domain: Helpers to pause a domain while in context

Ed White (9):
  VMX: VMFUNC and #VE definitions and detection.
  VMX: implement suppress #VE.
  x86/HVM: Hardware alternate p2m support detection.
  x86/altp2m: basic data structures and support routines.
  VMX/altp2m: add code to support EPTP switching and #VE.
  x86/altp2m: alternate p2m memory events.
  x86/altp2m: add remaining support routines.
  x86/altp2m: define and implement alternate p2m HVMOP types.
  x86/altp2m: Add altp2mhvm HVM domain parameter.

George Dunlap (1):
  x86/altp2m: add control of suppress_ve.

Ravi Sahita (2):
  VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  x86/altp2m: XSM hooks for altp2m HVM ops

Tamas K Lengyel (2):
  tools/libxc: add support to altp2m hvmops
  tools/xen-access: altp2m testcases

 docs/man/xl.cfg.pod.5                        |  12 +
 docs/misc/xen-command-line.markdown          |   7 +
 tools/flask/policy/policy/modules/xen/xen.if |   4 +-
 tools/libxc/Makefile                         |   1 +
 tools/libxc/include/xenctrl.h                |  21 +
 tools/libxc/xc_altp2m.c                      | 237 ++++++++++++
 tools/libxl/libxl.h                          |   6 +
 tools/libxl/libxl_create.c                   |   1 +
 tools/libxl/libxl_dom.c                      |   2 +
 tools/libxl/libxl_types.idl                  |   1 +
 tools/libxl/xl_cmdimpl.c                     |  10 +
 tools/tests/xen-access/xen-access.c          | 173 +++++++--
 xen/arch/x86/hvm/Makefile                    |   1 +
 xen/arch/x86/hvm/altp2m.c                    |  92 +++++
 xen/arch/x86/hvm/emulate.c                   |  19 +-
 xen/arch/x86/hvm/hvm.c                       | 249 +++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c                  |  42 +-
 xen/arch/x86/hvm/vmx/vmx.c                   | 168 ++++++++
 xen/arch/x86/mm/hap/Makefile                 |   1 +
 xen/arch/x86/mm/hap/altp2m_hap.c             |  98 +++++
 xen/arch/x86/mm/hap/hap.c                    |  32 +-
 xen/arch/x86/mm/mem_sharing.c                |   5 +-
 xen/arch/x86/mm/mm-locks.h                   |  38 +-
 xen/arch/x86/mm/p2m-ept.c                    |  46 ++-
 xen/arch/x86/mm/p2m-pod.c                    |  12 +-
 xen/arch/x86/mm/p2m-pt.c                     |  10 +-
 xen/arch/x86/mm/p2m.c                        | 552 +++++++++++++++++++++++++--
 xen/arch/x86/x86_emulate/x86_emulate.c       |  20 +-
 xen/arch/x86/x86_emulate/x86_emulate.h       |   4 +
 xen/common/domain.c                          |  28 ++
 xen/common/vm_event.c                        |   4 +
 xen/include/asm-arm/p2m.h                    |   6 +
 xen/include/asm-x86/domain.h                 |  10 +
 xen/include/asm-x86/hvm/altp2m.h             |  42 ++
 xen/include/asm-x86/hvm/hvm.h                |  28 ++
 xen/include/asm-x86/hvm/vcpu.h               |   9 +
 xen/include/asm-x86/hvm/vmx/vmcs.h           |  14 +-
 xen/include/asm-x86/hvm/vmx/vmx.h            |  13 +-
 xen/include/asm-x86/msr-index.h              |   1 +
 xen/include/asm-x86/p2m.h                    |  90 ++++-
 xen/include/public/hvm/hvm_op.h              |  82 ++++
 xen/include/public/hvm/params.h              |   5 +-
 xen/include/public/vm_event.h                |  11 +
 xen/include/xen/sched.h                      |   5 +
 xen/include/xsm/dummy.h                      |  12 +
 xen/include/xsm/xsm.h                        |  12 +
 xen/xsm/dummy.c                              |   2 +
 xen/xsm/flask/hooks.c                        |  12 +
 xen/xsm/flask/policy/access_vectors          |   7 +
 49 files changed, 2155 insertions(+), 102 deletions(-)
 create mode 100644 tools/libxc/xc_altp2m.c
 create mode 100644 xen/arch/x86/hvm/altp2m.c
 create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c
 create mode 100644 xen/include/asm-x86/hvm/altp2m.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH v4 01/15] common/domain: Helpers to pause a domain while in context
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  0:52 ` [PATCH v4 02/15] VMX: VMFUNC and #VE definitions and detection Ed White
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Andrew Cooper <andrew.cooper3@citrix.com>

For use on codepaths which would need to use domain_pause() but might be in
the target domain's context.  In the case that the target domain is in
context, all other vcpus are paused.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/common/domain.c     | 28 ++++++++++++++++++++++++++++
 xen/include/xen/sched.h |  5 +++++
 2 files changed, 33 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3bc52e6..1bb24ae 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1010,6 +1010,34 @@ int domain_unpause_by_systemcontroller(struct domain *d)
     return 0;
 }
 
+void domain_pause_except_self(struct domain *d)
+{
+    struct vcpu *v, *curr = current;
+
+    if ( curr->domain == d )
+    {
+        for_each_vcpu( d, v )
+            if ( likely(v != curr) )
+                vcpu_pause(v);
+    }
+    else
+        domain_pause(d);
+}
+
+void domain_unpause_except_self(struct domain *d)
+{
+    struct vcpu *v, *curr = current;
+
+    if ( curr->domain == d )
+    {
+        for_each_vcpu( d, v )
+            if ( likely(v != curr) )
+                vcpu_unpause(v);
+    }
+    else
+        domain_unpause(d);
+}
+
 int vcpu_reset(struct vcpu *v)
 {
     struct domain *d = v->domain;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b29d9e7..73d3bc8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -804,6 +804,11 @@ static inline int domain_pause_by_systemcontroller_nosync(struct domain *d)
 {
     return __domain_pause_by_systemcontroller(d, domain_pause_nosync);
 }
+
+/* domain_pause() but safe against trying to pause current. */
+void domain_pause_except_self(struct domain *d);
+void domain_unpause_except_self(struct domain *d);
+
 void cpu_init(void);
 
 struct scheduler;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 02/15] VMX: VMFUNC and #VE definitions and detection.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
  2015-07-10  0:52 ` [PATCH v4 01/15] common/domain: Helpers to pause a domain while in context Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  0:52 ` [PATCH v4 03/15] VMX: implement suppress #VE Ed White
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Currently, neither is enabled globally but may be enabled on a per-VCPU
basis by the altp2m code.

Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
that bit is now hardware-defined.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
---
 xen/arch/x86/hvm/vmx/vmcs.c        | 42 +++++++++++++++++++++++++++++++++++---
 xen/arch/x86/mm/p2m-ept.c          |  1 -
 xen/include/asm-x86/hvm/vmx/vmcs.h | 14 +++++++++++--
 xen/include/asm-x86/hvm/vmx/vmx.h  | 13 +++++++++++-
 xen/include/asm-x86/msr-index.h    |  1 +
 5 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 4c5ceb5..bc1cabd 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -101,6 +101,8 @@ u32 vmx_secondary_exec_control __read_mostly;
 u32 vmx_vmexit_control __read_mostly;
 u32 vmx_vmentry_control __read_mostly;
 u64 vmx_ept_vpid_cap __read_mostly;
+u64 vmx_vmfunc __read_mostly;
+bool_t vmx_virt_exception __read_mostly;
 
 const u32 vmx_introspection_force_enabled_msrs[] = {
     MSR_IA32_SYSENTER_EIP,
@@ -140,6 +142,8 @@ static void __init vmx_display_features(void)
     P(cpu_has_vmx_virtual_intr_delivery, "Virtual Interrupt Delivery");
     P(cpu_has_vmx_posted_intr_processing, "Posted Interrupt Processing");
     P(cpu_has_vmx_vmcs_shadowing, "VMCS shadowing");
+    P(cpu_has_vmx_vmfunc, "VM Functions");
+    P(cpu_has_vmx_virt_exceptions, "Virtualisation Exceptions");
     P(cpu_has_vmx_pml, "Page Modification Logging");
 #undef P
 
@@ -185,6 +189,7 @@ static int vmx_init_vmcs_config(void)
     u64 _vmx_misc_cap = 0;
     u32 _vmx_vmexit_control;
     u32 _vmx_vmentry_control;
+    u64 _vmx_vmfunc = 0;
     bool_t mismatch = 0;
 
     rdmsr(MSR_IA32_VMX_BASIC, vmx_basic_msr_low, vmx_basic_msr_high);
@@ -230,7 +235,9 @@ static int vmx_init_vmcs_config(void)
                SECONDARY_EXEC_ENABLE_EPT |
                SECONDARY_EXEC_ENABLE_RDTSCP |
                SECONDARY_EXEC_PAUSE_LOOP_EXITING |
-               SECONDARY_EXEC_ENABLE_INVPCID);
+               SECONDARY_EXEC_ENABLE_INVPCID |
+               SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+               SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
         rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
         if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
             opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
@@ -341,6 +348,24 @@ static int vmx_init_vmcs_config(void)
           || !(_vmx_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT) )
         _vmx_pin_based_exec_control  &= ~ PIN_BASED_POSTED_INTERRUPT;
 
+    /* The IA32_VMX_VMFUNC MSR exists only when VMFUNC is available */
+    if ( _vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS )
+    {
+        rdmsrl(MSR_IA32_VMX_VMFUNC, _vmx_vmfunc);
+
+        /*
+         * VMFUNC leaf 0 (EPTP switching) must be supported.
+         *
+         * Or we just don't use VMFUNC.
+         */
+        if ( !(_vmx_vmfunc & VMX_VMFUNC_EPTP_SWITCHING) )
+            _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+    }
+
+    /* Virtualization exceptions are only enabled if VMFUNC is enabled */
+    if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+        _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
     min = 0;
     opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
     _vmx_vmentry_control = adjust_vmx_controls(
@@ -361,6 +386,9 @@ static int vmx_init_vmcs_config(void)
         vmx_vmentry_control        = _vmx_vmentry_control;
         vmx_basic_msr              = ((u64)vmx_basic_msr_high << 32) |
                                      vmx_basic_msr_low;
+        vmx_vmfunc                 = _vmx_vmfunc;
+        vmx_virt_exception         = !!(_vmx_secondary_exec_control &
+                                       SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
         vmx_display_features();
 
         /* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -397,6 +425,9 @@ static int vmx_init_vmcs_config(void)
         mismatch |= cap_check(
             "EPT and VPID Capability",
             vmx_ept_vpid_cap, _vmx_ept_vpid_cap);
+        mismatch |= cap_check(
+            "VMFUNC Capability",
+            vmx_vmfunc, _vmx_vmfunc);
         if ( cpu_has_vmx_ins_outs_instr_info !=
              !!(vmx_basic_msr_high & (VMX_BASIC_INS_OUT_INFO >> 32)) )
         {
@@ -967,6 +998,11 @@ static int construct_vmcs(struct vcpu *v)
     /* Do not enable Monitor Trap Flag unless start single step debug */
     v->arch.hvm_vmx.exec_control &= ~CPU_BASED_MONITOR_TRAP_FLAG;
 
+    /* Disable VMFUNC and #VE for now: they may be enabled later by altp2m. */
+    v->arch.hvm_vmx.secondary_exec_control &=
+        ~(SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+          SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
+
     if ( is_pvh_domain(d) )
     {
         /* Disable virtual apics, TPR */
@@ -1790,9 +1826,9 @@ void vmcs_dump_vcpu(struct vcpu *v)
         printk("PLE Gap=%08x Window=%08x\n",
                vmr32(PLE_GAP), vmr32(PLE_WINDOW));
     if ( v->arch.hvm_vmx.secondary_exec_control &
-         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VMFUNC) )
+         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
         printk("Virtual processor ID = 0x%04x VMfunc controls = %016lx\n",
-               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VMFUNC_CONTROL));
+               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VM_FUNCTION_CONTROL));
 
     vmx_vmcs_exit(v);
 }
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 5133eb6..a6c9adf 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -281,7 +281,6 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
         epte->sp = (level > 1);
         epte->mfn += i * trunk;
         epte->snp = (iommu_enabled && iommu_snoop);
-        ASSERT(!epte->avail3);
 
         ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
 
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 1104bda..cb0ee6c 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -222,9 +222,10 @@ extern u32 vmx_vmentry_control;
 #define SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY    0x00000200
 #define SECONDARY_EXEC_PAUSE_LOOP_EXITING       0x00000400
 #define SECONDARY_EXEC_ENABLE_INVPCID           0x00001000
-#define SECONDARY_EXEC_ENABLE_VMFUNC            0x00002000
+#define SECONDARY_EXEC_ENABLE_VM_FUNCTIONS      0x00002000
 #define SECONDARY_EXEC_ENABLE_VMCS_SHADOWING    0x00004000
 #define SECONDARY_EXEC_ENABLE_PML               0x00020000
+#define SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS   0x00040000
 extern u32 vmx_secondary_exec_control;
 
 #define VMX_EPT_EXEC_ONLY_SUPPORTED             0x00000001
@@ -285,6 +286,10 @@ extern u32 vmx_secondary_exec_control;
     (vmx_pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT)
 #define cpu_has_vmx_vmcs_shadowing \
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VMCS_SHADOWING)
+#define cpu_has_vmx_vmfunc \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS)
+#define cpu_has_vmx_virt_exceptions \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS)
 #define cpu_has_vmx_pml \
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_PML)
 
@@ -316,6 +321,9 @@ extern u64 vmx_basic_msr;
 #define VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK  0x0FF
 #define VMX_GUEST_INTR_STATUS_SVI_OFFSET        8
 
+/* VMFUNC leaf definitions */
+#define VMX_VMFUNC_EPTP_SWITCHING   (1ULL << 0)
+
 /* VMCS field encodings. */
 #define VMCS_HIGH(x) ((x) | 1)
 enum vmcs_field {
@@ -350,12 +358,14 @@ enum vmcs_field {
     VIRTUAL_APIC_PAGE_ADDR          = 0x00002012,
     APIC_ACCESS_ADDR                = 0x00002014,
     PI_DESC_ADDR                    = 0x00002016,
-    VMFUNC_CONTROL                  = 0x00002018,
+    VM_FUNCTION_CONTROL             = 0x00002018,
     EPT_POINTER                     = 0x0000201a,
     EOI_EXIT_BITMAP0                = 0x0000201c,
 #define EOI_EXIT_BITMAP(n) (EOI_EXIT_BITMAP0 + (n) * 2) /* n = 0...3 */
+    EPTP_LIST_ADDR                  = 0x00002024,
     VMREAD_BITMAP                   = 0x00002026,
     VMWRITE_BITMAP                  = 0x00002028,
+    VIRT_EXCEPTION_INFO             = 0x0000202a,
     GUEST_PHYSICAL_ADDRESS          = 0x00002400,
     VMCS_LINK_POINTER               = 0x00002800,
     GUEST_IA32_DEBUGCTL             = 0x00002802,
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 35f804a..5b59d3c 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -47,7 +47,7 @@ typedef union {
         access      :   4,  /* bits 61:58 - p2m_access_t */
         tm          :   1,  /* bit 62 - VT-d transient-mapping hint in
                                shared EPT/VT-d usage */
-        avail3      :   1;  /* bit 63 - Software available 3 */
+        suppress_ve :   1;  /* bit 63 - suppress #VE */
     };
     u64 epte;
 } ept_entry_t;
@@ -186,6 +186,7 @@ static inline unsigned long pi_get_pir(struct pi_desc *pi_desc, int group)
 #define EXIT_REASON_XSETBV              55
 #define EXIT_REASON_APIC_WRITE          56
 #define EXIT_REASON_INVPCID             58
+#define EXIT_REASON_VMFUNC              59
 #define EXIT_REASON_PML_FULL            62
 
 /*
@@ -554,4 +555,14 @@ void p2m_init_hap_data(struct p2m_domain *p2m);
 #define EPT_L4_PAGETABLE_SHIFT      39
 #define EPT_PAGETABLE_ENTRIES       512
 
+/* #VE information page */
+typedef struct {
+    u32 exit_reason;
+    u32 semaphore;
+    u64 exit_qualification;
+    u64 gla;
+    u64 gpa;
+    u16 eptp_index;
+} ve_info_t;
+
 #endif /* __ASM_X86_HVM_VMX_VMX_H__ */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 83f2f70..8069d60 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -130,6 +130,7 @@
 #define MSR_IA32_VMX_TRUE_PROCBASED_CTLS        0x48e
 #define MSR_IA32_VMX_TRUE_EXIT_CTLS             0x48f
 #define MSR_IA32_VMX_TRUE_ENTRY_CTLS            0x490
+#define MSR_IA32_VMX_VMFUNC                     0x491
 #define IA32_FEATURE_CONTROL_MSR                0x3a
 #define IA32_FEATURE_CONTROL_MSR_LOCK                     0x0001
 #define IA32_FEATURE_CONTROL_MSR_ENABLE_VMXON_INSIDE_SMX  0x0002
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 03/15] VMX: implement suppress #VE.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
  2015-07-10  0:52 ` [PATCH v4 01/15] common/domain: Helpers to pause a domain while in context Ed White
  2015-07-10  0:52 ` [PATCH v4 02/15] VMX: VMFUNC and #VE definitions and detection Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  9:09   ` Jan Beulich
  2015-07-10  0:52 ` [PATCH v4 04/15] x86/HVM: Hardware alternate p2m support detection Ed White
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

In preparation for selectively enabling #VE in a later patch, set
suppress #VE on all EPTE's.

Suppress #VE should always be the default condition for two reasons:
it is generally not safe to deliver #VE into a guest unless that guest
has been modified to receive it; and even then for most EPT violations only
the hypervisor is able to handle the violation.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
---
 xen/arch/x86/mm/p2m-ept.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index a6c9adf..4111795 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -41,7 +41,8 @@
 #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
 static inline bool_t is_epte_valid(ept_entry_t *e)
 {
-    return (e->epte != 0 && e->sa_p2mt != p2m_invalid);
+    /* suppress_ve alone is not considered valid, so mask it off */
+    return ((e->epte & ~(1ul << 63)) != 0 && e->sa_p2mt != p2m_invalid);
 }
 
 /* returns : 0 for success, -errno otherwise */
@@ -219,6 +220,8 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
 static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
 {
     struct page_info *pg;
+    ept_entry_t *table;
+    unsigned int i;
 
     pg = p2m_alloc_ptp(p2m, 0);
     if ( pg == NULL )
@@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
     /* Manually set A bit to avoid overhead of MMU having to write it later. */
     ept_entry->a = 1;
 
+    ept_entry->suppress_ve = 1;
+
+    table = __map_domain_page(pg);
+
+    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
+        table[i].suppress_ve = 1;
+
+    unmap_domain_page(table);
+
     return 1;
 }
 
@@ -281,6 +293,7 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
         epte->sp = (level > 1);
         epte->mfn += i * trunk;
         epte->snp = (iommu_enabled && iommu_snoop);
+        epte->suppress_ve = 1;
 
         ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
 
@@ -790,6 +803,8 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
 
+    new_entry.suppress_ve = 1;
+
     rc = atomic_write_ept_entry(ept_entry, new_entry, target);
     if ( unlikely(rc) )
         old_entry.epte = 0;
@@ -1111,6 +1126,8 @@ static void ept_flush_pml_buffers(struct p2m_domain *p2m)
 int ept_p2m_init(struct p2m_domain *p2m)
 {
     struct ept_data *ept = &p2m->ept;
+    ept_entry_t *table;
+    unsigned int i;
 
     p2m->set_entry = ept_set_entry;
     p2m->get_entry = ept_get_entry;
@@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
         p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
     }
 
+    table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
+
+    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
+        table[i].suppress_ve = 1;
+
+    unmap_domain_page(table);
+
     if ( !zalloc_cpumask_var(&ept->synced_mask) )
         return -ENOMEM;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 04/15] x86/HVM: Hardware alternate p2m support detection.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (2 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 03/15] VMX: implement suppress #VE Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  0:52 ` [PATCH v4 05/15] x86/altp2m: basic data structures and support routines Ed White
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

As implemented here, only supported on platforms with VMX HAP.

By default this functionality is force-disabled, it can be enabled
by specifying altp2m=1 on the Xen command line.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 docs/misc/xen-command-line.markdown | 7 +++++++
 xen/arch/x86/hvm/hvm.c              | 7 +++++++
 xen/arch/x86/hvm/vmx/vmx.c          | 1 +
 xen/include/asm-x86/hvm/hvm.h       | 9 +++++++++
 4 files changed, 24 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index aa684c0..3391c66 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -139,6 +139,13 @@ mode during S3 resume.
 > Default: `true`
 
 Permit Xen to use superpages when performing memory management.
+ 
+### altp2m (Intel)
+> `= <boolean>`
+
++> Default: `false`
+
+Permit multiple copies of host p2m.
 
 ### apic
 > `= bigsmp | default`
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 535d622..4019658 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -94,6 +94,10 @@ bool_t opt_hvm_fep;
 boolean_param("hvm_fep", opt_hvm_fep);
 #endif
 
+/* Xen command-line option to enable altp2m */
+static bool_t __initdata opt_altp2m_enabled = 0;
+boolean_param("altp2m", opt_altp2m_enabled);
+
 static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -160,6 +164,9 @@ static int __init hvm_enable(void)
     if ( !fns->pvh_supported )
         printk(XENLOG_INFO "HVM: PVH mode not supported on this platform\n");
 
+    if ( !opt_altp2m_enabled )
+        hvm_funcs.altp2m_supported = 0;
+
     /*
      * Allow direct access to the PC debug ports 0x80 and 0xed (they are
      * often used for I/O delays, but the vmexits simply slow things down).
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index fc29b89..07527dd 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1841,6 +1841,7 @@ const struct hvm_function_table * __init start_vmx(void)
     if ( cpu_has_vmx_ept && (cpu_has_vmx_pat || opt_force_ept) )
     {
         vmx_function_table.hap_supported = 1;
+        vmx_function_table.altp2m_supported = 1;
 
         vmx_function_table.hap_capabilities = 0;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 57f9605..c61cfe7 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -94,6 +94,9 @@ struct hvm_function_table {
     /* Necessary hardware support for PVH mode? */
     int pvh_supported;
 
+    /* Necessary hardware support for alternate p2m's? */
+    bool_t altp2m_supported;
+
     /* Indicate HAP capabilities. */
     int hap_capabilities;
 
@@ -509,6 +512,12 @@ bool_t nhvm_vmcx_hap_enabled(struct vcpu *v);
 /* interrupt */
 enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v);
 
+/* returns true if hardware supports alternate p2m's */
+static inline bool_t hvm_altp2m_supported(void)
+{
+    return hvm_funcs.altp2m_supported;
+}
+
 #ifndef NDEBUG
 /* Permit use of the Forced Emulation Prefix in HVM guests */
 extern bool_t opt_hvm_fep;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (3 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 04/15] x86/HVM: Hardware alternate p2m support detection Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  9:13   ` Jan Beulich
  2015-07-10  0:52 ` [PATCH v4 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Add the basic data structures needed to support alternate p2m's and
the functions to initialise them and tear them down.

Although Intel hardware can handle 512 EPTP's per hardware thread
concurrently, only 10 per domain are supported in this patch for
performance reasons.

The iterator in hap_enable() does need to handle 512, so that is now
uint16_t.

This change also splits the p2m lock into one lock type for altp2m's
and another type for all other p2m's. The purpose of this is to place
the altp2m list lock between the types, so the list lock can be
acquired whilst holding the host p2m lock.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/Makefile        |  1 +
 xen/arch/x86/hvm/altp2m.c        | 92 +++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/hvm.c           | 21 +++++++++
 xen/arch/x86/mm/hap/hap.c        | 32 ++++++++++++-
 xen/arch/x86/mm/mm-locks.h       | 38 +++++++++++++++-
 xen/arch/x86/mm/p2m.c            | 98 ++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/domain.h     | 10 ++++
 xen/include/asm-x86/hvm/altp2m.h | 38 ++++++++++++++++
 xen/include/asm-x86/hvm/hvm.h    | 17 +++++++
 xen/include/asm-x86/hvm/vcpu.h   |  9 ++++
 xen/include/asm-x86/p2m.h        | 30 +++++++++++-
 11 files changed, 382 insertions(+), 4 deletions(-)
 create mode 100644 xen/arch/x86/hvm/altp2m.c
 create mode 100644 xen/include/asm-x86/hvm/altp2m.h

diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index 69af47f..eb1a37b 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -1,6 +1,7 @@
 subdir-y += svm
 subdir-y += vmx
 
+obj-y += altp2m.o
 obj-y += asid.o
 obj-y += emulate.o
 obj-y += event.o
diff --git a/xen/arch/x86/hvm/altp2m.c b/xen/arch/x86/hvm/altp2m.c
new file mode 100644
index 0000000..f98a38d
--- /dev/null
+++ b/xen/arch/x86/hvm/altp2m.c
@@ -0,0 +1,92 @@
+/*
+ * Alternate p2m HVM
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#include <asm/hvm/support.h>
+#include <asm/hvm/hvm.h>
+#include <asm/p2m.h>
+#include <asm/hvm/altp2m.h>
+
+void
+altp2m_vcpu_reset(struct vcpu *v)
+{
+    struct altp2mvcpu *av = &vcpu_altp2m(v);
+
+    av->p2midx = INVALID_ALTP2M;
+    av->veinfo_gfn = _gfn(INVALID_GFN);
+
+    if ( hvm_funcs.ap2m_vcpu_reset )
+        hvm_funcs.ap2m_vcpu_reset(v);
+}
+
+int
+altp2m_vcpu_initialise(struct vcpu *v)
+{
+    int rc = -EOPNOTSUPP;
+
+    if ( v != current )
+        vcpu_pause(v);
+
+    if ( !hvm_funcs.ap2m_vcpu_initialise ||
+         (hvm_funcs.ap2m_vcpu_initialise(v) == 0) )
+    {
+        rc = 0;
+        altp2m_vcpu_reset(v);
+        vcpu_altp2m(v).p2midx = 0;
+        atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+
+        ap2m_vcpu_update_eptp(v);
+    }
+
+    if ( v != current )
+        vcpu_unpause(v);
+
+    return rc;
+}
+
+void
+altp2m_vcpu_destroy(struct vcpu *v)
+{
+    struct p2m_domain *p2m;
+
+    if ( v != current )
+        vcpu_pause(v);
+
+    if ( hvm_funcs.ap2m_vcpu_destroy )
+        hvm_funcs.ap2m_vcpu_destroy(v);
+
+    if ( (p2m = p2m_get_altp2m(v)) )
+        atomic_dec(&p2m->active_vcpus);
+
+    altp2m_vcpu_reset(v);
+
+    ap2m_vcpu_update_eptp(v);
+    ap2m_vcpu_update_vmfunc_ve(v);
+
+    if ( v != current )
+        vcpu_unpause(v);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4019658..dbb4696 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -58,6 +58,7 @@
 #include <asm/hvm/cacheattr.h>
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/altp2m.h>
 #include <asm/hvm/event.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
@@ -2380,6 +2381,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
 {
     hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
 
+    altp2m_vcpu_destroy(v);
     nestedhvm_vcpu_destroy(v);
 
     free_compat_arg_xlat(v);
@@ -6498,6 +6500,25 @@ enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v)
     return hvm_funcs.nhvm_intr_blocked(v);
 }
 
+void ap2m_vcpu_update_eptp(struct vcpu *v)
+{
+    if ( hvm_funcs.ap2m_vcpu_update_eptp )
+        hvm_funcs.ap2m_vcpu_update_eptp(v);
+}
+
+void ap2m_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+    if ( hvm_funcs.ap2m_vcpu_update_vmfunc_ve )
+        hvm_funcs.ap2m_vcpu_update_vmfunc_ve(v);
+}
+
+bool_t ap2m_vcpu_emulate_ve(struct vcpu *v)
+{
+    if ( hvm_funcs.ap2m_vcpu_emulate_ve )
+        return hvm_funcs.ap2m_vcpu_emulate_ve(v);
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index d0d3f1e..a16a381 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)
 int hap_enable(struct domain *d, u32 mode)
 {
     unsigned int old_pages;
-    uint8_t i;
+    uint16_t i;
     int rv = 0;
 
     domain_pause(d);
@@ -498,6 +498,25 @@ int hap_enable(struct domain *d, u32 mode)
            goto out;
     }
 
+    /* Init alternate p2m data */
+    if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
+    {
+        rv = -ENOMEM;
+        goto out;
+    }
+
+    for ( i = 0; i < MAX_EPTP; i++ )
+        d->arch.altp2m_eptp[i] = INVALID_MFN;
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        rv = p2m_alloc_table(d->arch.altp2m_p2m[i]);
+        if ( rv != 0 )
+           goto out;
+    }
+
+    d->arch.altp2m_active = 0;
+
     /* Now let other users see the new mode */
     d->arch.paging.mode = mode | PG_HAP_enable;
 
@@ -510,6 +529,17 @@ void hap_final_teardown(struct domain *d)
 {
     uint8_t i;
 
+    d->arch.altp2m_active = 0;
+
+    if ( d->arch.altp2m_eptp )
+    {
+        free_xenheap_page(d->arch.altp2m_eptp);
+        d->arch.altp2m_eptp = NULL;
+    }
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+        p2m_teardown(d->arch.altp2m_p2m[i]);
+
     /* Destroy nestedp2m's first */
     for (i = 0; i < MAX_NESTEDP2M; i++) {
         p2m_teardown(d->arch.nested_p2m[i]);
diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index b4f035e..f137bad 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
 #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
 #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
 
-/* P2M lock (per-p2m-table)
+/* P2M lock (per-non-alt-p2m-table)
  *
  * This protects all queries and updates to the p2m table.
  * Queries may be made under the read lock but all modifications
@@ -225,10 +225,44 @@ declare_mm_lock(nestedp2m)
  *
  * The write lock is recursive as it is common for a code path to look
  * up a gfn and later mutate it.
+ *
+ * Note that this lock shares its implementation with the altp2m
+ * lock (not the altp2m list lock), so the implementation
+ * is found there.
  */
 
 declare_mm_rwlock(p2m);
-#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
+
+/* Alternate P2M list lock (per-domain)
+ *
+ * A per-domain lock that protects the list of alternate p2m's.
+ * Any operation that walks the list needs to acquire this lock.
+ * Additionally, before destroying an alternate p2m all VCPU's
+ * in the target domain must be paused.
+ */
+
+declare_mm_lock(altp2mlist)
+#define altp2m_list_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_list_lock)
+#define altp2m_list_unlock(d) mm_unlock(&(d)->arch.altp2m_list_lock)
+
+/* P2M lock (per-altp2m-table)
+ *
+ * This protects all queries and updates to the p2m table.
+ * Queries may be made under the read lock but all modifications
+ * need the main (write) lock.
+ *
+ * The write lock is recursive as it is common for a code path to look
+ * up a gfn and later mutate it.
+ */
+
+declare_mm_rwlock(altp2m);
+#define p2m_lock(p)                         \
+{                                           \
+    if ( p2m_is_altp2m(p) )                 \
+        mm_write_lock(altp2m, &(p)->lock);  \
+    else                                    \
+        mm_write_lock(p2m, &(p)->lock);     \
+}
 #define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
 #define gfn_lock(p,g,o)       p2m_lock(p)
 #define gfn_unlock(p,g,o)     p2m_unlock(p)
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6b39733..90ee651 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -35,6 +35,7 @@
 #include <asm/hvm/vmx/vmx.h> /* ept_p2m_init() */
 #include <asm/mem_sharing.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/altp2m.h>
 #include <asm/hvm/svm/amd-iommu-proto.h>
 #include <xsm/xsm.h>
 
@@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct domain *d)
     }
 }
 
+static void p2m_teardown_altp2m(struct domain *d)
+{
+    unsigned int i;
+    struct p2m_domain *p2m;
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( !d->arch.altp2m_p2m[i] )
+            continue;
+        p2m = d->arch.altp2m_p2m[i];
+        p2m_free_one(p2m);
+        d->arch.altp2m_p2m[i] = NULL;
+    }
+}
+
+static int p2m_init_altp2m(struct domain *d)
+{
+    uint8_t i;
+    struct p2m_domain *p2m;
+
+    mm_lock_init(&d->arch.altp2m_list_lock);
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        d->arch.altp2m_p2m[i] = p2m = p2m_init_one(d);
+        if ( p2m == NULL )
+        {
+            p2m_teardown_altp2m(d);
+            return -ENOMEM;
+        }
+        p2m->p2m_class = p2m_alternate;
+        p2m->access_required = 1;
+        _atomic_set(&p2m->active_vcpus, 0);
+    }
+
+    return 0;
+}
+
 int p2m_init(struct domain *d)
 {
     int rc;
@@ -196,7 +234,14 @@ int p2m_init(struct domain *d)
      * (p2m_init runs too early for HVM_PARAM_* options) */
     rc = p2m_init_nestedp2m(d);
     if ( rc )
+    {
         p2m_teardown_hostp2m(d);
+        return rc;
+    }
+
+    rc = p2m_init_altp2m(d);
+    if ( rc )
+        p2m_teardown_altp2m(d);
 
     return rc;
 }
@@ -1918,6 +1963,59 @@ int unmap_mmio_regions(struct domain *d,
     return err;
 }
 
+uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp)
+{
+    struct p2m_domain *p2m;
+    struct ept_data *ept;
+    uint16_t i;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
+            continue;
+
+        p2m = d->arch.altp2m_p2m[i];
+        ept = &p2m->ept;
+
+        if ( eptp == ept_get_eptp(ept) )
+            goto out;
+    }
+
+    i = INVALID_ALTP2M;
+
+out:
+    altp2m_list_unlock(d);
+    return i;
+}
+
+bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx)
+{
+    struct domain *d = v->domain;
+    bool_t rc = 0;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        if ( idx != vcpu_altp2m(v).p2midx )
+        {
+            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+            vcpu_altp2m(v).p2midx = idx;
+            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+            ap2m_vcpu_update_eptp(v);
+        }
+        rc = 1;
+    }
+
+    altp2m_list_unlock(d);
+    return rc;
+}
+
 /*** Audit ***/
 
 #if P2M_AUDIT
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 96bde65..294e089 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -234,6 +234,10 @@ struct paging_vcpu {
 typedef xen_domctl_cpuid_t cpuid_input_t;
 
 #define MAX_NESTEDP2M 10
+
+#define MAX_ALTP2M      ((uint16_t)10)
+#define INVALID_ALTP2M  ((uint16_t)~0)
+#define MAX_EPTP        (PAGE_SIZE / sizeof(uint64_t))
 struct p2m_domain;
 struct time_scale {
     int shift;
@@ -293,6 +297,12 @@ struct arch_domain
     struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
     mm_lock_t nested_p2m_lock;
 
+    /* altp2m: allow multiple copies of host p2m */
+    bool_t altp2m_active;
+    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
+    mm_lock_t altp2m_list_lock;
+    uint64_t *altp2m_eptp;
+
     /* NB. protected by d->event_lock and by irq_desc[irq].lock */
     struct radix_tree_root irq_pirq;
 
diff --git a/xen/include/asm-x86/hvm/altp2m.h b/xen/include/asm-x86/hvm/altp2m.h
new file mode 100644
index 0000000..1a6f22b
--- /dev/null
+++ b/xen/include/asm-x86/hvm/altp2m.h
@@ -0,0 +1,38 @@
+/*
+ * Alternate p2m HVM
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#ifndef _ALTP2M_H
+#define _ALTP2M_H
+
+#include <xen/types.h>
+#include <xen/sched.h>         /* for struct vcpu, struct domain */
+#include <asm/hvm/vcpu.h>      /* for vcpu_altp2m */
+
+/* Alternate p2m HVM on/off per domain */
+static inline bool_t altp2m_active(const struct domain *d)
+{
+    return d->arch.altp2m_active;
+}
+
+/* Alternate p2m VCPU */
+int altp2m_vcpu_initialise(struct vcpu *v);
+void altp2m_vcpu_destroy(struct vcpu *v);
+void altp2m_vcpu_reset(struct vcpu *v);
+
+#endif /* _ALTP2M_H */
+
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index c61cfe7..723b67c 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -210,6 +210,14 @@ struct hvm_function_table {
                                   uint32_t *ecx, uint32_t *edx);
 
     void (*enable_msr_exit_interception)(struct domain *d);
+
+    /* Alternate p2m */
+    int (*ap2m_vcpu_initialise)(struct vcpu *v);
+    void (*ap2m_vcpu_destroy)(struct vcpu *v);
+    int (*ap2m_vcpu_reset)(struct vcpu *v);
+    void (*ap2m_vcpu_update_eptp)(struct vcpu *v);
+    void (*ap2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
+    bool_t (*ap2m_vcpu_emulate_ve)(struct vcpu *v);
 };
 
 extern struct hvm_function_table hvm_funcs;
@@ -525,6 +533,15 @@ extern bool_t opt_hvm_fep;
 #define opt_hvm_fep 0
 #endif
 
+/* updates the current EPTP in VMCS */
+void ap2m_vcpu_update_eptp(struct vcpu *v);
+
+/* updates VMCS fields related to VMFUNC and #VE */
+void ap2m_vcpu_update_vmfunc_ve(struct vcpu *v);
+
+/* emulates #VE */
+bool_t ap2m_vcpu_emulate_ve(struct vcpu *v);
+
 #endif /* __ASM_X86_HVM_HVM_H__ */
 
 /*
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 3d8f4dc..09f25c1 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -118,6 +118,13 @@ struct nestedvcpu {
 
 #define vcpu_nestedhvm(v) ((v)->arch.hvm_vcpu.nvcpu)
 
+struct altp2mvcpu {
+    uint16_t    p2midx;         /* alternate p2m index */
+    gfn_t       veinfo_gfn;     /* #VE information page gfn */
+};
+
+#define vcpu_altp2m(v) ((v)->arch.hvm_vcpu.avcpu)
+
 struct hvm_vcpu {
     /* Guest control-register and EFER values, just as the guest sees them. */
     unsigned long       guest_cr[5];
@@ -163,6 +170,8 @@ struct hvm_vcpu {
 
     struct nestedvcpu   nvcpu;
 
+    struct altp2mvcpu   avcpu;
+
     struct mtrr_state   mtrr;
     u64                 pat_cr;
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index b49c09b..079a298 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -175,6 +175,7 @@ typedef unsigned int p2m_query_t;
 typedef enum {
     p2m_host,
     p2m_nested,
+    p2m_alternate,
 } p2m_class_t;
 
 /* Per-p2m-table state */
@@ -193,7 +194,7 @@ struct p2m_domain {
 
     struct domain     *domain;   /* back pointer to domain */
 
-    p2m_class_t       p2m_class; /* host/nested/? */
+    p2m_class_t       p2m_class; /* host/nested/alternate */
 
     /* Nested p2ms only: nested p2m base value that this p2m shadows.
      * This can be cleared to P2M_BASE_EADDR under the per-p2m lock but
@@ -219,6 +220,9 @@ struct p2m_domain {
      * host p2m's lock. */
     int                defer_nested_flush;
 
+    /* Alternate p2m: count of vcpu's currently using this p2m. */
+    atomic_t           active_vcpus;
+
     /* Pages used to construct the p2m */
     struct page_list_head pages;
 
@@ -317,6 +321,11 @@ static inline bool_t p2m_is_nestedp2m(const struct p2m_domain *p2m)
     return p2m->p2m_class == p2m_nested;
 }
 
+static inline bool_t p2m_is_altp2m(const struct p2m_domain *p2m)
+{
+    return p2m->p2m_class == p2m_alternate;
+}
+
 #define p2m_get_pagetable(p2m)  ((p2m)->phys_table)
 
 /**** p2m query accessors. They lock p2m_lock, and thus serialize
@@ -722,6 +731,25 @@ void nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
     l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
 
 /*
+ * Alternate p2m: shadow p2m tables used for alternate memory views
+ */
+
+/* get current alternate p2m table */
+static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    uint16_t index = vcpu_altp2m(v).p2midx;
+
+    return (index == INVALID_ALTP2M) ? NULL : d->arch.altp2m_p2m[index];
+}
+
+/* Locate an alternate p2m by its EPTP */
+uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp);
+
+/* Switch alternate p2m for a single vcpu */
+bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
+
+/*
  * p2m type to IOMMU flags
  */
 static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (4 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 05/15] x86/altp2m: basic data structures and support routines Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10 16:48   ` George Dunlap
  2015-07-10  0:52 ` [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Implement and hook up the code to enable VMX support of VMFUNC and #VE.

VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 138 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 138 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 07527dd..28afdaa 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -56,6 +56,7 @@
 #include <asm/debugger.h>
 #include <asm/apic.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/altp2m.h>
 #include <asm/event.h>
 #include <asm/monitor.h>
 #include <public/arch-x86/cpuid.h>
@@ -1763,6 +1764,104 @@ static void vmx_enable_msr_exit_interception(struct domain *d)
                                          MSR_TYPE_W);
 }
 
+static void vmx_vcpu_update_eptp(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    struct p2m_domain *p2m = NULL;
+    struct ept_data *ept;
+
+    if ( altp2m_active(d) )
+        p2m = p2m_get_altp2m(v);
+    if ( !p2m )
+        p2m = p2m_get_hostp2m(d);
+
+    ept = &p2m->ept;
+    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
+
+    vmx_vmcs_enter(v);
+
+    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
+
+    if ( v->arch.hvm_vmx.secondary_exec_control &
+        SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+        __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);
+
+    vmx_vmcs_exit(v);
+}
+
+static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+
+    if ( !cpu_has_vmx_vmfunc )
+        return;
+
+    if ( cpu_has_vmx_virt_exceptions )
+        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
+    vmx_vmcs_enter(v);
+
+    if ( !d->is_dying && altp2m_active(d) )
+    {
+        v->arch.hvm_vmx.secondary_exec_control |= mask;
+        __vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
+        __vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp));
+
+        if ( cpu_has_vmx_virt_exceptions )
+        {
+            p2m_type_t t;
+            mfn_t mfn;
+
+            mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
+
+            if ( mfn_x(mfn) != INVALID_MFN )
+                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
+            else
+                mask &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+        }
+    }
+    else
+        v->arch.hvm_vmx.secondary_exec_control &= ~mask;
+
+    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
+        v->arch.hvm_vmx.secondary_exec_control);
+
+    vmx_vmcs_exit(v);
+}
+
+static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
+{
+    bool_t rc = 0;
+    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) != INVALID_GFN ?
+        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) : NULL;
+
+    if ( !veinfo )
+        return 0;
+
+    if ( veinfo->semaphore != 0 )
+        goto out;
+
+    rc = 1;
+
+    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
+    veinfo->semaphore = ~0l;
+    veinfo->eptp_index = vcpu_altp2m(v).p2midx;
+
+    vmx_vmcs_enter(v);
+    __vmread(EXIT_QUALIFICATION, &veinfo->exit_qualification);
+    __vmread(GUEST_LINEAR_ADDRESS, &veinfo->gla);
+    __vmread(GUEST_PHYSICAL_ADDRESS, &veinfo->gpa);
+    vmx_vmcs_exit(v);
+
+    hvm_inject_hw_exception(TRAP_virtualisation,
+                            HVM_DELIVER_NO_ERROR_CODE);
+
+out:
+    hvm_unmap_guest_frame(veinfo, 0);
+    return rc;
+}
+
 static struct hvm_function_table __initdata vmx_function_table = {
     .name                 = "VMX",
     .cpu_up_prepare       = vmx_cpu_up_prepare,
@@ -1822,6 +1921,9 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
     .hypervisor_cpuid_leaf = vmx_hypervisor_cpuid_leaf,
     .enable_msr_exit_interception = vmx_enable_msr_exit_interception,
+    .ap2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
+    .ap2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
+    .ap2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2743,6 +2845,42 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 
     /* Now enable interrupts so it's safe to take locks. */
     local_irq_enable();
+ 
+    /*
+     * If the guest has the ability to switch EPTP without an exit,
+     * figure out whether it has done so and update the altp2m data.
+     */
+    if ( altp2m_active(v->domain) &&
+        (v->arch.hvm_vmx.secondary_exec_control &
+        SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+    {
+        unsigned long idx;
+
+        if ( v->arch.hvm_vmx.secondary_exec_control &
+            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+            __vmread(EPTP_INDEX, &idx);
+        else
+        {
+            unsigned long eptp;
+
+            __vmread(EPT_POINTER, &eptp);
+
+            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
+                 INVALID_ALTP2M )
+            {
+                gdprintk(XENLOG_ERR, "EPTP not found in alternate p2m list\n");
+                domain_crash(v->domain);
+            }
+        }
+
+        if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )
+        {
+            BUG_ON(idx >= MAX_ALTP2M);
+            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+            vcpu_altp2m(v).p2midx = (uint16_t)idx;
+            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+        }
+    }
 
     /* XXX: This looks ugly, but we need a mechanism to ensure
      * any pending vmresume has really happened
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (5 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  9:30   ` Jan Beulich
  2015-07-10  0:52 ` [PATCH v4 08/15] x86/altp2m: add control of suppress_ve Ed White
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Ravi Sahita <ravi.sahita@intel.com>

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
---
 xen/arch/x86/hvm/emulate.c             | 19 +++++++++++++++++--
 xen/arch/x86/hvm/vmx/vmx.c             | 29 +++++++++++++++++++++++++++++
 xen/arch/x86/x86_emulate/x86_emulate.c | 20 +++++++++++++++-----
 xen/arch/x86/x86_emulate/x86_emulate.h |  4 ++++
 xen/include/asm-x86/hvm/hvm.h          |  2 ++
 5 files changed, 67 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index fe5661d..1c90832 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1436,6 +1436,19 @@ static int hvmemul_invlpg(
     return rc;
 }
 
+static int hvmemul_vmfunc(
+    struct x86_emulate_ctxt *ctxt)
+{
+    int rc;
+
+    rc = hvm_funcs.ap2m_vcpu_emulate_vmfunc(ctxt->regs);
+    if ( rc != X86EMUL_OKAY )
+    {
+        hvmemul_inject_hw_exception(TRAP_invalid_op, 0, ctxt);
+    }
+    return rc;
+}
+
 static const struct x86_emulate_ops hvm_emulate_ops = {
     .read          = hvmemul_read,
     .insn_fetch    = hvmemul_insn_fetch,
@@ -1459,7 +1472,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmfunc        = hvmemul_vmfunc,
 };
 
 static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
@@ -1485,7 +1499,8 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmfunc        = hvmemul_vmfunc,
 };
 
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 28afdaa..2664673 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -82,6 +82,7 @@ static void vmx_fpu_dirty_intercept(void);
 static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content);
 static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg_intercept(unsigned long vaddr);
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
 
 uint8_t __read_mostly posted_intr_vector;
 
@@ -1830,6 +1831,19 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
     vmx_vmcs_exit(v);
 }
 
+static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
+{
+    int rc = X86EMUL_EXCEPTION;
+    struct vcpu *curr = current;
+
+    if ( !cpu_has_vmx_vmfunc && altp2m_active(curr->domain) &&
+         regs->eax == 0 &&
+         p2m_switch_vcpu_altp2m_by_id(curr, (uint16_t)regs->ecx) )
+        rc = X86EMUL_OKAY;
+
+    return rc;
+}
+
 static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
 {
     bool_t rc = 0;
@@ -1898,6 +1912,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .msr_read_intercept   = vmx_msr_read_intercept,
     .msr_write_intercept  = vmx_msr_write_intercept,
     .invlpg_intercept     = vmx_invlpg_intercept,
+    .vmfunc_intercept     = vmx_vmfunc_intercept,
     .handle_cd            = vmx_handle_cd,
     .set_info_guest       = vmx_set_info_guest,
     .set_rdtsc_exiting    = vmx_set_rdtsc_exiting,
@@ -1924,6 +1939,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .ap2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
     .ap2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
     .ap2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
+    .ap2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2095,6 +2111,12 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
         vpid_sync_vcpu_gva(curr, vaddr);
 }
 
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs)
+{
+    gdprintk(XENLOG_ERR, "Failed guest VMFUNC execution\n");
+    return X86EMUL_EXCEPTION;
+}
+
 static int vmx_cr_access(unsigned long exit_qualification)
 {
     struct vcpu *curr = current;
@@ -3234,6 +3256,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMFUNC:
+        if ( vmx_vmfunc_intercept(regs) == X86EMUL_EXCEPTION )
+            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        else
+            update_guest_eip();
+        break;
+
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
     case EXIT_REASON_GETSEC:
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index c017c69..d941771 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -3816,8 +3816,9 @@ x86_emulate(
         struct segment_register reg;
         unsigned long base, limit, cr0, cr0w;
 
-        if ( modrm == 0xdf ) /* invlpga */
+        switch( modrm )
         {
+        case 0xdf: /* invlpga AMD */
             generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
             fail_if(ops->invlpg == NULL);
@@ -3825,10 +3826,7 @@ x86_emulate(
                                    ctxt)) )
                 goto done;
             break;
-        }
-
-        if ( modrm == 0xf9 ) /* rdtscp */
-        {
+        case 0xf9: /* rdtscp */ {
             uint64_t tsc_aux;
             fail_if(ops->read_msr == NULL);
             if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
@@ -3836,7 +3834,19 @@ x86_emulate(
             _regs.ecx = (uint32_t)tsc_aux;
             goto rdtsc;
         }
+        case 0xd4: /* vmfunc */
+            generate_exception_if(lock_prefix | rep_prefix() | (vex.pfx == vex_66),
+                                  EXC_UD, -1);
+            fail_if(ops->vmfunc == NULL);
+            if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
+                goto done;
+            break;
+        default:
+            goto continue_grp7;
+        }
+        break;
 
+ continue_grp7:
         switch ( modrm_reg & 7 )
         {
         case 0: /* sgdt */
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 064b8f4..a4d4ec8 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -397,6 +397,10 @@ struct x86_emulate_ops
         enum x86_segment seg,
         unsigned long offset,
         struct x86_emulate_ctxt *ctxt);
+
+    /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */
+    int (*vmfunc)(
+        struct x86_emulate_ctxt *ctxt);
 };
 
 struct cpu_user_regs;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 723b67c..fa21e84 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -167,6 +167,7 @@ struct hvm_function_table {
     int (*msr_read_intercept)(unsigned int msr, uint64_t *msr_content);
     int (*msr_write_intercept)(unsigned int msr, uint64_t msr_content);
     void (*invlpg_intercept)(unsigned long vaddr);
+    int (*vmfunc_intercept)(struct cpu_user_regs *regs);
     void (*handle_cd)(struct vcpu *v, unsigned long value);
     void (*set_info_guest)(struct vcpu *v);
     void (*set_rdtsc_exiting)(struct vcpu *v, bool_t);
@@ -218,6 +219,7 @@ struct hvm_function_table {
     void (*ap2m_vcpu_update_eptp)(struct vcpu *v);
     void (*ap2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
     bool_t (*ap2m_vcpu_emulate_ve)(struct vcpu *v);
+    int (*ap2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
 };
 
 extern struct hvm_function_table hvm_funcs;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (6 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  9:39   ` Jan Beulich
  2015-07-10 17:02   ` George Dunlap
  2015-07-10  0:52 ` [PATCH v4 09/15] x86/altp2m: alternate p2m memory events Ed White
                   ` (6 subsequent siblings)
  14 siblings, 2 replies; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: George Dunlap <george.dunlap@eu.citrix.com>

The existing ept_set_entry() and ept_get_entry() routines are extended
to optionally set/get suppress_ve.  Passing -1 will set suppress_ve on
new p2m entries, or retain suppress_ve flag on existing entries.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/arch/x86/mm/mem_sharing.c |  5 +++--
 xen/arch/x86/mm/p2m-ept.c     | 18 ++++++++++++----
 xen/arch/x86/mm/p2m-pod.c     | 12 +++++------
 xen/arch/x86/mm/p2m-pt.c      | 10 +++++++--
 xen/arch/x86/mm/p2m.c         | 50 ++++++++++++++++++++++---------------------
 xen/include/asm-x86/p2m.h     | 24 +++++++++++----------
 6 files changed, 70 insertions(+), 49 deletions(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 16e329e..5780a26 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1257,10 +1257,11 @@ int relinquish_shared_pages(struct domain *d)
         p2m_type_t t;
         mfn_t mfn;
         int set_rc;
+        bool_t sve;
 
         if ( atomic_read(&d->shr_pages) == 0 )
             break;
-        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, &sve);
         if ( mfn_valid(mfn) && (t == p2m_ram_shared) )
         {
             /* Does not fail with ENOMEM given the DESTROY flag */
@@ -1270,7 +1271,7 @@ int relinquish_shared_pages(struct domain *d)
              * unshare.  Must succeed: we just read the old entry and
              * we hold the p2m lock. */
             set_rc = p2m->set_entry(p2m, gfn, _mfn(0), PAGE_ORDER_4K,
-                                    p2m_invalid, p2m_access_rwx);
+                                    p2m_invalid, p2m_access_rwx, sve);
             ASSERT(set_rc == 0);
             count += 0x10;
         }
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 4111795..1106235 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -657,7 +657,8 @@ bool_t ept_handle_misconfig(uint64_t gpa)
  */
 static int
 ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, 
-              unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma)
+              unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma,
+              int sve)
 {
     ept_entry_t *table, *ept_entry = NULL;
     unsigned long gfn_remainder = gfn;
@@ -803,7 +804,11 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
 
-    new_entry.suppress_ve = 1;
+    if ( sve != -1 )
+        new_entry.suppress_ve = !!sve;
+    else
+        new_entry.suppress_ve = is_epte_valid(&old_entry) ?
+                                    old_entry.suppress_ve : 1;
 
     rc = atomic_write_ept_entry(ept_entry, new_entry, target);
     if ( unlikely(rc) )
@@ -850,8 +855,9 @@ out:
 
 /* Read ept p2m entries */
 static mfn_t ept_get_entry(struct p2m_domain *p2m,
-                           unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
-                           p2m_query_t q, unsigned int *page_order)
+                            unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
+                            p2m_query_t q, unsigned int *page_order,
+                            bool_t *sve)
 {
     ept_entry_t *table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
     unsigned long gfn_remainder = gfn;
@@ -865,6 +871,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
 
     *t = p2m_mmio_dm;
     *a = p2m_access_n;
+    if ( sve )
+        *sve = 1;
 
     /* This pfn is higher than the highest the p2m map currently holds */
     if ( gfn > p2m->max_mapped_pfn )
@@ -930,6 +938,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
         else
             *t = ept_entry->sa_p2mt;
         *a = ept_entry->access;
+        if ( sve )
+            *sve = ept_entry->suppress_ve;
 
         mfn = _mfn(ept_entry->mfn);
         if ( i )
diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 0679f00..a2f6d02 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -536,7 +536,7 @@ recount:
         p2m_access_t a;
         p2m_type_t t;
 
-        (void)p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
+        (void)p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL, NULL);
 
         if ( t == p2m_populate_on_demand )
             pod++;
@@ -587,7 +587,7 @@ recount:
         p2m_type_t t;
         p2m_access_t a;
 
-        mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL, NULL);
         if ( t == p2m_populate_on_demand )
         {
             p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid,
@@ -676,7 +676,7 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
     for ( i=0; i<SUPERPAGE_PAGES; i++ )
     {
         p2m_access_t a; 
-        mfn = p2m->get_entry(p2m, gfn + i, &type, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gfn + i, &type, &a, 0, NULL, NULL);
 
         if ( i == 0 )
         {
@@ -808,7 +808,7 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
     for ( i=0; i<count; i++ )
     {
         p2m_access_t a;
-        mfns[i] = p2m->get_entry(p2m, gfns[i], types + i, &a, 0, NULL);
+        mfns[i] = p2m->get_entry(p2m, gfns[i], types + i, &a, 0, NULL, NULL);
         /* If this is ram, and not a pagetable or from the xen heap, and probably not mapped
            elsewhere, map it; otherwise, skip. */
         if ( p2m_is_ram(types[i])
@@ -947,7 +947,7 @@ p2m_pod_emergency_sweep(struct p2m_domain *p2m)
     for ( i=p2m->pod.reclaim_single; i > 0 ; i-- )
     {
         p2m_access_t a;
-        (void)p2m->get_entry(p2m, i, &t, &a, 0, NULL);
+        (void)p2m->get_entry(p2m, i, &t, &a, 0, NULL, NULL);
         if ( p2m_is_ram(t) )
         {
             gfns[j] = i;
@@ -1135,7 +1135,7 @@ guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn,
     for ( i = 0; i < (1UL << order); i++ )
     {
         p2m_access_t a;
-        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
+        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL, NULL);
         if ( p2m_is_ram(ot) )
         {
             P2M_DEBUG("gfn_to_mfn returned type %d!\n", ot);
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index e50b6fa..f6944cd 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -482,7 +482,8 @@ int p2m_pt_handle_deferred_changes(uint64_t gpa)
 /* Returns: 0 for success, -errno for failure */
 static int
 p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
-                 unsigned int page_order, p2m_type_t p2mt, p2m_access_t p2ma)
+                 unsigned int page_order, p2m_type_t p2mt, p2m_access_t p2ma,
+                 int sve)
 {
     /* XXX -- this might be able to be faster iff current->domain == d */
     void *table;
@@ -495,6 +496,8 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
     unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
     unsigned long old_mfn = 0;
 
+    ASSERT(sve != 0);
+
     if ( tb_init_done )
     {
         struct {
@@ -689,7 +692,7 @@ static inline p2m_type_t recalc_type(bool_t recalc, p2m_type_t t,
 static mfn_t
 p2m_pt_get_entry(struct p2m_domain *p2m, unsigned long gfn,
                  p2m_type_t *t, p2m_access_t *a, p2m_query_t q,
-                 unsigned int *page_order)
+                 unsigned int *page_order, bool_t *sve)
 {
     mfn_t mfn;
     paddr_t addr = ((paddr_t)gfn) << PAGE_SHIFT;
@@ -701,6 +704,9 @@ p2m_pt_get_entry(struct p2m_domain *p2m, unsigned long gfn,
 
     ASSERT(paging_mode_translate(p2m->domain));
 
+    if ( sve )
+        *sve = 1;
+
     /* XXX This is for compatibility with the old model, where anything not 
      * XXX marked as RAM was considered to be emulated MMIO space.
      * XXX Once we start explicitly registering MMIO regions in the p2m 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 90ee651..561a83c 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -342,7 +342,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn,
         /* Grab the lock here, don't release until put_gfn */
         gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order);
+    mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL);
 
     if ( (q & P2M_UNSHARE) && p2m_is_shared(*t) )
     {
@@ -351,7 +351,7 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn,
          * sleeping. */
         if ( mem_sharing_unshare_page(p2m->domain, gfn, 0) < 0 )
             (void)mem_sharing_notify_enomem(p2m->domain, gfn, 0);
-        mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order);
+        mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL);
     }
 
     if (unlikely((p2m_is_broken(*t))))
@@ -455,7 +455,7 @@ int p2m_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         else
             order = 0;
 
-        set_rc = p2m->set_entry(p2m, gfn, mfn, order, p2mt, p2ma);
+        set_rc = p2m->set_entry(p2m, gfn, mfn, order, p2mt, p2ma, -1);
         if ( set_rc )
             rc = set_rc;
 
@@ -619,7 +619,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn, unsigned long mfn,
     {
         for ( i = 0; i < (1UL << page_order); i++ )
         {
-            mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a, 0, NULL);
+            mfn_return = p2m->get_entry(p2m, gfn + i, &t, &a, 0, NULL, NULL);
             if ( !p2m_is_grant(t) && !p2m_is_shared(t) && !p2m_is_foreign(t) )
                 set_gpfn_from_mfn(mfn+i, INVALID_M2P_ENTRY);
             ASSERT( !p2m_is_valid(t) || mfn + i == mfn_x(mfn_return) );
@@ -682,7 +682,7 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
     /* First, remove m->p mappings for existing p->m mappings */
     for ( i = 0; i < (1UL << page_order); i++ )
     {
-        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
+        omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL, NULL);
         if ( p2m_is_shared(ot) )
         {
             /* Do an unshare to cleanly take care of all corner 
@@ -706,7 +706,7 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
                 (void)mem_sharing_notify_enomem(p2m->domain, gfn + i, 0);
                 return rc;
             }
-            omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL);
+            omfn = p2m->get_entry(p2m, gfn + i, &ot, &a, 0, NULL, NULL);
             ASSERT(!p2m_is_shared(ot));
         }
         if ( p2m_is_grant(ot) || p2m_is_foreign(ot) )
@@ -754,7 +754,7 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
              * address */
             P2M_DEBUG("aliased! mfn=%#lx, old gfn=%#lx, new gfn=%#lx\n",
                       mfn + i, ogfn, gfn + i);
-            omfn = p2m->get_entry(p2m, ogfn, &ot, &a, 0, NULL);
+            omfn = p2m->get_entry(p2m, ogfn, &ot, &a, 0, NULL, NULL);
             if ( p2m_is_ram(ot) && !p2m_is_paged(ot) )
             {
                 ASSERT(mfn_valid(omfn));
@@ -821,7 +821,7 @@ int p2m_change_type_one(struct domain *d, unsigned long gfn,
 
     gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, &pt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &pt, &a, 0, NULL, NULL);
     rc = likely(pt == ot)
          ? p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, nt,
                          p2m->default_access)
@@ -905,7 +905,7 @@ static int set_typed_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
         return -EIO;
 
     gfn_lock(p2m, gfn, 0);
-    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL);
+    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL, NULL);
     if ( p2m_is_grant(ot) || p2m_is_foreign(ot) )
     {
         p2m_unlock(p2m);
@@ -956,7 +956,7 @@ int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
         return -EIO;
 
     gfn_lock(p2m, gfn, 0);
-    actual_mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL);
+    actual_mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
 
     /* Do not use mfn_valid() here as it will usually fail for MMIO pages. */
     if ( (INVALID_MFN == mfn_x(actual_mfn)) || (t != p2m_mmio_direct) )
@@ -992,7 +992,7 @@ int set_shared_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
         return -EIO;
 
     gfn_lock(p2m, gfn, 0);
-    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL);
+    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL, NULL);
     /* At the moment we only allow p2m change if gfn has already been made
      * sharable first */
     ASSERT(p2m_is_shared(ot));
@@ -1044,7 +1044,7 @@ int p2m_mem_paging_nominate(struct domain *d, unsigned long gfn)
 
     gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
 
     /* Check if mfn is valid */
     if ( !mfn_valid(mfn) )
@@ -1106,7 +1106,7 @@ int p2m_mem_paging_evict(struct domain *d, unsigned long gfn)
     gfn_lock(p2m, gfn, 0);
 
     /* Get mfn */
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
     if ( unlikely(!mfn_valid(mfn)) )
         goto out;
 
@@ -1238,7 +1238,7 @@ void p2m_mem_paging_populate(struct domain *d, unsigned long gfn)
 
     /* Fix p2m mapping */
     gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
     /* Allow only nominated or evicted pages to enter page-in path */
     if ( p2mt == p2m_ram_paging_out || p2mt == p2m_ram_paged )
     {
@@ -1300,7 +1300,7 @@ int p2m_mem_paging_prep(struct domain *d, unsigned long gfn, uint64_t buffer)
 
     gfn_lock(p2m, gfn, 0);
 
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
 
     ret = -ENOENT;
     /* Allow missing pages */
@@ -1388,7 +1388,7 @@ void p2m_mem_paging_resume(struct domain *d, vm_event_response_t *rsp)
         unsigned long gfn = rsp->u.mem_access.gfn;
 
         gfn_lock(p2m, gfn, 0);
-        mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+        mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
         /*
          * Allow only pages which were prepared properly, or pages which
          * were nominated but not evicted.
@@ -1528,16 +1528,17 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     vm_event_request_t *req;
     int rc;
     unsigned long eip = guest_cpu_user_regs()->eip;
+    bool_t sve;
 
     /* First, handle rx2rw conversion automatically.
      * These calls to p2m->set_entry() must succeed: we have the gfn
      * locked and just did a successful get_entry(). */
     gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
 
     if ( npfec.write_access && p2ma == p2m_access_rx2rw ) 
     {
-        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw);
+        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw, sve);
         ASSERT(rc == 0);
         gfn_unlock(p2m, gfn, 0);
         return 1;
@@ -1546,7 +1547,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     {
         ASSERT(npfec.write_access || npfec.read_access || npfec.insn_fetch);
         rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
-                            p2mt, p2m_access_rwx);
+                            p2mt, p2m_access_rwx, -1);
         ASSERT(rc == 0);
     }
     gfn_unlock(p2m, gfn, 0);
@@ -1566,14 +1567,14 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
         else
         {
             gfn_lock(p2m, gfn, 0);
-            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
             if ( p2ma != p2m_access_n2rwx )
             {
                 /* A listener is not required, so clear the access
                  * restrictions.  This set must succeed: we have the
                  * gfn locked and just did a successful get_entry(). */
                 rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
-                                    p2mt, p2m_access_rwx);
+                                    p2mt, p2m_access_rwx, sve);
                 ASSERT(rc == 0);
             }
             gfn_unlock(p2m, gfn, 0);
@@ -1652,6 +1653,7 @@ long p2m_set_mem_access(struct domain *d, unsigned long pfn, uint32_t nr,
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     p2m_access_t a, _a;
+    bool_t sve;
     p2m_type_t t;
     mfn_t mfn;
     long rc = 0;
@@ -1693,8 +1695,8 @@ long p2m_set_mem_access(struct domain *d, unsigned long pfn, uint32_t nr,
     p2m_lock(p2m);
     for ( pfn += start; nr > start; ++pfn )
     {
-        mfn = p2m->get_entry(p2m, pfn, &t, &_a, 0, NULL);
-        rc = p2m->set_entry(p2m, pfn, mfn, PAGE_ORDER_4K, t, a);
+        mfn = p2m->get_entry(p2m, pfn, &t, &_a, 0, NULL, &sve);
+        rc = p2m->set_entry(p2m, pfn, mfn, PAGE_ORDER_4K, t, a, sve);
         if ( rc )
             break;
 
@@ -1742,7 +1744,7 @@ int p2m_get_mem_access(struct domain *d, unsigned long pfn,
     }
 
     gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, pfn, &t, &a, 0, NULL);
+    mfn = p2m->get_entry(p2m, pfn, &t, &a, 0, NULL, NULL);
     gfn_unlock(p2m, gfn, 0);
 
     if ( mfn_x(mfn) == INVALID_MFN )
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 079a298..0a172e0 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -226,17 +226,19 @@ struct p2m_domain {
     /* Pages used to construct the p2m */
     struct page_list_head pages;
 
-    int                (*set_entry   )(struct p2m_domain *p2m,
-                                       unsigned long gfn,
-                                       mfn_t mfn, unsigned int page_order,
-                                       p2m_type_t p2mt,
-                                       p2m_access_t p2ma);
-    mfn_t              (*get_entry   )(struct p2m_domain *p2m,
-                                       unsigned long gfn,
-                                       p2m_type_t *p2mt,
-                                       p2m_access_t *p2ma,
-                                       p2m_query_t q,
-                                       unsigned int *page_order);
+    int                (*set_entry)(struct p2m_domain *p2m,
+                                         unsigned long gfn,
+                                         mfn_t mfn, unsigned int page_order,
+                                         p2m_type_t p2mt,
+                                         p2m_access_t p2ma,
+                                         int sve);
+    mfn_t              (*get_entry)(struct p2m_domain *p2m,
+                                         unsigned long gfn,
+                                         p2m_type_t *p2mt,
+                                         p2m_access_t *p2ma,
+                                         p2m_query_t q,
+                                         unsigned int *page_order,
+                                         bool_t *sve);
     void               (*enable_hardware_log_dirty)(struct p2m_domain *p2m);
     void               (*disable_hardware_log_dirty)(struct p2m_domain *p2m);
     void               (*flush_hardware_cached_dirty)(struct p2m_domain *p2m);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 09/15] x86/altp2m: alternate p2m memory events.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (7 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 08/15] x86/altp2m: add control of suppress_ve Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  1:01   ` Lengyel, Tamas
  2015-07-10  0:52 ` [PATCH v4 10/15] x86/altp2m: add remaining support routines Ed White
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Add a flag to indicate that a memory event occurred in an alternate p2m
and a field containing the p2m index. Allow any event response to switch
to a different alternate p2m using the same flag and field.

Modify p2m_mem_access_check() to handle alternate p2m's.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> for the x86 bits.
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/arch/x86/mm/p2m.c         | 19 ++++++++++++++++++-
 xen/common/vm_event.c         |  4 ++++
 xen/include/asm-arm/p2m.h     |  6 ++++++
 xen/include/asm-x86/p2m.h     |  3 +++
 xen/include/public/vm_event.h | 11 +++++++++++
 5 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 561a83c..d4d1ba1 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1514,6 +1514,12 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
     }
 }
 
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
+{
+    if ( altp2m_active(v->domain) )
+        p2m_switch_vcpu_altp2m_by_id(v, idx);
+}
+
 bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
                             struct npfec npfec,
                             vm_event_request_t **req_ptr)
@@ -1521,7 +1527,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     struct vcpu *v = current;
     unsigned long gfn = gpa >> PAGE_SHIFT;
     struct domain *d = v->domain;    
-    struct p2m_domain* p2m = p2m_get_hostp2m(d);
+    struct p2m_domain *p2m = NULL;
     mfn_t mfn;
     p2m_type_t p2mt;
     p2m_access_t p2ma;
@@ -1530,6 +1536,11 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     unsigned long eip = guest_cpu_user_regs()->eip;
     bool_t sve;
 
+    if ( altp2m_active(d) )
+        p2m = p2m_get_altp2m(v);
+    if ( !p2m )
+        p2m = p2m_get_hostp2m(d);
+
     /* First, handle rx2rw conversion automatically.
      * These calls to p2m->set_entry() must succeed: we have the gfn
      * locked and just did a successful get_entry(). */
@@ -1636,6 +1647,12 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
         req->vcpu_id = v->vcpu_id;
 
         p2m_vm_event_fill_regs(req);
+
+        if ( altp2m_active(v->domain) )
+        {
+            req->flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
+            req->altp2m_idx = vcpu_altp2m(v).p2midx;
+        }
     }
 
     /* Pause the current VCPU */
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 120a78a..13224e2 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -399,6 +399,10 @@ void vm_event_resume(struct domain *d, struct vm_event_domain *ved)
 
         };
 
+        /* Check for altp2m switch */
+        if ( rsp.flags & VM_EVENT_FLAG_ALTERNATE_P2M )
+            p2m_altp2m_check(v, rsp.altp2m_idx);
+
         /* Unpause domain. */
         if ( rsp.flags & VM_EVENT_FLAG_VCPU_PAUSED )
             vm_event_vcpu_unpause(v);
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 63748ef..08bdce3 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -109,6 +109,12 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
     /* Not supported on ARM. */
 }
 
+static inline
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
+{
+    /* Not supported on ARM. */
+}
+
 #define p2m_is_foreign(_t)  ((_t) == p2m_map_foreign)
 #define p2m_is_ram(_t)      ((_t) == p2m_ram_rw || (_t) == p2m_ram_ro)
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 0a172e0..722e54c 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -751,6 +751,9 @@ uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp);
 /* Switch alternate p2m for a single vcpu */
 bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
 
+/* Check to see if vcpu should be switched to a different p2m. */
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx);
+
 /*
  * p2m type to IOMMU flags
  */
diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h
index 577e971..6dfa9db 100644
--- a/xen/include/public/vm_event.h
+++ b/xen/include/public/vm_event.h
@@ -47,6 +47,16 @@
 #define VM_EVENT_FLAG_VCPU_PAUSED     (1 << 0)
 /* Flags to aid debugging mem_event */
 #define VM_EVENT_FLAG_FOREIGN         (1 << 1)
+/*
+ * This flag can be set in a request or a response
+ *
+ * On a request, indicates that the event occurred in the alternate p2m specified by
+ * the altp2m_idx request field.
+ *
+ * On a response, indicates that the VCPU should resume in the alternate p2m specified
+ * by the altp2m_idx response field if possible.
+ */
+#define VM_EVENT_FLAG_ALTERNATE_P2M   (1 << 2)
 
 /*
  * Reasons for the vm event request
@@ -194,6 +204,7 @@ typedef struct vm_event_st {
     uint32_t flags;     /* VM_EVENT_FLAG_* */
     uint32_t reason;    /* VM_EVENT_REASON_* */
     uint32_t vcpu_id;
+    uint16_t altp2m_idx; /* may be used during request and response */
 
     union {
         struct vm_event_paging                mem_paging;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 10/15] x86/altp2m: add remaining support routines.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (8 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 09/15] x86/altp2m: alternate p2m memory events Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  9:41   ` Jan Beulich
  2015-07-10  0:52 ` [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Add the remaining routines required to support enabling the alternate
p2m functionality.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c           |  58 +++++-
 xen/arch/x86/mm/hap/Makefile     |   1 +
 xen/arch/x86/mm/hap/altp2m_hap.c |  98 ++++++++++
 xen/arch/x86/mm/p2m-ept.c        |   3 +
 xen/arch/x86/mm/p2m.c            | 385 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/altp2m.h |   4 +
 xen/include/asm-x86/p2m.h        |  33 ++++
 7 files changed, 576 insertions(+), 6 deletions(-)
 create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index dbb4696..bda6c1e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2802,10 +2802,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     mfn_t mfn;
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
-    struct p2m_domain *p2m;
+    struct p2m_domain *p2m, *hostp2m;
     int rc, fall_through = 0, paged = 0;
     int sharing_enomem = 0;
     vm_event_request_t *req_ptr = NULL;
+    bool_t ap2m_active = 0;
 
     /* On Nested Virtualization, walk the guest page table.
      * If this succeeds, all is fine.
@@ -2865,11 +2866,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
         goto out;
     }
 
-    p2m = p2m_get_hostp2m(currd);
-    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
+    ap2m_active = altp2m_active(currd);
+
+    /* Take a lock on the host p2m speculatively, to avoid potential
+     * locking order problems later and to handle unshare etc.
+     */
+    hostp2m = p2m_get_hostp2m(currd);
+    mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
                               P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE : 0),
                               NULL);
 
+    if ( ap2m_active )
+    {
+        if ( altp2m_hap_nested_page_fault(curr, gpa, gla, npfec, &p2m) == 1 )
+        {
+            /* entry was lazily copied from host -- retry */
+            __put_gfn(hostp2m, gfn);
+            rc = 1;
+            goto out;
+        }
+
+        mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+    }
+    else
+        p2m = hostp2m;
+
     /* Check access permissions first, then handle faults */
     if ( mfn_x(mfn) != INVALID_MFN )
     {
@@ -2909,6 +2930,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
 
         if ( violation )
         {
+            /* Should #VE be emulated for this fault? */
+            if ( p2m_is_altp2m(p2m) && !cpu_has_vmx_virt_exceptions )
+            {
+                bool_t sve;
+
+                p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
+
+                if ( !sve && ap2m_vcpu_emulate_ve(curr) )
+                {
+                    rc = 1;
+                    goto out_put_gfn;
+                }
+            }
+
             if ( p2m_mem_access_check(gpa, gla, npfec, &req_ptr) )
             {
                 fall_through = 1;
@@ -2928,7 +2963,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
          (npfec.write_access &&
           (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
     {
-        put_gfn(currd, gfn);
+        __put_gfn(p2m, gfn);
+        if ( ap2m_active )
+            __put_gfn(hostp2m, gfn);
 
         rc = 0;
         if ( unlikely(is_pvh_domain(currd)) )
@@ -2957,6 +2994,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     /* Spurious fault? PoD and log-dirty also take this path. */
     if ( p2m_is_ram(p2mt) )
     {
+        rc = 1;
         /*
          * Page log dirty is always done with order 0. If this mfn resides in
          * a large page, we do not change other pages type within that large
@@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
         if ( npfec.write_access )
         {
             paging_mark_dirty(currd, mfn_x(mfn));
+            /* If p2m is really an altp2m, unlock here to avoid lock ordering
+             * violation when the change below is propagated from host p2m */
+            if ( ap2m_active )
+                __put_gfn(p2m, gfn);
             p2m_change_type_one(currd, gfn, p2m_ram_logdirty, p2m_ram_rw);
+            __put_gfn(ap2m_active ? hostp2m : p2m, gfn);
+
+            goto out;
         }
-        rc = 1;
         goto out_put_gfn;
     }
 
@@ -2977,7 +3021,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     rc = fall_through;
 
 out_put_gfn:
-    put_gfn(currd, gfn);
+    __put_gfn(p2m, gfn);
+    if ( ap2m_active )
+        __put_gfn(hostp2m, gfn);
 out:
     /* All of these are delayed until we exit, since we might 
      * sleep on event ring wait queues, and we must not hold
diff --git a/xen/arch/x86/mm/hap/Makefile b/xen/arch/x86/mm/hap/Makefile
index 68f2bb5..216cd90 100644
--- a/xen/arch/x86/mm/hap/Makefile
+++ b/xen/arch/x86/mm/hap/Makefile
@@ -4,6 +4,7 @@ obj-y += guest_walk_3level.o
 obj-$(x86_64) += guest_walk_4level.o
 obj-y += nested_hap.o
 obj-y += nested_ept.o
+obj-y += altp2m_hap.o
 
 guest_walk_%level.o: guest_walk.c Makefile
 	$(CC) $(CFLAGS) -DGUEST_PAGING_LEVELS=$* -c $< -o $@
diff --git a/xen/arch/x86/mm/hap/altp2m_hap.c b/xen/arch/x86/mm/hap/altp2m_hap.c
new file mode 100644
index 0000000..52c7877
--- /dev/null
+++ b/xen/arch/x86/mm/hap/altp2m_hap.c
@@ -0,0 +1,98 @@
+/******************************************************************************
+ * arch/x86/mm/hap/altp2m_hap.c
+ *
+ * Copyright (c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <asm/domain.h>
+#include <asm/page.h>
+#include <asm/paging.h>
+#include <asm/p2m.h>
+#include <asm/hap.h>
+#include <asm/hvm/altp2m.h>
+
+#include "private.h"
+
+/*
+ * If the fault is for a not present entry:
+ *     if the entry in the host p2m has a valid mfn, copy it and retry
+ *     else indicate that outer handler should handle fault
+ *
+ * If the fault is for a present entry:
+ *     indicate that outer handler should handle fault
+ */
+
+int
+altp2m_hap_nested_page_fault(struct vcpu *v, paddr_t gpa,
+                                unsigned long gla, struct npfec npfec,
+                                struct p2m_domain **ap2m)
+{
+    struct p2m_domain *hp2m = p2m_get_hostp2m(v->domain);
+    p2m_type_t p2mt;
+    p2m_access_t p2ma;
+    unsigned int page_order;
+    gfn_t gfn = _gfn(paddr_to_pfn(gpa));
+    unsigned long mask;
+    mfn_t mfn;
+    int rv;
+
+    *ap2m = p2m_get_altp2m(v);
+
+    mfn = get_gfn_type_access(*ap2m, gfn_x(gfn), &p2mt, &p2ma,
+                              0, &page_order);
+    __put_gfn(*ap2m, gfn_x(gfn));
+
+    if ( mfn_x(mfn) != INVALID_MFN )
+        return 0;
+
+    mfn = get_gfn_type_access(hp2m, gfn_x(gfn), &p2mt, &p2ma,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order);
+    put_gfn(hp2m->domain, gfn_x(gfn));
+
+    if ( mfn_x(mfn) == INVALID_MFN )
+        return 0;
+
+    p2m_lock(*ap2m);
+
+    /* If this is a superpage mapping, round down both frame numbers
+     * to the start of the superpage. */
+    mask = ~((1UL << page_order) - 1);
+    mfn = _mfn(mfn_x(mfn) & mask);
+
+    rv = p2m_set_entry(*ap2m, gfn_x(gfn) & mask, mfn, page_order, p2mt, p2ma);
+    p2m_unlock(*ap2m);
+
+    if ( rv )
+    {
+        gdprintk(XENLOG_ERR,
+	    "failed to set entry for %#"PRIx64" -> %#"PRIx64" p2m %#"PRIx64"\n",
+	    gfn_x(gfn), mfn_x(mfn), (unsigned long)*ap2m);
+        domain_crash(hp2m->domain);
+    }
+
+    return 1;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 1106235..15a69f0 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -850,6 +850,9 @@ out:
     if ( is_epte_present(&old_entry) )
         ept_free_entry(p2m, &old_entry, target);
 
+    if ( rc == 0 && p2m_is_hostp2m(p2m) )
+        p2m_altp2m_propagate_change(d, _gfn(gfn), mfn, order, p2mt, p2ma);
+
     return rc;
 }
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index d4d1ba1..ced60fe 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2035,6 +2035,391 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx)
     return rc;
 }
 
+void p2m_flush_altp2m(struct domain *d)
+{
+    uint16_t i;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        p2m_flush_table(d->arch.altp2m_p2m[i]);
+        /* Uninit and reinit ept to force TLB shootdown */
+        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
+        ept_p2m_init(d->arch.altp2m_p2m[i]);
+        d->arch.altp2m_eptp[i] = INVALID_MFN;
+    }
+
+    altp2m_list_unlock(d);
+}
+
+static void p2m_init_altp2m_helper(struct domain *d, uint16_t i)
+{
+    struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
+    struct ept_data *ept;
+
+    p2m->min_remapped_gfn = INVALID_GFN;
+    p2m->max_remapped_gfn = INVALID_GFN;
+    ept = &p2m->ept;
+    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
+    d->arch.altp2m_eptp[i] = ept_get_eptp(ept);
+}
+
+long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx)
+{
+    long rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] == INVALID_MFN )
+    {
+        p2m_init_altp2m_helper(d, idx);
+        rc = 0;
+    }
+
+    altp2m_list_unlock(d);
+    return rc;
+}
+
+long p2m_init_next_altp2m(struct domain *d, uint16_t *idx)
+{
+    long rc = -EINVAL;
+    uint16_t i;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] != INVALID_MFN )
+            continue;
+
+        p2m_init_altp2m_helper(d, i);
+        *idx = i;
+        rc = 0;
+
+        break;
+    }
+
+    altp2m_list_unlock(d);
+    return rc;
+}
+
+long p2m_destroy_altp2m_by_id(struct domain *d, uint16_t idx)
+{
+    struct p2m_domain *p2m;
+    long rc = -EINVAL;
+
+    if ( !idx || idx > MAX_ALTP2M )
+        return rc;
+
+    domain_pause_except_self(d);
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        p2m = d->arch.altp2m_p2m[idx];
+
+        if ( !_atomic_read(p2m->active_vcpus) )
+        {
+            p2m_flush_table(d->arch.altp2m_p2m[idx]);
+            /* Uninit and reinit ept to force TLB shootdown */
+            ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
+            ept_p2m_init(d->arch.altp2m_p2m[idx]);
+            d->arch.altp2m_eptp[idx] = INVALID_MFN;
+            rc = 0;
+        }
+    }
+
+    altp2m_list_unlock(d);
+
+    domain_unpause_except_self(d);
+
+    return rc;
+}
+
+long p2m_switch_domain_altp2m_by_id(struct domain *d, uint16_t idx)
+{
+    struct vcpu *v;
+    long rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    domain_pause_except_self(d);
+
+    altp2m_list_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        for_each_vcpu( d, v )
+            if ( idx != vcpu_altp2m(v).p2midx )
+            {
+                atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+                vcpu_altp2m(v).p2midx = idx;
+                atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+                ap2m_vcpu_update_eptp(v);
+            }
+
+        rc = 0;
+    }
+
+    altp2m_list_unlock(d);
+
+    domain_unpause_except_self(d);
+
+    return rc;
+}
+
+long p2m_set_altp2m_mem_access(struct domain *d, uint16_t idx,
+                                 gfn_t gfn, xenmem_access_t access)
+{
+    struct p2m_domain *hp2m, *ap2m;
+    p2m_access_t req_a, old_a;
+    p2m_type_t t;
+    mfn_t mfn;
+    unsigned int page_order;
+    long rc = -EINVAL;
+
+    static const p2m_access_t memaccess[] = {
+#define ACCESS(ac) [XENMEM_access_##ac] = p2m_access_##ac
+        ACCESS(n),
+        ACCESS(r),
+        ACCESS(w),
+        ACCESS(rw),
+        ACCESS(x),
+        ACCESS(rx),
+        ACCESS(wx),
+        ACCESS(rwx),
+#undef ACCESS
+    };
+
+    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
+        return rc;
+
+    ap2m = d->arch.altp2m_p2m[idx];
+
+    switch ( access )
+    {
+    case 0 ... ARRAY_SIZE(memaccess) - 1:
+        req_a = memaccess[access];
+        break;
+    case XENMEM_access_default:
+        req_a = ap2m->default_access;
+        break;
+    default:
+        return rc;
+    }
+
+    /* If request to set default access */
+    if ( gfn_x(gfn) == INVALID_GFN )
+    {
+        ap2m->default_access = req_a;
+        return 0;
+    }
+
+    hp2m = p2m_get_hostp2m(d);
+
+    p2m_lock(ap2m);
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(gfn), &t, &old_a, 0, NULL, NULL);
+
+    /* Check host p2m if no valid entry in alternate */
+    if ( !mfn_valid(mfn) )
+    {
+        mfn = hp2m->get_entry(hp2m, gfn_x(gfn), &t, &old_a,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order, NULL);
+
+        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
+            goto out;
+
+        /* If this is a superpage, copy that first */
+        if ( page_order != PAGE_ORDER_4K )
+        {
+            gfn_t gfn2;
+            unsigned long mask;
+            mfn_t mfn2;
+
+            mask = ~((1UL << page_order) - 1);
+            gfn2 = _gfn(gfn_x(gfn) & mask);
+            mfn2 = _mfn(mfn_x(mfn) & mask);
+
+            if ( ap2m->set_entry(ap2m, gfn_x(gfn2), mfn2, page_order, t, old_a, 1) )
+                goto out;
+        }
+    }
+
+    if ( !ap2m->set_entry(ap2m, gfn_x(gfn), mfn, PAGE_ORDER_4K, t, req_a,
+                          (current->domain != d)) )
+        rc = 0;
+
+out:
+    p2m_unlock(ap2m);
+    return rc;
+}
+
+long p2m_change_altp2m_gfn(struct domain *d, uint16_t idx,
+                             gfn_t old_gfn, gfn_t new_gfn)
+{
+    struct p2m_domain *hp2m, *ap2m;
+    p2m_access_t a;
+    p2m_type_t t;
+    mfn_t mfn;
+    unsigned int page_order;
+    long rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
+        return rc;
+
+    hp2m = p2m_get_hostp2m(d);
+    ap2m = d->arch.altp2m_p2m[idx];
+
+    p2m_lock(ap2m);
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(old_gfn), &t, &a, 0, NULL, NULL);
+
+    if ( gfn_x(new_gfn) == INVALID_GFN )
+    {
+        if ( mfn_valid(mfn) )
+            p2m_remove_page(ap2m, gfn_x(old_gfn), mfn_x(mfn), PAGE_ORDER_4K);
+        rc = 0;
+        goto out;
+    }
+
+    /* Check host p2m if no valid entry in alternate */
+    if ( !mfn_valid(mfn) )
+    {
+        mfn = hp2m->get_entry(hp2m, gfn_x(old_gfn), &t, &a,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order, NULL);
+
+        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
+            goto out;
+
+        /* If this is a superpage, copy that first */
+        if ( page_order != PAGE_ORDER_4K )
+        {
+            gfn_t gfn;
+            unsigned long mask;
+
+            mask = ~((1UL << page_order) - 1);
+            gfn = _gfn(gfn_x(old_gfn) & mask);
+            mfn = _mfn(mfn_x(mfn) & mask);
+
+            if ( ap2m->set_entry(ap2m, gfn_x(gfn), mfn, page_order, t, a, 1) )
+                goto out;
+        }
+    }
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(new_gfn), &t, &a, 0, NULL, NULL);
+
+    if ( !mfn_valid(mfn) )
+        mfn = hp2m->get_entry(hp2m, gfn_x(new_gfn), &t, &a, 0, NULL, NULL);
+
+    if ( !mfn_valid(mfn) || (t != p2m_ram_rw) )
+        goto out;
+
+    if ( !ap2m->set_entry(ap2m, gfn_x(old_gfn), mfn, PAGE_ORDER_4K, t, a,
+                          (current->domain != d)) )
+    {
+        rc = 0;
+
+        if ( ap2m->min_remapped_gfn == INVALID_GFN ||
+             gfn_x(new_gfn) < ap2m->min_remapped_gfn )
+            ap2m->min_remapped_gfn = gfn_x(new_gfn);
+        if ( ap2m->max_remapped_gfn == INVALID_GFN ||
+             gfn_x(new_gfn) > ap2m->max_remapped_gfn )
+            ap2m->max_remapped_gfn = gfn_x(new_gfn);
+    }
+
+out:
+    p2m_unlock(ap2m);
+    return rc;
+}
+
+static void p2m_reset_altp2m(struct p2m_domain *p2m)
+{
+    p2m_flush_table(p2m);
+    /* Uninit and reinit ept to force TLB shootdown */
+    ept_p2m_uninit(p2m);
+    ept_p2m_init(p2m);
+    p2m->min_remapped_gfn = INVALID_GFN;
+    p2m->max_remapped_gfn = INVALID_GFN;
+}
+
+void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
+                                 mfn_t mfn, unsigned int page_order,
+                                 p2m_type_t p2mt, p2m_access_t p2ma)
+{
+    struct p2m_domain *p2m;
+    p2m_access_t a;
+    p2m_type_t t;
+    mfn_t m;
+    uint16_t i;
+    bool_t reset_p2m;
+    unsigned int reset_count = 0;
+    uint16_t last_reset_idx = ~0;
+
+    if ( !altp2m_active(d) )
+        return;
+
+    altp2m_list_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
+            continue;
+
+        p2m = d->arch.altp2m_p2m[i];
+        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
+
+        reset_p2m = 0;
+
+        /* Check for a dropped page that may impact this altp2m */
+        if ( mfn_x(mfn) == INVALID_MFN &&
+             gfn_x(gfn) >= p2m->min_remapped_gfn &&
+             gfn_x(gfn) <= p2m->max_remapped_gfn )
+            reset_p2m = 1;
+
+        if ( reset_p2m )
+        {
+            if ( !reset_count++ )
+            {
+                p2m_reset_altp2m(p2m);
+                last_reset_idx = i;
+            }
+            else
+            {
+                /* At least 2 altp2m's impacted, so reset everything */
+                __put_gfn(p2m, gfn_x(gfn));
+
+                for ( i = 0; i < MAX_ALTP2M; i++ )
+                {
+                    if ( i == last_reset_idx ||
+                         d->arch.altp2m_eptp[i] == INVALID_MFN )
+                        continue;
+
+                    p2m = d->arch.altp2m_p2m[i];
+                    p2m_lock(p2m);
+                    p2m_reset_altp2m(p2m);
+                    p2m_unlock(p2m);
+                }
+
+                goto out;
+            }
+        }
+        else if ( mfn_x(m) != INVALID_MFN )
+           p2m_set_entry(p2m, gfn_x(gfn), mfn, page_order, p2mt, p2ma);
+
+        __put_gfn(p2m, gfn_x(gfn));
+    }
+
+out:
+    altp2m_list_unlock(d);
+}
+
 /*** Audit ***/
 
 #if P2M_AUDIT
diff --git a/xen/include/asm-x86/hvm/altp2m.h b/xen/include/asm-x86/hvm/altp2m.h
index 1a6f22b..f90b5af 100644
--- a/xen/include/asm-x86/hvm/altp2m.h
+++ b/xen/include/asm-x86/hvm/altp2m.h
@@ -34,5 +34,9 @@ int altp2m_vcpu_initialise(struct vcpu *v);
 void altp2m_vcpu_destroy(struct vcpu *v);
 void altp2m_vcpu_reset(struct vcpu *v);
 
+/* Alternate p2m paging */
+int altp2m_hap_nested_page_fault(struct vcpu *v, paddr_t gpa,
+    unsigned long gla, struct npfec npfec, struct p2m_domain **ap2m);
+
 #endif /* _ALTP2M_H */
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 722e54c..6952f2b 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -268,6 +268,11 @@ struct p2m_domain {
     /* Highest guest frame that's ever been mapped in the p2m */
     unsigned long max_mapped_pfn;
 
+    /* Alternate p2m's only: range of gfn's for which underlying
+     * mfn may have duplicate mappings */
+    unsigned long min_remapped_gfn;
+    unsigned long max_remapped_gfn;
+
     /* When releasing shared gfn's in a preemptible manner, recall where
      * to resume the search */
     unsigned long next_shared_gfn_to_relinquish;
@@ -754,6 +759,34 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
 /* Check to see if vcpu should be switched to a different p2m. */
 void p2m_altp2m_check(struct vcpu *v, uint16_t idx);
 
+/* Flush all the alternate p2m's for a domain */
+void p2m_flush_altp2m(struct domain *d);
+
+/* Make a specific alternate p2m valid */
+long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx);
+
+/* Find an available alternate p2m and make it valid */
+long p2m_init_next_altp2m(struct domain *d, uint16_t *idx);
+
+/* Make a specific alternate p2m invalid */
+long p2m_destroy_altp2m_by_id(struct domain *d, uint16_t idx);
+
+/* Switch alternate p2m for entire domain */
+long p2m_switch_domain_altp2m_by_id(struct domain *d, uint16_t idx);
+
+/* Set access type for a gfn */
+long p2m_set_altp2m_mem_access(struct domain *d, uint16_t idx,
+                                 gfn_t gfn, xenmem_access_t access);
+
+/* Change a gfn->mfn mapping */
+long p2m_change_altp2m_gfn(struct domain *d, uint16_t idx,
+                             gfn_t old_gfn, gfn_t new_gfn);
+
+/* Propagate a host p2m change to all alternate p2m's */
+void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
+                                 mfn_t mfn, unsigned int page_order,
+                                 p2m_type_t p2mt, p2m_access_t p2ma);
+
 /*
  * p2m type to IOMMU flags
  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (9 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 10/15] x86/altp2m: add remaining support routines Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10 10:01   ` Jan Beulich
  2015-07-10  0:52 ` [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 xen/arch/x86/hvm/hvm.c          | 138 ++++++++++++++++++++++++++++++++++++++++
 xen/include/public/hvm/hvm_op.h |  82 ++++++++++++++++++++++++
 2 files changed, 220 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index bda6c1e..23cd507 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6443,6 +6443,144 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case HVMOP_altp2m:
+    {
+        struct xen_hvm_altp2m_op a;
+        struct domain *d = NULL;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        switch ( a.cmd )
+        {
+        case HVMOP_altp2m_get_domain_state:
+        case HVMOP_altp2m_set_domain_state:
+        case HVMOP_altp2m_create_p2m:
+        case HVMOP_altp2m_destroy_p2m:
+        case HVMOP_altp2m_switch_p2m:
+        case HVMOP_altp2m_set_mem_access:
+        case HVMOP_altp2m_change_gfn:
+            d = rcu_lock_domain_by_any_id(a.domain);
+            if ( d == NULL )
+                return -ESRCH;
+
+            if ( !is_hvm_domain(d) || !hvm_altp2m_supported() )
+                rc = -EINVAL;
+
+            break;
+        case HVMOP_altp2m_vcpu_enable_notify:
+
+            break;
+        default:
+            return -ENOSYS;
+
+            break;
+        }
+
+        if ( !rc )
+        {
+            switch ( a.cmd )
+            {
+            case HVMOP_altp2m_get_domain_state:
+                a.u.domain_state.state = altp2m_active(d);
+                rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+
+                break;
+            case HVMOP_altp2m_set_domain_state:
+            {
+                struct vcpu *v;
+                bool_t ostate;
+                
+                if ( nestedhvm_enabled(d) )
+                {
+                    rc = -EINVAL;
+                    break;
+                }
+
+                ostate = d->arch.altp2m_active;
+                d->arch.altp2m_active = !!a.u.domain_state.state;
+
+                /* If the alternate p2m state has changed, handle appropriately */
+                if ( d->arch.altp2m_active != ostate &&
+                     (ostate || !(rc = p2m_init_altp2m_by_id(d, 0))) )
+                {
+                    for_each_vcpu( d, v )
+                    {
+                        if ( !ostate )
+                            altp2m_vcpu_initialise(v);
+                        else
+                            altp2m_vcpu_destroy(v);
+                    }
+
+                    if ( ostate )
+                        p2m_flush_altp2m(d);
+                }
+
+                break;
+            }
+            default:
+            {
+                if ( !(d ? d : current->domain)->arch.altp2m_active )
+                {
+                    rc = -EINVAL;
+                    break;
+                }
+
+                switch ( a.cmd )
+                {
+                case HVMOP_altp2m_vcpu_enable_notify:
+                {
+                    struct vcpu *curr = current;
+                    p2m_type_t p2mt;
+
+                    if ( (gfn_x(vcpu_altp2m(curr).veinfo_gfn) != INVALID_GFN) ||
+                         (mfn_x(get_gfn_query_unlocked(curr->domain,
+                                a.u.enable_notify.gfn, &p2mt)) == INVALID_MFN) )
+                        return -EINVAL;
+
+                    vcpu_altp2m(curr).veinfo_gfn = _gfn(a.u.enable_notify.gfn);
+                    ap2m_vcpu_update_vmfunc_ve(curr);
+
+                    break;
+                }
+                case HVMOP_altp2m_create_p2m:
+                    if ( !(rc = p2m_init_next_altp2m(d, &a.u.view.view)) )
+                        rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+
+                    break;
+                case HVMOP_altp2m_destroy_p2m:
+                    rc = p2m_destroy_altp2m_by_id(d, a.u.view.view);
+
+                    break;
+                case HVMOP_altp2m_switch_p2m:
+                    rc = p2m_switch_domain_altp2m_by_id(d, a.u.view.view);
+
+                    break;
+                case HVMOP_altp2m_set_mem_access:
+                    rc = p2m_set_altp2m_mem_access(d, a.u.set_mem_access.view,
+                            _gfn(a.u.set_mem_access.gfn),
+                            a.u.set_mem_access.hvmmem_access);
+
+                    break;
+                case HVMOP_altp2m_change_gfn:
+                    rc = p2m_change_altp2m_gfn(d, a.u.change_gfn.view,
+                            _gfn(a.u.change_gfn.old_gfn),
+                            _gfn(a.u.change_gfn.new_gfn));
+
+                    break;
+                }
+
+                break;
+            }
+            }
+        }
+
+        if ( d )
+            rcu_unlock_domain(d);
+
+        break;
+    }
+
     default:
     {
         gdprintk(XENLOG_DEBUG, "Bad HVM op %ld.\n", op);
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index 9b84e84..05d42c4 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -396,6 +396,88 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_evtchn_upcall_vector_t);
 
 #endif /* defined(__i386__) || defined(__x86_64__) */
 
+/* HVMOP_altp2m: perform altp2m state operations */
+#define HVMOP_altp2m 24
+
+struct xen_hvm_altp2m_domain_state {
+    /* IN or OUT variable on/off */
+    uint8_t state;
+};
+typedef struct xen_hvm_altp2m_domain_state xen_hvm_altp2m_domain_state_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_domain_state_t);
+
+struct xen_hvm_altp2m_vcpu_enable_notify {
+    /* #VE info area gfn */
+    uint64_t gfn;
+};
+typedef struct xen_hvm_altp2m_vcpu_enable_notify xen_hvm_altp2m_vcpu_enable_notify_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_vcpu_enable_notify_t);
+
+struct xen_hvm_altp2m_view {
+    /* IN/OUT variable */
+    uint16_t view;
+    /* Create view only: default access type
+     * NOTE: currently ignored */
+    uint16_t hvmmem_default_access; /* xenmem_access_t */
+};
+typedef struct xen_hvm_altp2m_view xen_hvm_altp2m_view_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_view_t);
+
+struct xen_hvm_altp2m_set_mem_access {
+    /* view */
+    uint16_t view;
+    /* Memory type */
+    uint16_t hvmmem_access; /* xenmem_access_t */
+    uint8_t pad[4];
+    /* gfn */
+    uint64_t gfn;
+};
+typedef struct xen_hvm_altp2m_set_mem_access xen_hvm_altp2m_set_mem_access_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
+
+struct xen_hvm_altp2m_change_gfn {
+    /* view */
+    uint16_t view;
+    uint8_t pad[6];
+    /* old gfn */
+    uint64_t old_gfn;
+    /* new gfn, INVALID_GFN (~0UL) means revert */
+    uint64_t new_gfn;
+};
+typedef struct xen_hvm_altp2m_change_gfn xen_hvm_altp2m_change_gfn_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_change_gfn_t);
+
+struct xen_hvm_altp2m_op {
+    uint32_t cmd;
+/* Get/set the altp2m state for a domain */
+#define HVMOP_altp2m_get_domain_state     1
+#define HVMOP_altp2m_set_domain_state     2
+/* Set the current VCPU to receive altp2m event notifications */
+#define HVMOP_altp2m_vcpu_enable_notify   3
+/* Create a new view */
+#define HVMOP_altp2m_create_p2m           4
+/* Destroy a view */
+#define HVMOP_altp2m_destroy_p2m          5
+/* Switch view for an entire domain */
+#define HVMOP_altp2m_switch_p2m           6
+/* Notify that a page of memory is to have specific access types */
+#define HVMOP_altp2m_set_mem_access       7
+/* Change a p2m entry to have a different gfn->mfn mapping */
+#define HVMOP_altp2m_change_gfn           8
+    domid_t domain;
+    uint8_t pad[2];
+    union {
+        struct xen_hvm_altp2m_domain_state       domain_state;
+        struct xen_hvm_altp2m_vcpu_enable_notify enable_notify;
+        struct xen_hvm_altp2m_view               view;
+        struct xen_hvm_altp2m_set_mem_access     set_mem_access;
+        struct xen_hvm_altp2m_change_gfn         change_gfn;
+        uint8_t pad[64];
+    } u;
+};
+typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
+
 #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (10 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  8:53   ` Wei Liu
  2015-07-10 17:32   ` George Dunlap
  2015-07-10  0:52 ` [PATCH v4 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
                   ` (2 subsequent siblings)
  14 siblings, 2 replies; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

The altp2mhvm and nestedhvm parameters are mutually
exclusive and cannot be set together.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> for the hypervisor bits.
---
 docs/man/xl.cfg.pod.5           | 12 ++++++++++++
 tools/libxl/libxl.h             |  6 ++++++
 tools/libxl/libxl_create.c      |  1 +
 tools/libxl/libxl_dom.c         |  2 ++
 tools/libxl/libxl_types.idl     |  1 +
 tools/libxl/xl_cmdimpl.c        | 10 ++++++++++
 xen/arch/x86/hvm/hvm.c          | 23 +++++++++++++++++++++--
 xen/include/public/hvm/params.h |  5 ++++-
 8 files changed, 57 insertions(+), 3 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index a3e0e2e..18afd46 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1035,6 +1035,18 @@ enabled by default and you should usually omit it. It may be necessary
 to disable the HPET in order to improve compatibility with guest
 Operating Systems (X86 only)
 
+=item B<altp2mhvm=BOOLEAN>
+
+Enables or disables hvm guest access to alternate-p2m capability.
+Alternate-p2m allows a guest to manage multiple p2m guest physical
+"memory views" (as opposed to a single p2m). This option is
+disabled by default and is available only to hvm domains.
+You may want this option if you want to access-control/isolate
+access to specific guest physical memory pages accessed by
+the guest, e.g. for HVM domain memory introspection or
+for isolation/access-control of memory between components within
+a single guest hvm domain.
+
 =item B<nestedhvm=BOOLEAN>
 
 Enable or disables guest access to hardware virtualisation features,
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index a1c5d15..17222e7 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -745,6 +745,12 @@ typedef struct libxl__ctx libxl_ctx;
 #define LIBXL_HAVE_BUILDINFO_SERIAL_LIST 1
 
 /*
+ * LIBXL_HAVE_ALTP2M
+ * If this is defined, then libxl supports alternate p2m functionality.
+ */
+#define LIBXL_HAVE_ALTP2M 1
+
+/*
  * LIBXL_HAVE_REMUS
  * If this is defined, then libxl supports remus.
  */
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index f366a09..418deee 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -329,6 +329,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
         libxl_defbool_setdefault(&b_info->u.hvm.hpet,               true);
         libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,          true);
         libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm,         false);
+        libxl_defbool_setdefault(&b_info->u.hvm.altp2m,             false);
         libxl_defbool_setdefault(&b_info->u.hvm.usb,                false);
         libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index bdc0465..2f1200e 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -300,6 +300,8 @@ static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
                     libxl_defbool_val(info->u.hvm.vpt_align));
     xc_hvm_param_set(handle, domid, HVM_PARAM_NESTEDHVM,
                     libxl_defbool_val(info->u.hvm.nested_hvm));
+    xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2MHVM,
+                    libxl_defbool_val(info->u.hvm.altp2m));
 }
 
 int libxl__build_pre(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index e1632fa..fb641fe 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -440,6 +440,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                        ("mmio_hole_memkb",  MemKB),
                                        ("timer_mode",       libxl_timer_mode),
                                        ("nested_hvm",       libxl_defbool),
+                                       ("altp2m",           libxl_defbool),
                                        ("smbios_firmware",  string),
                                        ("acpi_firmware",    string),
                                        ("nographic",        libxl_defbool),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c858068..43cf6bf 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1500,6 +1500,16 @@ static void parse_config_data(const char *config_source,
 
         xlu_cfg_get_defbool(config, "nestedhvm", &b_info->u.hvm.nested_hvm, 0);
 
+        xlu_cfg_get_defbool(config, "altp2mhvm", &b_info->u.hvm.altp2m, 0);
+
+        if (!libxl_defbool_is_default(b_info->u.hvm.nested_hvm) &&
+            libxl_defbool_val(b_info->u.hvm.nested_hvm) &&
+            !libxl_defbool_is_default(b_info->u.hvm.altp2m) &&
+            libxl_defbool_val(b_info->u.hvm.altp2m)) {
+            fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used together\n");
+            exit(1);
+        }
+
         xlu_cfg_replace_string(config, "smbios_firmware",
                                &b_info->u.hvm.smbios_firmware, 0);
         xlu_cfg_replace_string(config, "acpi_firmware",
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 23cd507..6e59e68 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5750,6 +5750,7 @@ static int hvm_allow_set_param(struct domain *d,
     case HVM_PARAM_VIRIDIAN:
     case HVM_PARAM_IOREQ_SERVER_PFN:
     case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
+    case HVM_PARAM_ALTP2MHVM:
         if ( value != 0 && a->value != value )
             rc = -EEXIST;
         break;
@@ -5872,6 +5873,9 @@ static int hvmop_set_param(
          */
         if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
             rc = -EINVAL;
+        if ( a.value &&
+             d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] )
+            rc = -EINVAL;
         /* Set up NHVM state for any vcpus that are already up. */
         if ( a.value &&
              !d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM] )
@@ -5882,6 +5886,13 @@ static int hvmop_set_param(
             for_each_vcpu(d, v)
                 nestedhvm_vcpu_destroy(v);
         break;
+    case HVM_PARAM_ALTP2MHVM:
+        if ( a.value > 1 )
+            rc = -EINVAL;
+        if ( a.value &&
+             d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM] )
+            rc = -EINVAL;
+        break;
     case HVM_PARAM_BUFIOREQ_EVTCHN:
         rc = -EINVAL;
         break;
@@ -5942,6 +5953,7 @@ static int hvm_allow_get_param(struct domain *d,
     case HVM_PARAM_STORE_EVTCHN:
     case HVM_PARAM_CONSOLE_PFN:
     case HVM_PARAM_CONSOLE_EVTCHN:
+    case HVM_PARAM_ALTP2MHVM:
         break;
     /*
      * The following parameters must not be read by the guest
@@ -6482,6 +6494,12 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             switch ( a.cmd )
             {
             case HVMOP_altp2m_get_domain_state:
+                if ( !d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] )
+                {
+                    rc = -EINVAL;
+                    break;
+                }
+
                 a.u.domain_state.state = altp2m_active(d);
                 rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
 
@@ -6490,8 +6508,9 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             {
                 struct vcpu *v;
                 bool_t ostate;
-                
-                if ( nestedhvm_enabled(d) )
+
+                if ( !d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] ||
+                     nestedhvm_enabled(d) )
                 {
                     rc = -EINVAL;
                     break;
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index 7c73089..51daad1 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -187,6 +187,9 @@
 /* Location of the VM Generation ID in guest physical address space. */
 #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
 
-#define HVM_NR_PARAMS          35
+/* Boolean: Enable altp2m */
+#define HVM_PARAM_ALTP2MHVM    35
+
+#define HVM_NR_PARAMS          36
 
 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 13/15] x86/altp2m: XSM hooks for altp2m HVM ops
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (11 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  0:52 ` [PATCH v4 14/15] tools/libxc: add support to altp2m hvmops Ed White
  2015-07-10  0:52 ` [PATCH v4 15/15] tools/xen-access: altp2m testcases Ed White
  14 siblings, 0 replies; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Ravi Sahita <ravi.sahita@intel.com>

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
 tools/flask/policy/policy/modules/xen/xen.if |  4 ++--
 xen/arch/x86/hvm/hvm.c                       |  6 ++++++
 xen/include/xsm/dummy.h                      | 12 ++++++++++++
 xen/include/xsm/xsm.h                        | 12 ++++++++++++
 xen/xsm/dummy.c                              |  2 ++
 xen/xsm/flask/hooks.c                        | 12 ++++++++++++
 xen/xsm/flask/policy/access_vectors          |  7 +++++++
 7 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if
index f4cde11..6177fe9 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -8,7 +8,7 @@
 define(`declare_domain_common', `
 	allow $1 $2:grant { query setup };
 	allow $1 $2:mmu { adjust physmap map_read map_write stat pinpage updatemp mmuext_op };
-	allow $1 $2:hvm { getparam setparam };
+	allow $1 $2:hvm { getparam setparam altp2mhvm_op };
 	allow $1 $2:domain2 get_vnumainfo;
 ')
 
@@ -58,7 +58,7 @@ define(`create_domain_common', `
 	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
 	allow $1 $2:grant setup;
 	allow $1 $2:hvm { cacheattr getparam hvmctl irqlevel pciroute sethvmc
-			setparam pcilevel trackdirtyvram nested };
+			setparam pcilevel trackdirtyvram nested altp2mhvm altp2mhvm_op };
 ')
 
 # create_domain(priv, target)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 6e59e68..7c82e89 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5887,6 +5887,9 @@ static int hvmop_set_param(
                 nestedhvm_vcpu_destroy(v);
         break;
     case HVM_PARAM_ALTP2MHVM:
+        rc = xsm_hvm_param_altp2mhvm(XSM_PRIV, d);
+        if ( rc )
+            break;
         if ( a.value > 1 )
             rc = -EINVAL;
         if ( a.value &&
@@ -6490,6 +6493,9 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         }
 
         if ( !rc )
+            rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d ? d : current->domain);
+
+        if ( !rc )
         {
             switch ( a.cmd )
             {
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index f044c0f..e0b561d 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -548,6 +548,18 @@ static XSM_INLINE int xsm_hvm_param_nested(XSM_DEFAULT_ARG struct domain *d)
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_hvm_param_altp2mhvm(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
+static XSM_INLINE int xsm_hvm_altp2mhvm_op(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_TARGET);
+    return xsm_default_action(action, current->domain, d);
+}
+
 static XSM_INLINE int xsm_vm_event_control(XSM_DEFAULT_ARG struct domain *d, int mode, int op)
 {
     XSM_ASSERT_ACTION(XSM_PRIV);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index c872d44..dc48d23 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -147,6 +147,8 @@ struct xsm_operations {
     int (*hvm_param) (struct domain *d, unsigned long op);
     int (*hvm_control) (struct domain *d, unsigned long op);
     int (*hvm_param_nested) (struct domain *d);
+    int (*hvm_param_altp2mhvm) (struct domain *d);
+    int (*hvm_altp2mhvm_op) (struct domain *d);
     int (*get_vnumainfo) (struct domain *d);
 
     int (*vm_event_control) (struct domain *d, int mode, int op);
@@ -586,6 +588,16 @@ static inline int xsm_hvm_param_nested (xsm_default_t def, struct domain *d)
     return xsm_ops->hvm_param_nested(d);
 }
 
+static inline int xsm_hvm_param_altp2mhvm (xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->hvm_param_altp2mhvm(d);
+}
+
+static inline int xsm_hvm_altp2mhvm_op (xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->hvm_altp2mhvm_op(d);
+}
+
 static inline int xsm_get_vnumainfo (xsm_default_t def, struct domain *d)
 {
     return xsm_ops->get_vnumainfo(d);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index e84b0e4..3461d4f 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -116,6 +116,8 @@ void xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, hvm_param);
     set_to_dummy_if_null(ops, hvm_control);
     set_to_dummy_if_null(ops, hvm_param_nested);
+    set_to_dummy_if_null(ops, hvm_param_altp2mhvm);
+    set_to_dummy_if_null(ops, hvm_altp2mhvm_op);
 
     set_to_dummy_if_null(ops, do_xsm_op);
 #ifdef CONFIG_COMPAT
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 6e37d29..2b998c9 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1170,6 +1170,16 @@ static int flask_hvm_param_nested(struct domain *d)
     return current_has_perm(d, SECCLASS_HVM, HVM__NESTED);
 }
 
+static int flask_hvm_param_altp2mhvm(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_HVM, HVM__ALTP2MHVM);
+}
+
+static int flask_hvm_altp2mhvm_op(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_HVM, HVM__ALTP2MHVM_OP);
+}
+
 static int flask_vm_event_control(struct domain *d, int mode, int op)
 {
     return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__VM_EVENT);
@@ -1654,6 +1664,8 @@ static struct xsm_operations flask_ops = {
     .hvm_param = flask_hvm_param,
     .hvm_control = flask_hvm_param,
     .hvm_param_nested = flask_hvm_param_nested,
+    .hvm_param_altp2mhvm = flask_hvm_param_altp2mhvm,
+    .hvm_altp2mhvm_op = flask_hvm_altp2mhvm_op,
 
     .do_xsm_op = do_flask_op,
     .get_vnumainfo = flask_get_vnumainfo,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 68284d5..d168de2 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -272,6 +272,13 @@ class hvm
     share_mem
 # HVMOP_set_param setting HVM_PARAM_NESTEDHVM
     nested
+# HVMOP_set_param setting HVM_PARAM_ALTP2MHVM
+    altp2mhvm
+# HVMOP_altp2m_set_domain_state HVMOP_altp2m_get_domain_state
+# HVMOP_altp2m_vcpu_enable_notify HVMOP_altp2m_create_p2m
+# HVMOP_altp2m_destroy_p2m HVMOP_altp2m_switch_p2m
+# HVMOP_altp2m_set_mem_access HVMOP_altp2m_change_gfn
+    altp2mhvm_op
 }
 
 # Class event describes event channels.  Interdomain event channels have their
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 14/15] tools/libxc: add support to altp2m hvmops
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (12 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  8:46   ` Ian Campbell
  2015-07-10  0:52 ` [PATCH v4 15/15] tools/xen-access: altp2m testcases Ed White
  14 siblings, 1 reply; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Tamas K Lengyel <tlengyel@novetta.com>

Wrappers to issue altp2m hvmops.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
---
 tools/libxc/Makefile          |   1 +
 tools/libxc/include/xenctrl.h |  21 ++++
 tools/libxc/xc_altp2m.c       | 237 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 259 insertions(+)
 create mode 100644 tools/libxc/xc_altp2m.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 153b79e..c2c2b1c 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -10,6 +10,7 @@ override CONFIG_MIGRATE := n
 endif
 
 CTRL_SRCS-y       :=
+CTRL_SRCS-y       += xc_altp2m.c
 CTRL_SRCS-y       += xc_core.c
 CTRL_SRCS-$(CONFIG_X86) += xc_core_x86.c
 CTRL_SRCS-$(CONFIG_ARM) += xc_core_arm.c
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index d1d2ab3..ecddf28 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2316,6 +2316,27 @@ void xc_tmem_save_done(xc_interface *xch, int dom);
 int xc_tmem_restore(xc_interface *xch, int dom, int fd);
 int xc_tmem_restore_extra(xc_interface *xch, int dom, int fd);
 
+/**
+ * altp2m operations
+ */
+
+int xc_altp2m_get_domain_state(xc_interface *handle, domid_t dom, bool *state);
+int xc_altp2m_set_domain_state(xc_interface *handle, domid_t dom, bool state);
+int xc_altp2m_set_vcpu_enable_notify(xc_interface *handle, xen_pfn_t gfn);
+int xc_altp2m_create_view(xc_interface *handle, domid_t domid,
+                          xenmem_access_t default_access, uint16_t *view_id);
+int xc_altp2m_destroy_view(xc_interface *handle, domid_t domid,
+                           uint16_t view_id);
+/* Switch all vCPUs of the domain to the specified altp2m view */
+int xc_altp2m_switch_to_view(xc_interface *handle, domid_t domid,
+                             uint16_t view_id);
+int xc_altp2m_set_mem_access(xc_interface *handle, domid_t domid,
+                             uint16_t view_id, xen_pfn_t gfn,
+                             xenmem_access_t access);
+int xc_altp2m_change_gfn(xc_interface *handle, domid_t domid,
+                         uint16_t view_id, xen_pfn_t old_gfn,
+                         xen_pfn_t new_gfn);
+
 /** 
  * Mem paging operations.
  * Paging is supported only on the x86 architecture in 64 bit mode, with
diff --git a/tools/libxc/xc_altp2m.c b/tools/libxc/xc_altp2m.c
new file mode 100644
index 0000000..a4be36b
--- /dev/null
+++ b/tools/libxc/xc_altp2m.c
@@ -0,0 +1,237 @@
+/******************************************************************************
+ *
+ * xc_altp2m.c
+ *
+ * Interface to altp2m related HVMOPs
+ *
+ * Copyright (c) 2015 Tamas K Lengyel (tamas@tklengyel.com)
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+#include "xc_private.h"
+#include <stdbool.h>
+#include <xen/hvm/hvm_op.h>
+
+int xc_altp2m_get_domain_state(xc_interface *handle, domid_t dom, bool *state)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_get_domain_state;
+    arg->domain = dom;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    if ( !rc )
+        *state = arg->u.domain_state.state;
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_set_domain_state(xc_interface *handle, domid_t dom, bool state)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_set_domain_state;
+    arg->domain = dom;
+    arg->u.domain_state.state = state;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+/* This is a bit odd to me that it acts on current.. */
+int xc_altp2m_set_vcpu_enable_notify(xc_interface *handle, xen_pfn_t gfn)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_vcpu_enable_notify;
+    arg->u.enable_notify.gfn = gfn;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_create_view(xc_interface *handle, domid_t domid,
+                          xenmem_access_t default_access, uint16_t *view_id)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_create_p2m;
+    arg->domain = domid;
+    arg->u.view.view = -1;
+    arg->u.view.hvmmem_default_access = default_access;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    if ( !rc )
+        *view_id = arg->u.view.view;
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_destroy_view(xc_interface *handle, domid_t domid,
+                           uint16_t view_id)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_destroy_p2m;
+    arg->domain = domid;
+    arg->u.view.view = view_id;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+/* Switch all vCPUs of the domain to the specified altp2m view */
+int xc_altp2m_switch_to_view(xc_interface *handle, domid_t domid,
+                             uint16_t view_id)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_switch_p2m;
+    arg->domain = domid;
+    arg->u.view.view = view_id;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_set_mem_access(xc_interface *handle, domid_t domid,
+                             uint16_t view_id, xen_pfn_t gfn,
+                             xenmem_access_t access)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_set_mem_access;
+    arg->domain = domid;
+    arg->u.set_mem_access.view = view_id;
+    arg->u.set_mem_access.hvmmem_access = access;
+    arg->u.set_mem_access.gfn = gfn;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
+int xc_altp2m_change_gfn(xc_interface *handle, domid_t domid,
+                         uint16_t view_id, xen_pfn_t old_gfn,
+                         xen_pfn_t new_gfn)
+{
+    int rc;
+    DECLARE_HYPERCALL;
+    DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+    arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+    if ( arg == NULL )
+        return -1;
+
+    hypercall.op     = __HYPERVISOR_hvm_op;
+    hypercall.arg[0] = HVMOP_altp2m;
+    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+    arg->cmd = HVMOP_altp2m_change_gfn;
+    arg->domain = domid;
+    arg->u.change_gfn.view = view_id;
+    arg->u.change_gfn.old_gfn = old_gfn;
+    arg->u.change_gfn.new_gfn = new_gfn;
+
+    rc = do_xen_hypercall(handle, &hypercall);
+
+    xc_hypercall_buffer_free(handle, arg);
+    return rc;
+}
+
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH v4 15/15] tools/xen-access: altp2m testcases
  2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (13 preceding siblings ...)
  2015-07-10  0:52 ` [PATCH v4 14/15] tools/libxc: add support to altp2m hvmops Ed White
@ 2015-07-10  0:52 ` Ed White
  2015-07-10  1:35   ` Lengyel, Tamas
  2015-07-10  8:50   ` Ian Campbell
  14 siblings, 2 replies; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Tamas K Lengyel <tlengyel@novetta.com>

Working altp2m test-case. Extended the test tool to support singlestepping
to better highlight the core feature of altp2m view switching.

Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 tools/tests/xen-access/xen-access.c | 173 ++++++++++++++++++++++++++++++------
 1 file changed, 148 insertions(+), 25 deletions(-)

diff --git a/tools/tests/xen-access/xen-access.c b/tools/tests/xen-access/xen-access.c
index 12ab921..6daa408 100644
--- a/tools/tests/xen-access/xen-access.c
+++ b/tools/tests/xen-access/xen-access.c
@@ -275,6 +275,19 @@ xenaccess_t *xenaccess_init(xc_interface **xch_r, domid_t domain_id)
     return NULL;
 }
 
+static inline
+int control_singlestep(
+    xc_interface *xch,
+    domid_t domain_id,
+    unsigned long vcpu,
+    bool enable)
+{
+    uint32_t op = enable ?
+        XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON : XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF;
+
+    return xc_domain_debug_control(xch, domain_id, op, vcpu);
+}
+
 /*
  * Note that this function is not thread safe.
  */
@@ -317,13 +330,15 @@ static void put_response(vm_event_t *vm_event, vm_event_response_t *rsp)
 
 void usage(char* progname)
 {
-    fprintf(stderr,
-            "Usage: %s [-m] <domain_id> write|exec|breakpoint\n"
+    fprintf(stderr, "Usage: %s [-m] <domain_id> write|exec", progname);
+#if defined(__i386__) || defined(__x86_64__)
+            fprintf(stderr, "|breakpoint|altp2m_write|altp2m_exec");
+#endif
+            fprintf(stderr,
             "\n"
             "Logs first page writes, execs, or breakpoint traps that occur on the domain.\n"
             "\n"
-            "-m requires this program to run, or else the domain may pause\n",
-            progname);
+            "-m requires this program to run, or else the domain may pause\n");
 }
 
 int main(int argc, char *argv[])
@@ -341,6 +356,8 @@ int main(int argc, char *argv[])
     int required = 0;
     int breakpoint = 0;
     int shutting_down = 0;
+    int altp2m = 0;
+    uint16_t altp2m_view_id = 0;
 
     char* progname = argv[0];
     argv++;
@@ -379,10 +396,22 @@ int main(int argc, char *argv[])
         default_access = XENMEM_access_rw;
         after_first_access = XENMEM_access_rwx;
     }
+#if defined(__i386__) || defined(__x86_64__)
     else if ( !strcmp(argv[0], "breakpoint") )
     {
         breakpoint = 1;
     }
+    else if ( !strcmp(argv[0], "altp2m_write") )
+    {
+        default_access = XENMEM_access_rx;
+        altp2m = 1;
+    }
+    else if ( !strcmp(argv[0], "altp2m_exec") )
+    {
+        default_access = XENMEM_access_rw;
+        altp2m = 1;
+    }
+#endif
     else
     {
         usage(argv[0]);
@@ -415,22 +444,73 @@ int main(int argc, char *argv[])
         goto exit;
     }
 
-    /* Set the default access type and convert all pages to it */
-    rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
-    if ( rc < 0 )
+    /* With altp2m we just create a new, restricted view of the memory */
+    if ( altp2m )
     {
-        ERROR("Error %d setting default mem access type\n", rc);
-        goto exit;
-    }
+        xen_pfn_t gfn = 0;
+        unsigned long perm_set = 0;
+
+        rc = xc_altp2m_set_domain_state( xch, domain_id, 1 );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d enabling altp2m on domain!\n", rc);
+            goto exit;
+        }
+
+        rc = xc_altp2m_create_view( xch, domain_id, default_access, &altp2m_view_id );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d creating altp2m view!\n", rc);
+            goto exit;
+        }
 
-    rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
-                           (xenaccess->max_gpfn - START_PFN) );
+        DPRINTF("altp2m view created with id %u\n", altp2m_view_id);
+        DPRINTF("Setting altp2m mem_access permissions.. ");
 
-    if ( rc < 0 )
+        for(; gfn < xenaccess->max_gpfn; ++gfn)
+        {
+            rc = xc_altp2m_set_mem_access( xch, domain_id, altp2m_view_id, gfn,
+                                           default_access);
+            if ( !rc )
+                perm_set++;
+        }
+
+        DPRINTF("done! Permissions set on %lu pages.\n", perm_set);
+
+        rc = xc_altp2m_switch_to_view( xch, domain_id, altp2m_view_id );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d switching to altp2m view!\n", rc);
+            goto exit;
+        }
+
+        rc = xc_monitor_singlestep( xch, domain_id, 1 );
+        if ( rc < 0 )
+        {
+            ERROR("Error %d failed to enable singlestep monitoring!\n", rc);
+            goto exit;
+        }
+    }
+
+    if ( !altp2m )
     {
-        ERROR("Error %d setting all memory to access type %d\n", rc,
-              default_access);
-        goto exit;
+        /* Set the default access type and convert all pages to it */
+        rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
+        if ( rc < 0 )
+        {
+            ERROR("Error %d setting default mem access type\n", rc);
+            goto exit;
+        }
+
+        rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
+                               (xenaccess->max_gpfn - START_PFN) );
+
+        if ( rc < 0 )
+        {
+            ERROR("Error %d setting all memory to access type %d\n", rc,
+                  default_access);
+            goto exit;
+        }
     }
 
     if ( breakpoint )
@@ -448,13 +528,29 @@ int main(int argc, char *argv[])
     {
         if ( interrupted )
         {
+            /* Unregister for every event */
             DPRINTF("xenaccess shutting down on signal %d\n", interrupted);
 
-            /* Unregister for every event */
-            rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, ~0ull, 0);
-            rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, START_PFN,
-                                   (xenaccess->max_gpfn - START_PFN) );
-            rc = xc_monitor_software_breakpoint(xch, domain_id, 0);
+            if ( breakpoint )
+                rc = xc_monitor_software_breakpoint(xch, domain_id, 0);
+
+            if ( altp2m )
+            {
+                uint32_t vcpu_id;
+
+                rc = xc_altp2m_switch_to_view( xch, domain_id, 0 );
+                rc = xc_altp2m_destroy_view(xch, domain_id, altp2m_view_id);
+                rc = xc_altp2m_set_domain_state(xch, domain_id, 0);
+                rc = xc_monitor_singlestep(xch, domain_id, 0);
+
+                for ( vcpu_id = 0; vcpu_id<XEN_LEGACY_MAX_VCPUS; vcpu_id++)
+                    rc = control_singlestep(xch, domain_id, vcpu_id, 0);
+
+            } else {
+                rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, ~0ull, 0);
+                rc = xc_set_mem_access(xch, domain_id, XENMEM_access_rwx, START_PFN,
+                                       (xenaccess->max_gpfn - START_PFN) );
+            }
 
             shutting_down = 1;
         }
@@ -500,7 +596,7 @@ int main(int argc, char *argv[])
                 }
 
                 printf("PAGE ACCESS: %c%c%c for GFN %"PRIx64" (offset %06"
-                       PRIx64") gla %016"PRIx64" (valid: %c; fault in gpt: %c; fault with gla: %c) (vcpu %u)\n",
+                       PRIx64") gla %016"PRIx64" (valid: %c; fault in gpt: %c; fault with gla: %c) (vcpu %u, altp2m view %u)\n",
                        (req.u.mem_access.flags & MEM_ACCESS_R) ? 'r' : '-',
                        (req.u.mem_access.flags & MEM_ACCESS_W) ? 'w' : '-',
                        (req.u.mem_access.flags & MEM_ACCESS_X) ? 'x' : '-',
@@ -510,9 +606,20 @@ int main(int argc, char *argv[])
                        (req.u.mem_access.flags & MEM_ACCESS_GLA_VALID) ? 'y' : 'n',
                        (req.u.mem_access.flags & MEM_ACCESS_FAULT_IN_GPT) ? 'y' : 'n',
                        (req.u.mem_access.flags & MEM_ACCESS_FAULT_WITH_GLA) ? 'y': 'n',
-                       req.vcpu_id);
+                       req.vcpu_id,
+                       req.altp2m_idx);
 
-                if ( default_access != after_first_access )
+                if ( altp2m && req.flags & VM_EVENT_FLAG_ALTERNATE_P2M)
+                {
+                    DPRINTF("\tSwitching back to default view!\n");
+
+                    rsp.reason = req.reason;
+                    rsp.flags = req.flags;
+                    rsp.altp2m_idx = 0;
+
+                    control_singlestep(xch, domain_id, rsp.vcpu_id, 1);
+                }
+                else if ( default_access != after_first_access )
                 {
                     rc = xc_set_mem_access(xch, domain_id, after_first_access,
                                            req.u.mem_access.gfn, 1);
@@ -525,7 +632,6 @@ int main(int argc, char *argv[])
                     }
                 }
 
-
                 rsp.u.mem_access.gfn = req.u.mem_access.gfn;
                 break;
             case VM_EVENT_REASON_SOFTWARE_BREAKPOINT:
@@ -546,6 +652,23 @@ int main(int argc, char *argv[])
                 }
 
                 break;
+            case VM_EVENT_REASON_SINGLESTEP:
+                printf("Singlestep: rip=%016"PRIx64", vcpu %d\n",
+                       req.regs.x86.rip,
+                       req.vcpu_id);
+
+                if ( altp2m )
+                {
+                    printf("\tSwitching altp2m to view %u!\n", altp2m_view_id);
+
+                    rsp.reason = VM_EVENT_REASON_MEM_ACCESS;
+                    rsp.flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
+                    rsp.altp2m_idx = altp2m_view_id;
+                }
+
+                control_singlestep(xch, domain_id, req.vcpu_id, 0);
+
+                break;
             default:
                 fprintf(stderr, "UNKNOWN REASON CODE %d\n", req.reason);
             }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 09/15] x86/altp2m: alternate p2m memory events.
  2015-07-10  0:52 ` [PATCH v4 09/15] x86/altp2m: alternate p2m memory events Ed White
@ 2015-07-10  1:01   ` Lengyel, Tamas
  0 siblings, 0 replies; 51+ messages in thread
From: Lengyel, Tamas @ 2015-07-10  1:01 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Xen-devel, Jan Beulich, Andrew Cooper, Daniel De Graaf


[-- Attachment #1.1: Type: text/plain, Size: 897 bytes --]

> diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h
> index 577e971..6dfa9db 100644
> --- a/xen/include/public/vm_event.h
> +++ b/xen/include/public/vm_event.h
> @@ -47,6 +47,16 @@
>  #define VM_EVENT_FLAG_VCPU_PAUSED     (1 << 0)
>  /* Flags to aid debugging mem_event */
>  #define VM_EVENT_FLAG_FOREIGN         (1 << 1)
> +/*
> + * This flag can be set in a request or a response
> + *
> + * On a request, indicates that the event occurred in the alternate p2m
> specified by
> + * the altp2m_idx request field.
> + *
> + * On a response, indicates that the VCPU should resume in the alternate
> p2m specified
> + * by the altp2m_idx response field if possible.
> + */
> +#define VM_EVENT_FLAG_ALTERNATE_P2M   (1 << 2)
>

This will now collide with staging following Razvan's patch that recently
got merged. Otherwise:

Acked-by: Tamas K Lengyel <tlengyel@novetta.com>

[-- Attachment #1.2: Type: text/html, Size: 1308 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 15/15] tools/xen-access: altp2m testcases
  2015-07-10  0:52 ` [PATCH v4 15/15] tools/xen-access: altp2m testcases Ed White
@ 2015-07-10  1:35   ` Lengyel, Tamas
  2015-07-11  6:06     ` Razvan Cojocaru
  2015-07-10  8:50   ` Ian Campbell
  1 sibling, 1 reply; 51+ messages in thread
From: Lengyel, Tamas @ 2015-07-10  1:35 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Xen-devel, Jan Beulich, Andrew Cooper, Daniel De Graaf


[-- Attachment #1.1: Type: text/plain, Size: 859 bytes --]

> @@ -546,6 +652,23 @@ int main(int argc, char *argv[])
>                  }
>
>                  break;
> +            case VM_EVENT_REASON_SINGLESTEP:
> +                printf("Singlestep: rip=%016"PRIx64", vcpu %d\n",
> +                       req.regs.x86.rip,
> +                       req.vcpu_id);
> +
> +                if ( altp2m )
> +                {
> +                    printf("\tSwitching altp2m to view %u!\n",
> altp2m_view_id);
> +
> +                    rsp.reason = VM_EVENT_REASON_MEM_ACCESS;
>

So this was a workaround for v3 of the series that is no longer necessary -
it's probably cleaner to have the same reason set for the response as the
request was. It's not against any rule, so the code is still correct and
works, it's just not best practice. So in case there is another round on
the series, it could be fixed then.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 1189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 14/15] tools/libxc: add support to altp2m hvmops
  2015-07-10  0:52 ` [PATCH v4 14/15] tools/libxc: add support to altp2m hvmops Ed White
@ 2015-07-10  8:46   ` Ian Campbell
  0 siblings, 0 replies; 51+ messages in thread
From: Ian Campbell @ 2015-07-10  8:46 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Thu, 2015-07-09 at 17:52 -0700, Ed White wrote:
> From: Tamas K Lengyel <tlengyel@novetta.com>
> 
> Wrappers to issue altp2m hvmops.
> 
> Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com>
> Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>

These all appear to be valid wrappings of the hypercall interfaces, so
if the h/v folks are fine with the interface itself:
        Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 15/15] tools/xen-access: altp2m testcases
  2015-07-10  0:52 ` [PATCH v4 15/15] tools/xen-access: altp2m testcases Ed White
  2015-07-10  1:35   ` Lengyel, Tamas
@ 2015-07-10  8:50   ` Ian Campbell
  2015-07-10  8:55     ` Wei Liu
  1 sibling, 1 reply; 51+ messages in thread
From: Ian Campbell @ 2015-07-10  8:50 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Thu, 2015-07-09 at 17:52 -0700, Ed White wrote:
> From: Tamas K Lengyel <tlengyel@novetta.com>
> 
> Working altp2m test-case. Extended the test tool to support singlestepping
> to better highlight the core feature of altp2m view switching.

Is this the only higher level tool integration which is required for
this feature? I was expecting to see at a minimum some libxl/xl
integration for enabling/disabling the feature on a per domain basis
since AIUI it is a feature which is (or can be) exposed to the guest.

>From looking at the 00 patch it seems like the ability for a domain to
do altp2m on itself is perhaps not included in this iteration of the
series, is that correct? Is that functionality disabled by default or
simply not present?

Ian.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-10  0:52 ` [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
@ 2015-07-10  8:53   ` Wei Liu
  2015-07-10 17:32   ` George Dunlap
  1 sibling, 0 replies; 51+ messages in thread
From: Wei Liu @ 2015-07-10  8:53 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Thu, Jul 09, 2015 at 05:52:30PM -0700, Ed White wrote:
> The altp2mhvm and nestedhvm parameters are mutually
> exclusive and cannot be set together.
> 
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> for the hypervisor bits.

Drop "for the hypervisor bits" if you happen to resend.

For tools:

Acked-by: Wei Liu <wei.liu2@citrix.com>

I actually discover a minor problem that I should have mentioned in
last iteration, but since time is short I will fix that up for you once
this series is applied.

Wei.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 15/15] tools/xen-access: altp2m testcases
  2015-07-10  8:50   ` Ian Campbell
@ 2015-07-10  8:55     ` Wei Liu
  2015-07-10  9:12       ` Wei Liu
  2015-07-10  9:20       ` Ian Campbell
  0 siblings, 2 replies; 51+ messages in thread
From: Wei Liu @ 2015-07-10  8:55 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	Ed White, xen-devel, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

On Fri, Jul 10, 2015 at 09:50:25AM +0100, Ian Campbell wrote:
> On Thu, 2015-07-09 at 17:52 -0700, Ed White wrote:
> > From: Tamas K Lengyel <tlengyel@novetta.com>
> > 
> > Working altp2m test-case. Extended the test tool to support singlestepping
> > to better highlight the core feature of altp2m view switching.
> 
> Is this the only higher level tool integration which is required for
> this feature? I was expecting to see at a minimum some libxl/xl
> integration for enabling/disabling the feature on a per domain basis
> since AIUI it is a feature which is (or can be) exposed to the guest.

There is xl/libxl integration in one patch, which is trivial at the
moment.

As I understand it there will be more libxl patches for this feature,
but they are not included in this series at the moment.

> 
> >From looking at the 00 patch it seems like the ability for a domain to
> do altp2m on itself is perhaps not included in this iteration of the
> series, is that correct? Is that functionality disabled by default or
> simply not present?
> 

Disable by default in libxl.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 03/15] VMX: implement suppress #VE.
  2015-07-10  0:52 ` [PATCH v4 03/15] VMX: implement suppress #VE Ed White
@ 2015-07-10  9:09   ` Jan Beulich
  2015-07-10 19:22     ` Sahita, Ravi
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2015-07-10  9:09 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
> @@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
>          p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
>      }
>  
> +    table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
> +
> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
> +        table[i].suppress_ve = 1;
> +
> +    unmap_domain_page(table);

See my comments/questions on v3. I find it irritating for new patch
versions to be sent without addressing comments on the previous
one (verbally or by adjusting code).

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 15/15] tools/xen-access: altp2m testcases
  2015-07-10  8:55     ` Wei Liu
@ 2015-07-10  9:12       ` Wei Liu
  2015-07-10  9:20       ` Ian Campbell
  1 sibling, 0 replies; 51+ messages in thread
From: Wei Liu @ 2015-07-10  9:12 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	Ed White, xen-devel, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

On Fri, Jul 10, 2015 at 09:55:52AM +0100, Wei Liu wrote:
> On Fri, Jul 10, 2015 at 09:50:25AM +0100, Ian Campbell wrote:
> > On Thu, 2015-07-09 at 17:52 -0700, Ed White wrote:
> > > From: Tamas K Lengyel <tlengyel@novetta.com>
> > > 
> > > Working altp2m test-case. Extended the test tool to support singlestepping
> > > to better highlight the core feature of altp2m view switching.
> > 
> > Is this the only higher level tool integration which is required for
> > this feature? I was expecting to see at a minimum some libxl/xl
> > integration for enabling/disabling the feature on a per domain basis
> > since AIUI it is a feature which is (or can be) exposed to the guest.
> 
> There is xl/libxl integration in one patch, which is trivial at the
> moment.
> 
> As I understand it there will be more libxl patches for this feature,
> but they are not included in this series at the moment.

Oh, I could be wrong on this. Maybe these last two patches are the ones
he talked about. I will let Ed and Ravi confirm.

Wei.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 05/15] x86/altp2m: basic data structures and support routines.
  2015-07-10  0:52 ` [PATCH v4 05/15] x86/altp2m: basic data structures and support routines Ed White
@ 2015-07-10  9:13   ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2015-07-10  9:13 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
> Add the basic data structures needed to support alternate p2m's and
> the functions to initialise them and tear them down.
> 
> Although Intel hardware can handle 512 EPTP's per hardware thread
> concurrently, only 10 per domain are supported in this patch for
> performance reasons.
> 
> The iterator in hap_enable() does need to handle 512, so that is now
> uint16_t.
> 
> This change also splits the p2m lock into one lock type for altp2m's
> and another type for all other p2m's. The purpose of this is to place
> the altp2m list lock between the types, so the list lock can be
> acquired whilst holding the host p2m lock.
> 
> Signed-off-by: Ed White <edmund.h.white@intel.com>

Same here - none of my comments on v3 got addressed in any way.
Yes, I sent them only yesterday, but still hours before you sent the
new version.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 15/15] tools/xen-access: altp2m testcases
  2015-07-10  8:55     ` Wei Liu
  2015-07-10  9:12       ` Wei Liu
@ 2015-07-10  9:20       ` Ian Campbell
  1 sibling, 0 replies; 51+ messages in thread
From: Ian Campbell @ 2015-07-10  9:20 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ravi Sahita, George Dunlap, Ian Jackson, Tim Deegan, Ed White,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Fri, 2015-07-10 at 09:55 +0100, Wei Liu wrote:
> On Fri, Jul 10, 2015 at 09:50:25AM +0100, Ian Campbell wrote:
> > On Thu, 2015-07-09 at 17:52 -0700, Ed White wrote:
> > > From: Tamas K Lengyel <tlengyel@novetta.com>
> > > 
> > > Working altp2m test-case. Extended the test tool to support singlestepping
> > > to better highlight the core feature of altp2m view switching.
> > 
> > Is this the only higher level tool integration which is required for
> > this feature? I was expecting to see at a minimum some libxl/xl
> > integration for enabling/disabling the feature on a per domain basis
> > since AIUI it is a feature which is (or can be) exposed to the guest.
> 
> There is xl/libxl integration in one patch, which is trivial at the
> moment.

Thanks, armed with that I found it. I missed it the first time because
the subject didn't mention the tools.

Ian.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-10  0:52 ` [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
@ 2015-07-10  9:30   ` Jan Beulich
  2015-07-11 20:01     ` Sahita, Ravi
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2015-07-10  9:30 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
> @@ -3234,6 +3256,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>              update_guest_eip();
>          break;
>  
> +    case EXIT_REASON_VMFUNC:
> +        if ( vmx_vmfunc_intercept(regs) == X86EMUL_EXCEPTION )
> +            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
> +        else
> +            update_guest_eip();
> +        break;

How about X86EMUL_UNHANDLEABLE and X86EMUL_RETRY? As said
before, either get this right, or simply fold the relatively pointless
helper into here.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -3816,8 +3816,9 @@ x86_emulate(
>          struct segment_register reg;
>          unsigned long base, limit, cr0, cr0w;
>  
> -        if ( modrm == 0xdf ) /* invlpga */
> +        switch( modrm )
>          {
> +        case 0xdf: /* invlpga AMD */
>              generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
>              generate_exception_if(!mode_ring0(), EXC_GP, 0);
>              fail_if(ops->invlpg == NULL);

The diff now looks much better. Yet I don't see why you added "AMD"
to the comment - we don't elsewhere note that certain instructions
are vendor specific (and really which ones are also changes over time,
see RDTSCP for a prominent example).

> @@ -3825,10 +3826,7 @@ x86_emulate(
>                                     ctxt)) )
>                  goto done;
>              break;
> -        }
> -
> -        if ( modrm == 0xf9 ) /* rdtscp */
> -        {
> +        case 0xf9: /* rdtscp */ {
>              uint64_t tsc_aux;
>              fail_if(ops->read_msr == NULL);
>              if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
> @@ -3836,7 +3834,19 @@ x86_emulate(
>              _regs.ecx = (uint32_t)tsc_aux;
>              goto rdtsc;
>          }
> +        case 0xd4: /* vmfunc */
> +            generate_exception_if(lock_prefix | rep_prefix() | (vex.pfx == vex_66),
> +                                  EXC_UD, -1);
> +            fail_if(ops->vmfunc == NULL);
> +            if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
> +                goto done;
> +            break;
> +        default:
> +            goto continue_grp7;
> +        }
> +        break;
>  
> + continue_grp7:

Already when first looking at this I disliked this label. Looking at it
again, I'd really like to see it gone: RDTSCP handling already ends
in a goto. Since the only VMFUNC currently implemented doesn't
modify any register state either, its handling could end in an
unconditional "goto done" too for now. And INVLPG, not modifying
any register state, could follow suit.

And even if you really wanted to cater for future VMFUNCs to alter
register state, I'd still like this ugliness to be avoided - e.g. by
setting rc to a negative value before the switch and break-ing
afterwards when it's no longer negative.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-10  0:52 ` [PATCH v4 08/15] x86/altp2m: add control of suppress_ve Ed White
@ 2015-07-10  9:39   ` Jan Beulich
  2015-07-10 11:11     ` George Dunlap
  2015-07-10 17:02   ` George Dunlap
  1 sibling, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2015-07-10  9:39 UTC (permalink / raw)
  To: George Dunlap, Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, Andrew Cooper, Ian Jackson,
	xen-devel, tlengyel, Daniel De Graaf

>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
> @@ -1528,16 +1528,17 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>      vm_event_request_t *req;
>      int rc;
>      unsigned long eip = guest_cpu_user_regs()->eip;
> +    bool_t sve;
>  
>      /* First, handle rx2rw conversion automatically.
>       * These calls to p2m->set_entry() must succeed: we have the gfn
>       * locked and just did a successful get_entry(). */
>      gfn_lock(p2m, gfn, 0);
> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>  
>      if ( npfec.write_access && p2ma == p2m_access_rx2rw ) 
>      {
> -        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw);
> +        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw, sve);
>          ASSERT(rc == 0);
>          gfn_unlock(p2m, gfn, 0);
>          return 1;
> @@ -1546,7 +1547,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>      {
>          ASSERT(npfec.write_access || npfec.read_access || npfec.insn_fetch);
>          rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
> -                            p2mt, p2m_access_rwx);
> +                            p2mt, p2m_access_rwx, -1);

So why -1 here ...

> @@ -1566,14 +1567,14 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>          else
>          {
>              gfn_lock(p2m, gfn, 0);
> -            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
> +            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>              if ( p2ma != p2m_access_n2rwx )
>              {
>                  /* A listener is not required, so clear the access
>                   * restrictions.  This set must succeed: we have the
>                   * gfn locked and just did a successful get_entry(). */
>                  rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
> -                                    p2mt, p2m_access_rwx);
> +                                    p2mt, p2m_access_rwx, sve);

... but sve here, when -1 means "retain current setting" anyway?
(Same question applies elsewhere.)

While I'd preferable see this simplified, either way
Reviewed-by: Jan Beulich <jbeulich@suse.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 10/15] x86/altp2m: add remaining support routines.
  2015-07-10  0:52 ` [PATCH v4 10/15] x86/altp2m: add remaining support routines Ed White
@ 2015-07-10  9:41   ` Jan Beulich
  2015-07-10 17:15     ` George Dunlap
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2015-07-10  9:41 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
> Add the remaining routines required to support enabling the alternate
> p2m functionality.

So despite George's comments on v3 these are still all disconnected
from their users...

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-10  0:52 ` [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
@ 2015-07-10 10:01   ` Jan Beulich
  2015-07-10 22:03     ` Sahita, Ravi
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2015-07-10 10:01 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -6443,6 +6443,144 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
>          break;
>      }
>  
> +    case HVMOP_altp2m:
> +    {
> +        struct xen_hvm_altp2m_op a;
> +        struct domain *d = NULL;
> +
> +        if ( copy_from_guest(&a, arg, 1) )
> +            return -EFAULT;
> +
> +        switch ( a.cmd )
> +        {
> +        case HVMOP_altp2m_get_domain_state:
> +        case HVMOP_altp2m_set_domain_state:
> +        case HVMOP_altp2m_create_p2m:
> +        case HVMOP_altp2m_destroy_p2m:
> +        case HVMOP_altp2m_switch_p2m:
> +        case HVMOP_altp2m_set_mem_access:
> +        case HVMOP_altp2m_change_gfn:
> +            d = rcu_lock_domain_by_any_id(a.domain);
> +            if ( d == NULL )
> +                return -ESRCH;
> +
> +            if ( !is_hvm_domain(d) || !hvm_altp2m_supported() )
> +                rc = -EINVAL;
> +
> +            break;
> +        case HVMOP_altp2m_vcpu_enable_notify:
> +
> +            break;

The blank line ought to go ahead of the case label.

> +        default:
> +            return -ENOSYS;
> +
> +            break;

Bogus (unreachable) break.

> +        }
> +
> +        if ( !rc )
> +        {
> +            switch ( a.cmd )
> +            {
> +            case HVMOP_altp2m_get_domain_state:
> +                a.u.domain_state.state = altp2m_active(d);
> +                rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
> +
> +                break;
> +            case HVMOP_altp2m_set_domain_state:
> +            {
> +                struct vcpu *v;
> +                bool_t ostate;
> +                
> +                if ( nestedhvm_enabled(d) )
> +                {
> +                    rc = -EINVAL;
> +                    break;
> +                }
> +
> +                ostate = d->arch.altp2m_active;
> +                d->arch.altp2m_active = !!a.u.domain_state.state;
> +
> +                /* If the alternate p2m state has changed, handle appropriately */
> +                if ( d->arch.altp2m_active != ostate &&
> +                     (ostate || !(rc = p2m_init_altp2m_by_id(d, 0))) )
> +                {
> +                    for_each_vcpu( d, v )
> +                    {
> +                        if ( !ostate )
> +                            altp2m_vcpu_initialise(v);
> +                        else
> +                            altp2m_vcpu_destroy(v);
> +                    }
> +
> +                    if ( ostate )
> +                        p2m_flush_altp2m(d);
> +                }
> +
> +                break;
> +            }
> +            default:
> +            {

Pointless brace.

> +                if ( !(d ? d : current->domain)->arch.altp2m_active )

This is bogus: d is NULL if and only if altp2m_vcpu_enable_notify,
i.e. I don't see why you can't just use current->domain inside that
case (and you really do). That would then also eliminate the need
for this redundant and obfuscating switch() nesting you use.

> +
> +struct xen_hvm_altp2m_set_mem_access {
> +    /* view */
> +    uint16_t view;
> +    /* Memory type */
> +    uint16_t hvmmem_access; /* xenmem_access_t */
> +    uint8_t pad[4];
> +    /* gfn */
> +    uint64_t gfn;
> +};
> +typedef struct xen_hvm_altp2m_set_mem_access 
> xen_hvm_altp2m_set_mem_access_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
> +
> +struct xen_hvm_altp2m_change_gfn {
> +    /* view */
> +    uint16_t view;
> +    uint8_t pad[6];
> +    /* old gfn */
> +    uint64_t old_gfn;
> +    /* new gfn, INVALID_GFN (~0UL) means revert */
> +    uint64_t new_gfn;
> +};
> +typedef struct xen_hvm_altp2m_change_gfn xen_hvm_altp2m_change_gfn_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_change_gfn_t);
> +
> +struct xen_hvm_altp2m_op {
> +    uint32_t cmd;
> +/* Get/set the altp2m state for a domain */
> +#define HVMOP_altp2m_get_domain_state     1
> +#define HVMOP_altp2m_set_domain_state     2
> +/* Set the current VCPU to receive altp2m event notifications */
> +#define HVMOP_altp2m_vcpu_enable_notify   3
> +/* Create a new view */
> +#define HVMOP_altp2m_create_p2m           4
> +/* Destroy a view */
> +#define HVMOP_altp2m_destroy_p2m          5
> +/* Switch view for an entire domain */
> +#define HVMOP_altp2m_switch_p2m           6
> +/* Notify that a page of memory is to have specific access types */
> +#define HVMOP_altp2m_set_mem_access       7
> +/* Change a p2m entry to have a different gfn->mfn mapping */
> +#define HVMOP_altp2m_change_gfn           8
> +    domid_t domain;
> +    uint8_t pad[2];

While you added padding fields as asked for, you still don't verify
them to be zero on input.

Afaict all other questions raised on v3 still stand.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-10  9:39   ` Jan Beulich
@ 2015-07-10 11:11     ` George Dunlap
  2015-07-10 11:49       ` Jan Beulich
  0 siblings, 1 reply; 51+ messages in thread
From: George Dunlap @ 2015-07-10 11:11 UTC (permalink / raw)
  To: Jan Beulich, Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, Andrew Cooper, Ian Jackson,
	xen-devel, tlengyel, Daniel De Graaf

On 07/10/2015 10:39 AM, Jan Beulich wrote:
>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>> @@ -1528,16 +1528,17 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>      vm_event_request_t *req;
>>      int rc;
>>      unsigned long eip = guest_cpu_user_regs()->eip;
>> +    bool_t sve;
>>  
>>      /* First, handle rx2rw conversion automatically.
>>       * These calls to p2m->set_entry() must succeed: we have the gfn
>>       * locked and just did a successful get_entry(). */
>>      gfn_lock(p2m, gfn, 0);
>> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>>  
>>      if ( npfec.write_access && p2ma == p2m_access_rx2rw ) 
>>      {
>> -        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw);
>> +        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw, sve);
>>          ASSERT(rc == 0);
>>          gfn_unlock(p2m, gfn, 0);
>>          return 1;
>> @@ -1546,7 +1547,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>      {
>>          ASSERT(npfec.write_access || npfec.read_access || npfec.insn_fetch);
>>          rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>> -                            p2mt, p2m_access_rwx);
>> +                            p2mt, p2m_access_rwx, -1);
> 
> So why -1 here ...
> 
>> @@ -1566,14 +1567,14 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>          else
>>          {
>>              gfn_lock(p2m, gfn, 0);
>> -            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>> +            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>>              if ( p2ma != p2m_access_n2rwx )
>>              {
>>                  /* A listener is not required, so clear the access
>>                   * restrictions.  This set must succeed: we have the
>>                   * gfn locked and just did a successful get_entry(). */
>>                  rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>> -                                    p2mt, p2m_access_rwx);
>> +                                    p2mt, p2m_access_rwx, sve);
> 
> ... but sve here, when -1 means "retain current setting" anyway?
> (Same question applies elsewhere.)

This is my code. I considered whether to use -1 here, but since we're
reading and retaining gfn, mfn, and p2mt, it seemed more consistent
stylistically to just read and re-write it along with the others.

In any case I don't have strong opinions.

 -G

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-10 11:11     ` George Dunlap
@ 2015-07-10 11:49       ` Jan Beulich
  2015-07-10 11:56         ` George Dunlap
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2015-07-10 11:49 UTC (permalink / raw)
  To: George Dunlap
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, Andrew Cooper, Ian Jackson,
	Ed White, xen-devel, tlengyel, Daniel De Graaf

>>> On 10.07.15 at 13:11, <george.dunlap@eu.citrix.com> wrote:
> On 07/10/2015 10:39 AM, Jan Beulich wrote:
>>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>>> @@ -1528,16 +1528,17 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>>      vm_event_request_t *req;
>>>      int rc;
>>>      unsigned long eip = guest_cpu_user_regs()->eip;
>>> +    bool_t sve;
>>>  
>>>      /* First, handle rx2rw conversion automatically.
>>>       * These calls to p2m->set_entry() must succeed: we have the gfn
>>>       * locked and just did a successful get_entry(). */
>>>      gfn_lock(p2m, gfn, 0);
>>> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>>> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>>>  
>>>      if ( npfec.write_access && p2ma == p2m_access_rx2rw ) 
>>>      {
>>> -        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw);
>>> +        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw, sve);
>>>          ASSERT(rc == 0);
>>>          gfn_unlock(p2m, gfn, 0);
>>>          return 1;
>>> @@ -1546,7 +1547,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>>      {
>>>          ASSERT(npfec.write_access || npfec.read_access || npfec.insn_fetch);
>>>          rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>>> -                            p2mt, p2m_access_rwx);
>>> +                            p2mt, p2m_access_rwx, -1);
>> 
>> So why -1 here ...
>> 
>>> @@ -1566,14 +1567,14 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>>          else
>>>          {
>>>              gfn_lock(p2m, gfn, 0);
>>> -            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>>> +            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>>>              if ( p2ma != p2m_access_n2rwx )
>>>              {
>>>                  /* A listener is not required, so clear the access
>>>                   * restrictions.  This set must succeed: we have the
>>>                   * gfn locked and just did a successful get_entry(). */
>>>                  rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>>> -                                    p2mt, p2m_access_rwx);
>>> +                                    p2mt, p2m_access_rwx, sve);
>> 
>> ... but sve here, when -1 means "retain current setting" anyway?
>> (Same question applies elsewhere.)
> 
> This is my code. I considered whether to use -1 here, but since we're
> reading and retaining gfn, mfn, and p2mt, it seemed more consistent
> stylistically to just read and re-write it along with the others.
> 
> In any case I don't have strong opinions.

I'd suggest the other mechanism so one can easily see which places
actually want to change the flag (or set it to a specific value). But in
the end it's your call which way to go.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-10 11:49       ` Jan Beulich
@ 2015-07-10 11:56         ` George Dunlap
  0 siblings, 0 replies; 51+ messages in thread
From: George Dunlap @ 2015-07-10 11:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, Andrew Cooper, Ian Jackson,
	Ed White, xen-devel, tlengyel, Daniel De Graaf

On 07/10/2015 12:49 PM, Jan Beulich wrote:
>>>> On 10.07.15 at 13:11, <george.dunlap@eu.citrix.com> wrote:
>> On 07/10/2015 10:39 AM, Jan Beulich wrote:
>>>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>>>> @@ -1528,16 +1528,17 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>>>      vm_event_request_t *req;
>>>>      int rc;
>>>>      unsigned long eip = guest_cpu_user_regs()->eip;
>>>> +    bool_t sve;
>>>>  
>>>>      /* First, handle rx2rw conversion automatically.
>>>>       * These calls to p2m->set_entry() must succeed: we have the gfn
>>>>       * locked and just did a successful get_entry(). */
>>>>      gfn_lock(p2m, gfn, 0);
>>>> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>>>> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>>>>  
>>>>      if ( npfec.write_access && p2ma == p2m_access_rx2rw ) 
>>>>      {
>>>> -        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw);
>>>> +        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw, sve);
>>>>          ASSERT(rc == 0);
>>>>          gfn_unlock(p2m, gfn, 0);
>>>>          return 1;
>>>> @@ -1546,7 +1547,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>>>      {
>>>>          ASSERT(npfec.write_access || npfec.read_access || npfec.insn_fetch);
>>>>          rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>>>> -                            p2mt, p2m_access_rwx);
>>>> +                            p2mt, p2m_access_rwx, -1);
>>>
>>> So why -1 here ...
>>>
>>>> @@ -1566,14 +1567,14 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>>>>          else
>>>>          {
>>>>              gfn_lock(p2m, gfn, 0);
>>>> -            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>>>> +            mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>>>>              if ( p2ma != p2m_access_n2rwx )
>>>>              {
>>>>                  /* A listener is not required, so clear the access
>>>>                   * restrictions.  This set must succeed: we have the
>>>>                   * gfn locked and just did a successful get_entry(). */
>>>>                  rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>>>> -                                    p2mt, p2m_access_rwx);
>>>> +                                    p2mt, p2m_access_rwx, sve);
>>>
>>> ... but sve here, when -1 means "retain current setting" anyway?
>>> (Same question applies elsewhere.)
>>
>> This is my code. I considered whether to use -1 here, but since we're
>> reading and retaining gfn, mfn, and p2mt, it seemed more consistent
>> stylistically to just read and re-write it along with the others.
>>
>> In any case I don't have strong opinions.
> 
> I'd suggest the other mechanism so one can easily see which places
> actually want to change the flag (or set it to a specific value). But in
> the end it's your call which way to go.

That does make sense.  In fact, if there were "leave the default"
options for the other values (mfn, p2mt, &c) it would be clearer that
only the page order and the access rights were being changed here.

Anyway that's a minor issue at this point.  Ed / Ravi, feel free to
change it according to Jan's suggestion, or leave it as it is for now.

 -George

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 06/15] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-10  0:52 ` [PATCH v4 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
@ 2015-07-10 16:48   ` George Dunlap
  0 siblings, 0 replies; 51+ messages in thread
From: George Dunlap @ 2015-07-10 16:48 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Fri, Jul 10, 2015 at 1:52 AM, Ed White <edmund.h.white@intel.com> wrote:
> Implement and hook up the code to enable VMX support of VMFUNC and #VE.
>
> VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Jun Nakajima <jun.nakajima@intel.com>
> ---
>  xen/arch/x86/hvm/vmx/vmx.c | 138 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 138 insertions(+)
>
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 07527dd..28afdaa 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -56,6 +56,7 @@
>  #include <asm/debugger.h>
>  #include <asm/apic.h>
>  #include <asm/hvm/nestedhvm.h>
> +#include <asm/hvm/altp2m.h>
>  #include <asm/event.h>
>  #include <asm/monitor.h>
>  #include <public/arch-x86/cpuid.h>
> @@ -1763,6 +1764,104 @@ static void vmx_enable_msr_exit_interception(struct domain *d)
>                                           MSR_TYPE_W);
>  }
>
> +static void vmx_vcpu_update_eptp(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    struct p2m_domain *p2m = NULL;
> +    struct ept_data *ept;
> +
> +    if ( altp2m_active(d) )
> +        p2m = p2m_get_altp2m(v);
> +    if ( !p2m )
> +        p2m = p2m_get_hostp2m(d);
> +
> +    ept = &p2m->ept;
> +    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
> +
> +    vmx_vmcs_enter(v);
> +
> +    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
> +
> +    if ( v->arch.hvm_vmx.secondary_exec_control &
> +        SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
> +        __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);
> +
> +    vmx_vmcs_exit(v);
> +}
> +
> +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
> +
> +    if ( !cpu_has_vmx_vmfunc )
> +        return;
> +
> +    if ( cpu_has_vmx_virt_exceptions )
> +        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
> +
> +    vmx_vmcs_enter(v);
> +
> +    if ( !d->is_dying && altp2m_active(d) )
> +    {
> +        v->arch.hvm_vmx.secondary_exec_control |= mask;
> +        __vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
> +        __vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp));
> +
> +        if ( cpu_has_vmx_virt_exceptions )
> +        {
> +            p2m_type_t t;
> +            mfn_t mfn;
> +
> +            mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
> +
> +            if ( mfn_x(mfn) != INVALID_MFN )
> +                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
> +            else
> +                mask &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
> +        }
> +    }
> +    else
> +        v->arch.hvm_vmx.secondary_exec_control &= ~mask;
> +
> +    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> +        v->arch.hvm_vmx.secondary_exec_control);
> +
> +    vmx_vmcs_exit(v);
> +}
> +
> +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
> +{
> +    bool_t rc = 0;
> +    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) != INVALID_GFN ?
> +        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) : NULL;
> +
> +    if ( !veinfo )
> +        return 0;
> +
> +    if ( veinfo->semaphore != 0 )
> +        goto out;
> +
> +    rc = 1;
> +
> +    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
> +    veinfo->semaphore = ~0l;
> +    veinfo->eptp_index = vcpu_altp2m(v).p2midx;
> +
> +    vmx_vmcs_enter(v);
> +    __vmread(EXIT_QUALIFICATION, &veinfo->exit_qualification);
> +    __vmread(GUEST_LINEAR_ADDRESS, &veinfo->gla);
> +    __vmread(GUEST_PHYSICAL_ADDRESS, &veinfo->gpa);
> +    vmx_vmcs_exit(v);
> +
> +    hvm_inject_hw_exception(TRAP_virtualisation,
> +                            HVM_DELIVER_NO_ERROR_CODE);
> +
> +out:
> +    hvm_unmap_guest_frame(veinfo, 0);
> +    return rc;
> +}
> +
>  static struct hvm_function_table __initdata vmx_function_table = {
>      .name                 = "VMX",
>      .cpu_up_prepare       = vmx_cpu_up_prepare,
> @@ -1822,6 +1921,9 @@ static struct hvm_function_table __initdata vmx_function_table = {
>      .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
>      .hypervisor_cpuid_leaf = vmx_hypervisor_cpuid_leaf,
>      .enable_msr_exit_interception = vmx_enable_msr_exit_interception,
> +    .ap2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
> +    .ap2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
> +    .ap2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,

Just a bit of feeback for future patch series: This would have been a
lot easier to review if these hooks, and the wrappers which call them,
had all been added in a single patch, rather than having the hooks &
wrappers added in the previous patch and the functions implemented in
this patch.

(I'm trying to focus on the p2m-related stuff, so I'm just skimming this one.)

 -George

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-10  0:52 ` [PATCH v4 08/15] x86/altp2m: add control of suppress_ve Ed White
  2015-07-10  9:39   ` Jan Beulich
@ 2015-07-10 17:02   ` George Dunlap
  2015-07-11 21:29     ` Sahita, Ravi
  1 sibling, 1 reply; 51+ messages in thread
From: George Dunlap @ 2015-07-10 17:02 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Fri, Jul 10, 2015 at 1:52 AM, Ed White <edmund.h.white@intel.com> wrote:
> From: George Dunlap <george.dunlap@eu.citrix.com>
>
> The existing ept_set_entry() and ept_get_entry() routines are extended
> to optionally set/get suppress_ve.  Passing -1 will set suppress_ve on
> new p2m entries, or retain suppress_ve flag on existing entries.
>
> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

So because my patch contained code written by Ed, and this patch now
contains code written by you, I'm pretty sure that a strict observance
of protocol would require his SoB to be retained (as I think I did
when I sent it), and your SoB to be added, for copyright purposes.

In this particular case a lawyer might argue that the code snippets
inquestion were so small or obvious as to be uncopyrightable, but it
doesn't really hurt to be a bit more strict than we need to be. :-)

Also, a description of what you had changed could have helped speed
review. (It seems you've only added the bits requested to the p2m-pt
implementation?)

Finally, one thing I missed in the discussion before...

> @@ -1528,16 +1528,17 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>      vm_event_request_t *req;
>      int rc;
>      unsigned long eip = guest_cpu_user_regs()->eip;
> +    bool_t sve;
>
>      /* First, handle rx2rw conversion automatically.
>       * These calls to p2m->set_entry() must succeed: we have the gfn
>       * locked and just did a successful get_entry(). */
>      gfn_lock(p2m, gfn, 0);
> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>
>      if ( npfec.write_access && p2ma == p2m_access_rx2rw )
>      {
> -        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw);
> +        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt, p2m_access_rw, sve);
>          ASSERT(rc == 0);
>          gfn_unlock(p2m, gfn, 0);
>          return 1;
> @@ -1546,7 +1547,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>      {
>          ASSERT(npfec.write_access || npfec.read_access || npfec.insn_fetch);
>          rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
> -                            p2mt, p2m_access_rwx);
> +                            p2mt, p2m_access_rwx, -1);
>          ASSERT(rc == 0);
>      }
>      gfn_unlock(p2m, gfn, 0);

This definitely should not be "sve" in the 'if' clause and "-1" in the
'else' clause.  Because I was looking only at the patch, I missed that
when Jan raised the issue before. That's a mistake on my part -- would
you mind doing as Jan suggests, and just making these "NULL" and "-1"
throughout this file?

Thanks!
 -George

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 10/15] x86/altp2m: add remaining support routines.
  2015-07-10  9:41   ` Jan Beulich
@ 2015-07-10 17:15     ` George Dunlap
  2015-07-11 20:20       ` Sahita, Ravi
  0 siblings, 1 reply; 51+ messages in thread
From: George Dunlap @ 2015-07-10 17:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, Ed White,
	xen-devel, Andrew Cooper, tlengyel, Daniel De Graaf

On Fri, Jul 10, 2015 at 10:41 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>> Add the remaining routines required to support enabling the alternate
>> p2m functionality.
>
> So despite George's comments on v3 these are still all disconnected
> from their users...

I did try to make it clear that I wasn't asking for things to be moved
around for this patch series;  "for future reference" was meant to
mean "for future patch series".  Reorganizing the patch series at this
point is a double-edged sword -- it might make the new version in
isolation easier to review, but it makes it more difficult to compare
what's been said about previous patch series; additionally it takes
time away from addressing more substantive comments.

 -George

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-10  0:52 ` [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
  2015-07-10  8:53   ` Wei Liu
@ 2015-07-10 17:32   ` George Dunlap
  2015-07-10 22:12     ` Sahita, Ravi
  1 sibling, 1 reply; 51+ messages in thread
From: George Dunlap @ 2015-07-10 17:32 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Fri, Jul 10, 2015 at 1:52 AM, Ed White <edmund.h.white@intel.com> wrote:
> The altp2mhvm and nestedhvm parameters are mutually
> exclusive and cannot be set together.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> for the hypervisor bits.
> ---
>  docs/man/xl.cfg.pod.5           | 12 ++++++++++++
>  tools/libxl/libxl.h             |  6 ++++++
>  tools/libxl/libxl_create.c      |  1 +
>  tools/libxl/libxl_dom.c         |  2 ++
>  tools/libxl/libxl_types.idl     |  1 +
>  tools/libxl/xl_cmdimpl.c        | 10 ++++++++++
>  xen/arch/x86/hvm/hvm.c          | 23 +++++++++++++++++++++--
>  xen/include/public/hvm/params.h |  5 ++++-
>  8 files changed, 57 insertions(+), 3 deletions(-)
>
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index a3e0e2e..18afd46 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -1035,6 +1035,18 @@ enabled by default and you should usually omit it. It may be necessary
>  to disable the HPET in order to improve compatibility with guest
>  Operating Systems (X86 only)
>
> +=item B<altp2mhvm=BOOLEAN>
> +
> +Enables or disables hvm guest access to alternate-p2m capability.
> +Alternate-p2m allows a guest to manage multiple p2m guest physical
> +"memory views" (as opposed to a single p2m). This option is
> +disabled by default and is available only to hvm domains.
> +You may want this option if you want to access-control/isolate
> +access to specific guest physical memory pages accessed by
> +the guest, e.g. for HVM domain memory introspection or
> +for isolation/access-control of memory between components within
> +a single guest hvm domain.
> +
>  =item B<nestedhvm=BOOLEAN>
>
>  Enable or disables guest access to hardware virtualisation features,
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index a1c5d15..17222e7 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -745,6 +745,12 @@ typedef struct libxl__ctx libxl_ctx;
>  #define LIBXL_HAVE_BUILDINFO_SERIAL_LIST 1
>
>  /*
> + * LIBXL_HAVE_ALTP2M
> + * If this is defined, then libxl supports alternate p2m functionality.
> + */
> +#define LIBXL_HAVE_ALTP2M 1
> +
> +/*
>   * LIBXL_HAVE_REMUS
>   * If this is defined, then libxl supports remus.
>   */
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index f366a09..418deee 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -329,6 +329,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>          libxl_defbool_setdefault(&b_info->u.hvm.hpet,               true);
>          libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,          true);
>          libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm,         false);
> +        libxl_defbool_setdefault(&b_info->u.hvm.altp2m,             false);
>          libxl_defbool_setdefault(&b_info->u.hvm.usb,                false);
>          libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
>
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index bdc0465..2f1200e 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -300,6 +300,8 @@ static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
>                      libxl_defbool_val(info->u.hvm.vpt_align));
>      xc_hvm_param_set(handle, domid, HVM_PARAM_NESTEDHVM,
>                      libxl_defbool_val(info->u.hvm.nested_hvm));
> +    xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2MHVM,
> +                    libxl_defbool_val(info->u.hvm.altp2m));
>  }
>
>  int libxl__build_pre(libxl__gc *gc, uint32_t domid,
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index e1632fa..fb641fe 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -440,6 +440,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>                                         ("mmio_hole_memkb",  MemKB),
>                                         ("timer_mode",       libxl_timer_mode),
>                                         ("nested_hvm",       libxl_defbool),
> +                                       ("altp2m",           libxl_defbool),
>                                         ("smbios_firmware",  string),
>                                         ("acpi_firmware",    string),
>                                         ("nographic",        libxl_defbool),
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index c858068..43cf6bf 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -1500,6 +1500,16 @@ static void parse_config_data(const char *config_source,
>
>          xlu_cfg_get_defbool(config, "nestedhvm", &b_info->u.hvm.nested_hvm, 0);
>
> +        xlu_cfg_get_defbool(config, "altp2mhvm", &b_info->u.hvm.altp2m, 0);
> +
> +        if (!libxl_defbool_is_default(b_info->u.hvm.nested_hvm) &&
> +            libxl_defbool_val(b_info->u.hvm.nested_hvm) &&
> +            !libxl_defbool_is_default(b_info->u.hvm.altp2m) &&
> +            libxl_defbool_val(b_info->u.hvm.altp2m)) {
> +            fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used together\n");
> +            exit(1);
> +        }
> +
>          xlu_cfg_replace_string(config, "smbios_firmware",
>                                 &b_info->u.hvm.smbios_firmware, 0);
>          xlu_cfg_replace_string(config, "acpi_firmware",
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 23cd507..6e59e68 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -5750,6 +5750,7 @@ static int hvm_allow_set_param(struct domain *d,
>      case HVM_PARAM_VIRIDIAN:
>      case HVM_PARAM_IOREQ_SERVER_PFN:
>      case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
> +    case HVM_PARAM_ALTP2MHVM:

Sorry I missed this -- when I was skimming the reviews of the previous
version, I assumed that when Wei asked "hvm" to be taken out because
it was redundant, it would include the HVM at the end of this
HVM_PARAM.  It seems fairly redundant to have HVM both at the
beginning and the end.  (Note that argument doesn't apply to
NESTEDHVM, because in that case, it's the HVM itself which is nested.)

(I also have an idea this may have been discussed before, but I can't
find the relevant conversation now, so let me know if I'm covering old
ground...)

 -George

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 03/15] VMX: implement suppress #VE.
  2015-07-10  9:09   ` Jan Beulich
@ 2015-07-10 19:22     ` Sahita, Ravi
  0 siblings, 0 replies; 51+ messages in thread
From: Sahita, Ravi @ 2015-07-10 19:22 UTC (permalink / raw)
  To: Jan Beulich, White, Edmund H
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, tlengyel, Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Friday, July 10, 2015 2:10 AM
>
>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>> @@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
>>          p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
>>      }
>>
>> +    table =
>> + map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
>> +
>> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
>> +        table[i].suppress_ve = 1;
>> +
>> +    unmap_domain_page(table);
>
>See my comments/questions on v3. I find it irritating for new patch versions to
>be sent without addressing comments on the previous one (verbally or by
>adjusting code).
>
>Jan

Apologies - We did go through all your comments on v3 for the patches affected - I will respond to them on those threads next.
We posted v4 since we felt we had addressed the major code changes on 2 of the patches that were needed based on feedback so we wanted to post that before the deadline.

Ravi

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-10 10:01   ` Jan Beulich
@ 2015-07-10 22:03     ` Sahita, Ravi
  2015-07-13  7:25       ` Jan Beulich
  0 siblings, 1 reply; 51+ messages in thread
From: Sahita, Ravi @ 2015-07-10 22:03 UTC (permalink / raw)
  To: Jan Beulich, White, Edmund H
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Friday, July 10, 2015 3:01 AM
>
>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -6443,6 +6443,144 @@ long do_hvm_op(unsigned long op,
>XEN_GUEST_HANDLE_PARAM(void) arg)
>>          break;
>>      }
>>
>> +    case HVMOP_altp2m:
>> +    {
>> +        struct xen_hvm_altp2m_op a;
>> +        struct domain *d = NULL;
>> +
>> +        if ( copy_from_guest(&a, arg, 1) )
>> +            return -EFAULT;
>> +
>> +        switch ( a.cmd )
>> +        {
>> +        case HVMOP_altp2m_get_domain_state:
>> +        case HVMOP_altp2m_set_domain_state:
>> +        case HVMOP_altp2m_create_p2m:
>> +        case HVMOP_altp2m_destroy_p2m:
>> +        case HVMOP_altp2m_switch_p2m:
>> +        case HVMOP_altp2m_set_mem_access:
>> +        case HVMOP_altp2m_change_gfn:
>> +            d = rcu_lock_domain_by_any_id(a.domain);
>> +            if ( d == NULL )
>> +                return -ESRCH;
>> +
>> +            if ( !is_hvm_domain(d) || !hvm_altp2m_supported() )
>> +                rc = -EINVAL;
>> +
>> +            break;
>> +        case HVMOP_altp2m_vcpu_enable_notify:
>> +
>> +            break;
>
>The blank line ought to go ahead of the case label.

ok

>
>> +        default:
>> +            return -ENOSYS;
>> +
>> +            break;
>
>Bogus (unreachable) break.

Wanted to keep this so that if someone removes the error code then they don't cause an invalid fall through.
But ok with removing it if you think so.

>
>> +        }
>> +
>> +        if ( !rc )
>> +        {
>> +            switch ( a.cmd )
>> +            {
>> +            case HVMOP_altp2m_get_domain_state:
>> +                a.u.domain_state.state = altp2m_active(d);
>> +                rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
>> +
>> +                break;
>> +            case HVMOP_altp2m_set_domain_state:
>> +            {
>> +                struct vcpu *v;
>> +                bool_t ostate;
>> +
>> +                if ( nestedhvm_enabled(d) )
>> +                {
>> +                    rc = -EINVAL;
>> +                    break;
>> +                }
>> +
>> +                ostate = d->arch.altp2m_active;
>> +                d->arch.altp2m_active = !!a.u.domain_state.state;
>> +
>> +                /* If the alternate p2m state has changed, handle appropriately */
>> +                if ( d->arch.altp2m_active != ostate &&
>> +                     (ostate || !(rc = p2m_init_altp2m_by_id(d, 0))) )
>> +                {
>> +                    for_each_vcpu( d, v )
>> +                    {
>> +                        if ( !ostate )
>> +                            altp2m_vcpu_initialise(v);
>> +                        else
>> +                            altp2m_vcpu_destroy(v);
>> +                    }
>> +
>> +                    if ( ostate )
>> +                        p2m_flush_altp2m(d);
>> +                }
>> +
>> +                break;
>> +            }
>> +            default:
>> +            {
>
>Pointless brace.

ok

>
>> +                if ( !(d ? d : current->domain)->arch.altp2m_active )
>
>This is bogus: d is NULL if and only if altp2m_vcpu_enable_notify, i.e. I don't
>see why you can't just use current->domain inside that case (and you really
>do). That would then also eliminate the need for this redundant and
>obfuscating switch() nesting you use.
>

We need to check if the target domain is in altp2m mode for all the following sub-ops.
If we removed this check, we would need to check for target domain being in altp2m for all the following cases.
Andrew wanted to refactor and pull common code up, and that's what this is one case of for hvm_op.

>> +
>> +struct xen_hvm_altp2m_set_mem_access {
>> +    /* view */
>> +    uint16_t view;
>> +    /* Memory type */
>> +    uint16_t hvmmem_access; /* xenmem_access_t */
>> +    uint8_t pad[4];
>> +    /* gfn */
>> +    uint64_t gfn;
>> +};
>> +typedef struct xen_hvm_altp2m_set_mem_access
>> xen_hvm_altp2m_set_mem_access_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
>> +
>> +struct xen_hvm_altp2m_change_gfn {
>> +    /* view */
>> +    uint16_t view;
>> +    uint8_t pad[6];
>> +    /* old gfn */
>> +    uint64_t old_gfn;
>> +    /* new gfn, INVALID_GFN (~0UL) means revert */
>> +    uint64_t new_gfn;
>> +};
>> +typedef struct xen_hvm_altp2m_change_gfn
>xen_hvm_altp2m_change_gfn_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_change_gfn_t);
>> +
>> +struct xen_hvm_altp2m_op {
>> +    uint32_t cmd;
>> +/* Get/set the altp2m state for a domain */
>> +#define HVMOP_altp2m_get_domain_state     1
>> +#define HVMOP_altp2m_set_domain_state     2
>> +/* Set the current VCPU to receive altp2m event notifications */
>> +#define HVMOP_altp2m_vcpu_enable_notify   3
>> +/* Create a new view */
>> +#define HVMOP_altp2m_create_p2m           4
>> +/* Destroy a view */
>> +#define HVMOP_altp2m_destroy_p2m          5
>> +/* Switch view for an entire domain */
>> +#define HVMOP_altp2m_switch_p2m           6
>> +/* Notify that a page of memory is to have specific access types */
>> +#define HVMOP_altp2m_set_mem_access       7
>> +/* Change a p2m entry to have a different gfn->mfn mapping */
>> +#define HVMOP_altp2m_change_gfn           8
>> +    domid_t domain;
>> +    uint8_t pad[2];
>
>While you added padding fields as asked for, you still don't verify them to be
>zero on input.

Specifically what were you thinking we need to do here - also would be good if you can explain what was the underlying concern? (thanks)

>
>Afaict all other questions raised on v3 still stand.
>

Those were addressed in another thread.

Thanks,
Ravi

>Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-10 17:32   ` George Dunlap
@ 2015-07-10 22:12     ` Sahita, Ravi
  2015-07-14 11:50       ` George Dunlap
  0 siblings, 1 reply; 51+ messages in thread
From: Sahita, Ravi @ 2015-07-10 22:12 UTC (permalink / raw)
  To: George Dunlap, White, Edmund H
  Cc: Sahita, Ravi, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

>From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George
>Dunlap
>Sent: Friday, July 10, 2015 10:32 AM
>
>On Fri, Jul 10, 2015 at 1:52 AM, Ed White <edmund.h.white@intel.com>
>wrote:
>> The altp2mhvm and nestedhvm parameters are mutually exclusive and
>> cannot be set together.
>>
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> for the
>hypervisor bits.
>> ---
>>  docs/man/xl.cfg.pod.5           | 12 ++++++++++++
>>  tools/libxl/libxl.h             |  6 ++++++
>>  tools/libxl/libxl_create.c      |  1 +
>>  tools/libxl/libxl_dom.c         |  2 ++
>>  tools/libxl/libxl_types.idl     |  1 +
>>  tools/libxl/xl_cmdimpl.c        | 10 ++++++++++
>>  xen/arch/x86/hvm/hvm.c          | 23 +++++++++++++++++++++--
>>  xen/include/public/hvm/params.h |  5 ++++-
>>  8 files changed, 57 insertions(+), 3 deletions(-)
>>
>> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index
>> a3e0e2e..18afd46 100644
>> --- a/docs/man/xl.cfg.pod.5
>> +++ b/docs/man/xl.cfg.pod.5
>> @@ -1035,6 +1035,18 @@ enabled by default and you should usually omit
>> it. It may be necessary  to disable the HPET in order to improve
>> compatibility with guest  Operating Systems (X86 only)
>>
>> +=item B<altp2mhvm=BOOLEAN>
>> +
>> +Enables or disables hvm guest access to alternate-p2m capability.
>> +Alternate-p2m allows a guest to manage multiple p2m guest physical
>> +"memory views" (as opposed to a single p2m). This option is disabled
>> +by default and is available only to hvm domains.
>> +You may want this option if you want to access-control/isolate access
>> +to specific guest physical memory pages accessed by the guest, e.g.
>> +for HVM domain memory introspection or for isolation/access-control
>> +of memory between components within a single guest hvm domain.
>> +
>>  =item B<nestedhvm=BOOLEAN>
>>
>>  Enable or disables guest access to hardware virtualisation features,
>> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index
>> a1c5d15..17222e7 100644
>> --- a/tools/libxl/libxl.h
>> +++ b/tools/libxl/libxl.h
>> @@ -745,6 +745,12 @@ typedef struct libxl__ctx libxl_ctx;  #define
>> LIBXL_HAVE_BUILDINFO_SERIAL_LIST 1
>>
>>  /*
>> + * LIBXL_HAVE_ALTP2M
>> + * If this is defined, then libxl supports alternate p2m functionality.
>> + */
>> +#define LIBXL_HAVE_ALTP2M 1
>> +
>> +/*
>>   * LIBXL_HAVE_REMUS
>>   * If this is defined, then libxl supports remus.
>>   */
>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
>> index f366a09..418deee 100644
>> --- a/tools/libxl/libxl_create.c
>> +++ b/tools/libxl/libxl_create.c
>> @@ -329,6 +329,7 @@ int libxl__domain_build_info_setdefault(libxl__gc
>*gc,
>>          libxl_defbool_setdefault(&b_info->u.hvm.hpet,               true);
>>          libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,          true);
>>          libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm,         false);
>> +        libxl_defbool_setdefault(&b_info->u.hvm.altp2m,             false);
>>          libxl_defbool_setdefault(&b_info->u.hvm.usb,                false);
>>          libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
>>
>> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index
>> bdc0465..2f1200e 100644
>> --- a/tools/libxl/libxl_dom.c
>> +++ b/tools/libxl/libxl_dom.c
>> @@ -300,6 +300,8 @@ static void hvm_set_conf_params(xc_interface
>*handle, uint32_t domid,
>>                      libxl_defbool_val(info->u.hvm.vpt_align));
>>      xc_hvm_param_set(handle, domid, HVM_PARAM_NESTEDHVM,
>>                      libxl_defbool_val(info->u.hvm.nested_hvm));
>> +    xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2MHVM,
>> +                    libxl_defbool_val(info->u.hvm.altp2m));
>>  }
>>
>>  int libxl__build_pre(libxl__gc *gc, uint32_t domid, diff --git
>> a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index
>> e1632fa..fb641fe 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -440,6 +440,7 @@ libxl_domain_build_info =
>Struct("domain_build_info",[
>>                                         ("mmio_hole_memkb",  MemKB),
>>                                         ("timer_mode",       libxl_timer_mode),
>>                                         ("nested_hvm",       libxl_defbool),
>> +                                       ("altp2m",           libxl_defbool),
>>                                         ("smbios_firmware",  string),
>>                                         ("acpi_firmware",    string),
>>                                         ("nographic",        libxl_defbool),
>> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index
>> c858068..43cf6bf 100644
>> --- a/tools/libxl/xl_cmdimpl.c
>> +++ b/tools/libxl/xl_cmdimpl.c
>> @@ -1500,6 +1500,16 @@ static void parse_config_data(const char
>> *config_source,
>>
>>          xlu_cfg_get_defbool(config, "nestedhvm",
>> &b_info->u.hvm.nested_hvm, 0);
>>
>> +        xlu_cfg_get_defbool(config, "altp2mhvm",
>> + &b_info->u.hvm.altp2m, 0);
>> +
>> +        if (!libxl_defbool_is_default(b_info->u.hvm.nested_hvm) &&
>> +            libxl_defbool_val(b_info->u.hvm.nested_hvm) &&
>> +            !libxl_defbool_is_default(b_info->u.hvm.altp2m) &&
>> +            libxl_defbool_val(b_info->u.hvm.altp2m)) {
>> +            fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used
>together\n");
>> +            exit(1);
>> +        }
>> +
>>          xlu_cfg_replace_string(config, "smbios_firmware",
>>                                 &b_info->u.hvm.smbios_firmware, 0);
>>          xlu_cfg_replace_string(config, "acpi_firmware", diff --git
>> a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index
>> 23cd507..6e59e68 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -5750,6 +5750,7 @@ static int hvm_allow_set_param(struct domain *d,
>>      case HVM_PARAM_VIRIDIAN:
>>      case HVM_PARAM_IOREQ_SERVER_PFN:
>>      case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
>> +    case HVM_PARAM_ALTP2MHVM:
>
>Sorry I missed this -- when I was skimming the reviews of the previous
>version, I assumed that when Wei asked "hvm" to be taken out because it
>was redundant, it would include the HVM at the end of this HVM_PARAM.  It
>seems fairly redundant to have HVM both at the beginning and the end.
>(Note that argument doesn't apply to NESTEDHVM, because in that case, it's
>the HVM itself which is nested.)
>
>(I also have an idea this may have been discussed before, but I can't find the
>relevant conversation now, so let me know if I'm covering old
>ground...)


Wei has acked this today morning.

Ravi

>
> -George

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 15/15] tools/xen-access: altp2m testcases
  2015-07-10  1:35   ` Lengyel, Tamas
@ 2015-07-11  6:06     ` Razvan Cojocaru
  0 siblings, 0 replies; 51+ messages in thread
From: Razvan Cojocaru @ 2015-07-11  6:06 UTC (permalink / raw)
  To: Lengyel, Tamas, Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	Xen-devel, Jan Beulich, Andrew Cooper, Daniel De Graaf

On 07/10/2015 04:35 AM, Lengyel, Tamas wrote:
> 
>     @@ -546,6 +652,23 @@ int main(int argc, char *argv[])
>                      }
> 
>                      break;
>     +            case VM_EVENT_REASON_SINGLESTEP:
>     +                printf("Singlestep: rip=%016"PRIx64", vcpu %d\n",
>     +                       req.regs.x86.rip,
>     +                       req.vcpu_id);
>     +
>     +                if ( altp2m )
>     +                {
>     +                    printf("\tSwitching altp2m to view %u!\n",
>     altp2m_view_id);
>     +
>     +                    rsp.reason = VM_EVENT_REASON_MEM_ACCESS;
> 
> 
> So this was a workaround for v3 of the series that is no longer
> necessary - it's probably cleaner to have the same reason set for the
> response as the request was. It's not against any rule, so the code is
> still correct and works, it's just not best practice. So in case there
> is another round on the series, it could be fixed then.

With or without that change (but preferably with it):

Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-10  9:30   ` Jan Beulich
@ 2015-07-11 20:01     ` Sahita, Ravi
  2015-07-11 21:25       ` Sahita, Ravi
  2015-07-13  7:13       ` Jan Beulich
  0 siblings, 2 replies; 51+ messages in thread
From: Sahita, Ravi @ 2015-07-11 20:01 UTC (permalink / raw)
  To: Jan Beulich, White, Edmund H
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Friday, July 10, 2015 2:31 AM
>
>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>> @@ -3234,6 +3256,13 @@ void vmx_vmexit_handler(struct cpu_user_regs
>*regs)
>>              update_guest_eip();
>>          break;
>>
>> +    case EXIT_REASON_VMFUNC:
>> +        if ( vmx_vmfunc_intercept(regs) == X86EMUL_EXCEPTION )
>> +            hvm_inject_hw_exception(TRAP_invalid_op,
>HVM_DELIVER_NO_ERROR_CODE);
>> +        else
>> +            update_guest_eip();
>> +        break;
>
>How about X86EMUL_UNHANDLEABLE and X86EMUL_RETRY? As said before,
>either get this right, or simply fold the relatively pointless helper into here.

Sure I can add the other error conditions but note that they will be handled as EXCEPTION. Let me explain the point of the relatively pointless :-) helper was to have the interface complete so that if someone in the future wanted to handle VMFUNC exits (perhaps for lazily managing EPTP list for nesting scenarios) they could do that by extending the vmx_vmfunc_intercept. I can also add a comment there - Will that be sufficient? (I'm trying to avoid another revision after I revise it to add the other exception conditions as stated)

>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -3816,8 +3816,9 @@ x86_emulate(
>>          struct segment_register reg;
>>          unsigned long base, limit, cr0, cr0w;
>>
>> -        if ( modrm == 0xdf ) /* invlpga */
>> +        switch( modrm )
>>          {
>> +        case 0xdf: /* invlpga AMD */
>>              generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
>>              generate_exception_if(!mode_ring0(), EXC_GP, 0);
>>              fail_if(ops->invlpg == NULL);
>
>The diff now looks much better. Yet I don't see why you added "AMD"
>to the comment - we don't elsewhere note that certain instructions are
>vendor specific (and really which ones are also changes over time, see RDTSCP
>for a prominent example).
>

I thought it would be better to specify instructions that are unique to a specific CPU.
But I can remove it.

>> @@ -3825,10 +3826,7 @@ x86_emulate(
>>                                     ctxt)) )
>>                  goto done;
>>              break;
>> -        }
>> -
>> -        if ( modrm == 0xf9 ) /* rdtscp */
>> -        {
>> +        case 0xf9: /* rdtscp */ {
>>              uint64_t tsc_aux;
>>              fail_if(ops->read_msr == NULL);
>>              if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) !=
>> 0 ) @@ -3836,7 +3834,19 @@ x86_emulate(
>>              _regs.ecx = (uint32_t)tsc_aux;
>>              goto rdtsc;
>>          }
>> +        case 0xd4: /* vmfunc */
>> +            generate_exception_if(lock_prefix | rep_prefix() | (vex.pfx ==
>vex_66),
>> +                                  EXC_UD, -1);
>> +            fail_if(ops->vmfunc == NULL);
>> +            if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
>> +                goto done;
>> +            break;
>> +        default:
>> +            goto continue_grp7;
>> +        }
>> +        break;
>>
>> + continue_grp7:
>
>Already when first looking at this I disliked this label. Looking at it again, I'd
>really like to see it gone: RDTSCP handling already ends in a goto. Since the
>only VMFUNC currently implemented doesn't modify any register state
>either, its handling could end in an unconditional "goto done" too for now.
>And INVLPG, not modifying any register state, could follow suit.
>

Sure - no issues with that.

Thanks,
Ravi

>And even if you really wanted to cater for future VMFUNCs to alter register
>state, I'd still like this ugliness to be avoided - e.g. by setting rc to a negative
>value before the switch and break-ing afterwards when it's no longer
>negative.
>
>Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 10/15] x86/altp2m: add remaining support routines.
  2015-07-10 17:15     ` George Dunlap
@ 2015-07-11 20:20       ` Sahita, Ravi
  0 siblings, 0 replies; 51+ messages in thread
From: Sahita, Ravi @ 2015-07-11 20:20 UTC (permalink / raw)
  To: George Dunlap, Jan Beulich
  Cc: Sahita, Ravi, Wei Liu, Tim Deegan, Ian Jackson, White, Edmund H,
	xen-devel, Andrew Cooper, tlengyel, Daniel De Graaf

>From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George
>Dunlap
>Sent: Friday, July 10, 2015 10:15 AM
>To: Jan Beulich
>Cc: White, Edmund H; Tim Deegan; Sahita, Ravi; Wei Liu; Andrew Cooper; Ian
>Jackson; xen-devel@lists.xen.org; tlengyel@novetta.com; Daniel De Graaf
>Subject: Re: [Xen-devel] [PATCH v4 10/15] x86/altp2m: add remaining support
>routines.
>
>On Fri, Jul 10, 2015 at 10:41 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>>> Add the remaining routines required to support enabling the alternate
>>> p2m functionality.
>>
>> So despite George's comments on v3 these are still all disconnected
>> from their users...
>
>I did try to make it clear that I wasn't asking for things to be moved around for
>this patch series;  "for future reference" was meant to mean "for future patch
>series".  Reorganizing the patch series at this point is a double-edged sword --
>it might make the new version in isolation easier to review, but it makes it
>more difficult to compare what's been said about previous patch series;
>additionally it takes time away from addressing more substantive comments.

Thanks for the clarification George - if its ok, we'd like to leave this organization as is for now exactly for the reasons you mention.

Ravi

>
> -George

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-11 20:01     ` Sahita, Ravi
@ 2015-07-11 21:25       ` Sahita, Ravi
  2015-07-13  7:18         ` Jan Beulich
  2015-07-13  7:13       ` Jan Beulich
  1 sibling, 1 reply; 51+ messages in thread
From: Sahita, Ravi @ 2015-07-11 21:25 UTC (permalink / raw)
  To: Jan Beulich, White, Edmund H
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>From: Sahita, Ravi
>Sent: Saturday, July 11, 2015 1:01 PM
>
>>From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Friday, July 10, 2015 2:31 AM
>>
>>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>>> @@ -3234,6 +3256,13 @@ void vmx_vmexit_handler(struct cpu_user_regs
>>*regs)
>>>              update_guest_eip();
>>>          break;
>>>
>>> +    case EXIT_REASON_VMFUNC:
>>> +        if ( vmx_vmfunc_intercept(regs) == X86EMUL_EXCEPTION )
>>> +            hvm_inject_hw_exception(TRAP_invalid_op,
>>HVM_DELIVER_NO_ERROR_CODE);
>>> +        else
>>> +            update_guest_eip();
>>> +        break;
>>
>>How about X86EMUL_UNHANDLEABLE and X86EMUL_RETRY? As said
>before,
>>either get this right, or simply fold the relatively pointless helper into here.
>
>Sure I can add the other error conditions but note that they will be handled as
>EXCEPTION. Let me explain the point of the relatively pointless :-) helper was
>to have the interface complete so that if someone in the future wanted to
>handle VMFUNC exits (perhaps for lazily managing EPTP list for nesting
>scenarios) they could do that by extending the vmx_vmfunc_intercept. I can
>also add a comment there - Will that be sufficient? (I'm trying to avoid another
>revision after I revise it to add the other exception conditions as stated)
>
>>
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -3816,8 +3816,9 @@ x86_emulate(
>>>          struct segment_register reg;
>>>          unsigned long base, limit, cr0, cr0w;
>>>
>>> -        if ( modrm == 0xdf ) /* invlpga */
>>> +        switch( modrm )
>>>          {
>>> +        case 0xdf: /* invlpga AMD */
>>>              generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
>>>              generate_exception_if(!mode_ring0(), EXC_GP, 0);
>>>              fail_if(ops->invlpg == NULL);
>>
>>The diff now looks much better. Yet I don't see why you added "AMD"
>>to the comment - we don't elsewhere note that certain instructions are
>>vendor specific (and really which ones are also changes over time, see
>>RDTSCP for a prominent example).
>>
>
>I thought it would be better to specify instructions that are unique to a specific
>CPU.
>But I can remove it.
>
>>> @@ -3825,10 +3826,7 @@ x86_emulate(
>>>                                     ctxt)) )
>>>                  goto done;
>>>              break;
>>> -        }
>>> -
>>> -        if ( modrm == 0xf9 ) /* rdtscp */
>>> -        {
>>> +        case 0xf9: /* rdtscp */ {
>>>              uint64_t tsc_aux;
>>>              fail_if(ops->read_msr == NULL);
>>>              if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt))
>>> !=
>>> 0 ) @@ -3836,7 +3834,19 @@ x86_emulate(
>>>              _regs.ecx = (uint32_t)tsc_aux;
>>>              goto rdtsc;
>>>          }
>>> +        case 0xd4: /* vmfunc */
>>> +            generate_exception_if(lock_prefix | rep_prefix() |
>>> + (vex.pfx ==
>>vex_66),
>>> +                                  EXC_UD, -1);
>>> +            fail_if(ops->vmfunc == NULL);
>>> +            if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
>>> +                goto done;
>>> +            break;
>>> +        default:
>>> +            goto continue_grp7;
>>> +        }
>>> +        break;
>>>
>>> + continue_grp7:
>>
>>Already when first looking at this I disliked this label. Looking at it
>>again, I'd really like to see it gone: RDTSCP handling already ends in
>>a goto. Since the only VMFUNC currently implemented doesn't modify any
>>register state either, its handling could end in an unconditional "goto done"
>too for now.
>>And INVLPG, not modifying any register state, could follow suit.
>>
>
>Sure - no issues with that.
>

On second thoughts, I cannot really use a goto done for these 2 cases since that will skip the single-step tracing check that's performed in the existing flow.
So I can add a new label entrypoint before the tracing check, or goto writeback (with the dst.type switch there being a wasted check), or I can keep the flow as is - which would you prefer?

Thanks,
Ravi

>Thanks,
>Ravi
>
>>And even if you really wanted to cater for future VMFUNCs to alter
>>register state, I'd still like this ugliness to be avoided - e.g. by
>>setting rc to a negative value before the switch and break-ing
>>afterwards when it's no longer negative.
>>
>>Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 08/15] x86/altp2m: add control of suppress_ve.
  2015-07-10 17:02   ` George Dunlap
@ 2015-07-11 21:29     ` Sahita, Ravi
  0 siblings, 0 replies; 51+ messages in thread
From: Sahita, Ravi @ 2015-07-11 21:29 UTC (permalink / raw)
  To: George Dunlap, White, Edmund H
  Cc: Sahita, Ravi, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

>From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George
>Dunlap
>Sent: Friday, July 10, 2015 10:03 AM
>
>On Fri, Jul 10, 2015 at 1:52 AM, Ed White <edmund.h.white@intel.com>
>wrote:
>> From: George Dunlap <george.dunlap@eu.citrix.com>
>>
>> The existing ept_set_entry() and ept_get_entry() routines are extended
>> to optionally set/get suppress_ve.  Passing -1 will set suppress_ve on
>> new p2m entries, or retain suppress_ve flag on existing entries.
>>
>> Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
>
>So because my patch contained code written by Ed, and this patch now
>contains code written by you, I'm pretty sure that a strict observance of
>protocol would require his SoB to be retained (as I think I did when I sent it),
>and your SoB to be added, for copyright purposes.
>
>In this particular case a lawyer might argue that the code snippets inquestion
>were so small or obvious as to be uncopyrightable, but it doesn't really hurt to
>be a bit more strict than we need to be. :-)

I can add my Signed-off-by to cover both Ed and my code (from Intel perspective).

>
>Also, a description of what you had changed could have helped speed review.
>(It seems you've only added the bits requested to the p2m-pt
>implementation?)

That’s correct (from discussion with Ed).

>
>Finally, one thing I missed in the discussion before...
>
>> @@ -1528,16 +1528,17 @@ bool_t p2m_mem_access_check(paddr_t gpa,
>unsigned long gla,
>>      vm_event_request_t *req;
>>      int rc;
>>      unsigned long eip = guest_cpu_user_regs()->eip;
>> +    bool_t sve;
>>
>>      /* First, handle rx2rw conversion automatically.
>>       * These calls to p2m->set_entry() must succeed: we have the gfn
>>       * locked and just did a successful get_entry(). */
>>      gfn_lock(p2m, gfn, 0);
>> -    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
>>
>>      if ( npfec.write_access && p2ma == p2m_access_rx2rw )
>>      {
>> -        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt,
>p2m_access_rw);
>> +        rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2mt,
>> + p2m_access_rw, sve);
>>          ASSERT(rc == 0);
>>          gfn_unlock(p2m, gfn, 0);
>>          return 1;
>> @@ -1546,7 +1547,7 @@ bool_t p2m_mem_access_check(paddr_t gpa,
>unsigned long gla,
>>      {
>>          ASSERT(npfec.write_access || npfec.read_access || npfec.insn_fetch);
>>          rc = p2m->set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>> -                            p2mt, p2m_access_rwx);
>> +                            p2mt, p2m_access_rwx, -1);
>>          ASSERT(rc == 0);
>>      }
>>      gfn_unlock(p2m, gfn, 0);
>
>This definitely should not be "sve" in the 'if' clause and "-1" in the 'else' clause.
>Because I was looking only at the patch, I missed that when Jan raised the
>issue before. That's a mistake on my part -- would you mind doing as Jan
>suggests, and just making these "NULL" and "-1"
>throughout this file?

Ok I will make this change but I would like you to review it - could I send you a version on Saturday so you can review it (since you are not working on Monday)

Thanks,
Ravi

>
>Thanks!
> -George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-11 20:01     ` Sahita, Ravi
  2015-07-11 21:25       ` Sahita, Ravi
@ 2015-07-13  7:13       ` Jan Beulich
  1 sibling, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2015-07-13  7:13 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 11.07.15 at 22:01, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Friday, July 10, 2015 2:31 AM
>>
>>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>>> @@ -3234,6 +3256,13 @@ void vmx_vmexit_handler(struct cpu_user_regs
>>*regs)
>>>              update_guest_eip();
>>>          break;
>>>
>>> +    case EXIT_REASON_VMFUNC:
>>> +        if ( vmx_vmfunc_intercept(regs) == X86EMUL_EXCEPTION )
>>> +            hvm_inject_hw_exception(TRAP_invalid_op,
>>HVM_DELIVER_NO_ERROR_CODE);
>>> +        else
>>> +            update_guest_eip();
>>> +        break;
>>
>>How about X86EMUL_UNHANDLEABLE and X86EMUL_RETRY? As said before,
>>either get this right, or simply fold the relatively pointless helper into 
> here.
> 
> Sure I can add the other error conditions but note that they will be handled 
> as EXCEPTION.

The reason for this would need to go into ...

> Let me explain the point of the relatively pointless :-) helper 
> was to have the interface complete so that if someone in the future wanted to 
> handle VMFUNC exits (perhaps for lazily managing EPTP list for nesting 
> scenarios) they could do that by extending the vmx_vmfunc_intercept. I can 
> also add a comment there - Will that be sufficient? (I'm trying to avoid 
> another revision after I revise it to add the other exception conditions as 
> stated)

... such a comment. And yes, I'd be as fine with just a comment as
with the wrapper being folded in.

>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -3816,8 +3816,9 @@ x86_emulate(
>>>          struct segment_register reg;
>>>          unsigned long base, limit, cr0, cr0w;
>>>
>>> -        if ( modrm == 0xdf ) /* invlpga */
>>> +        switch( modrm )
>>>          {
>>> +        case 0xdf: /* invlpga AMD */
>>>              generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
>>>              generate_exception_if(!mode_ring0(), EXC_GP, 0);
>>>              fail_if(ops->invlpg == NULL);
>>
>>The diff now looks much better. Yet I don't see why you added "AMD"
>>to the comment - we don't elsewhere note that certain instructions are
>>vendor specific (and really which ones are also changes over time, see RDTSCP
>>for a prominent example).
>>
> 
> I thought it would be better to specify instructions that are unique to a 
> specific CPU.
> But I can remove it.

Yes please.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-11 21:25       ` Sahita, Ravi
@ 2015-07-13  7:18         ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2015-07-13  7:18 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 11.07.15 at 23:25, <ravi.sahita@intel.com> wrote:
>> From: Sahita, Ravi
>>Sent: Saturday, July 11, 2015 1:01 PM
>>>From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Friday, July 10, 2015 2:31 AM
>>>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>>>> @@ -3825,10 +3826,7 @@ x86_emulate(
>>>>                                     ctxt)) )
>>>>                  goto done;
>>>>              break;
>>>> -        }
>>>> -
>>>> -        if ( modrm == 0xf9 ) /* rdtscp */
>>>> -        {
>>>> +        case 0xf9: /* rdtscp */ {
>>>>              uint64_t tsc_aux;
>>>>              fail_if(ops->read_msr == NULL);
>>>>              if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt))
>>>> !=
>>>> 0 ) @@ -3836,7 +3834,19 @@ x86_emulate(
>>>>              _regs.ecx = (uint32_t)tsc_aux;
>>>>              goto rdtsc;
>>>>          }
>>>> +        case 0xd4: /* vmfunc */
>>>> +            generate_exception_if(lock_prefix | rep_prefix() |
>>>> + (vex.pfx ==
>>>vex_66),
>>>> +                                  EXC_UD, -1);
>>>> +            fail_if(ops->vmfunc == NULL);
>>>> +            if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
>>>> +                goto done;
>>>> +            break;
>>>> +        default:
>>>> +            goto continue_grp7;
>>>> +        }
>>>> +        break;
>>>>
>>>> + continue_grp7:
>>>
>>>Already when first looking at this I disliked this label. Looking at it
>>>again, I'd really like to see it gone: RDTSCP handling already ends in
>>>a goto. Since the only VMFUNC currently implemented doesn't modify any
>>>register state either, its handling could end in an unconditional "goto done"
>>too for now.
>>>And INVLPG, not modifying any register state, could follow suit.
>>>
>>
>>Sure - no issues with that.
>>
> 
> On second thoughts, I cannot really use a goto done for these 2 cases since 
> that will skip the single-step tracing check that's performed in the existing 
> flow.

Good point.

> So I can add a new label entrypoint before the tracing check, or goto 
> writeback (with the dst.type switch there being a wasted check), or I can 
> keep the flow as is - which would you prefer?

I think "goto writeback" for an insn not having any register state to
write back may end up being confusing to future readers. I.e. such
use would need to at least be annotated with a brief comment.
Whether to go that route or add a new label no_writeback or
insn_done or some such (again accompanied by a brief comment)
I'd leave up to you.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-10 22:03     ` Sahita, Ravi
@ 2015-07-13  7:25       ` Jan Beulich
  2015-07-13 23:39         ` Sahita, Ravi
  0 siblings, 1 reply; 51+ messages in thread
From: Jan Beulich @ 2015-07-13  7:25 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 11.07.15 at 00:03, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Friday, July 10, 2015 3:01 AM
>>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>>> +        default:
>>> +            return -ENOSYS;
>>> +
>>> +            break;
>>
>>Bogus (unreachable) break.
> 
> Wanted to keep this so that if someone removes the error code then they 
> don't cause an invalid fall through.
> But ok with removing it if you think so.

We don't (intentionally) do this anywhere else, so it should be
removed.

>>> +                if ( !(d ? d : current->domain)->arch.altp2m_active )
>>
>>This is bogus: d is NULL if and only if altp2m_vcpu_enable_notify, i.e. I 
> don't
>>see why you can't just use current->domain inside that case (and you really
>>do). That would then also eliminate the need for this redundant and
>>obfuscating switch() nesting you use.
>>
> 
> We need to check if the target domain is in altp2m mode for all the 
> following sub-ops.
> If we removed this check, we would need to check for target domain being in 
> altp2m for all the following cases.
> Andrew wanted to refactor and pull common code up, and that's what this is 
> one case of for hvm_op.

I'd be fine with such refactoring if it didn't result in nested switch()-es
using the same control expression.

>>> +
>>> +struct xen_hvm_altp2m_set_mem_access {
>>> +    /* view */
>>> +    uint16_t view;
>>> +    /* Memory type */
>>> +    uint16_t hvmmem_access; /* xenmem_access_t */
>>> +    uint8_t pad[4];
>>> +    /* gfn */
>>> +    uint64_t gfn;
>>> +};
>>> +typedef struct xen_hvm_altp2m_set_mem_access
>>> xen_hvm_altp2m_set_mem_access_t;
>>> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
>>> +
>>> +struct xen_hvm_altp2m_change_gfn {
>>> +    /* view */
>>> +    uint16_t view;
>>> +    uint8_t pad[6];
>>> +    /* old gfn */
>>> +    uint64_t old_gfn;
>>> +    /* new gfn, INVALID_GFN (~0UL) means revert */
>>> +    uint64_t new_gfn;
>>> +};
>>> +typedef struct xen_hvm_altp2m_change_gfn
>>xen_hvm_altp2m_change_gfn_t;
>>> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_change_gfn_t);
>>> +
>>> +struct xen_hvm_altp2m_op {
>>> +    uint32_t cmd;
>>> +/* Get/set the altp2m state for a domain */
>>> +#define HVMOP_altp2m_get_domain_state     1
>>> +#define HVMOP_altp2m_set_domain_state     2
>>> +/* Set the current VCPU to receive altp2m event notifications */
>>> +#define HVMOP_altp2m_vcpu_enable_notify   3
>>> +/* Create a new view */
>>> +#define HVMOP_altp2m_create_p2m           4
>>> +/* Destroy a view */
>>> +#define HVMOP_altp2m_destroy_p2m          5
>>> +/* Switch view for an entire domain */
>>> +#define HVMOP_altp2m_switch_p2m           6
>>> +/* Notify that a page of memory is to have specific access types */
>>> +#define HVMOP_altp2m_set_mem_access       7
>>> +/* Change a p2m entry to have a different gfn->mfn mapping */
>>> +#define HVMOP_altp2m_change_gfn           8
>>> +    domid_t domain;
>>> +    uint8_t pad[2];
>>
>>While you added padding fields as asked for, you still don't verify them to 
> be
>>zero on input.
> 
> Specifically what were you thinking we need to do here - also would be good 
> if you can explain what was the underlying concern? (thanks)

I'm pretty sure I said so before - future extensibility. I.e. a means to
make use of the now unused (padding) fields, which can only be done
if the fields are being checked to be zero while unused.

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-13  7:25       ` Jan Beulich
@ 2015-07-13 23:39         ` Sahita, Ravi
  2015-07-14  8:58           ` Jan Beulich
  0 siblings, 1 reply; 51+ messages in thread
From: Sahita, Ravi @ 2015-07-13 23:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf



>-----Original Message-----
>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Monday, July 13, 2015 12:26 AM
>To: Sahita, Ravi
>Cc: Andrew Cooper; Wei Liu; George Dunlap; Ian Jackson; White, Edmund H;
>xen-devel@lists.xen.org; tlengyel@novetta.com; Daniel De Graaf; Tim
>Deegan
>Subject: RE: [PATCH v4 11/15] x86/altp2m: define and implement alternate
>p2m HVMOP types.
>
>>>> On 11.07.15 at 00:03, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Friday, July 10, 2015 3:01 AM
>>>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>>>> +        default:
>>>> +            return -ENOSYS;
>>>> +
>>>> +            break;
>>>
>>>Bogus (unreachable) break.
>>
>> Wanted to keep this so that if someone removes the error code then
>> they don't cause an invalid fall through.
>> But ok with removing it if you think so.
>
>We don't (intentionally) do this anywhere else, so it should be removed.

Done

>
>>>> +                if ( !(d ? d : current->domain)->arch.altp2m_active
>>>> + )
>>>
>>>This is bogus: d is NULL if and only if altp2m_vcpu_enable_notify,
>>>i.e. I
>> don't
>>>see why you can't just use current->domain inside that case (and you
>>>really do). That would then also eliminate the need for this redundant
>>>and obfuscating switch() nesting you use.
>>>
>>
>> We need to check if the target domain is in altp2m mode for all the
>> following sub-ops.
>> If we removed this check, we would need to check for target domain
>> being in altp2m for all the following cases.
>> Andrew wanted to refactor and pull common code up, and that's what
>> this is one case of for hvm_op.
>
>I'd be fine with such refactoring if it didn't result in nested switch()-es using
>the same control expression.
>

Agree that removing the current->domain test will allow us to remove the switch() nesting, however that will also require replicating the common code blocks - I'm ok with making this change just want an ok from Andrew on this one before we make this change to avoid thrashing on this one (since he was the one asking for the refactor).


>>>> +
>>>> +struct xen_hvm_altp2m_set_mem_access {
>>>> +    /* view */
>>>> +    uint16_t view;
>>>> +    /* Memory type */
>>>> +    uint16_t hvmmem_access; /* xenmem_access_t */
>>>> +    uint8_t pad[4];
>>>> +    /* gfn */
>>>> +    uint64_t gfn;
>>>> +};
>>>> +typedef struct xen_hvm_altp2m_set_mem_access
>>>> xen_hvm_altp2m_set_mem_access_t;
>>>> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
>>>> +
>>>> +struct xen_hvm_altp2m_change_gfn {
>>>> +    /* view */
>>>> +    uint16_t view;
>>>> +    uint8_t pad[6];
>>>> +    /* old gfn */
>>>> +    uint64_t old_gfn;
>>>> +    /* new gfn, INVALID_GFN (~0UL) means revert */
>>>> +    uint64_t new_gfn;
>>>> +};
>>>> +typedef struct xen_hvm_altp2m_change_gfn
>>>xen_hvm_altp2m_change_gfn_t;
>>>> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_change_gfn_t);
>>>> +
>>>> +struct xen_hvm_altp2m_op {
>>>> +    uint32_t cmd;
>>>> +/* Get/set the altp2m state for a domain */
>>>> +#define HVMOP_altp2m_get_domain_state     1
>>>> +#define HVMOP_altp2m_set_domain_state     2
>>>> +/* Set the current VCPU to receive altp2m event notifications */
>>>> +#define HVMOP_altp2m_vcpu_enable_notify   3
>>>> +/* Create a new view */
>>>> +#define HVMOP_altp2m_create_p2m           4
>>>> +/* Destroy a view */
>>>> +#define HVMOP_altp2m_destroy_p2m          5
>>>> +/* Switch view for an entire domain */
>>>> +#define HVMOP_altp2m_switch_p2m           6
>>>> +/* Notify that a page of memory is to have specific access types */
>>>> +#define HVMOP_altp2m_set_mem_access       7
>>>> +/* Change a p2m entry to have a different gfn->mfn mapping */
>>>> +#define HVMOP_altp2m_change_gfn           8
>>>> +    domid_t domain;
>>>> +    uint8_t pad[2];
>>>
>>>While you added padding fields as asked for, you still don't verify
>>>them to
>> be
>>>zero on input.
>>
>> Specifically what were you thinking we need to do here - also would be
>> good if you can explain what was the underlying concern? (thanks)
>
>I'm pretty sure I said so before - future extensibility. I.e. a means to make use
>of the now unused (padding) fields, which can only be done if the fields are
>being checked to be zero while unused.
>
>Jan

Fixed this

Ravi

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-13 23:39         ` Sahita, Ravi
@ 2015-07-14  8:58           ` Jan Beulich
  0 siblings, 0 replies; 51+ messages in thread
From: Jan Beulich @ 2015-07-14  8:58 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 01:39, <ravi.sahita@intel.com> wrote:

> 
>>-----Original Message-----
>>From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Monday, July 13, 2015 12:26 AM
>>To: Sahita, Ravi
>>Cc: Andrew Cooper; Wei Liu; George Dunlap; Ian Jackson; White, Edmund H;
>>xen-devel@lists.xen.org; tlengyel@novetta.com; Daniel De Graaf; Tim
>>Deegan
>>Subject: RE: [PATCH v4 11/15] x86/altp2m: define and implement alternate
>>p2m HVMOP types.
>>
>>>>> On 11.07.15 at 00:03, <ravi.sahita@intel.com> wrote:
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>Sent: Friday, July 10, 2015 3:01 AM
>>>>>>> On 10.07.15 at 02:52, <edmund.h.white@intel.com> wrote:
>>>>> +        default:
>>>>> +            return -ENOSYS;
>>>>> +
>>>>> +            break;
>>>>
>>>>Bogus (unreachable) break.
>>>
>>> Wanted to keep this so that if someone removes the error code then
>>> they don't cause an invalid fall through.
>>> But ok with removing it if you think so.
>>
>>We don't (intentionally) do this anywhere else, so it should be removed.
> 
> Done
> 
>>
>>>>> +                if ( !(d ? d : current->domain)->arch.altp2m_active
>>>>> + )
>>>>
>>>>This is bogus: d is NULL if and only if altp2m_vcpu_enable_notify,
>>>>i.e. I
>>> don't
>>>>see why you can't just use current->domain inside that case (and you
>>>>really do). That would then also eliminate the need for this redundant
>>>>and obfuscating switch() nesting you use.
>>>>
>>>
>>> We need to check if the target domain is in altp2m mode for all the
>>> following sub-ops.
>>> If we removed this check, we would need to check for target domain
>>> being in altp2m for all the following cases.
>>> Andrew wanted to refactor and pull common code up, and that's what
>>> this is one case of for hvm_op.
>>
>>I'd be fine with such refactoring if it didn't result in nested switch()-es 
> using
>>the same control expression.
>>
> 
> Agree that removing the current->domain test will allow us to remove the 
> switch() nesting, however that will also require replicating the common code 
> blocks - I'm ok with making this change just want an ok from Andrew on this 
> one before we make this change to avoid thrashing on this one (since he was 
> the one asking for the refactor).

I don't see why you shouldn't be able to factor this out without this
odd switch() nesting. Note that I didn't object to multiple sequential
switch() statements, only to their strange nesting...

Jan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-10 22:12     ` Sahita, Ravi
@ 2015-07-14 11:50       ` George Dunlap
  0 siblings, 0 replies; 51+ messages in thread
From: George Dunlap @ 2015-07-14 11:50 UTC (permalink / raw)
  To: Sahita, Ravi, White, Edmund H
  Cc: Wei Liu, Tim Deegan, Ian Jackson, xen-devel, Jan Beulich,
	Andrew Cooper, tlengyel, Daniel De Graaf

On 07/10/2015 11:12 PM, Sahita, Ravi wrote:
>> From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George
>> Dunlap
>> Sent: Friday, July 10, 2015 10:32 AM
>>
>> On Fri, Jul 10, 2015 at 1:52 AM, Ed White <edmund.h.white@intel.com>
>> wrote:
>>> The altp2mhvm and nestedhvm parameters are mutually exclusive and
>>> cannot be set together.
>>>
>>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>>
>>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> for the
>> hypervisor bits.
>>> ---
>>>  docs/man/xl.cfg.pod.5           | 12 ++++++++++++
>>>  tools/libxl/libxl.h             |  6 ++++++
>>>  tools/libxl/libxl_create.c      |  1 +
>>>  tools/libxl/libxl_dom.c         |  2 ++
>>>  tools/libxl/libxl_types.idl     |  1 +
>>>  tools/libxl/xl_cmdimpl.c        | 10 ++++++++++
>>>  xen/arch/x86/hvm/hvm.c          | 23 +++++++++++++++++++++--
>>>  xen/include/public/hvm/params.h |  5 ++++-
>>>  8 files changed, 57 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index
>>> a3e0e2e..18afd46 100644
>>> --- a/docs/man/xl.cfg.pod.5
>>> +++ b/docs/man/xl.cfg.pod.5
>>> @@ -1035,6 +1035,18 @@ enabled by default and you should usually omit
>>> it. It may be necessary  to disable the HPET in order to improve
>>> compatibility with guest  Operating Systems (X86 only)
>>>
>>> +=item B<altp2mhvm=BOOLEAN>
>>> +
>>> +Enables or disables hvm guest access to alternate-p2m capability.
>>> +Alternate-p2m allows a guest to manage multiple p2m guest physical
>>> +"memory views" (as opposed to a single p2m). This option is disabled
>>> +by default and is available only to hvm domains.
>>> +You may want this option if you want to access-control/isolate access
>>> +to specific guest physical memory pages accessed by the guest, e.g.
>>> +for HVM domain memory introspection or for isolation/access-control
>>> +of memory between components within a single guest hvm domain.
>>> +
>>>  =item B<nestedhvm=BOOLEAN>
>>>
>>>  Enable or disables guest access to hardware virtualisation features,
>>> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index
>>> a1c5d15..17222e7 100644
>>> --- a/tools/libxl/libxl.h
>>> +++ b/tools/libxl/libxl.h
>>> @@ -745,6 +745,12 @@ typedef struct libxl__ctx libxl_ctx;  #define
>>> LIBXL_HAVE_BUILDINFO_SERIAL_LIST 1
>>>
>>>  /*
>>> + * LIBXL_HAVE_ALTP2M
>>> + * If this is defined, then libxl supports alternate p2m functionality.
>>> + */
>>> +#define LIBXL_HAVE_ALTP2M 1
>>> +
>>> +/*
>>>   * LIBXL_HAVE_REMUS
>>>   * If this is defined, then libxl supports remus.
>>>   */
>>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
>>> index f366a09..418deee 100644
>>> --- a/tools/libxl/libxl_create.c
>>> +++ b/tools/libxl/libxl_create.c
>>> @@ -329,6 +329,7 @@ int libxl__domain_build_info_setdefault(libxl__gc
>> *gc,
>>>          libxl_defbool_setdefault(&b_info->u.hvm.hpet,               true);
>>>          libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,          true);
>>>          libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm,         false);
>>> +        libxl_defbool_setdefault(&b_info->u.hvm.altp2m,             false);
>>>          libxl_defbool_setdefault(&b_info->u.hvm.usb,                false);
>>>          libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
>>>
>>> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index
>>> bdc0465..2f1200e 100644
>>> --- a/tools/libxl/libxl_dom.c
>>> +++ b/tools/libxl/libxl_dom.c
>>> @@ -300,6 +300,8 @@ static void hvm_set_conf_params(xc_interface
>> *handle, uint32_t domid,
>>>                      libxl_defbool_val(info->u.hvm.vpt_align));
>>>      xc_hvm_param_set(handle, domid, HVM_PARAM_NESTEDHVM,
>>>                      libxl_defbool_val(info->u.hvm.nested_hvm));
>>> +    xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2MHVM,
>>> +                    libxl_defbool_val(info->u.hvm.altp2m));
>>>  }
>>>
>>>  int libxl__build_pre(libxl__gc *gc, uint32_t domid, diff --git
>>> a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index
>>> e1632fa..fb641fe 100644
>>> --- a/tools/libxl/libxl_types.idl
>>> +++ b/tools/libxl/libxl_types.idl
>>> @@ -440,6 +440,7 @@ libxl_domain_build_info =
>> Struct("domain_build_info",[
>>>                                         ("mmio_hole_memkb",  MemKB),
>>>                                         ("timer_mode",       libxl_timer_mode),
>>>                                         ("nested_hvm",       libxl_defbool),
>>> +                                       ("altp2m",           libxl_defbool),
>>>                                         ("smbios_firmware",  string),
>>>                                         ("acpi_firmware",    string),
>>>                                         ("nographic",        libxl_defbool),
>>> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index
>>> c858068..43cf6bf 100644
>>> --- a/tools/libxl/xl_cmdimpl.c
>>> +++ b/tools/libxl/xl_cmdimpl.c
>>> @@ -1500,6 +1500,16 @@ static void parse_config_data(const char
>>> *config_source,
>>>
>>>          xlu_cfg_get_defbool(config, "nestedhvm",
>>> &b_info->u.hvm.nested_hvm, 0);
>>>
>>> +        xlu_cfg_get_defbool(config, "altp2mhvm",
>>> + &b_info->u.hvm.altp2m, 0);
>>> +
>>> +        if (!libxl_defbool_is_default(b_info->u.hvm.nested_hvm) &&
>>> +            libxl_defbool_val(b_info->u.hvm.nested_hvm) &&
>>> +            !libxl_defbool_is_default(b_info->u.hvm.altp2m) &&
>>> +            libxl_defbool_val(b_info->u.hvm.altp2m)) {
>>> +            fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used
>> together\n");
>>> +            exit(1);
>>> +        }
>>> +
>>>          xlu_cfg_replace_string(config, "smbios_firmware",
>>>                                 &b_info->u.hvm.smbios_firmware, 0);
>>>          xlu_cfg_replace_string(config, "acpi_firmware", diff --git
>>> a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index
>>> 23cd507..6e59e68 100644
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -5750,6 +5750,7 @@ static int hvm_allow_set_param(struct domain *d,
>>>      case HVM_PARAM_VIRIDIAN:
>>>      case HVM_PARAM_IOREQ_SERVER_PFN:
>>>      case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
>>> +    case HVM_PARAM_ALTP2MHVM:
>>
>> Sorry I missed this -- when I was skimming the reviews of the previous
>> version, I assumed that when Wei asked "hvm" to be taken out because it
>> was redundant, it would include the HVM at the end of this HVM_PARAM.  It
>> seems fairly redundant to have HVM both at the beginning and the end.
>> (Note that argument doesn't apply to NESTEDHVM, because in that case, it's
>> the HVM itself which is nested.)
>>
>> (I also have an idea this may have been discussed before, but I can't find the
>> relevant conversation now, so let me know if I'm covering old
>> ground...)
> 
> 
> Wei has acked this today morning.

That means he has no objections, but it doesn't mean nobody else has
objections.  :-)

 -George

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2015-07-14 11:50 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-10  0:52 [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m Ed White
2015-07-10  0:52 ` [PATCH v4 01/15] common/domain: Helpers to pause a domain while in context Ed White
2015-07-10  0:52 ` [PATCH v4 02/15] VMX: VMFUNC and #VE definitions and detection Ed White
2015-07-10  0:52 ` [PATCH v4 03/15] VMX: implement suppress #VE Ed White
2015-07-10  9:09   ` Jan Beulich
2015-07-10 19:22     ` Sahita, Ravi
2015-07-10  0:52 ` [PATCH v4 04/15] x86/HVM: Hardware alternate p2m support detection Ed White
2015-07-10  0:52 ` [PATCH v4 05/15] x86/altp2m: basic data structures and support routines Ed White
2015-07-10  9:13   ` Jan Beulich
2015-07-10  0:52 ` [PATCH v4 06/15] VMX/altp2m: add code to support EPTP switching and #VE Ed White
2015-07-10 16:48   ` George Dunlap
2015-07-10  0:52 ` [PATCH v4 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
2015-07-10  9:30   ` Jan Beulich
2015-07-11 20:01     ` Sahita, Ravi
2015-07-11 21:25       ` Sahita, Ravi
2015-07-13  7:18         ` Jan Beulich
2015-07-13  7:13       ` Jan Beulich
2015-07-10  0:52 ` [PATCH v4 08/15] x86/altp2m: add control of suppress_ve Ed White
2015-07-10  9:39   ` Jan Beulich
2015-07-10 11:11     ` George Dunlap
2015-07-10 11:49       ` Jan Beulich
2015-07-10 11:56         ` George Dunlap
2015-07-10 17:02   ` George Dunlap
2015-07-11 21:29     ` Sahita, Ravi
2015-07-10  0:52 ` [PATCH v4 09/15] x86/altp2m: alternate p2m memory events Ed White
2015-07-10  1:01   ` Lengyel, Tamas
2015-07-10  0:52 ` [PATCH v4 10/15] x86/altp2m: add remaining support routines Ed White
2015-07-10  9:41   ` Jan Beulich
2015-07-10 17:15     ` George Dunlap
2015-07-11 20:20       ` Sahita, Ravi
2015-07-10  0:52 ` [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
2015-07-10 10:01   ` Jan Beulich
2015-07-10 22:03     ` Sahita, Ravi
2015-07-13  7:25       ` Jan Beulich
2015-07-13 23:39         ` Sahita, Ravi
2015-07-14  8:58           ` Jan Beulich
2015-07-10  0:52 ` [PATCH v4 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
2015-07-10  8:53   ` Wei Liu
2015-07-10 17:32   ` George Dunlap
2015-07-10 22:12     ` Sahita, Ravi
2015-07-14 11:50       ` George Dunlap
2015-07-10  0:52 ` [PATCH v4 13/15] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
2015-07-10  0:52 ` [PATCH v4 14/15] tools/libxc: add support to altp2m hvmops Ed White
2015-07-10  8:46   ` Ian Campbell
2015-07-10  0:52 ` [PATCH v4 15/15] tools/xen-access: altp2m testcases Ed White
2015-07-10  1:35   ` Lengyel, Tamas
2015-07-11  6:06     ` Razvan Cojocaru
2015-07-10  8:50   ` Ian Campbell
2015-07-10  8:55     ` Wei Liu
2015-07-10  9:12       ` Wei Liu
2015-07-10  9:20       ` Ian Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).