All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
@ 2015-07-01 18:09 Ed White
  2015-07-01 18:09 ` [PATCH v3 01/13] common/domain: Helpers to pause a domain while in context Ed White
                   ` (14 more replies)
  0 siblings, 15 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

This set of patches adds support to hvm domains for EPTP switching by creating
multiple copies of the host p2m (currently limited to 10 copies).

The primary use of this capability is expected to be in scenarios where access
to memory needs to be monitored and/or restricted below the level at which the
guest OS page tables operate. Two examples that were discussed at the 2014 Xen
developer summit are:

    VM introspection: 
        http://www.slideshare.net/xen_com_mgr/
        zero-footprint-guest-memory-introspection-from-xen

    Secure inter-VM communication:
        http://www.slideshare.net/xen_com_mgr/nakajima-nvf

A more detailed design specification can be found at:
    http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01319.html

Each p2m copy is populated lazily on EPT violations.
Permissions for pages in alternate p2m's can be changed in a similar
way to the existing memory access interface, and gfn->mfn mappings can be changed.

All this is done through extra HVMOP types.

The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain
code is hypervisor-only, the toolstack has not been modified.

The intra-domain code has been tested. Violation notifications can only be received
for pages that have been modified (access permissions and/or gfn->mfn mapping) 
intra-domain, and only on VCPU's that have enabled notification.

VMFUNC and #VE will both be emulated on hardware without native support.

This code is not compatible with nested hvm functionality and will refuse to work
with nested hvm active. It is also not compatible with migration. It should be
considered experimental.


Changes since v2:

Addressed all v2 feedback *except*:

    In patch 5, the per-domain EPTP list page is still allocated from the
    Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
    hardware when walking the EPTP list during exit processing in patch 6.

    HVM_ops are not merged. Tamas suggested merging the memory access ops,
    but in practice they are not as similar as they appear on the surface.
    Razvan suggested merging the implementation code in p2m.c, but that is
    also not as common as it appears on the surface.
    Andrew suggested merging all altp2m ops into one with a subop code in
    the input stucture. His point that only 255 ops can be defined is well
    taken, but altp2m uses only 2 more ops than the recently introduced
    ioreq ops, and <15% of the available ops have been defined. Since we
    don't know how to implement XSM hooks and policy with the subop model,
    we have not adopted this suggestion.

    The p2m set/get interface is not modified. The altp2m code needs to
    write suppress_ve in 2 places and read it in 1 place. The original
    patch series managed this by coupling the state of suppress_ve to the
    p2m memory type, which Tim disliked. In v2 of the series, special
    set/get interaces were added to access suppress_ve only when required.
    Jan has suggested changing the existing interfaces, but we feel this
    is inappropriate for this experimental patch series. Changing the
    existing interfaces would require a design agreement to be reached
    and would impact a large amount of existing code.

    Andrew kindly added some reviewed-by's to v2. I have not carried
    his reviewed-by of the memory event patch forward because Tamas
    requested significant changes to the patch.


Changes since v1:

Many changes since v1 in response to maintainer feedback, including:

    Suppress_ve state is now decoupled from memory type
    VMFUNC emulation handled in x86 emulator
    Lazy-copy algorithm copies any page where mfn != INVALID_MFN
    All nested page fault handling except lazy-copy is now in
        top-level (hvm.c) nested page fault handler
    Split p2m lock type (as suggested by Tim) to avoid lock order violations
    XSM hooks
    Xen parameter to globally enable altp2m (default disabled) and HVM parameter
    Altp2m reference counting no longer uses dirty_cpu bitmap
    Remapped page tracking to invalidate altp2m's where needed to protect Xen
    Many other minor changes

The altp2m invalidation is implemented to a level that I believe satisifies
the requirements of protecting Xen. Invalidation notification is not yet
implemented, and there may be other cases where invalidation is warranted to
protect the integrity of the restrictions placed through altp2m. We may add
further patches in this area.

Testability is still a potential issue. We have offered to make our internal
Windows test binaries available for intra-domain testing. Tamas has
been working on toolstack support for cross-domain testing with a slightly
earlier patch series, and we hope he will submit that support.

Not all of the patches will be of interest to everyone copied here. I've
copied everyone on this initial mailing to give context.
   
Andrew Cooper (1):
  common/domain: Helpers to pause a domain while in context

Ed White (10):
  VMX: VMFUNC and #VE definitions and detection.
  VMX: implement suppress #VE.
  x86/HVM: Hardware alternate p2m support detection.
  x86/altp2m: basic data structures and support routines.
  VMX/altp2m: add code to support EPTP switching and #VE.
  x86/altp2m: add control of suppress_ve.
  x86/altp2m: alternate p2m memory events.
  x86/altp2m: add remaining support routines.
  x86/altp2m: define and implement alternate p2m HVMOP types.
  x86/altp2m: Add altp2mhvm HVM domain parameter.

Ravi Sahita (2):
  VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  x86/altp2m: XSM hooks for altp2m HVM ops

 docs/man/xl.cfg.pod.5                        |  12 +
 docs/misc/xen-command-line.markdown          |   7 +
 tools/flask/policy/policy/modules/xen/xen.if |   4 +-
 tools/libxl/libxl_create.c                   |   1 +
 tools/libxl/libxl_dom.c                      |   2 +
 tools/libxl/libxl_types.idl                  |   1 +
 tools/libxl/xl_cmdimpl.c                     |   8 +
 xen/arch/x86/hvm/Makefile                    |   1 +
 xen/arch/x86/hvm/altp2m.c                    |  92 +++++
 xen/arch/x86/hvm/emulate.c                   |  12 +-
 xen/arch/x86/hvm/hvm.c                       | 327 ++++++++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c                  |  42 ++-
 xen/arch/x86/hvm/vmx/vmx.c                   | 169 +++++++++
 xen/arch/x86/mm/hap/Makefile                 |   1 +
 xen/arch/x86/mm/hap/altp2m_hap.c             |  98 ++++++
 xen/arch/x86/mm/hap/hap.c                    |  31 +-
 xen/arch/x86/mm/mm-locks.h                   |  38 +-
 xen/arch/x86/mm/p2m-ept.c                    |  68 +++-
 xen/arch/x86/mm/p2m.c                        | 503 ++++++++++++++++++++++++++-
 xen/arch/x86/x86_emulate/x86_emulate.c       |  48 ++-
 xen/arch/x86/x86_emulate/x86_emulate.h       |   4 +
 xen/common/domain.c                          |  28 ++
 xen/common/vm_event.c                        |   3 +
 xen/include/asm-arm/p2m.h                    |   6 +
 xen/include/asm-x86/domain.h                 |  10 +
 xen/include/asm-x86/hvm/altp2m.h             |  42 +++
 xen/include/asm-x86/hvm/hvm.h                |  28 ++
 xen/include/asm-x86/hvm/vcpu.h               |   9 +
 xen/include/asm-x86/hvm/vmx/vmcs.h           |  14 +-
 xen/include/asm-x86/hvm/vmx/vmx.h            |  13 +-
 xen/include/asm-x86/msr-index.h              |   1 +
 xen/include/asm-x86/p2m.h                    |  79 ++++-
 xen/include/public/hvm/hvm_op.h              |  69 ++++
 xen/include/public/hvm/params.h              |   5 +-
 xen/include/public/vm_event.h                |  11 +
 xen/include/xen/sched.h                      |   5 +
 xen/include/xsm/dummy.h                      |  12 +
 xen/include/xsm/xsm.h                        |  12 +
 xen/xsm/dummy.c                              |   2 +
 xen/xsm/flask/hooks.c                        |  12 +
 xen/xsm/flask/policy/access_vectors          |   7 +
 41 files changed, 1789 insertions(+), 48 deletions(-)
 create mode 100644 xen/arch/x86/hvm/altp2m.c
 create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c
 create mode 100644 xen/include/asm-x86/hvm/altp2m.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 91+ messages in thread

* [PATCH v3 01/13] common/domain: Helpers to pause a domain while in context
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-01 18:09 ` [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection Ed White
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Andrew Cooper <andrew.cooper3@citrix.com>

For use on codepaths which would need to use domain_pause() but might be in
the target domain's context.  In the case that the target domain is in
context, all other vcpus are paused.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/common/domain.c     | 28 ++++++++++++++++++++++++++++
 xen/include/xen/sched.h |  5 +++++
 2 files changed, 33 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3bc52e6..1bb24ae 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1010,6 +1010,34 @@ int domain_unpause_by_systemcontroller(struct domain *d)
     return 0;
 }
 
+void domain_pause_except_self(struct domain *d)
+{
+    struct vcpu *v, *curr = current;
+
+    if ( curr->domain == d )
+    {
+        for_each_vcpu( d, v )
+            if ( likely(v != curr) )
+                vcpu_pause(v);
+    }
+    else
+        domain_pause(d);
+}
+
+void domain_unpause_except_self(struct domain *d)
+{
+    struct vcpu *v, *curr = current;
+
+    if ( curr->domain == d )
+    {
+        for_each_vcpu( d, v )
+            if ( likely(v != curr) )
+                vcpu_unpause(v);
+    }
+    else
+        domain_unpause(d);
+}
+
 int vcpu_reset(struct vcpu *v)
 {
     struct domain *d = v->domain;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index d810e1c..e68a860 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -803,6 +803,11 @@ static inline int domain_pause_by_systemcontroller_nosync(struct domain *d)
 {
     return __domain_pause_by_systemcontroller(d, domain_pause_nosync);
 }
+
+/* domain_pause() but safe against trying to pause current. */
+void domain_pause_except_self(struct domain *d);
+void domain_unpause_except_self(struct domain *d);
+
 void cpu_init(void);
 
 struct scheduler;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
  2015-07-01 18:09 ` [PATCH v3 01/13] common/domain: Helpers to pause a domain while in context Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-06 17:16   ` George Dunlap
  2015-07-07 18:58   ` Nakajima, Jun
  2015-07-01 18:09 ` [PATCH v3 03/13] VMX: implement suppress #VE Ed White
                   ` (12 subsequent siblings)
  14 siblings, 2 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Currently, neither is enabled globally but may be enabled on a per-VCPU
basis by the altp2m code.

Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
that bit is now hardware-defined.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/vmx/vmcs.c        | 42 +++++++++++++++++++++++++++++++++++---
 xen/arch/x86/mm/p2m-ept.c          |  1 -
 xen/include/asm-x86/hvm/vmx/vmcs.h | 14 +++++++++++--
 xen/include/asm-x86/hvm/vmx/vmx.h  | 13 +++++++++++-
 xen/include/asm-x86/msr-index.h    |  1 +
 5 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 4c5ceb5..bc1cabd 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -101,6 +101,8 @@ u32 vmx_secondary_exec_control __read_mostly;
 u32 vmx_vmexit_control __read_mostly;
 u32 vmx_vmentry_control __read_mostly;
 u64 vmx_ept_vpid_cap __read_mostly;
+u64 vmx_vmfunc __read_mostly;
+bool_t vmx_virt_exception __read_mostly;
 
 const u32 vmx_introspection_force_enabled_msrs[] = {
     MSR_IA32_SYSENTER_EIP,
@@ -140,6 +142,8 @@ static void __init vmx_display_features(void)
     P(cpu_has_vmx_virtual_intr_delivery, "Virtual Interrupt Delivery");
     P(cpu_has_vmx_posted_intr_processing, "Posted Interrupt Processing");
     P(cpu_has_vmx_vmcs_shadowing, "VMCS shadowing");
+    P(cpu_has_vmx_vmfunc, "VM Functions");
+    P(cpu_has_vmx_virt_exceptions, "Virtualisation Exceptions");
     P(cpu_has_vmx_pml, "Page Modification Logging");
 #undef P
 
@@ -185,6 +189,7 @@ static int vmx_init_vmcs_config(void)
     u64 _vmx_misc_cap = 0;
     u32 _vmx_vmexit_control;
     u32 _vmx_vmentry_control;
+    u64 _vmx_vmfunc = 0;
     bool_t mismatch = 0;
 
     rdmsr(MSR_IA32_VMX_BASIC, vmx_basic_msr_low, vmx_basic_msr_high);
@@ -230,7 +235,9 @@ static int vmx_init_vmcs_config(void)
                SECONDARY_EXEC_ENABLE_EPT |
                SECONDARY_EXEC_ENABLE_RDTSCP |
                SECONDARY_EXEC_PAUSE_LOOP_EXITING |
-               SECONDARY_EXEC_ENABLE_INVPCID);
+               SECONDARY_EXEC_ENABLE_INVPCID |
+               SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+               SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
         rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
         if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
             opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
@@ -341,6 +348,24 @@ static int vmx_init_vmcs_config(void)
           || !(_vmx_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT) )
         _vmx_pin_based_exec_control  &= ~ PIN_BASED_POSTED_INTERRUPT;
 
+    /* The IA32_VMX_VMFUNC MSR exists only when VMFUNC is available */
+    if ( _vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS )
+    {
+        rdmsrl(MSR_IA32_VMX_VMFUNC, _vmx_vmfunc);
+
+        /*
+         * VMFUNC leaf 0 (EPTP switching) must be supported.
+         *
+         * Or we just don't use VMFUNC.
+         */
+        if ( !(_vmx_vmfunc & VMX_VMFUNC_EPTP_SWITCHING) )
+            _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+    }
+
+    /* Virtualization exceptions are only enabled if VMFUNC is enabled */
+    if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+        _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
     min = 0;
     opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
     _vmx_vmentry_control = adjust_vmx_controls(
@@ -361,6 +386,9 @@ static int vmx_init_vmcs_config(void)
         vmx_vmentry_control        = _vmx_vmentry_control;
         vmx_basic_msr              = ((u64)vmx_basic_msr_high << 32) |
                                      vmx_basic_msr_low;
+        vmx_vmfunc                 = _vmx_vmfunc;
+        vmx_virt_exception         = !!(_vmx_secondary_exec_control &
+                                       SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
         vmx_display_features();
 
         /* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -397,6 +425,9 @@ static int vmx_init_vmcs_config(void)
         mismatch |= cap_check(
             "EPT and VPID Capability",
             vmx_ept_vpid_cap, _vmx_ept_vpid_cap);
+        mismatch |= cap_check(
+            "VMFUNC Capability",
+            vmx_vmfunc, _vmx_vmfunc);
         if ( cpu_has_vmx_ins_outs_instr_info !=
              !!(vmx_basic_msr_high & (VMX_BASIC_INS_OUT_INFO >> 32)) )
         {
@@ -967,6 +998,11 @@ static int construct_vmcs(struct vcpu *v)
     /* Do not enable Monitor Trap Flag unless start single step debug */
     v->arch.hvm_vmx.exec_control &= ~CPU_BASED_MONITOR_TRAP_FLAG;
 
+    /* Disable VMFUNC and #VE for now: they may be enabled later by altp2m. */
+    v->arch.hvm_vmx.secondary_exec_control &=
+        ~(SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+          SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
+
     if ( is_pvh_domain(d) )
     {
         /* Disable virtual apics, TPR */
@@ -1790,9 +1826,9 @@ void vmcs_dump_vcpu(struct vcpu *v)
         printk("PLE Gap=%08x Window=%08x\n",
                vmr32(PLE_GAP), vmr32(PLE_WINDOW));
     if ( v->arch.hvm_vmx.secondary_exec_control &
-         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VMFUNC) )
+         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
         printk("Virtual processor ID = 0x%04x VMfunc controls = %016lx\n",
-               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VMFUNC_CONTROL));
+               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VM_FUNCTION_CONTROL));
 
     vmx_vmcs_exit(v);
 }
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 5133eb6..a6c9adf 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -281,7 +281,6 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
         epte->sp = (level > 1);
         epte->mfn += i * trunk;
         epte->snp = (iommu_enabled && iommu_snoop);
-        ASSERT(!epte->avail3);
 
         ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
 
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 1104bda..cb0ee6c 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -222,9 +222,10 @@ extern u32 vmx_vmentry_control;
 #define SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY    0x00000200
 #define SECONDARY_EXEC_PAUSE_LOOP_EXITING       0x00000400
 #define SECONDARY_EXEC_ENABLE_INVPCID           0x00001000
-#define SECONDARY_EXEC_ENABLE_VMFUNC            0x00002000
+#define SECONDARY_EXEC_ENABLE_VM_FUNCTIONS      0x00002000
 #define SECONDARY_EXEC_ENABLE_VMCS_SHADOWING    0x00004000
 #define SECONDARY_EXEC_ENABLE_PML               0x00020000
+#define SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS   0x00040000
 extern u32 vmx_secondary_exec_control;
 
 #define VMX_EPT_EXEC_ONLY_SUPPORTED             0x00000001
@@ -285,6 +286,10 @@ extern u32 vmx_secondary_exec_control;
     (vmx_pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT)
 #define cpu_has_vmx_vmcs_shadowing \
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VMCS_SHADOWING)
+#define cpu_has_vmx_vmfunc \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS)
+#define cpu_has_vmx_virt_exceptions \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS)
 #define cpu_has_vmx_pml \
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_PML)
 
@@ -316,6 +321,9 @@ extern u64 vmx_basic_msr;
 #define VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK  0x0FF
 #define VMX_GUEST_INTR_STATUS_SVI_OFFSET        8
 
+/* VMFUNC leaf definitions */
+#define VMX_VMFUNC_EPTP_SWITCHING   (1ULL << 0)
+
 /* VMCS field encodings. */
 #define VMCS_HIGH(x) ((x) | 1)
 enum vmcs_field {
@@ -350,12 +358,14 @@ enum vmcs_field {
     VIRTUAL_APIC_PAGE_ADDR          = 0x00002012,
     APIC_ACCESS_ADDR                = 0x00002014,
     PI_DESC_ADDR                    = 0x00002016,
-    VMFUNC_CONTROL                  = 0x00002018,
+    VM_FUNCTION_CONTROL             = 0x00002018,
     EPT_POINTER                     = 0x0000201a,
     EOI_EXIT_BITMAP0                = 0x0000201c,
 #define EOI_EXIT_BITMAP(n) (EOI_EXIT_BITMAP0 + (n) * 2) /* n = 0...3 */
+    EPTP_LIST_ADDR                  = 0x00002024,
     VMREAD_BITMAP                   = 0x00002026,
     VMWRITE_BITMAP                  = 0x00002028,
+    VIRT_EXCEPTION_INFO             = 0x0000202a,
     GUEST_PHYSICAL_ADDRESS          = 0x00002400,
     VMCS_LINK_POINTER               = 0x00002800,
     GUEST_IA32_DEBUGCTL             = 0x00002802,
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 35f804a..5b59d3c 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -47,7 +47,7 @@ typedef union {
         access      :   4,  /* bits 61:58 - p2m_access_t */
         tm          :   1,  /* bit 62 - VT-d transient-mapping hint in
                                shared EPT/VT-d usage */
-        avail3      :   1;  /* bit 63 - Software available 3 */
+        suppress_ve :   1;  /* bit 63 - suppress #VE */
     };
     u64 epte;
 } ept_entry_t;
@@ -186,6 +186,7 @@ static inline unsigned long pi_get_pir(struct pi_desc *pi_desc, int group)
 #define EXIT_REASON_XSETBV              55
 #define EXIT_REASON_APIC_WRITE          56
 #define EXIT_REASON_INVPCID             58
+#define EXIT_REASON_VMFUNC              59
 #define EXIT_REASON_PML_FULL            62
 
 /*
@@ -554,4 +555,14 @@ void p2m_init_hap_data(struct p2m_domain *p2m);
 #define EPT_L4_PAGETABLE_SHIFT      39
 #define EPT_PAGETABLE_ENTRIES       512
 
+/* #VE information page */
+typedef struct {
+    u32 exit_reason;
+    u32 semaphore;
+    u64 exit_qualification;
+    u64 gla;
+    u64 gpa;
+    u16 eptp_index;
+} ve_info_t;
+
 #endif /* __ASM_X86_HVM_VMX_VMX_H__ */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 83f2f70..8069d60 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -130,6 +130,7 @@
 #define MSR_IA32_VMX_TRUE_PROCBASED_CTLS        0x48e
 #define MSR_IA32_VMX_TRUE_EXIT_CTLS             0x48f
 #define MSR_IA32_VMX_TRUE_ENTRY_CTLS            0x490
+#define MSR_IA32_VMX_VMFUNC                     0x491
 #define IA32_FEATURE_CONTROL_MSR                0x3a
 #define IA32_FEATURE_CONTROL_MSR_LOCK                     0x0001
 #define IA32_FEATURE_CONTROL_MSR_ENABLE_VMXON_INSIDE_SMX  0x0002
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 03/13] VMX: implement suppress #VE.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
  2015-07-01 18:09 ` [PATCH v3 01/13] common/domain: Helpers to pause a domain while in context Ed White
  2015-07-01 18:09 ` [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-06 17:26   ` George Dunlap
                     ` (2 more replies)
  2015-07-01 18:09 ` [PATCH v3 04/13] x86/HVM: Hardware alternate p2m support detection Ed White
                   ` (11 subsequent siblings)
  14 siblings, 3 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

In preparation for selectively enabling #VE in a later patch, set
suppress #VE on all EPTE's.

Suppress #VE should always be the default condition for two reasons:
it is generally not safe to deliver #VE into a guest unless that guest
has been modified to receive it; and even then for most EPT violations only
the hypervisor is able to handle the violation.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/mm/p2m-ept.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index a6c9adf..4111795 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -41,7 +41,8 @@
 #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
 static inline bool_t is_epte_valid(ept_entry_t *e)
 {
-    return (e->epte != 0 && e->sa_p2mt != p2m_invalid);
+    /* suppress_ve alone is not considered valid, so mask it off */
+    return ((e->epte & ~(1ul << 63)) != 0 && e->sa_p2mt != p2m_invalid);
 }
 
 /* returns : 0 for success, -errno otherwise */
@@ -219,6 +220,8 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
 static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
 {
     struct page_info *pg;
+    ept_entry_t *table;
+    unsigned int i;
 
     pg = p2m_alloc_ptp(p2m, 0);
     if ( pg == NULL )
@@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
     /* Manually set A bit to avoid overhead of MMU having to write it later. */
     ept_entry->a = 1;
 
+    ept_entry->suppress_ve = 1;
+
+    table = __map_domain_page(pg);
+
+    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
+        table[i].suppress_ve = 1;
+
+    unmap_domain_page(table);
+
     return 1;
 }
 
@@ -281,6 +293,7 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
         epte->sp = (level > 1);
         epte->mfn += i * trunk;
         epte->snp = (iommu_enabled && iommu_snoop);
+        epte->suppress_ve = 1;
 
         ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
 
@@ -790,6 +803,8 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
 
+    new_entry.suppress_ve = 1;
+
     rc = atomic_write_ept_entry(ept_entry, new_entry, target);
     if ( unlikely(rc) )
         old_entry.epte = 0;
@@ -1111,6 +1126,8 @@ static void ept_flush_pml_buffers(struct p2m_domain *p2m)
 int ept_p2m_init(struct p2m_domain *p2m)
 {
     struct ept_data *ept = &p2m->ept;
+    ept_entry_t *table;
+    unsigned int i;
 
     p2m->set_entry = ept_set_entry;
     p2m->get_entry = ept_get_entry;
@@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
         p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
     }
 
+    table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
+
+    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
+        table[i].suppress_ve = 1;
+
+    unmap_domain_page(table);
+
     if ( !zalloc_cpumask_var(&ept->synced_mask) )
         return -ENOMEM;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 04/13] x86/HVM: Hardware alternate p2m support detection.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (2 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 03/13] VMX: implement suppress #VE Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-01 18:09 ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Ed White
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

As implemented here, only supported on platforms with VMX HAP.

By default this functionality is force-disabled, it can be enabled
by specifying altp2m=1 on the Xen command line.

Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 docs/misc/xen-command-line.markdown | 7 +++++++
 xen/arch/x86/hvm/hvm.c              | 7 +++++++
 xen/arch/x86/hvm/vmx/vmx.c          | 1 +
 xen/include/asm-x86/hvm/hvm.h       | 9 +++++++++
 4 files changed, 24 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index aa684c0..3391c66 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -139,6 +139,13 @@ mode during S3 resume.
 > Default: `true`
 
 Permit Xen to use superpages when performing memory management.
+ 
+### altp2m (Intel)
+> `= <boolean>`
+
++> Default: `false`
+
+Permit multiple copies of host p2m.
 
 ### apic
 > `= bigsmp | default`
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index d5e5242..6faf3f4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -94,6 +94,10 @@ bool_t opt_hvm_fep;
 boolean_param("hvm_fep", opt_hvm_fep);
 #endif
 
+/* Xen command-line option to enable altp2m */
+static bool_t __initdata opt_altp2m_enabled = 0;
+boolean_param("altp2m", opt_altp2m_enabled);
+
 static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -160,6 +164,9 @@ static int __init hvm_enable(void)
     if ( !fns->pvh_supported )
         printk(XENLOG_INFO "HVM: PVH mode not supported on this platform\n");
 
+    if ( !opt_altp2m_enabled )
+        hvm_funcs.altp2m_supported = 0;
+
     /*
      * Allow direct access to the PC debug ports 0x80 and 0xed (they are
      * often used for I/O delays, but the vmexits simply slow things down).
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 0837627..2d3ad63 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1841,6 +1841,7 @@ const struct hvm_function_table * __init start_vmx(void)
     if ( cpu_has_vmx_ept && (cpu_has_vmx_pat || opt_force_ept) )
     {
         vmx_function_table.hap_supported = 1;
+        vmx_function_table.altp2m_supported = 1;
 
         vmx_function_table.hap_capabilities = 0;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 77eeac5..8aa2bb3 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -94,6 +94,9 @@ struct hvm_function_table {
     /* Necessary hardware support for PVH mode? */
     int pvh_supported;
 
+    /* Necessary hardware support for alternate p2m's? */
+    bool_t altp2m_supported;
+
     /* Indicate HAP capabilities. */
     int hap_capabilities;
 
@@ -509,6 +512,12 @@ bool_t nhvm_vmcx_hap_enabled(struct vcpu *v);
 /* interrupt */
 enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v);
 
+/* returns true if hardware supports alternate p2m's */
+static inline bool_t hvm_altp2m_supported(void)
+{
+    return hvm_funcs.altp2m_supported;
+}
+
 #ifndef NDEBUG
 /* Permit use of the Forced Emulation Prefix in HVM guests */
 extern bool_t opt_hvm_fep;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (3 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 04/13] x86/HVM: Hardware alternate p2m support detection Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-03 16:22   ` Andrew Cooper
                     ` (3 more replies)
  2015-07-01 18:09 ` [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE Ed White
                   ` (9 subsequent siblings)
  14 siblings, 4 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Add the basic data structures needed to support alternate p2m's and
the functions to initialise them and tear them down.

Although Intel hardware can handle 512 EPTP's per hardware thread
concurrently, only 10 per domain are supported in this patch for
performance reasons.

The iterator in hap_enable() does need to handle 512, so that is now
uint16_t.

This change also splits the p2m lock into one lock type for altp2m's
and another type for all other p2m's. The purpose of this is to place
the altp2m list lock between the types, so the list lock can be
acquired whilst holding the host p2m lock.

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 xen/arch/x86/hvm/Makefile        |  1 +
 xen/arch/x86/hvm/altp2m.c        | 92 +++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/hvm.c           | 21 +++++++++
 xen/arch/x86/mm/hap/hap.c        | 31 ++++++++++++-
 xen/arch/x86/mm/mm-locks.h       | 38 +++++++++++++++-
 xen/arch/x86/mm/p2m.c            | 98 ++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/domain.h     | 10 ++++
 xen/include/asm-x86/hvm/altp2m.h | 38 ++++++++++++++++
 xen/include/asm-x86/hvm/hvm.h    | 17 +++++++
 xen/include/asm-x86/hvm/vcpu.h   |  9 ++++
 xen/include/asm-x86/p2m.h        | 30 +++++++++++-
 11 files changed, 381 insertions(+), 4 deletions(-)
 create mode 100644 xen/arch/x86/hvm/altp2m.c
 create mode 100644 xen/include/asm-x86/hvm/altp2m.h

diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index 69af47f..eb1a37b 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -1,6 +1,7 @@
 subdir-y += svm
 subdir-y += vmx
 
+obj-y += altp2m.o
 obj-y += asid.o
 obj-y += emulate.o
 obj-y += event.o
diff --git a/xen/arch/x86/hvm/altp2m.c b/xen/arch/x86/hvm/altp2m.c
new file mode 100644
index 0000000..f98a38d
--- /dev/null
+++ b/xen/arch/x86/hvm/altp2m.c
@@ -0,0 +1,92 @@
+/*
+ * Alternate p2m HVM
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#include <asm/hvm/support.h>
+#include <asm/hvm/hvm.h>
+#include <asm/p2m.h>
+#include <asm/hvm/altp2m.h>
+
+void
+altp2m_vcpu_reset(struct vcpu *v)
+{
+    struct altp2mvcpu *av = &vcpu_altp2m(v);
+
+    av->p2midx = INVALID_ALTP2M;
+    av->veinfo_gfn = _gfn(INVALID_GFN);
+
+    if ( hvm_funcs.ap2m_vcpu_reset )
+        hvm_funcs.ap2m_vcpu_reset(v);
+}
+
+int
+altp2m_vcpu_initialise(struct vcpu *v)
+{
+    int rc = -EOPNOTSUPP;
+
+    if ( v != current )
+        vcpu_pause(v);
+
+    if ( !hvm_funcs.ap2m_vcpu_initialise ||
+         (hvm_funcs.ap2m_vcpu_initialise(v) == 0) )
+    {
+        rc = 0;
+        altp2m_vcpu_reset(v);
+        vcpu_altp2m(v).p2midx = 0;
+        atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+
+        ap2m_vcpu_update_eptp(v);
+    }
+
+    if ( v != current )
+        vcpu_unpause(v);
+
+    return rc;
+}
+
+void
+altp2m_vcpu_destroy(struct vcpu *v)
+{
+    struct p2m_domain *p2m;
+
+    if ( v != current )
+        vcpu_pause(v);
+
+    if ( hvm_funcs.ap2m_vcpu_destroy )
+        hvm_funcs.ap2m_vcpu_destroy(v);
+
+    if ( (p2m = p2m_get_altp2m(v)) )
+        atomic_dec(&p2m->active_vcpus);
+
+    altp2m_vcpu_reset(v);
+
+    ap2m_vcpu_update_eptp(v);
+    ap2m_vcpu_update_vmfunc_ve(v);
+
+    if ( v != current )
+        vcpu_unpause(v);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 6faf3f4..f21d34d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -58,6 +58,7 @@
 #include <asm/hvm/cacheattr.h>
 #include <asm/hvm/trace.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/altp2m.h>
 #include <asm/hvm/event.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
@@ -2380,6 +2381,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
 {
     hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
 
+    altp2m_vcpu_destroy(v);
     nestedhvm_vcpu_destroy(v);
 
     free_compat_arg_xlat(v);
@@ -6502,6 +6504,25 @@ enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v)
     return hvm_funcs.nhvm_intr_blocked(v);
 }
 
+void ap2m_vcpu_update_eptp(struct vcpu *v)
+{
+    if (hvm_funcs.ap2m_vcpu_update_eptp)
+        hvm_funcs.ap2m_vcpu_update_eptp(v);
+}
+
+void ap2m_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+    if (hvm_funcs.ap2m_vcpu_update_vmfunc_ve)
+        hvm_funcs.ap2m_vcpu_update_vmfunc_ve(v);
+}
+
+bool_t ap2m_vcpu_emulate_ve(struct vcpu *v)
+{
+    if (hvm_funcs.ap2m_vcpu_emulate_ve)
+        return hvm_funcs.ap2m_vcpu_emulate_ve(v);
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index d0d3f1e..c00282c 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)
 int hap_enable(struct domain *d, u32 mode)
 {
     unsigned int old_pages;
-    uint8_t i;
+    uint16_t i;
     int rv = 0;
 
     domain_pause(d);
@@ -498,6 +498,24 @@ int hap_enable(struct domain *d, u32 mode)
            goto out;
     }
 
+    /* Init alternate p2m data */
+    if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
+    {
+        rv = -ENOMEM;
+        goto out;
+    }
+
+    for (i = 0; i < MAX_EPTP; i++)
+        d->arch.altp2m_eptp[i] = INVALID_MFN;
+
+    for (i = 0; i < MAX_ALTP2M; i++) {
+        rv = p2m_alloc_table(d->arch.altp2m_p2m[i]);
+        if ( rv != 0 )
+           goto out;
+    }
+
+    d->arch.altp2m_active = 0;
+
     /* Now let other users see the new mode */
     d->arch.paging.mode = mode | PG_HAP_enable;
 
@@ -510,6 +528,17 @@ void hap_final_teardown(struct domain *d)
 {
     uint8_t i;
 
+    d->arch.altp2m_active = 0;
+
+    if ( d->arch.altp2m_eptp ) {
+        free_xenheap_page(d->arch.altp2m_eptp);
+        d->arch.altp2m_eptp = NULL;
+    }
+
+    for (i = 0; i < MAX_ALTP2M; i++) {
+        p2m_teardown(d->arch.altp2m_p2m[i]);
+    }
+
     /* Destroy nestedp2m's first */
     for (i = 0; i < MAX_NESTEDP2M; i++) {
         p2m_teardown(d->arch.nested_p2m[i]);
diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index b4f035e..301ca59 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
 #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
 #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
 
-/* P2M lock (per-p2m-table)
+/* P2M lock (per-non-alt-p2m-table)
  *
  * This protects all queries and updates to the p2m table.
  * Queries may be made under the read lock but all modifications
@@ -225,10 +225,44 @@ declare_mm_lock(nestedp2m)
  *
  * The write lock is recursive as it is common for a code path to look
  * up a gfn and later mutate it.
+ *
+ * Note that this lock shares its implementation with the altp2m
+ * lock (not the altp2m list lock), so the implementation
+ * is found there.
  */
 
 declare_mm_rwlock(p2m);
-#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
+
+/* Alternate P2M list lock (per-domain)
+ *
+ * A per-domain lock that protects the list of alternate p2m's.
+ * Any operation that walks the list needs to acquire this lock.
+ * Additionally, before destroying an alternate p2m all VCPU's
+ * in the target domain must be paused.
+ */
+
+declare_mm_lock(altp2mlist)
+#define altp2m_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_lock)
+#define altp2m_unlock(d) mm_unlock(&(d)->arch.altp2m_lock)
+
+/* P2M lock (per-altp2m-table)
+ *
+ * This protects all queries and updates to the p2m table.
+ * Queries may be made under the read lock but all modifications
+ * need the main (write) lock.
+ *
+ * The write lock is recursive as it is common for a code path to look
+ * up a gfn and later mutate it.
+ */
+
+declare_mm_rwlock(altp2m);
+#define p2m_lock(p)                         \
+{                                           \
+    if ( p2m_is_altp2m(p) )                 \
+        mm_write_lock(altp2m, &(p)->lock);  \
+    else                                    \
+        mm_write_lock(p2m, &(p)->lock);     \
+}
 #define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
 #define gfn_lock(p,g,o)       p2m_lock(p)
 #define gfn_unlock(p,g,o)     p2m_unlock(p)
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 1fd1194..58d4951 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -35,6 +35,7 @@
 #include <asm/hvm/vmx/vmx.h> /* ept_p2m_init() */
 #include <asm/mem_sharing.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/altp2m.h>
 #include <asm/hvm/svm/amd-iommu-proto.h>
 #include <xsm/xsm.h>
 
@@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct domain *d)
     }
 }
 
+static void p2m_teardown_altp2m(struct domain *d)
+{
+    uint8_t i;
+    struct p2m_domain *p2m;
+
+    for (i = 0; i < MAX_ALTP2M; i++)
+    {
+        if ( !d->arch.altp2m_p2m[i] )
+            continue;
+        p2m = d->arch.altp2m_p2m[i];
+        p2m_free_one(p2m);
+        d->arch.altp2m_p2m[i] = NULL;
+    }
+}
+
+static int p2m_init_altp2m(struct domain *d)
+{
+    uint8_t i;
+    struct p2m_domain *p2m;
+
+    mm_lock_init(&d->arch.altp2m_lock);
+    for (i = 0; i < MAX_ALTP2M; i++)
+    {
+        d->arch.altp2m_p2m[i] = p2m = p2m_init_one(d);
+        if ( p2m == NULL )
+        {
+            p2m_teardown_altp2m(d);
+            return -ENOMEM;
+        }
+        p2m->p2m_class = p2m_alternate;
+        p2m->access_required = 1;
+        _atomic_set(&p2m->active_vcpus, 0);
+    }
+
+    return 0;
+}
+
 int p2m_init(struct domain *d)
 {
     int rc;
@@ -196,7 +234,14 @@ int p2m_init(struct domain *d)
      * (p2m_init runs too early for HVM_PARAM_* options) */
     rc = p2m_init_nestedp2m(d);
     if ( rc )
+    {
         p2m_teardown_hostp2m(d);
+        return rc;
+    }
+
+    rc = p2m_init_altp2m(d);
+    if ( rc )
+        p2m_teardown_altp2m(d);
 
     return rc;
 }
@@ -1920,6 +1965,59 @@ int unmap_mmio_regions(struct domain *d,
     return err;
 }
 
+uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp)
+{
+    struct p2m_domain *p2m;
+    struct ept_data *ept;
+    uint16_t i;
+
+    altp2m_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
+            continue;
+
+        p2m = d->arch.altp2m_p2m[i];
+        ept = &p2m->ept;
+
+        if ( eptp == ept_get_eptp(ept) )
+            goto out;
+    }
+
+    i = INVALID_ALTP2M;
+
+out:
+    altp2m_unlock(d);
+    return i;
+}
+
+bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx)
+{
+    struct domain *d = v->domain;
+    bool_t rc = 0;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    altp2m_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        if ( idx != vcpu_altp2m(v).p2midx )
+        {
+            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+            vcpu_altp2m(v).p2midx = idx;
+            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+            ap2m_vcpu_update_eptp(v);
+        }
+        rc = 1;
+    }
+
+    altp2m_unlock(d);
+    return rc;
+}
+
 /*** Audit ***/
 
 #if P2M_AUDIT
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 5eb6832..744f54f 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -235,6 +235,10 @@ struct paging_vcpu {
 typedef xen_domctl_cpuid_t cpuid_input_t;
 
 #define MAX_NESTEDP2M 10
+
+#define MAX_ALTP2M      ((uint16_t)10)
+#define INVALID_ALTP2M  ((uint16_t)~0)
+#define MAX_EPTP        (PAGE_SIZE / sizeof(uint64_t))
 struct p2m_domain;
 struct time_scale {
     int shift;
@@ -294,6 +298,12 @@ struct arch_domain
     struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
     mm_lock_t nested_p2m_lock;
 
+    /* altp2m: allow multiple copies of host p2m */
+    bool_t altp2m_active;
+    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
+    mm_lock_t altp2m_lock;
+    uint64_t *altp2m_eptp;
+
     /* NB. protected by d->event_lock and by irq_desc[irq].lock */
     struct radix_tree_root irq_pirq;
 
diff --git a/xen/include/asm-x86/hvm/altp2m.h b/xen/include/asm-x86/hvm/altp2m.h
new file mode 100644
index 0000000..1a6f22b
--- /dev/null
+++ b/xen/include/asm-x86/hvm/altp2m.h
@@ -0,0 +1,38 @@
+/*
+ * Alternate p2m HVM
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#ifndef _ALTP2M_H
+#define _ALTP2M_H
+
+#include <xen/types.h>
+#include <xen/sched.h>         /* for struct vcpu, struct domain */
+#include <asm/hvm/vcpu.h>      /* for vcpu_altp2m */
+
+/* Alternate p2m HVM on/off per domain */
+static inline bool_t altp2m_active(const struct domain *d)
+{
+    return d->arch.altp2m_active;
+}
+
+/* Alternate p2m VCPU */
+int altp2m_vcpu_initialise(struct vcpu *v);
+void altp2m_vcpu_destroy(struct vcpu *v);
+void altp2m_vcpu_reset(struct vcpu *v);
+
+#endif /* _ALTP2M_H */
+
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 8aa2bb3..36f1b74 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -210,6 +210,14 @@ struct hvm_function_table {
                                   uint32_t *ecx, uint32_t *edx);
 
     void (*enable_msr_exit_interception)(struct domain *d);
+
+    /* Alternate p2m */
+    int (*ap2m_vcpu_initialise)(struct vcpu *v);
+    void (*ap2m_vcpu_destroy)(struct vcpu *v);
+    int (*ap2m_vcpu_reset)(struct vcpu *v);
+    void (*ap2m_vcpu_update_eptp)(struct vcpu *v);
+    void (*ap2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
+    bool_t (*ap2m_vcpu_emulate_ve)(struct vcpu *v);
 };
 
 extern struct hvm_function_table hvm_funcs;
@@ -525,6 +533,15 @@ extern bool_t opt_hvm_fep;
 #define opt_hvm_fep 0
 #endif
 
+/* updates the current EPTP in VMCS */
+void ap2m_vcpu_update_eptp(struct vcpu *v);
+
+/* updates VMCS fields related to VMFUNC and #VE */
+void ap2m_vcpu_update_vmfunc_ve(struct vcpu *v);
+
+/* emulates #VE */
+bool_t ap2m_vcpu_emulate_ve(struct vcpu *v);
+
 #endif /* __ASM_X86_HVM_HVM_H__ */
 
 /*
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 3d8f4dc..09f25c1 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -118,6 +118,13 @@ struct nestedvcpu {
 
 #define vcpu_nestedhvm(v) ((v)->arch.hvm_vcpu.nvcpu)
 
+struct altp2mvcpu {
+    uint16_t    p2midx;         /* alternate p2m index */
+    gfn_t       veinfo_gfn;     /* #VE information page gfn */
+};
+
+#define vcpu_altp2m(v) ((v)->arch.hvm_vcpu.avcpu)
+
 struct hvm_vcpu {
     /* Guest control-register and EFER values, just as the guest sees them. */
     unsigned long       guest_cr[5];
@@ -163,6 +170,8 @@ struct hvm_vcpu {
 
     struct nestedvcpu   nvcpu;
 
+    struct altp2mvcpu   avcpu;
+
     struct mtrr_state   mtrr;
     u64                 pat_cr;
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index b49c09b..079a298 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -175,6 +175,7 @@ typedef unsigned int p2m_query_t;
 typedef enum {
     p2m_host,
     p2m_nested,
+    p2m_alternate,
 } p2m_class_t;
 
 /* Per-p2m-table state */
@@ -193,7 +194,7 @@ struct p2m_domain {
 
     struct domain     *domain;   /* back pointer to domain */
 
-    p2m_class_t       p2m_class; /* host/nested/? */
+    p2m_class_t       p2m_class; /* host/nested/alternate */
 
     /* Nested p2ms only: nested p2m base value that this p2m shadows.
      * This can be cleared to P2M_BASE_EADDR under the per-p2m lock but
@@ -219,6 +220,9 @@ struct p2m_domain {
      * host p2m's lock. */
     int                defer_nested_flush;
 
+    /* Alternate p2m: count of vcpu's currently using this p2m. */
+    atomic_t           active_vcpus;
+
     /* Pages used to construct the p2m */
     struct page_list_head pages;
 
@@ -317,6 +321,11 @@ static inline bool_t p2m_is_nestedp2m(const struct p2m_domain *p2m)
     return p2m->p2m_class == p2m_nested;
 }
 
+static inline bool_t p2m_is_altp2m(const struct p2m_domain *p2m)
+{
+    return p2m->p2m_class == p2m_alternate;
+}
+
 #define p2m_get_pagetable(p2m)  ((p2m)->phys_table)
 
 /**** p2m query accessors. They lock p2m_lock, and thus serialize
@@ -722,6 +731,25 @@ void nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
     l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
 
 /*
+ * Alternate p2m: shadow p2m tables used for alternate memory views
+ */
+
+/* get current alternate p2m table */
+static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    uint16_t index = vcpu_altp2m(v).p2midx;
+
+    return (index == INVALID_ALTP2M) ? NULL : d->arch.altp2m_p2m[index];
+}
+
+/* Locate an alternate p2m by its EPTP */
+uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp);
+
+/* Switch alternate p2m for a single vcpu */
+bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
+
+/*
  * p2m type to IOMMU flags
  */
 static inline unsigned int p2m_get_iommu_flags(p2m_type_t p2mt)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (4 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-03 16:29   ` Andrew Cooper
  2015-07-07 19:02   ` Nakajima, Jun
  2015-07-01 18:09 ` [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Implement and hook up the code to enable VMX support of VMFUNC and #VE.

VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 138 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 138 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2d3ad63..9585aa3 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -56,6 +56,7 @@
 #include <asm/debugger.h>
 #include <asm/apic.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/altp2m.h>
 #include <asm/event.h>
 #include <asm/monitor.h>
 #include <public/arch-x86/cpuid.h>
@@ -1763,6 +1764,104 @@ static void vmx_enable_msr_exit_interception(struct domain *d)
                                          MSR_TYPE_W);
 }
 
+static void vmx_vcpu_update_eptp(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    struct p2m_domain *p2m = NULL;
+    struct ept_data *ept;
+
+    if ( altp2m_active(d) )
+        p2m = p2m_get_altp2m(v);
+    if ( !p2m )
+        p2m = p2m_get_hostp2m(d);
+
+    ept = &p2m->ept;
+    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
+
+    vmx_vmcs_enter(v);
+
+    __vmwrite(EPT_POINTER, ept_get_eptp(ept));
+
+    if ( v->arch.hvm_vmx.secondary_exec_control &
+        SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+        __vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);
+
+    vmx_vmcs_exit(v);
+}
+
+static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+
+    if ( !cpu_has_vmx_vmfunc )
+        return;
+
+    if ( cpu_has_vmx_virt_exceptions )
+        mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
+    vmx_vmcs_enter(v);
+
+    if ( !d->is_dying && altp2m_active(d) )
+    {
+        v->arch.hvm_vmx.secondary_exec_control |= mask;
+        __vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
+        __vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp));
+
+        if ( cpu_has_vmx_virt_exceptions )
+        {
+            p2m_type_t t;
+            mfn_t mfn;
+
+            mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), &t);
+
+            if ( mfn_x(mfn) != INVALID_MFN )
+                __vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
+            else
+                mask &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+        }
+    }
+    else
+        v->arch.hvm_vmx.secondary_exec_control &= ~mask;
+
+    __vmwrite(SECONDARY_VM_EXEC_CONTROL,
+        v->arch.hvm_vmx.secondary_exec_control);
+
+    vmx_vmcs_exit(v);
+}
+
+static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
+{
+    bool_t rc = 0;
+    ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) != INVALID_GFN ?
+        hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) : NULL;
+
+    if ( !veinfo )
+        return 0;
+
+    if ( veinfo->semaphore != 0 )
+        goto out;
+
+    rc = 1;
+
+    veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
+    veinfo->semaphore = ~0l;
+    veinfo->eptp_index = vcpu_altp2m(v).p2midx;
+
+    vmx_vmcs_enter(v);
+    __vmread(EXIT_QUALIFICATION, &veinfo->exit_qualification);
+    __vmread(GUEST_LINEAR_ADDRESS, &veinfo->gla);
+    __vmread(GUEST_PHYSICAL_ADDRESS, &veinfo->gpa);
+    vmx_vmcs_exit(v);
+
+    hvm_inject_hw_exception(TRAP_virtualisation,
+                            HVM_DELIVER_NO_ERROR_CODE);
+
+out:
+    hvm_unmap_guest_frame(veinfo, 0);
+    return rc;
+}
+
 static struct hvm_function_table __initdata vmx_function_table = {
     .name                 = "VMX",
     .cpu_up_prepare       = vmx_cpu_up_prepare,
@@ -1822,6 +1921,9 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
     .hypervisor_cpuid_leaf = vmx_hypervisor_cpuid_leaf,
     .enable_msr_exit_interception = vmx_enable_msr_exit_interception,
+    .ap2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
+    .ap2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
+    .ap2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2754,6 +2856,42 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 
     /* Now enable interrupts so it's safe to take locks. */
     local_irq_enable();
+ 
+    /*
+     * If the guest has the ability to switch EPTP without an exit,
+     * figure out whether it has done so and update the altp2m data.
+     */
+    if ( altp2m_active(v->domain) &&
+        (v->arch.hvm_vmx.secondary_exec_control &
+        SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+    {
+        unsigned long idx;
+
+        if ( v->arch.hvm_vmx.secondary_exec_control &
+            SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+            __vmread(EPTP_INDEX, &idx);
+        else
+        {
+            unsigned long eptp;
+
+            __vmread(EPT_POINTER, &eptp);
+
+            if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
+                 INVALID_ALTP2M )
+            {
+                gdprintk(XENLOG_ERR, "EPTP not found in alternate p2m list\n");
+                domain_crash(v->domain);
+            }
+        }
+
+        if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )
+        {
+            BUG_ON(idx >= MAX_ALTP2M);
+            atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+            vcpu_altp2m(v).p2midx = (uint16_t)idx;
+            atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+        }
+    }
 
     /* XXX: This looks ugly, but we need a mechanism to ensure
      * any pending vmresume has really happened
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (5 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-03 16:40   ` Andrew Cooper
  2015-07-09 14:05   ` Jan Beulich
  2015-07-01 18:09 ` [PATCH v3 08/13] x86/altp2m: add control of suppress_ve Ed White
                   ` (7 subsequent siblings)
  14 siblings, 2 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Ravi Sahita <ravi.sahita@intel.com>

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
---
 xen/arch/x86/hvm/emulate.c             | 12 +++++++--
 xen/arch/x86/hvm/vmx/vmx.c             | 30 +++++++++++++++++++++
 xen/arch/x86/x86_emulate/x86_emulate.c | 48 +++++++++++++++++++++-------------
 xen/arch/x86/x86_emulate/x86_emulate.h |  4 +++
 xen/include/asm-x86/hvm/hvm.h          |  2 ++
 5 files changed, 76 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index ac9c9d6..157fe78 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1356,6 +1356,12 @@ static int hvmemul_invlpg(
     return rc;
 }
 
+static int hvmemul_vmfunc(
+    struct x86_emulate_ctxt *ctxt)
+{
+    return hvm_funcs.ap2m_vcpu_emulate_vmfunc(ctxt->regs);
+}
+
 static const struct x86_emulate_ops hvm_emulate_ops = {
     .read          = hvmemul_read,
     .insn_fetch    = hvmemul_insn_fetch,
@@ -1379,7 +1385,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmfunc        = hvmemul_vmfunc,
 };
 
 static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
@@ -1405,7 +1412,8 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
     .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
-    .invlpg        = hvmemul_invlpg
+    .invlpg        = hvmemul_invlpg,
+    .vmfunc        = hvmemul_vmfunc,
 };
 
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 9585aa3..c6feeae 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -82,6 +82,7 @@ static void vmx_fpu_dirty_intercept(void);
 static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content);
 static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg_intercept(unsigned long vaddr);
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
 
 uint8_t __read_mostly posted_intr_vector;
 
@@ -1830,6 +1831,20 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
     vmx_vmcs_exit(v);
 }
 
+static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
+{
+    int rc = X86EMUL_EXCEPTION;
+    struct vcpu *v = current;
+
+    if ( !cpu_has_vmx_vmfunc && altp2m_active(v->domain) &&
+         regs->eax == 0 &&
+         p2m_switch_vcpu_altp2m_by_id(v, (uint16_t)regs->ecx) )
+    {
+        rc = X86EMUL_OKAY;
+    }
+    return rc;
+}
+
 static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
 {
     bool_t rc = 0;
@@ -1898,6 +1913,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .msr_read_intercept   = vmx_msr_read_intercept,
     .msr_write_intercept  = vmx_msr_write_intercept,
     .invlpg_intercept     = vmx_invlpg_intercept,
+    .vmfunc_intercept     = vmx_vmfunc_intercept,
     .handle_cd            = vmx_handle_cd,
     .set_info_guest       = vmx_set_info_guest,
     .set_rdtsc_exiting    = vmx_set_rdtsc_exiting,
@@ -1924,6 +1940,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .ap2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
     .ap2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
     .ap2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
+    .ap2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2095,6 +2112,12 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
         vpid_sync_vcpu_gva(curr, vaddr);
 }
 
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs)
+{
+    gdprintk(XENLOG_ERR, "Failed guest VMFUNC execution\n");
+    return X86EMUL_EXCEPTION;
+}
+
 static int vmx_cr_access(unsigned long exit_qualification)
 {
     struct vcpu *curr = current;
@@ -3245,6 +3268,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             update_guest_eip();
         break;
 
+    case EXIT_REASON_VMFUNC:
+        if ( vmx_vmfunc_intercept(regs) == X86EMUL_OKAY )
+            update_guest_eip();
+        else
+            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        break;
+
     case EXIT_REASON_MWAIT_INSTRUCTION:
     case EXIT_REASON_MONITOR_INSTRUCTION:
     case EXIT_REASON_GETSEC:
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index c017c69..adf64d0 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -3815,28 +3815,40 @@ x86_emulate(
     case 0x01: /* Grp7 */ {
         struct segment_register reg;
         unsigned long base, limit, cr0, cr0w;
+        uint64_t tsc_aux;
 
-        if ( modrm == 0xdf ) /* invlpga */
+        switch( modrm )
         {
-            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
-            generate_exception_if(!mode_ring0(), EXC_GP, 0);
-            fail_if(ops->invlpg == NULL);
-            if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
-                                   ctxt)) )
-                goto done;
-            break;
-        }
-
-        if ( modrm == 0xf9 ) /* rdtscp */
-        {
-            uint64_t tsc_aux;
-            fail_if(ops->read_msr == NULL);
-            if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
-                goto done;
-            _regs.ecx = (uint32_t)tsc_aux;
-            goto rdtsc;
+            case 0xdf: /* invlpga AMD */
+                generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
+                generate_exception_if(!mode_ring0(), EXC_GP, 0);
+                fail_if(ops->invlpg == NULL);
+                if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
+                                       ctxt)) )
+                    goto done;
+                break;
+            case 0xf9: /* rdtscp */
+                fail_if(ops->read_msr == NULL);
+                if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
+                    goto done;
+                _regs.ecx = (uint32_t)tsc_aux;
+                goto rdtsc;
+            case 0xd4: /* vmfunc */
+                generate_exception_if(
+                    (lock_prefix |
+                    rep_prefix() |
+                    (vex.pfx == vex_66)),
+                    EXC_UD, -1);
+                fail_if(ops->vmfunc == NULL);
+                if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
+                    goto done;
+                break;
+            default:
+                goto continue_grp7;
         }
+        break;
 
+continue_grp7:
         switch ( modrm_reg & 7 )
         {
         case 0: /* sgdt */
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 064b8f4..a4d4ec8 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -397,6 +397,10 @@ struct x86_emulate_ops
         enum x86_segment seg,
         unsigned long offset,
         struct x86_emulate_ctxt *ctxt);
+
+    /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */
+    int (*vmfunc)(
+        struct x86_emulate_ctxt *ctxt);
 };
 
 struct cpu_user_regs;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 36f1b74..595b399 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -167,6 +167,7 @@ struct hvm_function_table {
     int (*msr_read_intercept)(unsigned int msr, uint64_t *msr_content);
     int (*msr_write_intercept)(unsigned int msr, uint64_t msr_content);
     void (*invlpg_intercept)(unsigned long vaddr);
+    int (*vmfunc_intercept)(struct cpu_user_regs *regs);
     void (*handle_cd)(struct vcpu *v, unsigned long value);
     void (*set_info_guest)(struct vcpu *v);
     void (*set_rdtsc_exiting)(struct vcpu *v, bool_t);
@@ -218,6 +219,7 @@ struct hvm_function_table {
     void (*ap2m_vcpu_update_eptp)(struct vcpu *v);
     void (*ap2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
     bool_t (*ap2m_vcpu_emulate_ve)(struct vcpu *v);
+    int (*ap2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
 };
 
 extern struct hvm_function_table hvm_funcs;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 08/13] x86/altp2m: add control of suppress_ve.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (6 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-03 16:43   ` Andrew Cooper
  2015-07-01 18:09 ` [PATCH v3 09/13] x86/altp2m: alternate p2m memory events Ed White
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

The existing ept_set_entry() and ept_get_entry() routines are extended
to optionally set/get suppress_ve and renamed. New ept_set_entry() and
ept_get_entry() routines are provided as wrappers, where set preserves
suppress_ve for an existing entry and sets it for a new entry.

Additional function pointers are added to p2m_domain to allow direct
access to the extended routines.

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 xen/arch/x86/mm/p2m-ept.c | 40 +++++++++++++++++++++++++++++++++-------
 xen/include/asm-x86/p2m.h | 13 +++++++++++++
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 4111795..bcb9381 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -650,14 +650,15 @@ bool_t ept_handle_misconfig(uint64_t gpa)
 }
 
 /*
- * ept_set_entry() computes 'need_modify_vtd_table' for itself,
+ * _ept_set_entry() computes 'need_modify_vtd_table' for itself,
  * by observing whether any gfn->mfn translations are modified.
  *
  * Returns: 0 for success, -errno for failure
  */
 static int
-ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, 
-              unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma)
+_ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, 
+               unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma,
+               int sve)
 {
     ept_entry_t *table, *ept_entry = NULL;
     unsigned long gfn_remainder = gfn;
@@ -803,7 +804,11 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
 
-    new_entry.suppress_ve = 1;
+    if ( sve != -1 )
+        new_entry.suppress_ve = !!sve;
+    else
+        new_entry.suppress_ve = is_epte_valid(&old_entry) ?
+                                    old_entry.suppress_ve : 1;
 
     rc = atomic_write_ept_entry(ept_entry, new_entry, target);
     if ( unlikely(rc) )
@@ -848,10 +853,18 @@ out:
     return rc;
 }
 
+static int
+ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, 
+              unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma)
+{
+    return _ept_set_entry(p2m, gfn, mfn, order, p2mt, p2ma, -1);
+}
+
 /* Read ept p2m entries */
-static mfn_t ept_get_entry(struct p2m_domain *p2m,
-                           unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
-                           p2m_query_t q, unsigned int *page_order)
+static mfn_t _ept_get_entry(struct p2m_domain *p2m,
+                            unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
+                            p2m_query_t q, unsigned int *page_order,
+                            bool_t *sve)
 {
     ept_entry_t *table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
     unsigned long gfn_remainder = gfn;
@@ -865,6 +878,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
 
     *t = p2m_mmio_dm;
     *a = p2m_access_n;
+    if ( sve )
+        *sve = 1;
 
     /* This pfn is higher than the highest the p2m map currently holds */
     if ( gfn > p2m->max_mapped_pfn )
@@ -930,6 +945,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
         else
             *t = ept_entry->sa_p2mt;
         *a = ept_entry->access;
+        if ( sve )
+            *sve = ept_entry->suppress_ve;
 
         mfn = _mfn(ept_entry->mfn);
         if ( i )
@@ -953,6 +970,13 @@ out:
     return mfn;
 }
 
+static mfn_t ept_get_entry(struct p2m_domain *p2m,
+                           unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
+                           p2m_query_t q, unsigned int *page_order)
+{
+    return _ept_get_entry(p2m, gfn, t, a, q, page_order, NULL);
+}
+
 void ept_walk_table(struct domain *d, unsigned long gfn)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
@@ -1131,6 +1155,8 @@ int ept_p2m_init(struct p2m_domain *p2m)
 
     p2m->set_entry = ept_set_entry;
     p2m->get_entry = ept_get_entry;
+    p2m->set_entry_full = _ept_set_entry;
+    p2m->get_entry_full = _ept_get_entry;
     p2m->change_entry_type_global = ept_change_entry_type_global;
     p2m->change_entry_type_range = ept_change_entry_type_range;
     p2m->memory_type_changed = ept_memory_type_changed;
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 079a298..bf5e5cb 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -237,6 +237,19 @@ struct p2m_domain {
                                        p2m_access_t *p2ma,
                                        p2m_query_t q,
                                        unsigned int *page_order);
+    int                (*set_entry_full)(struct p2m_domain *p2m,
+                                         unsigned long gfn,
+                                         mfn_t mfn, unsigned int page_order,
+                                         p2m_type_t p2mt,
+                                         p2m_access_t p2ma,
+                                         int sve);
+    mfn_t              (*get_entry_full)(struct p2m_domain *p2m,
+                                         unsigned long gfn,
+                                         p2m_type_t *p2mt,
+                                         p2m_access_t *p2ma,
+                                         p2m_query_t q,
+                                         unsigned int *page_order,
+                                         bool_t *sve);
     void               (*enable_hardware_log_dirty)(struct p2m_domain *p2m);
     void               (*disable_hardware_log_dirty)(struct p2m_domain *p2m);
     void               (*flush_hardware_cached_dirty)(struct p2m_domain *p2m);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 09/13] x86/altp2m: alternate p2m memory events.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (7 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 08/13] x86/altp2m: add control of suppress_ve Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-01 18:29   ` Lengyel, Tamas
                     ` (2 more replies)
  2015-07-01 18:09 ` [PATCH v3 10/13] x86/altp2m: add remaining support routines Ed White
                   ` (5 subsequent siblings)
  14 siblings, 3 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Add a flag to indicate that a memory event occurred in an alternate p2m
and a field containing the p2m index. Allow any event response to switch
to a different alternate p2m using the same flag and field.

Modify p2m_memory_access_check() to handle alternate p2m's.

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 xen/arch/x86/mm/p2m.c         | 20 +++++++++++++++++++-
 xen/common/vm_event.c         |  3 +++
 xen/include/asm-arm/p2m.h     |  6 ++++++
 xen/include/asm-x86/p2m.h     |  3 +++
 xen/include/public/vm_event.h | 11 +++++++++++
 5 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 58d4951..576b28d 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1514,6 +1514,13 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
     }
 }
 
+void p2m_altp2m_check(struct vcpu *v, const vm_event_response_t *rsp)
+{
+    if ( (rsp->flags & VM_EVENT_FLAG_ALTERNATE_P2M) &&
+         altp2m_active(v->domain) )
+        p2m_switch_vcpu_altp2m_by_id(v, rsp->altp2m_idx);
+}
+
 bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
                             struct npfec npfec,
                             vm_event_request_t **req_ptr)
@@ -1521,7 +1528,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     struct vcpu *v = current;
     unsigned long gfn = gpa >> PAGE_SHIFT;
     struct domain *d = v->domain;    
-    struct p2m_domain* p2m = p2m_get_hostp2m(d);
+    struct p2m_domain *p2m = NULL;
     mfn_t mfn;
     p2m_type_t p2mt;
     p2m_access_t p2ma;
@@ -1529,6 +1536,11 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
     int rc;
     unsigned long eip = guest_cpu_user_regs()->eip;
 
+    if ( altp2m_active(d) )
+        p2m = p2m_get_altp2m(v);
+    if ( !p2m )
+        p2m = p2m_get_hostp2m(d);
+
     /* First, handle rx2rw conversion automatically.
      * These calls to p2m->set_entry() must succeed: we have the gfn
      * locked and just did a successful get_entry(). */
@@ -1635,6 +1647,12 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
         req->vcpu_id = v->vcpu_id;
 
         p2m_vm_event_fill_regs(req);
+
+        if ( altp2m_active(v->domain) )
+        {
+            req->flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
+            req->altp2m_idx = vcpu_altp2m(v).p2midx;
+        }
     }
 
     /* Pause the current VCPU */
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 120a78a..57095d8 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -399,6 +399,9 @@ void vm_event_resume(struct domain *d, struct vm_event_domain *ved)
 
         };
 
+        /* Check for altp2m switch */
+        p2m_altp2m_check(v, &rsp);
+
         /* Unpause domain. */
         if ( rsp.flags & VM_EVENT_FLAG_VCPU_PAUSED )
             vm_event_vcpu_unpause(v);
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 63748ef..3206c75 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -109,6 +109,12 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
     /* Not supported on ARM. */
 }
 
+static inline
+void p2m_altp2m_check(struct vcpu *v, const vm_event_response_t *rsp)
+{
+    /* Not supported on ARM. */
+}
+
 #define p2m_is_foreign(_t)  ((_t) == p2m_map_foreign)
 #define p2m_is_ram(_t)      ((_t) == p2m_ram_rw || (_t) == p2m_ram_ro)
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index bf5e5cb..5689f40 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -762,6 +762,9 @@ uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp);
 /* Switch alternate p2m for a single vcpu */
 bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
 
+/* Check to see if vcpu should be switched to a different p2m. */
+void p2m_altp2m_check(struct vcpu *v, const vm_event_response_t *rsp);
+
 /*
  * p2m type to IOMMU flags
  */
diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h
index 577e971..6dfa9db 100644
--- a/xen/include/public/vm_event.h
+++ b/xen/include/public/vm_event.h
@@ -47,6 +47,16 @@
 #define VM_EVENT_FLAG_VCPU_PAUSED     (1 << 0)
 /* Flags to aid debugging mem_event */
 #define VM_EVENT_FLAG_FOREIGN         (1 << 1)
+/*
+ * This flag can be set in a request or a response
+ *
+ * On a request, indicates that the event occurred in the alternate p2m specified by
+ * the altp2m_idx request field.
+ *
+ * On a response, indicates that the VCPU should resume in the alternate p2m specified
+ * by the altp2m_idx response field if possible.
+ */
+#define VM_EVENT_FLAG_ALTERNATE_P2M   (1 << 2)
 
 /*
  * Reasons for the vm event request
@@ -194,6 +204,7 @@ typedef struct vm_event_st {
     uint32_t flags;     /* VM_EVENT_FLAG_* */
     uint32_t reason;    /* VM_EVENT_REASON_* */
     uint32_t vcpu_id;
+    uint16_t altp2m_idx; /* may be used during request and response */
 
     union {
         struct vm_event_paging                mem_paging;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 10/13] x86/altp2m: add remaining support routines.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (8 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 09/13] x86/altp2m: alternate p2m memory events Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-03 16:56   ` Andrew Cooper
  2015-07-09 15:07   ` George Dunlap
  2015-07-01 18:09 ` [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
                   ` (4 subsequent siblings)
  14 siblings, 2 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Add the remaining routines required to support enabling the alternate
p2m functionality.

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 xen/arch/x86/hvm/hvm.c           |  58 +++++-
 xen/arch/x86/mm/hap/Makefile     |   1 +
 xen/arch/x86/mm/hap/altp2m_hap.c |  98 ++++++++++
 xen/arch/x86/mm/p2m-ept.c        |   3 +
 xen/arch/x86/mm/p2m.c            | 385 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/altp2m.h |   4 +
 xen/include/asm-x86/p2m.h        |  33 ++++
 7 files changed, 576 insertions(+), 6 deletions(-)
 create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f21d34d..d2d90c8 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2806,10 +2806,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     mfn_t mfn;
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
-    struct p2m_domain *p2m;
+    struct p2m_domain *p2m, *hostp2m;
     int rc, fall_through = 0, paged = 0;
     int sharing_enomem = 0;
     vm_event_request_t *req_ptr = NULL;
+    bool_t ap2m_active = 0;
 
     /* On Nested Virtualization, walk the guest page table.
      * If this succeeds, all is fine.
@@ -2869,11 +2870,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
         goto out;
     }
 
-    p2m = p2m_get_hostp2m(currd);
-    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
+    ap2m_active = altp2m_active(currd);
+
+    /* Take a lock on the host p2m speculatively, to avoid potential
+     * locking order problems later and to handle unshare etc.
+     */
+    hostp2m = p2m_get_hostp2m(currd);
+    mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
                               P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE : 0),
                               NULL);
 
+    if ( ap2m_active )
+    {
+        if ( altp2m_hap_nested_page_fault(curr, gpa, gla, npfec, &p2m) == 1 )
+        {
+            /* entry was lazily copied from host -- retry */
+            __put_gfn(hostp2m, gfn);
+            rc = 1;
+            goto out;
+        }
+
+        mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+    }
+    else
+        p2m = hostp2m;
+
     /* Check access permissions first, then handle faults */
     if ( mfn_x(mfn) != INVALID_MFN )
     {
@@ -2913,6 +2934,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
 
         if ( violation )
         {
+            /* Should #VE be emulated for this fault? */
+            if ( p2m_is_altp2m(p2m) && !cpu_has_vmx_virt_exceptions )
+            {
+                bool_t sve;
+
+                p2m->get_entry_full(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
+
+                if ( !sve && ap2m_vcpu_emulate_ve(curr) )
+                {
+                    rc = 1;
+                    goto out_put_gfn;
+                }
+            }
+
             if ( p2m_mem_access_check(gpa, gla, npfec, &req_ptr) )
             {
                 fall_through = 1;
@@ -2932,7 +2967,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
          (npfec.write_access &&
           (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
     {
-        put_gfn(currd, gfn);
+        __put_gfn(p2m, gfn);
+        if ( ap2m_active )
+            __put_gfn(hostp2m, gfn);
 
         rc = 0;
         if ( unlikely(is_pvh_domain(currd)) )
@@ -2961,6 +2998,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     /* Spurious fault? PoD and log-dirty also take this path. */
     if ( p2m_is_ram(p2mt) )
     {
+        rc = 1;
         /*
          * Page log dirty is always done with order 0. If this mfn resides in
          * a large page, we do not change other pages type within that large
@@ -2969,9 +3007,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
         if ( npfec.write_access )
         {
             paging_mark_dirty(currd, mfn_x(mfn));
+            /* If p2m is really an altp2m, unlock here to avoid lock ordering
+             * violation when the change below is propagated from host p2m */
+            if ( ap2m_active )
+                __put_gfn(p2m, gfn);
             p2m_change_type_one(currd, gfn, p2m_ram_logdirty, p2m_ram_rw);
+            __put_gfn(ap2m_active ? hostp2m : p2m, gfn);
+
+            goto out;
         }
-        rc = 1;
         goto out_put_gfn;
     }
 
@@ -2981,7 +3025,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
     rc = fall_through;
 
 out_put_gfn:
-    put_gfn(currd, gfn);
+    __put_gfn(p2m, gfn);
+    if ( ap2m_active )
+        __put_gfn(hostp2m, gfn);
 out:
     /* All of these are delayed until we exit, since we might 
      * sleep on event ring wait queues, and we must not hold
diff --git a/xen/arch/x86/mm/hap/Makefile b/xen/arch/x86/mm/hap/Makefile
index 68f2bb5..216cd90 100644
--- a/xen/arch/x86/mm/hap/Makefile
+++ b/xen/arch/x86/mm/hap/Makefile
@@ -4,6 +4,7 @@ obj-y += guest_walk_3level.o
 obj-$(x86_64) += guest_walk_4level.o
 obj-y += nested_hap.o
 obj-y += nested_ept.o
+obj-y += altp2m_hap.o
 
 guest_walk_%level.o: guest_walk.c Makefile
 	$(CC) $(CFLAGS) -DGUEST_PAGING_LEVELS=$* -c $< -o $@
diff --git a/xen/arch/x86/mm/hap/altp2m_hap.c b/xen/arch/x86/mm/hap/altp2m_hap.c
new file mode 100644
index 0000000..52c7877
--- /dev/null
+++ b/xen/arch/x86/mm/hap/altp2m_hap.c
@@ -0,0 +1,98 @@
+/******************************************************************************
+ * arch/x86/mm/hap/altp2m_hap.c
+ *
+ * Copyright (c) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <asm/domain.h>
+#include <asm/page.h>
+#include <asm/paging.h>
+#include <asm/p2m.h>
+#include <asm/hap.h>
+#include <asm/hvm/altp2m.h>
+
+#include "private.h"
+
+/*
+ * If the fault is for a not present entry:
+ *     if the entry in the host p2m has a valid mfn, copy it and retry
+ *     else indicate that outer handler should handle fault
+ *
+ * If the fault is for a present entry:
+ *     indicate that outer handler should handle fault
+ */
+
+int
+altp2m_hap_nested_page_fault(struct vcpu *v, paddr_t gpa,
+                                unsigned long gla, struct npfec npfec,
+                                struct p2m_domain **ap2m)
+{
+    struct p2m_domain *hp2m = p2m_get_hostp2m(v->domain);
+    p2m_type_t p2mt;
+    p2m_access_t p2ma;
+    unsigned int page_order;
+    gfn_t gfn = _gfn(paddr_to_pfn(gpa));
+    unsigned long mask;
+    mfn_t mfn;
+    int rv;
+
+    *ap2m = p2m_get_altp2m(v);
+
+    mfn = get_gfn_type_access(*ap2m, gfn_x(gfn), &p2mt, &p2ma,
+                              0, &page_order);
+    __put_gfn(*ap2m, gfn_x(gfn));
+
+    if ( mfn_x(mfn) != INVALID_MFN )
+        return 0;
+
+    mfn = get_gfn_type_access(hp2m, gfn_x(gfn), &p2mt, &p2ma,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order);
+    put_gfn(hp2m->domain, gfn_x(gfn));
+
+    if ( mfn_x(mfn) == INVALID_MFN )
+        return 0;
+
+    p2m_lock(*ap2m);
+
+    /* If this is a superpage mapping, round down both frame numbers
+     * to the start of the superpage. */
+    mask = ~((1UL << page_order) - 1);
+    mfn = _mfn(mfn_x(mfn) & mask);
+
+    rv = p2m_set_entry(*ap2m, gfn_x(gfn) & mask, mfn, page_order, p2mt, p2ma);
+    p2m_unlock(*ap2m);
+
+    if ( rv )
+    {
+        gdprintk(XENLOG_ERR,
+	    "failed to set entry for %#"PRIx64" -> %#"PRIx64" p2m %#"PRIx64"\n",
+	    gfn_x(gfn), mfn_x(mfn), (unsigned long)*ap2m);
+        domain_crash(hp2m->domain);
+    }
+
+    return 1;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index bcb9381..36fb6a4 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -850,6 +850,9 @@ out:
     if ( is_epte_present(&old_entry) )
         ept_free_entry(p2m, &old_entry, target);
 
+    if ( rc == 0 && p2m_is_hostp2m(p2m) )
+        p2m_altp2m_propagate_change(d, _gfn(gfn), mfn, order, p2mt, p2ma);
+
     return rc;
 }
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 576b28d..1d062e7 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2036,6 +2036,391 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx)
     return rc;
 }
 
+void p2m_flush_altp2m(struct domain *d)
+{
+    uint16_t i;
+
+    altp2m_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        p2m_flush_table(d->arch.altp2m_p2m[i]);
+        /* Uninit and reinit ept to force TLB shootdown */
+        ept_p2m_uninit(d->arch.altp2m_p2m[i]);
+        ept_p2m_init(d->arch.altp2m_p2m[i]);
+        d->arch.altp2m_eptp[i] = INVALID_MFN;
+    }
+
+    altp2m_unlock(d);
+}
+
+static void p2m_init_altp2m_helper(struct domain *d, uint16_t i)
+{
+    struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
+    struct ept_data *ept;
+
+    p2m->min_remapped_gfn = INVALID_GFN;
+    p2m->max_remapped_gfn = INVALID_GFN;
+    ept = &p2m->ept;
+    ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
+    d->arch.altp2m_eptp[i] = ept_get_eptp(ept);
+}
+
+long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx)
+{
+    long rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    altp2m_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] == INVALID_MFN )
+    {
+        p2m_init_altp2m_helper(d, idx);
+        rc = 0;
+    }
+
+    altp2m_unlock(d);
+    return rc;
+}
+
+long p2m_init_next_altp2m(struct domain *d, uint16_t *idx)
+{
+    long rc = -EINVAL;
+    uint16_t i;
+
+    altp2m_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] != INVALID_MFN )
+            continue;
+
+        p2m_init_altp2m_helper(d, i);
+        *idx = i;
+        rc = 0;
+
+        break;
+    }
+
+    altp2m_unlock(d);
+    return rc;
+}
+
+long p2m_destroy_altp2m_by_id(struct domain *d, uint16_t idx)
+{
+    struct p2m_domain *p2m;
+    long rc = -EINVAL;
+
+    if ( !idx || idx > MAX_ALTP2M )
+        return rc;
+
+    domain_pause_except_self(d);
+
+    altp2m_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        p2m = d->arch.altp2m_p2m[idx];
+
+        if ( !_atomic_read(p2m->active_vcpus) )
+        {
+            p2m_flush_table(d->arch.altp2m_p2m[idx]);
+            /* Uninit and reinit ept to force TLB shootdown */
+            ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
+            ept_p2m_init(d->arch.altp2m_p2m[idx]);
+            d->arch.altp2m_eptp[idx] = INVALID_MFN;
+            rc = 0;
+        }
+    }
+
+    altp2m_unlock(d);
+
+    domain_unpause_except_self(d);
+
+    return rc;
+}
+
+long p2m_switch_domain_altp2m_by_id(struct domain *d, uint16_t idx)
+{
+    struct vcpu *v;
+    long rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M )
+        return rc;
+
+    domain_pause_except_self(d);
+
+    altp2m_lock(d);
+
+    if ( d->arch.altp2m_eptp[idx] != INVALID_MFN )
+    {
+        for_each_vcpu( d, v )
+            if ( idx != vcpu_altp2m(v).p2midx )
+            {
+                atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
+                vcpu_altp2m(v).p2midx = idx;
+                atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+                ap2m_vcpu_update_eptp(v);
+            }
+
+        rc = 0;
+    }
+
+    altp2m_unlock(d);
+
+    domain_unpause_except_self(d);
+
+    return rc;
+}
+
+long p2m_set_altp2m_mem_access(struct domain *d, uint16_t idx,
+                                 gfn_t gfn, xenmem_access_t access)
+{
+    struct p2m_domain *hp2m, *ap2m;
+    p2m_access_t req_a, old_a;
+    p2m_type_t t;
+    mfn_t mfn;
+    unsigned int page_order;
+    long rc = -EINVAL;
+
+    static const p2m_access_t memaccess[] = {
+#define ACCESS(ac) [XENMEM_access_##ac] = p2m_access_##ac
+        ACCESS(n),
+        ACCESS(r),
+        ACCESS(w),
+        ACCESS(rw),
+        ACCESS(x),
+        ACCESS(rx),
+        ACCESS(wx),
+        ACCESS(rwx),
+#undef ACCESS
+    };
+
+    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
+        return rc;
+
+    ap2m = d->arch.altp2m_p2m[idx];
+
+    switch ( access )
+    {
+    case 0 ... ARRAY_SIZE(memaccess) - 1:
+        req_a = memaccess[access];
+        break;
+    case XENMEM_access_default:
+        req_a = ap2m->default_access;
+        break;
+    default:
+        return rc;
+    }
+
+    /* If request to set default access */
+    if ( gfn_x(gfn) == INVALID_GFN )
+    {
+        ap2m->default_access = req_a;
+        return 0;
+    }
+
+    hp2m = p2m_get_hostp2m(d);
+
+    p2m_lock(ap2m);
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(gfn), &t, &old_a, 0, NULL);
+
+    /* Check host p2m if no valid entry in alternate */
+    if ( !mfn_valid(mfn) )
+    {
+        mfn = hp2m->get_entry(hp2m, gfn_x(gfn), &t, &old_a,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order);
+
+        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
+            goto out;
+
+        /* If this is a superpage, copy that first */
+        if ( page_order != PAGE_ORDER_4K )
+        {
+            gfn_t gfn2;
+            unsigned long mask;
+            mfn_t mfn2;
+
+            mask = ~((1UL << page_order) - 1);
+            gfn2 = _gfn(gfn_x(gfn) & mask);
+            mfn2 = _mfn(mfn_x(mfn) & mask);
+
+            if ( ap2m->set_entry(ap2m, gfn_x(gfn2), mfn2, page_order, t, old_a) )
+                goto out;
+        }
+    }
+
+    if ( !ap2m->set_entry_full(ap2m, gfn_x(gfn), mfn, PAGE_ORDER_4K, t, req_a,
+                               (current->domain != d)) )
+        rc = 0;
+
+out:
+    p2m_unlock(ap2m);
+    return rc;
+}
+
+long p2m_change_altp2m_gfn(struct domain *d, uint16_t idx,
+                             gfn_t old_gfn, gfn_t new_gfn)
+{
+    struct p2m_domain *hp2m, *ap2m;
+    p2m_access_t a;
+    p2m_type_t t;
+    mfn_t mfn;
+    unsigned int page_order;
+    long rc = -EINVAL;
+
+    if ( idx > MAX_ALTP2M || d->arch.altp2m_eptp[idx] == INVALID_MFN )
+        return rc;
+
+    hp2m = p2m_get_hostp2m(d);
+    ap2m = d->arch.altp2m_p2m[idx];
+
+    p2m_lock(ap2m);
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(old_gfn), &t, &a, 0, NULL);
+
+    if ( gfn_x(new_gfn) == INVALID_GFN )
+    {
+        if ( mfn_valid(mfn) )
+            p2m_remove_page(ap2m, gfn_x(old_gfn), mfn_x(mfn), PAGE_ORDER_4K);
+        rc = 0;
+        goto out;
+    }
+
+    /* Check host p2m if no valid entry in alternate */
+    if ( !mfn_valid(mfn) )
+    {
+        mfn = hp2m->get_entry(hp2m, gfn_x(old_gfn), &t, &a,
+                              P2M_ALLOC | P2M_UNSHARE, &page_order);
+
+        if ( !mfn_valid(mfn) || t != p2m_ram_rw )
+            goto out;
+
+        /* If this is a superpage, copy that first */
+        if ( page_order != PAGE_ORDER_4K )
+        {
+            gfn_t gfn;
+            unsigned long mask;
+
+            mask = ~((1UL << page_order) - 1);
+            gfn = _gfn(gfn_x(old_gfn) & mask);
+            mfn = _mfn(mfn_x(mfn) & mask);
+
+            if ( ap2m->set_entry(ap2m, gfn_x(gfn), mfn, page_order, t, a) )
+                goto out;
+        }
+    }
+
+    mfn = ap2m->get_entry(ap2m, gfn_x(new_gfn), &t, &a, 0, NULL);
+
+    if ( !mfn_valid(mfn) )
+        mfn = hp2m->get_entry(hp2m, gfn_x(new_gfn), &t, &a, 0, NULL);
+
+    if ( !mfn_valid(mfn) || (t != p2m_ram_rw) )
+        goto out;
+
+    if ( !ap2m->set_entry_full(ap2m, gfn_x(old_gfn), mfn, PAGE_ORDER_4K, t, a,
+                               (current->domain != d)) )
+    {
+        rc = 0;
+
+        if ( ap2m->min_remapped_gfn == INVALID_GFN ||
+             gfn_x(new_gfn) < ap2m->min_remapped_gfn )
+            ap2m->min_remapped_gfn = gfn_x(new_gfn);
+        if ( ap2m->max_remapped_gfn == INVALID_GFN ||
+             gfn_x(new_gfn) > ap2m->max_remapped_gfn )
+            ap2m->max_remapped_gfn = gfn_x(new_gfn);
+    }
+
+out:
+    p2m_unlock(ap2m);
+    return rc;
+}
+
+static void p2m_reset_altp2m(struct p2m_domain *p2m)
+{
+    p2m_flush_table(p2m);
+    /* Uninit and reinit ept to force TLB shootdown */
+    ept_p2m_uninit(p2m);
+    ept_p2m_init(p2m);
+    p2m->min_remapped_gfn = INVALID_GFN;
+    p2m->max_remapped_gfn = INVALID_GFN;
+}
+
+void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
+                                 mfn_t mfn, unsigned int page_order,
+                                 p2m_type_t p2mt, p2m_access_t p2ma)
+{
+    struct p2m_domain *p2m;
+    p2m_access_t a;
+    p2m_type_t t;
+    mfn_t m;
+    uint16_t i;
+    bool_t reset_p2m;
+    unsigned int reset_count = 0;
+    uint16_t last_reset_idx = ~0;
+
+    if ( !altp2m_active(d) )
+        return;
+
+    altp2m_lock(d);
+
+    for ( i = 0; i < MAX_ALTP2M; i++ )
+    {
+        if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
+            continue;
+
+        p2m = d->arch.altp2m_p2m[i];
+        m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0, NULL);
+
+        reset_p2m = 0;
+
+        /* Check for a dropped page that may impact this altp2m */
+        if ( mfn_x(mfn) == INVALID_MFN &&
+             gfn_x(gfn) >= p2m->min_remapped_gfn &&
+             gfn_x(gfn) <= p2m->max_remapped_gfn )
+            reset_p2m = 1;
+
+        if ( reset_p2m )
+        {
+            if ( !reset_count++ )
+            {
+                p2m_reset_altp2m(p2m);
+                last_reset_idx = i;
+            }
+            else
+            {
+                /* At least 2 altp2m's impacted, so reset everything */
+                __put_gfn(p2m, gfn_x(gfn));
+
+                for ( i = 0; i < MAX_ALTP2M; i++ )
+                {
+                    if ( i == last_reset_idx ||
+                         d->arch.altp2m_eptp[i] == INVALID_MFN )
+                        continue;
+
+                    p2m = d->arch.altp2m_p2m[i];
+                    p2m_lock(p2m);
+                    p2m_reset_altp2m(p2m);
+                    p2m_unlock(p2m);
+                }
+
+                goto out;
+            }
+        }
+        else if ( mfn_x(m) != INVALID_MFN )
+           p2m_set_entry(p2m, gfn_x(gfn), mfn, page_order, p2mt, p2ma);
+
+        __put_gfn(p2m, gfn_x(gfn));
+    }
+
+out:
+    altp2m_unlock(d);
+}
+
 /*** Audit ***/
 
 #if P2M_AUDIT
diff --git a/xen/include/asm-x86/hvm/altp2m.h b/xen/include/asm-x86/hvm/altp2m.h
index 1a6f22b..f90b5af 100644
--- a/xen/include/asm-x86/hvm/altp2m.h
+++ b/xen/include/asm-x86/hvm/altp2m.h
@@ -34,5 +34,9 @@ int altp2m_vcpu_initialise(struct vcpu *v);
 void altp2m_vcpu_destroy(struct vcpu *v);
 void altp2m_vcpu_reset(struct vcpu *v);
 
+/* Alternate p2m paging */
+int altp2m_hap_nested_page_fault(struct vcpu *v, paddr_t gpa,
+    unsigned long gla, struct npfec npfec, struct p2m_domain **ap2m);
+
 #endif /* _ALTP2M_H */
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 5689f40..2159c5b 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -279,6 +279,11 @@ struct p2m_domain {
     /* Highest guest frame that's ever been mapped in the p2m */
     unsigned long max_mapped_pfn;
 
+    /* Alternate p2m's only: range of gfn's for which underlying
+     * mfn may have duplicate mappings */
+    unsigned long min_remapped_gfn;
+    unsigned long max_remapped_gfn;
+
     /* When releasing shared gfn's in a preemptible manner, recall where
      * to resume the search */
     unsigned long next_shared_gfn_to_relinquish;
@@ -765,6 +770,34 @@ bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
 /* Check to see if vcpu should be switched to a different p2m. */
 void p2m_altp2m_check(struct vcpu *v, const vm_event_response_t *rsp);
 
+/* Flush all the alternate p2m's for a domain */
+void p2m_flush_altp2m(struct domain *d);
+
+/* Make a specific alternate p2m valid */
+long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx);
+
+/* Find an available alternate p2m and make it valid */
+long p2m_init_next_altp2m(struct domain *d, uint16_t *idx);
+
+/* Make a specific alternate p2m invalid */
+long p2m_destroy_altp2m_by_id(struct domain *d, uint16_t idx);
+
+/* Switch alternate p2m for entire domain */
+long p2m_switch_domain_altp2m_by_id(struct domain *d, uint16_t idx);
+
+/* Set access type for a gfn */
+long p2m_set_altp2m_mem_access(struct domain *d, uint16_t idx,
+                                 gfn_t gfn, xenmem_access_t access);
+
+/* Change a gfn->mfn mapping */
+long p2m_change_altp2m_gfn(struct domain *d, uint16_t idx,
+                             gfn_t old_gfn, gfn_t new_gfn);
+
+/* Propagate a host p2m change to all alternate p2m's */
+void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
+                                 mfn_t mfn, unsigned int page_order,
+                                 p2m_type_t p2mt, p2m_access_t p2ma);
+
 /*
  * p2m type to IOMMU flags
  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (9 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 10/13] x86/altp2m: add remaining support routines Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-06 10:09   ` Andrew Cooper
  2015-07-09 14:34   ` Jan Beulich
  2015-07-01 18:09 ` [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
                   ` (3 subsequent siblings)
  14 siblings, 2 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 xen/arch/x86/hvm/hvm.c          | 201 ++++++++++++++++++++++++++++++++++++++++
 xen/include/public/hvm/hvm_op.h |  69 ++++++++++++++
 2 files changed, 270 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index d2d90c8..0d81050 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6447,6 +6447,207 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     }
 
+    case HVMOP_altp2m_get_domain_state:
+    {
+        struct xen_hvm_altp2m_domain_state a;
+        struct domain *d;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        d = rcu_lock_domain_by_any_id(a.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        rc = -EINVAL;
+        if ( is_hvm_domain(d) && hvm_altp2m_supported() )
+        {
+            a.state = altp2m_active(d);
+            rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+        }
+
+        rcu_unlock_domain(d);
+        break;
+    }
+
+    case HVMOP_altp2m_set_domain_state:
+    {
+        struct xen_hvm_altp2m_domain_state a;
+        struct domain *d;
+        struct vcpu *v;
+        bool_t ostate;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        d = rcu_lock_domain_by_any_id(a.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        rc = -EINVAL;
+        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+             !nestedhvm_enabled(d) )
+        {
+            ostate = d->arch.altp2m_active;
+            d->arch.altp2m_active = !!a.state;
+
+            rc = 0;
+
+            /* If the alternate p2m state has changed, handle appropriately */
+            if ( d->arch.altp2m_active != ostate )
+            {
+                if ( ostate || !(rc = p2m_init_altp2m_by_id(d, 0)) )
+                {
+                    for_each_vcpu( d, v )
+                    {
+                        if ( !ostate )
+                            altp2m_vcpu_initialise(v);
+                        else
+                            altp2m_vcpu_destroy(v);
+                    }
+
+                    if ( ostate )
+                        p2m_flush_altp2m(d);
+                }
+            }
+        }
+
+        rcu_unlock_domain(d);
+        break;
+    }
+
+    case HVMOP_altp2m_vcpu_enable_notify:
+    {
+        struct domain *curr_d = current->domain;
+        struct vcpu *curr = current;
+        struct xen_hvm_altp2m_vcpu_enable_notify a;
+        p2m_type_t p2mt;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        if ( !is_hvm_domain(curr_d) || !hvm_altp2m_supported() ||
+             !curr_d->arch.altp2m_active ||
+             gfn_x(vcpu_altp2m(curr).veinfo_gfn) != INVALID_GFN)
+            return -EINVAL;
+
+        if ( mfn_x(get_gfn_query_unlocked(curr_d, a.gfn, &p2mt)) ==
+             INVALID_MFN )
+            return -EINVAL;
+
+        vcpu_altp2m(curr).veinfo_gfn = _gfn(a.gfn);
+        ap2m_vcpu_update_vmfunc_ve(curr);
+        rc = 0;
+
+        break;
+    }
+
+    case HVMOP_altp2m_create_p2m:
+    {
+        struct xen_hvm_altp2m_view a;
+        struct domain *d;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        d = rcu_lock_domain_by_any_id(a.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        rc = -EINVAL;
+        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+             d->arch.altp2m_active &&
+             !(rc = p2m_init_next_altp2m(d, &a.view)) )
+            rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+
+        rcu_unlock_domain(d);
+        break;
+    }
+
+    case HVMOP_altp2m_destroy_p2m:
+    {
+        struct xen_hvm_altp2m_view a;
+        struct domain *d;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        d = rcu_lock_domain_by_any_id(a.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        rc = -EINVAL;
+        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+             d->arch.altp2m_active )
+            rc = p2m_destroy_altp2m_by_id(d, a.view);
+
+        rcu_unlock_domain(d);
+        break;
+    }
+
+    case HVMOP_altp2m_switch_p2m:
+    {
+        struct xen_hvm_altp2m_view a;
+        struct domain *d;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        d = rcu_lock_domain_by_any_id(a.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        rc = -EINVAL;
+        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+             d->arch.altp2m_active )
+            rc = p2m_switch_domain_altp2m_by_id(d, a.view);
+
+        rcu_unlock_domain(d);
+        break;
+    }
+
+    case HVMOP_altp2m_set_mem_access:
+    {
+        struct xen_hvm_altp2m_set_mem_access a;
+        struct domain *d;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        d = rcu_lock_domain_by_any_id(a.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        rc = -EINVAL;
+        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+             d->arch.altp2m_active )
+            rc = p2m_set_altp2m_mem_access(d, a.view, _gfn(a.gfn), a.hvmmem_access);
+
+        rcu_unlock_domain(d);
+        break;
+    }
+
+    case HVMOP_altp2m_change_gfn:
+    {
+        struct xen_hvm_altp2m_change_gfn a;
+        struct domain *d;
+
+        if ( copy_from_guest(&a, arg, 1) )
+            return -EFAULT;
+
+        d = rcu_lock_domain_by_any_id(a.domid);
+        if ( d == NULL )
+            return -ESRCH;
+
+        rc = -EINVAL;
+        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+             d->arch.altp2m_active )
+            rc = p2m_change_altp2m_gfn(d, a.view, _gfn(a.old_gfn), _gfn(a.new_gfn));
+
+        rcu_unlock_domain(d);
+        break;
+    }
+
     default:
     {
         gdprintk(XENLOG_DEBUG, "Bad HVM op %ld.\n", op);
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index 9b84e84..3fa7b47 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -396,6 +396,75 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_evtchn_upcall_vector_t);
 
 #endif /* defined(__i386__) || defined(__x86_64__) */
 
+/* Set/get the altp2m state for a domain */
+#define HVMOP_altp2m_set_domain_state     24
+#define HVMOP_altp2m_get_domain_state     25
+struct xen_hvm_altp2m_domain_state {
+    /* Domain to be updated or queried */
+    domid_t domid;
+    /* IN or OUT variable on/off */
+    uint8_t state;
+};
+typedef struct xen_hvm_altp2m_domain_state xen_hvm_altp2m_domain_state_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_domain_state_t);
+
+/* Set the current VCPU to receive altp2m event notifications */
+#define HVMOP_altp2m_vcpu_enable_notify   26
+struct xen_hvm_altp2m_vcpu_enable_notify {
+    /* #VE info area gfn */
+    uint64_t gfn;
+};
+typedef struct xen_hvm_altp2m_vcpu_enable_notify xen_hvm_altp2m_vcpu_enable_notify_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_vcpu_enable_notify_t);
+
+/* Create a new view */
+#define HVMOP_altp2m_create_p2m   27
+/* Destroy a view */
+#define HVMOP_altp2m_destroy_p2m  28
+/* Switch view for an entire domain */
+#define HVMOP_altp2m_switch_p2m   29
+struct xen_hvm_altp2m_view {
+    /* Domain to be updated */
+    domid_t domid;
+    /* IN/OUT variable */
+    uint16_t view;
+    /* Create view only: default access type
+     * NOTE: currently ignored */
+    uint16_t hvmmem_default_access; /* xenmem_access_t */
+};
+typedef struct xen_hvm_altp2m_view xen_hvm_altp2m_view_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_view_t);
+
+/* Notify that a page of memory is to have specific access types */
+#define HVMOP_altp2m_set_mem_access 30
+struct xen_hvm_altp2m_set_mem_access {
+    /* Domain to be updated. */
+    domid_t domid;
+    /* view */
+    uint16_t view;
+    /* Memory type */
+    uint16_t hvmmem_access; /* xenmem_access_t */
+    /* gfn */
+    uint64_t gfn;
+};
+typedef struct xen_hvm_altp2m_set_mem_access xen_hvm_altp2m_set_mem_access_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
+
+/* Change a p2m entry to have a different gfn->mfn mapping */
+#define HVMOP_altp2m_change_gfn 31
+struct xen_hvm_altp2m_change_gfn {
+    /* Domain to be updated. */
+    domid_t domid;
+    /* view */
+    uint16_t view;
+    /* old gfn */
+    uint64_t old_gfn;
+    /* new gfn, INVALID_GFN (~0UL) means revert */
+    uint64_t new_gfn;
+};
+typedef struct xen_hvm_altp2m_change_gfn xen_hvm_altp2m_change_gfn_t;
+DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_change_gfn_t);
+
 #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (10 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-06 10:16   ` Andrew Cooper
  2015-07-06 17:49   ` Wei Liu
  2015-07-01 18:09 ` [PATCH v3 13/13] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
                   ` (2 subsequent siblings)
  14 siblings, 2 replies; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

The altp2mhvm and nestedhvm parameters are mutually
exclusive and cannot be set together.

Signed-off-by: Ed White <edmund.h.white@intel.com>
---
 docs/man/xl.cfg.pod.5           | 12 ++++++++++++
 tools/libxl/libxl_create.c      |  1 +
 tools/libxl/libxl_dom.c         |  2 ++
 tools/libxl/libxl_types.idl     |  1 +
 tools/libxl/xl_cmdimpl.c        |  8 ++++++++
 xen/arch/x86/hvm/hvm.c          | 16 +++++++++++++++-
 xen/include/public/hvm/params.h |  5 ++++-
 7 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index a3e0e2e..18afd46 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1035,6 +1035,18 @@ enabled by default and you should usually omit it. It may be necessary
 to disable the HPET in order to improve compatibility with guest
 Operating Systems (X86 only)
 
+=item B<altp2mhvm=BOOLEAN>
+
+Enables or disables hvm guest access to alternate-p2m capability.
+Alternate-p2m allows a guest to manage multiple p2m guest physical
+"memory views" (as opposed to a single p2m). This option is
+disabled by default and is available only to hvm domains.
+You may want this option if you want to access-control/isolate
+access to specific guest physical memory pages accessed by
+the guest, e.g. for HVM domain memory introspection or
+for isolation/access-control of memory between components within
+a single guest hvm domain.
+
 =item B<nestedhvm=BOOLEAN>
 
 Enable or disables guest access to hardware virtualisation features,
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..35e322e 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -329,6 +329,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
         libxl_defbool_setdefault(&b_info->u.hvm.hpet,               true);
         libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,          true);
         libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm,         false);
+        libxl_defbool_setdefault(&b_info->u.hvm.altp2mhvm,          false);
         libxl_defbool_setdefault(&b_info->u.hvm.usb,                false);
         libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 600393d..b75f49b 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -300,6 +300,8 @@ static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
                     libxl_defbool_val(info->u.hvm.vpt_align));
     xc_hvm_param_set(handle, domid, HVM_PARAM_NESTEDHVM,
                     libxl_defbool_val(info->u.hvm.nested_hvm));
+    xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2MHVM,
+                    libxl_defbool_val(info->u.hvm.altp2mhvm));
 }
 
 int libxl__build_pre(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 23f27d4..66a89cf 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -437,6 +437,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                        ("mmio_hole_memkb",  MemKB),
                                        ("timer_mode",       libxl_timer_mode),
                                        ("nested_hvm",       libxl_defbool),
+                                       ("altp2mhvm",        libxl_defbool),
                                        ("smbios_firmware",  string),
                                        ("acpi_firmware",    string),
                                        ("nographic",        libxl_defbool),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c858068..ccb0de9 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1500,6 +1500,14 @@ static void parse_config_data(const char *config_source,
 
         xlu_cfg_get_defbool(config, "nestedhvm", &b_info->u.hvm.nested_hvm, 0);
 
+        xlu_cfg_get_defbool(config, "altp2mhvm", &b_info->u.hvm.altp2mhvm, 0);
+
+        if (strcmp(libxl_defbool_to_string(b_info->u.hvm.nested_hvm), "True") == 0 &&
+            strcmp(libxl_defbool_to_string(b_info->u.hvm.altp2mhvm), "True") == 0) {
+            fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used together\n");
+            exit (1);
+        }
+
         xlu_cfg_replace_string(config, "smbios_firmware",
                                &b_info->u.hvm.smbios_firmware, 0);
         xlu_cfg_replace_string(config, "acpi_firmware",
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0d81050..92c123c 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5754,6 +5754,7 @@ static int hvm_allow_set_param(struct domain *d,
     case HVM_PARAM_VIRIDIAN:
     case HVM_PARAM_IOREQ_SERVER_PFN:
     case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
+    case HVM_PARAM_ALTP2MHVM:
         if ( value != 0 && a->value != value )
             rc = -EEXIST;
         break;
@@ -5876,6 +5877,9 @@ static int hvmop_set_param(
          */
         if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
             rc = -EINVAL;
+        if ( a.value &&
+             d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] )
+            rc = -EINVAL;
         /* Set up NHVM state for any vcpus that are already up. */
         if ( a.value &&
              !d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM] )
@@ -5886,6 +5890,13 @@ static int hvmop_set_param(
             for_each_vcpu(d, v)
                 nestedhvm_vcpu_destroy(v);
         break;
+    case HVM_PARAM_ALTP2MHVM:
+        if ( a.value > 1 )
+            rc = -EINVAL;
+        if ( a.value &&
+             d->arch.hvm_domain.params[HVM_PARAM_NESTEDHVM] )
+            rc = -EINVAL;
+        break;
     case HVM_PARAM_BUFIOREQ_EVTCHN:
         rc = -EINVAL;
         break;
@@ -5946,6 +5957,7 @@ static int hvm_allow_get_param(struct domain *d,
     case HVM_PARAM_STORE_EVTCHN:
     case HVM_PARAM_CONSOLE_PFN:
     case HVM_PARAM_CONSOLE_EVTCHN:
+    case HVM_PARAM_ALTP2MHVM:
         break;
     /*
      * The following parameters must not be read by the guest
@@ -6460,7 +6472,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
             return -ESRCH;
 
         rc = -EINVAL;
-        if ( is_hvm_domain(d) && hvm_altp2m_supported() )
+        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+             d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] )
         {
             a.state = altp2m_active(d);
             rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
@@ -6486,6 +6499,7 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         rc = -EINVAL;
         if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+             d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] &&
              !nestedhvm_enabled(d) )
         {
             ostate = d->arch.altp2m_active;
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index 7c73089..1b5f840 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -187,6 +187,9 @@
 /* Location of the VM Generation ID in guest physical address space. */
 #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
 
-#define HVM_NR_PARAMS          35
+/* Boolean: Enable altp2m (hvm only) */
+#define HVM_PARAM_ALTP2MHVM    35
+
+#define HVM_NR_PARAMS          36
 
 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH v3 13/13] x86/altp2m: XSM hooks for altp2m HVM ops
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (11 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
@ 2015-07-01 18:09 ` Ed White
  2015-07-02 19:17   ` Daniel De Graaf
  2015-07-06  9:50 ` [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Jan Beulich
  2015-07-08 18:35 ` Sahita, Ravi
  14 siblings, 1 reply; 91+ messages in thread
From: Ed White @ 2015-07-01 18:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

From: Ravi Sahita <ravi.sahita@intel.com>

Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
---
 tools/flask/policy/policy/modules/xen/xen.if |   4 +-
 xen/arch/x86/hvm/hvm.c                       | 118 ++++++++++++++++-----------
 xen/include/xsm/dummy.h                      |  12 +++
 xen/include/xsm/xsm.h                        |  12 +++
 xen/xsm/dummy.c                              |   2 +
 xen/xsm/flask/hooks.c                        |  12 +++
 xen/xsm/flask/policy/access_vectors          |   7 ++
 7 files changed, 119 insertions(+), 48 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.if b/tools/flask/policy/policy/modules/xen/xen.if
index f4cde11..6177fe9 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -8,7 +8,7 @@
 define(`declare_domain_common', `
 	allow $1 $2:grant { query setup };
 	allow $1 $2:mmu { adjust physmap map_read map_write stat pinpage updatemp mmuext_op };
-	allow $1 $2:hvm { getparam setparam };
+	allow $1 $2:hvm { getparam setparam altp2mhvm_op };
 	allow $1 $2:domain2 get_vnumainfo;
 ')
 
@@ -58,7 +58,7 @@ define(`create_domain_common', `
 	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
 	allow $1 $2:grant setup;
 	allow $1 $2:hvm { cacheattr getparam hvmctl irqlevel pciroute sethvmc
-			setparam pcilevel trackdirtyvram nested };
+			setparam pcilevel trackdirtyvram nested altp2mhvm altp2mhvm_op };
 ')
 
 # create_domain(priv, target)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 92c123c..cc0c7b3 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5891,6 +5891,9 @@ static int hvmop_set_param(
                 nestedhvm_vcpu_destroy(v);
         break;
     case HVM_PARAM_ALTP2MHVM:
+        rc = xsm_hvm_param_altp2mhvm(XSM_PRIV, d);
+        if ( rc )
+            break;
         if ( a.value > 1 )
             rc = -EINVAL;
         if ( a.value &&
@@ -6471,12 +6474,15 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( d == NULL )
             return -ESRCH;
 
-        rc = -EINVAL;
-        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
-             d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] )
+        if ( !(rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d)) )
         {
-            a.state = altp2m_active(d);
-            rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+            rc = -EINVAL;
+            if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+                 d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] )
+            {
+                a.state = altp2m_active(d);
+                rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+            }
         }
 
         rcu_unlock_domain(d);
@@ -6497,31 +6503,34 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( d == NULL )
             return -ESRCH;
 
-        rc = -EINVAL;
-        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
-             d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] &&
-             !nestedhvm_enabled(d) )
+        if ( !(rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d)) )
         {
-            ostate = d->arch.altp2m_active;
-            d->arch.altp2m_active = !!a.state;
+            rc = -EINVAL;
+            if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+                 d->arch.hvm_domain.params[HVM_PARAM_ALTP2MHVM] &&
+                 !nestedhvm_enabled(d) )
+            {
+                ostate = d->arch.altp2m_active;
+                d->arch.altp2m_active = !!a.state;
 
-            rc = 0;
+                rc = 0;
 
-            /* If the alternate p2m state has changed, handle appropriately */
-            if ( d->arch.altp2m_active != ostate )
-            {
-                if ( ostate || !(rc = p2m_init_altp2m_by_id(d, 0)) )
+                /* If the alternate p2m state has changed, handle appropriately */
+                if ( d->arch.altp2m_active != ostate )
                 {
-                    for_each_vcpu( d, v )
+                    if ( ostate || !(rc = p2m_init_altp2m_by_id(d, 0)) )
                     {
-                        if ( !ostate )
-                            altp2m_vcpu_initialise(v);
-                        else
-                            altp2m_vcpu_destroy(v);
+                        for_each_vcpu( d, v )
+                        {
+                            if ( !ostate )
+                                altp2m_vcpu_initialise(v);
+                            else
+                                altp2m_vcpu_destroy(v);
+                        }
+
+                        if ( ostate )
+                            p2m_flush_altp2m(d);
                     }
-
-                    if ( ostate )
-                        p2m_flush_altp2m(d);
                 }
             }
         }
@@ -6540,6 +6549,9 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( copy_from_guest(&a, arg, 1) )
             return -EFAULT;
 
+        if ( (rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, curr_d)) )
+            return rc;
+
         if ( !is_hvm_domain(curr_d) || !hvm_altp2m_supported() ||
              !curr_d->arch.altp2m_active ||
              gfn_x(vcpu_altp2m(curr).veinfo_gfn) != INVALID_GFN)
@@ -6568,11 +6580,14 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( d == NULL )
             return -ESRCH;
 
-        rc = -EINVAL;
-        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
-             d->arch.altp2m_active &&
-             !(rc = p2m_init_next_altp2m(d, &a.view)) )
-            rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+        if ( !(rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d)) )
+        {
+            rc = -EINVAL;
+            if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+                 d->arch.altp2m_active &&
+                 !(rc = p2m_init_next_altp2m(d, &a.view)) )
+                rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+        }
 
         rcu_unlock_domain(d);
         break;
@@ -6590,10 +6605,13 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( d == NULL )
             return -ESRCH;
 
-        rc = -EINVAL;
-        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
-             d->arch.altp2m_active )
-            rc = p2m_destroy_altp2m_by_id(d, a.view);
+        if ( !(rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d)) )
+        {
+            rc = -EINVAL;
+            if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+                 d->arch.altp2m_active )
+                rc = p2m_destroy_altp2m_by_id(d, a.view);
+        }
 
         rcu_unlock_domain(d);
         break;
@@ -6611,10 +6629,13 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( d == NULL )
             return -ESRCH;
 
-        rc = -EINVAL;
-        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
-             d->arch.altp2m_active )
-            rc = p2m_switch_domain_altp2m_by_id(d, a.view);
+        if ( !(rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d)) )
+        {
+            rc = -EINVAL;
+            if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+                 d->arch.altp2m_active )
+                rc = p2m_switch_domain_altp2m_by_id(d, a.view);
+        }
 
         rcu_unlock_domain(d);
         break;
@@ -6631,11 +6652,13 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         d = rcu_lock_domain_by_any_id(a.domid);
         if ( d == NULL )
             return -ESRCH;
-
-        rc = -EINVAL;
-        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
-             d->arch.altp2m_active )
-            rc = p2m_set_altp2m_mem_access(d, a.view, _gfn(a.gfn), a.hvmmem_access);
+        if ( !(rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d)) )
+        {
+            rc = -EINVAL;
+            if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+                 d->arch.altp2m_active )
+                rc = p2m_set_altp2m_mem_access(d, a.view, _gfn(a.gfn), a.hvmmem_access);
+        }
 
         rcu_unlock_domain(d);
         break;
@@ -6653,10 +6676,13 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         if ( d == NULL )
             return -ESRCH;
 
-        rc = -EINVAL;
-        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
-             d->arch.altp2m_active )
-            rc = p2m_change_altp2m_gfn(d, a.view, _gfn(a.old_gfn), _gfn(a.new_gfn));
+        if ( !(rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d)) )
+        {
+            rc = -EINVAL;
+            if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
+                 d->arch.altp2m_active )
+                rc = p2m_change_altp2m_gfn(d, a.view, _gfn(a.old_gfn), _gfn(a.new_gfn));
+        }
 
         rcu_unlock_domain(d);
         break;
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index f044c0f..e0b561d 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -548,6 +548,18 @@ static XSM_INLINE int xsm_hvm_param_nested(XSM_DEFAULT_ARG struct domain *d)
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_hvm_param_altp2mhvm(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
+static XSM_INLINE int xsm_hvm_altp2mhvm_op(XSM_DEFAULT_ARG struct domain *d)
+{
+    XSM_ASSERT_ACTION(XSM_TARGET);
+    return xsm_default_action(action, current->domain, d);
+}
+
 static XSM_INLINE int xsm_vm_event_control(XSM_DEFAULT_ARG struct domain *d, int mode, int op)
 {
     XSM_ASSERT_ACTION(XSM_PRIV);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index c872d44..dc48d23 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -147,6 +147,8 @@ struct xsm_operations {
     int (*hvm_param) (struct domain *d, unsigned long op);
     int (*hvm_control) (struct domain *d, unsigned long op);
     int (*hvm_param_nested) (struct domain *d);
+    int (*hvm_param_altp2mhvm) (struct domain *d);
+    int (*hvm_altp2mhvm_op) (struct domain *d);
     int (*get_vnumainfo) (struct domain *d);
 
     int (*vm_event_control) (struct domain *d, int mode, int op);
@@ -586,6 +588,16 @@ static inline int xsm_hvm_param_nested (xsm_default_t def, struct domain *d)
     return xsm_ops->hvm_param_nested(d);
 }
 
+static inline int xsm_hvm_param_altp2mhvm (xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->hvm_param_altp2mhvm(d);
+}
+
+static inline int xsm_hvm_altp2mhvm_op (xsm_default_t def, struct domain *d)
+{
+    return xsm_ops->hvm_altp2mhvm_op(d);
+}
+
 static inline int xsm_get_vnumainfo (xsm_default_t def, struct domain *d)
 {
     return xsm_ops->get_vnumainfo(d);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index e84b0e4..3461d4f 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -116,6 +116,8 @@ void xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, hvm_param);
     set_to_dummy_if_null(ops, hvm_control);
     set_to_dummy_if_null(ops, hvm_param_nested);
+    set_to_dummy_if_null(ops, hvm_param_altp2mhvm);
+    set_to_dummy_if_null(ops, hvm_altp2mhvm_op);
 
     set_to_dummy_if_null(ops, do_xsm_op);
 #ifdef CONFIG_COMPAT
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 6e37d29..2b998c9 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1170,6 +1170,16 @@ static int flask_hvm_param_nested(struct domain *d)
     return current_has_perm(d, SECCLASS_HVM, HVM__NESTED);
 }
 
+static int flask_hvm_param_altp2mhvm(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_HVM, HVM__ALTP2MHVM);
+}
+
+static int flask_hvm_altp2mhvm_op(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_HVM, HVM__ALTP2MHVM_OP);
+}
+
 static int flask_vm_event_control(struct domain *d, int mode, int op)
 {
     return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__VM_EVENT);
@@ -1654,6 +1664,8 @@ static struct xsm_operations flask_ops = {
     .hvm_param = flask_hvm_param,
     .hvm_control = flask_hvm_param,
     .hvm_param_nested = flask_hvm_param_nested,
+    .hvm_param_altp2mhvm = flask_hvm_param_altp2mhvm,
+    .hvm_altp2mhvm_op = flask_hvm_altp2mhvm_op,
 
     .do_xsm_op = do_flask_op,
     .get_vnumainfo = flask_get_vnumainfo,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 68284d5..d168de2 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -272,6 +272,13 @@ class hvm
     share_mem
 # HVMOP_set_param setting HVM_PARAM_NESTEDHVM
     nested
+# HVMOP_set_param setting HVM_PARAM_ALTP2MHVM
+    altp2mhvm
+# HVMOP_altp2m_set_domain_state HVMOP_altp2m_get_domain_state
+# HVMOP_altp2m_vcpu_enable_notify HVMOP_altp2m_create_p2m
+# HVMOP_altp2m_destroy_p2m HVMOP_altp2m_switch_p2m
+# HVMOP_altp2m_set_mem_access HVMOP_altp2m_change_gfn
+    altp2mhvm_op
 }
 
 # Class event describes event channels.  Interdomain event channels have their
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 09/13] x86/altp2m: alternate p2m memory events.
  2015-07-01 18:09 ` [PATCH v3 09/13] x86/altp2m: alternate p2m memory events Ed White
@ 2015-07-01 18:29   ` Lengyel, Tamas
  2015-07-03 16:46   ` Andrew Cooper
  2015-07-07 15:18   ` George Dunlap
  2 siblings, 0 replies; 91+ messages in thread
From: Lengyel, Tamas @ 2015-07-01 18:29 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Xen-devel, Jan Beulich, Andrew Cooper, Daniel De Graaf


[-- Attachment #1.1: Type: text/plain, Size: 1020 bytes --]

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c

> index 58d4951..576b28d 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1514,6 +1514,13 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
>      }
>  }
>
> +void p2m_altp2m_check(struct vcpu *v, const vm_event_response_t *rsp)
> +{
> +    if ( (rsp->flags & VM_EVENT_FLAG_ALTERNATE_P2M) &&
>

Please keep the check for (rsp->flags & VM_EVENT_FLAG_ALTERNATE_P2M) in
common/vm_event.c. With that you also only have to pass the altp2m_idx here.


> +         altp2m_active(v->domain) )
> +        p2m_switch_vcpu_altp2m_by_id(v, rsp->altp2m_idx);
> +}
> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
> index 120a78a..57095d8 100644
> --- a/xen/common/vm_event.c
> +++ b/xen/common/vm_event.c
> @@ -399,6 +399,9 @@ void vm_event_resume(struct domain *d, struct
> vm_event_domain *ved)
>
>          };
>
> +        /* Check for altp2m switch */
> +        p2m_altp2m_check(v, &rsp);
>

See my comment above.

Thanks,
Tamas

[-- Attachment #1.2: Type: text/html, Size: 1571 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 13/13] x86/altp2m: XSM hooks for altp2m HVM ops
  2015-07-01 18:09 ` [PATCH v3 13/13] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
@ 2015-07-02 19:17   ` Daniel De Graaf
  0 siblings, 0 replies; 91+ messages in thread
From: Daniel De Graaf @ 2015-07-02 19:17 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, Andrew Cooper, tlengyel

On 07/01/2015 02:09 PM, Ed White wrote:
> From: Ravi Sahita <ravi.sahita@intel.com>
>
> Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-01 18:09 ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Ed White
@ 2015-07-03 16:22   ` Andrew Cooper
  2015-07-06  9:56     ` Jan Beulich
  2015-07-06 16:40     ` Ed White
  2015-07-07 15:04   ` George Dunlap
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 91+ messages in thread
From: Andrew Cooper @ 2015-07-03 16:22 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, tlengyel, Daniel De Graaf

On 01/07/15 19:09, Ed White wrote:
> Add the basic data structures needed to support alternate p2m's and
> the functions to initialise them and tear them down.
>
> Although Intel hardware can handle 512 EPTP's per hardware thread
> concurrently, only 10 per domain are supported in this patch for
> performance reasons.
>
> The iterator in hap_enable() does need to handle 512, so that is now
> uint16_t.
>
> This change also splits the p2m lock into one lock type for altp2m's
> and another type for all other p2m's. The purpose of this is to place
> the altp2m list lock between the types, so the list lock can be
> acquired whilst holding the host p2m lock.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>

Only some style issues.  Otherwise, Reviewed-by: Andrew Cooper
<andrew.cooper3@citrix.com>

> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 6faf3f4..f21d34d 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -6502,6 +6504,25 @@ enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v)
>      return hvm_funcs.nhvm_intr_blocked(v);
>  }
>  
> +void ap2m_vcpu_update_eptp(struct vcpu *v)
> +{
> +    if (hvm_funcs.ap2m_vcpu_update_eptp)

spaces inside brackets

> +        hvm_funcs.ap2m_vcpu_update_eptp(v);
> +}
> +
> +void ap2m_vcpu_update_vmfunc_ve(struct vcpu *v)
> +{
> +    if (hvm_funcs.ap2m_vcpu_update_vmfunc_ve)

spaces inside brackets

> +        hvm_funcs.ap2m_vcpu_update_vmfunc_ve(v);
> +}
> +
> +bool_t ap2m_vcpu_emulate_ve(struct vcpu *v)
> +{
> +    if (hvm_funcs.ap2m_vcpu_emulate_ve)

spaces inside brackets

> +        return hvm_funcs.ap2m_vcpu_emulate_ve(v);
> +    return 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
> index d0d3f1e..c00282c 100644
> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)
>  int hap_enable(struct domain *d, u32 mode)
>  {
>      unsigned int old_pages;
> -    uint8_t i;
> +    uint16_t i;
>      int rv = 0;
>  
>      domain_pause(d);
> @@ -498,6 +498,24 @@ int hap_enable(struct domain *d, u32 mode)
>             goto out;
>      }
>  
> +    /* Init alternate p2m data */
> +    if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
> +    {
> +        rv = -ENOMEM;
> +        goto out;
> +    }
> +
> +    for (i = 0; i < MAX_EPTP; i++)
> +        d->arch.altp2m_eptp[i] = INVALID_MFN;
> +
> +    for (i = 0; i < MAX_ALTP2M; i++) {

braces

> +        rv = p2m_alloc_table(d->arch.altp2m_p2m[i]);
> +        if ( rv != 0 )
> +           goto out;
> +    }
> +
> +    d->arch.altp2m_active = 0;
> +
>      /* Now let other users see the new mode */
>      d->arch.paging.mode = mode | PG_HAP_enable;
>  
> @@ -510,6 +528,17 @@ void hap_final_teardown(struct domain *d)
>  {
>      uint8_t i;
>  
> +    d->arch.altp2m_active = 0;
> +
> +    if ( d->arch.altp2m_eptp ) {

braces

> +        free_xenheap_page(d->arch.altp2m_eptp);
> +        d->arch.altp2m_eptp = NULL;
> +    }
> +
> +    for (i = 0; i < MAX_ALTP2M; i++) {

braces

> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index 1fd1194..58d4951 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -35,6 +35,7 @@
>  #include <asm/hvm/vmx/vmx.h> /* ept_p2m_init() */
>  #include <asm/mem_sharing.h>
>  #include <asm/hvm/nestedhvm.h>
> +#include <asm/hvm/altp2m.h>
>  #include <asm/hvm/svm/amd-iommu-proto.h>
>  #include <xsm/xsm.h>
>  
> @@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct domain *d)
>      }
>  }
>  
> +static void p2m_teardown_altp2m(struct domain *d)
> +{
> +    uint8_t i;

A plain unsigned int here would suffice.  It also looks curios as you
use uint16 for the same iteration elsewhere.

> +    struct p2m_domain *p2m;
> +
> +    for (i = 0; i < MAX_ALTP2M; i++)

spaces inside brackets

> +    {
> +        if ( !d->arch.altp2m_p2m[i] )
> +            continue;
> +        p2m = d->arch.altp2m_p2m[i];
> +        p2m_free_one(p2m);
> +        d->arch.altp2m_p2m[i] = NULL;
> +    }
> +}
> +
> +static int p2m_init_altp2m(struct domain *d)
> +{
> +    uint8_t i;
> +    struct p2m_domain *p2m;
> +
> +    mm_lock_init(&d->arch.altp2m_lock);
> +    for (i = 0; i < MAX_ALTP2M; i++)

spaces inside brackets

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-01 18:09 ` [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE Ed White
@ 2015-07-03 16:29   ` Andrew Cooper
  2015-07-07 14:28     ` Wei Liu
  2015-07-07 19:02   ` Nakajima, Jun
  1 sibling, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2015-07-03 16:29 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, tlengyel, Daniel De Graaf

On 01/07/15 19:09, Ed White wrote:
> Implement and hook up the code to enable VMX support of VMFUNC and #VE.
>
> VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

(You are also going to need an ack/review from a VMX maintainer for the
entire series)

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-01 18:09 ` [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
@ 2015-07-03 16:40   ` Andrew Cooper
  2015-07-06 19:56     ` Sahita, Ravi
  2015-07-09 14:05   ` Jan Beulich
  1 sibling, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2015-07-03 16:40 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, tlengyel, Daniel De Graaf

On 01/07/15 19:09, Ed White wrote:
> From: Ravi Sahita <ravi.sahita@intel.com>
>
> Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
> ---
>  xen/arch/x86/hvm/emulate.c             | 12 +++++++--
>  xen/arch/x86/hvm/vmx/vmx.c             | 30 +++++++++++++++++++++
>  xen/arch/x86/x86_emulate/x86_emulate.c | 48 +++++++++++++++++++++-------------
>  xen/arch/x86/x86_emulate/x86_emulate.h |  4 +++
>  xen/include/asm-x86/hvm/hvm.h          |  2 ++
>  5 files changed, 76 insertions(+), 20 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index ac9c9d6..157fe78 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1356,6 +1356,12 @@ static int hvmemul_invlpg(
>      return rc;
>  }
>  
> +static int hvmemul_vmfunc(
> +    struct x86_emulate_ctxt *ctxt)
> +{
> +    return hvm_funcs.ap2m_vcpu_emulate_vmfunc(ctxt->regs);
> +}
> +
>  static const struct x86_emulate_ops hvm_emulate_ops = {
>      .read          = hvmemul_read,
>      .insn_fetch    = hvmemul_insn_fetch,
> @@ -1379,7 +1385,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
>      .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
>      .get_fpu       = hvmemul_get_fpu,
>      .put_fpu       = hvmemul_put_fpu,
> -    .invlpg        = hvmemul_invlpg
> +    .invlpg        = hvmemul_invlpg,
> +    .vmfunc        = hvmemul_vmfunc,
>  };
>  
>  static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
> @@ -1405,7 +1412,8 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
>      .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
>      .get_fpu       = hvmemul_get_fpu,
>      .put_fpu       = hvmemul_put_fpu,
> -    .invlpg        = hvmemul_invlpg
> +    .invlpg        = hvmemul_invlpg,
> +    .vmfunc        = hvmemul_vmfunc,
>  };
>  
>  static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 9585aa3..c6feeae 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -82,6 +82,7 @@ static void vmx_fpu_dirty_intercept(void);
>  static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content);
>  static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
>  static void vmx_invlpg_intercept(unsigned long vaddr);
> +static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
>  
>  uint8_t __read_mostly posted_intr_vector;
>  
> @@ -1830,6 +1831,20 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
>      vmx_vmcs_exit(v);
>  }
>  
> +static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
> +{
> +    int rc = X86EMUL_EXCEPTION;
> +    struct vcpu *v = current;
> +
> +    if ( !cpu_has_vmx_vmfunc && altp2m_active(v->domain) &&
> +         regs->eax == 0 &&
> +         p2m_switch_vcpu_altp2m_by_id(v, (uint16_t)regs->ecx) )
> +    {
> +        rc = X86EMUL_OKAY;
> +    }

You need a #UD injection at this point.

> +    return rc;
> +}
> +
>  static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
>  {
>      bool_t rc = 0;
> @@ -1898,6 +1913,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
>      .msr_read_intercept   = vmx_msr_read_intercept,
>      .msr_write_intercept  = vmx_msr_write_intercept,
>      .invlpg_intercept     = vmx_invlpg_intercept,
> +    .vmfunc_intercept     = vmx_vmfunc_intercept,
>      .handle_cd            = vmx_handle_cd,
>      .set_info_guest       = vmx_set_info_guest,
>      .set_rdtsc_exiting    = vmx_set_rdtsc_exiting,
> @@ -1924,6 +1940,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
>      .ap2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
>      .ap2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
>      .ap2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
> +    .ap2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
>  };
>  
>  const struct hvm_function_table * __init start_vmx(void)
> @@ -2095,6 +2112,12 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
>          vpid_sync_vcpu_gva(curr, vaddr);
>  }
>  
> +static int vmx_vmfunc_intercept(struct cpu_user_regs *regs)
> +{
> +    gdprintk(XENLOG_ERR, "Failed guest VMFUNC execution\n");
> +    return X86EMUL_EXCEPTION;
> +}
> +
>  static int vmx_cr_access(unsigned long exit_qualification)
>  {
>      struct vcpu *curr = current;
> @@ -3245,6 +3268,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>              update_guest_eip();
>          break;
>  
> +    case EXIT_REASON_VMFUNC:
> +        if ( vmx_vmfunc_intercept(regs) == X86EMUL_OKAY )
> +            update_guest_eip();
> +        else
> +            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
> +        break;
> +
>      case EXIT_REASON_MWAIT_INSTRUCTION:
>      case EXIT_REASON_MONITOR_INSTRUCTION:
>      case EXIT_REASON_GETSEC:
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
> index c017c69..adf64d0 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -3815,28 +3815,40 @@ x86_emulate(
>      case 0x01: /* Grp7 */ {
>          struct segment_register reg;
>          unsigned long base, limit, cr0, cr0w;
> +        uint64_t tsc_aux;

This variable can live inside the rdtscp case, to reduce its scope.

>  
> -        if ( modrm == 0xdf ) /* invlpga */
> +        switch( modrm )
>          {
> -            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
> -            generate_exception_if(!mode_ring0(), EXC_GP, 0);
> -            fail_if(ops->invlpg == NULL);
> -            if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
> -                                   ctxt)) )
> -                goto done;
> -            break;
> -        }
> -
> -        if ( modrm == 0xf9 ) /* rdtscp */
> -        {
> -            uint64_t tsc_aux;
> -            fail_if(ops->read_msr == NULL);
> -            if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
> -                goto done;
> -            _regs.ecx = (uint32_t)tsc_aux;
> -            goto rdtsc;
> +            case 0xdf: /* invlpga AMD */
> +                generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
> +                generate_exception_if(!mode_ring0(), EXC_GP, 0);
> +                fail_if(ops->invlpg == NULL);
> +                if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
> +                                       ctxt)) )
> +                    goto done;
> +                break;
> +            case 0xf9: /* rdtscp */
> +                fail_if(ops->read_msr == NULL);
> +                if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
> +                    goto done;
> +                _regs.ecx = (uint32_t)tsc_aux;
> +                goto rdtsc;
> +            case 0xd4: /* vmfunc */
> +                generate_exception_if(
> +                    (lock_prefix |
> +                    rep_prefix() |
> +                    (vex.pfx == vex_66)),
> +                    EXC_UD, -1);

The instruction reference makes no mention of any conditions like this.

The 3 conditions for #UD are being executed in non-root mode, the enable
VM functions execution control is clear (which is how we would get here
in the first place), or if eax is is >= 64.

The first needs an has_hvm_container() check, while the second and third
can be left to ops->vmfunc() to handle.

~Andrew

> +                fail_if(ops->vmfunc == NULL);
> +                if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
> +                    goto done;
> +                break;
> +            default:
> +                goto continue_grp7;
>          }
> +        break;
>  
> +continue_grp7:
>          switch ( modrm_reg & 7 )
>          {
>          case 0: /* sgdt */
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
> index 064b8f4..a4d4ec8 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -397,6 +397,10 @@ struct x86_emulate_ops
>          enum x86_segment seg,
>          unsigned long offset,
>          struct x86_emulate_ctxt *ctxt);
> +
> +    /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */
> +    int (*vmfunc)(
> +        struct x86_emulate_ctxt *ctxt);
>  };
>  
>  struct cpu_user_regs;
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index 36f1b74..595b399 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -167,6 +167,7 @@ struct hvm_function_table {
>      int (*msr_read_intercept)(unsigned int msr, uint64_t *msr_content);
>      int (*msr_write_intercept)(unsigned int msr, uint64_t msr_content);
>      void (*invlpg_intercept)(unsigned long vaddr);
> +    int (*vmfunc_intercept)(struct cpu_user_regs *regs);
>      void (*handle_cd)(struct vcpu *v, unsigned long value);
>      void (*set_info_guest)(struct vcpu *v);
>      void (*set_rdtsc_exiting)(struct vcpu *v, bool_t);
> @@ -218,6 +219,7 @@ struct hvm_function_table {
>      void (*ap2m_vcpu_update_eptp)(struct vcpu *v);
>      void (*ap2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
>      bool_t (*ap2m_vcpu_emulate_ve)(struct vcpu *v);
> +    int (*ap2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
>  };
>  
>  extern struct hvm_function_table hvm_funcs;

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 08/13] x86/altp2m: add control of suppress_ve.
  2015-07-01 18:09 ` [PATCH v3 08/13] x86/altp2m: add control of suppress_ve Ed White
@ 2015-07-03 16:43   ` Andrew Cooper
  0 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2015-07-03 16:43 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, tlengyel, Daniel De Graaf

On 01/07/15 19:09, Ed White wrote:
> The existing ept_set_entry() and ept_get_entry() routines are extended
> to optionally set/get suppress_ve and renamed. New ept_set_entry() and
> ept_get_entry() routines are provided as wrappers, where set preserves
> suppress_ve for an existing entry and sets it for a new entry.
>
> Additional function pointers are added to p2m_domain to allow direct
> access to the extended routines.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 09/13] x86/altp2m: alternate p2m memory events.
  2015-07-01 18:09 ` [PATCH v3 09/13] x86/altp2m: alternate p2m memory events Ed White
  2015-07-01 18:29   ` Lengyel, Tamas
@ 2015-07-03 16:46   ` Andrew Cooper
  2015-07-07 15:18   ` George Dunlap
  2 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2015-07-03 16:46 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, tlengyel, Daniel De Graaf

On 01/07/15 19:09, Ed White wrote:
> Add a flag to indicate that a memory event occurred in an alternate p2m
> and a field containing the p2m index. Allow any event response to switch
> to a different alternate p2m using the same flag and field.
>
> Modify p2m_memory_access_check() to handle alternate p2m's.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> for the x86 bits.

The rest looks fine (subject to Tamas' comment), but I shall leave final
review to the new mem-event maintainers.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 10/13] x86/altp2m: add remaining support routines.
  2015-07-01 18:09 ` [PATCH v3 10/13] x86/altp2m: add remaining support routines Ed White
@ 2015-07-03 16:56   ` Andrew Cooper
  2015-07-09 15:07   ` George Dunlap
  1 sibling, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2015-07-03 16:56 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, tlengyel, Daniel De Graaf

On 01/07/15 19:09, Ed White wrote:
> Add the remaining routines required to support enabling the alternate
> p2m functionality.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (12 preceding siblings ...)
  2015-07-01 18:09 ` [PATCH v3 13/13] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
@ 2015-07-06  9:50 ` Jan Beulich
  2015-07-06 11:25   ` Tim Deegan
  2015-07-08 18:35 ` Sahita, Ravi
  14 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2015-07-06  9:50 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
> Changes since v2:
> 
> Addressed all v2 feedback *except*:
> 
>     In patch 5, the per-domain EPTP list page is still allocated from the
>     Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
>     hardware when walking the EPTP list during exit processing in patch 6.

With this little detail I can't take this as a valid reason not to make
this change. Also - weren't we aiming at getting the page from the
HAP pool of the domain anyway?

>     HVM_ops are not merged. Tamas suggested merging the memory access ops,
>     but in practice they are not as similar as they appear on the surface.
>     Razvan suggested merging the implementation code in p2m.c, but that is
>     also not as common as it appears on the surface.
>     Andrew suggested merging all altp2m ops into one with a subop code in
>     the input stucture. His point that only 255 ops can be defined is well
>     taken, but altp2m uses only 2 more ops than the recently introduced
>     ioreq ops, and <15% of the available ops have been defined. Since we
>     don't know how to implement XSM hooks and policy with the subop model,
>     we have not adopted this suggestion.

Again, not very convincing an argument, but I'll need to take another
look at the patches for a final opinion.

>     The p2m set/get interface is not modified. The altp2m code needs to
>     write suppress_ve in 2 places and read it in 1 place. The original
>     patch series managed this by coupling the state of suppress_ve to the
>     p2m memory type, which Tim disliked. In v2 of the series, special
>     set/get interaces were added to access suppress_ve only when required.
>     Jan has suggested changing the existing interfaces, but we feel this
>     is inappropriate for this experimental patch series. Changing the
>     existing interfaces would require a design agreement to be reached
>     and would impact a large amount of existing code.

I continue to think the adjustment should be made as suggested.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-03 16:22   ` Andrew Cooper
@ 2015-07-06  9:56     ` Jan Beulich
  2015-07-06 16:52       ` Ed White
  2015-07-06 16:40     ` Ed White
  1 sibling, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2015-07-06  9:56 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, RaviSahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 03.07.15 at 18:22, <andrew.cooper3@citrix.com> wrote:
> On 01/07/15 19:09, Ed White wrote:
>> Add the basic data structures needed to support alternate p2m's and
>> the functions to initialise them and tear them down.
>>
>> Although Intel hardware can handle 512 EPTP's per hardware thread
>> concurrently, only 10 per domain are supported in this patch for
>> performance reasons.
>>
>> The iterator in hap_enable() does need to handle 512, so that is now
>> uint16_t.
>>
>> This change also splits the p2m lock into one lock type for altp2m's
>> and another type for all other p2m's. The purpose of this is to place
>> the altp2m list lock between the types, so the list lock can be
>> acquired whilst holding the host p2m lock.
>>
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Only some style issues.  Otherwise, Reviewed-by: Andrew Cooper
> <andrew.cooper3@citrix.com>

To be honest, with coding style issues having been pointed out
before, them left un-addressed in more just an occasional instance
moves me towards ignoring such a submission altogether. Please
help reviewers and maintainers by addressing _all_ of them even
if only a few (or just one) got pointed out during review. This also
helps you by avoiding to do another round just for addressing
these.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-01 18:09 ` [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
@ 2015-07-06 10:09   ` Andrew Cooper
  2015-07-06 16:49     ` Ed White
  2015-07-07  7:33     ` Jan Beulich
  2015-07-09 14:34   ` Jan Beulich
  1 sibling, 2 replies; 91+ messages in thread
From: Andrew Cooper @ 2015-07-06 10:09 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	Jan Beulich, tlengyel, Daniel De Graaf

On 01/07/15 19:09, Ed White wrote:
> Signed-off-by: Ed White <edmund.h.white@intel.com>

I am still very much unconvinced by the argument against having a single
HVMOP_altp2m and a set of subops.  do_domctl() and do_sysctl() are
examples of a subop style hypercall with different XSM settings for
different subops.

Furthermore, factoring out a do_altp2m_op() handler would allow things
like the hvm_altp2m_supported() check to be made common.  Factoring
further to having a named common header of a subop and a domid at the
head of every subop structure would allow all the domain rcu locking to
become common outside of the subop switch.

In addition,

> +    case HVMOP_altp2m_vcpu_enable_notify:
> +    {
> +        struct domain *curr_d = current->domain;
> +        struct vcpu *curr = current;
> +        struct xen_hvm_altp2m_vcpu_enable_notify a;
> +        p2m_type_t p2mt;
> +
> +        if ( copy_from_guest(&a, arg, 1) )
> +            return -EFAULT;
> +
> +        if ( !is_hvm_domain(curr_d) || !hvm_altp2m_supported() ||
> +             !curr_d->arch.altp2m_active ||
> +             gfn_x(vcpu_altp2m(curr).veinfo_gfn) != INVALID_GFN)

Brackets around the boolean operation on this line, and a space inside
the final bracket.

> +/* Notify that a page of memory is to have specific access types */
> +#define HVMOP_altp2m_set_mem_access 30
> +struct xen_hvm_altp2m_set_mem_access {
> +    /* Domain to be updated. */
> +    domid_t domid;
> +    /* view */
> +    uint16_t view;
> +    /* Memory type */
> +    uint16_t hvmmem_access; /* xenmem_access_t */

Explicit padding bytes here please.

> +    /* gfn */
> +    uint64_t gfn;
> +};
> +typedef struct xen_hvm_altp2m_set_mem_access xen_hvm_altp2m_set_mem_access_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_set_mem_access_t);
> +
> +/* Change a p2m entry to have a different gfn->mfn mapping */
> +#define HVMOP_altp2m_change_gfn 31
> +struct xen_hvm_altp2m_change_gfn {
> +    /* Domain to be updated. */
> +    domid_t domid;
> +    /* view */
> +    uint16_t view;

And here.

~Andrew

> +    /* old gfn */
> +    uint64_t old_gfn;
> +    /* new gfn, INVALID_GFN (~0UL) means revert */
> +    uint64_t new_gfn;
> +};
> +typedef struct xen_hvm_altp2m_change_gfn xen_hvm_altp2m_change_gfn_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_change_gfn_t);
> +
>  #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
>  
>  /*

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-01 18:09 ` [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
@ 2015-07-06 10:16   ` Andrew Cooper
  2015-07-06 17:49   ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2015-07-06 10:16 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, tlengyel, Daniel De Graaf

On 01/07/15 19:09, Ed White wrote:
> --- a/xen/include/public/hvm/params.h
> +++ b/xen/include/public/hvm/params.h
> @@ -187,6 +187,9 @@
>  /* Location of the VM Generation ID in guest physical address space. */
>  #define HVM_PARAM_VM_GENERATION_ID_ADDR 34
>  
> -#define HVM_NR_PARAMS          35
> +/* Boolean: Enable altp2m (hvm only) */

HVM_PARAMS are explicitly hvm only.  No need to say so.

Hypervisor bits Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

> +#define HVM_PARAM_ALTP2MHVM    35
> +
> +#define HVM_NR_PARAMS          36
>  
>  #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-06  9:50 ` [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Jan Beulich
@ 2015-07-06 11:25   ` Tim Deegan
  2015-07-06 11:38     ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Tim Deegan @ 2015-07-06 11:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Ed White, xen-devel, tlengyel, Daniel De Graaf

At 10:50 +0100 on 06 Jul (1436179849), Jan Beulich wrote:
> >>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
> > Changes since v2:
> > 
> > Addressed all v2 feedback *except*:
> > 
> >     In patch 5, the per-domain EPTP list page is still allocated from the
> >     Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
> >     hardware when walking the EPTP list during exit processing in patch 6.
> 
> With this little detail I can't take this as a valid reason not to make
> this change. Also - weren't we aiming at getting the page from the
> HAP pool of the domain anyway?

This one keeps coming up again, and I think has not been explained
very clearly.  For me, the important detail is that this is basically
an extension of the VMCx structures, and not part of any per-vcpu/per-p2m
state:

http://lists.xenproject.org/archives/html/xen-devel/2015-03/msg03272.html

ISTR map_domain_page_global() slots being about as rare as xenheap
pages, but I guess both of those have changed a fair amount.

Tim.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-06 11:25   ` Tim Deegan
@ 2015-07-06 11:38     ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-06 11:38 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Ed White, xen-devel, tlengyel, Daniel De Graaf

>>> On 06.07.15 at 13:25, <tim@xen.org> wrote:
> At 10:50 +0100 on 06 Jul (1436179849), Jan Beulich wrote:
>> >>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>> > Changes since v2:
>> > 
>> > Addressed all v2 feedback *except*:
>> > 
>> >     In patch 5, the per-domain EPTP list page is still allocated from the
>> >     Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
>> >     hardware when walking the EPTP list during exit processing in patch 6.
>> 
>> With this little detail I can't take this as a valid reason not to make
>> this change. Also - weren't we aiming at getting the page from the
>> HAP pool of the domain anyway?
> 
> This one keeps coming up again, and I think has not been explained
> very clearly.  For me, the important detail is that this is basically
> an extension of the VMCx structures, and not part of any per-vcpu/per-p2m
> state:
> 
> http://lists.xenproject.org/archives/html/xen-devel/2015-03/msg03272.html 

Okay, I must have forgotten about you saying so. Viewed that
way, it certainly makes sense.

> ISTR map_domain_page_global() slots being about as rare as xenheap
> pages, but I guess both of those have changed a fair amount.

Except that we have some room to grow the vmap() area, but
we can't reasonably grow the Xen heap.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-03 16:22   ` Andrew Cooper
  2015-07-06  9:56     ` Jan Beulich
@ 2015-07-06 16:40     ` Ed White
  2015-07-06 16:50       ` Ian Jackson
  2015-07-07  6:31       ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Jan Beulich
  1 sibling, 2 replies; 91+ messages in thread
From: Ed White @ 2015-07-06 16:40 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, tlengyel, Daniel De Graaf

On 07/03/2015 09:22 AM, Andrew Cooper wrote:
> On 01/07/15 19:09, Ed White wrote:
>> Add the basic data structures needed to support alternate p2m's and
>> the functions to initialise them and tear them down.
>>
>> Although Intel hardware can handle 512 EPTP's per hardware thread
>> concurrently, only 10 per domain are supported in this patch for
>> performance reasons.
>>
>> The iterator in hap_enable() does need to handle 512, so that is now
>> uint16_t.
>>
>> This change also splits the p2m lock into one lock type for altp2m's
>> and another type for all other p2m's. The purpose of this is to place
>> the altp2m list lock between the types, so the list lock can be
>> acquired whilst holding the host p2m lock.
>>
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Only some style issues.  Otherwise, Reviewed-by: Andrew Cooper
> <andrew.cooper3@citrix.com>
> 
>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>> index 6faf3f4..f21d34d 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -6502,6 +6504,25 @@ enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v)
>>      return hvm_funcs.nhvm_intr_blocked(v);
>>  }
>>  
>> +void ap2m_vcpu_update_eptp(struct vcpu *v)
>> +{
>> +    if (hvm_funcs.ap2m_vcpu_update_eptp)
> 
> spaces inside brackets
> 
>> +        hvm_funcs.ap2m_vcpu_update_eptp(v);
>> +}
>> +
>> +void ap2m_vcpu_update_vmfunc_ve(struct vcpu *v)
>> +{
>> +    if (hvm_funcs.ap2m_vcpu_update_vmfunc_ve)
> 
> spaces inside brackets
> 
>> +        hvm_funcs.ap2m_vcpu_update_vmfunc_ve(v);
>> +}
>> +
>> +bool_t ap2m_vcpu_emulate_ve(struct vcpu *v)
>> +{
>> +    if (hvm_funcs.ap2m_vcpu_emulate_ve)
> 
> spaces inside brackets
> 
>> +        return hvm_funcs.ap2m_vcpu_emulate_ve(v);
>> +    return 0;
>> +}
>> +
>>  /*
>>   * Local variables:
>>   * mode: C
>> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
>> index d0d3f1e..c00282c 100644
>> --- a/xen/arch/x86/mm/hap/hap.c
>> +++ b/xen/arch/x86/mm/hap/hap.c
>> @@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)
>>  int hap_enable(struct domain *d, u32 mode)
>>  {
>>      unsigned int old_pages;
>> -    uint8_t i;
>> +    uint16_t i;
>>      int rv = 0;
>>  
>>      domain_pause(d);
>> @@ -498,6 +498,24 @@ int hap_enable(struct domain *d, u32 mode)
>>             goto out;
>>      }
>>  
>> +    /* Init alternate p2m data */
>> +    if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
>> +    {
>> +        rv = -ENOMEM;
>> +        goto out;
>> +    }
>> +
>> +    for (i = 0; i < MAX_EPTP; i++)
>> +        d->arch.altp2m_eptp[i] = INVALID_MFN;
>> +
>> +    for (i = 0; i < MAX_ALTP2M; i++) {
> 
> braces
> 
>> +        rv = p2m_alloc_table(d->arch.altp2m_p2m[i]);
>> +        if ( rv != 0 )
>> +           goto out;
>> +    }
>> +
>> +    d->arch.altp2m_active = 0;
>> +
>>      /* Now let other users see the new mode */
>>      d->arch.paging.mode = mode | PG_HAP_enable;
>>  
>> @@ -510,6 +528,17 @@ void hap_final_teardown(struct domain *d)
>>  {
>>      uint8_t i;
>>  
>> +    d->arch.altp2m_active = 0;
>> +
>> +    if ( d->arch.altp2m_eptp ) {
> 
> braces
> 
>> +        free_xenheap_page(d->arch.altp2m_eptp);
>> +        d->arch.altp2m_eptp = NULL;
>> +    }
>> +
>> +    for (i = 0; i < MAX_ALTP2M; i++) {
> 
> braces
> 
>> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
>> index 1fd1194..58d4951 100644
>> --- a/xen/arch/x86/mm/p2m.c
>> +++ b/xen/arch/x86/mm/p2m.c
>> @@ -35,6 +35,7 @@
>>  #include <asm/hvm/vmx/vmx.h> /* ept_p2m_init() */
>>  #include <asm/mem_sharing.h>
>>  #include <asm/hvm/nestedhvm.h>
>> +#include <asm/hvm/altp2m.h>
>>  #include <asm/hvm/svm/amd-iommu-proto.h>
>>  #include <xsm/xsm.h>
>>  
>> @@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct domain *d)
>>      }
>>  }
>>  
>> +static void p2m_teardown_altp2m(struct domain *d)
>> +{
>> +    uint8_t i;
> 
> A plain unsigned int here would suffice.  It also looks curios as you
> use uint16 for the same iteration elsewhere.
> 
>> +    struct p2m_domain *p2m;
>> +
>> +    for (i = 0; i < MAX_ALTP2M; i++)
> 
> spaces inside brackets
> 
>> +    {
>> +        if ( !d->arch.altp2m_p2m[i] )
>> +            continue;
>> +        p2m = d->arch.altp2m_p2m[i];
>> +        p2m_free_one(p2m);
>> +        d->arch.altp2m_p2m[i] = NULL;
>> +    }
>> +}
>> +
>> +static int p2m_init_altp2m(struct domain *d)
>> +{
>> +    uint8_t i;
>> +    struct p2m_domain *p2m;
>> +
>> +    mm_lock_init(&d->arch.altp2m_lock);
>> +    for (i = 0; i < MAX_ALTP2M; i++)
> 
> spaces inside brackets
> 

In every case, this is because I wrote the code to conform with the style
of the surrounding code. I'll fix them all, but I think the maintainers
need to be clear about which is more important -- following the coding
style or following the style of the surrounding code.

Ed

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-06 10:09   ` Andrew Cooper
@ 2015-07-06 16:49     ` Ed White
  2015-07-06 17:08       ` Ian Jackson
  2015-07-07  7:39       ` Jan Beulich
  2015-07-07  7:33     ` Jan Beulich
  1 sibling, 2 replies; 91+ messages in thread
From: Ed White @ 2015-07-06 16:49 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	Jan Beulich, tlengyel, Daniel De Graaf

On 07/06/2015 03:09 AM, Andrew Cooper wrote:
> On 01/07/15 19:09, Ed White wrote:
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> I am still very much unconvinced by the argument against having a single
> HVMOP_altp2m and a set of subops.  do_domctl() and do_sysctl() are
> examples of a subop style hypercall with different XSM settings for
> different subops.
> 
> Furthermore, factoring out a do_altp2m_op() handler would allow things
> like the hvm_altp2m_supported() check to be made common.  Factoring
> further to having a named common header of a subop and a domid at the
> head of every subop structure would allow all the domain rcu locking to
> become common outside of the subop switch.
> 

How do we get to a binding decision on whether making this change is
a prerequisite for acceptance or not? Changing the HVMOP encoding
means fairly extensive changes to the code in hvm.c, and the XSM
patch, and the code Tamas has written. It also necessitates significant
changes to all the code we use to test the intra-domain protection
model.

Feature freeze is Friday, and that's a lot to change, test, and get
approved.

Who owns the decision?

Ed

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-06 16:40     ` Ed White
@ 2015-07-06 16:50       ` Ian Jackson
  2015-07-07  6:48         ` Coding style (was Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.) Jan Beulich
  2015-07-07  6:31       ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Jan Beulich
  1 sibling, 1 reply; 91+ messages in thread
From: Ian Jackson @ 2015-07-06 16:50 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	xen-devel, Jan Beulich, tlengyel, Daniel De Graaf

Ed White writes ("Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines."):
...
> In every case, this is because I wrote the code to conform with the style
> of the surrounding code. I'll fix them all, but I think the maintainers
> need to be clear about which is more important -- following the coding
> style or following the style of the surrounding code.

Sadly there are indeed inconsistent style problems like this in
various bits of the codebase.  I agree with Ed that maintainers need
to be clear about what is more important.

I also think that maintainers should (a) when making style complaints,
be aware if the existing code style is inconsistent or wrong and
(b) where it is, consider whether to grant submitters some leeway.

That doesn't mean that it's not appropriate to ask a submitter to
conform to a particular style; but it is important to remain
respectful.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-06  9:56     ` Jan Beulich
@ 2015-07-06 16:52       ` Ed White
  0 siblings, 0 replies; 91+ messages in thread
From: Ed White @ 2015-07-06 16:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, RaviSahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

On 07/06/2015 02:56 AM, Jan Beulich wrote:
>>>> On 03.07.15 at 18:22, <andrew.cooper3@citrix.com> wrote:
>> On 01/07/15 19:09, Ed White wrote:
>>> Add the basic data structures needed to support alternate p2m's and
>>> the functions to initialise them and tear them down.
>>>
>>> Although Intel hardware can handle 512 EPTP's per hardware thread
>>> concurrently, only 10 per domain are supported in this patch for
>>> performance reasons.
>>>
>>> The iterator in hap_enable() does need to handle 512, so that is now
>>> uint16_t.
>>>
>>> This change also splits the p2m lock into one lock type for altp2m's
>>> and another type for all other p2m's. The purpose of this is to place
>>> the altp2m list lock between the types, so the list lock can be
>>> acquired whilst holding the host p2m lock.
>>>
>>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>
>> Only some style issues.  Otherwise, Reviewed-by: Andrew Cooper
>> <andrew.cooper3@citrix.com>
> 
> To be honest, with coding style issues having been pointed out
> before, them left un-addressed in more just an occasional instance
> moves me towards ignoring such a submission altogether. Please
> help reviewers and maintainers by addressing _all_ of them even
> if only a few (or just one) got pointed out during review. This also
> helps you by avoiding to do another round just for addressing
> these.

See my reply to Andrew. If I've written the code so that it conforms
to the style of the existing code in that area, and no-one has
specifically asked me to change it (which they had not in this case),
why would I think it needs changing?

Ed

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-06 16:49     ` Ed White
@ 2015-07-06 17:08       ` Ian Jackson
  2015-07-06 18:27         ` Ed White
  2015-07-07  7:41         ` Jan Beulich
  2015-07-07  7:39       ` Jan Beulich
  1 sibling, 2 replies; 91+ messages in thread
From: Ian Jackson @ 2015-07-06 17:08 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	xen-devel, Jan Beulich, tlengyel, Daniel De Graaf

Ed White writes ("Re: [Xen-devel] [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types."):
> On 07/06/2015 03:09 AM, Andrew Cooper wrote:
> > I am still very much unconvinced by the argument against having a single
> > HVMOP_altp2m and a set of subops.  do_domctl() and do_sysctl() are
> > examples of a subop style hypercall with different XSM settings for
> > different subops.
...
> How do we get to a binding decision on whether making this change is
> a prerequisite for acceptance or not? Changing the HVMOP encoding
> means fairly extensive changes to the code in hvm.c, and the XSM
> patch, and the code Tamas has written. It also necessitates significant
> changes to all the code we use to test the intra-domain protection
> model.

I have tried to find the discussons about this and I'm not sure I have
found them all.  I found this:

  Subject: Re: [PATCH v2 09/12] x86/altp2m: add remaining support routines.
  Date: Wed, 24 Jun 2015 11:06:45 -0700
  Message-ID: <558AF1B5.4000801@intel.com>

  On 06/24/2015 09:15 AM, Lengyel, Tamas wrote:
  > This function IMHO should be merged with p2m_set_mem_access and should be
  > triggerable with the same memop (XENMEM_access_op) hypercall instead of
  > introducing a new hvmop one.

  I think we should vote on this. My view is that it makes XENMEM_access_op
  too complicated to use. It also makes using this one specific altp2m
  capability different to using any of the others -- especially if we adopt
  Andrew's suggestion and make all the altp2m ops subops.

and the ensuing subthread, and this thread.  If there are others,
could you please refer me to them ?

If this is the same disagreement, it appears that at least Tamas
(original author), Andrew Cooper (x86 maintainer) disagree with you.

> Feature freeze is Friday, and that's a lot to change, test, and get
> approved.
> 
> Who owns the decision?

Normally decisions are taken by the maintainers for the relevant area
of code.  See the role of maintainer, as documented here:

  http://www.xenproject.org/governance.html

  Maintainers

  Maintainers own one or several components in the Xen tree. A
  maintainer reviews and approves changes that affect their
  components. It is a maintainer's prime responsibility to review,
  comment on, co-ordinate and accept patches from other community
  member's and to maintain the design cohesion of their
  components. Maintainers are listed in a MAINTAINERS file in the root
  of the source tree.

For the x86 API that would be:

  Keir Fraser <keir@xen.org>
  Jan Beulich <jbeulich@suse.com>
  Andrew Cooper <andrew.cooper3@citrix.com>


In practice, normally a decision by one maintainer would stand unless
another maintainer disagrees.

In the usual course of events, a submitter who disagrees with a
decision of a maintainer can ask another maintainer for a second
opinion.  Usually this results in consensus.

I can see that Jan Beulich (who is the other active x86 maintainer -
Keir is no longer very active) has been CC'd on a lot of this traffic.
I don't see you having asked Jan for an opinion, although you did ask
for a vote.  It would be helpful of Jan were to explicitly state his
opinion.

Jan: what do you think ?

In principle, if the dispute is not resolved, committers could vote.
We have (as a project) not yet needed to do this about a matter of
code.  I don't think a vote to overrule the maintainers is likely
here, although the views of other contributors - especially of
committers and other maintainers will be influential with Jan and
Andrew.

I hope this is helpful.

Thank,
Ian.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection.
  2015-07-01 18:09 ` [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection Ed White
@ 2015-07-06 17:16   ` George Dunlap
  2015-07-07 18:58   ` Nakajima, Jun
  1 sibling, 0 replies; 91+ messages in thread
From: George Dunlap @ 2015-07-06 17:16 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Wed, Jul 1, 2015 at 7:09 PM, Ed White <edmund.h.white@intel.com> wrote:
> Currently, neither is enabled globally but may be enabled on a per-VCPU
> basis by the altp2m code.
>
> Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
> that bit is now hardware-defined.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: George Dunlap <george.dunlap@eu.citrix.com>

> ---
>  xen/arch/x86/hvm/vmx/vmcs.c        | 42 +++++++++++++++++++++++++++++++++++---
>  xen/arch/x86/mm/p2m-ept.c          |  1 -
>  xen/include/asm-x86/hvm/vmx/vmcs.h | 14 +++++++++++--
>  xen/include/asm-x86/hvm/vmx/vmx.h  | 13 +++++++++++-
>  xen/include/asm-x86/msr-index.h    |  1 +
>  5 files changed, 64 insertions(+), 7 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index 4c5ceb5..bc1cabd 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -101,6 +101,8 @@ u32 vmx_secondary_exec_control __read_mostly;
>  u32 vmx_vmexit_control __read_mostly;
>  u32 vmx_vmentry_control __read_mostly;
>  u64 vmx_ept_vpid_cap __read_mostly;
> +u64 vmx_vmfunc __read_mostly;
> +bool_t vmx_virt_exception __read_mostly;
>
>  const u32 vmx_introspection_force_enabled_msrs[] = {
>      MSR_IA32_SYSENTER_EIP,
> @@ -140,6 +142,8 @@ static void __init vmx_display_features(void)
>      P(cpu_has_vmx_virtual_intr_delivery, "Virtual Interrupt Delivery");
>      P(cpu_has_vmx_posted_intr_processing, "Posted Interrupt Processing");
>      P(cpu_has_vmx_vmcs_shadowing, "VMCS shadowing");
> +    P(cpu_has_vmx_vmfunc, "VM Functions");
> +    P(cpu_has_vmx_virt_exceptions, "Virtualisation Exceptions");
>      P(cpu_has_vmx_pml, "Page Modification Logging");
>  #undef P
>
> @@ -185,6 +189,7 @@ static int vmx_init_vmcs_config(void)
>      u64 _vmx_misc_cap = 0;
>      u32 _vmx_vmexit_control;
>      u32 _vmx_vmentry_control;
> +    u64 _vmx_vmfunc = 0;
>      bool_t mismatch = 0;
>
>      rdmsr(MSR_IA32_VMX_BASIC, vmx_basic_msr_low, vmx_basic_msr_high);
> @@ -230,7 +235,9 @@ static int vmx_init_vmcs_config(void)
>                 SECONDARY_EXEC_ENABLE_EPT |
>                 SECONDARY_EXEC_ENABLE_RDTSCP |
>                 SECONDARY_EXEC_PAUSE_LOOP_EXITING |
> -               SECONDARY_EXEC_ENABLE_INVPCID);
> +               SECONDARY_EXEC_ENABLE_INVPCID |
> +               SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
> +               SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
>          rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
>          if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
>              opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
> @@ -341,6 +348,24 @@ static int vmx_init_vmcs_config(void)
>            || !(_vmx_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT) )
>          _vmx_pin_based_exec_control  &= ~ PIN_BASED_POSTED_INTERRUPT;
>
> +    /* The IA32_VMX_VMFUNC MSR exists only when VMFUNC is available */
> +    if ( _vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS )
> +    {
> +        rdmsrl(MSR_IA32_VMX_VMFUNC, _vmx_vmfunc);
> +
> +        /*
> +         * VMFUNC leaf 0 (EPTP switching) must be supported.
> +         *
> +         * Or we just don't use VMFUNC.
> +         */
> +        if ( !(_vmx_vmfunc & VMX_VMFUNC_EPTP_SWITCHING) )
> +            _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
> +    }
> +
> +    /* Virtualization exceptions are only enabled if VMFUNC is enabled */
> +    if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
> +        _vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
> +
>      min = 0;
>      opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
>      _vmx_vmentry_control = adjust_vmx_controls(
> @@ -361,6 +386,9 @@ static int vmx_init_vmcs_config(void)
>          vmx_vmentry_control        = _vmx_vmentry_control;
>          vmx_basic_msr              = ((u64)vmx_basic_msr_high << 32) |
>                                       vmx_basic_msr_low;
> +        vmx_vmfunc                 = _vmx_vmfunc;
> +        vmx_virt_exception         = !!(_vmx_secondary_exec_control &
> +                                       SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
>          vmx_display_features();
>
>          /* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
> @@ -397,6 +425,9 @@ static int vmx_init_vmcs_config(void)
>          mismatch |= cap_check(
>              "EPT and VPID Capability",
>              vmx_ept_vpid_cap, _vmx_ept_vpid_cap);
> +        mismatch |= cap_check(
> +            "VMFUNC Capability",
> +            vmx_vmfunc, _vmx_vmfunc);
>          if ( cpu_has_vmx_ins_outs_instr_info !=
>               !!(vmx_basic_msr_high & (VMX_BASIC_INS_OUT_INFO >> 32)) )
>          {
> @@ -967,6 +998,11 @@ static int construct_vmcs(struct vcpu *v)
>      /* Do not enable Monitor Trap Flag unless start single step debug */
>      v->arch.hvm_vmx.exec_control &= ~CPU_BASED_MONITOR_TRAP_FLAG;
>
> +    /* Disable VMFUNC and #VE for now: they may be enabled later by altp2m. */
> +    v->arch.hvm_vmx.secondary_exec_control &=
> +        ~(SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
> +          SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
> +
>      if ( is_pvh_domain(d) )
>      {
>          /* Disable virtual apics, TPR */
> @@ -1790,9 +1826,9 @@ void vmcs_dump_vcpu(struct vcpu *v)
>          printk("PLE Gap=%08x Window=%08x\n",
>                 vmr32(PLE_GAP), vmr32(PLE_WINDOW));
>      if ( v->arch.hvm_vmx.secondary_exec_control &
> -         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VMFUNC) )
> +         (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
>          printk("Virtual processor ID = 0x%04x VMfunc controls = %016lx\n",
> -               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VMFUNC_CONTROL));
> +               vmr16(VIRTUAL_PROCESSOR_ID), vmr(VM_FUNCTION_CONTROL));
>
>      vmx_vmcs_exit(v);
>  }
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index 5133eb6..a6c9adf 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -281,7 +281,6 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
>          epte->sp = (level > 1);
>          epte->mfn += i * trunk;
>          epte->snp = (iommu_enabled && iommu_snoop);
> -        ASSERT(!epte->avail3);
>
>          ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
>
> diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
> index 1104bda..cb0ee6c 100644
> --- a/xen/include/asm-x86/hvm/vmx/vmcs.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
> @@ -222,9 +222,10 @@ extern u32 vmx_vmentry_control;
>  #define SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY    0x00000200
>  #define SECONDARY_EXEC_PAUSE_LOOP_EXITING       0x00000400
>  #define SECONDARY_EXEC_ENABLE_INVPCID           0x00001000
> -#define SECONDARY_EXEC_ENABLE_VMFUNC            0x00002000
> +#define SECONDARY_EXEC_ENABLE_VM_FUNCTIONS      0x00002000
>  #define SECONDARY_EXEC_ENABLE_VMCS_SHADOWING    0x00004000
>  #define SECONDARY_EXEC_ENABLE_PML               0x00020000
> +#define SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS   0x00040000
>  extern u32 vmx_secondary_exec_control;
>
>  #define VMX_EPT_EXEC_ONLY_SUPPORTED             0x00000001
> @@ -285,6 +286,10 @@ extern u32 vmx_secondary_exec_control;
>      (vmx_pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT)
>  #define cpu_has_vmx_vmcs_shadowing \
>      (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VMCS_SHADOWING)
> +#define cpu_has_vmx_vmfunc \
> +    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS)
> +#define cpu_has_vmx_virt_exceptions \
> +    (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS)
>  #define cpu_has_vmx_pml \
>      (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_PML)
>
> @@ -316,6 +321,9 @@ extern u64 vmx_basic_msr;
>  #define VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK  0x0FF
>  #define VMX_GUEST_INTR_STATUS_SVI_OFFSET        8
>
> +/* VMFUNC leaf definitions */
> +#define VMX_VMFUNC_EPTP_SWITCHING   (1ULL << 0)
> +
>  /* VMCS field encodings. */
>  #define VMCS_HIGH(x) ((x) | 1)
>  enum vmcs_field {
> @@ -350,12 +358,14 @@ enum vmcs_field {
>      VIRTUAL_APIC_PAGE_ADDR          = 0x00002012,
>      APIC_ACCESS_ADDR                = 0x00002014,
>      PI_DESC_ADDR                    = 0x00002016,
> -    VMFUNC_CONTROL                  = 0x00002018,
> +    VM_FUNCTION_CONTROL             = 0x00002018,
>      EPT_POINTER                     = 0x0000201a,
>      EOI_EXIT_BITMAP0                = 0x0000201c,
>  #define EOI_EXIT_BITMAP(n) (EOI_EXIT_BITMAP0 + (n) * 2) /* n = 0...3 */
> +    EPTP_LIST_ADDR                  = 0x00002024,
>      VMREAD_BITMAP                   = 0x00002026,
>      VMWRITE_BITMAP                  = 0x00002028,
> +    VIRT_EXCEPTION_INFO             = 0x0000202a,
>      GUEST_PHYSICAL_ADDRESS          = 0x00002400,
>      VMCS_LINK_POINTER               = 0x00002800,
>      GUEST_IA32_DEBUGCTL             = 0x00002802,
> diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
> index 35f804a..5b59d3c 100644
> --- a/xen/include/asm-x86/hvm/vmx/vmx.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmx.h
> @@ -47,7 +47,7 @@ typedef union {
>          access      :   4,  /* bits 61:58 - p2m_access_t */
>          tm          :   1,  /* bit 62 - VT-d transient-mapping hint in
>                                 shared EPT/VT-d usage */
> -        avail3      :   1;  /* bit 63 - Software available 3 */
> +        suppress_ve :   1;  /* bit 63 - suppress #VE */
>      };
>      u64 epte;
>  } ept_entry_t;
> @@ -186,6 +186,7 @@ static inline unsigned long pi_get_pir(struct pi_desc *pi_desc, int group)
>  #define EXIT_REASON_XSETBV              55
>  #define EXIT_REASON_APIC_WRITE          56
>  #define EXIT_REASON_INVPCID             58
> +#define EXIT_REASON_VMFUNC              59
>  #define EXIT_REASON_PML_FULL            62
>
>  /*
> @@ -554,4 +555,14 @@ void p2m_init_hap_data(struct p2m_domain *p2m);
>  #define EPT_L4_PAGETABLE_SHIFT      39
>  #define EPT_PAGETABLE_ENTRIES       512
>
> +/* #VE information page */
> +typedef struct {
> +    u32 exit_reason;
> +    u32 semaphore;
> +    u64 exit_qualification;
> +    u64 gla;
> +    u64 gpa;
> +    u16 eptp_index;
> +} ve_info_t;
> +
>  #endif /* __ASM_X86_HVM_VMX_VMX_H__ */
> diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
> index 83f2f70..8069d60 100644
> --- a/xen/include/asm-x86/msr-index.h
> +++ b/xen/include/asm-x86/msr-index.h
> @@ -130,6 +130,7 @@
>  #define MSR_IA32_VMX_TRUE_PROCBASED_CTLS        0x48e
>  #define MSR_IA32_VMX_TRUE_EXIT_CTLS             0x48f
>  #define MSR_IA32_VMX_TRUE_ENTRY_CTLS            0x490
> +#define MSR_IA32_VMX_VMFUNC                     0x491
>  #define IA32_FEATURE_CONTROL_MSR                0x3a
>  #define IA32_FEATURE_CONTROL_MSR_LOCK                     0x0001
>  #define IA32_FEATURE_CONTROL_MSR_ENABLE_VMXON_INSIDE_SMX  0x0002
> --
> 1.9.1
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 03/13] VMX: implement suppress #VE.
  2015-07-01 18:09 ` [PATCH v3 03/13] VMX: implement suppress #VE Ed White
@ 2015-07-06 17:26   ` George Dunlap
  2015-07-07 18:59   ` Nakajima, Jun
  2015-07-09 13:01   ` Jan Beulich
  2 siblings, 0 replies; 91+ messages in thread
From: George Dunlap @ 2015-07-06 17:26 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Wed, Jul 1, 2015 at 7:09 PM, Ed White <edmund.h.white@intel.com> wrote:
> In preparation for selectively enabling #VE in a later patch, set
> suppress #VE on all EPTE's.
>
> Suppress #VE should always be the default condition for two reasons:
> it is generally not safe to deliver #VE into a guest unless that guest
> has been modified to receive it; and even then for most EPT violations only
> the hypervisor is able to handle the violation.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>

> ---
>  xen/arch/x86/mm/p2m-ept.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index a6c9adf..4111795 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -41,7 +41,8 @@
>  #define is_epte_superpage(ept_entry)    ((ept_entry)->sp)
>  static inline bool_t is_epte_valid(ept_entry_t *e)
>  {
> -    return (e->epte != 0 && e->sa_p2mt != p2m_invalid);
> +    /* suppress_ve alone is not considered valid, so mask it off */
> +    return ((e->epte & ~(1ul << 63)) != 0 && e->sa_p2mt != p2m_invalid);
>  }
>
>  /* returns : 0 for success, -errno otherwise */
> @@ -219,6 +220,8 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
>  static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
>  {
>      struct page_info *pg;
> +    ept_entry_t *table;
> +    unsigned int i;
>
>      pg = p2m_alloc_ptp(p2m, 0);
>      if ( pg == NULL )
> @@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
>      /* Manually set A bit to avoid overhead of MMU having to write it later. */
>      ept_entry->a = 1;
>
> +    ept_entry->suppress_ve = 1;
> +
> +    table = __map_domain_page(pg);
> +
> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
> +        table[i].suppress_ve = 1;
> +
> +    unmap_domain_page(table);
> +
>      return 1;
>  }
>
> @@ -281,6 +293,7 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry,
>          epte->sp = (level > 1);
>          epte->mfn += i * trunk;
>          epte->snp = (iommu_enabled && iommu_snoop);
> +        epte->suppress_ve = 1;
>
>          ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
>
> @@ -790,6 +803,8 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>          ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
>      }
>
> +    new_entry.suppress_ve = 1;
> +
>      rc = atomic_write_ept_entry(ept_entry, new_entry, target);
>      if ( unlikely(rc) )
>          old_entry.epte = 0;
> @@ -1111,6 +1126,8 @@ static void ept_flush_pml_buffers(struct p2m_domain *p2m)
>  int ept_p2m_init(struct p2m_domain *p2m)
>  {
>      struct ept_data *ept = &p2m->ept;
> +    ept_entry_t *table;
> +    unsigned int i;
>
>      p2m->set_entry = ept_set_entry;
>      p2m->get_entry = ept_get_entry;
> @@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
>          p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
>      }
>
> +    table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
> +
> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
> +        table[i].suppress_ve = 1;
> +
> +    unmap_domain_page(table);
> +
>      if ( !zalloc_cpumask_var(&ept->synced_mask) )
>          return -ENOMEM;
>
> --
> 1.9.1
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-01 18:09 ` [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
  2015-07-06 10:16   ` Andrew Cooper
@ 2015-07-06 17:49   ` Wei Liu
  2015-07-06 18:01     ` Ed White
  1 sibling, 1 reply; 91+ messages in thread
From: Wei Liu @ 2015-07-06 17:49 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Wed, Jul 01, 2015 at 11:09:36AM -0700, Ed White wrote:
> The altp2mhvm and nestedhvm parameters are mutually
> exclusive and cannot be set together.
> 
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> ---
>  docs/man/xl.cfg.pod.5           | 12 ++++++++++++
>  tools/libxl/libxl_create.c      |  1 +
>  tools/libxl/libxl_dom.c         |  2 ++
>  tools/libxl/libxl_types.idl     |  1 +
>  tools/libxl/xl_cmdimpl.c        |  8 ++++++++
>  xen/arch/x86/hvm/hvm.c          | 16 +++++++++++++++-
>  xen/include/public/hvm/params.h |  5 ++++-
>  7 files changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index a3e0e2e..18afd46 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -1035,6 +1035,18 @@ enabled by default and you should usually omit it. It may be necessary
>  to disable the HPET in order to improve compatibility with guest
>  Operating Systems (X86 only)
>  
> +=item B<altp2mhvm=BOOLEAN>
> +
> +Enables or disables hvm guest access to alternate-p2m capability.
> +Alternate-p2m allows a guest to manage multiple p2m guest physical
> +"memory views" (as opposed to a single p2m). This option is
> +disabled by default and is available only to hvm domains.
> +You may want this option if you want to access-control/isolate
> +access to specific guest physical memory pages accessed by
> +the guest, e.g. for HVM domain memory introspection or
> +for isolation/access-control of memory between components within
> +a single guest hvm domain.
> +
>  =item B<nestedhvm=BOOLEAN>
>  
>  Enable or disables guest access to hardware virtualisation features,
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 86384d2..35e322e 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -329,6 +329,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
>          libxl_defbool_setdefault(&b_info->u.hvm.hpet,               true);
>          libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,          true);
>          libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm,         false);
> +        libxl_defbool_setdefault(&b_info->u.hvm.altp2mhvm,          false);
>          libxl_defbool_setdefault(&b_info->u.hvm.usb,                false);
>          libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
>  
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index 600393d..b75f49b 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -300,6 +300,8 @@ static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
>                      libxl_defbool_val(info->u.hvm.vpt_align));
>      xc_hvm_param_set(handle, domid, HVM_PARAM_NESTEDHVM,
>                      libxl_defbool_val(info->u.hvm.nested_hvm));
> +    xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2MHVM,
> +                    libxl_defbool_val(info->u.hvm.altp2mhvm));
>  }
>  
>  int libxl__build_pre(libxl__gc *gc, uint32_t domid,
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 23f27d4..66a89cf 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -437,6 +437,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>                                         ("mmio_hole_memkb",  MemKB),
>                                         ("timer_mode",       libxl_timer_mode),
>                                         ("nested_hvm",       libxl_defbool),
> +                                       ("altp2mhvm",        libxl_defbool),

It's redundant to have "hvm" in the name of this field. Calling it
"altp2m" would be fine IMHO.

>                                         ("smbios_firmware",  string),
>                                         ("acpi_firmware",    string),
>                                         ("nographic",        libxl_defbool),
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index c858068..ccb0de9 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -1500,6 +1500,14 @@ static void parse_config_data(const char *config_source,
>  
>          xlu_cfg_get_defbool(config, "nestedhvm", &b_info->u.hvm.nested_hvm, 0);
>  
> +        xlu_cfg_get_defbool(config, "altp2mhvm", &b_info->u.hvm.altp2mhvm, 0);
> +
> +        if (strcmp(libxl_defbool_to_string(b_info->u.hvm.nested_hvm), "True") == 0 &&
> +            strcmp(libxl_defbool_to_string(b_info->u.hvm.altp2mhvm), "True") == 0) {
> +            fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used together\n");

You can use libxl_defbool_val. Don't use strcmp.

> +            exit (1);

Coding style.

You also need to #define LIBXL_HAVE_XXX in libxl.h. See that file for
examples.

Wei.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-06 17:49   ` Wei Liu
@ 2015-07-06 18:01     ` Ed White
  2015-07-06 18:18       ` Wei Liu
  0 siblings, 1 reply; 91+ messages in thread
From: Ed White @ 2015-07-06 18:01 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ravi Sahita, George Dunlap, Ian Jackson, Tim Deegan, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On 07/06/2015 10:49 AM, Wei Liu wrote:
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index 23f27d4..66a89cf 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -437,6 +437,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>>                                         ("mmio_hole_memkb",  MemKB),
>>                                         ("timer_mode",       libxl_timer_mode),
>>                                         ("nested_hvm",       libxl_defbool),
>> +                                       ("altp2mhvm",        libxl_defbool),
> 
> It's redundant to have "hvm" in the name of this field. Calling it
> "altp2m" would be fine IMHO.
> 

When I originally started writing this code, I modelled the naming
and some of the structure on nestedhvm, which is why so many things
had hvm in the name. I've now removed a lot of those, but in this
instance I wonder if doing so would cause confusion, since we have
a Xen command-line parameter called altp2m.

Ed

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-06 18:01     ` Ed White
@ 2015-07-06 18:18       ` Wei Liu
  2015-07-06 22:59         ` Ed White
  0 siblings, 1 reply; 91+ messages in thread
From: Wei Liu @ 2015-07-06 18:18 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Mon, Jul 06, 2015 at 11:01:27AM -0700, Ed White wrote:
> On 07/06/2015 10:49 AM, Wei Liu wrote:
> >> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >> index 23f27d4..66a89cf 100644
> >> --- a/tools/libxl/libxl_types.idl
> >> +++ b/tools/libxl/libxl_types.idl
> >> @@ -437,6 +437,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
> >>                                         ("mmio_hole_memkb",  MemKB),
> >>                                         ("timer_mode",       libxl_timer_mode),
> >>                                         ("nested_hvm",       libxl_defbool),
> >> +                                       ("altp2mhvm",        libxl_defbool),
> > 
> > It's redundant to have "hvm" in the name of this field. Calling it
> > "altp2m" would be fine IMHO.
> > 
> 
> When I originally started writing this code, I modelled the naming
> and some of the structure on nestedhvm, which is why so many things
> had hvm in the name. I've now removed a lot of those, but in this
> instance I wonder if doing so would cause confusion, since we have
> a Xen command-line parameter called altp2m.
> 

I don't think it will cause confusion. This is part of guest
configuration which has nothing to do with Xen and in theory another
name space.

I asked you to remove hvm because it's redundant -- that field is a
sub-field of u.hvm already.

Wei.

> Ed

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-06 17:08       ` Ian Jackson
@ 2015-07-06 18:27         ` Ed White
  2015-07-06 23:40           ` Lengyel, Tamas
  2015-07-07  7:41         ` Jan Beulich
  1 sibling, 1 reply; 91+ messages in thread
From: Ed White @ 2015-07-06 18:27 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	xen-devel, Jan Beulich, tlengyel, Daniel De Graaf

On 07/06/2015 10:08 AM, Ian Jackson wrote:
> Ed White writes ("Re: [Xen-devel] [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types."):
>> On 07/06/2015 03:09 AM, Andrew Cooper wrote:
>>> I am still very much unconvinced by the argument against having a single
>>> HVMOP_altp2m and a set of subops.  do_domctl() and do_sysctl() are
>>> examples of a subop style hypercall with different XSM settings for
>>> different subops.
> ...
>> How do we get to a binding decision on whether making this change is
>> a prerequisite for acceptance or not? Changing the HVMOP encoding
>> means fairly extensive changes to the code in hvm.c, and the XSM
>> patch, and the code Tamas has written. It also necessitates significant
>> changes to all the code we use to test the intra-domain protection
>> model.
> 
> I have tried to find the discussons about this and I'm not sure I have
> found them all.  I found this:
> 
>   Subject: Re: [PATCH v2 09/12] x86/altp2m: add remaining support routines.
>   Date: Wed, 24 Jun 2015 11:06:45 -0700
>   Message-ID: <558AF1B5.4000801@intel.com>
> 
>   On 06/24/2015 09:15 AM, Lengyel, Tamas wrote:
>   > This function IMHO should be merged with p2m_set_mem_access and should be
>   > triggerable with the same memop (XENMEM_access_op) hypercall instead of
>   > introducing a new hvmop one.
> 
>   I think we should vote on this. My view is that it makes XENMEM_access_op
>   too complicated to use. It also makes using this one specific altp2m
>   capability different to using any of the others -- especially if we adopt
>   Andrew's suggestion and make all the altp2m ops subops.
> 
> and the ensuing subthread, and this thread.  If there are others,
> could you please refer me to them ?
>

I believe, unless Tamas says otherwise, that we agreed the
HVMOP's in question and their implementations are sufficiently
different that we should not merge them.

The decision I'm looking for is on the suggestion Andrew made in
http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg03820.html.

That suggestion had not been made prior to that point, even though the
HVMOP's have not changed since the original patch series submitted in
January, but it now appears that it may be a requirement, not a
suggestion.

Our focus has very clearly been on inclusion in Xen 4.6, and changing
the HVMOP's in this way, with the attendant other changes required, puts
us at a substantial risk of not being feature-complete by Friday, which
is why I want to clarify it.

To be clear: this is not like the p2m set/get issue, where we have a
disagreement on design principles; it's just a large amount of work
being suggested late in the development cycle, and no-one has said
definitively whether or not we *have* to do it.

Ed
 
> If this is the same disagreement, it appears that at least Tamas
> (original author), Andrew Cooper (x86 maintainer) disagree with you.
> 
>> Feature freeze is Friday, and that's a lot to change, test, and get
>> approved.
>>
>> Who owns the decision?
> 
> Normally decisions are taken by the maintainers for the relevant area
> of code.  See the role of maintainer, as documented here:
> 
>   http://www.xenproject.org/governance.html
> 
>   Maintainers
> 
>   Maintainers own one or several components in the Xen tree. A
>   maintainer reviews and approves changes that affect their
>   components. It is a maintainer's prime responsibility to review,
>   comment on, co-ordinate and accept patches from other community
>   member's and to maintain the design cohesion of their
>   components. Maintainers are listed in a MAINTAINERS file in the root
>   of the source tree.
> 
> For the x86 API that would be:
> 
>   Keir Fraser <keir@xen.org>
>   Jan Beulich <jbeulich@suse.com>
>   Andrew Cooper <andrew.cooper3@citrix.com>
> 
> 
> In practice, normally a decision by one maintainer would stand unless
> another maintainer disagrees.
> 
> In the usual course of events, a submitter who disagrees with a
> decision of a maintainer can ask another maintainer for a second
> opinion.  Usually this results in consensus.
> 
> I can see that Jan Beulich (who is the other active x86 maintainer -
> Keir is no longer very active) has been CC'd on a lot of this traffic.
> I don't see you having asked Jan for an opinion, although you did ask
> for a vote.  It would be helpful of Jan were to explicitly state his
> opinion.
> 
> Jan: what do you think ?
> 
> In principle, if the dispute is not resolved, committers could vote.
> We have (as a project) not yet needed to do this about a matter of
> code.  I don't think a vote to overrule the maintainers is likely
> here, although the views of other contributors - especially of
> committers and other maintainers will be influential with Jan and
> Andrew.
> 
> I hope this is helpful.
> 
> Thank,
> Ian.
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-03 16:40   ` Andrew Cooper
@ 2015-07-06 19:56     ` Sahita, Ravi
  2015-07-07  7:31       ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-06 19:56 UTC (permalink / raw)
  To: Andrew Cooper, White, Edmund H, xen-devel
  Cc: Sahita, Ravi, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Jan Beulich, tlengyel, Daniel De Graaf



-----Original Message-----
From: Andrew Cooper [mailto:andrew.cooper3@citrix.com] 
Sent: Friday, July 03, 2015 9:40 AM
To: White, Edmund H; xen-devel@lists.xen.org
Cc: Ian Jackson; Jan Beulich; Tim Deegan; Daniel De Graaf; Sahita, Ravi; Wei Liu; tlengyel@novetta.com; George Dunlap
Subject: Re: [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.

On 01/07/15 19:09, Ed White wrote:
> From: Ravi Sahita <ravi.sahita@intel.com>
>
> Signed-off-by: Ravi Sahita <ravi.sahita@intel.com>
> ---
>  xen/arch/x86/hvm/emulate.c             | 12 +++++++--
>  xen/arch/x86/hvm/vmx/vmx.c             | 30 +++++++++++++++++++++
>  xen/arch/x86/x86_emulate/x86_emulate.c | 48 
> +++++++++++++++++++++-------------
>  xen/arch/x86/x86_emulate/x86_emulate.h |  4 +++
>  xen/include/asm-x86/hvm/hvm.h          |  2 ++
>  5 files changed, 76 insertions(+), 20 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c 
> index ac9c9d6..157fe78 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1356,6 +1356,12 @@ static int hvmemul_invlpg(
>      return rc;
>  }
>  
> +static int hvmemul_vmfunc(
> +    struct x86_emulate_ctxt *ctxt)
> +{
> +    return hvm_funcs.ap2m_vcpu_emulate_vmfunc(ctxt->regs);
> +}
> +
>  static const struct x86_emulate_ops hvm_emulate_ops = {
>      .read          = hvmemul_read,
>      .insn_fetch    = hvmemul_insn_fetch,
> @@ -1379,7 +1385,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
>      .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
>      .get_fpu       = hvmemul_get_fpu,
>      .put_fpu       = hvmemul_put_fpu,
> -    .invlpg        = hvmemul_invlpg
> +    .invlpg        = hvmemul_invlpg,
> +    .vmfunc        = hvmemul_vmfunc,
>  };
>  
>  static const struct x86_emulate_ops hvm_emulate_ops_no_write = { @@ 
> -1405,7 +1412,8 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
>      .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
>      .get_fpu       = hvmemul_get_fpu,
>      .put_fpu       = hvmemul_put_fpu,
> -    .invlpg        = hvmemul_invlpg
> +    .invlpg        = hvmemul_invlpg,
> +    .vmfunc        = hvmemul_vmfunc,
>  };
>  
>  static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt, 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c 
> index 9585aa3..c6feeae 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -82,6 +82,7 @@ static void vmx_fpu_dirty_intercept(void);  static 
> int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content);  
> static int vmx_msr_write_intercept(unsigned int msr, uint64_t 
> msr_content);  static void vmx_invlpg_intercept(unsigned long vaddr);
> +static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
>  
>  uint8_t __read_mostly posted_intr_vector;
>  
> @@ -1830,6 +1831,20 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
>      vmx_vmcs_exit(v);
>  }
>  
> +static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs) {
> +    int rc = X86EMUL_EXCEPTION;
> +    struct vcpu *v = current;
> +
> +    if ( !cpu_has_vmx_vmfunc && altp2m_active(v->domain) &&
> +         regs->eax == 0 &&
> +         p2m_switch_vcpu_altp2m_by_id(v, (uint16_t)regs->ecx) )
> +    {
> +        rc = X86EMUL_OKAY;
> +    }

You need a #UD injection at this point.

Ravi> I will keep this function unchanged i.e. returns X86EMUL_EXCEPTION on error, which will cause the initiating hvmemul_vmfunc to stage a #UD (this staging was in fact missing, and is now fixed - thanks).
Ravi> The #UD is actually injected by the top level routine vmx_vmexit_ud_intercept.

> +    return rc;
> +}
> +
>  static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)  {
>      bool_t rc = 0;
> @@ -1898,6 +1913,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
>      .msr_read_intercept   = vmx_msr_read_intercept,
>      .msr_write_intercept  = vmx_msr_write_intercept,
>      .invlpg_intercept     = vmx_invlpg_intercept,
> +    .vmfunc_intercept     = vmx_vmfunc_intercept,
>      .handle_cd            = vmx_handle_cd,
>      .set_info_guest       = vmx_set_info_guest,
>      .set_rdtsc_exiting    = vmx_set_rdtsc_exiting,
> @@ -1924,6 +1940,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
>      .ap2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
>      .ap2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
>      .ap2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
> +    .ap2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
>  };
>  
>  const struct hvm_function_table * __init start_vmx(void) @@ -2095,6 
> +2112,12 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
>          vpid_sync_vcpu_gva(curr, vaddr);  }
>  
> +static int vmx_vmfunc_intercept(struct cpu_user_regs *regs) {
> +    gdprintk(XENLOG_ERR, "Failed guest VMFUNC execution\n");
> +    return X86EMUL_EXCEPTION;
> +}
> +
>  static int vmx_cr_access(unsigned long exit_qualification)  {
>      struct vcpu *curr = current;
> @@ -3245,6 +3268,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>              update_guest_eip();
>          break;
>  
> +    case EXIT_REASON_VMFUNC:
> +        if ( vmx_vmfunc_intercept(regs) == X86EMUL_OKAY )
> +            update_guest_eip();
> +        else
> +            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
> +        break;
> +
>      case EXIT_REASON_MWAIT_INSTRUCTION:
>      case EXIT_REASON_MONITOR_INSTRUCTION:
>      case EXIT_REASON_GETSEC:
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index c017c69..adf64d0 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -3815,28 +3815,40 @@ x86_emulate(
>      case 0x01: /* Grp7 */ {
>          struct segment_register reg;
>          unsigned long base, limit, cr0, cr0w;
> +        uint64_t tsc_aux;

This variable can live inside the rdtscp case, to reduce its scope.

Ravi>Will address

>  
> -        if ( modrm == 0xdf ) /* invlpga */
> +        switch( modrm )
>          {
> -            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
> -            generate_exception_if(!mode_ring0(), EXC_GP, 0);
> -            fail_if(ops->invlpg == NULL);
> -            if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
> -                                   ctxt)) )
> -                goto done;
> -            break;
> -        }
> -
> -        if ( modrm == 0xf9 ) /* rdtscp */
> -        {
> -            uint64_t tsc_aux;
> -            fail_if(ops->read_msr == NULL);
> -            if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
> -                goto done;
> -            _regs.ecx = (uint32_t)tsc_aux;
> -            goto rdtsc;
> +            case 0xdf: /* invlpga AMD */
> +                generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
> +                generate_exception_if(!mode_ring0(), EXC_GP, 0);
> +                fail_if(ops->invlpg == NULL);
> +                if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
> +                                       ctxt)) )
> +                    goto done;
> +                break;
> +            case 0xf9: /* rdtscp */
> +                fail_if(ops->read_msr == NULL);
> +                if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
> +                    goto done;
> +                _regs.ecx = (uint32_t)tsc_aux;
> +                goto rdtsc;
> +            case 0xd4: /* vmfunc */
> +                generate_exception_if(
> +                    (lock_prefix |
> +                    rep_prefix() |
> +                    (vex.pfx == vex_66)),
> +                    EXC_UD, -1);

The instruction reference makes no mention of any conditions like this.

Ravi> yes I will note that to be fixed - for now the best documentation I can point to is for an instruction in the same encoding group (see XSETBV or XTEST) which specifies #UD when prefixes LOCK, 66H, F3H or F2H are used.

The 3 conditions for #UD are being executed in non-root mode, the enable VM functions execution control is clear (which is how we would get here in the first place), or if eax is is >= 64.
The first needs an has_hvm_container() check, while the second and third can be left to ops->vmfunc() to handle.

Ravi>right, the required exec controls and register parameter checks are already done by ops->vmfunc(), and regarding the has_hvm_container check - I don't think that's needed because ops->vmfunc checks for altp2m being enabled and altp2m can be enabled for hvm domains only.

Thanks,
Ravi

~Andrew

> +                fail_if(ops->vmfunc == NULL);
> +                if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
> +                    goto done;
> +                break;
> +            default:
> +                goto continue_grp7;
>          }
> +        break;
>  
> +continue_grp7:
>          switch ( modrm_reg & 7 )
>          {
>          case 0: /* sgdt */
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h 
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index 064b8f4..a4d4ec8 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -397,6 +397,10 @@ struct x86_emulate_ops
>          enum x86_segment seg,
>          unsigned long offset,
>          struct x86_emulate_ctxt *ctxt);
> +
> +    /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */
> +    int (*vmfunc)(
> +        struct x86_emulate_ctxt *ctxt);
>  };
>  
>  struct cpu_user_regs;
> diff --git a/xen/include/asm-x86/hvm/hvm.h 
> b/xen/include/asm-x86/hvm/hvm.h index 36f1b74..595b399 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -167,6 +167,7 @@ struct hvm_function_table {
>      int (*msr_read_intercept)(unsigned int msr, uint64_t *msr_content);
>      int (*msr_write_intercept)(unsigned int msr, uint64_t msr_content);
>      void (*invlpg_intercept)(unsigned long vaddr);
> +    int (*vmfunc_intercept)(struct cpu_user_regs *regs);
>      void (*handle_cd)(struct vcpu *v, unsigned long value);
>      void (*set_info_guest)(struct vcpu *v);
>      void (*set_rdtsc_exiting)(struct vcpu *v, bool_t); @@ -218,6 
> +219,7 @@ struct hvm_function_table {
>      void (*ap2m_vcpu_update_eptp)(struct vcpu *v);
>      void (*ap2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
>      bool_t (*ap2m_vcpu_emulate_ve)(struct vcpu *v);
> +    int (*ap2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
>  };
>  
>  extern struct hvm_function_table hvm_funcs;

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter.
  2015-07-06 18:18       ` Wei Liu
@ 2015-07-06 22:59         ` Ed White
  0 siblings, 0 replies; 91+ messages in thread
From: Ed White @ 2015-07-06 22:59 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ravi Sahita, George Dunlap, Ian Jackson, Tim Deegan, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On 07/06/2015 11:18 AM, Wei Liu wrote:
> On Mon, Jul 06, 2015 at 11:01:27AM -0700, Ed White wrote:
>> On 07/06/2015 10:49 AM, Wei Liu wrote:
>>>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>>>> index 23f27d4..66a89cf 100644
>>>> --- a/tools/libxl/libxl_types.idl
>>>> +++ b/tools/libxl/libxl_types.idl
>>>> @@ -437,6 +437,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>>>>                                         ("mmio_hole_memkb",  MemKB),
>>>>                                         ("timer_mode",       libxl_timer_mode),
>>>>                                         ("nested_hvm",       libxl_defbool),
>>>> +                                       ("altp2mhvm",        libxl_defbool),
>>>
>>> It's redundant to have "hvm" in the name of this field. Calling it
>>> "altp2m" would be fine IMHO.
>>>
>>
>> When I originally started writing this code, I modelled the naming
>> and some of the structure on nestedhvm, which is why so many things
>> had hvm in the name. I've now removed a lot of those, but in this
>> instance I wonder if doing so would cause confusion, since we have
>> a Xen command-line parameter called altp2m.
>>
> 
> I don't think it will cause confusion. This is part of guest
> configuration which has nothing to do with Xen and in theory another
> name space.
> 
> I asked you to remove hvm because it's redundant -- that field is a
> sub-field of u.hvm already.
> 

My mistake, I hadn't appreciated that you were only suggesting
renaming the field, not renaming the string in the .cfg.

I've addressed your other feedback for the next version of the
series. I added LIBXL_HAVE_ALTP2M, even though libxl only
supports interpreting the domain parameter in our patches.
Tamas has more libxl support in his patches, which have not
yet been submitted.

Ed

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-06 18:27         ` Ed White
@ 2015-07-06 23:40           ` Lengyel, Tamas
  2015-07-07  7:46             ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Lengyel, Tamas @ 2015-07-06 23:40 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Xen-devel, Jan Beulich, Daniel De Graaf


[-- Attachment #1.1: Type: text/plain, Size: 436 bytes --]

> I believe, unless Tamas says otherwise, that we agreed the
> HVMOP's in question and their implementations are sufficiently
> different that we should not merge them.


I'm still not entirely convinced of this being the case but considering
altp2m will be an experimental feature I'm not going to hold this against
it being merged for 4.6. Hopefully things will be ironed out more as we go
forward and start using it for real.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 699 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-06 16:40     ` Ed White
  2015-07-06 16:50       ` Ian Jackson
@ 2015-07-07  6:31       ` Jan Beulich
  1 sibling, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-07  6:31 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 06.07.15 at 18:40, <edmund.h.white@intel.com> wrote:
> On 07/03/2015 09:22 AM, Andrew Cooper wrote:
>> On 01/07/15 19:09, Ed White wrote:
>>> Add the basic data structures needed to support alternate p2m's and
>>> the functions to initialise them and tear them down.
>>>
>>> Although Intel hardware can handle 512 EPTP's per hardware thread
>>> concurrently, only 10 per domain are supported in this patch for
>>> performance reasons.
>>>
>>> The iterator in hap_enable() does need to handle 512, so that is now
>>> uint16_t.
>>>
>>> This change also splits the p2m lock into one lock type for altp2m's
>>> and another type for all other p2m's. The purpose of this is to place
>>> the altp2m list lock between the types, so the list lock can be
>>> acquired whilst holding the host p2m lock.
>>>
>>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>> 
>> Only some style issues.  Otherwise, Reviewed-by: Andrew Cooper
>> <andrew.cooper3@citrix.com>
>> 
>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>>> index 6faf3f4..f21d34d 100644
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -6502,6 +6504,25 @@ enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v)
>>>      return hvm_funcs.nhvm_intr_blocked(v);
>>>  }
>>>  
>>> +void ap2m_vcpu_update_eptp(struct vcpu *v)
>>> +{
>>> +    if (hvm_funcs.ap2m_vcpu_update_eptp)
>> 
>> spaces inside brackets
>> 
>>> +        hvm_funcs.ap2m_vcpu_update_eptp(v);
>>> +}
>>> +
>>> +void ap2m_vcpu_update_vmfunc_ve(struct vcpu *v)
>>> +{
>>> +    if (hvm_funcs.ap2m_vcpu_update_vmfunc_ve)
>> 
>> spaces inside brackets
>> 
>>> +        hvm_funcs.ap2m_vcpu_update_vmfunc_ve(v);
>>> +}
>>> +
>>> +bool_t ap2m_vcpu_emulate_ve(struct vcpu *v)
>>> +{
>>> +    if (hvm_funcs.ap2m_vcpu_emulate_ve)
>> 
>> spaces inside brackets
>> 
>>> +        return hvm_funcs.ap2m_vcpu_emulate_ve(v);
>>> +    return 0;
>>> +}
>>> +
>>>  /*
>>>   * Local variables:
>>>   * mode: C
>>> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
>>> index d0d3f1e..c00282c 100644
>>> --- a/xen/arch/x86/mm/hap/hap.c
>>> +++ b/xen/arch/x86/mm/hap/hap.c
>>> @@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)
>>>  int hap_enable(struct domain *d, u32 mode)
>>>  {
>>>      unsigned int old_pages;
>>> -    uint8_t i;
>>> +    uint16_t i;
>>>      int rv = 0;
>>>  
>>>      domain_pause(d);
>>> @@ -498,6 +498,24 @@ int hap_enable(struct domain *d, u32 mode)
>>>             goto out;
>>>      }
>>>  
>>> +    /* Init alternate p2m data */
>>> +    if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
>>> +    {
>>> +        rv = -ENOMEM;
>>> +        goto out;
>>> +    }
>>> +
>>> +    for (i = 0; i < MAX_EPTP; i++)
>>> +        d->arch.altp2m_eptp[i] = INVALID_MFN;
>>> +
>>> +    for (i = 0; i < MAX_ALTP2M; i++) {
>> 
>> braces
>> 
>>> +        rv = p2m_alloc_table(d->arch.altp2m_p2m[i]);
>>> +        if ( rv != 0 )
>>> +           goto out;
>>> +    }
>>> +
>>> +    d->arch.altp2m_active = 0;
>>> +
>>>      /* Now let other users see the new mode */
>>>      d->arch.paging.mode = mode | PG_HAP_enable;
>>>  
>>> @@ -510,6 +528,17 @@ void hap_final_teardown(struct domain *d)
>>>  {
>>>      uint8_t i;
>>>  
>>> +    d->arch.altp2m_active = 0;
>>> +
>>> +    if ( d->arch.altp2m_eptp ) {
>> 
>> braces
>> 
>>> +        free_xenheap_page(d->arch.altp2m_eptp);
>>> +        d->arch.altp2m_eptp = NULL;
>>> +    }
>>> +
>>> +    for (i = 0; i < MAX_ALTP2M; i++) {
>> 
>> braces
>> 
>>> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
>>> index 1fd1194..58d4951 100644
>>> --- a/xen/arch/x86/mm/p2m.c
>>> +++ b/xen/arch/x86/mm/p2m.c
>>> @@ -35,6 +35,7 @@
>>>  #include <asm/hvm/vmx/vmx.h> /* ept_p2m_init() */
>>>  #include <asm/mem_sharing.h>
>>>  #include <asm/hvm/nestedhvm.h>
>>> +#include <asm/hvm/altp2m.h>
>>>  #include <asm/hvm/svm/amd-iommu-proto.h>
>>>  #include <xsm/xsm.h>
>>>  
>>> @@ -183,6 +184,43 @@ static void p2m_teardown_nestedp2m(struct domain *d)
>>>      }
>>>  }
>>>  
>>> +static void p2m_teardown_altp2m(struct domain *d)
>>> +{
>>> +    uint8_t i;
>> 
>> A plain unsigned int here would suffice.  It also looks curios as you
>> use uint16 for the same iteration elsewhere.
>> 
>>> +    struct p2m_domain *p2m;
>>> +
>>> +    for (i = 0; i < MAX_ALTP2M; i++)
>> 
>> spaces inside brackets
>> 
>>> +    {
>>> +        if ( !d->arch.altp2m_p2m[i] )
>>> +            continue;
>>> +        p2m = d->arch.altp2m_p2m[i];
>>> +        p2m_free_one(p2m);
>>> +        d->arch.altp2m_p2m[i] = NULL;
>>> +    }
>>> +}
>>> +
>>> +static int p2m_init_altp2m(struct domain *d)
>>> +{
>>> +    uint8_t i;
>>> +    struct p2m_domain *p2m;
>>> +
>>> +    mm_lock_init(&d->arch.altp2m_lock);
>>> +    for (i = 0; i < MAX_ALTP2M; i++)
>> 
>> spaces inside brackets
>> 
> 
> In every case, this is because I wrote the code to conform with the style
> of the surrounding code. I'll fix them all, but I think the maintainers
> need to be clear about which is more important -- following the coding
> style or following the style of the surrounding code.

Okay, I agree that in the areas you change there are bad examples.
Nevertheless, looking at the files you modify as a whole, it is clear
that they're intended to be fully conforming to Xen coding style. In
a number of cases, you even mix styles yourself (e.g. putting blanks
around outer parentheses of if() but not doing so for for()). This
simply makes no sense. Furthermore it is quite intriguing that all the
violations I spotted while checking are due to nested code. I.e.
clearly sloppiness of submitter, reviewer(s) and committer. Bottom
line - this is a clear case of wrongly using malformed code as an
excuse to add further malformed code.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Coding style (was Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.)
  2015-07-06 16:50       ` Ian Jackson
@ 2015-07-07  6:48         ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-07  6:48 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Lars Kurth, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Tim Deegan, Ed White, xen-devel, tlengyel, Daniel De Graaf

>>> On 06.07.15 at 18:50, <Ian.Jackson@eu.citrix.com> wrote:
> Ed White writes ("Re: [PATCH v3 05/13] x86/altp2m: basic data structures and 
> support routines."):
> ...
>> In every case, this is because I wrote the code to conform with the style
>> of the surrounding code. I'll fix them all, but I think the maintainers
>> need to be clear about which is more important -- following the coding
>> style or following the style of the surrounding code.
> 
> Sadly there are indeed inconsistent style problems like this in
> various bits of the codebase.  I agree with Ed that maintainers need
> to be clear about what is more important.
> 
> I also think that maintainers should (a) when making style complaints,
> be aware if the existing code style is inconsistent or wrong and
> (b) where it is, consider whether to grant submitters some leeway.

I don't think there's much to be discussed here - when a single
source file uses mostly one style with a couple of violations, the
mainly used style rules. Admittedly there are (pretty few) files
where the determination isn't that easy (and where I would
grant leeway without making any remarks), but none of those are
involved here afaict.

> That doesn't mean that it's not appropriate to ask a submitter to
> conform to a particular style; but it is important to remain
> respectful.

Respectful - yes. But especially considering the amount of patches
submitted during the last so many weeks/months, it is a waste of
reviewing resources to constantly and repeatedly (i.e. often to the
same submitter) have to point out coding style violations. Since
merely asking them to address this apparently doesn't help, I'm
slowly moving towards viewing the "respectful" the other way
around (i.e. such submissions wasting my and other reviewers'
time), which means that - as said - sooner or later I'll simply start
dropping such patches on the floor. After all, following style
conventions is mostly a matter of discipline - it doesn't require any
significant skills. And being contributor to different projects with
different styles isn't an excuse either (leaving aside occasional
mistakes of course) - being myself (occasional) contributor to
binutils, gcc, and linux, and considering the differences between
hypervisor, tools, qemu-trad, and qemu-upstream I already have
to follow half a dozen different styles, none of which matches what
I use in my own projects.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-06 19:56     ` Sahita, Ravi
@ 2015-07-07  7:31       ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-07  7:31 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 06.07.15 at 21:56, <ravi.sahita@intel.com> wrote:
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com] 
> Sent: Friday, July 03, 2015 9:40 AM
>> On 01/07/15 19:09, Ed White wrote:
>> @@ -1830,6 +1831,20 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
>>      vmx_vmcs_exit(v);
>>  }
>>  
>> +static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs) {
>> +    int rc = X86EMUL_EXCEPTION;
>> +    struct vcpu *v = current;
>> +
>> +    if ( !cpu_has_vmx_vmfunc && altp2m_active(v->domain) &&
>> +         regs->eax == 0 &&
>> +         p2m_switch_vcpu_altp2m_by_id(v, (uint16_t)regs->ecx) )
>> +    {
>> +        rc = X86EMUL_OKAY;
>> +    }
> 
> You need a #UD injection at this point.
> 
> Ravi> I will keep this function unchanged i.e. returns X86EMUL_EXCEPTION on 
> error, which will cause the initiating hvmemul_vmfunc to stage a #UD (this 
> staging was in fact missing, and is now fixed - thanks).
> Ravi> The #UD is actually injected by the top level routine 
> vmx_vmexit_ud_intercept.

May I please ask you to adjust your reply style to one similar to
what everyone else uses? The Ravi> prefixes aren't indicating a
reply at a first, cursory look, especially since they only precede
paragraph (or sentence?) starts, i.e. aren't replicated on each
line...

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-06 10:09   ` Andrew Cooper
  2015-07-06 16:49     ` Ed White
@ 2015-07-07  7:33     ` Jan Beulich
  2015-07-07 20:10       ` Sahita, Ravi
  1 sibling, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2015-07-07  7:33 UTC (permalink / raw)
  To: Andrew Cooper, Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	xen-devel, tlengyel, Daniel De Graaf

>>> On 06.07.15 at 12:09, <andrew.cooper3@citrix.com> wrote:
> On 01/07/15 19:09, Ed White wrote:
>> Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> I am still very much unconvinced by the argument against having a single
> HVMOP_altp2m and a set of subops.  do_domctl() and do_sysctl() are
> examples of a subop style hypercall with different XSM settings for
> different subops.

+1

>> +    case HVMOP_altp2m_vcpu_enable_notify:
>> +    {
>> +        struct domain *curr_d = current->domain;
>> +        struct vcpu *curr = current;
>> +        struct xen_hvm_altp2m_vcpu_enable_notify a;
>> +        p2m_type_t p2mt;
>> +
>> +        if ( copy_from_guest(&a, arg, 1) )
>> +            return -EFAULT;
>> +
>> +        if ( !is_hvm_domain(curr_d) || !hvm_altp2m_supported() ||
>> +             !curr_d->arch.altp2m_active ||
>> +             gfn_x(vcpu_altp2m(curr).veinfo_gfn) != INVALID_GFN)
> 
> Brackets around the boolean operation on this line, and a space inside
> the final bracket.

The latter is a requirement, while the former really is optional.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-06 16:49     ` Ed White
  2015-07-06 17:08       ` Ian Jackson
@ 2015-07-07  7:39       ` Jan Beulich
  1 sibling, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-07  7:39 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 06.07.15 at 18:49, <edmund.h.white@intel.com> wrote:
> On 07/06/2015 03:09 AM, Andrew Cooper wrote:
>> On 01/07/15 19:09, Ed White wrote:
>>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>> 
>> I am still very much unconvinced by the argument against having a single
>> HVMOP_altp2m and a set of subops.  do_domctl() and do_sysctl() are
>> examples of a subop style hypercall with different XSM settings for
>> different subops.
>> 
>> Furthermore, factoring out a do_altp2m_op() handler would allow things
>> like the hvm_altp2m_supported() check to be made common.  Factoring
>> further to having a named common header of a subop and a domid at the
>> head of every subop structure would allow all the domain rcu locking to
>> become common outside of the subop switch.
>> 
> 
> How do we get to a binding decision on whether making this change is
> a prerequisite for acceptance or not? Changing the HVMOP encoding
> means fairly extensive changes to the code in hvm.c, and the XSM
> patch, and the code Tamas has written. It also necessitates significant
> changes to all the code we use to test the intra-domain protection
> model.

I don't follow this argumentation about XSM at all: Various HVMOPs
have different XSM handling anway (i.e. by far not all of them go
through e.g. xsm_hvm_control()), and hence I don't even see the
problem of passing the sub-op to a (new) XSM handler.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-06 17:08       ` Ian Jackson
  2015-07-06 18:27         ` Ed White
@ 2015-07-07  7:41         ` Jan Beulich
  1 sibling, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-07  7:41 UTC (permalink / raw)
  To: Ian Jackson, Ed White
  Cc: Ravi Sahita, Wei Liu, GeorgeDunlap, Andrew Cooper, Tim Deegan,
	xen-devel, tlengyel, Daniel De Graaf

>>> On 06.07.15 at 19:08, <Ian.Jackson@eu.citrix.com> wrote:
> I can see that Jan Beulich (who is the other active x86 maintainer -
> Keir is no longer very active) has been CC'd on a lot of this traffic.
> I don't see you having asked Jan for an opinion, although you did ask
> for a vote.  It would be helpful of Jan were to explicitly state his
> opinion.
> 
> Jan: what do you think ?

I thought I had indicated agreement with Andrew already, and I
expressed the same in another reply to the mail this one too is a
reply to, so in any case - unless there are convincing arguments
against what Andrew asked for, I'd ask for his request to be
followed.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-06 23:40           ` Lengyel, Tamas
@ 2015-07-07  7:46             ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-07  7:46 UTC (permalink / raw)
  To: Ed White, Tamas Lengyel
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Xen-devel, Daniel De Graaf

>>> On 07.07.15 at 01:40, <tlengyel@novetta.com> wrote:
>>  I believe, unless Tamas says otherwise, that we agreed the
>> HVMOP's in question and their implementations are sufficiently
>> different that we should not merge them.
> 
> 
> I'm still not entirely convinced of this being the case but considering
> altp2m will be an experimental feature I'm not going to hold this against
> it being merged for 4.6. Hopefully things will be ironed out more as we go
> forward and start using it for real.

Experimental or not - as long as this is not self contained code, but
has modifications scattered around, it imo can't be subject to
relaxation due to being experimental.

Yet your reply reminds me to clarify an earlier reply of mine: While
I'm in agreement with Andrew regarding the single HVMOP aspect,
I have no particular requirements whether to unify certain functions
(i.e. I'd leave the decision to accept the possibly resulting code
duplication to the respective maintainers [vm-event and x86/mm
iirc what code is being affected]).

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-03 16:29   ` Andrew Cooper
@ 2015-07-07 14:28     ` Wei Liu
  0 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2015-07-07 14:28 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: jun.nakajima, Ravi Sahita, Wei Liu, eddie.dong, George Dunlap,
	Ian Jackson, Tim Deegan, Ed White, xen-devel, kevin.tian,
	Jan Beulich, tlengyel, Daniel De Graaf

On Fri, Jul 03, 2015 at 05:29:44PM +0100, Andrew Cooper wrote:
> On 01/07/15 19:09, Ed White wrote:
> > Implement and hook up the code to enable VMX support of VMFUNC and #VE.
> >
> > VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.
> >
> > Signed-off-by: Ed White <edmund.h.white@intel.com>
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> (You are also going to need an ack/review from a VMX maintainer for the
> entire series)

Actually CC VMX maintainers.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-01 18:09 ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Ed White
  2015-07-03 16:22   ` Andrew Cooper
@ 2015-07-07 15:04   ` George Dunlap
  2015-07-07 15:22     ` Tim Deegan
  2015-07-09 13:29   ` Jan Beulich
  2015-07-09 15:58   ` George Dunlap
  3 siblings, 1 reply; 91+ messages in thread
From: George Dunlap @ 2015-07-07 15:04 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, Ian Jackson, Tim Deegan, Jan Beulich,
	Andrew Cooper, tlengyel, Daniel De Graaf

On 07/01/2015 07:09 PM, Ed White wrote:
> diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
> index b4f035e..301ca59 100644
> --- a/xen/arch/x86/mm/mm-locks.h
> +++ b/xen/arch/x86/mm/mm-locks.h
> @@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
>  #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
>  #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
>  
> -/* P2M lock (per-p2m-table)
> +/* P2M lock (per-non-alt-p2m-table)
>   *
>   * This protects all queries and updates to the p2m table.
>   * Queries may be made under the read lock but all modifications
> @@ -225,10 +225,44 @@ declare_mm_lock(nestedp2m)
>   *
>   * The write lock is recursive as it is common for a code path to look
>   * up a gfn and later mutate it.
> + *
> + * Note that this lock shares its implementation with the altp2m
> + * lock (not the altp2m list lock), so the implementation
> + * is found there.
>   */
>  
>  declare_mm_rwlock(p2m);
> -#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
> +
> +/* Alternate P2M list lock (per-domain)
> + *
> + * A per-domain lock that protects the list of alternate p2m's.
> + * Any operation that walks the list needs to acquire this lock.
> + * Additionally, before destroying an alternate p2m all VCPU's
> + * in the target domain must be paused.
> + */
> +
> +declare_mm_lock(altp2mlist)
> +#define altp2m_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_lock)
> +#define altp2m_unlock(d) mm_unlock(&(d)->arch.altp2m_lock)
> +
> +/* P2M lock (per-altp2m-table)
> + *
> + * This protects all queries and updates to the p2m table.
> + * Queries may be made under the read lock but all modifications
> + * need the main (write) lock.
> + *
> + * The write lock is recursive as it is common for a code path to look
> + * up a gfn and later mutate it.
> + */
> +
> +declare_mm_rwlock(altp2m);
> +#define p2m_lock(p)                         \
> +{                                           \
> +    if ( p2m_is_altp2m(p) )                 \
> +        mm_write_lock(altp2m, &(p)->lock);  \
> +    else                                    \
> +        mm_write_lock(p2m, &(p)->lock);     \
> +}
>  #define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
>  #define gfn_lock(p,g,o)       p2m_lock(p)
>  #define gfn_unlock(p,g,o)     p2m_unlock(p)

I've just taken on the role of mm maintainer, and so this late in a
series, having Tim's approval and also having Andy's reviewed-by, I'd
normally just skim through and Ack it as a matter of course.  But I just
wouldn't feel right giving this my maintainer's ack without
understanding the locking a bit better.  So please bear with me here.

I see in the cover letter that you "sandwiched" the altp2mlist lock
between p2m and altp2m at Tim's suggestion.  But I can't find the
discussion where that was suggested (it doesn't seem to be in response
to v1 patch 5), and it's not immediately obvious to me, either from the
code or your comments, what that's for.  Can you point me to the
discussion, and/or give me a better description as to why it's important
to be able to grab the p2m lock before the altp2mlist lock, but the
altp2m lock afterwards?

Thanks,
 -George

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 09/13] x86/altp2m: alternate p2m memory events.
  2015-07-01 18:09 ` [PATCH v3 09/13] x86/altp2m: alternate p2m memory events Ed White
  2015-07-01 18:29   ` Lengyel, Tamas
  2015-07-03 16:46   ` Andrew Cooper
@ 2015-07-07 15:18   ` George Dunlap
  2 siblings, 0 replies; 91+ messages in thread
From: George Dunlap @ 2015-07-07 15:18 UTC (permalink / raw)
  To: Ed White, xen-devel
  Cc: Ravi Sahita, Wei Liu, Ian Jackson, Tim Deegan, Jan Beulich,
	Andrew Cooper, tlengyel, Daniel De Graaf

On 07/01/2015 07:09 PM, Ed White wrote:
> Add a flag to indicate that a memory event occurred in an alternate p2m
> and a field containing the p2m index. Allow any event response to switch
> to a different alternate p2m using the same flag and field.
> 
> Modify p2m_memory_access_check() to handle alternate p2m's.
> 
> Signed-off-by: Ed White <edmund.h.white@intel.com>

With Tamas' comments addressed:

Acked-by: George Dunlap <george.dunlap@eu.citrix.com>

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-07 15:04   ` George Dunlap
@ 2015-07-07 15:22     ` Tim Deegan
  2015-07-07 16:19       ` Ed White
  0 siblings, 1 reply; 91+ messages in thread
From: Tim Deegan @ 2015-07-07 15:22 UTC (permalink / raw)
  To: George Dunlap
  Cc: Ravi Sahita, Wei Liu, Andrew Cooper, Ian Jackson, Ed White,
	xen-devel, Jan Beulich, tlengyel, Daniel De Graaf

At 16:04 +0100 on 07 Jul (1436285059), George Dunlap wrote:
> On 07/01/2015 07:09 PM, Ed White wrote:
> > diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
> > index b4f035e..301ca59 100644
> > --- a/xen/arch/x86/mm/mm-locks.h
> > +++ b/xen/arch/x86/mm/mm-locks.h
> > @@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
> >  #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
> >  #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
> >  
> > -/* P2M lock (per-p2m-table)
> > +/* P2M lock (per-non-alt-p2m-table)
> >   *
> >   * This protects all queries and updates to the p2m table.
> >   * Queries may be made under the read lock but all modifications
> > @@ -225,10 +225,44 @@ declare_mm_lock(nestedp2m)
> >   *
> >   * The write lock is recursive as it is common for a code path to look
> >   * up a gfn and later mutate it.
> > + *
> > + * Note that this lock shares its implementation with the altp2m
> > + * lock (not the altp2m list lock), so the implementation
> > + * is found there.
> >   */
> >  
> >  declare_mm_rwlock(p2m);
> > -#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
> > +
> > +/* Alternate P2M list lock (per-domain)
> > + *
> > + * A per-domain lock that protects the list of alternate p2m's.
> > + * Any operation that walks the list needs to acquire this lock.
> > + * Additionally, before destroying an alternate p2m all VCPU's
> > + * in the target domain must be paused.
> > + */
> > +
> > +declare_mm_lock(altp2mlist)
> > +#define altp2m_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_lock)
> > +#define altp2m_unlock(d) mm_unlock(&(d)->arch.altp2m_lock)
> > +
> > +/* P2M lock (per-altp2m-table)
> > + *
> > + * This protects all queries and updates to the p2m table.
> > + * Queries may be made under the read lock but all modifications
> > + * need the main (write) lock.
> > + *
> > + * The write lock is recursive as it is common for a code path to look
> > + * up a gfn and later mutate it.
> > + */
> > +
> > +declare_mm_rwlock(altp2m);
> > +#define p2m_lock(p)                         \
> > +{                                           \
> > +    if ( p2m_is_altp2m(p) )                 \
> > +        mm_write_lock(altp2m, &(p)->lock);  \
> > +    else                                    \
> > +        mm_write_lock(p2m, &(p)->lock);     \
> > +}
> >  #define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
> >  #define gfn_lock(p,g,o)       p2m_lock(p)
> >  #define gfn_unlock(p,g,o)     p2m_unlock(p)
> 
> I've just taken on the role of mm maintainer, and so this late in a
> series, having Tim's approval and also having Andy's reviewed-by, I'd
> normally just skim through and Ack it as a matter of course.  But I just
> wouldn't feel right giving this my maintainer's ack without
> understanding the locking a bit better.  So please bear with me here.
> 
> I see in the cover letter that you "sandwiched" the altp2mlist lock
> between p2m and altp2m at Tim's suggestion.  But I can't find the
> discussion where that was suggested (it doesn't seem to be in response
> to v1 patch 5),

I suggested changing the locking order here:
http://lists.xenproject.org/archives/html/xen-devel/2015-01/msg01824.html

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-07 15:22     ` Tim Deegan
@ 2015-07-07 16:19       ` Ed White
  2015-07-08 13:52         ` George Dunlap
  2015-07-09 17:05         ` Sahita, Ravi
  0 siblings, 2 replies; 91+ messages in thread
From: Ed White @ 2015-07-07 16:19 UTC (permalink / raw)
  To: Tim Deegan, George Dunlap
  Cc: Ravi Sahita, Wei Liu, Andrew Cooper, Ian Jackson, xen-devel,
	Jan Beulich, tlengyel, Daniel De Graaf

On 07/07/2015 08:22 AM, Tim Deegan wrote:
> At 16:04 +0100 on 07 Jul (1436285059), George Dunlap wrote:
>> On 07/01/2015 07:09 PM, Ed White wrote:
>>> diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
>>> index b4f035e..301ca59 100644
>>> --- a/xen/arch/x86/mm/mm-locks.h
>>> +++ b/xen/arch/x86/mm/mm-locks.h
>>> @@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
>>>  #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
>>>  #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
>>>  
>>> -/* P2M lock (per-p2m-table)
>>> +/* P2M lock (per-non-alt-p2m-table)
>>>   *
>>>   * This protects all queries and updates to the p2m table.
>>>   * Queries may be made under the read lock but all modifications
>>> @@ -225,10 +225,44 @@ declare_mm_lock(nestedp2m)
>>>   *
>>>   * The write lock is recursive as it is common for a code path to look
>>>   * up a gfn and later mutate it.
>>> + *
>>> + * Note that this lock shares its implementation with the altp2m
>>> + * lock (not the altp2m list lock), so the implementation
>>> + * is found there.
>>>   */
>>>  
>>>  declare_mm_rwlock(p2m);
>>> -#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
>>> +
>>> +/* Alternate P2M list lock (per-domain)
>>> + *
>>> + * A per-domain lock that protects the list of alternate p2m's.
>>> + * Any operation that walks the list needs to acquire this lock.
>>> + * Additionally, before destroying an alternate p2m all VCPU's
>>> + * in the target domain must be paused.
>>> + */
>>> +
>>> +declare_mm_lock(altp2mlist)
>>> +#define altp2m_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_lock)
>>> +#define altp2m_unlock(d) mm_unlock(&(d)->arch.altp2m_lock)
>>> +
>>> +/* P2M lock (per-altp2m-table)
>>> + *
>>> + * This protects all queries and updates to the p2m table.
>>> + * Queries may be made under the read lock but all modifications
>>> + * need the main (write) lock.
>>> + *
>>> + * The write lock is recursive as it is common for a code path to look
>>> + * up a gfn and later mutate it.
>>> + */
>>> +
>>> +declare_mm_rwlock(altp2m);
>>> +#define p2m_lock(p)                         \
>>> +{                                           \
>>> +    if ( p2m_is_altp2m(p) )                 \
>>> +        mm_write_lock(altp2m, &(p)->lock);  \
>>> +    else                                    \
>>> +        mm_write_lock(p2m, &(p)->lock);     \
>>> +}
>>>  #define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
>>>  #define gfn_lock(p,g,o)       p2m_lock(p)
>>>  #define gfn_unlock(p,g,o)     p2m_unlock(p)
>>
>> I've just taken on the role of mm maintainer, and so this late in a
>> series, having Tim's approval and also having Andy's reviewed-by, I'd
>> normally just skim through and Ack it as a matter of course.  But I just
>> wouldn't feel right giving this my maintainer's ack without
>> understanding the locking a bit better.  So please bear with me here.
>>
>> I see in the cover letter that you "sandwiched" the altp2mlist lock
>> between p2m and altp2m at Tim's suggestion.  But I can't find the
>> discussion where that was suggested (it doesn't seem to be in response
>> to v1 patch 5),
> 
> I suggested changing the locking order here:
> http://lists.xenproject.org/archives/html/xen-devel/2015-01/msg01824.html
> 
> Cheers,
> 
> Tim.
> 

And Tim, Andrew and I subsequently discussed this specific approach
in a phone meeting.

Ed

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection.
  2015-07-01 18:09 ` [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection Ed White
  2015-07-06 17:16   ` George Dunlap
@ 2015-07-07 18:58   ` Nakajima, Jun
  1 sibling, 0 replies; 91+ messages in thread
From: Nakajima, Jun @ 2015-07-07 18:58 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Wed, Jul 1, 2015 at 11:09 AM, Ed White <edmund.h.white@intel.com> wrote:
> Currently, neither is enabled globally but may be enabled on a per-VCPU
> basis by the altp2m code.
>
> Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
> that bit is now hardware-defined.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jun Nakajima <jun.nakajima@intel.com>

-- 
Jun
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 03/13] VMX: implement suppress #VE.
  2015-07-01 18:09 ` [PATCH v3 03/13] VMX: implement suppress #VE Ed White
  2015-07-06 17:26   ` George Dunlap
@ 2015-07-07 18:59   ` Nakajima, Jun
  2015-07-09 13:01   ` Jan Beulich
  2 siblings, 0 replies; 91+ messages in thread
From: Nakajima, Jun @ 2015-07-07 18:59 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Wed, Jul 1, 2015 at 11:09 AM, Ed White <edmund.h.white@intel.com> wrote:
> In preparation for selectively enabling #VE in a later patch, set
> suppress #VE on all EPTE's.
>
> Suppress #VE should always be the default condition for two reasons:
> it is generally not safe to deliver #VE into a guest unless that guest
> has been modified to receive it; and even then for most EPT violations only
> the hypervisor is able to handle the violation.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jun Nakajima <jun.nakajima@intel.com>

-- 
Jun
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE.
  2015-07-01 18:09 ` [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE Ed White
  2015-07-03 16:29   ` Andrew Cooper
@ 2015-07-07 19:02   ` Nakajima, Jun
  1 sibling, 0 replies; 91+ messages in thread
From: Nakajima, Jun @ 2015-07-07 19:02 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Tim Deegan, Ian Jackson,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Wed, Jul 1, 2015 at 11:09 AM, Ed White <edmund.h.white@intel.com> wrote:
> Implement and hook up the code to enable VMX support of VMFUNC and #VE.
>
> VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>

Acked-by: Jun Nakajima <jun.nakajima@intel.com>

-- 
Jun
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-07  7:33     ` Jan Beulich
@ 2015-07-07 20:10       ` Sahita, Ravi
  2015-07-07 20:25         ` Andrew Cooper
  0 siblings, 1 reply; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-07 20:10 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper, White, Edmund H
  Cc: Wei Liu, George Dunlap, Tim Deegan, Ian Jackson, xen-devel,
	tlengyel, Daniel De Graaf


>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Tuesday, July 07, 2015 12:34 AM
>
>>>> On 06.07.15 at 12:09, <andrew.cooper3@citrix.com> wrote:
>> On 01/07/15 19:09, Ed White wrote:
>>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>
>> I am still very much unconvinced by the argument against having a
>> single HVMOP_altp2m and a set of subops.  do_domctl() and do_sysctl()
>> are examples of a subop style hypercall with different XSM settings
>> for different subops.
>
>+1


Thanks Andrew and Jan for providing feedback on what the maintainers want to see for the HVMOP_altp2m.

Just wanted some clarity from a timing perspective on this one so we know how to proceed - is creating a single HVMOP_altp2m and a set of associated subops a requirement to be completed for 4.6 or is that something that can be addressed in a subsequent change?

Thanks,
Ravi

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-07 20:10       ` Sahita, Ravi
@ 2015-07-07 20:25         ` Andrew Cooper
  0 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2015-07-07 20:25 UTC (permalink / raw)
  To: Sahita, Ravi, Jan Beulich, White, Edmund H
  Cc: Wei Liu, George Dunlap, Tim Deegan, Ian Jackson, xen-devel,
	tlengyel, Daniel De Graaf

On 07/07/15 21:10, Sahita, Ravi wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Tuesday, July 07, 2015 12:34 AM
>>
>>>>> On 06.07.15 at 12:09, <andrew.cooper3@citrix.com> wrote:
>>> On 01/07/15 19:09, Ed White wrote:
>>>> Signed-off-by: Ed White <edmund.h.white@intel.com>
>>> I am still very much unconvinced by the argument against having a
>>> single HVMOP_altp2m and a set of subops.  do_domctl() and do_sysctl()
>>> are examples of a subop style hypercall with different XSM settings
>>> for different subops.
>> +1
>
> Thanks Andrew and Jan for providing feedback on what the maintainers want to see for the HVMOP_altp2m.
>
> Just wanted some clarity from a timing perspective on this one so we know how to proceed - is creating a single HVMOP_altp2m and a set of associated subops a requirement to be completed for 4.6 or is that something that can be addressed in a subsequent change?

This, unlike most other bits of the series, is an ABI matter.  Baring
exceptional circumstances, no incompatible changes may be made to the ABI.

This, being a guest visible interface, is critical to get right as it
cannot be changed at a later point.

~Andrew

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-07 16:19       ` Ed White
@ 2015-07-08 13:52         ` George Dunlap
  2015-07-09 17:05         ` Sahita, Ravi
  1 sibling, 0 replies; 91+ messages in thread
From: George Dunlap @ 2015-07-08 13:52 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Ian Jackson, Tim Deegan, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Tue, Jul 7, 2015 at 5:19 PM, Ed White <edmund.h.white@intel.com> wrote:
> On 07/07/2015 08:22 AM, Tim Deegan wrote:
>> At 16:04 +0100 on 07 Jul (1436285059), George Dunlap wrote:
>>> On 07/01/2015 07:09 PM, Ed White wrote:
>>>> diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
>>>> index b4f035e..301ca59 100644
>>>> --- a/xen/arch/x86/mm/mm-locks.h
>>>> +++ b/xen/arch/x86/mm/mm-locks.h
>>>> @@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
>>>>  #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
>>>>  #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
>>>>
>>>> -/* P2M lock (per-p2m-table)
>>>> +/* P2M lock (per-non-alt-p2m-table)
>>>>   *
>>>>   * This protects all queries and updates to the p2m table.
>>>>   * Queries may be made under the read lock but all modifications
>>>> @@ -225,10 +225,44 @@ declare_mm_lock(nestedp2m)
>>>>   *
>>>>   * The write lock is recursive as it is common for a code path to look
>>>>   * up a gfn and later mutate it.
>>>> + *
>>>> + * Note that this lock shares its implementation with the altp2m
>>>> + * lock (not the altp2m list lock), so the implementation
>>>> + * is found there.
>>>>   */
>>>>
>>>>  declare_mm_rwlock(p2m);
>>>> -#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
>>>> +
>>>> +/* Alternate P2M list lock (per-domain)
>>>> + *
>>>> + * A per-domain lock that protects the list of alternate p2m's.
>>>> + * Any operation that walks the list needs to acquire this lock.
>>>> + * Additionally, before destroying an alternate p2m all VCPU's
>>>> + * in the target domain must be paused.
>>>> + */
>>>> +
>>>> +declare_mm_lock(altp2mlist)
>>>> +#define altp2m_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_lock)
>>>> +#define altp2m_unlock(d) mm_unlock(&(d)->arch.altp2m_lock)
>>>> +
>>>> +/* P2M lock (per-altp2m-table)
>>>> + *
>>>> + * This protects all queries and updates to the p2m table.
>>>> + * Queries may be made under the read lock but all modifications
>>>> + * need the main (write) lock.
>>>> + *
>>>> + * The write lock is recursive as it is common for a code path to look
>>>> + * up a gfn and later mutate it.
>>>> + */
>>>> +
>>>> +declare_mm_rwlock(altp2m);
>>>> +#define p2m_lock(p)                         \
>>>> +{                                           \
>>>> +    if ( p2m_is_altp2m(p) )                 \
>>>> +        mm_write_lock(altp2m, &(p)->lock);  \
>>>> +    else                                    \
>>>> +        mm_write_lock(p2m, &(p)->lock);     \
>>>> +}
>>>>  #define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
>>>>  #define gfn_lock(p,g,o)       p2m_lock(p)
>>>>  #define gfn_unlock(p,g,o)     p2m_unlock(p)
>>>
>>> I've just taken on the role of mm maintainer, and so this late in a
>>> series, having Tim's approval and also having Andy's reviewed-by, I'd
>>> normally just skim through and Ack it as a matter of course.  But I just
>>> wouldn't feel right giving this my maintainer's ack without
>>> understanding the locking a bit better.  So please bear with me here.
>>>
>>> I see in the cover letter that you "sandwiched" the altp2mlist lock
>>> between p2m and altp2m at Tim's suggestion.  But I can't find the
>>> discussion where that was suggested (it doesn't seem to be in response
>>> to v1 patch 5),
>>
>> I suggested changing the locking order here:
>> http://lists.xenproject.org/archives/html/xen-devel/2015-01/msg01824.html
>>
>> Cheers,
>>
>> Tim.
>>
>
> And Tim, Andrew and I subsequently discussed this specific approach
> in a phone meeting.

Right; but it's still important to have all information necessary to
understand the patch series in the code, comments, and changelogs.  I
can't go back and listen to your conversation, and although a
reasonable amount of inference from the code can be expected, anything
complicated should be made explicit.

(Perhaps it is explained further on in the series, in which case the
above comment may not require any action.)

 -George

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
                   ` (13 preceding siblings ...)
  2015-07-06  9:50 ` [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Jan Beulich
@ 2015-07-08 18:35 ` Sahita, Ravi
  2015-07-09 11:49   ` Wei Liu
  14 siblings, 1 reply; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-08 18:35 UTC (permalink / raw)
  To: Wei Liu, xen-devel
  Cc: George Dunlap, Ian Jackson, Tim Deegan, White, Edmund H,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

Hi Wei,

Given where we stand (close) on the altp2m patch series - we would like to request an extension of about a week to complete the last couple of patches in the series for inclusion in 4.6. 
Some of the suggestions may have implications on other patches and on our tests hence asking for the extension (we know what we need to change). 

Hope that is acceptable. This is a quick current status snapshot of v3: 
(this doesn't have a couple tools patches that Tamas has contributed but those will be included in rev4)

altp2m series patch v3 status
0	[PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m				Sign-off	Reviewed		Acked			Status
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1	[Xen-devel] [PATCH v3 01/13] common/domain: Helpers to pause a domain while in context	Andrew Cooper						good?
2	[Xen-devel] [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection.		Ed White	Andrew Cooper	Jun Nakajima		good
3	[Xen-devel] [PATCH v3 03/13] VMX: implement suppress #VE.					Ed White	Andrew Cooper	Jun Nakajima		good
4	[Xen-devel] [PATCH v3 04/13] x86/HVM: Hardware alternate p2m support detection		Ed White	Andrew Cooper				good?
5	[Xen-devel] [PATCH v3 05/13] x86/altp2m: basic data structures and support routines		Ed White	Andrew Cooper				Locking being reviewed by George Dunlap
6	[Xen-devel] [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE	Ed White	Andrew Cooper	Jun Nakajima		good
7	[Xen-devel] [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator		Ravi Sahita	***						***Proposed Changes ready staged for v4
8	[Xen-devel] [PATCH v3 08/13] x86/altp2m: add control of suppress_ve				Ed White	***Andrew Cooper				*** George Dunlap has an alt patch based on v2 7 of 12
9	[Xen-devel] [PATCH v3 09/13] x86/altp2m: alternate p2m memory events			Ed White	Andrew Cooper	George Dunlap		good
10	[Xen-devel] [PATCH v3 10/13] x86/altp2m: add remaining support routines			Ed White	Andew Cooper					good?
11	[Xen-devel] [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP type 	Ed White	***						*** Need to create HVMOP sub-types for altp2m
12	[Xen-devel] [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter		Ed White	Andew Cooper					Wei's comments addressed and staged for v4
13	[Xen-devel] [PATCH v3 13/13] x86/altp2m: XSM hooks for altp2m HVM ops			Ravi Sahita	Daniel De Graaf		Daniel De Graaf		*** Will be impacted by HVMOP changes

Thanks,
Ravi

>-----Original Message-----
>From: White, Edmund H
>Sent: Wednesday, July 01, 2015 11:09 AM
>To: xen-devel@lists.xen.org
>Cc: Ian Jackson; Jan Beulich; Tim Deegan; Daniel De Graaf; Sahita, Ravi;
>Andrew Cooper; Wei Liu; tlengyel@novetta.com; George Dunlap; White,
>Edmund H
>Subject: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
>
>This set of patches adds support to hvm domains for EPTP switching by
>creating multiple copies of the host p2m (currently limited to 10 copies).
>
>The primary use of this capability is expected to be in scenarios where access
>to memory needs to be monitored and/or restricted below the level at which
>the guest OS page tables operate. Two examples that were discussed at the
>2014 Xen developer summit are:
>
>    VM introspection:
>        http://www.slideshare.net/xen_com_mgr/
>        zero-footprint-guest-memory-introspection-from-xen
>
>    Secure inter-VM communication:
>        http://www.slideshare.net/xen_com_mgr/nakajima-nvf
>
>A more detailed design specification can be found at:
>    http://lists.xenproject.org/archives/html/xen-devel/2015-
>06/msg01319.html
>
>Each p2m copy is populated lazily on EPT violations.
>Permissions for pages in alternate p2m's can be changed in a similar way to
>the existing memory access interface, and gfn->mfn mappings can be
>changed.
>
>All this is done through extra HVMOP types.
>
>The cross-domain HVMOP code has been compile-tested only. Also, the
>cross-domain code is hypervisor-only, the toolstack has not been modified.
>
>The intra-domain code has been tested. Violation notifications can only be
>received for pages that have been modified (access permissions and/or gfn-
>>mfn mapping) intra-domain, and only on VCPU's that have enabled
>notification.
>
>VMFUNC and #VE will both be emulated on hardware without native support.
>
>This code is not compatible with nested hvm functionality and will refuse to
>work with nested hvm active. It is also not compatible with migration. It
>should be considered experimental.
>
>
>Changes since v2:
>
>Addressed all v2 feedback *except*:
>
>    In patch 5, the per-domain EPTP list page is still allocated from the
>    Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
>    hardware when walking the EPTP list during exit processing in patch 6.
>
>    HVM_ops are not merged. Tamas suggested merging the memory access
>ops,
>    but in practice they are not as similar as they appear on the surface.
>    Razvan suggested merging the implementation code in p2m.c, but that is
>    also not as common as it appears on the surface.
>    Andrew suggested merging all altp2m ops into one with a subop code in
>    the input stucture. His point that only 255 ops can be defined is well
>    taken, but altp2m uses only 2 more ops than the recently introduced
>    ioreq ops, and <15% of the available ops have been defined. Since we
>    don't know how to implement XSM hooks and policy with the subop model,
>    we have not adopted this suggestion.
>
>    The p2m set/get interface is not modified. The altp2m code needs to
>    write suppress_ve in 2 places and read it in 1 place. The original
>    patch series managed this by coupling the state of suppress_ve to the
>    p2m memory type, which Tim disliked. In v2 of the series, special
>    set/get interaces were added to access suppress_ve only when required.
>    Jan has suggested changing the existing interfaces, but we feel this
>    is inappropriate for this experimental patch series. Changing the
>    existing interfaces would require a design agreement to be reached
>    and would impact a large amount of existing code.
>
>    Andrew kindly added some reviewed-by's to v2. I have not carried
>    his reviewed-by of the memory event patch forward because Tamas
>    requested significant changes to the patch.
>
>
>Changes since v1:
>
>Many changes since v1 in response to maintainer feedback, including:
>
>    Suppress_ve state is now decoupled from memory type
>    VMFUNC emulation handled in x86 emulator
>    Lazy-copy algorithm copies any page where mfn != INVALID_MFN
>    All nested page fault handling except lazy-copy is now in
>        top-level (hvm.c) nested page fault handler
>    Split p2m lock type (as suggested by Tim) to avoid lock order violations
>    XSM hooks
>    Xen parameter to globally enable altp2m (default disabled) and HVM
>parameter
>    Altp2m reference counting no longer uses dirty_cpu bitmap
>    Remapped page tracking to invalidate altp2m's where needed to protect
>Xen
>    Many other minor changes
>
>The altp2m invalidation is implemented to a level that I believe satisifies the
>requirements of protecting Xen. Invalidation notification is not yet
>implemented, and there may be other cases where invalidation is warranted
>to protect the integrity of the restrictions placed through altp2m. We may add
>further patches in this area.
>
>Testability is still a potential issue. We have offered to make our internal
>Windows test binaries available for intra-domain testing. Tamas has been
>working on toolstack support for cross-domain testing with a slightly earlier
>patch series, and we hope he will submit that support.
>
>Not all of the patches will be of interest to everyone copied here. I've copied
>everyone on this initial mailing to give context.
>
>Andrew Cooper (1):
>  common/domain: Helpers to pause a domain while in context
>
>Ed White (10):
>  VMX: VMFUNC and #VE definitions and detection.
>  VMX: implement suppress #VE.
>  x86/HVM: Hardware alternate p2m support detection.
>  x86/altp2m: basic data structures and support routines.
>  VMX/altp2m: add code to support EPTP switching and #VE.
>  x86/altp2m: add control of suppress_ve.
>  x86/altp2m: alternate p2m memory events.
>  x86/altp2m: add remaining support routines.
>  x86/altp2m: define and implement alternate p2m HVMOP types.
>  x86/altp2m: Add altp2mhvm HVM domain parameter.
>
>Ravi Sahita (2):
>  VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
>  x86/altp2m: XSM hooks for altp2m HVM ops
>
> docs/man/xl.cfg.pod.5                        |  12 +
> docs/misc/xen-command-line.markdown          |   7 +
> tools/flask/policy/policy/modules/xen/xen.if |   4 +-
> tools/libxl/libxl_create.c                   |   1 +
> tools/libxl/libxl_dom.c                      |   2 +
> tools/libxl/libxl_types.idl                  |   1 +
> tools/libxl/xl_cmdimpl.c                     |   8 +
> xen/arch/x86/hvm/Makefile                    |   1 +
> xen/arch/x86/hvm/altp2m.c                    |  92 +++++
> xen/arch/x86/hvm/emulate.c                   |  12 +-
> xen/arch/x86/hvm/hvm.c                       | 327 ++++++++++++++++-
> xen/arch/x86/hvm/vmx/vmcs.c                  |  42 ++-
> xen/arch/x86/hvm/vmx/vmx.c                   | 169 +++++++++
> xen/arch/x86/mm/hap/Makefile                 |   1 +
> xen/arch/x86/mm/hap/altp2m_hap.c             |  98 ++++++
> xen/arch/x86/mm/hap/hap.c                    |  31 +-
> xen/arch/x86/mm/mm-locks.h                   |  38 +-
> xen/arch/x86/mm/p2m-ept.c                    |  68 +++-
> xen/arch/x86/mm/p2m.c                        | 503
>++++++++++++++++++++++++++-
> xen/arch/x86/x86_emulate/x86_emulate.c       |  48 ++-
> xen/arch/x86/x86_emulate/x86_emulate.h       |   4 +
> xen/common/domain.c                          |  28 ++
> xen/common/vm_event.c                        |   3 +
> xen/include/asm-arm/p2m.h                    |   6 +
> xen/include/asm-x86/domain.h                 |  10 +
> xen/include/asm-x86/hvm/altp2m.h             |  42 +++
> xen/include/asm-x86/hvm/hvm.h                |  28 ++
> xen/include/asm-x86/hvm/vcpu.h               |   9 +
> xen/include/asm-x86/hvm/vmx/vmcs.h           |  14 +-
> xen/include/asm-x86/hvm/vmx/vmx.h            |  13 +-
> xen/include/asm-x86/msr-index.h              |   1 +
> xen/include/asm-x86/p2m.h                    |  79 ++++-
> xen/include/public/hvm/hvm_op.h              |  69 ++++
> xen/include/public/hvm/params.h              |   5 +-
> xen/include/public/vm_event.h                |  11 +
> xen/include/xen/sched.h                      |   5 +
> xen/include/xsm/dummy.h                      |  12 +
> xen/include/xsm/xsm.h                        |  12 +
> xen/xsm/dummy.c                              |   2 +
> xen/xsm/flask/hooks.c                        |  12 +
> xen/xsm/flask/policy/access_vectors          |   7 +
> 41 files changed, 1789 insertions(+), 48 deletions(-)  create mode 100644
>xen/arch/x86/hvm/altp2m.c  create mode 100644
>xen/arch/x86/mm/hap/altp2m_hap.c  create mode 100644 xen/include/asm-
>x86/hvm/altp2m.h
>
>--
>1.9.1

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-08 18:35 ` Sahita, Ravi
@ 2015-07-09 11:49   ` Wei Liu
  2015-07-09 14:14     ` Jan Beulich
                       ` (2 more replies)
  0 siblings, 3 replies; 91+ messages in thread
From: Wei Liu @ 2015-07-09 11:49 UTC (permalink / raw)
  To: Sahita, Ravi
  Cc: Wei Liu, George Dunlap, Tim Deegan, Ian Jackson, White, Edmund H,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Wed, Jul 08, 2015 at 06:35:33PM +0000, Sahita, Ravi wrote:
> Hi Wei,
> 
> Given where we stand (close) on the altp2m patch series - we would like to request an extension of about a week to complete the last couple of patches in the series for inclusion in 4.6. 
> Some of the suggestions may have implications on other patches and on our tests hence asking for the extension (we know what we need to change). 
> 
> Hope that is acceptable. This is a quick current status snapshot of v3: 
> (this doesn't have a couple tools patches that Tamas has contributed but those will be included in rev4)
> 
> altp2m series patch v3 status
> 0	[PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m				Sign-off	Reviewed		Acked			Status
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 1	[Xen-devel] [PATCH v3 01/13] common/domain: Helpers to pause a domain while in context	Andrew Cooper						good?
> 2	[Xen-devel] [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection.		Ed White	Andrew Cooper	Jun Nakajima		good
> 3	[Xen-devel] [PATCH v3 03/13] VMX: implement suppress #VE.					Ed White	Andrew Cooper	Jun Nakajima		good
> 4	[Xen-devel] [PATCH v3 04/13] x86/HVM: Hardware alternate p2m support detection		Ed White	Andrew Cooper				good?
> 5	[Xen-devel] [PATCH v3 05/13] x86/altp2m: basic data structures and support routines		Ed White	Andrew Cooper				Locking being reviewed by George Dunlap
> 6	[Xen-devel] [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE	Ed White	Andrew Cooper	Jun Nakajima		good
> 7	[Xen-devel] [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator		Ravi Sahita	***						***Proposed Changes ready staged for v4
> 8	[Xen-devel] [PATCH v3 08/13] x86/altp2m: add control of suppress_ve				Ed White	***Andrew Cooper				*** George Dunlap has an alt patch based on v2 7 of 12
> 9	[Xen-devel] [PATCH v3 09/13] x86/altp2m: alternate p2m memory events			Ed White	Andrew Cooper	George Dunlap		good
> 10	[Xen-devel] [PATCH v3 10/13] x86/altp2m: add remaining support routines			Ed White	Andew Cooper					good?
> 11	[Xen-devel] [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP type 	Ed White	***						*** Need to create HVMOP sub-types for altp2m
> 12	[Xen-devel] [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter		Ed White	Andew Cooper					Wei's comments addressed and staged for v4
> 13	[Xen-devel] [PATCH v3 13/13] x86/altp2m: XSM hooks for altp2m HVM ops			Ravi Sahita	Daniel De Graaf		Daniel De Graaf		*** Will be impacted by HVMOP changes
> 

Hi Ravi

Thanks for the email. Let me elaborate on this.

First thing, you asked for an freeze extension, but as I understand it
you meant a freeze exception. I will reply on this basis.

Overall the series is of very high quality and very useful feature,
personally I would very much like this feature to be accepted in 4.6,
but I did set out criteria for granting a freeze exception [0]. The
series in question needs to be in a state that's only pending a few
tweaks and publicly endorsed by maintainers, plus other case by case
consideration.

With my limited knowledge of hypervisor side code I can't tell if the
series is very close. But some very long sub-threads prompt me into
thinking there are still issues (big or small).

As far as I can tell, there are several issues here with regard to this
patch series on hypervisor side:

1. Patch #1 is neither acked nor reviewed.

2. Patch #5, George is looking at locking scheme. Although that patch
   is already reviewed by Andrew and locking scheme backed by Tim, I
   would like a formal ack from George as current P2M maintainer.

3. Patch #7 has issue with regard to instruction emulator.

4. Patch #8 in this version (#7 in previous version, v2) is reviewed,
   but I'm not sure if the concern raised in v2 went away. Judging
   from the date of v3 submission and date of issue raised on v2, the
   issue is still there.

5. Patch #11 is not acked, and because it's ABI matter we need to be
   very careful about it.

6. Patch #13 will be affected by HVM op changes. A new ack is required
   in version 4.

I would need all above issues addressed to consider granting a freeze
exception. I can't speak for hypervisor maintainers. They could ack
your next version, or they could raise more issues.

With my tools maintainer hat on, I think the tools side changes in v3
are trivial and in theory I should be able to just ack it in version
4. I don't think having more patches on tools side is a good idea. The
ideal situation of patches accepted in the first round is rather
low. I would suggest if you post those patches you arrange them at the
end of your series, so that if we can apply all those acked patches if
a freeze exception is granted.

Question for you and Ed, how much testing have you done to this
series? I assume you've done testing with it turned on and off, to the
point you get a functioning guest. Have you run any test suites and
got positive results?

Question for hypervisor maintainers, how much risk do you think this
series impose?  If this feature is off, is the code path more or less
the same of before?  It seems to touch core functionality (P2M). Note
that any bug in that area would block the release.

I do have a hard line in mind -- the freeze exception should be
granted in the first week of the freeze (this seems to match your
estimation). I won't consider granting any after that time frame.

All in all, I can't grant this series freeze exception now but I also
don't preclude the possibility of doing that.

Wei.

[0]: <E1Z8Rcu-0003v6-7l@ukmail1.uk.xensource.com>

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 03/13] VMX: implement suppress #VE.
  2015-07-01 18:09 ` [PATCH v3 03/13] VMX: implement suppress #VE Ed White
  2015-07-06 17:26   ` George Dunlap
  2015-07-07 18:59   ` Nakajima, Jun
@ 2015-07-09 13:01   ` Jan Beulich
  2015-07-10 19:30     ` Sahita, Ravi
  2 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2015-07-09 13:01 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
> @@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
>      /* Manually set A bit to avoid overhead of MMU having to write it later. */
>      ept_entry->a = 1;
>  
> +    ept_entry->suppress_ve = 1;
> +
> +    table = __map_domain_page(pg);
> +
> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
> +        table[i].suppress_ve = 1;
> +
> +    unmap_domain_page(table);

For the moment I can certainly agree to it being done this way, but
it's inefficient and should be cleaned up: There shouldn't be two
mappings of the page being allocated (one in hap_alloc() and the
other being added here). I suppose the easiest would be to pass
an optional callback pointer to p2m_alloc_ptp(). Or, to also cover
the case below in ept_p2m_init() (i.e. p2m_alloc_table()) a new
optional hook in struct p2m_domain could be added for that purpose.
Albeit ...

> @@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
>          p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
>      }
>  
> +    table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
> +
> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
> +        table[i].suppress_ve = 1;
> +
> +    unmap_domain_page(table);

... why is this needed? Bit 63 is documented to be ignored in PML4Es
(just like in all other intermediate page tables).

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-01 18:09 ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Ed White
  2015-07-03 16:22   ` Andrew Cooper
  2015-07-07 15:04   ` George Dunlap
@ 2015-07-09 13:29   ` Jan Beulich
  2015-07-10 21:48     ` Sahita, Ravi
  2015-07-09 15:58   ` George Dunlap
  3 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2015-07-09 13:29 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
> ---
>  xen/arch/x86/hvm/Makefile        |  1 +
>  xen/arch/x86/hvm/altp2m.c        | 92 +++++++++++++++++++++++++++++++++++++

Wouldn't this better go into xen/arch/x86/mm/?

> +int
> +altp2m_vcpu_initialise(struct vcpu *v)
> +{
> +    int rc = -EOPNOTSUPP;
> +
> +    if ( v != current )
> +        vcpu_pause(v);
> +
> +    if ( !hvm_funcs.ap2m_vcpu_initialise ||
> +         (hvm_funcs.ap2m_vcpu_initialise(v) == 0) )
> +    {
> +        rc = 0;

I think you would better honor the error code returned by
hvm_funcs.ap2m_vcpu_initialise() and enter this block only if it
was zero.

> +        altp2m_vcpu_reset(v);
> +        vcpu_altp2m(v).p2midx = 0;
> +        atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
> +
> +        ap2m_vcpu_update_eptp(v);

We're in vendor independent code here - either the function is
misnamed, or it shouldn't be called directly from here.

> +void
> +altp2m_vcpu_destroy(struct vcpu *v)
> +{
> +    struct p2m_domain *p2m;
> +
> +    if ( v != current )
> +        vcpu_pause(v);
> +
> +    if ( hvm_funcs.ap2m_vcpu_destroy )
> +        hvm_funcs.ap2m_vcpu_destroy(v);
> +
> +    if ( (p2m = p2m_get_altp2m(v)) )
> +        atomic_dec(&p2m->active_vcpus);

The ordering looks odd - from an abstract perspective I'd expect
p2m_get_altp2m() to not return the p2m anymore that was just
destroyed via hvm_funcs.ap2m_vcpu_destroy().

> +void ap2m_vcpu_update_eptp(struct vcpu *v)
> +{

As I think I said before, I consider these ap2m_ prefixes ambiguous -
the 'a' could also stand for accelerated, advanced, ... Consistently
staying with altp2m_ would seem better.

> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)
>  int hap_enable(struct domain *d, u32 mode)
>  {
>      unsigned int old_pages;
> -    uint8_t i;
> +    uint16_t i;

unsigned int (also elsewhere, including uint8_t-s)

> @@ -498,6 +498,24 @@ int hap_enable(struct domain *d, u32 mode)
>             goto out;
>      }
>  
> +    /* Init alternate p2m data */
> +    if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
> +    {
> +        rv = -ENOMEM;
> +        goto out;
> +    }
> +
> +    for (i = 0; i < MAX_EPTP; i++)
> +        d->arch.altp2m_eptp[i] = INVALID_MFN;

The above again seems EPT-specific in a vendor independent file.

> @@ -196,7 +234,14 @@ int p2m_init(struct domain *d)
>       * (p2m_init runs too early for HVM_PARAM_* options) */
>      rc = p2m_init_nestedp2m(d);
>      if ( rc )
> +    {
>          p2m_teardown_hostp2m(d);
> +        return rc;
> +    }
> +
> +    rc = p2m_init_altp2m(d);
> +    if ( rc )
> +        p2m_teardown_altp2m(d);
>  
>      return rc;
>  }

And why not also p2m_teardown_hostp2m() in this last error case?
And doesn't p2m_init_nestedp2m() need undoing too? (Overall this
suggests the error cleanup structuring here should be changed.)

> @@ -294,6 +298,12 @@ struct arch_domain
>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>      mm_lock_t nested_p2m_lock;
>  
> +    /* altp2m: allow multiple copies of host p2m */
> +    bool_t altp2m_active;
> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
> +    mm_lock_t altp2m_lock;
> +    uint64_t *altp2m_eptp;

This is a non-insignificant increase of the structure size - perhaps all
of these should hang off of struct arch_domain via a single,
separately allocated pointer?

> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -210,6 +210,14 @@ struct hvm_function_table {
>                                    uint32_t *ecx, uint32_t *edx);
>  
>      void (*enable_msr_exit_interception)(struct domain *d);
> +
> +    /* Alternate p2m */
> +    int (*ap2m_vcpu_initialise)(struct vcpu *v);
> +    void (*ap2m_vcpu_destroy)(struct vcpu *v);
> +    int (*ap2m_vcpu_reset)(struct vcpu *v);

Why is this returning int when altp2m_vcpu_reset() returns void?

> +static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    uint16_t index = vcpu_altp2m(v).p2midx;
> +
> +    return (index == INVALID_ALTP2M) ? NULL : d->arch.altp2m_p2m[index];

It would feel better if you checked against MAX_ALTP2M here.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  2015-07-01 18:09 ` [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
  2015-07-03 16:40   ` Andrew Cooper
@ 2015-07-09 14:05   ` Jan Beulich
  1 sibling, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-09 14:05 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
> @@ -1830,6 +1831,20 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
>      vmx_vmcs_exit(v);
>  }
>  
> +static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
> +{
> +    int rc = X86EMUL_EXCEPTION;
> +    struct vcpu *v = current;

curr

> +    if ( !cpu_has_vmx_vmfunc && altp2m_active(v->domain) &&
> +         regs->eax == 0 &&
> +         p2m_switch_vcpu_altp2m_by_id(v, (uint16_t)regs->ecx) )
> +    {
> +        rc = X86EMUL_OKAY;
> +    }

Pointless braces.

> @@ -2095,6 +2112,12 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
>          vpid_sync_vcpu_gva(curr, vaddr);
>  }
>  
> +static int vmx_vmfunc_intercept(struct cpu_user_regs *regs)
> +{
> +    gdprintk(XENLOG_ERR, "Failed guest VMFUNC execution\n");
> +    return X86EMUL_EXCEPTION;
> +}
> +
>  static int vmx_cr_access(unsigned long exit_qualification)
>  {
>      struct vcpu *curr = current;
> @@ -3245,6 +3268,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
>              update_guest_eip();
>          break;
>  
> +    case EXIT_REASON_VMFUNC:
> +        if ( vmx_vmfunc_intercept(regs) == X86EMUL_OKAY )
> +            update_guest_eip();
> +        else
> +            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
> +        break;

The two changes don't fit together (and continue to look pointless
considering that the helper unconditionally returns
X86EMUL_EXCEPTION): != X86EMUL_OKAY doesn't mean
== X86EMUL_EXCEPTION.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -3815,28 +3815,40 @@ x86_emulate(
>      case 0x01: /* Grp7 */ {
>          struct segment_register reg;
>          unsigned long base, limit, cr0, cr0w;
> +        uint64_t tsc_aux;
>  
> -        if ( modrm == 0xdf ) /* invlpga */
> +        switch( modrm )
>          {
> -            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
> -            generate_exception_if(!mode_ring0(), EXC_GP, 0);
> -            fail_if(ops->invlpg == NULL);
> -            if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
> -                                   ctxt)) )
> -                goto done;
> -            break;
> -        }
> -
> -        if ( modrm == 0xf9 ) /* rdtscp */
> -        {
> -            uint64_t tsc_aux;
> -            fail_if(ops->read_msr == NULL);
> -            if ( (rc = ops->read_msr(MSR_TSC_AUX, &tsc_aux, ctxt)) != 0 )
> -                goto done;
> -            _regs.ecx = (uint32_t)tsc_aux;
> -            goto rdtsc;
> +            case 0xdf: /* invlpga AMD */

Case labels indented the same as the containing switch() please.

> +            case 0xd4: /* vmfunc */
> +                generate_exception_if(
> +                    (lock_prefix |
> +                    rep_prefix() |
> +                    (vex.pfx == vex_66)),
> +                    EXC_UD, -1);

FWIW, while Andrew pointed out that this doesn't match the doc, I
suppose it's the doc that's wrong here, so I would be inclined to
suggest keeping it as is. What I don't like though is the formatting -
why does this need to span across 5 lines?

> +                fail_if(ops->vmfunc == NULL);
> +                if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
> +                    goto done;
> +                break;
> +            default:
> +                goto continue_grp7;
>          }
> +        break;
>  
> +continue_grp7:

Labels indented by at least one space please.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-09 11:49   ` Wei Liu
@ 2015-07-09 14:14     ` Jan Beulich
  2015-07-09 16:13     ` Sahita, Ravi
  2015-07-09 16:42     ` George Dunlap
  2 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-09 14:14 UTC (permalink / raw)
  To: Wei Liu
  Cc: Tim Deegan, Ravi Sahita, George Dunlap, Andrew Cooper,
	Ian Jackson, Edmund H White, xen-devel, tlengyel,
	Daniel De Graaf

>>> On 09.07.15 at 13:49, <wei.liu2@citrix.com> wrote:
> As far as I can tell, there are several issues here with regard to this
> patch series on hypervisor side:
> 
> 1. Patch #1 is neither acked nor reviewed.

The patch is obvious and straightforward - the only reason I didn't
commit it yet is that it would be dead code right now.

> 3. Patch #7 has issue with regard to instruction emulator.

Those would seem to be easily fixed.

> 4. Patch #8 in this version (#7 in previous version, v2) is reviewed,
>    but I'm not sure if the concern raised in v2 went away. Judging
>    from the date of v3 submission and date of issue raised on v2, the
>    issue is still there.

No, the concern did not go away, but George actually had provided
a patch to address the concern.

> 5. Patch #11 is not acked, and because it's ABI matter we need to be
>    very careful about it.

Indeed. But directions were given, so it should be mostly a
mechanical thing.

> Question for hypervisor maintainers, how much risk do you think this
> series impose?  If this feature is off, is the code path more or less
> the same of before?

It looks like it mostly is, so I don't view the risk as too high.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types.
  2015-07-01 18:09 ` [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
  2015-07-06 10:09   ` Andrew Cooper
@ 2015-07-09 14:34   ` Jan Beulich
  1 sibling, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-09 14:34 UTC (permalink / raw)
  To: Ed White
  Cc: Tim Deegan, Ravi Sahita, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
> +    case HVMOP_altp2m_set_domain_state:
> +    {
> +        struct xen_hvm_altp2m_domain_state a;
> +        struct domain *d;
> +        struct vcpu *v;
> +        bool_t ostate;
> +
> +        if ( copy_from_guest(&a, arg, 1) )
> +            return -EFAULT;
> +
> +        d = rcu_lock_domain_by_any_id(a.domid);
> +        if ( d == NULL )
> +            return -ESRCH;

Is it indeed intended for a guest to enable this on itself?

> +        rc = -EINVAL;
> +        if ( is_hvm_domain(d) && hvm_altp2m_supported() &&
> +             !nestedhvm_enabled(d) )
> +        {
> +            ostate = d->arch.altp2m_active;
> +            d->arch.altp2m_active = !!a.state;
> +
> +            rc = 0;
> +
> +            /* If the alternate p2m state has changed, handle appropriately */
> +            if ( d->arch.altp2m_active != ostate )
> +            {
> +                if ( ostate || !(rc = p2m_init_altp2m_by_id(d, 0)) )
> +                {

Two if()-s like these should be combined into one.

> +    case HVMOP_altp2m_vcpu_enable_notify:
> +    {
> +        struct domain *curr_d = current->domain;
> +        struct vcpu *curr = current;

The other way around please.

> +    case HVMOP_altp2m_set_mem_access:
> +    {
> +        struct xen_hvm_altp2m_set_mem_access a;
> +        struct domain *d;
> +
> +        if ( copy_from_guest(&a, arg, 1) )
> +            return -EFAULT;
> +
> +        d = rcu_lock_domain_by_any_id(a.domid);
> +        if ( d == NULL )
> +            return -ESRCH;

Again - is it intended for this to be invokable by the guest for itself?
If so, is it being made certain that it can't increase access permissions
beyond what its controlling domain dictates?

> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -396,6 +396,75 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_evtchn_upcall_vector_t);
>  
>  #endif /* defined(__i386__) || defined(__x86_64__) */
>  
> +/* Set/get the altp2m state for a domain */
> +#define HVMOP_altp2m_set_domain_state     24
> +#define HVMOP_altp2m_get_domain_state     25
> +struct xen_hvm_altp2m_domain_state {
> +    /* Domain to be updated or queried */
> +    domid_t domid;
> +    /* IN or OUT variable on/off */
> +    uint8_t state;
> +};

Not sure if it was said before - explicit padding please (with padding
fields verified to be zero for future extensibility).

> +struct xen_hvm_altp2m_view {
> +    /* Domain to be updated */
> +    domid_t domid;
> +    /* IN/OUT variable */
> +    uint16_t view;

Is it certain that the number of views will never exceed 64k?

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 10/13] x86/altp2m: add remaining support routines.
  2015-07-01 18:09 ` [PATCH v3 10/13] x86/altp2m: add remaining support routines Ed White
  2015-07-03 16:56   ` Andrew Cooper
@ 2015-07-09 15:07   ` George Dunlap
  1 sibling, 0 replies; 91+ messages in thread
From: George Dunlap @ 2015-07-09 15:07 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Wed, Jul 1, 2015 at 7:09 PM, Ed White <edmund.h.white@intel.com> wrote:
> Add the remaining routines required to support enabling the alternate
> p2m functionality.
>
> Signed-off-by: Ed White <edmund.h.white@intel.com>
> ---
>  xen/arch/x86/hvm/hvm.c           |  58 +++++-
>  xen/arch/x86/mm/hap/Makefile     |   1 +
>  xen/arch/x86/mm/hap/altp2m_hap.c |  98 ++++++++++
>  xen/arch/x86/mm/p2m-ept.c        |   3 +
>  xen/arch/x86/mm/p2m.c            | 385 +++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/hvm/altp2m.h |   4 +
>  xen/include/asm-x86/p2m.h        |  33 ++++
>  7 files changed, 576 insertions(+), 6 deletions(-)
>  create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index f21d34d..d2d90c8 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2806,10 +2806,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>      mfn_t mfn;
>      struct vcpu *curr = current;
>      struct domain *currd = curr->domain;
> -    struct p2m_domain *p2m;
> +    struct p2m_domain *p2m, *hostp2m;
>      int rc, fall_through = 0, paged = 0;
>      int sharing_enomem = 0;
>      vm_event_request_t *req_ptr = NULL;
> +    bool_t ap2m_active = 0;
>
>      /* On Nested Virtualization, walk the guest page table.
>       * If this succeeds, all is fine.
> @@ -2869,11 +2870,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>          goto out;
>      }
>
> -    p2m = p2m_get_hostp2m(currd);
> -    mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
> +    ap2m_active = altp2m_active(currd);
> +
> +    /* Take a lock on the host p2m speculatively, to avoid potential
> +     * locking order problems later and to handle unshare etc.
> +     */
> +    hostp2m = p2m_get_hostp2m(currd);
> +    mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
>                                P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE : 0),
>                                NULL);
>
> +    if ( ap2m_active )
> +    {
> +        if ( altp2m_hap_nested_page_fault(curr, gpa, gla, npfec, &p2m) == 1 )
> +        {
> +            /* entry was lazily copied from host -- retry */
> +            __put_gfn(hostp2m, gfn);
> +            rc = 1;
> +            goto out;
> +        }
> +
> +        mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
> +    }
> +    else
> +        p2m = hostp2m;
> +
>      /* Check access permissions first, then handle faults */
>      if ( mfn_x(mfn) != INVALID_MFN )
>      {
> @@ -2913,6 +2934,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>
>          if ( violation )
>          {
> +            /* Should #VE be emulated for this fault? */
> +            if ( p2m_is_altp2m(p2m) && !cpu_has_vmx_virt_exceptions )
> +            {
> +                bool_t sve;
> +
> +                p2m->get_entry_full(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
> +
> +                if ( !sve && ap2m_vcpu_emulate_ve(curr) )
> +                {
> +                    rc = 1;
> +                    goto out_put_gfn;
> +                }
> +            }
> +
>              if ( p2m_mem_access_check(gpa, gla, npfec, &req_ptr) )
>              {
>                  fall_through = 1;
> @@ -2932,7 +2967,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>           (npfec.write_access &&
>            (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
>      {
> -        put_gfn(currd, gfn);
> +        __put_gfn(p2m, gfn);
> +        if ( ap2m_active )
> +            __put_gfn(hostp2m, gfn);
>
>          rc = 0;
>          if ( unlikely(is_pvh_domain(currd)) )
> @@ -2961,6 +2998,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>      /* Spurious fault? PoD and log-dirty also take this path. */
>      if ( p2m_is_ram(p2mt) )
>      {
> +        rc = 1;
>          /*
>           * Page log dirty is always done with order 0. If this mfn resides in
>           * a large page, we do not change other pages type within that large
> @@ -2969,9 +3007,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>          if ( npfec.write_access )
>          {
>              paging_mark_dirty(currd, mfn_x(mfn));
> +            /* If p2m is really an altp2m, unlock here to avoid lock ordering
> +             * violation when the change below is propagated from host p2m */
> +            if ( ap2m_active )
> +                __put_gfn(p2m, gfn);
>              p2m_change_type_one(currd, gfn, p2m_ram_logdirty, p2m_ram_rw);
> +            __put_gfn(ap2m_active ? hostp2m : p2m, gfn);
> +
> +            goto out;
>          }
> -        rc = 1;
>          goto out_put_gfn;
>      }
>
> @@ -2981,7 +3025,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>      rc = fall_through;
>
>  out_put_gfn:
> -    put_gfn(currd, gfn);
> +    __put_gfn(p2m, gfn);
> +    if ( ap2m_active )
> +        __put_gfn(hostp2m, gfn);
>  out:
>      /* All of these are delayed until we exit, since we might
>       * sleep on event ring wait queues, and we must not hold
> diff --git a/xen/arch/x86/mm/hap/Makefile b/xen/arch/x86/mm/hap/Makefile
> index 68f2bb5..216cd90 100644
> --- a/xen/arch/x86/mm/hap/Makefile
> +++ b/xen/arch/x86/mm/hap/Makefile
> @@ -4,6 +4,7 @@ obj-y += guest_walk_3level.o
>  obj-$(x86_64) += guest_walk_4level.o
>  obj-y += nested_hap.o
>  obj-y += nested_ept.o
> +obj-y += altp2m_hap.o
>
>  guest_walk_%level.o: guest_walk.c Makefile
>         $(CC) $(CFLAGS) -DGUEST_PAGING_LEVELS=$* -c $< -o $@
> diff --git a/xen/arch/x86/mm/hap/altp2m_hap.c b/xen/arch/x86/mm/hap/altp2m_hap.c
> new file mode 100644
> index 0000000..52c7877
> --- /dev/null
> +++ b/xen/arch/x86/mm/hap/altp2m_hap.c
> @@ -0,0 +1,98 @@
> +/******************************************************************************
> + * arch/x86/mm/hap/altp2m_hap.c
> + *
> + * Copyright (c) 2014 Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> + */
> +
> +#include <asm/domain.h>
> +#include <asm/page.h>
> +#include <asm/paging.h>
> +#include <asm/p2m.h>
> +#include <asm/hap.h>
> +#include <asm/hvm/altp2m.h>
> +
> +#include "private.h"
> +
> +/*
> + * If the fault is for a not present entry:
> + *     if the entry in the host p2m has a valid mfn, copy it and retry
> + *     else indicate that outer handler should handle fault
> + *
> + * If the fault is for a present entry:
> + *     indicate that outer handler should handle fault
> + */
> +
> +int
> +altp2m_hap_nested_page_fault(struct vcpu *v, paddr_t gpa,
> +                                unsigned long gla, struct npfec npfec,
> +                                struct p2m_domain **ap2m)

It looks like even at the end of this series, all that this function
ever does is propagate changes from the host p2m to he guest p2m.

In that case, wouldn't it be better to rename this function something
like "altp2m_propagate_from_hostp2m()" or something like that, so that
people reading the nested page fault handler know there's nothing else
going on?

Also, it appears that this is the only function in this file, even at
the end of the series.  Would it make more sense to include this
function in another file?


> +{
> +    struct p2m_domain *hp2m = p2m_get_hostp2m(v->domain);
> +    p2m_type_t p2mt;
> +    p2m_access_t p2ma;
> +    unsigned int page_order;
> +    gfn_t gfn = _gfn(paddr_to_pfn(gpa));
> +    unsigned long mask;
> +    mfn_t mfn;
> +    int rv;
> +
> +    *ap2m = p2m_get_altp2m(v);
> +
> +    mfn = get_gfn_type_access(*ap2m, gfn_x(gfn), &p2mt, &p2ma,
> +                              0, &page_order);
> +    __put_gfn(*ap2m, gfn_x(gfn));
> +
> +    if ( mfn_x(mfn) != INVALID_MFN )
> +        return 0;
> +
> +    mfn = get_gfn_type_access(hp2m, gfn_x(gfn), &p2mt, &p2ma,
> +                              P2M_ALLOC | P2M_UNSHARE, &page_order);
> +    put_gfn(hp2m->domain, gfn_x(gfn));

Why are we calling put_gfn here instead of __put_gfn (as above)?

--

These aren't blockers, I don't think, but it would be nice to have
them addressed if possible.

Also, just some feedback re the patch organization here for future
refernce -- introducing all these "support routines" which are not yet
called anywhere -- makes it difficult to review.  If I don't know
where they're going to be called from, how can I know if they're doing
what they're supposed to do?  It seems like introducing them as needed
would make review easier, as then the reviewer can go through the
whole path in which they're being used and decide (for instance) if
the boundaries of the functions make sense.

Everything that is functional as of this patch looks good from a
functional perspective; just the naming / file comments above.  I may
have comments on the non-functional bits when I find out how they're
used later.

 -George

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-01 18:09 ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Ed White
                     ` (2 preceding siblings ...)
  2015-07-09 13:29   ` Jan Beulich
@ 2015-07-09 15:58   ` George Dunlap
  3 siblings, 0 replies; 91+ messages in thread
From: George Dunlap @ 2015-07-09 15:58 UTC (permalink / raw)
  To: Ed White
  Cc: Ravi Sahita, Wei Liu, Tim Deegan, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Wed, Jul 1, 2015 at 7:09 PM, Ed White <edmund.h.white@intel.com> wrote:
> diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
> index b4f035e..301ca59 100644
> --- a/xen/arch/x86/mm/mm-locks.h
> +++ b/xen/arch/x86/mm/mm-locks.h
> @@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
>  #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
>  #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
>
> -/* P2M lock (per-p2m-table)
> +/* P2M lock (per-non-alt-p2m-table)
>   *
>   * This protects all queries and updates to the p2m table.
>   * Queries may be made under the read lock but all modifications
> @@ -225,10 +225,44 @@ declare_mm_lock(nestedp2m)
>   *
>   * The write lock is recursive as it is common for a code path to look
>   * up a gfn and later mutate it.
> + *
> + * Note that this lock shares its implementation with the altp2m
> + * lock (not the altp2m list lock), so the implementation
> + * is found there.
>   */
>
>  declare_mm_rwlock(p2m);
> -#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
> +
> +/* Alternate P2M list lock (per-domain)
> + *
> + * A per-domain lock that protects the list of alternate p2m's.
> + * Any operation that walks the list needs to acquire this lock.
> + * Additionally, before destroying an alternate p2m all VCPU's
> + * in the target domain must be paused.
> + */
> +
> +declare_mm_lock(altp2mlist)
> +#define altp2m_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_lock)
> +#define altp2m_unlock(d) mm_unlock(&(d)->arch.altp2m_lock)

This is unlocking an altp2mlist lock type; and the lock above is
protecting an altp2m list, not an altp2m.  Please use
"altp2m_list_lock", both for the lock itself and the macro that locks
it.

Also, even after reading Tim's comment and going through the whole
series, I still don't have a clear idea in what circumstances the
various locks (p2m, altp2mlist, and altp2m) will be acquired after one
another, such that this "sandwiched" lock structure is required.

Please include a comment brief description of what codepaths might
cause different pairs of locks to be acquired, so that someone coming
and looking at this code doesn't have to try to figure it out
themselves.

Everything else looks OK to me.  Thanks.

 -George

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-09 11:49   ` Wei Liu
  2015-07-09 14:14     ` Jan Beulich
@ 2015-07-09 16:13     ` Sahita, Ravi
  2015-07-09 16:20       ` Ian Campbell
  2015-07-09 16:21       ` Wei Liu
  2015-07-09 16:42     ` George Dunlap
  2 siblings, 2 replies; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-09 16:13 UTC (permalink / raw)
  To: Wei Liu
  Cc: George Dunlap, Andrew Cooper, Tim Deegan, White, Edmund H,
	xen-devel, Jan Beulich, tlengyel, Daniel De Graaf, Ian Jackson

>From: Wei Liu [mailto:wei.liu2@citrix.com]
>Sent: Thursday, July 09, 2015 4:49 AM
>
>Question for you and Ed, how much testing have you done to this series? I
>assume you've done testing with it turned on and off, to the point you get a
>functioning guest. Have you run any test suites and got positive results?

Yes, we have tested with Windows hvm guests and got positive results, and Tamas' tool patches based tests have as well.

>
>[0]: <E1Z8Rcu-0003v6-7l@ukmail1.uk.xensource.com>

This ref link was messed up

Thanks,
Ravi

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-09 16:13     ` Sahita, Ravi
@ 2015-07-09 16:20       ` Ian Campbell
  2015-07-09 16:21       ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Ian Campbell @ 2015-07-09 16:20 UTC (permalink / raw)
  To: Sahita, Ravi
  Cc: Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan, White,
	Edmund H, xen-devel, Jan Beulich, tlengyel, Daniel De Graaf,
	Ian Jackson

On Thu, 2015-07-09 at 16:13 +0000, Sahita, Ravi wrote:
> >[0]: <E1Z8Rcu-0003v6-7l@ukmail1.uk.xensource.com>
> 
> This ref link was messed up

It was a message id, you can paste it into the URL of your favourite ML
archive which supports such things. e.g.
http://mid.gmane.org/<E1Z8Rcu-0003v6-7l@ukmail1.uk.xensource.com>

Ian.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-09 16:13     ` Sahita, Ravi
  2015-07-09 16:20       ` Ian Campbell
@ 2015-07-09 16:21       ` Wei Liu
  1 sibling, 0 replies; 91+ messages in thread
From: Wei Liu @ 2015-07-09 16:21 UTC (permalink / raw)
  To: Sahita, Ravi
  Cc: Wei Liu, George Dunlap, Tim Deegan, Ian Jackson, White, Edmund H,
	xen-devel, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

On Thu, Jul 09, 2015 at 04:13:20PM +0000, Sahita, Ravi wrote:
> >From: Wei Liu [mailto:wei.liu2@citrix.com]
> >Sent: Thursday, July 09, 2015 4:49 AM
> >
> >Question for you and Ed, how much testing have you done to this series? I
> >assume you've done testing with it turned on and off, to the point you get a
> >functioning guest. Have you run any test suites and got positive results?
> 
> Yes, we have tested with Windows hvm guests and got positive results,
> and Tamas' tool patches based tests have as well.
> 
> >
> >[0]: <E1Z8Rcu-0003v6-7l@ukmail1.uk.xensource.com>
> 
> This ref link was messed up
> 

That's the message-id of my email.

Here is a link to the archive:

http://www.gossamer-threads.com/lists/xen/devel/386386

It's just my 2 weeks' reminder email, which has information on how to
ask for freeze exception. I believed you saw that already.

Wei.

> Thanks,
> Ravi

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m
  2015-07-09 11:49   ` Wei Liu
  2015-07-09 14:14     ` Jan Beulich
  2015-07-09 16:13     ` Sahita, Ravi
@ 2015-07-09 16:42     ` George Dunlap
  2 siblings, 0 replies; 91+ messages in thread
From: George Dunlap @ 2015-07-09 16:42 UTC (permalink / raw)
  To: Wei Liu, Sahita, Ravi
  Cc: Andrew Cooper, Tim Deegan, White, Edmund H, xen-devel,
	Jan Beulich, tlengyel, Daniel De Graaf, Ian Jackson

On 07/09/2015 12:49 PM, Wei Liu wrote:
> On Wed, Jul 08, 2015 at 06:35:33PM +0000, Sahita, Ravi wrote:
>> Hi Wei,
>>
>> Given where we stand (close) on the altp2m patch series - we would like to request an extension of about a week to complete the last couple of patches in the series for inclusion in 4.6. 
>> Some of the suggestions may have implications on other patches and on our tests hence asking for the extension (we know what we need to change). 
>>
>> Hope that is acceptable. This is a quick current status snapshot of v3: 
>> (this doesn't have a couple tools patches that Tamas has contributed but those will be included in rev4)
>>
>> altp2m series patch v3 status
>> 0	[PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m				Sign-off	Reviewed		Acked			Status
>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> 1	[Xen-devel] [PATCH v3 01/13] common/domain: Helpers to pause a domain while in context	Andrew Cooper						good?
>> 2	[Xen-devel] [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection.		Ed White	Andrew Cooper	Jun Nakajima		good
>> 3	[Xen-devel] [PATCH v3 03/13] VMX: implement suppress #VE.					Ed White	Andrew Cooper	Jun Nakajima		good
>> 4	[Xen-devel] [PATCH v3 04/13] x86/HVM: Hardware alternate p2m support detection		Ed White	Andrew Cooper				good?
>> 5	[Xen-devel] [PATCH v3 05/13] x86/altp2m: basic data structures and support routines		Ed White	Andrew Cooper				Locking being reviewed by George Dunlap
>> 6	[Xen-devel] [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE	Ed White	Andrew Cooper	Jun Nakajima		good
>> 7	[Xen-devel] [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator		Ravi Sahita	***						***Proposed Changes ready staged for v4
>> 8	[Xen-devel] [PATCH v3 08/13] x86/altp2m: add control of suppress_ve				Ed White	***Andrew Cooper				*** George Dunlap has an alt patch based on v2 7 of 12
>> 9	[Xen-devel] [PATCH v3 09/13] x86/altp2m: alternate p2m memory events			Ed White	Andrew Cooper	George Dunlap		good
>> 10	[Xen-devel] [PATCH v3 10/13] x86/altp2m: add remaining support routines			Ed White	Andew Cooper					good?
>> 11	[Xen-devel] [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP type 	Ed White	***						*** Need to create HVMOP sub-types for altp2m
>> 12	[Xen-devel] [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter		Ed White	Andew Cooper					Wei's comments addressed and staged for v4
>> 13	[Xen-devel] [PATCH v3 13/13] x86/altp2m: XSM hooks for altp2m HVM ops			Ravi Sahita	Daniel De Graaf		Daniel De Graaf		*** Will be impacted by HVMOP changes
>>
> 
> Hi Ravi
> 
> Thanks for the email. Let me elaborate on this.
> 
> First thing, you asked for an freeze extension, but as I understand it
> you meant a freeze exception. I will reply on this basis.
> 
> Overall the series is of very high quality and very useful feature,
> personally I would very much like this feature to be accepted in 4.6,
> but I did set out criteria for granting a freeze exception [0]. The
> series in question needs to be in a state that's only pending a few
> tweaks and publicly endorsed by maintainers, plus other case by case
> consideration.
> 
> With my limited knowledge of hypervisor side code I can't tell if the
> series is very close. But some very long sub-threads prompt me into
> thinking there are still issues (big or small).
> 
> As far as I can tell, there are several issues here with regard to this
> patch series on hypervisor side:
> 
> 1. Patch #1 is neither acked nor reviewed.
> 
> 2. Patch #5, George is looking at locking scheme. Although that patch
>    is already reviewed by Andrew and locking scheme backed by Tim, I
>    would like a formal ack from George as current P2M maintainer.

I expect that the locking is probably fine, but it definitely needs to
be a comment in the code explaining the codepaths where the order
becomes important, and why we need this 3-level "sandwiched" locking thing.

> Question for hypervisor maintainers, how much risk do you think this
> series impose?  If this feature is off, is the code path more or less
> the same of before?  It seems to touch core functionality (P2M). Note
> that any bug in that area would block the release.

So there's no reason that it should be high risk; in theory the feature
is off by default, and if it's off, the only changes to existing paths
should consist of
1. Checking to see if it's enabled, and when it finds it's not, skipping
all the new code
2. Going about setting bit 63 in all the ept entries, and dealing with
the fact that it's probably going to be set now.

Neither of those are very high risk; and the second path will be tested
millions of times in a single vm-boot-shutdown smoke test.

However, the series as it stands has more impact on existing paths than
it needs to.  It seems that a bunch of structures are pro-actively
initiated and allocated for each domain created, even if the entire
feature is disabled at boot.  Additionally, a lot of the set-up and
tear-down operations could be completely skipped if
d->arch.altp2m_active == 0.

In the interest of making this easier to accept during a code freeze, I
suggest going through the code again with an eye to lowering the impact.
 I don't think you need to spend a great deal of effort or do any major
refactorings; but there should be a lot of "low hanging fruit" where you
can just say "if (!d->arch.altp2m_active) return;".

Additionally, at very least you should gate allocating and initializing
altp2m tables and eptp on hvm_funcs.altp2m_supported; better yet, gate
it on it having been enabled for the domain on create.

The second would mean, I guess, calling the initialization stuff when
the parameter is set; that looks to be in like with what nestedhvm does.

Use your best judgement to find the balance between limiting the impact
and spending more time changing the code.

Note also that I'll be working tomorrow, doing code review, but that
I'll be out of the office Monday and Friday next week.

 -George

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-07 16:19       ` Ed White
  2015-07-08 13:52         ` George Dunlap
@ 2015-07-09 17:05         ` Sahita, Ravi
  2015-07-10 16:35           ` George Dunlap
  1 sibling, 1 reply; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-09 17:05 UTC (permalink / raw)
  To: White, Edmund H, Tim Deegan, George Dunlap
  Cc: Wei Liu, Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	tlengyel, Daniel De Graaf


>From: White, Edmund H
>Sent: Tuesday, July 07, 2015 9:20 AM
>
>On 07/07/2015 08:22 AM, Tim Deegan wrote:
>> At 16:04 +0100 on 07 Jul (1436285059), George Dunlap wrote:
>>> On 07/01/2015 07:09 PM, Ed White wrote:
>>>> diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-
>locks.h
>>>> index b4f035e..301ca59 100644
>>>> --- a/xen/arch/x86/mm/mm-locks.h
>>>> +++ b/xen/arch/x86/mm/mm-locks.h
>>>> @@ -217,7 +217,7 @@ declare_mm_lock(nestedp2m)
>>>>  #define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)-
>>arch.nested_p2m_lock)
>>>>  #define nestedp2m_unlock(d) mm_unlock(&(d)-
>>arch.nested_p2m_lock)
>>>>
>>>> -/* P2M lock (per-p2m-table)
>>>> +/* P2M lock (per-non-alt-p2m-table)
>>>>   *
>>>>   * This protects all queries and updates to the p2m table.
>>>>   * Queries may be made under the read lock but all modifications @@
>>>> -225,10 +225,44 @@ declare_mm_lock(nestedp2m)
>>>>   *
>>>>   * The write lock is recursive as it is common for a code path to look
>>>>   * up a gfn and later mutate it.
>>>> + *
>>>> + * Note that this lock shares its implementation with the altp2m
>>>> + * lock (not the altp2m list lock), so the implementation
>>>> + * is found there.
>>>>   */
>>>>
>>>>  declare_mm_rwlock(p2m);
>>>> -#define p2m_lock(p)           mm_write_lock(p2m, &(p)->lock);
>>>> +
>>>> +/* Alternate P2M list lock (per-domain)
>>>> + *
>>>> + * A per-domain lock that protects the list of alternate p2m's.
>>>> + * Any operation that walks the list needs to acquire this lock.
>>>> + * Additionally, before destroying an alternate p2m all VCPU's
>>>> + * in the target domain must be paused.
>>>> + */
>>>> +
>>>> +declare_mm_lock(altp2mlist)
>>>> +#define altp2m_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_lock)
>>>> +#define altp2m_unlock(d) mm_unlock(&(d)->arch.altp2m_lock)
>>>> +
>>>> +/* P2M lock (per-altp2m-table)
>>>> + *
>>>> + * This protects all queries and updates to the p2m table.
>>>> + * Queries may be made under the read lock but all modifications
>>>> + * need the main (write) lock.
>>>> + *
>>>> + * The write lock is recursive as it is common for a code path to
>>>> +look
>>>> + * up a gfn and later mutate it.
>>>> + */
>>>> +
>>>> +declare_mm_rwlock(altp2m);
>>>> +#define p2m_lock(p)                         \
>>>> +{                                           \
>>>> +    if ( p2m_is_altp2m(p) )                 \
>>>> +        mm_write_lock(altp2m, &(p)->lock);  \
>>>> +    else                                    \
>>>> +        mm_write_lock(p2m, &(p)->lock);     \
>>>> +}
>>>>  #define p2m_unlock(p)         mm_write_unlock(&(p)->lock);
>>>>  #define gfn_lock(p,g,o)       p2m_lock(p)
>>>>  #define gfn_unlock(p,g,o)     p2m_unlock(p)
>>>
>>> I've just taken on the role of mm maintainer, and so this late in a
>>> series, having Tim's approval and also having Andy's reviewed-by, I'd
>>> normally just skim through and Ack it as a matter of course.  But I
>>> just wouldn't feel right giving this my maintainer's ack without
>>> understanding the locking a bit better.  So please bear with me here.
>>>
>>> I see in the cover letter that you "sandwiched" the altp2mlist lock
>>> between p2m and altp2m at Tim's suggestion.  But I can't find the
>>> discussion where that was suggested (it doesn't seem to be in
>>> response to v1 patch 5),
>>
>> I suggested changing the locking order here:
>> http://lists.xenproject.org/archives/html/xen-devel/2015-01/msg01824.h
>> tml
>>
>> Cheers,
>>
>> Tim.
>>
>
>And Tim, Andrew and I subsequently discussed this specific approach in a
>phone meeting.
>
>Ed

Here is a brief description of the approach that was taken - 

The altp2m design implements a set of p2m's that are derived from the host p2m, by lazy copy, and can then have specific modifications applied. If a change is made to the host p2m when the domain is in altp2m mode, that change must be propagated to any altp2m that contains a valid copy of the old host p2m entry, primarily to guarantee that no altp2m maps a mfn that is not mapped by the host p2m, as that would compromise Xen. Tim suggested that the only safe way to perform that propagation would be in the code that writes the new EPTE, but by that point there is invariably a lock on the host p2m, and that lock was frequently acquired some way higher in the call stack. It is not safe to walk the list of altp2m's without holding the list lock, so in that situation the acquisition order is hos
 tp2m->altp2mlist->altp2m. Although Ed did the implementation, the idea to split the p2m lock type using the macros in mm-locks.h was suggested by Tim and Andrew in a phone meeting attended by Tim, Andrew, Ravi, and Ed in early May.

Cheers,
Ravi

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-09 17:05         ` Sahita, Ravi
@ 2015-07-10 16:35           ` George Dunlap
  2015-07-10 22:11             ` Sahita, Ravi
  0 siblings, 1 reply; 91+ messages in thread
From: George Dunlap @ 2015-07-10 16:35 UTC (permalink / raw)
  To: Sahita, Ravi, White, Edmund H, Tim Deegan
  Cc: Wei Liu, Andrew Cooper, Ian Jackson, xen-devel, Jan Beulich,
	tlengyel, Daniel De Graaf

On 07/09/2015 06:05 PM, Sahita, Ravi wrote:
>> And Tim, Andrew and I subsequently discussed this specific approach in a
>> phone meeting.
>>
>> Ed
> 
> Here is a brief description of the approach that was taken - 
> 
> The altp2m design implements a set of p2m's that are derived from the host p2m, by lazy copy, and can then have specific modifications applied. If a change is made to the host p2m when the domain is in altp2m mode, that change must be propagated to any altp2m that contains a valid copy of the old host p2m entry, primarily to guarantee that no altp2m maps a mfn that is not mapped by the host p2m, as that would compromise Xen. Tim suggested that the only safe way to perform that propagation would be in the code that writes the new EPTE, but by that point there is invariably a lock on the host p2m, and that lock was frequently acquired some way higher in the call stack. It is not safe to walk the list of altp2m's without holding the list lock, so in that situation the acquisition order is h
 ostp2m->altp2mlist->altp2m. Although Ed did the implementation, the idea to split the p2m lock type using the macros in mm-locks.h was suggested by Tim and Andrew in a phone meeting attended by Tim, 
Andrew, Ravi, and Ed in early May.

Thanks for the description.  I'd summarize it this way:

Changes made to the host p2m when in altp2m mode are propagated to the
altp2ms synchronously in ept_set_entry().  At that point, we will hold
the host p2m lock; propagating this change involves grabbing the
altp2m_list lock, and the locks of the individual alternate p2ms.  In
order to allow us to maintain locking order discipline, we split the p2m
lock into p2m (for host p2ms) and altp2m (for alternate p2ms), putting
the altp2mlist lock in the middle.

Is that about right?

If so, please add the above paragraph, or something with the same basic
information, in the comments in mm-locks.h.

Thanks,
 -George

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 03/13] VMX: implement suppress #VE.
  2015-07-09 13:01   ` Jan Beulich
@ 2015-07-10 19:30     ` Sahita, Ravi
  2015-07-13  7:40       ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-10 19:30 UTC (permalink / raw)
  To: Jan Beulich, White, Edmund H
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	xen-devel, tlengyel, Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 09, 2015 6:01 AM
>
>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>> @@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct p2m_domain
>*p2m, ept_entry_t *ept_entry)
>>      /* Manually set A bit to avoid overhead of MMU having to write it later.
>*/
>>      ept_entry->a = 1;
>>
>> +    ept_entry->suppress_ve = 1;
>> +
>> +    table = __map_domain_page(pg);
>> +
>> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
>> +        table[i].suppress_ve = 1;
>> +
>> +    unmap_domain_page(table);
>
>For the moment I can certainly agree to it being done this way, but it's

On this one I'm hoping you are ok with the way the code is structured now.

>inefficient and should be cleaned up: There shouldn't be two mappings of the
>page being allocated (one in hap_alloc() and the other being added here). I
>suppose the easiest would be to pass an optional callback pointer to
>p2m_alloc_ptp(). Or, to also cover the case below in ept_p2m_init() (i.e.
>p2m_alloc_table()) a new optional hook in struct p2m_domain could be added
>for that purpose.
>Albeit ...
>
>> @@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
>>          p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
>>      }
>>
>> +    table =
>> + map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
>> +
>> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
>> +        table[i].suppress_ve = 1;
>> +
>> +    unmap_domain_page(table);
>
>... why is this needed? Bit 63 is documented to be ignored in PML4Es (just like
>in all other intermediate page tables).

Valid point - this has no negative side-effects per se so we didn't change this.

Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-09 13:29   ` Jan Beulich
@ 2015-07-10 21:48     ` Sahita, Ravi
  2015-07-13  8:01       ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-10 21:48 UTC (permalink / raw)
  To: Jan Beulich, White, Edmund H
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, xen-devel, tlengyel, Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 09, 2015 6:30 AM
>
>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>> ---
>>  xen/arch/x86/hvm/Makefile        |  1 +
>>  xen/arch/x86/hvm/altp2m.c        | 92
>+++++++++++++++++++++++++++++++++++++
>
>Wouldn't this better go into xen/arch/x86/mm/?

In this case we followed the pattern of nestedhvm - hope that's ok.

>
>> +int
>> +altp2m_vcpu_initialise(struct vcpu *v) {
>> +    int rc = -EOPNOTSUPP;
>> +
>> +    if ( v != current )
>> +        vcpu_pause(v);
>> +
>> +    if ( !hvm_funcs.ap2m_vcpu_initialise ||
>> +         (hvm_funcs.ap2m_vcpu_initialise(v) == 0) )
>> +    {
>> +        rc = 0;
>
>I think you would better honor the error code returned by
>hvm_funcs.ap2m_vcpu_initialise() and enter this block only if it was zero.

The code is checking that condition - did I misinterpret?

>
>> +        altp2m_vcpu_reset(v);
>> +        vcpu_altp2m(v).p2midx = 0;
>> +        atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
>> +
>> +        ap2m_vcpu_update_eptp(v);
>
>We're in vendor independent code here - either the function is misnamed, or
>it shouldn't be called directly from here.

Would it be reasonable to add if hap_enabled and cpu_has_vmx checks like other code in this file that invokes ept specific ops?
Otherwise it seems ok that the function would be called from here for p2m_altp2m interactions such as switching altp2m by id etc.
Open to any other suggestions from you, or we would like to leave it as it is.

>
>> +void
>> +altp2m_vcpu_destroy(struct vcpu *v)
>> +{
>> +    struct p2m_domain *p2m;
>> +
>> +    if ( v != current )
>> +        vcpu_pause(v);
>> +
>> +    if ( hvm_funcs.ap2m_vcpu_destroy )
>> +        hvm_funcs.ap2m_vcpu_destroy(v);
>> +
>> +    if ( (p2m = p2m_get_altp2m(v)) )
>> +        atomic_dec(&p2m->active_vcpus);
>
>The ordering looks odd - from an abstract perspective I'd expect
>p2m_get_altp2m() to not return the p2m anymore that was just destroyed via
>hvm_funcs.ap2m_vcpu_destroy().
>

ap2m_vcpu_destroy is for destroying vcpu context related to altp2m - note this is not implemented since its not needed for Intel implementation.  The idea is that if something needs to be done specifically for for AMD then that could be done here. 

>> +void ap2m_vcpu_update_eptp(struct vcpu *v) {
>
>As I think I said before, I consider these ap2m_ prefixes ambiguous - the 'a'
>could also stand for accelerated, advanced, ... Consistently staying with
>altp2m_ would seem better.
>

We have a comment above the list of these ap2m_ functions in hvm.h stating these are for Alternate p2m - do you feel strongly about us changing this naming? Also this is the interface naming, and if we renamed it altp2m_xxx it would cause confusion with the actual altp2m_xx functionality - so we would like to leave it as proposed.

>> --- a/xen/arch/x86/mm/hap/hap.c
>> +++ b/xen/arch/x86/mm/hap/hap.c
>> @@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)  int
>> hap_enable(struct domain *d, u32 mode)  {
>>      unsigned int old_pages;
>> -    uint8_t i;
>> +    uint16_t i;
>
>unsigned int (also elsewhere, including uint8_t-s)

We used existing iterator types that were being used (uint8_t was being used in hap_final_teardown).
If you feel strongly we could change it but we would change code that we didn't need to touch for this patch.

>
>> @@ -498,6 +498,24 @@ int hap_enable(struct domain *d, u32 mode)
>>             goto out;
>>      }
>>
>> +    /* Init alternate p2m data */
>> +    if ( (d->arch.altp2m_eptp = alloc_xenheap_page()) == NULL )
>> +    {
>> +        rv = -ENOMEM;
>> +        goto out;
>> +    }
>> +
>> +    for (i = 0; i < MAX_EPTP; i++)
>> +        d->arch.altp2m_eptp[i] = INVALID_MFN;
>
>The above again seems EPT-specific in a vendor independent file.

See above comment.

>
>> @@ -196,7 +234,14 @@ int p2m_init(struct domain *d)
>>       * (p2m_init runs too early for HVM_PARAM_* options) */
>>      rc = p2m_init_nestedp2m(d);
>>      if ( rc )
>> +    {
>>          p2m_teardown_hostp2m(d);
>> +        return rc;
>> +    }
>> +
>> +    rc = p2m_init_altp2m(d);
>> +    if ( rc )
>> +        p2m_teardown_altp2m(d);
>>
>>      return rc;
>>  }
>
>And why not also p2m_teardown_hostp2m() in this last error case?
>And doesn't p2m_init_nestedp2m() need undoing too? (Overall this
>suggests the error cleanup structuring here should be changed.)

Sounds right - we will make this change.

>
>> @@ -294,6 +298,12 @@ struct arch_domain
>>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>      mm_lock_t nested_p2m_lock;
>>
>> +    /* altp2m: allow multiple copies of host p2m */
>> +    bool_t altp2m_active;
>> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>> +    mm_lock_t altp2m_lock;
>> +    uint64_t *altp2m_eptp;
>
>This is a non-insignificant increase of the structure size - perhaps all
>of these should hang off of struct arch_domain via a single,
>separately allocated pointer?

Is this a nice-to-have - again we modelled after the nestedhvm extensions to the struct.
This will affect a lot of our patch without really changing how much memory is allocated.

>
>> --- a/xen/include/asm-x86/hvm/hvm.h
>> +++ b/xen/include/asm-x86/hvm/hvm.h
>> @@ -210,6 +210,14 @@ struct hvm_function_table {
>>                                    uint32_t *ecx, uint32_t *edx);
>>
>>      void (*enable_msr_exit_interception)(struct domain *d);
>> +
>> +    /* Alternate p2m */
>> +    int (*ap2m_vcpu_initialise)(struct vcpu *v);
>> +    void (*ap2m_vcpu_destroy)(struct vcpu *v);
>> +    int (*ap2m_vcpu_reset)(struct vcpu *v);
>
>Why is this returning int when altp2m_vcpu_reset() returns void?

Currently altp2m_vcpu_reset cannot fail - but at the interface level from hvm, we wanted to allow someone to change that in the future, so the interface allows for a return code.

>
>> +static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v)
>> +{
>> +    struct domain *d = v->domain;
>> +    uint16_t index = vcpu_altp2m(v).p2midx;
>> +
>> +    return (index == INVALID_ALTP2M) ? NULL : d-
>>arch.altp2m_p2m[index];
>
>It would feel better if you checked against MAX_ALTP2M here.

There is no way for p2midx to be >= MAX_ALTP2M without being INVALID_ALTP2M.

Thanks,
Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-10 16:35           ` George Dunlap
@ 2015-07-10 22:11             ` Sahita, Ravi
  0 siblings, 0 replies; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-10 22:11 UTC (permalink / raw)
  To: George Dunlap, White, Edmund H, Tim Deegan
  Cc: Sahita, Ravi, Wei Liu, Andrew Cooper, Ian Jackson, xen-devel,
	Jan Beulich, tlengyel, Daniel De Graaf

>From: George Dunlap [mailto:george.dunlap@eu.citrix.com]
>Sent: Friday, July 10, 2015 9:35 AM
>
>On 07/09/2015 06:05 PM, Sahita, Ravi wrote:
>>> And Tim, Andrew and I subsequently discussed this specific approach
>>> in a phone meeting.
>>>
>>> Ed
>>
>> Here is a brief description of the approach that was taken -
>>
>> The altp2m design implements a set of p2m's that are derived from the
>> host p2m, by lazy copy, and can then have specific modifications
>> applied. If a change is made to the host p2m when the domain is in
>> altp2m mode, that change must be propagated to any altp2m that
>> contains a valid copy of the old host p2m entry, primarily to
>> guarantee that no altp2m maps a mfn that is not mapped by the host
>> p2m, as that would compromise Xen. Tim suggested that the only safe
>> way to perform that propagation would be in the code that writes the
>> new EPTE, but by that point there is invariably a lock on the host
>> p2m, and that lock was frequently acquired some way higher in the call
>> stack. It is not safe to walk the list of altp2m's without holding the
>> list lock, so in that situation the acquisition order is
>> hostp2m->altp2mlist->altp2m. Although Ed did the implementation, the
>> idea to split the p2m lock type using the macros in mm-locks.h was
>> suggested by Tim and Andrew in a phone meeting attended by Tim,
>Andrew, Ravi, and Ed in early May.
>
>Thanks for the description.  I'd summarize it this way:
>
>Changes made to the host p2m when in altp2m mode are propagated to the
>altp2ms synchronously in ept_set_entry().  At that point, we will hold the host
>p2m lock; propagating this change involves grabbing the altp2m_list lock, and
>the locks of the individual alternate p2ms.  In order to allow us to maintain
>locking order discipline, we split the p2m lock into p2m (for host p2ms) and
>altp2m (for alternate p2ms), putting the altp2mlist lock in the middle.
>
>Is that about right?

Yes that's fine.

>
>If so, please add the above paragraph, or something with the same basic
>information, in the comments in mm-locks.h.

Ok.

>
>Thanks,
> -George

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 03/13] VMX: implement suppress #VE.
  2015-07-10 19:30     ` Sahita, Ravi
@ 2015-07-13  7:40       ` Jan Beulich
  2015-07-13 23:39         ` Sahita, Ravi
  2015-07-14 11:18         ` George Dunlap
  0 siblings, 2 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-13  7:40 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 10.07.15 at 21:30, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 09, 2015 6:01 AM
>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>> @@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct p2m_domain
>>> @@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
>>>          p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
>>>      }
>>>
>>> +    table =
>>> + map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
>>> +
>>> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
>>> +        table[i].suppress_ve = 1;
>>> +
>>> +    unmap_domain_page(table);
>>
>>... why is this needed? Bit 63 is documented to be ignored in PML4Es (just 
> like
>>in all other intermediate page tables).
> 
> Valid point - this has no negative side-effects per se so we didn't change 
> this.

Taking "we didn't change this" to refer to v3 -> v4, I still think this
should be dropped if it isn't needed. There can only be confusion
arising from code having no purpose.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-10 21:48     ` Sahita, Ravi
@ 2015-07-13  8:01       ` Jan Beulich
  2015-07-14  0:01         ` Sahita, Ravi
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2015-07-13  8:01 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 10.07.15 at 23:48, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 09, 2015 6:30 AM
>>
>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>> ---
>>>  xen/arch/x86/hvm/Makefile        |  1 +
>>>  xen/arch/x86/hvm/altp2m.c        | 92
>>+++++++++++++++++++++++++++++++++++++
>>
>>Wouldn't this better go into xen/arch/x86/mm/?
> 
> In this case we followed the pattern of nestedhvm - hope that's ok.

Not really imo: Nested HVM obviously belongs in hvm/; alt-P2m
is more of a mm extension than a HVM one afaict, and hence
would rather belong in mm/.

>>> +int
>>> +altp2m_vcpu_initialise(struct vcpu *v) {
>>> +    int rc = -EOPNOTSUPP;
>>> +
>>> +    if ( v != current )
>>> +        vcpu_pause(v);
>>> +
>>> +    if ( !hvm_funcs.ap2m_vcpu_initialise ||
>>> +         (hvm_funcs.ap2m_vcpu_initialise(v) == 0) )
>>> +    {
>>> +        rc = 0;
>>
>>I think you would better honor the error code returned by
>>hvm_funcs.ap2m_vcpu_initialise() and enter this block only if it was zero.
> 
> The code is checking that condition - did I misinterpret?

It is checking the condition, yes, but not propagating the error
code.

>>> +        altp2m_vcpu_reset(v);
>>> +        vcpu_altp2m(v).p2midx = 0;
>>> +        atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
>>> +
>>> +        ap2m_vcpu_update_eptp(v);
>>
>>We're in vendor independent code here - either the function is misnamed, or
>>it shouldn't be called directly from here.
> 
> Would it be reasonable to add if hap_enabled and cpu_has_vmx checks like 
> other code in this file that invokes ept specific ops?
> Otherwise it seems ok that the function would be called from here for 
> p2m_altp2m interactions such as switching altp2m by id etc.
> Open to any other suggestions from you, or we would like to leave it as it 
> is.

Imo such should be abstracted out properly (if it's indeed EPT-specific),
or the function be renamed.

>>> +void
>>> +altp2m_vcpu_destroy(struct vcpu *v)
>>> +{
>>> +    struct p2m_domain *p2m;
>>> +
>>> +    if ( v != current )
>>> +        vcpu_pause(v);
>>> +
>>> +    if ( hvm_funcs.ap2m_vcpu_destroy )
>>> +        hvm_funcs.ap2m_vcpu_destroy(v);
>>> +
>>> +    if ( (p2m = p2m_get_altp2m(v)) )
>>> +        atomic_dec(&p2m->active_vcpus);
>>
>>The ordering looks odd - from an abstract perspective I'd expect
>>p2m_get_altp2m() to not return the p2m anymore that was just destroyed via
>>hvm_funcs.ap2m_vcpu_destroy().
>>
> 
> ap2m_vcpu_destroy is for destroying vcpu context related to altp2m - note 
> this is not implemented since its not needed for Intel implementation.  The 
> idea is that if something needs to be done specifically for for AMD then that 
> could be done here. 

First of all this doesn't invalidate or address the concern raised.
And then - if you don't need the hook, why don't you leave it out
altogether, eliminating the need to decide about its caller's proper
placement?

>>> +void ap2m_vcpu_update_eptp(struct vcpu *v) {
>>
>>As I think I said before, I consider these ap2m_ prefixes ambiguous - the 'a'
>>could also stand for accelerated, advanced, ... Consistently staying with
>>altp2m_ would seem better.
>>
> 
> We have a comment above the list of these ap2m_ functions in hvm.h stating 
> these are for Alternate p2m - do you feel strongly about us changing this 
> naming? Also this is the interface naming, and if we renamed it altp2m_xxx it 
> would cause confusion with the actual altp2m_xx functionality - so we would 
> like to leave it as proposed.

I don't think there would be much confusion - structure member
names and function names live in different name spaces anyway.
So yes, I continue to think ap2m is a bad prefix...

>>> --- a/xen/arch/x86/mm/hap/hap.c
>>> +++ b/xen/arch/x86/mm/hap/hap.c
>>> @@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)  int
>>> hap_enable(struct domain *d, u32 mode)  {
>>>      unsigned int old_pages;
>>> -    uint8_t i;
>>> +    uint16_t i;
>>
>>unsigned int (also elsewhere, including uint8_t-s)
> 
> We used existing iterator types that were being used (uint8_t was being used 
> in hap_final_teardown).
> If you feel strongly we could change it but we would change code that we 
> didn't need to touch for this patch.

I didn't say you should change code you otherwise don't need to
touch. But both new code as well as code being changed anyway
shouldn't repeat/continue pre-existing mistakes (or however you'd
want to call such).

>>> @@ -294,6 +298,12 @@ struct arch_domain
>>>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>>      mm_lock_t nested_p2m_lock;
>>>
>>> +    /* altp2m: allow multiple copies of host p2m */
>>> +    bool_t altp2m_active;
>>> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>>> +    mm_lock_t altp2m_lock;
>>> +    uint64_t *altp2m_eptp;
>>
>>This is a non-insignificant increase of the structure size - perhaps all
>>of these should hang off of struct arch_domain via a single,
>>separately allocated pointer?
> 
> Is this a nice-to-have - again we modelled after the nestedhvm extensions to 
> the struct.
> This will affect a lot of our patch without really changing how much memory 
> is allocated.

I understand that. To a certain point I can agree to limit changes to
what is there at this stage. But you wanting to avoid addressing
concerns basically everywhere it's not a bug overextends this. Just
because the series was submitted late doesn't mean you should now
expect us to give in on any controversy regarding aspects we would
normally expect to be changed. This would basically encourage
others to submit their stuff late too in the future, hoping for relaxed
review.

>>> --- a/xen/include/asm-x86/hvm/hvm.h
>>> +++ b/xen/include/asm-x86/hvm/hvm.h
>>> @@ -210,6 +210,14 @@ struct hvm_function_table {
>>>                                    uint32_t *ecx, uint32_t *edx);
>>>
>>>      void (*enable_msr_exit_interception)(struct domain *d);
>>> +
>>> +    /* Alternate p2m */
>>> +    int (*ap2m_vcpu_initialise)(struct vcpu *v);
>>> +    void (*ap2m_vcpu_destroy)(struct vcpu *v);
>>> +    int (*ap2m_vcpu_reset)(struct vcpu *v);
>>
>>Why is this returning int when altp2m_vcpu_reset() returns void?
> 
> Currently altp2m_vcpu_reset cannot fail - but at the interface level from 
> hvm, we wanted to allow someone to change that in the future, so the 
> interface allows for a return code.

That's precisely what I assumed, and precisely what I want to see
avoided: Either the operation can (theoretically) fail - then this
should be catered for at all levels. Or it can't - then there's no
point for the non-void return type here.

>>> +static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v)
>>> +{
>>> +    struct domain *d = v->domain;
>>> +    uint16_t index = vcpu_altp2m(v).p2midx;
>>> +
>>> +    return (index == INVALID_ALTP2M) ? NULL : d-
>>>arch.altp2m_p2m[index];
>>
>>It would feel better if you checked against MAX_ALTP2M here.
> 
> There is no way for p2midx to be >= MAX_ALTP2M without being INVALID_ALTP2M.

If there was an ASSERT() to that effect I'd be fine. Yet you have to
accept that bugs may exist somewhere, and being tight with checks
like those for valid array indexes lowers the risk / impact of security
issues.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 03/13] VMX: implement suppress #VE.
  2015-07-13  7:40       ` Jan Beulich
@ 2015-07-13 23:39         ` Sahita, Ravi
  2015-07-14 11:18         ` George Dunlap
  1 sibling, 0 replies; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-13 23:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf



>-----Original Message-----
>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Monday, July 13, 2015 12:40 AM
>
>>>> On 10.07.15 at 21:30, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Thursday, July 09, 2015 6:01 AM
>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>> @@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct
>>>> p2m_domain @@ -1134,6 +1151,13 @@ int ept_p2m_init(struct
>p2m_domain *p2m)
>>>>          p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
>>>>      }
>>>>
>>>> +    table =
>>>> + map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
>>>> +
>>>> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
>>>> +        table[i].suppress_ve = 1;
>>>> +
>>>> +    unmap_domain_page(table);
>>>
>>>... why is this needed? Bit 63 is documented to be ignored in PML4Es
>>>(just
>> like
>>>in all other intermediate page tables).
>>
>> Valid point - this has no negative side-effects per se so we didn't
>> change this.
>
>Taking "we didn't change this" to refer to v3 -> v4, I still think this should be
>dropped if it isn't needed. There can only be confusion arising from code
>having no purpose.
>
>Jan

Done.

Ravi

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-13  8:01       ` Jan Beulich
@ 2015-07-14  0:01         ` Sahita, Ravi
  2015-07-14  8:53           ` Jan Beulich
  2015-07-14 11:34           ` George Dunlap
  0 siblings, 2 replies; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-14  0:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf



>-----Original Message-----
>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Monday, July 13, 2015 1:01 AM
>
>>>> On 10.07.15 at 23:48, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Thursday, July 09, 2015 6:30 AM
>>>
>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>> ---
>>>>  xen/arch/x86/hvm/Makefile        |  1 +
>>>>  xen/arch/x86/hvm/altp2m.c        | 92
>>>+++++++++++++++++++++++++++++++++++++
>>>
>>>Wouldn't this better go into xen/arch/x86/mm/?
>>
>> In this case we followed the pattern of nestedhvm - hope that's ok.
>
>Not really imo: Nested HVM obviously belongs in hvm/; alt-P2m is more of a
>mm extension than a HVM one afaict, and hence would rather belong in mm/.
>

Would like George's opinion also on this before we make this change (again want to avoid thrashing on things like this).

>>>> +int
>>>> +altp2m_vcpu_initialise(struct vcpu *v) {
>>>> +    int rc = -EOPNOTSUPP;
>>>> +
>>>> +    if ( v != current )
>>>> +        vcpu_pause(v);
>>>> +
>>>> +    if ( !hvm_funcs.ap2m_vcpu_initialise ||
>>>> +         (hvm_funcs.ap2m_vcpu_initialise(v) == 0) )
>>>> +    {
>>>> +        rc = 0;
>>>
>>>I think you would better honor the error code returned by
>>>hvm_funcs.ap2m_vcpu_initialise() and enter this block only if it was zero.
>>
>> The code is checking that condition - did I misinterpret?
>
>It is checking the condition, yes, but not propagating the error code.
>

We removed the unused wrapper out - so this one is moot.

>>>> +        altp2m_vcpu_reset(v);
>>>> +        vcpu_altp2m(v).p2midx = 0;
>>>> +        atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
>>>> +
>>>> +        ap2m_vcpu_update_eptp(v);
>>>
>>>We're in vendor independent code here - either the function is
>>>misnamed, or it shouldn't be called directly from here.
>>
>> Would it be reasonable to add if hap_enabled and cpu_has_vmx checks
>> like other code in this file that invokes ept specific ops?
>> Otherwise it seems ok that the function would be called from here for
>> p2m_altp2m interactions such as switching altp2m by id etc.
>> Open to any other suggestions from you, or we would like to leave it
>> as it is.
>
>Imo such should be abstracted out properly (if it's indeed EPT-specific), or the
>function be renamed.
>

Renaming may make sense - checking first - Would a name like altp2m_vcpu_update_p2m() make sense?
(note - ap2m_ prefix is now altp2m_ prefix due to another review feedback to do that).


>>>> +void
>>>> +altp2m_vcpu_destroy(struct vcpu *v) {
>>>> +    struct p2m_domain *p2m;
>>>> +
>>>> +    if ( v != current )
>>>> +        vcpu_pause(v);
>>>> +
>>>> +    if ( hvm_funcs.ap2m_vcpu_destroy )
>>>> +        hvm_funcs.ap2m_vcpu_destroy(v);
>>>> +
>>>> +    if ( (p2m = p2m_get_altp2m(v)) )
>>>> +        atomic_dec(&p2m->active_vcpus);
>>>
>>>The ordering looks odd - from an abstract perspective I'd expect
>>>p2m_get_altp2m() to not return the p2m anymore that was just destroyed
>>>via hvm_funcs.ap2m_vcpu_destroy().
>>>
>>
>> ap2m_vcpu_destroy is for destroying vcpu context related to altp2m -
>> note this is not implemented since its not needed for Intel
>> implementation.  The idea is that if something needs to be done
>> specifically for for AMD then that could be done here.
>
>First of all this doesn't invalidate or address the concern raised.
>And then - if you don't need the hook, why don't you leave it out altogether,
>eliminating the need to decide about its caller's proper placement?
>

This unimplemented interface function is removed - so this is moot.

>>>> +void ap2m_vcpu_update_eptp(struct vcpu *v) {
>>>
>>>As I think I said before, I consider these ap2m_ prefixes ambiguous - the 'a'
>>>could also stand for accelerated, advanced, ... Consistently staying
>>>with altp2m_ would seem better.
>>>
>>
>> We have a comment above the list of these ap2m_ functions in hvm.h
>> stating these are for Alternate p2m - do you feel strongly about us
>> changing this naming? Also this is the interface naming, and if we
>> renamed it altp2m_xxx it would cause confusion with the actual
>> altp2m_xx functionality - so we would like to leave it as proposed.
>
>I don't think there would be much confusion - structure member names and
>function names live in different name spaces anyway.
>So yes, I continue to think ap2m is a bad prefix...
>

Fixed - ap2m prefix is now altp2m prefix.

>>>> --- a/xen/arch/x86/mm/hap/hap.c
>>>> +++ b/xen/arch/x86/mm/hap/hap.c
>>>> @@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)  int
>>>> hap_enable(struct domain *d, u32 mode)  {
>>>>      unsigned int old_pages;
>>>> -    uint8_t i;
>>>> +    uint16_t i;
>>>
>>>unsigned int (also elsewhere, including uint8_t-s)
>>
>> We used existing iterator types that were being used (uint8_t was
>> being used in hap_final_teardown).
>> If you feel strongly we could change it but we would change code that
>> we didn't need to touch for this patch.
>
>I didn't say you should change code you otherwise don't need to touch. But
>both new code as well as code being changed anyway shouldn't
>repeat/continue pre-existing mistakes (or however you'd want to call such).
>

We will change this (didn't get to it in v5, saw it late and wanted to submit v5 since there are some other feedback we need anyway).

>>>> @@ -294,6 +298,12 @@ struct arch_domain
>>>>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>>>      mm_lock_t nested_p2m_lock;
>>>>
>>>> +    /* altp2m: allow multiple copies of host p2m */
>>>> +    bool_t altp2m_active;
>>>> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>>>> +    mm_lock_t altp2m_lock;
>>>> +    uint64_t *altp2m_eptp;
>>>
>>>This is a non-insignificant increase of the structure size - perhaps
>>>all of these should hang off of struct arch_domain via a single,
>>>separately allocated pointer?
>>
>> Is this a nice-to-have - again we modelled after the nestedhvm
>> extensions to the struct.
>> This will affect a lot of our patch without really changing how much
>> memory is allocated.
>
>I understand that. To a certain point I can agree to limit changes to what is
>there at this stage. But you wanting to avoid addressing concerns basically
>everywhere it's not a bug overextends this. Just because the series was
>submitted late doesn't mean you should now expect us to give in on any
>controversy regarding aspects we would normally expect to be changed. This
>would basically encourage others to submit their stuff late too in the future,
>hoping for relaxed review.
>

Couple things - first, we have absorbed a lot of (good) feedback - thanks for that.
Second, I don't think the series can be characterized as late (feedback from others welcome). 
V1 had almost the same structure and was submitted in January.
On this change - this will be a lot of effects on the code and we would like to avoid this one.

>>>> --- a/xen/include/asm-x86/hvm/hvm.h
>>>> +++ b/xen/include/asm-x86/hvm/hvm.h
>>>> @@ -210,6 +210,14 @@ struct hvm_function_table {
>>>>                                    uint32_t *ecx, uint32_t *edx);
>>>>
>>>>      void (*enable_msr_exit_interception)(struct domain *d);
>>>> +
>>>> +    /* Alternate p2m */
>>>> +    int (*ap2m_vcpu_initialise)(struct vcpu *v);
>>>> +    void (*ap2m_vcpu_destroy)(struct vcpu *v);
>>>> +    int (*ap2m_vcpu_reset)(struct vcpu *v);
>>>
>>>Why is this returning int when altp2m_vcpu_reset() returns void?
>>
>> Currently altp2m_vcpu_reset cannot fail - but at the interface level
>> from hvm, we wanted to allow someone to change that in the future, so
>> the interface allows for a return code.
>
>That's precisely what I assumed, and precisely what I want to see
>avoided: Either the operation can (theoretically) fail - then this should be
>catered for at all levels. Or it can't - then there's no point for the non-void
>return type here.
>

There is no wrapper any more - this comment also doesn't apply any more.

>>>> +static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v) {
>>>> +    struct domain *d = v->domain;
>>>> +    uint16_t index = vcpu_altp2m(v).p2midx;
>>>> +
>>>> +    return (index == INVALID_ALTP2M) ? NULL : d-
>>>>arch.altp2m_p2m[index];
>>>
>>>It would feel better if you checked against MAX_ALTP2M here.
>>
>> There is no way for p2midx to be >= MAX_ALTP2M without being
>INVALID_ALTP2M.
>
>If there was an ASSERT() to that effect I'd be fine. Yet you have to accept that
>bugs may exist somewhere, and being tight with checks like those for valid
>array indexes lowers the risk / impact of security issues.
>
>Jan

There is an ASSERT now.

Ravi

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-14  0:01         ` Sahita, Ravi
@ 2015-07-14  8:53           ` Jan Beulich
  2015-07-16  8:48             ` Sahita, Ravi
  2015-07-14 11:34           ` George Dunlap
  1 sibling, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2015-07-14  8:53 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 14.07.15 at 02:01, <ravi.sahita@intel.com> wrote:
>>From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Monday, July 13, 2015 1:01 AM
>>>>> On 10.07.15 at 23:48, <ravi.sahita@intel.com> wrote:
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>Sent: Thursday, July 09, 2015 6:30 AM
>>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>>> +        altp2m_vcpu_reset(v);
>>>>> +        vcpu_altp2m(v).p2midx = 0;
>>>>> +        atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
>>>>> +
>>>>> +        ap2m_vcpu_update_eptp(v);
>>>>
>>>>We're in vendor independent code here - either the function is
>>>>misnamed, or it shouldn't be called directly from here.
>>>
>>> Would it be reasonable to add if hap_enabled and cpu_has_vmx checks
>>> like other code in this file that invokes ept specific ops?
>>> Otherwise it seems ok that the function would be called from here for
>>> p2m_altp2m interactions such as switching altp2m by id etc.
>>> Open to any other suggestions from you, or we would like to leave it
>>> as it is.
>>
>>Imo such should be abstracted out properly (if it's indeed EPT-specific), or 
> the
>>function be renamed.
>>
> 
> Renaming may make sense - checking first - Would a name like 
> altp2m_vcpu_update_p2m() make sense?

Sounds fine to me.

>>>>> @@ -294,6 +298,12 @@ struct arch_domain
>>>>>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>>>>      mm_lock_t nested_p2m_lock;
>>>>>
>>>>> +    /* altp2m: allow multiple copies of host p2m */
>>>>> +    bool_t altp2m_active;
>>>>> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>>>>> +    mm_lock_t altp2m_lock;
>>>>> +    uint64_t *altp2m_eptp;
>>>>
>>>>This is a non-insignificant increase of the structure size - perhaps
>>>>all of these should hang off of struct arch_domain via a single,
>>>>separately allocated pointer?
>>>
>>> Is this a nice-to-have - again we modelled after the nestedhvm
>>> extensions to the struct.
>>> This will affect a lot of our patch without really changing how much
>>> memory is allocated.
>>
>>I understand that. To a certain point I can agree to limit changes to what is
>>there at this stage. But you wanting to avoid addressing concerns basically
>>everywhere it's not a bug overextends this. Just because the series was
>>submitted late doesn't mean you should now expect us to give in on any
>>controversy regarding aspects we would normally expect to be changed. This
>>would basically encourage others to submit their stuff late too in the 
> future,
>>hoping for relaxed review.
>>
> 
> Couple things - first, we have absorbed a lot of (good) feedback - thanks for 
> that.
> Second, I don't think the series can be characterized as late (feedback from 
> others welcome). 
> V1 had almost the same structure and was submitted in January.

Still we're at v3 only here, not v10 or beyond.

> On this change - this will be a lot of effects on the code and we would like 
> to avoid this one.

While this may be a lot of mechanical change, I don't this presenting
any major risk of breaking the code.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 03/13] VMX: implement suppress #VE.
  2015-07-13  7:40       ` Jan Beulich
  2015-07-13 23:39         ` Sahita, Ravi
@ 2015-07-14 11:18         ` George Dunlap
  1 sibling, 0 replies; 91+ messages in thread
From: George Dunlap @ 2015-07-14 11:18 UTC (permalink / raw)
  To: Jan Beulich, Ravi Sahita
  Cc: Tim Deegan, Wei Liu, Andrew Cooper, Ian Jackson, Edmund H White,
	xen-devel, tlengyel, Daniel De Graaf

On 07/13/2015 08:40 AM, Jan Beulich wrote:
>>>> On 10.07.15 at 21:30, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>> Sent: Thursday, July 09, 2015 6:01 AM
>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>> @@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct p2m_domain
>>>> @@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
>>>>          p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers;
>>>>      }
>>>>
>>>> +    table =
>>>> + map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
>>>> +
>>>> +    for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
>>>> +        table[i].suppress_ve = 1;
>>>> +
>>>> +    unmap_domain_page(table);
>>>
>>> ... why is this needed? Bit 63 is documented to be ignored in PML4Es (just 
>> like
>>> in all other intermediate page tables).
>>
>> Valid point - this has no negative side-effects per se so we didn't change 
>> this.
> 
> Taking "we didn't change this" to refer to v3 -> v4, I still think this
> should be dropped if it isn't needed. There can only be confusion
> arising from code having no purpose.

Just want to call out the general principle Jan refers to here: Nobody
can remember everything about all the details of how the code and the
hardware works; there's just too much to keep in your head all at one
time.  And in any case, people who are not maintainers need to be able
to understand the code to modify it.

The result is that we naturally use the code itself to remind us, or
give us a hint, what the rest of the code or what the hardware does; if
we see a check for NULL, we tend to assume that the pointer may actually
be NULL; if we see a flag being passed, we tend to assume that the flag
will have an effect.

The general principle for making the code easier to grasp, and for
reducing the cognitive load on people trying to understand and modify
it, is to enhance this.  Write code which implies the truth about other
bits of the codebase or the hardware; avoid writing code which will
mislead someone into thinking something false about the other bits of
the codebase or the hardware.

 -George

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-14  0:01         ` Sahita, Ravi
  2015-07-14  8:53           ` Jan Beulich
@ 2015-07-14 11:34           ` George Dunlap
  1 sibling, 0 replies; 91+ messages in thread
From: George Dunlap @ 2015-07-14 11:34 UTC (permalink / raw)
  To: Sahita, Ravi, Jan Beulich
  Cc: Tim Deegan, Wei Liu, Andrew Cooper, Ian Jackson, White, Edmund H,
	xen-devel, tlengyel, Daniel De Graaf

On 07/14/2015 01:01 AM, Sahita, Ravi wrote:
> 
> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Monday, July 13, 2015 1:01 AM
>>
>>>>> On 10.07.15 at 23:48, <ravi.sahita@intel.com> wrote:
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>> Sent: Thursday, July 09, 2015 6:30 AM
>>>>
>>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>>> ---
>>>>>  xen/arch/x86/hvm/Makefile        |  1 +
>>>>>  xen/arch/x86/hvm/altp2m.c        | 92
>>>> +++++++++++++++++++++++++++++++++++++
>>>>
>>>> Wouldn't this better go into xen/arch/x86/mm/?
>>>
>>> In this case we followed the pattern of nestedhvm - hope that's ok.
>>
>> Not really imo: Nested HVM obviously belongs in hvm/; alt-P2m is more of a
>> mm extension than a HVM one afaict, and hence would rather belong in mm/.
>>
> 
> Would like George's opinion also on this before we make this change (again want to avoid thrashing on things like this).

This is a bit of a murky one, since the whole reason you're doing this
is that you actually do use hardware to do the VMFUNC and p2m switching.

Let me take a look at v5 and see what I think (as it sounds like some of
the hvm_function hooks have disappeared).

 -George

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-14  8:53           ` Jan Beulich
@ 2015-07-16  8:48             ` Sahita, Ravi
  2015-07-16  9:02               ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-16  8:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Tuesday, July 14, 2015 1:53 AM
>
>>>> On 14.07.15 at 02:01, <ravi.sahita@intel.com> wrote:
>>>From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Monday, July 13, 2015 1:01 AM
>>>>>> On 10.07.15 at 23:48, <ravi.sahita@intel.com> wrote:
>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>Sent: Thursday, July 09, 2015 6:30 AM
>>>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>>>> +        altp2m_vcpu_reset(v);
>>>>>> +        vcpu_altp2m(v).p2midx = 0;
>>>>>> +        atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
>>>>>> +
>>>>>> +        ap2m_vcpu_update_eptp(v);
>>>>>
>>>>>We're in vendor independent code here - either the function is
>>>>>misnamed, or it shouldn't be called directly from here.
>>>>
>>>> Would it be reasonable to add if hap_enabled and cpu_has_vmx checks
>>>> like other code in this file that invokes ept specific ops?
>>>> Otherwise it seems ok that the function would be called from here
>>>> for p2m_altp2m interactions such as switching altp2m by id etc.
>>>> Open to any other suggestions from you, or we would like to leave it
>>>> as it is.
>>>
>>>Imo such should be abstracted out properly (if it's indeed
>>>EPT-specific), or
>> the
>>>function be renamed.
>>>
>>
>> Renaming may make sense - checking first - Would a name like
>> altp2m_vcpu_update_p2m() make sense?
>
>Sounds fine to me.
>

Thanks


>>>>>> @@ -294,6 +298,12 @@ struct arch_domain
>>>>>>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>>>>>      mm_lock_t nested_p2m_lock;
>>>>>>
>>>>>> +    /* altp2m: allow multiple copies of host p2m */
>>>>>> +    bool_t altp2m_active;
>>>>>> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>>>>>> +    mm_lock_t altp2m_lock;
>>>>>> +    uint64_t *altp2m_eptp;
>>>>>
>>>>>This is a non-insignificant increase of the structure size - perhaps
>>>>>all of these should hang off of struct arch_domain via a single,
>>>>>separately allocated pointer?
>>>>
>>>> Is this a nice-to-have - again we modelled after the nestedhvm
>>>> extensions to the struct.
>>>> This will affect a lot of our patch without really changing how much
>>>> memory is allocated.
>>>
>>>I understand that. To a certain point I can agree to limit changes to
>>>what is there at this stage. But you wanting to avoid addressing
>>>concerns basically everywhere it's not a bug overextends this. Just
>>>because the series was submitted late doesn't mean you should now
>>>expect us to give in on any controversy regarding aspects we would
>>>normally expect to be changed. This would basically encourage others
>>>to submit their stuff late too in the
>> future,
>>>hoping for relaxed review.
>>>
>>
>> Couple things - first, we have absorbed a lot of (good) feedback -
>> thanks for that.
>> Second, I don't think the series can be characterized as late
>> (feedback from others welcome).
>> V1 had almost the same structure and was submitted in January.
>
>Still we're at v3 only here, not v10 or beyond.
>
>> On this change - this will be a lot of effects on the code and we
>> would like to avoid this one.
>
>While this may be a lot of mechanical change, I don't this presenting any major
>risk of breaking the code.

On this one specific advice on how and where to implement such a change would be great just so that we don't thrash on this change.


>
>Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-16  8:48             ` Sahita, Ravi
@ 2015-07-16  9:02               ` Jan Beulich
  2015-07-17 22:39                 ` Sahita, Ravi
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2015-07-16  9:02 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 16.07.15 at 10:48, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Tuesday, July 14, 2015 1:53 AM
>>>>> On 14.07.15 at 02:01, <ravi.sahita@intel.com> wrote:
>>>>From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>Sent: Monday, July 13, 2015 1:01 AM
>>>>>>> On 10.07.15 at 23:48, <ravi.sahita@intel.com> wrote:
>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>Sent: Thursday, July 09, 2015 6:30 AM
>>>>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>>>>> @@ -294,6 +298,12 @@ struct arch_domain
>>>>>>>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>>>>>>      mm_lock_t nested_p2m_lock;
>>>>>>>
>>>>>>> +    /* altp2m: allow multiple copies of host p2m */
>>>>>>> +    bool_t altp2m_active;
>>>>>>> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>>>>>>> +    mm_lock_t altp2m_lock;
>>>>>>> +    uint64_t *altp2m_eptp;
>>>>>>
>>>>>>This is a non-insignificant increase of the structure size - perhaps
>>>>>>all of these should hang off of struct arch_domain via a single,
>>>>>>separately allocated pointer?
>>>>>
>>>>> Is this a nice-to-have - again we modelled after the nestedhvm
>>>>> extensions to the struct.
>>>>> This will affect a lot of our patch without really changing how much
>>>>> memory is allocated.
>>>>
>>>>I understand that. To a certain point I can agree to limit changes to
>>>>what is there at this stage. But you wanting to avoid addressing
>>>>concerns basically everywhere it's not a bug overextends this. Just
>>>>because the series was submitted late doesn't mean you should now
>>>>expect us to give in on any controversy regarding aspects we would
>>>>normally expect to be changed. This would basically encourage others
>>>>to submit their stuff late too in the
>>> future,
>>>>hoping for relaxed review.
>>>>
>>>
>>> Couple things - first, we have absorbed a lot of (good) feedback -
>>> thanks for that.
>>> Second, I don't think the series can be characterized as late
>>> (feedback from others welcome).
>>> V1 had almost the same structure and was submitted in January.
>>
>>Still we're at v3 only here, not v10 or beyond.
>>
>>> On this change - this will be a lot of effects on the code and we
>>> would like to avoid this one.
>>
>>While this may be a lot of mechanical change, I don't this presenting any 
> major
>>risk of breaking the code.
> 
> On this one specific advice on how and where to implement such a change 
> would be great just so that we don't thrash on this change.

I don't follow - what to do here was said quite explicitly (still visible
in the context above). I.e. I have no idea what additional advice
you seek.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-16  9:02               ` Jan Beulich
@ 2015-07-17 22:39                 ` Sahita, Ravi
  2015-07-20  6:18                   ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-17 22:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Thursday, July 16, 2015 2:02 AM
>
>>>> On 16.07.15 at 10:48, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Tuesday, July 14, 2015 1:53 AM
>>>>>> On 14.07.15 at 02:01, <ravi.sahita@intel.com> wrote:
>>>>>From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>Sent: Monday, July 13, 2015 1:01 AM
>>>>>>>> On 10.07.15 at 23:48, <ravi.sahita@intel.com> wrote:
>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>Sent: Thursday, July 09, 2015 6:30 AM
>>>>>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>>>>>> @@ -294,6 +298,12 @@ struct arch_domain
>>>>>>>>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>>>>>>>      mm_lock_t nested_p2m_lock;
>>>>>>>>
>>>>>>>> +    /* altp2m: allow multiple copies of host p2m */
>>>>>>>> +    bool_t altp2m_active;
>>>>>>>> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>>>>>>>> +    mm_lock_t altp2m_lock;
>>>>>>>> +    uint64_t *altp2m_eptp;
>>>>>>>
>>>>>>>This is a non-insignificant increase of the structure size -
>>>>>>>perhaps all of these should hang off of struct arch_domain via a
>>>>>>>single, separately allocated pointer?
>>>>>>
>>>>>> Is this a nice-to-have - again we modelled after the nestedhvm
>>>>>> extensions to the struct.
>>>>>> This will affect a lot of our patch without really changing how
>>>>>> much memory is allocated.
>>>>>
>>>>>I understand that. To a certain point I can agree to limit changes
>>>>>to what is there at this stage. But you wanting to avoid addressing
>>>>>concerns basically everywhere it's not a bug overextends this. Just
>>>>>because the series was submitted late doesn't mean you should now
>>>>>expect us to give in on any controversy regarding aspects we would
>>>>>normally expect to be changed. This would basically encourage others
>>>>>to submit their stuff late too in the
>>>> future,
>>>>>hoping for relaxed review.
>>>>>
>>>>
>>>> Couple things - first, we have absorbed a lot of (good) feedback -
>>>> thanks for that.
>>>> Second, I don't think the series can be characterized as late
>>>> (feedback from others welcome).
>>>> V1 had almost the same structure and was submitted in January.
>>>
>>>Still we're at v3 only here, not v10 or beyond.
>>>
>>>> On this change - this will be a lot of effects on the code and we
>>>> would like to avoid this one.
>>>
>>>While this may be a lot of mechanical change, I don't this presenting
>>>any
>> major
>>>risk of breaking the code.
>>
>> On this one specific advice on how and where to implement such a
>> change would be great just so that we don't thrash on this change.
>
>I don't follow - what to do here was said quite explicitly (still visible in the
>context above). I.e. I have no idea what additional advice you seek.

Ok that's fine - sorry if this was unclear - I was seeking if you had some specific feedback on how to allocate and manage the dynamic altp2m struct etc (if you had an opinion would be good to hear that). Will treat this as a Category 2 with lower priority than some of the other Category 2's in the other reviews (theres one from you and one from George). 

Thanks,
Ravi

>
>Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-17 22:39                 ` Sahita, Ravi
@ 2015-07-20  6:18                   ` Jan Beulich
  2015-07-21  5:04                     ` Sahita, Ravi
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2015-07-20  6:18 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 18.07.15 at 00:39, <ravi.sahita@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>Sent: Thursday, July 16, 2015 2:02 AM
>>
>>>>> On 16.07.15 at 10:48, <ravi.sahita@intel.com> wrote:
>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>Sent: Tuesday, July 14, 2015 1:53 AM
>>>>>>> On 14.07.15 at 02:01, <ravi.sahita@intel.com> wrote:
>>>>>>From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>Sent: Monday, July 13, 2015 1:01 AM
>>>>>>>>> On 10.07.15 at 23:48, <ravi.sahita@intel.com> wrote:
>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>>Sent: Thursday, July 09, 2015 6:30 AM
>>>>>>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>>>>>>> @@ -294,6 +298,12 @@ struct arch_domain
>>>>>>>>>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>>>>>>>>      mm_lock_t nested_p2m_lock;
>>>>>>>>>
>>>>>>>>> +    /* altp2m: allow multiple copies of host p2m */
>>>>>>>>> +    bool_t altp2m_active;
>>>>>>>>> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>>>>>>>>> +    mm_lock_t altp2m_lock;
>>>>>>>>> +    uint64_t *altp2m_eptp;
>>>>>>>>
>>>>>>>>This is a non-insignificant increase of the structure size -
>>>>>>>>perhaps all of these should hang off of struct arch_domain via a
>>>>>>>>single, separately allocated pointer?
>>>>>>>
>>>>>>> Is this a nice-to-have - again we modelled after the nestedhvm
>>>>>>> extensions to the struct.
>>>>>>> This will affect a lot of our patch without really changing how
>>>>>>> much memory is allocated.
>>>>>>
>>>>>>I understand that. To a certain point I can agree to limit changes
>>>>>>to what is there at this stage. But you wanting to avoid addressing
>>>>>>concerns basically everywhere it's not a bug overextends this. Just
>>>>>>because the series was submitted late doesn't mean you should now
>>>>>>expect us to give in on any controversy regarding aspects we would
>>>>>>normally expect to be changed. This would basically encourage others
>>>>>>to submit their stuff late too in the
>>>>> future,
>>>>>>hoping for relaxed review.
>>>>>>
>>>>>
>>>>> Couple things - first, we have absorbed a lot of (good) feedback -
>>>>> thanks for that.
>>>>> Second, I don't think the series can be characterized as late
>>>>> (feedback from others welcome).
>>>>> V1 had almost the same structure and was submitted in January.
>>>>
>>>>Still we're at v3 only here, not v10 or beyond.
>>>>
>>>>> On this change - this will be a lot of effects on the code and we
>>>>> would like to avoid this one.
>>>>
>>>>While this may be a lot of mechanical change, I don't this presenting
>>>>any
>>> major
>>>>risk of breaking the code.
>>>
>>> On this one specific advice on how and where to implement such a
>>> change would be great just so that we don't thrash on this change.
>>
>>I don't follow - what to do here was said quite explicitly (still visible in 
> the
>>context above). I.e. I have no idea what additional advice you seek.
> 
> Ok that's fine - sorry if this was unclear - I was seeking if you had some 
> specific feedback on how to allocate and manage the dynamic altp2m struct etc 
> (if you had an opinion would be good to hear that).

xmalloc() / xzalloc(). What other alternatives would you see?

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-20  6:18                   ` Jan Beulich
@ 2015-07-21  5:04                     ` Sahita, Ravi
  2015-07-21  6:24                       ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Sahita, Ravi @ 2015-07-21  5:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Sahita, Ravi, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, White, Edmund H, xen-devel, tlengyel,
	Daniel De Graaf

>From: Jan Beulich [mailto:JBeulich@suse.com]
>Sent: Sunday, July 19, 2015 11:18 PM
>
>>>> On 18.07.15 at 00:39, <ravi.sahita@intel.com> wrote:
>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>Sent: Thursday, July 16, 2015 2:02 AM
>>>
>>>>>> On 16.07.15 at 10:48, <ravi.sahita@intel.com> wrote:
>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>Sent: Tuesday, July 14, 2015 1:53 AM
>>>>>>>> On 14.07.15 at 02:01, <ravi.sahita@intel.com> wrote:
>>>>>>>From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>Sent: Monday, July 13, 2015 1:01 AM
>>>>>>>>>> On 10.07.15 at 23:48, <ravi.sahita@intel.com> wrote:
>>>>>>>>> From: Jan Beulich [mailto:JBeulich@suse.com]
>>>>>>>>>Sent: Thursday, July 09, 2015 6:30 AM
>>>>>>>>>>>> On 01.07.15 at 20:09, <edmund.h.white@intel.com> wrote:
>>>>>>>>>> @@ -294,6 +298,12 @@ struct arch_domain
>>>>>>>>>>      struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>>>>>>>>>      mm_lock_t nested_p2m_lock;
>>>>>>>>>>
>>>>>>>>>> +    /* altp2m: allow multiple copies of host p2m */
>>>>>>>>>> +    bool_t altp2m_active;
>>>>>>>>>> +    struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>>>>>>>>>> +    mm_lock_t altp2m_lock;
>>>>>>>>>> +    uint64_t *altp2m_eptp;
>>>>>>>>>
>>>>>>>>>This is a non-insignificant increase of the structure size -
>>>>>>>>>perhaps all of these should hang off of struct arch_domain via a
>>>>>>>>>single, separately allocated pointer?
>>>>>>>>
>>>>>>>> Is this a nice-to-have - again we modelled after the nestedhvm
>>>>>>>> extensions to the struct.
>>>>>>>> This will affect a lot of our patch without really changing how
>>>>>>>> much memory is allocated.
>>>>>>>
>>>>>>>I understand that. To a certain point I can agree to limit changes
>>>>>>>to what is there at this stage. But you wanting to avoid
>>>>>>>addressing concerns basically everywhere it's not a bug
>>>>>>>overextends this. Just because the series was submitted late
>>>>>>>doesn't mean you should now expect us to give in on any
>>>>>>>controversy regarding aspects we would normally expect to be
>>>>>>>changed. This would basically encourage others to submit their
>>>>>>>stuff late too in the
>>>>>> future,
>>>>>>>hoping for relaxed review.
>>>>>>>
>>>>>>
>>>>>> Couple things - first, we have absorbed a lot of (good) feedback -
>>>>>> thanks for that.
>>>>>> Second, I don't think the series can be characterized as late
>>>>>> (feedback from others welcome).
>>>>>> V1 had almost the same structure and was submitted in January.
>>>>>
>>>>>Still we're at v3 only here, not v10 or beyond.
>>>>>
>>>>>> On this change - this will be a lot of effects on the code and we
>>>>>> would like to avoid this one.
>>>>>
>>>>>While this may be a lot of mechanical change, I don't this
>>>>>presenting any
>>>> major
>>>>>risk of breaking the code.
>>>>
>>>> On this one specific advice on how and where to implement such a
>>>> change would be great just so that we don't thrash on this change.
>>>
>>>I don't follow - what to do here was said quite explicitly (still
>>>visible in
>> the
>>>context above). I.e. I have no idea what additional advice you seek.
>>
>> Ok that's fine - sorry if this was unclear - I was seeking if you had
>> some specific feedback on how to allocate and manage the dynamic
>> altp2m struct etc (if you had an opinion would be good to hear that).
>
>xmalloc() / xzalloc(). What other alternatives would you see?
>
>Jan

Ok - understood - as you must have seen, this change is not in our v6 - that is per our preface work plan -- we may not be able to get this change into the series proposed for 4.6.
Though I want to assure you the change can be made subsequently, to address your previous point, tonight I have prepared a delta patch for this change already but we need to test and that takes up a decent chunk of time. 
Are you ok if this mechanical change doesn't go into our 4.6 series? 

Thanks,
Ravi

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.
  2015-07-21  5:04                     ` Sahita, Ravi
@ 2015-07-21  6:24                       ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2015-07-21  6:24 UTC (permalink / raw)
  To: Ravi Sahita
  Cc: Tim Deegan, Wei Liu, George Dunlap, Andrew Cooper, Ian Jackson,
	Edmund H White, xen-devel, tlengyel, Daniel De Graaf

>>> On 21.07.15 at 07:04, <ravi.sahita@intel.com> wrote:
> Are you ok if this mechanical change doesn't go into our 4.6 series? 

Reluctantly - yes.

Jan

^ permalink raw reply	[flat|nested] 91+ messages in thread

end of thread, other threads:[~2015-07-21  6:24 UTC | newest]

Thread overview: 91+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-01 18:09 [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Ed White
2015-07-01 18:09 ` [PATCH v3 01/13] common/domain: Helpers to pause a domain while in context Ed White
2015-07-01 18:09 ` [PATCH v3 02/13] VMX: VMFUNC and #VE definitions and detection Ed White
2015-07-06 17:16   ` George Dunlap
2015-07-07 18:58   ` Nakajima, Jun
2015-07-01 18:09 ` [PATCH v3 03/13] VMX: implement suppress #VE Ed White
2015-07-06 17:26   ` George Dunlap
2015-07-07 18:59   ` Nakajima, Jun
2015-07-09 13:01   ` Jan Beulich
2015-07-10 19:30     ` Sahita, Ravi
2015-07-13  7:40       ` Jan Beulich
2015-07-13 23:39         ` Sahita, Ravi
2015-07-14 11:18         ` George Dunlap
2015-07-01 18:09 ` [PATCH v3 04/13] x86/HVM: Hardware alternate p2m support detection Ed White
2015-07-01 18:09 ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Ed White
2015-07-03 16:22   ` Andrew Cooper
2015-07-06  9:56     ` Jan Beulich
2015-07-06 16:52       ` Ed White
2015-07-06 16:40     ` Ed White
2015-07-06 16:50       ` Ian Jackson
2015-07-07  6:48         ` Coding style (was Re: [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.) Jan Beulich
2015-07-07  6:31       ` [PATCH v3 05/13] x86/altp2m: basic data structures and support routines Jan Beulich
2015-07-07 15:04   ` George Dunlap
2015-07-07 15:22     ` Tim Deegan
2015-07-07 16:19       ` Ed White
2015-07-08 13:52         ` George Dunlap
2015-07-09 17:05         ` Sahita, Ravi
2015-07-10 16:35           ` George Dunlap
2015-07-10 22:11             ` Sahita, Ravi
2015-07-09 13:29   ` Jan Beulich
2015-07-10 21:48     ` Sahita, Ravi
2015-07-13  8:01       ` Jan Beulich
2015-07-14  0:01         ` Sahita, Ravi
2015-07-14  8:53           ` Jan Beulich
2015-07-16  8:48             ` Sahita, Ravi
2015-07-16  9:02               ` Jan Beulich
2015-07-17 22:39                 ` Sahita, Ravi
2015-07-20  6:18                   ` Jan Beulich
2015-07-21  5:04                     ` Sahita, Ravi
2015-07-21  6:24                       ` Jan Beulich
2015-07-14 11:34           ` George Dunlap
2015-07-09 15:58   ` George Dunlap
2015-07-01 18:09 ` [PATCH v3 06/13] VMX/altp2m: add code to support EPTP switching and #VE Ed White
2015-07-03 16:29   ` Andrew Cooper
2015-07-07 14:28     ` Wei Liu
2015-07-07 19:02   ` Nakajima, Jun
2015-07-01 18:09 ` [PATCH v3 07/13] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator Ed White
2015-07-03 16:40   ` Andrew Cooper
2015-07-06 19:56     ` Sahita, Ravi
2015-07-07  7:31       ` Jan Beulich
2015-07-09 14:05   ` Jan Beulich
2015-07-01 18:09 ` [PATCH v3 08/13] x86/altp2m: add control of suppress_ve Ed White
2015-07-03 16:43   ` Andrew Cooper
2015-07-01 18:09 ` [PATCH v3 09/13] x86/altp2m: alternate p2m memory events Ed White
2015-07-01 18:29   ` Lengyel, Tamas
2015-07-03 16:46   ` Andrew Cooper
2015-07-07 15:18   ` George Dunlap
2015-07-01 18:09 ` [PATCH v3 10/13] x86/altp2m: add remaining support routines Ed White
2015-07-03 16:56   ` Andrew Cooper
2015-07-09 15:07   ` George Dunlap
2015-07-01 18:09 ` [PATCH v3 11/13] x86/altp2m: define and implement alternate p2m HVMOP types Ed White
2015-07-06 10:09   ` Andrew Cooper
2015-07-06 16:49     ` Ed White
2015-07-06 17:08       ` Ian Jackson
2015-07-06 18:27         ` Ed White
2015-07-06 23:40           ` Lengyel, Tamas
2015-07-07  7:46             ` Jan Beulich
2015-07-07  7:41         ` Jan Beulich
2015-07-07  7:39       ` Jan Beulich
2015-07-07  7:33     ` Jan Beulich
2015-07-07 20:10       ` Sahita, Ravi
2015-07-07 20:25         ` Andrew Cooper
2015-07-09 14:34   ` Jan Beulich
2015-07-01 18:09 ` [PATCH v3 12/13] x86/altp2m: Add altp2mhvm HVM domain parameter Ed White
2015-07-06 10:16   ` Andrew Cooper
2015-07-06 17:49   ` Wei Liu
2015-07-06 18:01     ` Ed White
2015-07-06 18:18       ` Wei Liu
2015-07-06 22:59         ` Ed White
2015-07-01 18:09 ` [PATCH v3 13/13] x86/altp2m: XSM hooks for altp2m HVM ops Ed White
2015-07-02 19:17   ` Daniel De Graaf
2015-07-06  9:50 ` [PATCH v3 00/12] Alternate p2m: support multiple copies of host p2m Jan Beulich
2015-07-06 11:25   ` Tim Deegan
2015-07-06 11:38     ` Jan Beulich
2015-07-08 18:35 ` Sahita, Ravi
2015-07-09 11:49   ` Wei Liu
2015-07-09 14:14     ` Jan Beulich
2015-07-09 16:13     ` Sahita, Ravi
2015-07-09 16:20       ` Ian Campbell
2015-07-09 16:21       ` Wei Liu
2015-07-09 16:42     ` George Dunlap

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.