[PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m

* [PATCH v4 00/15] Alternate p2m: support multiple copies of host p2m
@ 2015-07-10  0:52 Ed White
  2015-07-10  0:52 ` [PATCH v4 01/15] common/domain: Helpers to pause a domain while in context Ed White
                   ` (14 more replies)
  0 siblings, 15 replies; 51+ messages in thread
From: Ed White @ 2015-07-10  0:52 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, George Dunlap, Ian Jackson, Tim Deegan,
	Ed White, Jan Beulich, Andrew Cooper, tlengyel, Daniel De Graaf

This set of patches adds support to hvm domains for EPTP switching by creating
multiple copies of the host p2m (currently limited to 10 copies).

The primary use of this capability is expected to be in scenarios where access
to memory needs to be monitored and/or restricted below the level at which the
guest OS page tables operate. Two examples that were discussed at the 2014 Xen
developer summit are:

    VM introspection: 
        http://www.slideshare.net/xen_com_mgr/
        zero-footprint-guest-memory-introspection-from-xen

    Secure inter-VM communication:
        http://www.slideshare.net/xen_com_mgr/nakajima-nvf

A more detailed design specification can be found at:
    http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01319.html

Each p2m copy is populated lazily on EPT violations.
Permissions for pages in alternate p2m's can be changed in a similar
way to the existing memory access interface, and gfn->mfn mappings can be changed.

All this is done through extra HVMOP types.

The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain
code is hypervisor-only, the toolstack has not been modified.

The intra-domain code has been tested. Violation notifications can only be received
for pages that have been modified (access permissions and/or gfn->mfn mapping) 
intra-domain, and only on VCPU's that have enabled notification.

VMFUNC and #VE will both be emulated on hardware without native support.

This code is not compatible with nested hvm functionality and will refuse to work
with nested hvm active. It is also not compatible with migration. It should be
considered experimental.

Changes since v3:

Major changes are:

    Replaced patch 8.

    Refactored patch 11 to use a single HVMOP with subcodes.

    Addressed feedback in patch 7, and some other patches.

    Added two tools/test patches from Tamas. Both are optional.

    Added various ack's and reviewed-by's.

    Rebased.

Ravi Sahita will now be the point of contact for this series.

Changes since v2:

Addressed all v2 feedback *except*:

    In patch 5, the per-domain EPTP list page is still allocated from the
    Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
    hardware when walking the EPTP list during exit processing in patch 6.

    HVM_ops are not merged. Tamas suggested merging the memory access ops,
    but in practice they are not as similar as they appear on the surface.
    Razvan suggested merging the implementation code in p2m.c, but that is
    also not as common as it appears on the surface.
    Andrew suggested merging all altp2m ops into one with a subop code in
    the input stucture. His point that only 255 ops can be defined is well
    taken, but altp2m uses only 2 more ops than the recently introduced
    ioreq ops, and <15% of the available ops have been defined. Since we
    don't know how to implement XSM hooks and policy with the subop model,
    we have not adopted this suggestion.

    The p2m set/get interface is not modified. The altp2m code needs to
    write suppress_ve in 2 places and read it in 1 place. The original
    patch series managed this by coupling the state of suppress_ve to the
    p2m memory type, which Tim disliked. In v2 of the series, special
    set/get interaces were added to access suppress_ve only when required.
    Jan has suggested changing the existing interfaces, but we feel this
    is inappropriate for this experimental patch series. Changing the
    existing interfaces would require a design agreement to be reached
    and would impact a large amount of existing code.

    Andrew kindly added some reviewed-by's to v2. I have not carried
    his reviewed-by of the memory event patch forward because Tamas
    requested significant changes to the patch.

Changes since v1:

Many changes since v1 in response to maintainer feedback, including:

    Suppress_ve state is now decoupled from memory type
    VMFUNC emulation handled in x86 emulator
    Lazy-copy algorithm copies any page where mfn != INVALID_MFN
    All nested page fault handling except lazy-copy is now in
        top-level (hvm.c) nested page fault handler
    Split p2m lock type (as suggested by Tim) to avoid lock order violations
    XSM hooks
    Xen parameter to globally enable altp2m (default disabled) and HVM parameter
    Altp2m reference counting no longer uses dirty_cpu bitmap
    Remapped page tracking to invalidate altp2m's where needed to protect Xen
    Many other minor changes

The altp2m invalidation is implemented to a level that I believe satisifies
the requirements of protecting Xen. Invalidation notification is not yet
implemented, and there may be other cases where invalidation is warranted to
protect the integrity of the restrictions placed through altp2m. We may add
further patches in this area.

Testability is still a potential issue. We have offered to make our internal
Windows test binaries available for intra-domain testing. Tamas has
been working on toolstack support for cross-domain testing with a slightly
earlier patch series, and we hope he will submit that support.

Not all of the patches will be of interest to everyone copied here. I've
copied everyone on this initial mailing to give context.

Andrew Cooper (1):
  common/domain: Helpers to pause a domain while in context

Ed White (9):
  VMX: VMFUNC and #VE definitions and detection.
  VMX: implement suppress #VE.
  x86/HVM: Hardware alternate p2m support detection.
  x86/altp2m: basic data structures and support routines.
  VMX/altp2m: add code to support EPTP switching and #VE.
  x86/altp2m: alternate p2m memory events.
  x86/altp2m: add remaining support routines.
  x86/altp2m: define and implement alternate p2m HVMOP types.
  x86/altp2m: Add altp2mhvm HVM domain parameter.

George Dunlap (1):
  x86/altp2m: add control of suppress_ve.

Ravi Sahita (2):
  VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  x86/altp2m: XSM hooks for altp2m HVM ops

Tamas K Lengyel (2):
  tools/libxc: add support to altp2m hvmops
  tools/xen-access: altp2m testcases

 docs/man/xl.cfg.pod.5                        |  12 +
 docs/misc/xen-command-line.markdown          |   7 +
 tools/flask/policy/policy/modules/xen/xen.if |   4 +-
 tools/libxc/Makefile                         |   1 +
 tools/libxc/include/xenctrl.h                |  21 +
 tools/libxc/xc_altp2m.c                      | 237 ++++++++++++
 tools/libxl/libxl.h                          |   6 +
 tools/libxl/libxl_create.c                   |   1 +
 tools/libxl/libxl_dom.c                      |   2 +
 tools/libxl/libxl_types.idl                  |   1 +
 tools/libxl/xl_cmdimpl.c                     |  10 +
 tools/tests/xen-access/xen-access.c          | 173 +++++++--
 xen/arch/x86/hvm/Makefile                    |   1 +
 xen/arch/x86/hvm/altp2m.c                    |  92 +++++
 xen/arch/x86/hvm/emulate.c                   |  19 +-
 xen/arch/x86/hvm/hvm.c                       | 249 +++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c                  |  42 +-
 xen/arch/x86/hvm/vmx/vmx.c                   | 168 ++++++++
 xen/arch/x86/mm/hap/Makefile                 |   1 +
 xen/arch/x86/mm/hap/altp2m_hap.c             |  98 +++++
 xen/arch/x86/mm/hap/hap.c                    |  32 +-
 xen/arch/x86/mm/mem_sharing.c                |   5 +-
 xen/arch/x86/mm/mm-locks.h                   |  38 +-
 xen/arch/x86/mm/p2m-ept.c                    |  46 ++-
 xen/arch/x86/mm/p2m-pod.c                    |  12 +-
 xen/arch/x86/mm/p2m-pt.c                     |  10 +-
 xen/arch/x86/mm/p2m.c                        | 552 +++++++++++++++++++++++++--
 xen/arch/x86/x86_emulate/x86_emulate.c       |  20 +-
 xen/arch/x86/x86_emulate/x86_emulate.h       |   4 +
 xen/common/domain.c                          |  28 ++
 xen/common/vm_event.c                        |   4 +
 xen/include/asm-arm/p2m.h                    |   6 +
 xen/include/asm-x86/domain.h                 |  10 +
 xen/include/asm-x86/hvm/altp2m.h             |  42 ++
 xen/include/asm-x86/hvm/hvm.h                |  28 ++
 xen/include/asm-x86/hvm/vcpu.h               |   9 +
 xen/include/asm-x86/hvm/vmx/vmcs.h           |  14 +-
 xen/include/asm-x86/hvm/vmx/vmx.h            |  13 +-
 xen/include/asm-x86/msr-index.h              |   1 +
 xen/include/asm-x86/p2m.h                    |  90 ++++-
 xen/include/public/hvm/hvm_op.h              |  82 ++++
 xen/include/public/hvm/params.h              |   5 +-
 xen/include/public/vm_event.h                |  11 +
 xen/include/xen/sched.h                      |   5 +
 xen/include/xsm/dummy.h                      |  12 +
 xen/include/xsm/xsm.h                        |  12 +
 xen/xsm/dummy.c                              |   2 +
 xen/xsm/flask/hooks.c                        |  12 +
 xen/xsm/flask/policy/access_vectors          |   7 +
 49 files changed, 2155 insertions(+), 102 deletions(-)
 create mode 100644 tools/libxc/xc_altp2m.c
 create mode 100644 xen/arch/x86/hvm/altp2m.c
 create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c
 create mode 100644 xen/include/asm-x86/hvm/altp2m.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 51+ messages in thread