[PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m

* [PATCH v7 00/15] Alternate p2m: support multiple copies of host p2m
@ 2015-07-22 23:01 Ed White
  2015-07-22 23:01 ` [PATCH v7 01/15] common/domain: Helpers to pause a domain while in context Ed White
                   ` (16 more replies)
  0 siblings, 17 replies; 42+ messages in thread
From: Ed White @ 2015-07-22 23:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Ravi Sahita, Wei Liu, Jun Nakajima, George Dunlap, Ian Jackson,
	Tim Deegan, Ed White, Jan Beulich, Andrew Cooper, tlengyel,
	Daniel De Graaf

This set of patches adds support to hvm domains for EPTP switching by creating
multiple copies of the host p2m (currently limited to 10 copies).

The primary use of this capability is expected to be in scenarios where access
to memory needs to be monitored and/or restricted below the level at which the
guest OS page tables operate. Two examples that were discussed at the 2014 Xen
developer summit are:

    VM introspection: 
        http://www.slideshare.net/xen_com_mgr/
        zero-footprint-guest-memory-introspection-from-xen

    Secure inter-VM communication:
        http://www.slideshare.net/xen_com_mgr/nakajima-nvf

A more detailed design specification can be found at:
    http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01319.html

Each p2m copy is populated lazily on EPT violations.
Permissions for pages in alternate p2m's can be changed in a similar
way to the existing memory access interface, and gfn->mfn mappings can be changed.

All this is done through extra HVMOP types.

The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain
code is hypervisor-only, the toolstack has not been modified.

The intra-domain code has been tested. Violation notifications can only be received
for pages that have been modified (access permissions and/or gfn->mfn mapping) 
intra-domain, and only on VCPU's that have enabled notification.

VMFUNC and #VE will both be emulated on hardware without native support.

This code is not compatible with nested hvm functionality and will refuse to work
with nested hvm active. It is also not compatible with migration. It should be
considered experimental.

Changes since v6:

    Rebased again.

	The patch list below specifies the changes in v7 of the altp2m series. 
	We intend for this v7 to be the last version of this patch series that we 
	are submitting during this freeze exception week. This is taking into account 
	the time-zone differences and resource availability. We believe we have 
	addressed the majority of review comments in this series, with the exception 
	of a couple. We have noted the pending ones that remain open in two patches 
	listed below - with the intent to be addressed in subsequent patches post 4.6. 
	Hopefully, this series satifies the criteria for inclusion in 4.6. We thank 
	the maintainers for their perseverance with this patch series and their 
	continuous feedback during this freeze exception week. 

    Patch 1:
        add George's R-b

    Patch 2:
        no changes

    Patch 3:
        add Jun's ack

    Patch 4:
        no changes

    Patch 5:
        move altp2m.c from hvm to mm
        move altp2m.h from x86/mm to x86
        change iterator types in hap_enable/hap_final_teardown
        fix altp2m init failure path in p2m_init
        rename altp2m_vcpu_update_eptp to altp2m_vcpu_update_p2m
        change uint16_t's/uint8_t's to unsigned int's
        rework p2m_get_altp2m

		not done - mechanical change of moving domain struct 
		    altp2m new members to a seperate alloc'ed struct 
			(initial stab for this patch has been written)

    Patch 6:
        remove casts around p2midx handling
        fix veinfo semaphore initialization
        mechanical changes due to patch 5 changes

    Patch 7:
        remove incorrect cast
        add Jan's ack

    Patch 8:
        formatting changes requested by Andrew

    Patch 9:
        no changes

    Patch 10:
        rename altp2m lazy copier, make bool_t, use __put_gfn throughout,
          and move to p2m.c, eliminating altp2m_hap.c
        change various p2m_altp2m... routines from long to int
        change uint16_t's/uint8_t's to unsigned int's
        optimize remapped gfn tracking and altp2m invalidation check
        mechanical changes due to patch 5 changes

		not done - abstracting some ept functionality from p2m

    Patch 11:
        fix cmd range check
        rework domain locking
        add Jan's ack

    Patch 12:
        no changes

    Patch 13:
        no changes

    Patch 14:
        no changes

    Patch 15:
        no changes

Changes since v5:

    Rebased on staging.

    We believe v6 addresses all ABI issues and actual bugs, it does
    not address all outstanding maintainer issues.

    Patch 1:
        no changes

    Patch 2:
        no changes

    Patch 3:
        no changes
        removed ack's etc

    Patch 4:
        fixed a markdown formatting error

    Patch 5:
        removed a buggy assert
        removed Andrew's R-b

    Patch 6:
        fixed a bug when disabling #VE due to bad veinfo gfn

    Patch 7:
        addressed Jan's most recent comments

    Patch 8:
        no changes

    Patch 9:
        Added padding to vm_event_t header (per Andrew)

    Patch 10:
        No changes

    Patch 11:
        Reworked structure padding
        Added altp2m_op interface version
        Reworked altp2m_op handling again

    Patch 12:
        Mechanical changes due to patch 11 changes

    Patch 13:
        Mechanical changes due to patch 11 changes

    Patch 14:
        Mechanical changes due to patch 11 changes

    Patch 15:
        Mechanical changes due to an upstream change

Changes since v4:

    Patch 3:  don't set bit 63 of top-level entries.

    Patch 5:  extra locking order description in mm-locks.h
              don't initialise altp2m data unless altp2m is enabled globally
               and hardware supports it
              removed some hardware-specific wrappers that were not being used
              renamed ap2m... interfaces to altp2m...
              fixed error path in p2m_init_altp2m

    Patch 7:  addressed remaining feedback

    Patch 8:  made suppress_ve preservation consistent

    Patch 9:  changed flag bit to avoid collision with recently applied series

    Patch 10: check pad fields for zero
              minor formatting changes

    Patch 11: renamed HVM parameter

    Patch 15: removed v3 workaround

Changes since v3:

Major changes are:

    Replaced patch 8.

    Refactored patch 11 to use a single HVMOP with subcodes.

    Addressed feedback in patch 7, and some other patches.

    Added two tools/test patches from Tamas. Both are optional.

    Added various ack's and reviewed-by's.

    Rebased.

Ravi Sahita will now be the point of contact for this series.

Changes since v2:

Addressed all v2 feedback *except*:

    In patch 5, the per-domain EPTP list page is still allocated from the
    Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
    hardware when walking the EPTP list during exit processing in patch 6.

    HVM_ops are not merged. Tamas suggested merging the memory access ops,
    but in practice they are not as similar as they appear on the surface.
    Razvan suggested merging the implementation code in p2m.c, but that is
    also not as common as it appears on the surface.
    Andrew suggested merging all altp2m ops into one with a subop code in
    the input stucture. His point that only 255 ops can be defined is well
    taken, but altp2m uses only 2 more ops than the recently introduced
    ioreq ops, and <15% of the available ops have been defined. Since we
    don't know how to implement XSM hooks and policy with the subop model,
    we have not adopted this suggestion.

    The p2m set/get interface is not modified. The altp2m code needs to
    write suppress_ve in 2 places and read it in 1 place. The original
    patch series managed this by coupling the state of suppress_ve to the
    p2m memory type, which Tim disliked. In v2 of the series, special
    set/get interaces were added to access suppress_ve only when required.
    Jan has suggested changing the existing interfaces, but we feel this
    is inappropriate for this experimental patch series. Changing the
    existing interfaces would require a design agreement to be reached
    and would impact a large amount of existing code.

    Andrew kindly added some reviewed-by's to v2. I have not carried
    his reviewed-by of the memory event patch forward because Tamas
    requested significant changes to the patch.

Changes since v1:

Many changes since v1 in response to maintainer feedback, including:

    Suppress_ve state is now decoupled from memory type
    VMFUNC emulation handled in x86 emulator
    Lazy-copy algorithm copies any page where mfn != INVALID_MFN
    All nested page fault handling except lazy-copy is now in
        top-level (hvm.c) nested page fault handler
    Split p2m lock type (as suggested by Tim) to avoid lock order violations
    XSM hooks
    Xen parameter to globally enable altp2m (default disabled) and HVM parameter
    Altp2m reference counting no longer uses dirty_cpu bitmap
    Remapped page tracking to invalidate altp2m's where needed to protect Xen
    Many other minor changes

The altp2m invalidation is implemented to a level that I believe satisifies
the requirements of protecting Xen. Invalidation notification is not yet
implemented, and there may be other cases where invalidation is warranted to
protect the integrity of the restrictions placed through altp2m. We may add
further patches in this area.

Testability is still a potential issue. We have offered to make our internal
Windows test binaries available for intra-domain testing. Tamas has
been working on toolstack support for cross-domain testing with a slightly
earlier patch series, and we hope he will submit that support.

Not all of the patches will be of interest to everyone copied here. I've
copied everyone on this initial mailing to give context.

Andrew Cooper (1):
  common/domain: Helpers to pause a domain while in context

Ed White (9):
  VMX: VMFUNC and #VE definitions and detection.
  VMX: implement suppress #VE.
  x86/HVM: Hardware alternate p2m support detection.
  x86/altp2m: basic data structures and support routines.
  VMX/altp2m: add code to support EPTP switching and #VE.
  x86/altp2m: alternate p2m memory events.
  x86/altp2m: add remaining support routines.
  x86/altp2m: define and implement alternate p2m HVMOP types.
  x86/altp2m: Add altp2mhvm HVM domain parameter.

George Dunlap (1):
  x86/altp2m: add control of suppress_ve.

Ravi Sahita (2):
  VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  x86/altp2m: XSM hooks for altp2m HVM ops

Tamas K Lengyel (2):
  tools/libxc: add support to altp2m hvmops
  tools/xen-access: altp2m testcases

 docs/man/xl.cfg.pod.5                        |  12 +
 docs/misc/xen-command-line.markdown          |   7 +
 tools/flask/policy/policy/modules/xen/xen.if |   4 +-
 tools/libxc/Makefile                         |   1 +
 tools/libxc/include/xenctrl.h                |  22 +
 tools/libxc/xc_altp2m.c                      | 248 +++++++++++
 tools/libxl/libxl.h                          |   6 +
 tools/libxl/libxl_create.c                   |   1 +
 tools/libxl/libxl_dom.c                      |   2 +
 tools/libxl/libxl_types.idl                  |   1 +
 tools/libxl/xl_cmdimpl.c                     |  10 +
 tools/tests/xen-access/xen-access.c          | 173 ++++++--
 xen/arch/x86/hvm/emulate.c                   |  18 +-
 xen/arch/x86/hvm/hvm.c                       | 252 ++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c                  |  42 +-
 xen/arch/x86/hvm/vmx/vmx.c                   | 176 ++++++++
 xen/arch/x86/mm/Makefile                     |   1 +
 xen/arch/x86/mm/altp2m.c                     |  77 ++++
 xen/arch/x86/mm/hap/hap.c                    |  40 +-
 xen/arch/x86/mm/mem_sharing.c                |   4 +-
 xen/arch/x86/mm/mm-locks.h                   |  46 +-
 xen/arch/x86/mm/p2m-ept.c                    |  35 +-
 xen/arch/x86/mm/p2m-pod.c                    |  12 +-
 xen/arch/x86/mm/p2m-pt.c                     |  10 +-
 xen/arch/x86/mm/p2m.c                        | 612 +++++++++++++++++++++++++--
 xen/arch/x86/x86_emulate/x86_emulate.c       |  19 +-
 xen/arch/x86/x86_emulate/x86_emulate.h       |   4 +
 xen/common/domain.c                          |  28 ++
 xen/common/vm_event.c                        |   4 +
 xen/include/asm-arm/p2m.h                    |   6 +
 xen/include/asm-x86/altp2m.h                 |  38 ++
 xen/include/asm-x86/domain.h                 |  10 +
 xen/include/asm-x86/hvm/hvm.h                |  25 ++
 xen/include/asm-x86/hvm/vcpu.h               |   9 +
 xen/include/asm-x86/hvm/vmx/vmcs.h           |  14 +-
 xen/include/asm-x86/hvm/vmx/vmx.h            |  13 +-
 xen/include/asm-x86/msr-index.h              |   1 +
 xen/include/asm-x86/p2m.h                    | 100 ++++-
 xen/include/public/hvm/hvm_op.h              |  89 ++++
 xen/include/public/hvm/params.h              |   5 +-
 xen/include/public/vm_event.h                |  12 +
 xen/include/xen/sched.h                      |   5 +
 xen/include/xsm/dummy.h                      |  12 +
 xen/include/xsm/xsm.h                        |  12 +
 xen/xsm/dummy.c                              |   2 +
 xen/xsm/flask/hooks.c                        |  12 +
 xen/xsm/flask/policy/access_vectors          |   7 +
 47 files changed, 2136 insertions(+), 103 deletions(-)
 create mode 100644 tools/libxc/xc_altp2m.c
 create mode 100644 xen/arch/x86/mm/altp2m.c
 create mode 100644 xen/include/asm-x86/altp2m.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 42+ messages in thread