All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv6 0/2] x86/ept: reduce translation invalidation impact
@ 2015-12-18 13:50 David Vrabel
  2015-12-18 13:50 ` [PATCHv6 1/2] x86/ept: invalidate guest physical mappings on VMENTER David Vrabel
  2015-12-18 13:50 ` [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released David Vrabel
  0 siblings, 2 replies; 13+ messages in thread
From: David Vrabel @ 2015-12-18 13:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jun Nakajima, George Dunlap, Andrew Cooper,
	Tim Deegan, David Vrabel, Jan Beulich

This series improves the performance of EPT by further reducing the
impact of the translation invalidations (ept_sync_domain()). By:

a) Deferring invalidations until the p2m write lock is released.

Prior to this change a 16 VCPU guest could not be successfully
migrated on an (admittedly slow) 160 PCPU box because the p2m write
lock was held for such extended periods of time.  This starved the
read lock needed (by the toolstack) to map the domain's memory,
triggering the watchdog.

After this change a 64 VCPU guest could be successfully migrated.

ept_sync_domain() is very expensive because:

a) it uses on_selected_cpus() and the IPI cost can be particularly
   high for a multi-socket machine.

b) on_selected_cpus() is serialized by its own spin lock.

On this particular box, ept_sync_domain() could take ~3-5 ms.

Simply using a fair rw lock was not sufficient to resolve this (but it
was an improvement) as the cost of the ept_sync_domain calls() was
still delaying the read locks enough for the watchdog to trigger (the
toolstack maps a batch of 1024 GFNs at a time, which means trying to
acquire the p2m read lock 1024 times).

Changes in v6:

- Fix performance bug in patch #2.
- Improve comments.

Changes in v5:

- Fix PoD by explicitly doing an invalidation before reclaiming zero
  pages.
- Use the same mechanism for dealing with freeing page table pages.
  This isn't a common path and its simpler than the deferred list.

Changes in v4:

- __ept_sync_domain() is a no-op -- invalidates are done before VMENTER.
- initialize ept->invalidate to all ones so the initial invalidate is
  always done.

Changes in v3:

- Drop already applied "x86/ept: remove unnecessary sync after
  resolving misconfigured entries".
- Replaced "mm: don't free pages until mm locks are released" with
  "x86/ept: invalidate guest physical mappings on VMENTER".

Changes in v2:

- Use a per-p2m (not per-CPU) list for page table pages to be freed.
- Hold the write lock while updating the synced_mask.

David

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-02-02  7:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-18 13:50 [PATCHv6 0/2] x86/ept: reduce translation invalidation impact David Vrabel
2015-12-18 13:50 ` [PATCHv6 1/2] x86/ept: invalidate guest physical mappings on VMENTER David Vrabel
2015-12-18 14:59   ` George Dunlap
2015-12-20  6:51   ` Tian, Kevin
2015-12-18 13:50 ` [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released David Vrabel
2015-12-20  6:56   ` Tian, Kevin
2016-02-01 14:50     ` David Vrabel
2016-02-02  7:58       ` Tian, Kevin
2015-12-22 12:23   ` George Dunlap
2015-12-22 14:01     ` Andrew Cooper
2015-12-22 14:20       ` David Vrabel
2015-12-22 14:56         ` George Dunlap
2016-02-01 15:57     ` David Vrabel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.