All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0 of 3] RFC: x86 memory sharing performance improvements
@ 2012-04-12 14:16 Andres Lagar-Cavilla
  2012-04-12 14:16 ` [PATCH 1 of 3] x86/mm/sharing: Clean ups for relinquishing shared pages on destroy Andres Lagar-Cavilla
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Andres Lagar-Cavilla @ 2012-04-12 14:16 UTC (permalink / raw)
  To: xen-devel; +Cc: andres, keir.xen, tim, adin

This is an RFC series. I haven't fully tested it yet, but I want the concept to
be known as I intend this to be merged prior to the closing of the 4.2 window.

The sharing subsystem does not scale elegantly with high degrees of page
sharing. The culprit is a reverse map that each shared frame maintains,
resolving to all domain pages pointing to the shared frame. Because the rmap is
implemented with a O(n) search linked-list, CoW unsharing can result in
prolonged search times.

The place where this becomes most obvious is during domain destruction, during
which all shared p2m entries need to be unshared. Destroying a domain with a
lot of sharing could result in minutes of hypervisor freeze-up!

Solutions proposed:
- Make the p2m clean up of shared entries part of the preemptible, synchronous,
domain_kill domctl (as opposed to executing monolithically in the finalize
destruction RCU callback)
- When a shared frame exceeds an arbitrary ref count, mutate the rmap from a
linked list to a hash table.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>

 xen/arch/x86/domain.c             |   16 +++-
 xen/arch/x86/mm/mem_sharing.c     |   45 ++++++++++
 xen/arch/x86/mm/p2m.c             |    4 +
 xen/include/asm-arm/mm.h          |    4 +
 xen/include/asm-x86/domain.h      |    1 +
 xen/include/asm-x86/mem_sharing.h |   10 ++
 xen/include/asm-x86/p2m.h         |    4 +
 xen/arch/x86/mm/mem_sharing.c     |  142 +++++++++++++++++++++++--------
 xen/arch/x86/mm/mem_sharing.c     |  170 +++++++++++++++++++++++++++++++++++--
 xen/include/asm-x86/mem_sharing.h |   13 ++-
 10 files changed, 354 insertions(+), 55 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread
* [PATCH 0 of 3] x86/mem_sharing: Improve performance of rmap, fix cascading bugs
@ 2012-04-24 19:48 Andres Lagar-Cavilla
  2012-04-24 19:48 ` [PATCH 2 of 3] x86/mem_sharing: modularize reverse map for shared frames Andres Lagar-Cavilla
  0 siblings, 1 reply; 12+ messages in thread
From: Andres Lagar-Cavilla @ 2012-04-24 19:48 UTC (permalink / raw)
  To: xen-devel; +Cc: andres, tim

This is a repost of patches 2 and 3 of the series initially posted on Apr 12th.

The first patch has been split into two functionally isolated patches as per
Tim Deegan's request.

The original posting (suitably edited) follows
--------------------------------
The sharing subsystem does not scale elegantly with high degrees of page
sharing. The culprit is a reverse map that each shared frame maintains,
resolving to all domain pages pointing to the shared frame. Because the rmap is
implemented with a O(n) search linked-list, CoW unsharing can result in
prolonged search times.

The place where this becomes most obvious is during domain destruction, during
which all shared p2m entries need to be unshared. Destroying a domain with a
lot of sharing could result in minutes of hypervisor freeze-up: 7 minutes for a
2 GiB domain! As a result, errors cascade throughout the system, including soft
lockups, watchdogs firing, IO drivers timing out, etc.

The proposed solution is to mutate the rmap from a linked list to a hash table
when the number of domain pages referencing the shared frame exceeds a
threshold. This maintains minimal space use for the common case of relatively
low sharing, and switches to an O(1) data structure for heavily shared pages,
with an space overhead of one page. The threshold chosen is 256, as a single
page can fit 256 spill lists for 256 buckets in a hash table.

With these patches in place, domain destruction for a 2 GiB domain with a
shared frame including over a hundred thousand references drops from 7 minutes
to two seconds.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>

 xen/arch/x86/mm/mem_sharing.c     |    8 +-
 xen/arch/x86/mm/mem_sharing.c     |  138 ++++++++++++++++++++++--------
 xen/arch/x86/mm/mem_sharing.c     |  170 +++++++++++++++++++++++++++++++++++--
 xen/include/asm-x86/mem_sharing.h |   13 ++-
 4 files changed, 274 insertions(+), 55 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-04-24 19:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-12 14:16 [PATCH 0 of 3] RFC: x86 memory sharing performance improvements Andres Lagar-Cavilla
2012-04-12 14:16 ` [PATCH 1 of 3] x86/mm/sharing: Clean ups for relinquishing shared pages on destroy Andres Lagar-Cavilla
2012-04-18 12:42   ` Tim Deegan
2012-04-18 13:06     ` Andres Lagar-Cavilla
2012-04-12 14:16 ` [PATCH 2 of 3] x86/mem_sharing: modularize reverse map for shared frames Andres Lagar-Cavilla
2012-04-18 14:05   ` Tim Deegan
2012-04-18 14:19     ` Andres Lagar-Cavilla
2012-04-12 14:16 ` [PATCH 3 of 3] x86/mem_sharing: For shared pages with many references, use a hash table instead of a list Andres Lagar-Cavilla
2012-04-18 15:35   ` Tim Deegan
2012-04-18 16:18     ` Andres Lagar-Cavilla
2012-04-24 19:33       ` Andres Lagar-Cavilla
2012-04-24 19:48 [PATCH 0 of 3] x86/mem_sharing: Improve performance of rmap, fix cascading bugs Andres Lagar-Cavilla
2012-04-24 19:48 ` [PATCH 2 of 3] x86/mem_sharing: modularize reverse map for shared frames Andres Lagar-Cavilla

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.