All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC [PATCH 0/1] ksm: introduce ksm_max_page_sharing per page deduplication limit
@ 2015-11-10 18:44 Andrea Arcangeli
  2015-11-10 18:44 ` [PATCH 1/1] " Andrea Arcangeli
  0 siblings, 1 reply; 29+ messages in thread
From: Andrea Arcangeli @ 2015-11-10 18:44 UTC (permalink / raw)
  To: Hugh Dickins, Davidlohr Bueso
  Cc: linux-mm, Petr Holasek, Andrew Morton, Arjan van de Ven

Hello,

this patch solves KSM computational complexity issues in the rmap walk
that created stalls during enterprise usage on large systems with lots
of RAM and CPUs. It's incremental with the patches posted earlier and
it should apply clean to upstream.

Special review should be done on the KSM page migration code with
merge_across_nodes == 0. I tested merge_across_nodes == 0 but not very
hard. The old code in this area in fact looked flakey in case two KSM
pages of equal content ended up being migrated in the same node. With
this new code instead they end up in the same "chain" in the right
stable_tree or the page under migration is merged into an existing KSM
page of the right node if page_mapcount allows for it.

At the moment the chains aren't defragmented but they could be. It'd
be enough to refile a couple of remap_items for each prune of the
stable_node_chain, from a "dup" with the lowest rmap_hlist_len to the
dup with the highest (not yet full) rmap_hlist_len. The rmap_hlist_len
of the "dup" that got its rmap_item removed, would drop to zero and it
would be garbage collected at the next prune pass and the
page_sharing/page_shared ratio would increase up to the peak possible
given the current max_page_sharing. However the "chain" prune logic
already tries to compact the "dups" in as few stable nodes as
possible. Overall if any defragmentation logic of the stable_node
"chains" really turn out to be good idea, it's better to do it in an
incremental patch as it's an orthogonal problem.

Changing max_page_sharing without first doing "echo 2
>/sys/kernel/mm/ksm/run" (that get rid of the entire stable rbtree)
would also be possible but I didn't think it was worth it as certain
asserts become not enforceable anymore.

The code has perhaps too much asserts, mostly VM_BUG_ON though (the
few BUG_ON are there because in those few cases if the asserts trigger
and ksmd doesn't stop there, it'd corrupt memory randomly). I can
remove the VM_BUG_ON when this gets more testing and gets out of
RFC. Also note, there's no VM_WARN_ON available.

Some testsuite that pretends to know the internals of KSM and predict
exact page_sharing/page_shared values, may give false positive with
this patch applied, but it's enough to set max_page_sharing to a very
large value in order to pass the old tests. Ideally those testsuites
should learn about the max_page_sharing limit and predict the new
page_sharing/shared results with the new code to validate it.

Comments welcome, thanks,
Andrea

Andrea Arcangeli (1):
  ksm: introduce ksm_max_page_sharing per page deduplication limit

 Documentation/vm/ksm.txt |  63 ++++
 mm/ksm.c                 | 731 ++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 728 insertions(+), 66 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-04-20  3:14 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-10 18:44 RFC [PATCH 0/1] ksm: introduce ksm_max_page_sharing per page deduplication limit Andrea Arcangeli
2015-11-10 18:44 ` [PATCH 1/1] " Andrea Arcangeli
2015-12-09 16:19   ` Petr Holasek
2015-12-09 17:15     ` Andrea Arcangeli
2015-12-09 18:10       ` Andrea Arcangeli
2015-12-10 16:06         ` Petr Holasek
2015-12-11  0:31   ` Andrew Morton
2016-01-14 23:36   ` Hugh Dickins
2016-01-16 17:49     ` Andrea Arcangeli
2016-01-16 18:00       ` Arjan van de Ven
2016-01-18  8:14         ` Hugh Dickins
2016-01-18 14:43           ` Arjan van de Ven
2016-01-18  9:10       ` Hugh Dickins
2016-01-18  9:45         ` Hugh Dickins
2016-01-18 17:46         ` Andrea Arcangeli
2016-03-17 21:34         ` Hugh Dickins
2016-03-17 21:50           ` Andrew Morton
2016-03-18 16:27           ` Andrea Arcangeli
2016-01-18 11:01     ` Mel Gorman
2016-01-18 22:19       ` Andrea Arcangeli
2016-01-19 10:43         ` Mel Gorman
2016-04-06 20:33   ` Rik van Riel
2016-04-06 22:02     ` Andrea Arcangeli
2016-09-21 15:12       ` Gavin Guo
2016-09-21 15:34         ` Andrea Arcangeli
2016-09-22 10:48           ` Gavin Guo
2016-10-28  6:26             ` Gavin Guo
2016-10-28 18:31               ` Andrea Arcangeli
2017-04-20  3:14                 ` Gavin Guo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.