linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] KSMscale cleanup/optimizations
@ 2017-05-18 17:37 Andrea Arcangeli
  2017-05-18 17:37 ` [PATCH 1/3] ksm: cleanup stable_node chain collapse case Andrea Arcangeli
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Andrea Arcangeli @ 2017-05-18 17:37 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Evgheni Dereveanchin, Andrey Ryabinin, Petr Holasek,
	Hugh Dickins, Arjan van de Ven, Davidlohr Bueso, Gavin Guo,
	Jay Vosburgh, Mel Gorman, Dan Carpenter

Hello,

This is incremental with the two fixes already in -mm.

There are no fixes here it's just minor cleanups and optimizations.

1/3 removes makes the "fix" for the stale stable_node fall in the
standard case without introducing new cases. Setting stable_node to
NULL was marginally safer, but stale pointer is still wiped from the
caller, this looks cleaner.

2/3 should fix the false positive from Dan's static checker. Dan could
you check if it still complains?

3/3 is a microoptimization to apply the the refile of future merge
candidate dups at the head of the chain in all cases and to skip it in
one case where we did it and but it was a noop (to avoid checking if
it was already at the head but now we've to check it anyway so it got
optimized away).

Andrea Arcangeli (3):
  ksm: cleanup stable_node chain collapse case
  ksm: swap the two output parameters of chain/chain_prune
  ksm: optimize refile of stable_node_dup at the head of the chain

 mm/ksm.c | 163 ++++++++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 103 insertions(+), 60 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] ksm: cleanup stable_node chain collapse case
  2017-05-18 17:37 [PATCH 0/3] KSMscale cleanup/optimizations Andrea Arcangeli
@ 2017-05-18 17:37 ` Andrea Arcangeli
  2017-05-18 17:37 ` [PATCH 2/3] ksm: swap the two output parameters of chain/chain_prune Andrea Arcangeli
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Andrea Arcangeli @ 2017-05-18 17:37 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Evgheni Dereveanchin, Andrey Ryabinin, Petr Holasek,
	Hugh Dickins, Arjan van de Ven, Davidlohr Bueso, Gavin Guo,
	Jay Vosburgh, Mel Gorman, Dan Carpenter

When the stable_node chain is collapsed we can as well set the caller
stable_node to match the returned stable_node_dup in chain_prune().

This way the collapse case becomes indistinguishable from the regular
stable_node case and we can remove two branches from the KSM page
migration handling slow paths.

While it was all correct this looks cleaner (and faster) as the caller
has to deal with fewer special cases.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 mm/ksm.c | 50 ++++++++++++++++++++++++++++----------------------
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index fc0c73bd6cb3..959036064bb7 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1394,14 +1394,18 @@ static struct stable_node *stable_node_dup(struct stable_node **_stable_node,
 			ksm_stable_node_chains--;
 			ksm_stable_node_dups--;
 			/*
-			 * NOTE: the caller depends on the
-			 * *_stable_node to become NULL if the chain
-			 * was collapsed. Enforce that if anything
-			 * uses a stale (freed) stable_node chain a
-			 * visible crash will materialize (instead of
-			 * an use after free).
+			 * NOTE: the caller depends on the stable_node
+			 * to be equal to stable_node_dup if the chain
+			 * was collapsed.
 			 */
-			*_stable_node = stable_node = NULL;
+			*_stable_node = found;
+			/*
+			 * Just for robustneess as stable_node is
+			 * otherwise left as a stable pointer, the
+			 * compiler shall optimize it away at build
+			 * time.
+			 */
+			stable_node = NULL;
 		} else if (__is_page_sharing_candidate(found, 1)) {
 			/*
 			 * Refile our candidate at the head
@@ -1507,7 +1511,11 @@ static struct page *stable_tree_search(struct page *page)
 		 * not NULL. stable_node_dup may have been inserted in
 		 * the rbtree instead as a regular stable_node (in
 		 * order to collapse the stable_node chain if a single
-		 * stable_node dup was found in it).
+		 * stable_node dup was found in it). In such case the
+		 * stable_node is overwritten by the calleee to point
+		 * to the stable_node_dup that was collapsed in the
+		 * stable rbtree and stable_node will be equal to
+		 * stable_node_dup like if the chain never existed.
 		 */
 		if (!stable_node_dup) {
 			/*
@@ -1625,15 +1633,13 @@ static struct page *stable_tree_search(struct page *page)
 replace:
 	/*
 	 * If stable_node was a chain and chain_prune collapsed it,
-	 * stable_node will be NULL here. In that case the
-	 * stable_node_dup is the regular stable_node that has
-	 * replaced the chain. If stable_node is not NULL and equal to
-	 * stable_node_dup there was no chain and stable_node_dup is
-	 * the regular stable_node in the stable rbtree. Otherwise
-	 * stable_node is the chain and stable_node_dup is the dup to
-	 * replace.
+	 * stable_node has been updated to be the new regular
+	 * stable_node. A collapse of the chain is indistinguishable
+	 * from the case there was no chain in the stable
+	 * rbtree. Otherwise stable_node is the chain and
+	 * stable_node_dup is the dup to replace.
 	 */
-	if (!stable_node || stable_node_dup == stable_node) {
+	if (stable_node_dup == stable_node) {
 		VM_BUG_ON(is_stable_node_chain(stable_node_dup));
 		VM_BUG_ON(is_stable_node_dup(stable_node_dup));
 		/* there is no chain */
@@ -1678,13 +1684,13 @@ static struct page *stable_tree_search(struct page *page)
 		stable_node_dup = stable_node_any;
 	/*
 	 * If stable_node was a chain and chain_prune collapsed it,
-	 * stable_node will be NULL here. In that case the
-	 * stable_node_dup is the regular stable_node that has
-	 * replaced the chain. If stable_node is not NULL and equal to
-	 * stable_node_dup there was no chain and stable_node_dup is
-	 * the regular stable_node in the stable rbtree.
+	 * stable_node has been updated to be the new regular
+	 * stable_node. A collapse of the chain is indistinguishable
+	 * from the case there was no chain in the stable
+	 * rbtree. Otherwise stable_node is the chain and
+	 * stable_node_dup is the dup to replace.
 	 */
-	if (!stable_node || stable_node_dup == stable_node) {
+	if (stable_node_dup == stable_node) {
 		VM_BUG_ON(is_stable_node_chain(stable_node_dup));
 		VM_BUG_ON(is_stable_node_dup(stable_node_dup));
 		/* chain is missing so create it */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] ksm: swap the two output parameters of chain/chain_prune
  2017-05-18 17:37 [PATCH 0/3] KSMscale cleanup/optimizations Andrea Arcangeli
  2017-05-18 17:37 ` [PATCH 1/3] ksm: cleanup stable_node chain collapse case Andrea Arcangeli
@ 2017-05-18 17:37 ` Andrea Arcangeli
  2017-05-18 17:37 ` [PATCH 3/3] ksm: optimize refile of stable_node_dup at the head of the chain Andrea Arcangeli
  2017-05-18 20:35 ` [PATCH 0/3] KSMscale cleanup/optimizations Dan Carpenter
  3 siblings, 0 replies; 5+ messages in thread
From: Andrea Arcangeli @ 2017-05-18 17:37 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Evgheni Dereveanchin, Andrey Ryabinin, Petr Holasek,
	Hugh Dickins, Arjan van de Ven, Davidlohr Bueso, Gavin Guo,
	Jay Vosburgh, Mel Gorman, Dan Carpenter

Some static checker complains if chain/chain_prune returns a
potentially stale pointer.

There are two output parameters to chain/chain_prune, one is tree_page
the other is stable_node_dup. Like in get_ksm_page the caller has to
check tree_page is NULL before touching the stable_node. Similarly in
chain/chain_prune the caller has to check tree_page before touching
the stable_node_dup returned or the original stable_node passed as
parameter.

Because the tree_page is never returned as a stale pointer, it may be
more intuitive to return tree_page and to pass stable_node_dup for
reference instead of the reverse.

This patch purely swaps the two output parameters of chain/chain_prune
as a cleanup for the static checker and to mimic the get_ksm_page
behavior more closely. There's no change to the caller at all except
the swap, it's purely a cleanup and it is a noop from the caller point
of view.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 mm/ksm.c | 78 ++++++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 52 insertions(+), 26 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 959036064bb7..7b2e26f9cf41 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1315,14 +1315,14 @@ bool is_page_sharing_candidate(struct stable_node *stable_node)
 	return __is_page_sharing_candidate(stable_node, 0);
 }
 
-static struct stable_node *stable_node_dup(struct stable_node **_stable_node,
-					   struct page **tree_page,
-					   struct rb_root *root,
-					   bool prune_stale_stable_nodes)
+struct page *stable_node_dup(struct stable_node **_stable_node_dup,
+			     struct stable_node **_stable_node,
+			     struct rb_root *root,
+			     bool prune_stale_stable_nodes)
 {
 	struct stable_node *dup, *found = NULL, *stable_node = *_stable_node;
 	struct hlist_node *hlist_safe;
-	struct page *_tree_page;
+	struct page *_tree_page, *tree_page = NULL;
 	int nr = 0;
 	int found_rmap_hlist_len;
 
@@ -1355,14 +1355,14 @@ static struct stable_node *stable_node_dup(struct stable_node **_stable_node,
 			if (!found ||
 			    dup->rmap_hlist_len > found_rmap_hlist_len) {
 				if (found)
-					put_page(*tree_page);
+					put_page(tree_page);
 				found = dup;
 				found_rmap_hlist_len = found->rmap_hlist_len;
-				*tree_page = _tree_page;
+				tree_page = _tree_page;
 
+				/* skip put_page for found dup */
 				if (!prune_stale_stable_nodes)
 					break;
-				/* skip put_page */
 				continue;
 			}
 		}
@@ -1419,7 +1419,8 @@ static struct stable_node *stable_node_dup(struct stable_node **_stable_node,
 		}
 	}
 
-	return found;
+	*_stable_node_dup = found;
+	return tree_page;
 }
 
 static struct stable_node *stable_node_dup_any(struct stable_node *stable_node,
@@ -1435,35 +1436,60 @@ static struct stable_node *stable_node_dup_any(struct stable_node *stable_node,
 			   typeof(*stable_node), hlist_dup);
 }
 
-static struct stable_node *__stable_node_chain(struct stable_node **_stable_node,
-					       struct page **tree_page,
-					       struct rb_root *root,
-					       bool prune_stale_stable_nodes)
+/*
+ * Like for get_ksm_page, this function can free the *_stable_node and
+ * *_stable_node_dup if the returned tree_page is NULL.
+ *
+ * It can also free and overwrite *_stable_node with the found
+ * stable_node_dup if the chain is collapsed (in which case
+ * *_stable_node will be equal to *_stable_node_dup like if the chain
+ * never existed). It's up to the caller to verify tree_page is not
+ * NULL before dereferencing *_stable_node or *_stable_node_dup.
+ *
+ * *_stable_node_dup is really a second output parameter of this
+ * function and will be overwritten in all cases, the caller doesn't
+ * need to initialize it.
+ */
+static struct page *__stable_node_chain(struct stable_node **_stable_node_dup,
+					struct stable_node **_stable_node,
+					struct rb_root *root,
+					bool prune_stale_stable_nodes)
 {
 	struct stable_node *stable_node = *_stable_node;
 	if (!is_stable_node_chain(stable_node)) {
 		if (is_page_sharing_candidate(stable_node)) {
-			*tree_page = get_ksm_page(stable_node, false);
-			return stable_node;
+			*_stable_node_dup = stable_node;
+			return get_ksm_page(stable_node, false);
 		}
+		/*
+		 * _stable_node_dup set to NULL means the stable_node
+		 * reached the ksm_max_page_sharing limit.
+		 */
+		*_stable_node_dup = NULL;
 		return NULL;
 	}
-	return stable_node_dup(_stable_node, tree_page, root,
+	return stable_node_dup(_stable_node_dup, _stable_node, root,
 			       prune_stale_stable_nodes);
 }
 
-static __always_inline struct stable_node *chain_prune(struct stable_node **s_n,
-						       struct page **t_p,
-						       struct rb_root *root)
+static __always_inline struct page *chain_prune(struct stable_node **s_n_d,
+						struct stable_node **s_n,
+						struct rb_root *root)
 {
-	return __stable_node_chain(s_n, t_p, root, true);
+	return __stable_node_chain(s_n_d, s_n, root, true);
 }
 
-static __always_inline struct stable_node *chain(struct stable_node *s_n,
-						 struct page **t_p,
-						 struct rb_root *root)
+static __always_inline struct page *chain(struct stable_node **s_n_d,
+					  struct stable_node *s_n,
+					  struct rb_root *root)
 {
-	return __stable_node_chain(&s_n, t_p, root, false);
+	struct stable_node *old_stable_node = s_n;
+	struct page *tree_page;
+
+	tree_page = __stable_node_chain(s_n_d, &s_n, root, false);
+	/* not pruning dups so s_n cannot have changed */
+	VM_BUG_ON(s_n != old_stable_node);
+	return tree_page;
 }
 
 /*
@@ -1504,7 +1530,7 @@ static struct page *stable_tree_search(struct page *page)
 		cond_resched();
 		stable_node = rb_entry(*new, struct stable_node, node);
 		stable_node_any = NULL;
-		stable_node_dup = chain_prune(&stable_node, &tree_page, root);
+		tree_page = chain_prune(&stable_node_dup, &stable_node,	root);
 		/*
 		 * NOTE: stable_node may have been freed by
 		 * chain_prune() if the returned stable_node_dup is
@@ -1745,7 +1771,7 @@ static struct stable_node *stable_tree_insert(struct page *kpage)
 		cond_resched();
 		stable_node = rb_entry(*new, struct stable_node, node);
 		stable_node_any = NULL;
-		stable_node_dup = chain(stable_node, &tree_page, root);
+		tree_page = chain(&stable_node_dup, stable_node, root);
 		if (!stable_node_dup) {
 			/*
 			 * Either all stable_node dups were full in

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] ksm: optimize refile of stable_node_dup at the head of the chain
  2017-05-18 17:37 [PATCH 0/3] KSMscale cleanup/optimizations Andrea Arcangeli
  2017-05-18 17:37 ` [PATCH 1/3] ksm: cleanup stable_node chain collapse case Andrea Arcangeli
  2017-05-18 17:37 ` [PATCH 2/3] ksm: swap the two output parameters of chain/chain_prune Andrea Arcangeli
@ 2017-05-18 17:37 ` Andrea Arcangeli
  2017-05-18 20:35 ` [PATCH 0/3] KSMscale cleanup/optimizations Dan Carpenter
  3 siblings, 0 replies; 5+ messages in thread
From: Andrea Arcangeli @ 2017-05-18 17:37 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: Evgheni Dereveanchin, Andrey Ryabinin, Petr Holasek,
	Hugh Dickins, Arjan van de Ven, Davidlohr Bueso, Gavin Guo,
	Jay Vosburgh, Mel Gorman, Dan Carpenter

If a candidate stable_node_dup has been found and it can accept
further merges it can be refiled to the head of the list to speedup
next searches without altering which dup is found and how the dups
accumulate in the chain.

We already refiled it back to the head in the prune_stale_stable_nodes
case, but we didn't refile it if not pruning (which is more
common). And we also refiled it when it was already at the head which
is unnecessary (in the prune_stale_stable_nodes case, nr > 1 means
there's more than one dup in the chain, it doesn't mean it's not
already at the head of the chain).

The stable_node_chain list is single threaded and there's no SMP
locking contention so it should be faster to refile it to the head of
the list also if prune_stale_stable_nodes is false.

A profiling shows the refile happens 1.9% of the time when a dup is
found with a max_page_sharing limit setting of 3 (with
max_page_sharing of 2 the refile never happens of course as there's
never space for one more merge) which is reasonably low. At higher
max_page_sharing values it should be much less frequent.

This is just an optimization.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 mm/ksm.c | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 7b2e26f9cf41..e02342f4f6aa 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1369,13 +1369,14 @@ struct page *stable_node_dup(struct stable_node **_stable_node_dup,
 		put_page(_tree_page);
 	}
 
-	/*
-	 * nr is relevant only if prune_stale_stable_nodes is true,
-	 * otherwise we may break the loop at nr == 1 even if there
-	 * are multiple entries.
-	 */
-	if (prune_stale_stable_nodes && found) {
-		if (nr == 1) {
+	if (found) {
+		/*
+		 * nr is counting all dups in the chain only if
+		 * prune_stale_stable_nodes is true, otherwise we may
+		 * break the loop at nr == 1 even if there are
+		 * multiple entries.
+		 */
+		if (prune_stale_stable_nodes && nr == 1) {
 			/*
 			 * If there's not just one entry it would
 			 * corrupt memory, better BUG_ON. In KSM
@@ -1406,12 +1407,22 @@ struct page *stable_node_dup(struct stable_node **_stable_node_dup,
 			 * time.
 			 */
 			stable_node = NULL;
-		} else if (__is_page_sharing_candidate(found, 1)) {
+		} else if (stable_node->hlist.first != &found->hlist_dup &&
+			   __is_page_sharing_candidate(found, 1)) {
 			/*
-			 * Refile our candidate at the head
-			 * after the prune if our candidate
-			 * can accept one more future sharing
-			 * in addition to the one underway.
+			 * If the found stable_node dup can accept one
+			 * more future merge (in addition to the one
+			 * that is underway) and is not at the head of
+			 * the chain, put it there so next search will
+			 * be quicker in the !prune_stale_stable_nodes
+			 * case.
+			 *
+			 * NOTE: it would be inaccurate to use nr > 1
+			 * instead of checking the hlist.first pointer
+			 * directly, because in the
+			 * prune_stale_stable_nodes case "nr" isn't
+			 * the position of the found dup in the chain,
+			 * but the total number of dups in the chain.
 			 */
 			hlist_del(&found->hlist_dup);
 			hlist_add_head(&found->hlist_dup,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] KSMscale cleanup/optimizations
  2017-05-18 17:37 [PATCH 0/3] KSMscale cleanup/optimizations Andrea Arcangeli
                   ` (2 preceding siblings ...)
  2017-05-18 17:37 ` [PATCH 3/3] ksm: optimize refile of stable_node_dup at the head of the chain Andrea Arcangeli
@ 2017-05-18 20:35 ` Dan Carpenter
  3 siblings, 0 replies; 5+ messages in thread
From: Dan Carpenter @ 2017-05-18 20:35 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Andrew Morton, linux-mm, Evgheni Dereveanchin, Andrey Ryabinin,
	Petr Holasek, Hugh Dickins, Arjan van de Ven, Davidlohr Bueso,
	Gavin Guo, Jay Vosburgh, Mel Gorman

On Thu, May 18, 2017 at 07:37:18PM +0200, Andrea Arcangeli wrote:
> 2/3 should fix the false positive from Dan's static checker. Dan could
> you check if it still complains?

That works.  Thanks!

regards,
dan carpenter

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-05-18 20:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-18 17:37 [PATCH 0/3] KSMscale cleanup/optimizations Andrea Arcangeli
2017-05-18 17:37 ` [PATCH 1/3] ksm: cleanup stable_node chain collapse case Andrea Arcangeli
2017-05-18 17:37 ` [PATCH 2/3] ksm: swap the two output parameters of chain/chain_prune Andrea Arcangeli
2017-05-18 17:37 ` [PATCH 3/3] ksm: optimize refile of stable_node_dup at the head of the chain Andrea Arcangeli
2017-05-18 20:35 ` [PATCH 0/3] KSMscale cleanup/optimizations Dan Carpenter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).