linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v3 PATCH] mm: ksm: do not block on page lock when searching stable tree
@ 2019-01-29 20:29 Yang Shi
  2019-01-30  7:14 ` John Hubbard
  2019-01-30  8:13 ` Kirill Tkhai
  0 siblings, 2 replies; 5+ messages in thread
From: Yang Shi @ 2019-01-29 20:29 UTC (permalink / raw)
  To: ktkhai, jhubbard, hughd, aarcange, akpm; +Cc: yang.shi, linux-mm, linux-kernel

ksmd need search stable tree to look for the suitable KSM page, but the
KSM page might be locked for a while due to i.e. KSM page rmap walk.
Basically it is not a big deal since commit 2c653d0ee2ae
("ksm: introduce ksm_max_page_sharing per page deduplication limit"),
since max_page_sharing limits the number of shared KSM pages.

But it still sounds not worth waiting for the lock, the page can be skip,
then try to merge it in the next scan to avoid potential stall if its
content is still intact.

Introduce trylock mode to get_ksm_page() to not block on page lock, like
what try_to_merge_one_page() does.  And, define three possible
operations (nolock, lock and trylock) as enum type to avoid stacking up
bools and make the code more readable.

Return -EBUSY if trylock fails, since NULL means not find suitable KSM
page, which is a valid case.

With the default max_page_sharing setting (256), there is almost no
observed change comparing lock vs trylock.

However, with ksm02 of LTP, the reduced ksmd full scan time can be
observed, which has set max_page_sharing to 786432.  With lock version,
ksmd may tak 10s - 11s to run two full scans, with trylock version ksmd
may take 8s - 11s to run two full scans.  And, the number of
pages_sharing and pages_to_scan keep same.  Basically, this change has
no harm.

Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Suggested-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
Hi folks,

This patch was with "mm: vmscan: skip KSM page in direct reclaim if priority
is low" in the initial submission.  Then Hugh and Andrea pointed out commit
2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication
limit") is good enough for limiting the number of shared KSM page to prevent
from softlock when walking ksm page rmap.  This commit does solve the problem.
So, the series was dropped by Andrew from -mm tree.

However, I thought the second patch (this one) still sounds useful.  So, I did
some test and resubmit it.  The first version was reviewed by Krill Tkhai, so
I keep his Reviewed-by tag since there is no change to the patch except the
commit log.

So, would you please reconsider this patch?

v3: Use enum to define get_ksm_page operations (nolock, lock and trylock) per
    John Hubbard
v2: Updated the commit log to reflect some test result and latest discussion

 mm/ksm.c | 46 ++++++++++++++++++++++++++++++++++++----------
 1 file changed, 36 insertions(+), 10 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 6c48ad1..5647bc1 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node)
 	free_stable_node(stable_node);
 }
 
+enum get_ksm_page_flags {
+	GET_KSM_PAGE_NOLOCK,
+	GET_KSM_PAGE_LOCK,
+	GET_KSM_PAGE_TRYLOCK
+};
+
 /*
  * get_ksm_page: checks if the page indicated by the stable node
  * is still its ksm page, despite having held no reference to it.
@@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node)
  * a page to put something that might look like our key in page->mapping.
  * is on its way to being freed; but it is an anomaly to bear in mind.
  */
-static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
+static struct page *get_ksm_page(struct stable_node *stable_node,
+				 enum get_ksm_page_flags flags)
 {
 	struct page *page;
 	void *expected_mapping;
@@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
 		goto stale;
 	}
 
-	if (lock_it) {
+	if (flags == GET_KSM_PAGE_TRYLOCK) {
+		if (!trylock_page(page)) {
+			put_page(page);
+			return ERR_PTR(-EBUSY);
+		}
+	} else if (flags == GET_KSM_PAGE_LOCK)
 		lock_page(page);
+
+	if (flags != GET_KSM_PAGE_NOLOCK) {
 		if (READ_ONCE(page->mapping) != expected_mapping) {
 			unlock_page(page);
 			put_page(page);
@@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct rmap_item *rmap_item)
 		struct page *page;
 
 		stable_node = rmap_item->head;
-		page = get_ksm_page(stable_node, true);
+		page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
 		if (!page)
 			goto out;
 
@@ -863,7 +877,7 @@ static int remove_stable_node(struct stable_node *stable_node)
 	struct page *page;
 	int err;
 
-	page = get_ksm_page(stable_node, true);
+	page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
 	if (!page) {
 		/*
 		 * get_ksm_page did remove_node_from_stable_tree itself.
@@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct stable_node **_stable_node_dup,
 		 * stable_node parameter itself will be freed from
 		 * under us if it returns NULL.
 		 */
-		_tree_page = get_ksm_page(dup, false);
+		_tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK);
 		if (!_tree_page)
 			continue;
 		nr += 1;
@@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct stable_node **_stable_node_dup,
 	if (!is_stable_node_chain(stable_node)) {
 		if (is_page_sharing_candidate(stable_node)) {
 			*_stable_node_dup = stable_node;
-			return get_ksm_page(stable_node, false);
+			return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK);
 		}
 		/*
 		 * _stable_node_dup set to NULL means the stable_node
@@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct page *page)
 			 * wrprotected at all times. Any will work
 			 * fine to continue the walk.
 			 */
-			tree_page = get_ksm_page(stable_node_any, false);
+			tree_page = get_ksm_page(stable_node_any,
+						 GET_KSM_PAGE_NOLOCK);
 		}
 		VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
 		if (!tree_page) {
@@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct page *page)
 			 * It would be more elegant to return stable_node
 			 * than kpage, but that involves more changes.
 			 */
-			tree_page = get_ksm_page(stable_node_dup, true);
+			tree_page = get_ksm_page(stable_node_dup,
+						 GET_KSM_PAGE_TRYLOCK);
+
+			if (PTR_ERR(tree_page) == -EBUSY)
+				return ERR_PTR(-EBUSY);
+
 			if (unlikely(!tree_page))
 				/*
 				 * The tree may have been rebalanced,
@@ -1842,7 +1862,8 @@ static struct stable_node *stable_tree_insert(struct page *kpage)
 			 * wrprotected at all times. Any will work
 			 * fine to continue the walk.
 			 */
-			tree_page = get_ksm_page(stable_node_any, false);
+			tree_page = get_ksm_page(stable_node_any,
+						 GET_KSM_PAGE_NOLOCK);
 		}
 		VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
 		if (!tree_page) {
@@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
 
 	/* We first start with searching the page inside the stable tree */
 	kpage = stable_tree_search(page);
+
+	if (PTR_ERR(kpage) == -EBUSY)
+		return;
+
 	if (kpage == page && rmap_item->head == stable_node) {
 		put_page(kpage);
 		return;
@@ -2242,7 +2267,8 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
 
 			list_for_each_entry_safe(stable_node, next,
 						 &migrate_nodes, list) {
-				page = get_ksm_page(stable_node, false);
+				page = get_ksm_page(stable_node,
+						    GET_KSM_PAGE_NOLOCK);
 				if (page)
 					put_page(page);
 				cond_resched();
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [v3 PATCH] mm: ksm: do not block on page lock when searching stable tree
  2019-01-29 20:29 [v3 PATCH] mm: ksm: do not block on page lock when searching stable tree Yang Shi
@ 2019-01-30  7:14 ` John Hubbard
  2019-01-30 17:47   ` Yang Shi
  2019-01-30  8:13 ` Kirill Tkhai
  1 sibling, 1 reply; 5+ messages in thread
From: John Hubbard @ 2019-01-30  7:14 UTC (permalink / raw)
  To: Yang Shi, ktkhai, hughd, aarcange, akpm; +Cc: linux-mm, linux-kernel

On 1/29/19 12:29 PM, Yang Shi wrote:
> ksmd need search stable tree to look for the suitable KSM page, but the
> KSM page might be locked for a while due to i.e. KSM page rmap walk.
> Basically it is not a big deal since commit 2c653d0ee2ae
> ("ksm: introduce ksm_max_page_sharing per page deduplication limit"),
> since max_page_sharing limits the number of shared KSM pages.
> 
> But it still sounds not worth waiting for the lock, the page can be skip,
> then try to merge it in the next scan to avoid potential stall if its
> content is still intact.
> 
> Introduce trylock mode to get_ksm_page() to not block on page lock, like
> what try_to_merge_one_page() does.  And, define three possible
> operations (nolock, lock and trylock) as enum type to avoid stacking up
> bools and make the code more readable.
> 
> Return -EBUSY if trylock fails, since NULL means not find suitable KSM
> page, which is a valid case.
> 
> With the default max_page_sharing setting (256), there is almost no
> observed change comparing lock vs trylock.
> 
> However, with ksm02 of LTP, the reduced ksmd full scan time can be
> observed, which has set max_page_sharing to 786432.  With lock version,
> ksmd may tak 10s - 11s to run two full scans, with trylock version ksmd
> may take 8s - 11s to run two full scans.  And, the number of
> pages_sharing and pages_to_scan keep same.  Basically, this change has
> no harm >
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Suggested-by: John Hubbard <jhubbard@nvidia.com>
> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> ---
> Hi folks,
> 
> This patch was with "mm: vmscan: skip KSM page in direct reclaim if priority
> is low" in the initial submission.  Then Hugh and Andrea pointed out commit
> 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication
> limit") is good enough for limiting the number of shared KSM page to prevent
> from softlock when walking ksm page rmap.  This commit does solve the problem.
> So, the series was dropped by Andrew from -mm tree.
> 
> However, I thought the second patch (this one) still sounds useful.  So, I did
> some test and resubmit it.  The first version was reviewed by Krill Tkhai, so
> I keep his Reviewed-by tag since there is no change to the patch except the
> commit log.
> 
> So, would you please reconsider this patch?
> 
> v3: Use enum to define get_ksm_page operations (nolock, lock and trylock) per
>      John Hubbard
> v2: Updated the commit log to reflect some test result and latest discussion
> 
>   mm/ksm.c | 46 ++++++++++++++++++++++++++++++++++++----------
>   1 file changed, 36 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 6c48ad1..5647bc1 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node)
>   	free_stable_node(stable_node);
>   }
>   
> +enum get_ksm_page_flags {
> +	GET_KSM_PAGE_NOLOCK,
> +	GET_KSM_PAGE_LOCK,
> +	GET_KSM_PAGE_TRYLOCK
> +};
> +
>   /*
>    * get_ksm_page: checks if the page indicated by the stable node
>    * is still its ksm page, despite having held no reference to it.
> @@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node)
>    * a page to put something that might look like our key in page->mapping.
>    * is on its way to being freed; but it is an anomaly to bear in mind.
>    */
> -static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
> +static struct page *get_ksm_page(struct stable_node *stable_node,
> +				 enum get_ksm_page_flags flags)
>   {
>   	struct page *page;
>   	void *expected_mapping;
> @@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
>   		goto stale;
>   	}
>   
> -	if (lock_it) {
> +	if (flags == GET_KSM_PAGE_TRYLOCK) {
> +		if (!trylock_page(page)) {
> +			put_page(page);
> +			return ERR_PTR(-EBUSY);
> +		}
> +	} else if (flags == GET_KSM_PAGE_LOCK)
>   		lock_page(page);
> +
> +	if (flags != GET_KSM_PAGE_NOLOCK) {
>   		if (READ_ONCE(page->mapping) != expected_mapping) {
>   			unlock_page(page);
>   			put_page(page);
> @@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct rmap_item *rmap_item)
>   		struct page *page;
>   
>   		stable_node = rmap_item->head;
> -		page = get_ksm_page(stable_node, true);
> +		page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>   		if (!page)
>   			goto out;
>   
> @@ -863,7 +877,7 @@ static int remove_stable_node(struct stable_node *stable_node)
>   	struct page *page;
>   	int err;
>   
> -	page = get_ksm_page(stable_node, true);
> +	page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>   	if (!page) {
>   		/*
>   		 * get_ksm_page did remove_node_from_stable_tree itself.
> @@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct stable_node **_stable_node_dup,
>   		 * stable_node parameter itself will be freed from
>   		 * under us if it returns NULL.
>   		 */
> -		_tree_page = get_ksm_page(dup, false);
> +		_tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK);
>   		if (!_tree_page)
>   			continue;
>   		nr += 1;
> @@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct stable_node **_stable_node_dup,
>   	if (!is_stable_node_chain(stable_node)) {
>   		if (is_page_sharing_candidate(stable_node)) {
>   			*_stable_node_dup = stable_node;
> -			return get_ksm_page(stable_node, false);
> +			return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK);
>   		}
>   		/*
>   		 * _stable_node_dup set to NULL means the stable_node
> @@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct page *page)
>   			 * wrprotected at all times. Any will work
>   			 * fine to continue the walk.
>   			 */
> -			tree_page = get_ksm_page(stable_node_any, false);
> +			tree_page = get_ksm_page(stable_node_any,
> +						 GET_KSM_PAGE_NOLOCK);
>   		}
>   		VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>   		if (!tree_page) {
> @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct page *page)
>   			 * It would be more elegant to return stable_node
>   			 * than kpage, but that involves more changes.
>   			 */
> -			tree_page = get_ksm_page(stable_node_dup, true);
> +			tree_page = get_ksm_page(stable_node_dup,
> +						 GET_KSM_PAGE_TRYLOCK);
> +
> +			if (PTR_ERR(tree_page) == -EBUSY)
> +				return ERR_PTR(-EBUSY);

or just:

	if (PTR_ERR(tree_page) == -EBUSY)
		return tree_page;

right?

> +
>   			if (unlikely(!tree_page))
>   				/*
>   				 * The tree may have been rebalanced,
> @@ -1842,7 +1862,8 @@ static struct stable_node *stable_tree_insert(struct page *kpage)
>   			 * wrprotected at all times. Any will work
>   			 * fine to continue the walk.
>   			 */
> -			tree_page = get_ksm_page(stable_node_any, false);
> +			tree_page = get_ksm_page(stable_node_any,
> +						 GET_KSM_PAGE_NOLOCK);
>   		}
>   		VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>   		if (!tree_page) {
> @@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
>   
>   	/* We first start with searching the page inside the stable tree */
>   	kpage = stable_tree_search(page);
> +
> +	if (PTR_ERR(kpage) == -EBUSY)
> +		return;
> +
>   	if (kpage == page && rmap_item->head == stable_node) {
>   		put_page(kpage);
>   		return;
> @@ -2242,7 +2267,8 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
>   
>   			list_for_each_entry_safe(stable_node, next,
>   						 &migrate_nodes, list) {
> -				page = get_ksm_page(stable_node, false);
> +				page = get_ksm_page(stable_node,
> +						    GET_KSM_PAGE_NOLOCK);
>   				if (page)
>   					put_page(page);
>   				cond_resched();
> 

Hi Yang,

The patch looks correct as far doing what it claims to do. I'll leave it
to others to decide if a trylock-based approach is really what you want,
for KSM scans. It seems reasonable from my very limited knowledge of
KSM: there shouldn't be any cases where you really *need* to wait for
a page lock, because the whole system is really sort of an optimization
anyway.


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v3 PATCH] mm: ksm: do not block on page lock when searching stable tree
  2019-01-29 20:29 [v3 PATCH] mm: ksm: do not block on page lock when searching stable tree Yang Shi
  2019-01-30  7:14 ` John Hubbard
@ 2019-01-30  8:13 ` Kirill Tkhai
  1 sibling, 0 replies; 5+ messages in thread
From: Kirill Tkhai @ 2019-01-30  8:13 UTC (permalink / raw)
  To: Yang Shi, jhubbard, hughd, aarcange, akpm; +Cc: linux-mm, linux-kernel

On 29.01.2019 23:29, Yang Shi wrote:
> ksmd need search stable tree to look for the suitable KSM page, but the
> KSM page might be locked for a while due to i.e. KSM page rmap walk.
> Basically it is not a big deal since commit 2c653d0ee2ae
> ("ksm: introduce ksm_max_page_sharing per page deduplication limit"),
> since max_page_sharing limits the number of shared KSM pages.
> 
> But it still sounds not worth waiting for the lock, the page can be skip,
> then try to merge it in the next scan to avoid potential stall if its
> content is still intact.
> 
> Introduce trylock mode to get_ksm_page() to not block on page lock, like
> what try_to_merge_one_page() does.  And, define three possible
> operations (nolock, lock and trylock) as enum type to avoid stacking up
> bools and make the code more readable.
> 
> Return -EBUSY if trylock fails, since NULL means not find suitable KSM
> page, which is a valid case.
> 
> With the default max_page_sharing setting (256), there is almost no
> observed change comparing lock vs trylock.
> 
> However, with ksm02 of LTP, the reduced ksmd full scan time can be
> observed, which has set max_page_sharing to 786432.  With lock version,
> ksmd may tak 10s - 11s to run two full scans, with trylock version ksmd
> may take 8s - 11s to run two full scans.  And, the number of
> pages_sharing and pages_to_scan keep same.  Basically, this change has
> no harm.
> 
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Suggested-by: John Hubbard <jhubbard@nvidia.com>
> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>

Also looks good for me.

> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> ---
> Hi folks,
> 
> This patch was with "mm: vmscan: skip KSM page in direct reclaim if priority
> is low" in the initial submission.  Then Hugh and Andrea pointed out commit
> 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication
> limit") is good enough for limiting the number of shared KSM page to prevent
> from softlock when walking ksm page rmap.  This commit does solve the problem.
> So, the series was dropped by Andrew from -mm tree.
> 
> However, I thought the second patch (this one) still sounds useful.  So, I did
> some test and resubmit it.  The first version was reviewed by Krill Tkhai, so
> I keep his Reviewed-by tag since there is no change to the patch except the
> commit log.
> 
> So, would you please reconsider this patch?
> 
> v3: Use enum to define get_ksm_page operations (nolock, lock and trylock) per
>     John Hubbard
> v2: Updated the commit log to reflect some test result and latest discussion
> 
>  mm/ksm.c | 46 ++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 36 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 6c48ad1..5647bc1 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node)
>  	free_stable_node(stable_node);
>  }
>  
> +enum get_ksm_page_flags {
> +	GET_KSM_PAGE_NOLOCK,
> +	GET_KSM_PAGE_LOCK,
> +	GET_KSM_PAGE_TRYLOCK
> +};
> +
>  /*
>   * get_ksm_page: checks if the page indicated by the stable node
>   * is still its ksm page, despite having held no reference to it.
> @@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node)
>   * a page to put something that might look like our key in page->mapping.
>   * is on its way to being freed; but it is an anomaly to bear in mind.
>   */
> -static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
> +static struct page *get_ksm_page(struct stable_node *stable_node,
> +				 enum get_ksm_page_flags flags)
>  {
>  	struct page *page;
>  	void *expected_mapping;
> @@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
>  		goto stale;
>  	}
>  
> -	if (lock_it) {
> +	if (flags == GET_KSM_PAGE_TRYLOCK) {
> +		if (!trylock_page(page)) {
> +			put_page(page);
> +			return ERR_PTR(-EBUSY);
> +		}
> +	} else if (flags == GET_KSM_PAGE_LOCK)
>  		lock_page(page);
> +
> +	if (flags != GET_KSM_PAGE_NOLOCK) {
>  		if (READ_ONCE(page->mapping) != expected_mapping) {
>  			unlock_page(page);
>  			put_page(page);
> @@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct rmap_item *rmap_item)
>  		struct page *page;
>  
>  		stable_node = rmap_item->head;
> -		page = get_ksm_page(stable_node, true);
> +		page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>  		if (!page)
>  			goto out;
>  
> @@ -863,7 +877,7 @@ static int remove_stable_node(struct stable_node *stable_node)
>  	struct page *page;
>  	int err;
>  
> -	page = get_ksm_page(stable_node, true);
> +	page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>  	if (!page) {
>  		/*
>  		 * get_ksm_page did remove_node_from_stable_tree itself.
> @@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct stable_node **_stable_node_dup,
>  		 * stable_node parameter itself will be freed from
>  		 * under us if it returns NULL.
>  		 */
> -		_tree_page = get_ksm_page(dup, false);
> +		_tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK);
>  		if (!_tree_page)
>  			continue;
>  		nr += 1;
> @@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct stable_node **_stable_node_dup,
>  	if (!is_stable_node_chain(stable_node)) {
>  		if (is_page_sharing_candidate(stable_node)) {
>  			*_stable_node_dup = stable_node;
> -			return get_ksm_page(stable_node, false);
> +			return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK);
>  		}
>  		/*
>  		 * _stable_node_dup set to NULL means the stable_node
> @@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct page *page)
>  			 * wrprotected at all times. Any will work
>  			 * fine to continue the walk.
>  			 */
> -			tree_page = get_ksm_page(stable_node_any, false);
> +			tree_page = get_ksm_page(stable_node_any,
> +						 GET_KSM_PAGE_NOLOCK);
>  		}
>  		VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>  		if (!tree_page) {
> @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct page *page)
>  			 * It would be more elegant to return stable_node
>  			 * than kpage, but that involves more changes.
>  			 */
> -			tree_page = get_ksm_page(stable_node_dup, true);
> +			tree_page = get_ksm_page(stable_node_dup,
> +						 GET_KSM_PAGE_TRYLOCK);
> +
> +			if (PTR_ERR(tree_page) == -EBUSY)
> +				return ERR_PTR(-EBUSY);
> +
>  			if (unlikely(!tree_page))
>  				/*
>  				 * The tree may have been rebalanced,
> @@ -1842,7 +1862,8 @@ static struct stable_node *stable_tree_insert(struct page *kpage)
>  			 * wrprotected at all times. Any will work
>  			 * fine to continue the walk.
>  			 */
> -			tree_page = get_ksm_page(stable_node_any, false);
> +			tree_page = get_ksm_page(stable_node_any,
> +						 GET_KSM_PAGE_NOLOCK);
>  		}
>  		VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>  		if (!tree_page) {
> @@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
>  
>  	/* We first start with searching the page inside the stable tree */
>  	kpage = stable_tree_search(page);
> +
> +	if (PTR_ERR(kpage) == -EBUSY)
> +		return;
> +
>  	if (kpage == page && rmap_item->head == stable_node) {
>  		put_page(kpage);
>  		return;
> @@ -2242,7 +2267,8 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page)
>  
>  			list_for_each_entry_safe(stable_node, next,
>  						 &migrate_nodes, list) {
> -				page = get_ksm_page(stable_node, false);
> +				page = get_ksm_page(stable_node,
> +						    GET_KSM_PAGE_NOLOCK);
>  				if (page)
>  					put_page(page);
>  				cond_resched();
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v3 PATCH] mm: ksm: do not block on page lock when searching stable tree
  2019-01-30  7:14 ` John Hubbard
@ 2019-01-30 17:47   ` Yang Shi
  2019-01-30 18:14     ` John Hubbard
  0 siblings, 1 reply; 5+ messages in thread
From: Yang Shi @ 2019-01-30 17:47 UTC (permalink / raw)
  To: John Hubbard, ktkhai, hughd, aarcange, akpm; +Cc: linux-mm, linux-kernel



On 1/29/19 11:14 PM, John Hubbard wrote:
> On 1/29/19 12:29 PM, Yang Shi wrote:
>> ksmd need search stable tree to look for the suitable KSM page, but the
>> KSM page might be locked for a while due to i.e. KSM page rmap walk.
>> Basically it is not a big deal since commit 2c653d0ee2ae
>> ("ksm: introduce ksm_max_page_sharing per page deduplication limit"),
>> since max_page_sharing limits the number of shared KSM pages.
>>
>> But it still sounds not worth waiting for the lock, the page can be 
>> skip,
>> then try to merge it in the next scan to avoid potential stall if its
>> content is still intact.
>>
>> Introduce trylock mode to get_ksm_page() to not block on page lock, like
>> what try_to_merge_one_page() does.  And, define three possible
>> operations (nolock, lock and trylock) as enum type to avoid stacking up
>> bools and make the code more readable.
>>
>> Return -EBUSY if trylock fails, since NULL means not find suitable KSM
>> page, which is a valid case.
>>
>> With the default max_page_sharing setting (256), there is almost no
>> observed change comparing lock vs trylock.
>>
>> However, with ksm02 of LTP, the reduced ksmd full scan time can be
>> observed, which has set max_page_sharing to 786432.  With lock version,
>> ksmd may tak 10s - 11s to run two full scans, with trylock version ksmd
>> may take 8s - 11s to run two full scans.  And, the number of
>> pages_sharing and pages_to_scan keep same.  Basically, this change has
>> no harm >
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Suggested-by: John Hubbard <jhubbard@nvidia.com>
>> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> ---
>> Hi folks,
>>
>> This patch was with "mm: vmscan: skip KSM page in direct reclaim if 
>> priority
>> is low" in the initial submission.  Then Hugh and Andrea pointed out 
>> commit
>> 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page 
>> deduplication
>> limit") is good enough for limiting the number of shared KSM page to 
>> prevent
>> from softlock when walking ksm page rmap.  This commit does solve the 
>> problem.
>> So, the series was dropped by Andrew from -mm tree.
>>
>> However, I thought the second patch (this one) still sounds useful.  
>> So, I did
>> some test and resubmit it.  The first version was reviewed by Krill 
>> Tkhai, so
>> I keep his Reviewed-by tag since there is no change to the patch 
>> except the
>> commit log.
>>
>> So, would you please reconsider this patch?
>>
>> v3: Use enum to define get_ksm_page operations (nolock, lock and 
>> trylock) per
>>      John Hubbard
>> v2: Updated the commit log to reflect some test result and latest 
>> discussion
>>
>>   mm/ksm.c | 46 ++++++++++++++++++++++++++++++++++++----------
>>   1 file changed, 36 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index 6c48ad1..5647bc1 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct 
>> stable_node *stable_node)
>>       free_stable_node(stable_node);
>>   }
>>   +enum get_ksm_page_flags {
>> +    GET_KSM_PAGE_NOLOCK,
>> +    GET_KSM_PAGE_LOCK,
>> +    GET_KSM_PAGE_TRYLOCK
>> +};
>> +
>>   /*
>>    * get_ksm_page: checks if the page indicated by the stable node
>>    * is still its ksm page, despite having held no reference to it.
>> @@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct 
>> stable_node *stable_node)
>>    * a page to put something that might look like our key in 
>> page->mapping.
>>    * is on its way to being freed; but it is an anomaly to bear in mind.
>>    */
>> -static struct page *get_ksm_page(struct stable_node *stable_node, 
>> bool lock_it)
>> +static struct page *get_ksm_page(struct stable_node *stable_node,
>> +                 enum get_ksm_page_flags flags)
>>   {
>>       struct page *page;
>>       void *expected_mapping;
>> @@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct 
>> stable_node *stable_node, bool lock_it)
>>           goto stale;
>>       }
>>   -    if (lock_it) {
>> +    if (flags == GET_KSM_PAGE_TRYLOCK) {
>> +        if (!trylock_page(page)) {
>> +            put_page(page);
>> +            return ERR_PTR(-EBUSY);
>> +        }
>> +    } else if (flags == GET_KSM_PAGE_LOCK)
>>           lock_page(page);
>> +
>> +    if (flags != GET_KSM_PAGE_NOLOCK) {
>>           if (READ_ONCE(page->mapping) != expected_mapping) {
>>               unlock_page(page);
>>               put_page(page);
>> @@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct 
>> rmap_item *rmap_item)
>>           struct page *page;
>>             stable_node = rmap_item->head;
>> -        page = get_ksm_page(stable_node, true);
>> +        page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>>           if (!page)
>>               goto out;
>>   @@ -863,7 +877,7 @@ static int remove_stable_node(struct 
>> stable_node *stable_node)
>>       struct page *page;
>>       int err;
>>   -    page = get_ksm_page(stable_node, true);
>> +    page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK);
>>       if (!page) {
>>           /*
>>            * get_ksm_page did remove_node_from_stable_tree itself.
>> @@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct 
>> stable_node **_stable_node_dup,
>>            * stable_node parameter itself will be freed from
>>            * under us if it returns NULL.
>>            */
>> -        _tree_page = get_ksm_page(dup, false);
>> +        _tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK);
>>           if (!_tree_page)
>>               continue;
>>           nr += 1;
>> @@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct 
>> stable_node **_stable_node_dup,
>>       if (!is_stable_node_chain(stable_node)) {
>>           if (is_page_sharing_candidate(stable_node)) {
>>               *_stable_node_dup = stable_node;
>> -            return get_ksm_page(stable_node, false);
>> +            return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK);
>>           }
>>           /*
>>            * _stable_node_dup set to NULL means the stable_node
>> @@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct 
>> page *page)
>>                * wrprotected at all times. Any will work
>>                * fine to continue the walk.
>>                */
>> -            tree_page = get_ksm_page(stable_node_any, false);
>> +            tree_page = get_ksm_page(stable_node_any,
>> +                         GET_KSM_PAGE_NOLOCK);
>>           }
>>           VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>>           if (!tree_page) {
>> @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct 
>> page *page)
>>                * It would be more elegant to return stable_node
>>                * than kpage, but that involves more changes.
>>                */
>> -            tree_page = get_ksm_page(stable_node_dup, true);
>> +            tree_page = get_ksm_page(stable_node_dup,
>> +                         GET_KSM_PAGE_TRYLOCK);
>> +
>> +            if (PTR_ERR(tree_page) == -EBUSY)
>> +                return ERR_PTR(-EBUSY);
>
> or just:
>
>     if (PTR_ERR(tree_page) == -EBUSY)
>         return tree_page;
>
> right?

Either looks fine to me. Returning errno may look more explicit? Anyway 
I really don't have preference.

>
>> +
>>               if (unlikely(!tree_page))
>>                   /*
>>                    * The tree may have been rebalanced,
>> @@ -1842,7 +1862,8 @@ static struct stable_node 
>> *stable_tree_insert(struct page *kpage)
>>                * wrprotected at all times. Any will work
>>                * fine to continue the walk.
>>                */
>> -            tree_page = get_ksm_page(stable_node_any, false);
>> +            tree_page = get_ksm_page(stable_node_any,
>> +                         GET_KSM_PAGE_NOLOCK);
>>           }
>>           VM_BUG_ON(!stable_node_dup ^ !!stable_node_any);
>>           if (!tree_page) {
>> @@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page 
>> *page, struct rmap_item *rmap_item)
>>         /* We first start with searching the page inside the stable 
>> tree */
>>       kpage = stable_tree_search(page);
>> +
>> +    if (PTR_ERR(kpage) == -EBUSY)
>> +        return;
>> +
>>       if (kpage == page && rmap_item->head == stable_node) {
>>           put_page(kpage);
>>           return;
>> @@ -2242,7 +2267,8 @@ static struct rmap_item 
>> *scan_get_next_rmap_item(struct page **page)
>>                 list_for_each_entry_safe(stable_node, next,
>>                            &migrate_nodes, list) {
>> -                page = get_ksm_page(stable_node, false);
>> +                page = get_ksm_page(stable_node,
>> +                            GET_KSM_PAGE_NOLOCK);
>>                   if (page)
>>                       put_page(page);
>>                   cond_resched();
>>
>
> Hi Yang,
>
> The patch looks correct as far doing what it claims to do. I'll leave it
> to others to decide if a trylock-based approach is really what you want,
> for KSM scans. It seems reasonable from my very limited knowledge of
> KSM: there shouldn't be any cases where you really *need* to wait for
> a page lock, because the whole system is really sort of an optimization
> anyway.

Thanks!

Yang

>
>
> thanks,


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [v3 PATCH] mm: ksm: do not block on page lock when searching stable tree
  2019-01-30 17:47   ` Yang Shi
@ 2019-01-30 18:14     ` John Hubbard
  0 siblings, 0 replies; 5+ messages in thread
From: John Hubbard @ 2019-01-30 18:14 UTC (permalink / raw)
  To: Yang Shi, ktkhai, hughd, aarcange, akpm; +Cc: linux-mm, linux-kernel

On 1/30/19 9:47 AM, Yang Shi wrote:
[...]
>>> @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct page *page)
>>>                * It would be more elegant to return stable_node
>>>                * than kpage, but that involves more changes.
>>>                */
>>> -            tree_page = get_ksm_page(stable_node_dup, true);
>>> +            tree_page = get_ksm_page(stable_node_dup,
>>> +                         GET_KSM_PAGE_TRYLOCK);
>>> +
>>> +            if (PTR_ERR(tree_page) == -EBUSY)
>>> +                return ERR_PTR(-EBUSY);
>>
>> or just:
>>
>>     if (PTR_ERR(tree_page) == -EBUSY)
>>         return tree_page;
>>
>> right?
> 
> Either looks fine to me. Returning errno may look more explicit? Anyway I really don't have preference.

Yes, either one is fine. I like to see less code on the screen, all else being equal,
but it's an extremely minor point, and sometimes being explicit instead is better anyway.



thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-01-30 18:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-29 20:29 [v3 PATCH] mm: ksm: do not block on page lock when searching stable tree Yang Shi
2019-01-30  7:14 ` John Hubbard
2019-01-30 17:47   ` Yang Shi
2019-01-30 18:14     ` John Hubbard
2019-01-30  8:13 ` Kirill Tkhai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).