linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] thp: tidy and fix khugepaged_prealloc_page
@ 2012-09-12 12:55 Xiao Guangrong
  2012-09-12 12:56 ` [PATCH 1/3] thp: fix forgetting to reset the page alloc indicator Xiao Guangrong
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Xiao Guangrong @ 2012-09-12 12:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Hugh Dickins, Linux Memory Management List, LKML

There has a bug in khugepaged_prealloc_page, the page-alloc
indicator is not reset if the previous page request is failed,
then it will trigger the VM_BUG_ON in khugepaged_alloc_page.
It is fixed by the first patch which need not be back port for
it was introduced by recent commit. (sorry for that)

As Hugh pointed out, this are some ugly portions:
- releasing mmap_sem lock is hidden in khugepaged_alloc_page
- page is freed in khugepaged_prealloc_page
The later two patches try to fix these issues.

Hugh,

If any point i missed, please let me know, and sorry to waste
your time on my broken patch.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] thp: fix forgetting to reset the page alloc indicator
  2012-09-12 12:55 [PATCH 0/3] thp: tidy and fix khugepaged_prealloc_page Xiao Guangrong
@ 2012-09-12 12:56 ` Xiao Guangrong
  2012-09-12 12:56 ` [PATCH 2/3] thp: move release mmap_sem lock out of khugepaged_alloc_page Xiao Guangrong
  2012-09-12 12:57 ` [PATCH 3/3] thp: introduce khugepaged_cleanup_page Xiao Guangrong
  2 siblings, 0 replies; 6+ messages in thread
From: Xiao Guangrong @ 2012-09-12 12:56 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Andrew Morton, Hugh Dickins, Linux Memory Management List, LKML

If NUMA is enabled, the indicator is not reset if the previous page
request is failed, then it will trigger the VM_BUG_ON in khugepaged_alloc_page

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 mm/huge_memory.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e366ca5..66d2bc6 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1825,6 +1825,7 @@ static bool khugepaged_prealloc_page(struct page **hpage, bool *wait)
 			return false;

 		*wait = false;
+		*hpage = NULL;
 		khugepaged_alloc_sleep();
 	} else if (*hpage) {
 		put_page(*hpage);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] thp: move release mmap_sem lock out of khugepaged_alloc_page
  2012-09-12 12:55 [PATCH 0/3] thp: tidy and fix khugepaged_prealloc_page Xiao Guangrong
  2012-09-12 12:56 ` [PATCH 1/3] thp: fix forgetting to reset the page alloc indicator Xiao Guangrong
@ 2012-09-12 12:56 ` Xiao Guangrong
  2012-09-12 22:18   ` Andrew Morton
  2012-09-12 12:57 ` [PATCH 3/3] thp: introduce khugepaged_cleanup_page Xiao Guangrong
  2 siblings, 1 reply; 6+ messages in thread
From: Xiao Guangrong @ 2012-09-12 12:56 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Andrew Morton, Hugh Dickins, Linux Memory Management List, LKML

To make the code more clear, move release the lock out of khugepaged_alloc_page

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 mm/huge_memory.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 66d2bc6..5622347 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1854,11 +1854,6 @@ static struct page
 	*hpage  = alloc_hugepage_vma(khugepaged_defrag(), vma, address,
 				      node, __GFP_OTHER_NODE);

-	/*
-	 * After allocating the hugepage, release the mmap_sem read lock in
-	 * preparation for taking it in write mode.
-	 */
-	up_read(&mm->mmap_sem);
 	if (unlikely(!*hpage)) {
 		count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
 		*hpage = ERR_PTR(-ENOMEM);
@@ -1905,7 +1900,6 @@ static struct page
 		       struct vm_area_struct *vma, unsigned long address,
 		       int node)
 {
-	up_read(&mm->mmap_sem);
 	VM_BUG_ON(!*hpage);
 	return  *hpage;
 }
@@ -1931,8 +1925,14 @@ static void collapse_huge_page(struct mm_struct *mm,

 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);

-	/* release the mmap_sem read lock. */
 	new_page = khugepaged_alloc_page(hpage, mm, vma, address, node);
+
+	/*
+	 * After allocating the hugepage, release the mmap_sem read lock in
+	 * preparation for taking it in write mode.
+	 */
+	up_read(&mm->mmap_sem);
+
 	if (!new_page)
 		return;

-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] thp: introduce khugepaged_cleanup_page
  2012-09-12 12:55 [PATCH 0/3] thp: tidy and fix khugepaged_prealloc_page Xiao Guangrong
  2012-09-12 12:56 ` [PATCH 1/3] thp: fix forgetting to reset the page alloc indicator Xiao Guangrong
  2012-09-12 12:56 ` [PATCH 2/3] thp: move release mmap_sem lock out of khugepaged_alloc_page Xiao Guangrong
@ 2012-09-12 12:57 ` Xiao Guangrong
  2 siblings, 0 replies; 6+ messages in thread
From: Xiao Guangrong @ 2012-09-12 12:57 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Andrew Morton, Hugh Dickins, Linux Memory Management List, LKML

It is used to release the page on the fail path, then the page need not
be cleaned up in khugepaged_prealloc_page anymore

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 mm/huge_memory.c |   19 +++++++++++++++----
 1 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 5622347..de0a028 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1827,9 +1827,6 @@ static bool khugepaged_prealloc_page(struct page **hpage, bool *wait)
 		*wait = false;
 		*hpage = NULL;
 		khugepaged_alloc_sleep();
-	} else if (*hpage) {
-		put_page(*hpage);
-		*hpage = NULL;
 	}

 	return true;
@@ -1863,6 +1860,13 @@ static struct page
 	count_vm_event(THP_COLLAPSE_ALLOC);
 	return *hpage;
 }
+
+static void khugepaged_cleanup_page(struct page **hpage)
+{
+	VM_BUG_ON(!*hpage);
+	put_page(*hpage);
+	*hpage = NULL;
+}
 #else
 static struct page *khugepaged_alloc_hugepage(bool *wait)
 {
@@ -1903,6 +1907,10 @@ static struct page
 	VM_BUG_ON(!*hpage);
 	return  *hpage;
 }
+
+static void khugepaged_cleanup_page(struct page **hpage)
+{
+}
 #endif

 static void collapse_huge_page(struct mm_struct *mm,
@@ -1936,8 +1944,10 @@ static void collapse_huge_page(struct mm_struct *mm,
 	if (!new_page)
 		return;

-	if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL)))
+	if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))) {
+		khugepaged_cleanup_page(hpage);
 		return;
+	}

 	/*
 	 * Prevent all access to pagetables with the exception of
@@ -2048,6 +2058,7 @@ out_up_write:
 	return;

 out:
+	khugepaged_cleanup_page(hpage);
 	mem_cgroup_uncharge_page(new_page);
 	goto out_up_write;
 }
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/3] thp: move release mmap_sem lock out of khugepaged_alloc_page
  2012-09-12 12:56 ` [PATCH 2/3] thp: move release mmap_sem lock out of khugepaged_alloc_page Xiao Guangrong
@ 2012-09-12 22:18   ` Andrew Morton
  2012-09-13  9:16     ` Xiao Guangrong
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2012-09-12 22:18 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Hugh Dickins, Linux Memory Management List, LKML

On Wed, 12 Sep 2012 20:56:41 +0800
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:

> To make the code more clear, move release the lock out of khugepaged_alloc_page
> 
> ...
>
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1854,11 +1854,6 @@ static struct page
>  	*hpage  = alloc_hugepage_vma(khugepaged_defrag(), vma, address,
>  				      node, __GFP_OTHER_NODE);
> 
> -	/*
> -	 * After allocating the hugepage, release the mmap_sem read lock in
> -	 * preparation for taking it in write mode.
> -	 */
> -	up_read(&mm->mmap_sem);
>  	if (unlikely(!*hpage)) {
>  		count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
>  		*hpage = ERR_PTR(-ENOMEM);
> @@ -1905,7 +1900,6 @@ static struct page
>  		       struct vm_area_struct *vma, unsigned long address,
>  		       int node)
>  {
> -	up_read(&mm->mmap_sem);
>  	VM_BUG_ON(!*hpage);
>  	return  *hpage;
>  }
> @@ -1931,8 +1925,14 @@ static void collapse_huge_page(struct mm_struct *mm,
> 
>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
> 
> -	/* release the mmap_sem read lock. */
>  	new_page = khugepaged_alloc_page(hpage, mm, vma, address, node);
> +
> +	/*
> +	 * After allocating the hugepage, release the mmap_sem read lock in
> +	 * preparation for taking it in write mode.
> +	 */
> +	up_read(&mm->mmap_sem);
> +
>  	if (!new_page)
>  		return;

Well that's a pretty minor improvement: one still has to go off on a
big hunt to locate the matching down_read().

And the patch will increase mmap_sem hold times by a teeny amount.  Do
we really want to do this?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/3] thp: move release mmap_sem lock out of khugepaged_alloc_page
  2012-09-12 22:18   ` Andrew Morton
@ 2012-09-13  9:16     ` Xiao Guangrong
  0 siblings, 0 replies; 6+ messages in thread
From: Xiao Guangrong @ 2012-09-13  9:16 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Hugh Dickins, Linux Memory Management List, LKML

On 09/13/2012 06:18 AM, Andrew Morton wrote:
> On Wed, 12 Sep 2012 20:56:41 +0800
> Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:
> 
>> To make the code more clear, move release the lock out of khugepaged_alloc_page
>>
>> ...
>>
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -1854,11 +1854,6 @@ static struct page
>>  	*hpage  = alloc_hugepage_vma(khugepaged_defrag(), vma, address,
>>  				      node, __GFP_OTHER_NODE);
>>
>> -	/*
>> -	 * After allocating the hugepage, release the mmap_sem read lock in
>> -	 * preparation for taking it in write mode.
>> -	 */
>> -	up_read(&mm->mmap_sem);
>>  	if (unlikely(!*hpage)) {
>>  		count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
>>  		*hpage = ERR_PTR(-ENOMEM);
>> @@ -1905,7 +1900,6 @@ static struct page
>>  		       struct vm_area_struct *vma, unsigned long address,
>>  		       int node)
>>  {
>> -	up_read(&mm->mmap_sem);
>>  	VM_BUG_ON(!*hpage);
>>  	return  *hpage;
>>  }
>> @@ -1931,8 +1925,14 @@ static void collapse_huge_page(struct mm_struct *mm,
>>
>>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>>
>> -	/* release the mmap_sem read lock. */
>>  	new_page = khugepaged_alloc_page(hpage, mm, vma, address, node);
>> +
>> +	/*
>> +	 * After allocating the hugepage, release the mmap_sem read lock in
>> +	 * preparation for taking it in write mode.
>> +	 */
>> +	up_read(&mm->mmap_sem);
>> +
>>  	if (!new_page)
>>  		return;
> 
> Well that's a pretty minor improvement: one still has to go off on a
> big hunt to locate the matching down_read().
> 
> And the patch will increase mmap_sem hold times by a teeny amount.  Do
> we really want to do this?

Andrew,

This is why i did in the previous patch (the lock is released in alloc function),
but as you noticed, this is really teeny overload after this patch - only increase
the load of count_vm_event() which operates a cpu-local variable. And, before i
posted this patch, i did kerbench test, no regression was found.

The another approach is, let the function name indicate the lock will be released,
how about just change the function name to khugepaged_alloc_page_release_lock?







^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-09-13  9:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-12 12:55 [PATCH 0/3] thp: tidy and fix khugepaged_prealloc_page Xiao Guangrong
2012-09-12 12:56 ` [PATCH 1/3] thp: fix forgetting to reset the page alloc indicator Xiao Guangrong
2012-09-12 12:56 ` [PATCH 2/3] thp: move release mmap_sem lock out of khugepaged_alloc_page Xiao Guangrong
2012-09-12 22:18   ` Andrew Morton
2012-09-13  9:16     ` Xiao Guangrong
2012-09-12 12:57 ` [PATCH 3/3] thp: introduce khugepaged_cleanup_page Xiao Guangrong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).