linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting
@ 2016-06-17  3:03 Huang, Ying
  2016-06-17  3:15 ` Huang, Ying
  2016-06-17  5:31 ` Minchan Kim
  0 siblings, 2 replies; 9+ messages in thread
From: Huang, Ying @ 2016-06-17  3:03 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Huang Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm,
	linux-kernel

From: Huang Ying <ying.huang@intel.com>

madvise_free_huge_pmd should return 0 if the fallback PTE operations are
required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
the THP will be split and fallback PTE operations should be used if
splitting succeeds.  But the original code will make fallback PTE
operations skipped, after splitting succeeds.  Fix that via make
madvise_free_huge_pmd return 0 after splitting successfully, so that the
fallback PTE operations will be done.

Know issues: if my understanding were correct, return 1 from
madvise_free_huge_pmd means the following processing for the PMD should
be skipped, while return 0 means the following processing is still
needed.  So the function should return 0 only if the THP is split
successfully or the PMD is not trans huge.  But the pmd_trans_unstable
after madvise_free_huge_pmd guarantee the following processing will be
skipped for huge PMD.  So current code can run properly.  But if my
understanding were correct, we can clean up return code of
madvise_free_huge_pmd accordingly.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
---
 mm/huge_memory.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2ad52d5..64dc95d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 	if (next - addr != HPAGE_PMD_SIZE) {
 		get_page(page);
 		spin_unlock(ptl);
-		if (split_huge_page(page)) {
-			put_page(page);
-			unlock_page(page);
-			goto out_unlocked;
-		}
+		split_huge_page(page);
 		put_page(page);
 		unlock_page(page);
-		ret = 1;
 		goto out_unlocked;
 	}
 
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting
  2016-06-17  3:03 [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting Huang, Ying
@ 2016-06-17  3:15 ` Huang, Ying
  2016-06-17  5:31 ` Minchan Kim
  1 sibling, 0 replies; 9+ messages in thread
From: Huang, Ying @ 2016-06-17  3:15 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Minchan Kim, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm,
	linux-kernel

"Huang, Ying" <ying.huang@intel.com> writes:

> From: Huang Ying <ying.huang@intel.com>
>
> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
> the THP will be split and fallback PTE operations should be used if
> splitting succeeds.  But the original code will make fallback PTE
> operations skipped, after splitting succeeds.  Fix that via make
> madvise_free_huge_pmd return 0 after splitting successfully, so that the
> fallback PTE operations will be done.
>
> Know issues: if my understanding were correct, return 1 from
> madvise_free_huge_pmd means the following processing for the PMD should
> be skipped, while return 0 means the following processing is still
> needed.  So the function should return 0 only if the THP is split
> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
> after madvise_free_huge_pmd guarantee the following processing will be
> skipped for huge PMD.  So current code can run properly.  But if my
> understanding were correct, we can clean up return code of
> madvise_free_huge_pmd accordingly.
>

This patch was tested with the below program, given some memory
pressure, the the value read back is 0 with the patch, that is, THP is
split, freed and zero page is mapped.  Without the patch, the value read
back is still 1, that is, the page is not freed.

Best Regards,
Huang, Ying

------------------------------->
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <stdlib.h>
#include <sys/mman.h>

#ifndef MADV_FREE
#define MADV_FREE	8		/* free pages only if memory pressure */
#endif

#define ONE_MB		(1024 * 1024)
#define THP_SIZE	(2 * ONE_MB)
#define THP_MASK	(THP_SIZE - 1)
#define MAP_SIZE	(16 * ONE_MB)

#define ERR_EXIT_ON(cond, msg)					\
	do {							\
		int __cond_in_macro = (cond);			\
		if (__cond_in_macro)				\
			error_exit(__cond_in_macro, (msg));	\
	} while (0)

void error_exit(int ret, const char *msg)
{
	fprintf(stderr, "Error: %s, ret : %d, error: %s",
		msg, ret, strerror(errno));
	exit(1);
}

void write_and_free()
{
	int ret;
	void *addr;
	int *pn;

	addr = mmap(NULL, MAP_SIZE, PROT_READ | PROT_WRITE,
		    MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	ERR_EXIT_ON(ret, "mmap");
	ret = madvise(addr, MAP_SIZE, MADV_HUGEPAGE);
	ERR_EXIT_ON(ret, "advise hugepage");
	pn = (int *)(((unsigned long)addr + THP_SIZE) & ~THP_MASK);
	*pn = 1;
	printf("map 1 THP, hit any key to free part of it: ");
 	fgetc(stdin);
	ret = madvise(pn, ONE_MB, MADV_FREE);
	ERR_EXIT_ON(ret, "advise free");
	printf("freed part of THP, hit any key to get the new value: ");
 	fgetc(stdin);
	printf("val: %d\n", *pn);
}

int main(int argc, char *argv[])
{
	write_and_free();
	return 0;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting
  2016-06-17  3:03 [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting Huang, Ying
  2016-06-17  3:15 ` Huang, Ying
@ 2016-06-17  5:31 ` Minchan Kim
  2016-06-17 15:59   ` Huang, Ying
  2016-06-17 19:45   ` Huang, Ying
  1 sibling, 2 replies; 9+ messages in thread
From: Minchan Kim @ 2016-06-17  5:31 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm,
	linux-kernel

Hi,

On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote:
> From: Huang Ying <ying.huang@intel.com>
> 
> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
> the THP will be split and fallback PTE operations should be used if
> splitting succeeds.  But the original code will make fallback PTE
> operations skipped, after splitting succeeds.  Fix that via make
> madvise_free_huge_pmd return 0 after splitting successfully, so that the
> fallback PTE operations will be done.

You're right. Thanks!

> 
> Know issues: if my understanding were correct, return 1 from
> madvise_free_huge_pmd means the following processing for the PMD should
> be skipped, while return 0 means the following processing is still
> needed.  So the function should return 0 only if the THP is split
> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
> after madvise_free_huge_pmd guarantee the following processing will be
> skipped for huge PMD.  So current code can run properly.  But if my
> understanding were correct, we can clean up return code of
> madvise_free_huge_pmd accordingly.

I like your clean up. Just a minor comment below.

> 
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> ---
>  mm/huge_memory.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 2ad52d5..64dc95d 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c

First of all, let's change ret from int to bool.
And then, add description in the function entry.

/*
 * Return true if we do MADV_FREE successfully on entire pmd page.
 * Otherwise, return false.
 */

And do not set to 1 if it is huge_zero_pmd but just goto out to
return false.

Thanks!

> @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  	if (next - addr != HPAGE_PMD_SIZE) {
>  		get_page(page);
>  		spin_unlock(ptl);
> -		if (split_huge_page(page)) {
> -			put_page(page);
> -			unlock_page(page);
> -			goto out_unlocked;
> -		}
> +		split_huge_page(page);
>  		put_page(page);
>  		unlock_page(page);
> -		ret = 1;
>  		goto out_unlocked;
>  	}
>  
> -- 
> 2.8.1
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting
  2016-06-17  5:31 ` Minchan Kim
@ 2016-06-17 15:59   ` Huang, Ying
  2016-06-19 23:54     ` Minchan Kim
  2016-06-17 19:45   ` Huang, Ying
  1 sibling, 1 reply; 9+ messages in thread
From: Huang, Ying @ 2016-06-17 15:59 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Huang, Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm,
	linux-kernel

Minchan Kim <minchan@kernel.org> writes:

> Hi,
>
> On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote:
>> From: Huang Ying <ying.huang@intel.com>
>> 
>> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
>> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
>> the THP will be split and fallback PTE operations should be used if
>> splitting succeeds.  But the original code will make fallback PTE
>> operations skipped, after splitting succeeds.  Fix that via make
>> madvise_free_huge_pmd return 0 after splitting successfully, so that the
>> fallback PTE operations will be done.
>
> You're right. Thanks!
>
>> 
>> Know issues: if my understanding were correct, return 1 from
>> madvise_free_huge_pmd means the following processing for the PMD should
>> be skipped, while return 0 means the following processing is still
>> needed.  So the function should return 0 only if the THP is split
>> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
>> after madvise_free_huge_pmd guarantee the following processing will be
>> skipped for huge PMD.  So current code can run properly.  But if my
>> understanding were correct, we can clean up return code of
>> madvise_free_huge_pmd accordingly.
>
> I like your clean up. Just a minor comment below.
>
>> 
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> ---
>>  mm/huge_memory.c | 7 +------
>>  1 file changed, 1 insertion(+), 6 deletions(-)
>> 
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 2ad52d5..64dc95d 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>
> First of all, let's change ret from int to bool.
> And then, add description in the function entry.
>
> /*
>  * Return true if we do MADV_FREE successfully on entire pmd page.
>  * Otherwise, return false.
>  */
>
> And do not set to 1 if it is huge_zero_pmd but just goto out to
> return false.

Do you want to fold the cleanup with this patch or do that in another
patch?

Best Regards,
Huang, Ying

> Thanks!
>
>> @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>>  	if (next - addr != HPAGE_PMD_SIZE) {
>>  		get_page(page);
>>  		spin_unlock(ptl);
>> -		if (split_huge_page(page)) {
>> -			put_page(page);
>> -			unlock_page(page);
>> -			goto out_unlocked;
>> -		}
>> +		split_huge_page(page);
>>  		put_page(page);
>>  		unlock_page(page);
>> -		ret = 1;
>>  		goto out_unlocked;
>>  	}
>>  
>> -- 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting
  2016-06-17  5:31 ` Minchan Kim
  2016-06-17 15:59   ` Huang, Ying
@ 2016-06-17 19:45   ` Huang, Ying
  2016-06-20  0:15     ` Minchan Kim
  1 sibling, 1 reply; 9+ messages in thread
From: Huang, Ying @ 2016-06-17 19:45 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Huang, Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm,
	linux-kernel

Minchan Kim <minchan@kernel.org> writes:

> Hi,
>
> On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote:
>> From: Huang Ying <ying.huang@intel.com>
>> 
>> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
>> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
>> the THP will be split and fallback PTE operations should be used if
>> splitting succeeds.  But the original code will make fallback PTE
>> operations skipped, after splitting succeeds.  Fix that via make
>> madvise_free_huge_pmd return 0 after splitting successfully, so that the
>> fallback PTE operations will be done.
>
> You're right. Thanks!
>
>> 
>> Know issues: if my understanding were correct, return 1 from
>> madvise_free_huge_pmd means the following processing for the PMD should
>> be skipped, while return 0 means the following processing is still
>> needed.  So the function should return 0 only if the THP is split
>> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
>> after madvise_free_huge_pmd guarantee the following processing will be
>> skipped for huge PMD.  So current code can run properly.  But if my
>> understanding were correct, we can clean up return code of
>> madvise_free_huge_pmd accordingly.
>
> I like your clean up. Just a minor comment below.
>
>> 
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> ---
>>  mm/huge_memory.c | 7 +------
>>  1 file changed, 1 insertion(+), 6 deletions(-)
>> 
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 2ad52d5..64dc95d 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>
> First of all, let's change ret from int to bool.
> And then, add description in the function entry.

Yes.  bool looks better than int.

> /*
>  * Return true if we do MADV_FREE successfully on entire pmd page.
>  * Otherwise, return false.
>  */

This way, we need to return false if we failed to split huge page, this
will cause unnecessary pmd_trans_unstable check.  How about to change
the comments to

/*
 * Return true if we finished processing entire pmd page and needn't
 * fall back pte processing.  Otherwise, return false.
 */

Best Regards,
Huang, Ying

> And do not set to 1 if it is huge_zero_pmd but just goto out to
> return false.
>
> Thanks!
>
>> @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>>  	if (next - addr != HPAGE_PMD_SIZE) {
>>  		get_page(page);
>>  		spin_unlock(ptl);
>> -		if (split_huge_page(page)) {
>> -			put_page(page);
>> -			unlock_page(page);
>> -			goto out_unlocked;
>> -		}
>> +		split_huge_page(page);
>>  		put_page(page);
>>  		unlock_page(page);
>> -		ret = 1;
>>  		goto out_unlocked;
>>  	}
>>  
>> -- 
>> 2.8.1
>> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting
  2016-06-17 15:59   ` Huang, Ying
@ 2016-06-19 23:54     ` Minchan Kim
  2016-06-20 15:48       ` Huang, Ying
  0 siblings, 1 reply; 9+ messages in thread
From: Minchan Kim @ 2016-06-19 23:54 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Minchan Kim, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm,
	linux-kernel

On Fri, Jun 17, 2016 at 08:59:31AM -0700, Huang, Ying wrote:
> Minchan Kim <minchan@kernel.org> writes:
> 
> > Hi,
> >
> > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote:
> >> From: Huang Ying <ying.huang@intel.com>
> >> 
> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
> >> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
> >> the THP will be split and fallback PTE operations should be used if
> >> splitting succeeds.  But the original code will make fallback PTE
> >> operations skipped, after splitting succeeds.  Fix that via make
> >> madvise_free_huge_pmd return 0 after splitting successfully, so that the
> >> fallback PTE operations will be done.
> >
> > You're right. Thanks!
> >
> >> 
> >> Know issues: if my understanding were correct, return 1 from
> >> madvise_free_huge_pmd means the following processing for the PMD should
> >> be skipped, while return 0 means the following processing is still
> >> needed.  So the function should return 0 only if the THP is split
> >> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
> >> after madvise_free_huge_pmd guarantee the following processing will be
> >> skipped for huge PMD.  So current code can run properly.  But if my
> >> understanding were correct, we can clean up return code of
> >> madvise_free_huge_pmd accordingly.
> >
> > I like your clean up. Just a minor comment below.
> >
> >> 
> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> >> ---
> >>  mm/huge_memory.c | 7 +------
> >>  1 file changed, 1 insertion(+), 6 deletions(-)
> >> 
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 2ad52d5..64dc95d 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >
> > First of all, let's change ret from int to bool.
> > And then, add description in the function entry.
> >
> > /*
> >  * Return true if we do MADV_FREE successfully on entire pmd page.
> >  * Otherwise, return false.
> >  */
> >
> > And do not set to 1 if it is huge_zero_pmd but just goto out to
> > return false.
> 
> Do you want to fold the cleanup with this patch or do that in another
> patch?

I prefer separating cleanup and bug fix so that we can send only bug
fix patch to stable tree.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting
  2016-06-17 19:45   ` Huang, Ying
@ 2016-06-20  0:15     ` Minchan Kim
  2016-06-20 15:48       ` Huang, Ying
  0 siblings, 1 reply; 9+ messages in thread
From: Minchan Kim @ 2016-06-20  0:15 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Minchan Kim, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm,
	linux-kernel

On Fri, Jun 17, 2016 at 12:45:43PM -0700, Huang, Ying wrote:
> Minchan Kim <minchan@kernel.org> writes:
> 
> > Hi,
> >
> > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote:
> >> From: Huang Ying <ying.huang@intel.com>
> >> 
> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
> >> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
> >> the THP will be split and fallback PTE operations should be used if
> >> splitting succeeds.  But the original code will make fallback PTE
> >> operations skipped, after splitting succeeds.  Fix that via make
> >> madvise_free_huge_pmd return 0 after splitting successfully, so that the
> >> fallback PTE operations will be done.
> >
> > You're right. Thanks!
> >
> >> 
> >> Know issues: if my understanding were correct, return 1 from
> >> madvise_free_huge_pmd means the following processing for the PMD should
> >> be skipped, while return 0 means the following processing is still
> >> needed.  So the function should return 0 only if the THP is split
> >> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
> >> after madvise_free_huge_pmd guarantee the following processing will be
> >> skipped for huge PMD.  So current code can run properly.  But if my
> >> understanding were correct, we can clean up return code of
> >> madvise_free_huge_pmd accordingly.
> >
> > I like your clean up. Just a minor comment below.
> >
> >> 
> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> >> ---
> >>  mm/huge_memory.c | 7 +------
> >>  1 file changed, 1 insertion(+), 6 deletions(-)
> >> 
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 2ad52d5..64dc95d 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >
> > First of all, let's change ret from int to bool.
> > And then, add description in the function entry.
> 
> Yes.  bool looks better than int.
> 
> > /*
> >  * Return true if we do MADV_FREE successfully on entire pmd page.
> >  * Otherwise, return false.
> >  */
> 
> This way, we need to return false if we failed to split huge page, this
> will cause unnecessary pmd_trans_unstable check.  How about to change
> the comments to

I focused the function name "madvise_free_huge_pmd". IOW, the function
should free huge pmd page. If it is successful, done. Otherwise, next
routines should handle it.

If it fail to split, pmd_trans_unstable will check it and return.
I don't think it's heavy operation to affect performance so rather
than making function fast, I wanted to make it simple by following
function name.

> 
> /*
>  * Return true if we finished processing entire pmd page and needn't
>  * fall back pte processing.  Otherwise, return false.
>  */
> 
> Best Regards,
> Huang, Ying
> 
> > And do not set to 1 if it is huge_zero_pmd but just goto out to
> > return false.
> >
> > Thanks!
> >
> >> @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
> >>  	if (next - addr != HPAGE_PMD_SIZE) {
> >>  		get_page(page);
> >>  		spin_unlock(ptl);
> >> -		if (split_huge_page(page)) {
> >> -			put_page(page);
> >> -			unlock_page(page);
> >> -			goto out_unlocked;
> >> -		}
> >> +		split_huge_page(page);
> >>  		put_page(page);
> >>  		unlock_page(page);
> >> -		ret = 1;
> >>  		goto out_unlocked;
> >>  	}
> >>  
> >> -- 
> >> 2.8.1
> >> 
> >> --
> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> the body to majordomo@kvack.org.  For more info on Linux MM,
> >> see: http://www.linux-mm.org/ .
> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting
  2016-06-20  0:15     ` Minchan Kim
@ 2016-06-20 15:48       ` Huang, Ying
  0 siblings, 0 replies; 9+ messages in thread
From: Huang, Ying @ 2016-06-20 15:48 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Huang, Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm,
	linux-kernel

Minchan Kim <minchan@kernel.org> writes:

> On Fri, Jun 17, 2016 at 12:45:43PM -0700, Huang, Ying wrote:
>> Minchan Kim <minchan@kernel.org> writes:
>> 
>> > Hi,
>> >
>> > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote:
>> >> From: Huang Ying <ying.huang@intel.com>
>> >> 
>> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
>> >> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
>> >> the THP will be split and fallback PTE operations should be used if
>> >> splitting succeeds.  But the original code will make fallback PTE
>> >> operations skipped, after splitting succeeds.  Fix that via make
>> >> madvise_free_huge_pmd return 0 after splitting successfully, so that the
>> >> fallback PTE operations will be done.
>> >
>> > You're right. Thanks!
>> >
>> >> 
>> >> Know issues: if my understanding were correct, return 1 from
>> >> madvise_free_huge_pmd means the following processing for the PMD should
>> >> be skipped, while return 0 means the following processing is still
>> >> needed.  So the function should return 0 only if the THP is split
>> >> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
>> >> after madvise_free_huge_pmd guarantee the following processing will be
>> >> skipped for huge PMD.  So current code can run properly.  But if my
>> >> understanding were correct, we can clean up return code of
>> >> madvise_free_huge_pmd accordingly.
>> >
>> > I like your clean up. Just a minor comment below.
>> >
>> >> 
>> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> >> ---
>> >>  mm/huge_memory.c | 7 +------
>> >>  1 file changed, 1 insertion(+), 6 deletions(-)
>> >> 
>> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> >> index 2ad52d5..64dc95d 100644
>> >> --- a/mm/huge_memory.c
>> >> +++ b/mm/huge_memory.c
>> >
>> > First of all, let's change ret from int to bool.
>> > And then, add description in the function entry.
>> 
>> Yes.  bool looks better than int.
>> 
>> > /*
>> >  * Return true if we do MADV_FREE successfully on entire pmd page.
>> >  * Otherwise, return false.
>> >  */
>> 
>> This way, we need to return false if we failed to split huge page, this
>> will cause unnecessary pmd_trans_unstable check.  How about to change
>> the comments to
>
> I focused the function name "madvise_free_huge_pmd". IOW, the function
> should free huge pmd page. If it is successful, done. Otherwise, next
> routines should handle it.
>
> If it fail to split, pmd_trans_unstable will check it and return.
> I don't think it's heavy operation to affect performance so rather
> than making function fast, I wanted to make it simple by following
> function name.

Reasonable.  Will do that.

Best Regards,
Huang, Ying

>> 
>> /*
>>  * Return true if we finished processing entire pmd page and needn't
>>  * fall back pte processing.  Otherwise, return false.
>>  */
>> 
>> Best Regards,
>> Huang, Ying
>> 
>> > And do not set to 1 if it is huge_zero_pmd but just goto out to
>> > return false.
>> >
>> > Thanks!
>> >
>> >> @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>> >>  	if (next - addr != HPAGE_PMD_SIZE) {
>> >>  		get_page(page);
>> >>  		spin_unlock(ptl);
>> >> -		if (split_huge_page(page)) {
>> >> -			put_page(page);
>> >> -			unlock_page(page);
>> >> -			goto out_unlocked;
>> >> -		}
>> >> +		split_huge_page(page);
>> >>  		put_page(page);
>> >>  		unlock_page(page);
>> >> -		ret = 1;
>> >>  		goto out_unlocked;
>> >>  	}
>> >>  
>> >> -- 
>> >> 2.8.1
>> >> 
>> >> --
>> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> >> the body to majordomo@kvack.org.  For more info on Linux MM,
>> >> see: http://www.linux-mm.org/ .
>> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting
  2016-06-19 23:54     ` Minchan Kim
@ 2016-06-20 15:48       ` Huang, Ying
  0 siblings, 0 replies; 9+ messages in thread
From: Huang, Ying @ 2016-06-20 15:48 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Huang, Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka,
	Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm,
	linux-kernel

Minchan Kim <minchan@kernel.org> writes:

> On Fri, Jun 17, 2016 at 08:59:31AM -0700, Huang, Ying wrote:
>> Minchan Kim <minchan@kernel.org> writes:
>> 
>> > Hi,
>> >
>> > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote:
>> >> From: Huang Ying <ying.huang@intel.com>
>> >> 
>> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
>> >> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
>> >> the THP will be split and fallback PTE operations should be used if
>> >> splitting succeeds.  But the original code will make fallback PTE
>> >> operations skipped, after splitting succeeds.  Fix that via make
>> >> madvise_free_huge_pmd return 0 after splitting successfully, so that the
>> >> fallback PTE operations will be done.
>> >
>> > You're right. Thanks!
>> >
>> >> 
>> >> Know issues: if my understanding were correct, return 1 from
>> >> madvise_free_huge_pmd means the following processing for the PMD should
>> >> be skipped, while return 0 means the following processing is still
>> >> needed.  So the function should return 0 only if the THP is split
>> >> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
>> >> after madvise_free_huge_pmd guarantee the following processing will be
>> >> skipped for huge PMD.  So current code can run properly.  But if my
>> >> understanding were correct, we can clean up return code of
>> >> madvise_free_huge_pmd accordingly.
>> >
>> > I like your clean up. Just a minor comment below.
>> >
>> >> 
>> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> >> ---
>> >>  mm/huge_memory.c | 7 +------
>> >>  1 file changed, 1 insertion(+), 6 deletions(-)
>> >> 
>> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> >> index 2ad52d5..64dc95d 100644
>> >> --- a/mm/huge_memory.c
>> >> +++ b/mm/huge_memory.c
>> >
>> > First of all, let's change ret from int to bool.
>> > And then, add description in the function entry.
>> >
>> > /*
>> >  * Return true if we do MADV_FREE successfully on entire pmd page.
>> >  * Otherwise, return false.
>> >  */
>> >
>> > And do not set to 1 if it is huge_zero_pmd but just goto out to
>> > return false.
>> 
>> Do you want to fold the cleanup with this patch or do that in another
>> patch?
>
> I prefer separating cleanup and bug fix so that we can send only bug
> fix patch to stable tree.

Sure.

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-06-20 15:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-17  3:03 [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting Huang, Ying
2016-06-17  3:15 ` Huang, Ying
2016-06-17  5:31 ` Minchan Kim
2016-06-17 15:59   ` Huang, Ying
2016-06-19 23:54     ` Minchan Kim
2016-06-20 15:48       ` Huang, Ying
2016-06-17 19:45   ` Huang, Ying
2016-06-20  0:15     ` Minchan Kim
2016-06-20 15:48       ` Huang, Ying

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).