* [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting @ 2016-06-17 3:03 Huang, Ying 2016-06-17 3:15 ` Huang, Ying 2016-06-17 5:31 ` Minchan Kim 0 siblings, 2 replies; 9+ messages in thread From: Huang, Ying @ 2016-06-17 3:03 UTC (permalink / raw) To: Minchan Kim Cc: Huang Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm, linux-kernel From: Huang Ying <ying.huang@intel.com> madvise_free_huge_pmd should return 0 if the fallback PTE operations are required. In madvise_free_huge_pmd, if part pages of THP are discarded, the THP will be split and fallback PTE operations should be used if splitting succeeds. But the original code will make fallback PTE operations skipped, after splitting succeeds. Fix that via make madvise_free_huge_pmd return 0 after splitting successfully, so that the fallback PTE operations will be done. Know issues: if my understanding were correct, return 1 from madvise_free_huge_pmd means the following processing for the PMD should be skipped, while return 0 means the following processing is still needed. So the function should return 0 only if the THP is split successfully or the PMD is not trans huge. But the pmd_trans_unstable after madvise_free_huge_pmd guarantee the following processing will be skipped for huge PMD. So current code can run properly. But if my understanding were correct, we can clean up return code of madvise_free_huge_pmd accordingly. Signed-off-by: "Huang, Ying" <ying.huang@intel.com> --- mm/huge_memory.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2ad52d5..64dc95d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (next - addr != HPAGE_PMD_SIZE) { get_page(page); spin_unlock(ptl); - if (split_huge_page(page)) { - put_page(page); - unlock_page(page); - goto out_unlocked; - } + split_huge_page(page); put_page(page); unlock_page(page); - ret = 1; goto out_unlocked; } -- 2.8.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting 2016-06-17 3:03 [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting Huang, Ying @ 2016-06-17 3:15 ` Huang, Ying 2016-06-17 5:31 ` Minchan Kim 1 sibling, 0 replies; 9+ messages in thread From: Huang, Ying @ 2016-06-17 3:15 UTC (permalink / raw) To: Huang, Ying Cc: Minchan Kim, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm, linux-kernel "Huang, Ying" <ying.huang@intel.com> writes: > From: Huang Ying <ying.huang@intel.com> > > madvise_free_huge_pmd should return 0 if the fallback PTE operations are > required. In madvise_free_huge_pmd, if part pages of THP are discarded, > the THP will be split and fallback PTE operations should be used if > splitting succeeds. But the original code will make fallback PTE > operations skipped, after splitting succeeds. Fix that via make > madvise_free_huge_pmd return 0 after splitting successfully, so that the > fallback PTE operations will be done. > > Know issues: if my understanding were correct, return 1 from > madvise_free_huge_pmd means the following processing for the PMD should > be skipped, while return 0 means the following processing is still > needed. So the function should return 0 only if the THP is split > successfully or the PMD is not trans huge. But the pmd_trans_unstable > after madvise_free_huge_pmd guarantee the following processing will be > skipped for huge PMD. So current code can run properly. But if my > understanding were correct, we can clean up return code of > madvise_free_huge_pmd accordingly. > This patch was tested with the below program, given some memory pressure, the the value read back is 0 with the patch, that is, THP is split, freed and zero page is mapped. Without the patch, the value read back is still 1, that is, the page is not freed. Best Regards, Huang, Ying -------------------------------> #include <stdio.h> #include <string.h> #include <errno.h> #include <stdlib.h> #include <sys/mman.h> #ifndef MADV_FREE #define MADV_FREE 8 /* free pages only if memory pressure */ #endif #define ONE_MB (1024 * 1024) #define THP_SIZE (2 * ONE_MB) #define THP_MASK (THP_SIZE - 1) #define MAP_SIZE (16 * ONE_MB) #define ERR_EXIT_ON(cond, msg) \ do { \ int __cond_in_macro = (cond); \ if (__cond_in_macro) \ error_exit(__cond_in_macro, (msg)); \ } while (0) void error_exit(int ret, const char *msg) { fprintf(stderr, "Error: %s, ret : %d, error: %s", msg, ret, strerror(errno)); exit(1); } void write_and_free() { int ret; void *addr; int *pn; addr = mmap(NULL, MAP_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); ERR_EXIT_ON(ret, "mmap"); ret = madvise(addr, MAP_SIZE, MADV_HUGEPAGE); ERR_EXIT_ON(ret, "advise hugepage"); pn = (int *)(((unsigned long)addr + THP_SIZE) & ~THP_MASK); *pn = 1; printf("map 1 THP, hit any key to free part of it: "); fgetc(stdin); ret = madvise(pn, ONE_MB, MADV_FREE); ERR_EXIT_ON(ret, "advise free"); printf("freed part of THP, hit any key to get the new value: "); fgetc(stdin); printf("val: %d\n", *pn); } int main(int argc, char *argv[]) { write_and_free(); return 0; } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting 2016-06-17 3:03 [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting Huang, Ying 2016-06-17 3:15 ` Huang, Ying @ 2016-06-17 5:31 ` Minchan Kim 2016-06-17 15:59 ` Huang, Ying 2016-06-17 19:45 ` Huang, Ying 1 sibling, 2 replies; 9+ messages in thread From: Minchan Kim @ 2016-06-17 5:31 UTC (permalink / raw) To: Huang, Ying Cc: Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm, linux-kernel Hi, On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote: > From: Huang Ying <ying.huang@intel.com> > > madvise_free_huge_pmd should return 0 if the fallback PTE operations are > required. In madvise_free_huge_pmd, if part pages of THP are discarded, > the THP will be split and fallback PTE operations should be used if > splitting succeeds. But the original code will make fallback PTE > operations skipped, after splitting succeeds. Fix that via make > madvise_free_huge_pmd return 0 after splitting successfully, so that the > fallback PTE operations will be done. You're right. Thanks! > > Know issues: if my understanding were correct, return 1 from > madvise_free_huge_pmd means the following processing for the PMD should > be skipped, while return 0 means the following processing is still > needed. So the function should return 0 only if the THP is split > successfully or the PMD is not trans huge. But the pmd_trans_unstable > after madvise_free_huge_pmd guarantee the following processing will be > skipped for huge PMD. So current code can run properly. But if my > understanding were correct, we can clean up return code of > madvise_free_huge_pmd accordingly. I like your clean up. Just a minor comment below. > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > --- > mm/huge_memory.c | 7 +------ > 1 file changed, 1 insertion(+), 6 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 2ad52d5..64dc95d 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c First of all, let's change ret from int to bool. And then, add description in the function entry. /* * Return true if we do MADV_FREE successfully on entire pmd page. * Otherwise, return false. */ And do not set to 1 if it is huge_zero_pmd but just goto out to return false. Thanks! > @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > if (next - addr != HPAGE_PMD_SIZE) { > get_page(page); > spin_unlock(ptl); > - if (split_huge_page(page)) { > - put_page(page); > - unlock_page(page); > - goto out_unlocked; > - } > + split_huge_page(page); > put_page(page); > unlock_page(page); > - ret = 1; > goto out_unlocked; > } > > -- > 2.8.1 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting 2016-06-17 5:31 ` Minchan Kim @ 2016-06-17 15:59 ` Huang, Ying 2016-06-19 23:54 ` Minchan Kim 2016-06-17 19:45 ` Huang, Ying 1 sibling, 1 reply; 9+ messages in thread From: Huang, Ying @ 2016-06-17 15:59 UTC (permalink / raw) To: Minchan Kim Cc: Huang, Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm, linux-kernel Minchan Kim <minchan@kernel.org> writes: > Hi, > > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote: >> From: Huang Ying <ying.huang@intel.com> >> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are >> required. In madvise_free_huge_pmd, if part pages of THP are discarded, >> the THP will be split and fallback PTE operations should be used if >> splitting succeeds. But the original code will make fallback PTE >> operations skipped, after splitting succeeds. Fix that via make >> madvise_free_huge_pmd return 0 after splitting successfully, so that the >> fallback PTE operations will be done. > > You're right. Thanks! > >> >> Know issues: if my understanding were correct, return 1 from >> madvise_free_huge_pmd means the following processing for the PMD should >> be skipped, while return 0 means the following processing is still >> needed. So the function should return 0 only if the THP is split >> successfully or the PMD is not trans huge. But the pmd_trans_unstable >> after madvise_free_huge_pmd guarantee the following processing will be >> skipped for huge PMD. So current code can run properly. But if my >> understanding were correct, we can clean up return code of >> madvise_free_huge_pmd accordingly. > > I like your clean up. Just a minor comment below. > >> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >> --- >> mm/huge_memory.c | 7 +------ >> 1 file changed, 1 insertion(+), 6 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 2ad52d5..64dc95d 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c > > First of all, let's change ret from int to bool. > And then, add description in the function entry. > > /* > * Return true if we do MADV_FREE successfully on entire pmd page. > * Otherwise, return false. > */ > > And do not set to 1 if it is huge_zero_pmd but just goto out to > return false. Do you want to fold the cleanup with this patch or do that in another patch? Best Regards, Huang, Ying > Thanks! > >> @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, >> if (next - addr != HPAGE_PMD_SIZE) { >> get_page(page); >> spin_unlock(ptl); >> - if (split_huge_page(page)) { >> - put_page(page); >> - unlock_page(page); >> - goto out_unlocked; >> - } >> + split_huge_page(page); >> put_page(page); >> unlock_page(page); >> - ret = 1; >> goto out_unlocked; >> } >> >> -- ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting 2016-06-17 15:59 ` Huang, Ying @ 2016-06-19 23:54 ` Minchan Kim 2016-06-20 15:48 ` Huang, Ying 0 siblings, 1 reply; 9+ messages in thread From: Minchan Kim @ 2016-06-19 23:54 UTC (permalink / raw) To: Huang, Ying Cc: Minchan Kim, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm, linux-kernel On Fri, Jun 17, 2016 at 08:59:31AM -0700, Huang, Ying wrote: > Minchan Kim <minchan@kernel.org> writes: > > > Hi, > > > > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote: > >> From: Huang Ying <ying.huang@intel.com> > >> > >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are > >> required. In madvise_free_huge_pmd, if part pages of THP are discarded, > >> the THP will be split and fallback PTE operations should be used if > >> splitting succeeds. But the original code will make fallback PTE > >> operations skipped, after splitting succeeds. Fix that via make > >> madvise_free_huge_pmd return 0 after splitting successfully, so that the > >> fallback PTE operations will be done. > > > > You're right. Thanks! > > > >> > >> Know issues: if my understanding were correct, return 1 from > >> madvise_free_huge_pmd means the following processing for the PMD should > >> be skipped, while return 0 means the following processing is still > >> needed. So the function should return 0 only if the THP is split > >> successfully or the PMD is not trans huge. But the pmd_trans_unstable > >> after madvise_free_huge_pmd guarantee the following processing will be > >> skipped for huge PMD. So current code can run properly. But if my > >> understanding were correct, we can clean up return code of > >> madvise_free_huge_pmd accordingly. > > > > I like your clean up. Just a minor comment below. > > > >> > >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > >> --- > >> mm/huge_memory.c | 7 +------ > >> 1 file changed, 1 insertion(+), 6 deletions(-) > >> > >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >> index 2ad52d5..64dc95d 100644 > >> --- a/mm/huge_memory.c > >> +++ b/mm/huge_memory.c > > > > First of all, let's change ret from int to bool. > > And then, add description in the function entry. > > > > /* > > * Return true if we do MADV_FREE successfully on entire pmd page. > > * Otherwise, return false. > > */ > > > > And do not set to 1 if it is huge_zero_pmd but just goto out to > > return false. > > Do you want to fold the cleanup with this patch or do that in another > patch? I prefer separating cleanup and bug fix so that we can send only bug fix patch to stable tree. Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting 2016-06-19 23:54 ` Minchan Kim @ 2016-06-20 15:48 ` Huang, Ying 0 siblings, 0 replies; 9+ messages in thread From: Huang, Ying @ 2016-06-20 15:48 UTC (permalink / raw) To: Minchan Kim Cc: Huang, Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm, linux-kernel Minchan Kim <minchan@kernel.org> writes: > On Fri, Jun 17, 2016 at 08:59:31AM -0700, Huang, Ying wrote: >> Minchan Kim <minchan@kernel.org> writes: >> >> > Hi, >> > >> > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote: >> >> From: Huang Ying <ying.huang@intel.com> >> >> >> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are >> >> required. In madvise_free_huge_pmd, if part pages of THP are discarded, >> >> the THP will be split and fallback PTE operations should be used if >> >> splitting succeeds. But the original code will make fallback PTE >> >> operations skipped, after splitting succeeds. Fix that via make >> >> madvise_free_huge_pmd return 0 after splitting successfully, so that the >> >> fallback PTE operations will be done. >> > >> > You're right. Thanks! >> > >> >> >> >> Know issues: if my understanding were correct, return 1 from >> >> madvise_free_huge_pmd means the following processing for the PMD should >> >> be skipped, while return 0 means the following processing is still >> >> needed. So the function should return 0 only if the THP is split >> >> successfully or the PMD is not trans huge. But the pmd_trans_unstable >> >> after madvise_free_huge_pmd guarantee the following processing will be >> >> skipped for huge PMD. So current code can run properly. But if my >> >> understanding were correct, we can clean up return code of >> >> madvise_free_huge_pmd accordingly. >> > >> > I like your clean up. Just a minor comment below. >> > >> >> >> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >> >> --- >> >> mm/huge_memory.c | 7 +------ >> >> 1 file changed, 1 insertion(+), 6 deletions(-) >> >> >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> >> index 2ad52d5..64dc95d 100644 >> >> --- a/mm/huge_memory.c >> >> +++ b/mm/huge_memory.c >> > >> > First of all, let's change ret from int to bool. >> > And then, add description in the function entry. >> > >> > /* >> > * Return true if we do MADV_FREE successfully on entire pmd page. >> > * Otherwise, return false. >> > */ >> > >> > And do not set to 1 if it is huge_zero_pmd but just goto out to >> > return false. >> >> Do you want to fold the cleanup with this patch or do that in another >> patch? > > I prefer separating cleanup and bug fix so that we can send only bug > fix patch to stable tree. Sure. Best Regards, Huang, Ying ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting 2016-06-17 5:31 ` Minchan Kim 2016-06-17 15:59 ` Huang, Ying @ 2016-06-17 19:45 ` Huang, Ying 2016-06-20 0:15 ` Minchan Kim 1 sibling, 1 reply; 9+ messages in thread From: Huang, Ying @ 2016-06-17 19:45 UTC (permalink / raw) To: Minchan Kim Cc: Huang, Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm, linux-kernel Minchan Kim <minchan@kernel.org> writes: > Hi, > > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote: >> From: Huang Ying <ying.huang@intel.com> >> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are >> required. In madvise_free_huge_pmd, if part pages of THP are discarded, >> the THP will be split and fallback PTE operations should be used if >> splitting succeeds. But the original code will make fallback PTE >> operations skipped, after splitting succeeds. Fix that via make >> madvise_free_huge_pmd return 0 after splitting successfully, so that the >> fallback PTE operations will be done. > > You're right. Thanks! > >> >> Know issues: if my understanding were correct, return 1 from >> madvise_free_huge_pmd means the following processing for the PMD should >> be skipped, while return 0 means the following processing is still >> needed. So the function should return 0 only if the THP is split >> successfully or the PMD is not trans huge. But the pmd_trans_unstable >> after madvise_free_huge_pmd guarantee the following processing will be >> skipped for huge PMD. So current code can run properly. But if my >> understanding were correct, we can clean up return code of >> madvise_free_huge_pmd accordingly. > > I like your clean up. Just a minor comment below. > >> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >> --- >> mm/huge_memory.c | 7 +------ >> 1 file changed, 1 insertion(+), 6 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 2ad52d5..64dc95d 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c > > First of all, let's change ret from int to bool. > And then, add description in the function entry. Yes. bool looks better than int. > /* > * Return true if we do MADV_FREE successfully on entire pmd page. > * Otherwise, return false. > */ This way, we need to return false if we failed to split huge page, this will cause unnecessary pmd_trans_unstable check. How about to change the comments to /* * Return true if we finished processing entire pmd page and needn't * fall back pte processing. Otherwise, return false. */ Best Regards, Huang, Ying > And do not set to 1 if it is huge_zero_pmd but just goto out to > return false. > > Thanks! > >> @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, >> if (next - addr != HPAGE_PMD_SIZE) { >> get_page(page); >> spin_unlock(ptl); >> - if (split_huge_page(page)) { >> - put_page(page); >> - unlock_page(page); >> - goto out_unlocked; >> - } >> + split_huge_page(page); >> put_page(page); >> unlock_page(page); >> - ret = 1; >> goto out_unlocked; >> } >> >> -- >> 2.8.1 >> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@kvack.org. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting 2016-06-17 19:45 ` Huang, Ying @ 2016-06-20 0:15 ` Minchan Kim 2016-06-20 15:48 ` Huang, Ying 0 siblings, 1 reply; 9+ messages in thread From: Minchan Kim @ 2016-06-20 0:15 UTC (permalink / raw) To: Huang, Ying Cc: Minchan Kim, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm, linux-kernel On Fri, Jun 17, 2016 at 12:45:43PM -0700, Huang, Ying wrote: > Minchan Kim <minchan@kernel.org> writes: > > > Hi, > > > > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote: > >> From: Huang Ying <ying.huang@intel.com> > >> > >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are > >> required. In madvise_free_huge_pmd, if part pages of THP are discarded, > >> the THP will be split and fallback PTE operations should be used if > >> splitting succeeds. But the original code will make fallback PTE > >> operations skipped, after splitting succeeds. Fix that via make > >> madvise_free_huge_pmd return 0 after splitting successfully, so that the > >> fallback PTE operations will be done. > > > > You're right. Thanks! > > > >> > >> Know issues: if my understanding were correct, return 1 from > >> madvise_free_huge_pmd means the following processing for the PMD should > >> be skipped, while return 0 means the following processing is still > >> needed. So the function should return 0 only if the THP is split > >> successfully or the PMD is not trans huge. But the pmd_trans_unstable > >> after madvise_free_huge_pmd guarantee the following processing will be > >> skipped for huge PMD. So current code can run properly. But if my > >> understanding were correct, we can clean up return code of > >> madvise_free_huge_pmd accordingly. > > > > I like your clean up. Just a minor comment below. > > > >> > >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > >> --- > >> mm/huge_memory.c | 7 +------ > >> 1 file changed, 1 insertion(+), 6 deletions(-) > >> > >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >> index 2ad52d5..64dc95d 100644 > >> --- a/mm/huge_memory.c > >> +++ b/mm/huge_memory.c > > > > First of all, let's change ret from int to bool. > > And then, add description in the function entry. > > Yes. bool looks better than int. > > > /* > > * Return true if we do MADV_FREE successfully on entire pmd page. > > * Otherwise, return false. > > */ > > This way, we need to return false if we failed to split huge page, this > will cause unnecessary pmd_trans_unstable check. How about to change > the comments to I focused the function name "madvise_free_huge_pmd". IOW, the function should free huge pmd page. If it is successful, done. Otherwise, next routines should handle it. If it fail to split, pmd_trans_unstable will check it and return. I don't think it's heavy operation to affect performance so rather than making function fast, I wanted to make it simple by following function name. > > /* > * Return true if we finished processing entire pmd page and needn't > * fall back pte processing. Otherwise, return false. > */ > > Best Regards, > Huang, Ying > > > And do not set to 1 if it is huge_zero_pmd but just goto out to > > return false. > > > > Thanks! > > > >> @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > >> if (next - addr != HPAGE_PMD_SIZE) { > >> get_page(page); > >> spin_unlock(ptl); > >> - if (split_huge_page(page)) { > >> - put_page(page); > >> - unlock_page(page); > >> - goto out_unlocked; > >> - } > >> + split_huge_page(page); > >> put_page(page); > >> unlock_page(page); > >> - ret = 1; > >> goto out_unlocked; > >> } > >> > >> -- > >> 2.8.1 > >> > >> -- > >> To unsubscribe, send a message with 'unsubscribe linux-mm' in > >> the body to majordomo@kvack.org. For more info on Linux MM, > >> see: http://www.linux-mm.org/ . > >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting 2016-06-20 0:15 ` Minchan Kim @ 2016-06-20 15:48 ` Huang, Ying 0 siblings, 0 replies; 9+ messages in thread From: Huang, Ying @ 2016-06-20 15:48 UTC (permalink / raw) To: Minchan Kim Cc: Huang, Ying, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Jerome Marchand, Andrea Arcangeli, Ebru Akagunduz, linux-mm, linux-kernel Minchan Kim <minchan@kernel.org> writes: > On Fri, Jun 17, 2016 at 12:45:43PM -0700, Huang, Ying wrote: >> Minchan Kim <minchan@kernel.org> writes: >> >> > Hi, >> > >> > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote: >> >> From: Huang Ying <ying.huang@intel.com> >> >> >> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are >> >> required. In madvise_free_huge_pmd, if part pages of THP are discarded, >> >> the THP will be split and fallback PTE operations should be used if >> >> splitting succeeds. But the original code will make fallback PTE >> >> operations skipped, after splitting succeeds. Fix that via make >> >> madvise_free_huge_pmd return 0 after splitting successfully, so that the >> >> fallback PTE operations will be done. >> > >> > You're right. Thanks! >> > >> >> >> >> Know issues: if my understanding were correct, return 1 from >> >> madvise_free_huge_pmd means the following processing for the PMD should >> >> be skipped, while return 0 means the following processing is still >> >> needed. So the function should return 0 only if the THP is split >> >> successfully or the PMD is not trans huge. But the pmd_trans_unstable >> >> after madvise_free_huge_pmd guarantee the following processing will be >> >> skipped for huge PMD. So current code can run properly. But if my >> >> understanding were correct, we can clean up return code of >> >> madvise_free_huge_pmd accordingly. >> > >> > I like your clean up. Just a minor comment below. >> > >> >> >> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >> >> --- >> >> mm/huge_memory.c | 7 +------ >> >> 1 file changed, 1 insertion(+), 6 deletions(-) >> >> >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> >> index 2ad52d5..64dc95d 100644 >> >> --- a/mm/huge_memory.c >> >> +++ b/mm/huge_memory.c >> > >> > First of all, let's change ret from int to bool. >> > And then, add description in the function entry. >> >> Yes. bool looks better than int. >> >> > /* >> > * Return true if we do MADV_FREE successfully on entire pmd page. >> > * Otherwise, return false. >> > */ >> >> This way, we need to return false if we failed to split huge page, this >> will cause unnecessary pmd_trans_unstable check. How about to change >> the comments to > > I focused the function name "madvise_free_huge_pmd". IOW, the function > should free huge pmd page. If it is successful, done. Otherwise, next > routines should handle it. > > If it fail to split, pmd_trans_unstable will check it and return. > I don't think it's heavy operation to affect performance so rather > than making function fast, I wanted to make it simple by following > function name. Reasonable. Will do that. Best Regards, Huang, Ying >> >> /* >> * Return true if we finished processing entire pmd page and needn't >> * fall back pte processing. Otherwise, return false. >> */ >> >> Best Regards, >> Huang, Ying >> >> > And do not set to 1 if it is huge_zero_pmd but just goto out to >> > return false. >> > >> > Thanks! >> > >> >> @@ -1655,14 +1655,9 @@ int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, >> >> if (next - addr != HPAGE_PMD_SIZE) { >> >> get_page(page); >> >> spin_unlock(ptl); >> >> - if (split_huge_page(page)) { >> >> - put_page(page); >> >> - unlock_page(page); >> >> - goto out_unlocked; >> >> - } >> >> + split_huge_page(page); >> >> put_page(page); >> >> unlock_page(page); >> >> - ret = 1; >> >> goto out_unlocked; >> >> } >> >> >> >> -- >> >> 2.8.1 >> >> >> >> -- >> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> >> the body to majordomo@kvack.org. For more info on Linux MM, >> >> see: http://www.linux-mm.org/ . >> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-06-20 15:49 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-17 3:03 [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting Huang, Ying 2016-06-17 3:15 ` Huang, Ying 2016-06-17 5:31 ` Minchan Kim 2016-06-17 15:59 ` Huang, Ying 2016-06-19 23:54 ` Minchan Kim 2016-06-20 15:48 ` Huang, Ying 2016-06-17 19:45 ` Huang, Ying 2016-06-20 0:15 ` Minchan Kim 2016-06-20 15:48 ` Huang, Ying
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).