Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it
@ 2019-11-28  1:03 Wei Yang
  2019-11-28  1:03 ` [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed Wei Yang
  2019-11-28  8:32 ` [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it Kirill A. Shutemov
  0 siblings, 2 replies; 15+ messages in thread
From: Wei Yang @ 2019-11-28  1:03 UTC (permalink / raw)
  To: akpm; +Cc: kirill.shutemov, linux-mm, linux-kernel, Wei Yang

At this point, we are sure page is PageTransHuge, which means
hpage_nr_pages is HPAGE_PMD_NR.

This is safe to use PMD_SIZE instead of calculating it.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
---
 mm/page_vma_mapped.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index eff4b4520c8d..76e03650a3ab 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -223,7 +223,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 			if (pvmw->address >= pvmw->vma->vm_end ||
 			    pvmw->address >=
 					__vma_address(pvmw->page, pvmw->vma) +
-					hpage_nr_pages(pvmw->page) * PAGE_SIZE)
+					PMD_SIZE)
 				return not_found(pvmw);
 			/* Did we cross page table boundary? */
 			if (pvmw->address % PMD_SIZE == 0) {
-- 
2.17.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed
  2019-11-28  1:03 [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it Wei Yang
@ 2019-11-28  1:03 ` Wei Yang
  2019-11-28  8:31   ` Kirill A. Shutemov
  2019-11-28  8:32 ` [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it Kirill A. Shutemov
  1 sibling, 1 reply; 15+ messages in thread
From: Wei Yang @ 2019-11-28  1:03 UTC (permalink / raw)
  To: akpm; +Cc: kirill.shutemov, linux-mm, linux-kernel, Wei Yang

The check here is to guarantee pvmw->address iteration is limited in one
page table boundary. To be specific, here the address range should be in
one PMD_SIZE.

If my understanding is correct, this check is already done in the above
check:

    address >= __vma_address(page, vma) + PMD_SIZE

The boundary check here seems not necessary.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>

---
Test:
   more than 48 hours kernel build test shows this code is not touched.
---
 mm/page_vma_mapped.c | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 76e03650a3ab..25aada8a1271 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -163,7 +163,6 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 			return not_found(pvmw);
 		return true;
 	}
-restart:
 	pgd = pgd_offset(mm, pvmw->address);
 	if (!pgd_present(*pgd))
 		return false;
@@ -225,17 +224,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 					__vma_address(pvmw->page, pvmw->vma) +
 					PMD_SIZE)
 				return not_found(pvmw);
-			/* Did we cross page table boundary? */
-			if (pvmw->address % PMD_SIZE == 0) {
-				pte_unmap(pvmw->pte);
-				if (pvmw->ptl) {
-					spin_unlock(pvmw->ptl);
-					pvmw->ptl = NULL;
-				}
-				goto restart;
-			} else {
-				pvmw->pte++;
-			}
+			pvmw->pte++;
 		} while (pte_none(*pvmw->pte));
 
 		if (!pvmw->ptl) {
-- 
2.17.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed
  2019-11-28  1:03 ` [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed Wei Yang
@ 2019-11-28  8:31   ` Kirill A. Shutemov
  2019-11-28 21:09     ` Wei Yang
  0 siblings, 1 reply; 15+ messages in thread
From: Kirill A. Shutemov @ 2019-11-28  8:31 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, kirill.shutemov, linux-mm, linux-kernel

On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote:
> The check here is to guarantee pvmw->address iteration is limited in one
> page table boundary. To be specific, here the address range should be in
> one PMD_SIZE.
> 
> If my understanding is correct, this check is already done in the above
> check:
> 
>     address >= __vma_address(page, vma) + PMD_SIZE
> 
> The boundary check here seems not necessary.
> 
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>

NAK.

THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap().

> Test:
>    more than 48 hours kernel build test shows this code is not touched.

Not an argument. I doubt mremap(2) is ever called in kernel build
workload.

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it
  2019-11-28  1:03 [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it Wei Yang
  2019-11-28  1:03 ` [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed Wei Yang
@ 2019-11-28  8:32 ` Kirill A. Shutemov
  2019-11-28 21:22   ` Wei Yang
  1 sibling, 1 reply; 15+ messages in thread
From: Kirill A. Shutemov @ 2019-11-28  8:32 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, kirill.shutemov, linux-mm, linux-kernel

On Thu, Nov 28, 2019 at 09:03:20AM +0800, Wei Yang wrote:
> At this point, we are sure page is PageTransHuge, which means
> hpage_nr_pages is HPAGE_PMD_NR.
> 
> This is safe to use PMD_SIZE instead of calculating it.
> 
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> ---
>  mm/page_vma_mapped.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index eff4b4520c8d..76e03650a3ab 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -223,7 +223,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>  			if (pvmw->address >= pvmw->vma->vm_end ||
>  			    pvmw->address >=
>  					__vma_address(pvmw->page, pvmw->vma) +
> -					hpage_nr_pages(pvmw->page) * PAGE_SIZE)
> +					PMD_SIZE)
>  				return not_found(pvmw);
>  			/* Did we cross page table boundary? */
>  			if (pvmw->address % PMD_SIZE == 0) {

It is dubious cleanup. Maybe page_size(pvmw->page) instead? It will not
break if we ever get PUD THP pages.

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed
  2019-11-28  8:31   ` Kirill A. Shutemov
@ 2019-11-28 21:09     ` Wei Yang
  2019-11-28 22:39       ` Matthew Wilcox
  0 siblings, 1 reply; 15+ messages in thread
From: Wei Yang @ 2019-11-28 21:09 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Wei Yang, akpm, kirill.shutemov, linux-mm, linux-kernel

On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote:
>On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote:
>> The check here is to guarantee pvmw->address iteration is limited in one
>> page table boundary. To be specific, here the address range should be in
>> one PMD_SIZE.
>> 
>> If my understanding is correct, this check is already done in the above
>> check:
>> 
>>     address >= __vma_address(page, vma) + PMD_SIZE
>> 
>> The boundary check here seems not necessary.
>> 
>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>
>NAK.
>
>THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap().
>

Hi, Kirill

Thanks for your comment during Thanks Giving Day. Happy holiday:-)

I didn't think about this case before, thanks for reminding. Then I tried to
understand your concern.

mremap() would expand/shrink a memory mapping. In this case, probably shrink
is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the
case you mentioned maybe pvmw->page is the head of a THP but part of it is
unmapped.

This means the following condition stands:

    vma->vm_start <= vma_address(page) 
    vma->vm_end <=   vma_address(page) + page_size(page)

Since we have checked address with vm_end, do you think this case is also
guarded?

Not sure my understanding is correct, look forward your comments.

>> Test:
>>    more than 48 hours kernel build test shows this code is not touched.
>
>Not an argument. I doubt mremap(2) is ever called in kernel build
>workload.
>
>-- 
> Kirill A. Shutemov

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it
  2019-11-28  8:32 ` [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it Kirill A. Shutemov
@ 2019-11-28 21:22   ` Wei Yang
  2019-12-02  8:03     ` Kirill A. Shutemov
  0 siblings, 1 reply; 15+ messages in thread
From: Wei Yang @ 2019-11-28 21:22 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Wei Yang, akpm, kirill.shutemov, linux-mm, linux-kernel

On Thu, Nov 28, 2019 at 11:32:55AM +0300, Kirill A. Shutemov wrote:
>On Thu, Nov 28, 2019 at 09:03:20AM +0800, Wei Yang wrote:
>> At this point, we are sure page is PageTransHuge, which means
>> hpage_nr_pages is HPAGE_PMD_NR.
>> 
>> This is safe to use PMD_SIZE instead of calculating it.
>> 
>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> ---
>>  mm/page_vma_mapped.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index eff4b4520c8d..76e03650a3ab 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -223,7 +223,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>  			if (pvmw->address >= pvmw->vma->vm_end ||
>>  			    pvmw->address >=
>>  					__vma_address(pvmw->page, pvmw->vma) +
>> -					hpage_nr_pages(pvmw->page) * PAGE_SIZE)
>> +					PMD_SIZE)
>>  				return not_found(pvmw);
>>  			/* Did we cross page table boundary? */
>>  			if (pvmw->address % PMD_SIZE == 0) {
>
>It is dubious cleanup. Maybe page_size(pvmw->page) instead? It will not
>break if we ever get PUD THP pages.
>

Thanks for your comment.

I took a look into the code again and found I may miss something.

I found we support PUD THP pages, whilc hpage_nr_pages() just return
HPAGE_PMD_NR on PageTransHuge. Why this is not possible to return PUD number?

Search in the kernel tree, one implementation of PUD_SIZE fault is
dev_dax_huge_fault. If page fault happens here, would this if break the loop
too early?

>-- 
> Kirill A. Shutemov

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed
  2019-11-28 21:09     ` Wei Yang
@ 2019-11-28 22:39       ` Matthew Wilcox
  2019-11-29  8:30         ` Wei Yang
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2019-11-28 22:39 UTC (permalink / raw)
  To: Wei Yang
  Cc: Kirill A. Shutemov, Wei Yang, akpm, kirill.shutemov, linux-mm,
	linux-kernel

On Thu, Nov 28, 2019 at 09:09:45PM +0000, Wei Yang wrote:
> On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote:
> >On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote:
> >> The check here is to guarantee pvmw->address iteration is limited in one
> >> page table boundary. To be specific, here the address range should be in
> >> one PMD_SIZE.
> >> 
> >> If my understanding is correct, this check is already done in the above
> >> check:
> >> 
> >>     address >= __vma_address(page, vma) + PMD_SIZE
> >> 
> >> The boundary check here seems not necessary.
> >> 
> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >
> >NAK.
> >
> >THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap().
> >
> 
> Hi, Kirill
> 
> Thanks for your comment during Thanks Giving Day. Happy holiday:-)
> 
> I didn't think about this case before, thanks for reminding. Then I tried to
> understand your concern.
> 
> mremap() would expand/shrink a memory mapping. In this case, probably shrink
> is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the
> case you mentioned maybe pvmw->page is the head of a THP but part of it is
> unmapped.

mremap() can also move a mapping, see MREMAP_FIXED.

> This means the following condition stands:
> 
>     vma->vm_start <= vma_address(page) 
>     vma->vm_end <=   vma_address(page) + page_size(page)
> 
> Since we have checked address with vm_end, do you think this case is also
> guarded?
> 
> Not sure my understanding is correct, look forward your comments.
> 
> >> Test:
> >>    more than 48 hours kernel build test shows this code is not touched.
> >
> >Not an argument. I doubt mremap(2) is ever called in kernel build
> >workload.
> >
> >-- 
> > Kirill A. Shutemov
> 
> -- 
> Wei Yang
> Help you, Help me
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed
  2019-11-28 22:39       ` Matthew Wilcox
@ 2019-11-29  8:30         ` Wei Yang
  2019-11-29 11:18           ` Matthew Wilcox
  0 siblings, 1 reply; 15+ messages in thread
From: Wei Yang @ 2019-11-29  8:30 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Wei Yang, Kirill A. Shutemov, Wei Yang, akpm, kirill.shutemov,
	linux-mm, linux-kernel

On Thu, Nov 28, 2019 at 02:39:04PM -0800, Matthew Wilcox wrote:
>On Thu, Nov 28, 2019 at 09:09:45PM +0000, Wei Yang wrote:
>> On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote:
>> >On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote:
>> >> The check here is to guarantee pvmw->address iteration is limited in one
>> >> page table boundary. To be specific, here the address range should be in
>> >> one PMD_SIZE.
>> >> 
>> >> If my understanding is correct, this check is already done in the above
>> >> check:
>> >> 
>> >>     address >= __vma_address(page, vma) + PMD_SIZE
>> >> 
>> >> The boundary check here seems not necessary.
>> >> 
>> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >
>> >NAK.
>> >
>> >THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap().
>> >
>> 
>> Hi, Kirill
>> 
>> Thanks for your comment during Thanks Giving Day. Happy holiday:-)
>> 
>> I didn't think about this case before, thanks for reminding. Then I tried to
>> understand your concern.
>> 
>> mremap() would expand/shrink a memory mapping. In this case, probably shrink
>> is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the
>> case you mentioned maybe pvmw->page is the head of a THP but part of it is
>> unmapped.
>
>mremap() can also move a mapping, see MREMAP_FIXED.

Hi, Matthew

Thanks for your comment.

I took a look into the MREMAP_FIXED case, but still not clear in which case it
fall into the situation Kirill mentioned.

Per my understanding, move mapping is achieved in two steps:

    * unmap some range in old vma if old_len >= new_len
    * move vma

If the length doesn't change, we are expecting to have the "copy" of old
vma. This doesn't change the THP PMD mapping.

So the change still happens in the unmap step, if I am correct.

Would you mind giving me more hint on the case when we would have the
situation as Kirill mentioned?

>
>> This means the following condition stands:
>> 
>>     vma->vm_start <= vma_address(page) 
>>     vma->vm_end <=   vma_address(page) + page_size(page)
>> 
>> Since we have checked address with vm_end, do you think this case is also
>> guarded?
>> 
>> Not sure my understanding is correct, look forward your comments.
>> 
>> >> Test:
>> >>    more than 48 hours kernel build test shows this code is not touched.
>> >
>> >Not an argument. I doubt mremap(2) is ever called in kernel build
>> >workload.
>> >
>> >-- 
>> > Kirill A. Shutemov
>> 
>> -- 
>> Wei Yang
>> Help you, Help me
>> 

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed
  2019-11-29  8:30         ` Wei Yang
@ 2019-11-29 11:18           ` Matthew Wilcox
  2019-12-02  6:53             ` Wei Yang
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2019-11-29 11:18 UTC (permalink / raw)
  To: Wei Yang
  Cc: Wei Yang, Kirill A. Shutemov, akpm, kirill.shutemov, linux-mm,
	linux-kernel

On Fri, Nov 29, 2019 at 04:30:02PM +0800, Wei Yang wrote:
> On Thu, Nov 28, 2019 at 02:39:04PM -0800, Matthew Wilcox wrote:
> >On Thu, Nov 28, 2019 at 09:09:45PM +0000, Wei Yang wrote:
> >> On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote:
> >> >On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote:
> >> >> The check here is to guarantee pvmw->address iteration is limited in one
> >> >> page table boundary. To be specific, here the address range should be in
> >> >> one PMD_SIZE.
> >> >> 
> >> >> If my understanding is correct, this check is already done in the above
> >> >> check:
> >> >> 
> >> >>     address >= __vma_address(page, vma) + PMD_SIZE
> >> >> 
> >> >> The boundary check here seems not necessary.
> >> >> 
> >> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >> >
> >> >NAK.
> >> >
> >> >THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap().
> >> >
> >> 
> >> Hi, Kirill
> >> 
> >> Thanks for your comment during Thanks Giving Day. Happy holiday:-)
> >> 
> >> I didn't think about this case before, thanks for reminding. Then I tried to
> >> understand your concern.
> >> 
> >> mremap() would expand/shrink a memory mapping. In this case, probably shrink
> >> is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the
> >> case you mentioned maybe pvmw->page is the head of a THP but part of it is
> >> unmapped.
> >
> >mremap() can also move a mapping, see MREMAP_FIXED.
> 
> Hi, Matthew
> 
> Thanks for your comment.
> 
> I took a look into the MREMAP_FIXED case, but still not clear in which case it
> fall into the situation Kirill mentioned.
> 
> Per my understanding, move mapping is achieved in two steps:
> 
>     * unmap some range in old vma if old_len >= new_len
>     * move vma
> 
> If the length doesn't change, we are expecting to have the "copy" of old
> vma. This doesn't change the THP PMD mapping.
> 
> So the change still happens in the unmap step, if I am correct.
> 
> Would you mind giving me more hint on the case when we would have the
> situation as Kirill mentioned?

Set up a THP mapping.
Move it to an address which is no longer 2MB aligned.
Unmap it.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed
  2019-11-29 11:18           ` Matthew Wilcox
@ 2019-12-02  6:53             ` Wei Yang
  0 siblings, 0 replies; 15+ messages in thread
From: Wei Yang @ 2019-12-02  6:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Wei Yang, Wei Yang, Kirill A. Shutemov, akpm, kirill.shutemov,
	linux-mm, linux-kernel

On Fri, Nov 29, 2019 at 03:18:01AM -0800, Matthew Wilcox wrote:
>On Fri, Nov 29, 2019 at 04:30:02PM +0800, Wei Yang wrote:
>> On Thu, Nov 28, 2019 at 02:39:04PM -0800, Matthew Wilcox wrote:
>> >On Thu, Nov 28, 2019 at 09:09:45PM +0000, Wei Yang wrote:
>> >> On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote:
>> >> >On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote:
>> >> >> The check here is to guarantee pvmw->address iteration is limited in one
>> >> >> page table boundary. To be specific, here the address range should be in
>> >> >> one PMD_SIZE.
>> >> >> 
>> >> >> If my understanding is correct, this check is already done in the above
>> >> >> check:
>> >> >> 
>> >> >>     address >= __vma_address(page, vma) + PMD_SIZE
>> >> >> 
>> >> >> The boundary check here seems not necessary.
>> >> >> 
>> >> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >> >
>> >> >NAK.
>> >> >
>> >> >THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap().
>> >> >
>> >> 
>> >> Hi, Kirill
>> >> 
>> >> Thanks for your comment during Thanks Giving Day. Happy holiday:-)
>> >> 
>> >> I didn't think about this case before, thanks for reminding. Then I tried to
>> >> understand your concern.
>> >> 
>> >> mremap() would expand/shrink a memory mapping. In this case, probably shrink
>> >> is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the
>> >> case you mentioned maybe pvmw->page is the head of a THP but part of it is
>> >> unmapped.
>> >
>> >mremap() can also move a mapping, see MREMAP_FIXED.
>> 
>> Hi, Matthew
>> 
>> Thanks for your comment.
>> 
>> I took a look into the MREMAP_FIXED case, but still not clear in which case it
>> fall into the situation Kirill mentioned.
>> 
>> Per my understanding, move mapping is achieved in two steps:
>> 
>>     * unmap some range in old vma if old_len >= new_len
>>     * move vma
>> 
>> If the length doesn't change, we are expecting to have the "copy" of old
>> vma. This doesn't change the THP PMD mapping.
>> 
>> So the change still happens in the unmap step, if I am correct.
>> 
>> Would you mind giving me more hint on the case when we would have the
>> situation as Kirill mentioned?
>
>Set up a THP mapping.
>Move it to an address which is no longer 2MB aligned.
>Unmap it.

Thanks Matthew

I got the point, thanks a lot :-)

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it
  2019-11-28 21:22   ` Wei Yang
@ 2019-12-02  8:03     ` Kirill A. Shutemov
  2019-12-02  8:54       ` Wei Yang
  2019-12-02 22:21       ` Wei Yang
  0 siblings, 2 replies; 15+ messages in thread
From: Kirill A. Shutemov @ 2019-12-02  8:03 UTC (permalink / raw)
  To: Wei Yang; +Cc: Wei Yang, akpm, kirill.shutemov, linux-mm, linux-kernel

On Thu, Nov 28, 2019 at 09:22:26PM +0000, Wei Yang wrote:
> On Thu, Nov 28, 2019 at 11:32:55AM +0300, Kirill A. Shutemov wrote:
> >On Thu, Nov 28, 2019 at 09:03:20AM +0800, Wei Yang wrote:
> >> At this point, we are sure page is PageTransHuge, which means
> >> hpage_nr_pages is HPAGE_PMD_NR.
> >> 
> >> This is safe to use PMD_SIZE instead of calculating it.
> >> 
> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >> ---
> >>  mm/page_vma_mapped.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> >> index eff4b4520c8d..76e03650a3ab 100644
> >> --- a/mm/page_vma_mapped.c
> >> +++ b/mm/page_vma_mapped.c
> >> @@ -223,7 +223,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> >>  			if (pvmw->address >= pvmw->vma->vm_end ||
> >>  			    pvmw->address >=
> >>  					__vma_address(pvmw->page, pvmw->vma) +
> >> -					hpage_nr_pages(pvmw->page) * PAGE_SIZE)
> >> +					PMD_SIZE)
> >>  				return not_found(pvmw);
> >>  			/* Did we cross page table boundary? */
> >>  			if (pvmw->address % PMD_SIZE == 0) {
> >
> >It is dubious cleanup. Maybe page_size(pvmw->page) instead? It will not
> >break if we ever get PUD THP pages.
> >
> 
> Thanks for your comment.
> 
> I took a look into the code again and found I may miss something.
> 
> I found we support PUD THP pages, whilc hpage_nr_pages() just return
> HPAGE_PMD_NR on PageTransHuge. Why this is not possible to return PUD number?

We only support PUD THP for DAX. Means, we don't have struct page for it.

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it
  2019-12-02  8:03     ` Kirill A. Shutemov
@ 2019-12-02  8:54       ` Wei Yang
  2019-12-02 22:21       ` Wei Yang
  1 sibling, 0 replies; 15+ messages in thread
From: Wei Yang @ 2019-12-02  8:54 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Wei Yang, Wei Yang, akpm, kirill.shutemov, linux-mm, linux-kernel

On Mon, Dec 02, 2019 at 11:03:15AM +0300, Kirill A. Shutemov wrote:
>On Thu, Nov 28, 2019 at 09:22:26PM +0000, Wei Yang wrote:
>> On Thu, Nov 28, 2019 at 11:32:55AM +0300, Kirill A. Shutemov wrote:
>> >On Thu, Nov 28, 2019 at 09:03:20AM +0800, Wei Yang wrote:
>> >> At this point, we are sure page is PageTransHuge, which means
>> >> hpage_nr_pages is HPAGE_PMD_NR.
>> >> 
>> >> This is safe to use PMD_SIZE instead of calculating it.
>> >> 
>> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >> ---
>> >>  mm/page_vma_mapped.c | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >> 
>> >> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> >> index eff4b4520c8d..76e03650a3ab 100644
>> >> --- a/mm/page_vma_mapped.c
>> >> +++ b/mm/page_vma_mapped.c
>> >> @@ -223,7 +223,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> >>  			if (pvmw->address >= pvmw->vma->vm_end ||
>> >>  			    pvmw->address >=
>> >>  					__vma_address(pvmw->page, pvmw->vma) +
>> >> -					hpage_nr_pages(pvmw->page) * PAGE_SIZE)
>> >> +					PMD_SIZE)
>> >>  				return not_found(pvmw);
>> >>  			/* Did we cross page table boundary? */
>> >>  			if (pvmw->address % PMD_SIZE == 0) {
>> >
>> >It is dubious cleanup. Maybe page_size(pvmw->page) instead? It will not
>> >break if we ever get PUD THP pages.
>> >
>> 
>> Thanks for your comment.
>> 
>> I took a look into the code again and found I may miss something.
>> 
>> I found we support PUD THP pages, whilc hpage_nr_pages() just return
>> HPAGE_PMD_NR on PageTransHuge. Why this is not possible to return PUD number?
>
>We only support PUD THP for DAX. Means, we don't have struct page for it.
>

Ok, many background story behind it.

Thanks :-)

>-- 
> Kirill A. Shutemov

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it
  2019-12-02  8:03     ` Kirill A. Shutemov
  2019-12-02  8:54       ` Wei Yang
@ 2019-12-02 22:21       ` Wei Yang
  2019-12-03  9:47         ` Kirill A. Shutemov
  1 sibling, 1 reply; 15+ messages in thread
From: Wei Yang @ 2019-12-02 22:21 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Wei Yang, Wei Yang, akpm, kirill.shutemov, linux-mm, linux-kernel

On Mon, Dec 02, 2019 at 11:03:15AM +0300, Kirill A. Shutemov wrote:
>On Thu, Nov 28, 2019 at 09:22:26PM +0000, Wei Yang wrote:
>> On Thu, Nov 28, 2019 at 11:32:55AM +0300, Kirill A. Shutemov wrote:
>> >On Thu, Nov 28, 2019 at 09:03:20AM +0800, Wei Yang wrote:
>> >> At this point, we are sure page is PageTransHuge, which means
>> >> hpage_nr_pages is HPAGE_PMD_NR.
>> >> 
>> >> This is safe to use PMD_SIZE instead of calculating it.
>> >> 
>> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >> ---
>> >>  mm/page_vma_mapped.c | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >> 
>> >> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> >> index eff4b4520c8d..76e03650a3ab 100644
>> >> --- a/mm/page_vma_mapped.c
>> >> +++ b/mm/page_vma_mapped.c
>> >> @@ -223,7 +223,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> >>  			if (pvmw->address >= pvmw->vma->vm_end ||
>> >>  			    pvmw->address >=
>> >>  					__vma_address(pvmw->page, pvmw->vma) +
>> >> -					hpage_nr_pages(pvmw->page) * PAGE_SIZE)
>> >> +					PMD_SIZE)
>> >>  				return not_found(pvmw);
>> >>  			/* Did we cross page table boundary? */
>> >>  			if (pvmw->address % PMD_SIZE == 0) {
>> >
>> >It is dubious cleanup. Maybe page_size(pvmw->page) instead? It will not
>> >break if we ever get PUD THP pages.
>> >
>> 
>> Thanks for your comment.
>> 
>> I took a look into the code again and found I may miss something.
>> 
>> I found we support PUD THP pages, whilc hpage_nr_pages() just return
>> HPAGE_PMD_NR on PageTransHuge. Why this is not possible to return PUD number?
>
>We only support PUD THP for DAX. Means, we don't have struct page for it.
>

BTW, may I ask why you suggest to use page_size() if we are sure only PMD THP
page is used here? To be more generic?

>-- 
> Kirill A. Shutemov

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it
  2019-12-02 22:21       ` Wei Yang
@ 2019-12-03  9:47         ` Kirill A. Shutemov
  2019-12-03 15:14           ` Wei Yang
  0 siblings, 1 reply; 15+ messages in thread
From: Kirill A. Shutemov @ 2019-12-03  9:47 UTC (permalink / raw)
  To: Wei Yang; +Cc: Wei Yang, akpm, kirill.shutemov, linux-mm, linux-kernel

On Mon, Dec 02, 2019 at 10:21:51PM +0000, Wei Yang wrote:
> On Mon, Dec 02, 2019 at 11:03:15AM +0300, Kirill A. Shutemov wrote:
> >On Thu, Nov 28, 2019 at 09:22:26PM +0000, Wei Yang wrote:
> >> On Thu, Nov 28, 2019 at 11:32:55AM +0300, Kirill A. Shutemov wrote:
> >> >On Thu, Nov 28, 2019 at 09:03:20AM +0800, Wei Yang wrote:
> >> >> At this point, we are sure page is PageTransHuge, which means
> >> >> hpage_nr_pages is HPAGE_PMD_NR.
> >> >> 
> >> >> This is safe to use PMD_SIZE instead of calculating it.
> >> >> 
> >> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >> >> ---
> >> >>  mm/page_vma_mapped.c | 2 +-
> >> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >> 
> >> >> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> >> >> index eff4b4520c8d..76e03650a3ab 100644
> >> >> --- a/mm/page_vma_mapped.c
> >> >> +++ b/mm/page_vma_mapped.c
> >> >> @@ -223,7 +223,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> >> >>  			if (pvmw->address >= pvmw->vma->vm_end ||
> >> >>  			    pvmw->address >=
> >> >>  					__vma_address(pvmw->page, pvmw->vma) +
> >> >> -					hpage_nr_pages(pvmw->page) * PAGE_SIZE)
> >> >> +					PMD_SIZE)
> >> >>  				return not_found(pvmw);
> >> >>  			/* Did we cross page table boundary? */
> >> >>  			if (pvmw->address % PMD_SIZE == 0) {
> >> >
> >> >It is dubious cleanup. Maybe page_size(pvmw->page) instead? It will not
> >> >break if we ever get PUD THP pages.
> >> >
> >> 
> >> Thanks for your comment.
> >> 
> >> I took a look into the code again and found I may miss something.
> >> 
> >> I found we support PUD THP pages, whilc hpage_nr_pages() just return
> >> HPAGE_PMD_NR on PageTransHuge. Why this is not possible to return PUD number?
> >
> >We only support PUD THP for DAX. Means, we don't have struct page for it.
> >
> 
> BTW, may I ask why you suggest to use page_size() if we are sure only PMD THP
> page is used here? To be more generic?

Yeah. I would rather not touch it at all.

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it
  2019-12-03  9:47         ` Kirill A. Shutemov
@ 2019-12-03 15:14           ` Wei Yang
  0 siblings, 0 replies; 15+ messages in thread
From: Wei Yang @ 2019-12-03 15:14 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Wei Yang, Wei Yang, akpm, kirill.shutemov, linux-mm, linux-kernel

On Tue, Dec 03, 2019 at 12:47:30PM +0300, Kirill A. Shutemov wrote:
>On Mon, Dec 02, 2019 at 10:21:51PM +0000, Wei Yang wrote:
>> On Mon, Dec 02, 2019 at 11:03:15AM +0300, Kirill A. Shutemov wrote:
>> >On Thu, Nov 28, 2019 at 09:22:26PM +0000, Wei Yang wrote:
>> >> On Thu, Nov 28, 2019 at 11:32:55AM +0300, Kirill A. Shutemov wrote:
>> >> >On Thu, Nov 28, 2019 at 09:03:20AM +0800, Wei Yang wrote:
>> >> >> At this point, we are sure page is PageTransHuge, which means
>> >> >> hpage_nr_pages is HPAGE_PMD_NR.
>> >> >> 
>> >> >> This is safe to use PMD_SIZE instead of calculating it.
>> >> >> 
>> >> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >> >> ---
>> >> >>  mm/page_vma_mapped.c | 2 +-
>> >> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >> >> 
>> >> >> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> >> >> index eff4b4520c8d..76e03650a3ab 100644
>> >> >> --- a/mm/page_vma_mapped.c
>> >> >> +++ b/mm/page_vma_mapped.c
>> >> >> @@ -223,7 +223,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> >> >>  			if (pvmw->address >= pvmw->vma->vm_end ||
>> >> >>  			    pvmw->address >=
>> >> >>  					__vma_address(pvmw->page, pvmw->vma) +
>> >> >> -					hpage_nr_pages(pvmw->page) * PAGE_SIZE)
>> >> >> +					PMD_SIZE)
>> >> >>  				return not_found(pvmw);
>> >> >>  			/* Did we cross page table boundary? */
>> >> >>  			if (pvmw->address % PMD_SIZE == 0) {
>> >> >
>> >> >It is dubious cleanup. Maybe page_size(pvmw->page) instead? It will not
>> >> >break if we ever get PUD THP pages.
>> >> >
>> >> 
>> >> Thanks for your comment.
>> >> 
>> >> I took a look into the code again and found I may miss something.
>> >> 
>> >> I found we support PUD THP pages, whilc hpage_nr_pages() just return
>> >> HPAGE_PMD_NR on PageTransHuge. Why this is not possible to return PUD number?
>> >
>> >We only support PUD THP for DAX. Means, we don't have struct page for it.
>> >
>> 
>> BTW, may I ask why you suggest to use page_size() if we are sure only PMD THP
>> page is used here? To be more generic?
>
>Yeah. I would rather not touch it at all.
>

Got it, thanks.

>-- 
> Kirill A. Shutemov

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, back to index

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-28  1:03 [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it Wei Yang
2019-11-28  1:03 ` [PATCH 2/2] mm/page_vma_mapped: page table boundary is already guaranteed Wei Yang
2019-11-28  8:31   ` Kirill A. Shutemov
2019-11-28 21:09     ` Wei Yang
2019-11-28 22:39       ` Matthew Wilcox
2019-11-29  8:30         ` Wei Yang
2019-11-29 11:18           ` Matthew Wilcox
2019-12-02  6:53             ` Wei Yang
2019-11-28  8:32 ` [PATCH 1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it Kirill A. Shutemov
2019-11-28 21:22   ` Wei Yang
2019-12-02  8:03     ` Kirill A. Shutemov
2019-12-02  8:54       ` Wei Yang
2019-12-02 22:21       ` Wei Yang
2019-12-03  9:47         ` Kirill A. Shutemov
2019-12-03 15:14           ` Wei Yang

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git