linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
@ 2019-04-23 16:43 Yang Shi
  2019-04-23 17:52 ` Michal Hocko
  2019-06-08  3:58 ` Hugh Dickins
  0 siblings, 2 replies; 17+ messages in thread
From: Yang Shi @ 2019-04-23 16:43 UTC (permalink / raw)
  To: mhocko, vbabka, rientjes, kirill, akpm; +Cc: yang.shi, linux-mm, linux-kernel

The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
vma") introduced THPeligible bit for processes' smaps. But, when checking
the eligibility for shmem vma, __transparent_hugepage_enabled() is
called to override the result from shmem_huge_enabled().  It may result
in the anonymous vma's THP flag override shmem's.  For example, running a
simple test which create THP for shmem, but with anonymous THP disabled,
when reading the process's smaps, it may show:

7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
Size:               4096 kB
...
[snip]
...
ShmemPmdMapped:     4096 kB
...
[snip]
...
THPeligible:    0

And, /proc/meminfo does show THP allocated and PMD mapped too:

ShmemHugePages:     4096 kB
ShmemPmdMapped:     4096 kB

This doesn't make too much sense.  The anonymous THP flag should not
intervene shmem THP.  Calling shmem_huge_enabled() with checking
MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
dax vma check since we already checked if the vma is shmem already.

Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
v2: Check VM_NOHUGEPAGE per Michal Hocko

 mm/huge_memory.c | 4 ++--
 mm/shmem.c       | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 165ea46..5881e82 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
 {
 	if (vma_is_anonymous(vma))
 		return __transparent_hugepage_enabled(vma);
-	if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
-		return __transparent_hugepage_enabled(vma);
+	if (vma_is_shmem(vma))
+		return shmem_huge_enabled(vma);
 
 	return false;
 }
diff --git a/mm/shmem.c b/mm/shmem.c
index 2275a0f..6f09a31 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
 	loff_t i_size;
 	pgoff_t off;
 
+	if ((vma->vm_flags & VM_NOHUGEPAGE) ||
+	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
+		return false;
 	if (shmem_huge == SHMEM_HUGE_FORCE)
 		return true;
 	if (shmem_huge == SHMEM_HUGE_DENY)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-04-23 16:43 [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility Yang Shi
@ 2019-04-23 17:52 ` Michal Hocko
  2019-04-23 18:34   ` Yang Shi
  2019-04-28 19:13   ` Yang Shi
  2019-06-08  3:58 ` Hugh Dickins
  1 sibling, 2 replies; 17+ messages in thread
From: Michal Hocko @ 2019-04-23 17:52 UTC (permalink / raw)
  To: Yang Shi; +Cc: vbabka, rientjes, kirill, akpm, linux-mm, linux-kernel

On Wed 24-04-19 00:43:01, Yang Shi wrote:
> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
> vma") introduced THPeligible bit for processes' smaps. But, when checking
> the eligibility for shmem vma, __transparent_hugepage_enabled() is
> called to override the result from shmem_huge_enabled().  It may result
> in the anonymous vma's THP flag override shmem's.  For example, running a
> simple test which create THP for shmem, but with anonymous THP disabled,
> when reading the process's smaps, it may show:
> 
> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
> Size:               4096 kB
> ...
> [snip]
> ...
> ShmemPmdMapped:     4096 kB
> ...
> [snip]
> ...
> THPeligible:    0
> 
> And, /proc/meminfo does show THP allocated and PMD mapped too:
> 
> ShmemHugePages:     4096 kB
> ShmemPmdMapped:     4096 kB
> 
> This doesn't make too much sense.  The anonymous THP flag should not
> intervene shmem THP.  Calling shmem_huge_enabled() with checking
> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
> dax vma check since we already checked if the vma is shmem already.

Kirill, can we get a confirmation that this is really intended behavior
rather than an omission please? Is this documented? What is a global
knob to simply disable THP system wise?

I have to say that the THP tuning API is one giant mess :/

Btw. this patch also seem to fix khugepaged behavior because it previously
ignored both VM_NOHUGEPAGE and MMF_DISABLE_THP.

> Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Kirill A. Shutemov <kirill@shutemov.name>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> ---
> v2: Check VM_NOHUGEPAGE per Michal Hocko
> 
>  mm/huge_memory.c | 4 ++--
>  mm/shmem.c       | 3 +++
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 165ea46..5881e82 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
>  {
>  	if (vma_is_anonymous(vma))
>  		return __transparent_hugepage_enabled(vma);
> -	if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
> -		return __transparent_hugepage_enabled(vma);
> +	if (vma_is_shmem(vma))
> +		return shmem_huge_enabled(vma);
>  
>  	return false;
>  }
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 2275a0f..6f09a31 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
>  	loff_t i_size;
>  	pgoff_t off;
>  
> +	if ((vma->vm_flags & VM_NOHUGEPAGE) ||
> +	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
> +		return false;
>  	if (shmem_huge == SHMEM_HUGE_FORCE)
>  		return true;
>  	if (shmem_huge == SHMEM_HUGE_DENY)
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-04-23 17:52 ` Michal Hocko
@ 2019-04-23 18:34   ` Yang Shi
  2019-04-24  0:22     ` Yang Shi
  2019-04-28 19:13   ` Yang Shi
  1 sibling, 1 reply; 17+ messages in thread
From: Yang Shi @ 2019-04-23 18:34 UTC (permalink / raw)
  To: Michal Hocko; +Cc: vbabka, rientjes, kirill, akpm, linux-mm, linux-kernel



On 4/23/19 10:52 AM, Michal Hocko wrote:
> On Wed 24-04-19 00:43:01, Yang Shi wrote:
>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
>> vma") introduced THPeligible bit for processes' smaps. But, when checking
>> the eligibility for shmem vma, __transparent_hugepage_enabled() is
>> called to override the result from shmem_huge_enabled().  It may result
>> in the anonymous vma's THP flag override shmem's.  For example, running a
>> simple test which create THP for shmem, but with anonymous THP disabled,
>> when reading the process's smaps, it may show:
>>
>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
>> Size:               4096 kB
>> ...
>> [snip]
>> ...
>> ShmemPmdMapped:     4096 kB
>> ...
>> [snip]
>> ...
>> THPeligible:    0
>>
>> And, /proc/meminfo does show THP allocated and PMD mapped too:
>>
>> ShmemHugePages:     4096 kB
>> ShmemPmdMapped:     4096 kB
>>
>> This doesn't make too much sense.  The anonymous THP flag should not
>> intervene shmem THP.  Calling shmem_huge_enabled() with checking
>> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
>> dax vma check since we already checked if the vma is shmem already.
> Kirill, can we get a confirmation that this is really intended behavior
> rather than an omission please? Is this documented? What is a global
> knob to simply disable THP system wise?
>
> I have to say that the THP tuning API is one giant mess :/
>
> Btw. this patch also seem to fix khugepaged behavior because it previously
> ignored both VM_NOHUGEPAGE and MMF_DISABLE_THP.

Aha, I didn't notice this. It looks we need separate the patch to fix 
that khugepaged problem for both 5.1-rc and LTS.

>
>> Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: David Rientjes <rientjes@google.com>
>> Cc: Kirill A. Shutemov <kirill@shutemov.name>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> ---
>> v2: Check VM_NOHUGEPAGE per Michal Hocko
>>
>>   mm/huge_memory.c | 4 ++--
>>   mm/shmem.c       | 3 +++
>>   2 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 165ea46..5881e82 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
>>   {
>>   	if (vma_is_anonymous(vma))
>>   		return __transparent_hugepage_enabled(vma);
>> -	if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
>> -		return __transparent_hugepage_enabled(vma);
>> +	if (vma_is_shmem(vma))
>> +		return shmem_huge_enabled(vma);
>>   
>>   	return false;
>>   }
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index 2275a0f..6f09a31 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
>>   	loff_t i_size;
>>   	pgoff_t off;
>>   
>> +	if ((vma->vm_flags & VM_NOHUGEPAGE) ||
>> +	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
>> +		return false;
>>   	if (shmem_huge == SHMEM_HUGE_FORCE)
>>   		return true;
>>   	if (shmem_huge == SHMEM_HUGE_DENY)
>> -- 
>> 1.8.3.1
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-04-23 18:34   ` Yang Shi
@ 2019-04-24  0:22     ` Yang Shi
  2019-04-24  7:58       ` Michal Hocko
  0 siblings, 1 reply; 17+ messages in thread
From: Yang Shi @ 2019-04-24  0:22 UTC (permalink / raw)
  To: Michal Hocko; +Cc: vbabka, rientjes, kirill, akpm, linux-mm, linux-kernel



On 4/23/19 11:34 AM, Yang Shi wrote:
>
>
> On 4/23/19 10:52 AM, Michal Hocko wrote:
>> On Wed 24-04-19 00:43:01, Yang Shi wrote:
>>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for 
>>> each
>>> vma") introduced THPeligible bit for processes' smaps. But, when 
>>> checking
>>> the eligibility for shmem vma, __transparent_hugepage_enabled() is
>>> called to override the result from shmem_huge_enabled().  It may result
>>> in the anonymous vma's THP flag override shmem's.  For example, 
>>> running a
>>> simple test which create THP for shmem, but with anonymous THP 
>>> disabled,
>>> when reading the process's smaps, it may show:
>>>
>>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
>>> Size:               4096 kB
>>> ...
>>> [snip]
>>> ...
>>> ShmemPmdMapped:     4096 kB
>>> ...
>>> [snip]
>>> ...
>>> THPeligible:    0
>>>
>>> And, /proc/meminfo does show THP allocated and PMD mapped too:
>>>
>>> ShmemHugePages:     4096 kB
>>> ShmemPmdMapped:     4096 kB
>>>
>>> This doesn't make too much sense.  The anonymous THP flag should not
>>> intervene shmem THP.  Calling shmem_huge_enabled() with checking
>>> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
>>> dax vma check since we already checked if the vma is shmem already.
>> Kirill, can we get a confirmation that this is really intended behavior
>> rather than an omission please? Is this documented? What is a global
>> knob to simply disable THP system wise?
>>
>> I have to say that the THP tuning API is one giant mess :/
>>
>> Btw. this patch also seem to fix khugepaged behavior because it 
>> previously
>> ignored both VM_NOHUGEPAGE and MMF_DISABLE_THP.

Second look shows this is not ignored. hugepage_vma_check() would check 
this for both anonymous vma and shmem vma before scanning. It is called 
before shmem_huge_enabled().

>
> Aha, I didn't notice this. It looks we need separate the patch to fix 
> that khugepaged problem for both 5.1-rc and LTS.
>
>>
>>> Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each 
>>> vma")
>>> Cc: Michal Hocko <mhocko@suse.com>
>>> Cc: Vlastimil Babka <vbabka@suse.cz>
>>> Cc: David Rientjes <rientjes@google.com>
>>> Cc: Kirill A. Shutemov <kirill@shutemov.name>
>>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>>> ---
>>> v2: Check VM_NOHUGEPAGE per Michal Hocko
>>>
>>>   mm/huge_memory.c | 4 ++--
>>>   mm/shmem.c       | 3 +++
>>>   2 files changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 165ea46..5881e82 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct 
>>> vm_area_struct *vma)
>>>   {
>>>       if (vma_is_anonymous(vma))
>>>           return __transparent_hugepage_enabled(vma);
>>> -    if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
>>> -        return __transparent_hugepage_enabled(vma);
>>> +    if (vma_is_shmem(vma))
>>> +        return shmem_huge_enabled(vma);
>>>         return false;
>>>   }
>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>> index 2275a0f..6f09a31 100644
>>> --- a/mm/shmem.c
>>> +++ b/mm/shmem.c
>>> @@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct 
>>> *vma)
>>>       loff_t i_size;
>>>       pgoff_t off;
>>>   +    if ((vma->vm_flags & VM_NOHUGEPAGE) ||
>>> +        test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
>>> +        return false;
>>>       if (shmem_huge == SHMEM_HUGE_FORCE)
>>>           return true;
>>>       if (shmem_huge == SHMEM_HUGE_DENY)
>>> -- 
>>> 1.8.3.1
>>>
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-04-24  0:22     ` Yang Shi
@ 2019-04-24  7:58       ` Michal Hocko
  0 siblings, 0 replies; 17+ messages in thread
From: Michal Hocko @ 2019-04-24  7:58 UTC (permalink / raw)
  To: Yang Shi; +Cc: vbabka, rientjes, kirill, akpm, linux-mm, linux-kernel

On Tue 23-04-19 17:22:36, Yang Shi wrote:
> 
> 
> On 4/23/19 11:34 AM, Yang Shi wrote:
> > 
> > 
> > On 4/23/19 10:52 AM, Michal Hocko wrote:
> > > On Wed 24-04-19 00:43:01, Yang Shi wrote:
> > > > The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility
> > > > for each
> > > > vma") introduced THPeligible bit for processes' smaps. But, when
> > > > checking
> > > > the eligibility for shmem vma, __transparent_hugepage_enabled() is
> > > > called to override the result from shmem_huge_enabled().  It may result
> > > > in the anonymous vma's THP flag override shmem's.  For example,
> > > > running a
> > > > simple test which create THP for shmem, but with anonymous THP
> > > > disabled,
> > > > when reading the process's smaps, it may show:
> > > > 
> > > > 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
> > > > Size:               4096 kB
> > > > ...
> > > > [snip]
> > > > ...
> > > > ShmemPmdMapped:     4096 kB
> > > > ...
> > > > [snip]
> > > > ...
> > > > THPeligible:    0
> > > > 
> > > > And, /proc/meminfo does show THP allocated and PMD mapped too:
> > > > 
> > > > ShmemHugePages:     4096 kB
> > > > ShmemPmdMapped:     4096 kB
> > > > 
> > > > This doesn't make too much sense.  The anonymous THP flag should not
> > > > intervene shmem THP.  Calling shmem_huge_enabled() with checking
> > > > MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
> > > > dax vma check since we already checked if the vma is shmem already.
> > > Kirill, can we get a confirmation that this is really intended behavior
> > > rather than an omission please? Is this documented? What is a global
> > > knob to simply disable THP system wise?
> > > 
> > > I have to say that the THP tuning API is one giant mess :/
> > > 
> > > Btw. this patch also seem to fix khugepaged behavior because it
> > > previously
> > > ignored both VM_NOHUGEPAGE and MMF_DISABLE_THP.
> 
> Second look shows this is not ignored. hugepage_vma_check() would check this
> for both anonymous vma and shmem vma before scanning. It is called before
> shmem_huge_enabled().

Right. I have missed the earlier check. The main question remains
though.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-04-23 17:52 ` Michal Hocko
  2019-04-23 18:34   ` Yang Shi
@ 2019-04-28 19:13   ` Yang Shi
  2019-05-06 23:37     ` Yang Shi
  1 sibling, 1 reply; 17+ messages in thread
From: Yang Shi @ 2019-04-28 19:13 UTC (permalink / raw)
  To: Michal Hocko, Kirill A. Shutemov
  Cc: vbabka, rientjes, kirill, akpm, linux-mm, linux-kernel



On 4/23/19 10:52 AM, Michal Hocko wrote:
> On Wed 24-04-19 00:43:01, Yang Shi wrote:
>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
>> vma") introduced THPeligible bit for processes' smaps. But, when checking
>> the eligibility for shmem vma, __transparent_hugepage_enabled() is
>> called to override the result from shmem_huge_enabled().  It may result
>> in the anonymous vma's THP flag override shmem's.  For example, running a
>> simple test which create THP for shmem, but with anonymous THP disabled,
>> when reading the process's smaps, it may show:
>>
>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
>> Size:               4096 kB
>> ...
>> [snip]
>> ...
>> ShmemPmdMapped:     4096 kB
>> ...
>> [snip]
>> ...
>> THPeligible:    0
>>
>> And, /proc/meminfo does show THP allocated and PMD mapped too:
>>
>> ShmemHugePages:     4096 kB
>> ShmemPmdMapped:     4096 kB
>>
>> This doesn't make too much sense.  The anonymous THP flag should not
>> intervene shmem THP.  Calling shmem_huge_enabled() with checking
>> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
>> dax vma check since we already checked if the vma is shmem already.
> Kirill, can we get a confirmation that this is really intended behavior
> rather than an omission please? Is this documented? What is a global
> knob to simply disable THP system wise?

Hi Kirill,

Ping. Any comment?

Thanks,
Yang

>
> I have to say that the THP tuning API is one giant mess :/
>
> Btw. this patch also seem to fix khugepaged behavior because it previously
> ignored both VM_NOHUGEPAGE and MMF_DISABLE_THP.
>
>> Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: David Rientjes <rientjes@google.com>
>> Cc: Kirill A. Shutemov <kirill@shutemov.name>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> ---
>> v2: Check VM_NOHUGEPAGE per Michal Hocko
>>
>>   mm/huge_memory.c | 4 ++--
>>   mm/shmem.c       | 3 +++
>>   2 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 165ea46..5881e82 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
>>   {
>>   	if (vma_is_anonymous(vma))
>>   		return __transparent_hugepage_enabled(vma);
>> -	if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
>> -		return __transparent_hugepage_enabled(vma);
>> +	if (vma_is_shmem(vma))
>> +		return shmem_huge_enabled(vma);
>>   
>>   	return false;
>>   }
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index 2275a0f..6f09a31 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
>>   	loff_t i_size;
>>   	pgoff_t off;
>>   
>> +	if ((vma->vm_flags & VM_NOHUGEPAGE) ||
>> +	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
>> +		return false;
>>   	if (shmem_huge == SHMEM_HUGE_FORCE)
>>   		return true;
>>   	if (shmem_huge == SHMEM_HUGE_DENY)
>> -- 
>> 1.8.3.1
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-04-28 19:13   ` Yang Shi
@ 2019-05-06 23:37     ` Yang Shi
  2019-05-07 10:47       ` Michal Hocko
  0 siblings, 1 reply; 17+ messages in thread
From: Yang Shi @ 2019-05-06 23:37 UTC (permalink / raw)
  To: Michal Hocko, Kirill A. Shutemov
  Cc: vbabka, rientjes, kirill, akpm, linux-mm, linux-kernel



On 4/28/19 12:13 PM, Yang Shi wrote:
>
>
> On 4/23/19 10:52 AM, Michal Hocko wrote:
>> On Wed 24-04-19 00:43:01, Yang Shi wrote:
>>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for 
>>> each
>>> vma") introduced THPeligible bit for processes' smaps. But, when 
>>> checking
>>> the eligibility for shmem vma, __transparent_hugepage_enabled() is
>>> called to override the result from shmem_huge_enabled().  It may result
>>> in the anonymous vma's THP flag override shmem's.  For example, 
>>> running a
>>> simple test which create THP for shmem, but with anonymous THP 
>>> disabled,
>>> when reading the process's smaps, it may show:
>>>
>>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
>>> Size:               4096 kB
>>> ...
>>> [snip]
>>> ...
>>> ShmemPmdMapped:     4096 kB
>>> ...
>>> [snip]
>>> ...
>>> THPeligible:    0
>>>
>>> And, /proc/meminfo does show THP allocated and PMD mapped too:
>>>
>>> ShmemHugePages:     4096 kB
>>> ShmemPmdMapped:     4096 kB
>>>
>>> This doesn't make too much sense.  The anonymous THP flag should not
>>> intervene shmem THP.  Calling shmem_huge_enabled() with checking
>>> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
>>> dax vma check since we already checked if the vma is shmem already.
>> Kirill, can we get a confirmation that this is really intended behavior
>> rather than an omission please? Is this documented? What is a global
>> knob to simply disable THP system wise?
>
> Hi Kirill,
>
> Ping. Any comment?

Talked with Kirill at LSFMM, it sounds this is kind of intended behavior 
according to him. But, we all agree it looks inconsistent.

So, we may have two options:
     - Just fix the false negative issue as what the patch does
     - Change the behavior to make it more consistent

I'm not sure whether anyone relies on the behavior explicitly or 
implicitly or not.

If we would like to change the behavior, I may consider to take a step 
further to refactor the code a little bit to use huge_fault() to handle 
THP fault instead of falling back to handle_pte_fault() in the current 
implementation. This may make adding THP for other filesystems easier.

>
> Thanks,
> Yang
>
>>
>> I have to say that the THP tuning API is one giant mess :/
>>
>> Btw. this patch also seem to fix khugepaged behavior because it 
>> previously
>> ignored both VM_NOHUGEPAGE and MMF_DISABLE_THP.
>>
>>> Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each 
>>> vma")
>>> Cc: Michal Hocko <mhocko@suse.com>
>>> Cc: Vlastimil Babka <vbabka@suse.cz>
>>> Cc: David Rientjes <rientjes@google.com>
>>> Cc: Kirill A. Shutemov <kirill@shutemov.name>
>>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>>> ---
>>> v2: Check VM_NOHUGEPAGE per Michal Hocko
>>>
>>>   mm/huge_memory.c | 4 ++--
>>>   mm/shmem.c       | 3 +++
>>>   2 files changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 165ea46..5881e82 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct 
>>> vm_area_struct *vma)
>>>   {
>>>       if (vma_is_anonymous(vma))
>>>           return __transparent_hugepage_enabled(vma);
>>> -    if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
>>> -        return __transparent_hugepage_enabled(vma);
>>> +    if (vma_is_shmem(vma))
>>> +        return shmem_huge_enabled(vma);
>>>         return false;
>>>   }
>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>> index 2275a0f..6f09a31 100644
>>> --- a/mm/shmem.c
>>> +++ b/mm/shmem.c
>>> @@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct 
>>> *vma)
>>>       loff_t i_size;
>>>       pgoff_t off;
>>>   +    if ((vma->vm_flags & VM_NOHUGEPAGE) ||
>>> +        test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
>>> +        return false;
>>>       if (shmem_huge == SHMEM_HUGE_FORCE)
>>>           return true;
>>>       if (shmem_huge == SHMEM_HUGE_DENY)
>>> -- 
>>> 1.8.3.1
>>>
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-05-06 23:37     ` Yang Shi
@ 2019-05-07 10:47       ` Michal Hocko
  2019-05-07 17:10         ` Yang Shi
  0 siblings, 1 reply; 17+ messages in thread
From: Michal Hocko @ 2019-05-07 10:47 UTC (permalink / raw)
  To: Yang Shi
  Cc: Kirill A. Shutemov, vbabka, rientjes, kirill, akpm, linux-mm,
	linux-kernel, Hugh Dickins


[Hmm, I thought, Hugh was CCed]

On Mon 06-05-19 16:37:42, Yang Shi wrote:
> 
> 
> On 4/28/19 12:13 PM, Yang Shi wrote:
> > 
> > 
> > On 4/23/19 10:52 AM, Michal Hocko wrote:
> > > On Wed 24-04-19 00:43:01, Yang Shi wrote:
> > > > The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility
> > > > for each
> > > > vma") introduced THPeligible bit for processes' smaps. But, when
> > > > checking
> > > > the eligibility for shmem vma, __transparent_hugepage_enabled() is
> > > > called to override the result from shmem_huge_enabled().  It may result
> > > > in the anonymous vma's THP flag override shmem's.  For example,
> > > > running a
> > > > simple test which create THP for shmem, but with anonymous THP
> > > > disabled,
> > > > when reading the process's smaps, it may show:
> > > > 
> > > > 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
> > > > Size:               4096 kB
> > > > ...
> > > > [snip]
> > > > ...
> > > > ShmemPmdMapped:     4096 kB
> > > > ...
> > > > [snip]
> > > > ...
> > > > THPeligible:    0
> > > > 
> > > > And, /proc/meminfo does show THP allocated and PMD mapped too:
> > > > 
> > > > ShmemHugePages:     4096 kB
> > > > ShmemPmdMapped:     4096 kB
> > > > 
> > > > This doesn't make too much sense.  The anonymous THP flag should not
> > > > intervene shmem THP.  Calling shmem_huge_enabled() with checking
> > > > MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
> > > > dax vma check since we already checked if the vma is shmem already.
> > > Kirill, can we get a confirmation that this is really intended behavior
> > > rather than an omission please? Is this documented? What is a global
> > > knob to simply disable THP system wise?
> > 
> > Hi Kirill,
> > 
> > Ping. Any comment?
> 
> Talked with Kirill at LSFMM, it sounds this is kind of intended behavior
> according to him. But, we all agree it looks inconsistent.
> 
> So, we may have two options:
>     - Just fix the false negative issue as what the patch does
>     - Change the behavior to make it more consistent
> 
> I'm not sure whether anyone relies on the behavior explicitly or implicitly
> or not.

Well, I would be certainly more happy with a more consistent behavior.
Talked to Hugh at LSFMM about this and he finds treating shmem objects
separately from the anonymous memory. And that is already the case
partially when each mount point might have its own setup. So the primary
question is whether we need a one global knob to controll all THP
allocations. One argument to have that is that it might be helpful to
for an admin to simply disable source of THP at a single place rather
than crawling over all shmem mount points and remount them. Especially
in environments where shmem points are mounted in a container by a
non-root. Why would somebody wanted something like that? One example
would be to temporarily workaround high order allocations issues which
we have seen non trivial amount of in the past and we are likely not at
the end of the tunel.

That being said I would be in favor of treating the global sysfs knob to
be global for all THP allocations. I will not push back on that if there
is a general consensus that shmem and fs in general are a different
class of objects and a single global control is not desirable for
whatever reasons.

Kirill, Hugh othe folks?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-05-07 10:47       ` Michal Hocko
@ 2019-05-07 17:10         ` Yang Shi
  2019-06-06 18:59           ` Yang Shi
  0 siblings, 1 reply; 17+ messages in thread
From: Yang Shi @ 2019-05-07 17:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Kirill A. Shutemov, vbabka, rientjes, kirill, akpm, linux-mm,
	linux-kernel, Hugh Dickins



On 5/7/19 3:47 AM, Michal Hocko wrote:
> [Hmm, I thought, Hugh was CCed]
>
> On Mon 06-05-19 16:37:42, Yang Shi wrote:
>>
>> On 4/28/19 12:13 PM, Yang Shi wrote:
>>>
>>> On 4/23/19 10:52 AM, Michal Hocko wrote:
>>>> On Wed 24-04-19 00:43:01, Yang Shi wrote:
>>>>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility
>>>>> for each
>>>>> vma") introduced THPeligible bit for processes' smaps. But, when
>>>>> checking
>>>>> the eligibility for shmem vma, __transparent_hugepage_enabled() is
>>>>> called to override the result from shmem_huge_enabled().  It may result
>>>>> in the anonymous vma's THP flag override shmem's.  For example,
>>>>> running a
>>>>> simple test which create THP for shmem, but with anonymous THP
>>>>> disabled,
>>>>> when reading the process's smaps, it may show:
>>>>>
>>>>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
>>>>> Size:               4096 kB
>>>>> ...
>>>>> [snip]
>>>>> ...
>>>>> ShmemPmdMapped:     4096 kB
>>>>> ...
>>>>> [snip]
>>>>> ...
>>>>> THPeligible:    0
>>>>>
>>>>> And, /proc/meminfo does show THP allocated and PMD mapped too:
>>>>>
>>>>> ShmemHugePages:     4096 kB
>>>>> ShmemPmdMapped:     4096 kB
>>>>>
>>>>> This doesn't make too much sense.  The anonymous THP flag should not
>>>>> intervene shmem THP.  Calling shmem_huge_enabled() with checking
>>>>> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
>>>>> dax vma check since we already checked if the vma is shmem already.
>>>> Kirill, can we get a confirmation that this is really intended behavior
>>>> rather than an omission please? Is this documented? What is a global
>>>> knob to simply disable THP system wise?
>>> Hi Kirill,
>>>
>>> Ping. Any comment?
>> Talked with Kirill at LSFMM, it sounds this is kind of intended behavior
>> according to him. But, we all agree it looks inconsistent.
>>
>> So, we may have two options:
>>      - Just fix the false negative issue as what the patch does
>>      - Change the behavior to make it more consistent
>>
>> I'm not sure whether anyone relies on the behavior explicitly or implicitly
>> or not.
> Well, I would be certainly more happy with a more consistent behavior.
> Talked to Hugh at LSFMM about this and he finds treating shmem objects
> separately from the anonymous memory. And that is already the case
> partially when each mount point might have its own setup. So the primary
> question is whether we need a one global knob to controll all THP
> allocations. One argument to have that is that it might be helpful to
> for an admin to simply disable source of THP at a single place rather
> than crawling over all shmem mount points and remount them. Especially
> in environments where shmem points are mounted in a container by a
> non-root. Why would somebody wanted something like that? One example
> would be to temporarily workaround high order allocations issues which
> we have seen non trivial amount of in the past and we are likely not at
> the end of the tunel.

Shmem has a global control for such use. Setting shmem_enabled to 
"force" or "deny" would enable or disable THP for shmem globally, 
including non-fs objects, i.e. memfd, SYS V shmem, etc.

>
> That being said I would be in favor of treating the global sysfs knob to
> be global for all THP allocations. I will not push back on that if there
> is a general consensus that shmem and fs in general are a different
> class of objects and a single global control is not desirable for
> whatever reasons.

OK, we need more inputs from Kirill, Hugh and other folks.

>
> Kirill, Hugh othe folks?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-05-07 17:10         ` Yang Shi
@ 2019-06-06 18:59           ` Yang Shi
  2019-06-07 10:57             ` Hugh Dickins
  0 siblings, 1 reply; 17+ messages in thread
From: Yang Shi @ 2019-06-06 18:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Kirill A. Shutemov, vbabka, rientjes, kirill, akpm, linux-mm,
	linux-kernel, Hugh Dickins



On 5/7/19 10:10 AM, Yang Shi wrote:
>
>
> On 5/7/19 3:47 AM, Michal Hocko wrote:
>> [Hmm, I thought, Hugh was CCed]
>>
>> On Mon 06-05-19 16:37:42, Yang Shi wrote:
>>>
>>> On 4/28/19 12:13 PM, Yang Shi wrote:
>>>>
>>>> On 4/23/19 10:52 AM, Michal Hocko wrote:
>>>>> On Wed 24-04-19 00:43:01, Yang Shi wrote:
>>>>>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility
>>>>>> for each
>>>>>> vma") introduced THPeligible bit for processes' smaps. But, when
>>>>>> checking
>>>>>> the eligibility for shmem vma, __transparent_hugepage_enabled() is
>>>>>> called to override the result from shmem_huge_enabled().  It may 
>>>>>> result
>>>>>> in the anonymous vma's THP flag override shmem's.  For example,
>>>>>> running a
>>>>>> simple test which create THP for shmem, but with anonymous THP
>>>>>> disabled,
>>>>>> when reading the process's smaps, it may show:
>>>>>>
>>>>>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
>>>>>> Size:               4096 kB
>>>>>> ...
>>>>>> [snip]
>>>>>> ...
>>>>>> ShmemPmdMapped:     4096 kB
>>>>>> ...
>>>>>> [snip]
>>>>>> ...
>>>>>> THPeligible:    0
>>>>>>
>>>>>> And, /proc/meminfo does show THP allocated and PMD mapped too:
>>>>>>
>>>>>> ShmemHugePages:     4096 kB
>>>>>> ShmemPmdMapped:     4096 kB
>>>>>>
>>>>>> This doesn't make too much sense.  The anonymous THP flag should not
>>>>>> intervene shmem THP.  Calling shmem_huge_enabled() with checking
>>>>>> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
>>>>>> dax vma check since we already checked if the vma is shmem already.
>>>>> Kirill, can we get a confirmation that this is really intended 
>>>>> behavior
>>>>> rather than an omission please? Is this documented? What is a global
>>>>> knob to simply disable THP system wise?
>>>> Hi Kirill,
>>>>
>>>> Ping. Any comment?
>>> Talked with Kirill at LSFMM, it sounds this is kind of intended 
>>> behavior
>>> according to him. But, we all agree it looks inconsistent.
>>>
>>> So, we may have two options:
>>>      - Just fix the false negative issue as what the patch does
>>>      - Change the behavior to make it more consistent
>>>
>>> I'm not sure whether anyone relies on the behavior explicitly or 
>>> implicitly
>>> or not.
>> Well, I would be certainly more happy with a more consistent behavior.
>> Talked to Hugh at LSFMM about this and he finds treating shmem objects
>> separately from the anonymous memory. And that is already the case
>> partially when each mount point might have its own setup. So the primary
>> question is whether we need a one global knob to controll all THP
>> allocations. One argument to have that is that it might be helpful to
>> for an admin to simply disable source of THP at a single place rather
>> than crawling over all shmem mount points and remount them. Especially
>> in environments where shmem points are mounted in a container by a
>> non-root. Why would somebody wanted something like that? One example
>> would be to temporarily workaround high order allocations issues which
>> we have seen non trivial amount of in the past and we are likely not at
>> the end of the tunel.
>
> Shmem has a global control for such use. Setting shmem_enabled to 
> "force" or "deny" would enable or disable THP for shmem globally, 
> including non-fs objects, i.e. memfd, SYS V shmem, etc.
>
>>
>> That being said I would be in favor of treating the global sysfs knob to
>> be global for all THP allocations. I will not push back on that if there
>> is a general consensus that shmem and fs in general are a different
>> class of objects and a single global control is not desirable for
>> whatever reasons.
>
> OK, we need more inputs from Kirill, Hugh and other folks.

[Forgot cc to mailing lists]

Hi guys,

How should we move forward for this one? Make the sysfs knob 
(/sys/kernel/mm/transparent_hugepage/enabled) to be global for both 
anonymous and tmpfs? Or just treat shmem objects separately from anon 
memory then fix the false-negative of THP eligibility by this patch?

>
>>
>> Kirill, Hugh othe folks?
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-06-06 18:59           ` Yang Shi
@ 2019-06-07 10:57             ` Hugh Dickins
  2019-06-07 14:25               ` Michal Hocko
  2019-06-07 18:51               ` Yang Shi
  0 siblings, 2 replies; 17+ messages in thread
From: Hugh Dickins @ 2019-06-07 10:57 UTC (permalink / raw)
  To: Yang Shi
  Cc: Michal Hocko, Kirill A. Shutemov, vbabka, rientjes, kirill, akpm,
	linux-mm, linux-kernel, Hugh Dickins

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5507 bytes --]

On Thu, 6 Jun 2019, Yang Shi wrote:
> On 5/7/19 10:10 AM, Yang Shi wrote:
> > On 5/7/19 3:47 AM, Michal Hocko wrote:
> > > [Hmm, I thought, Hugh was CCed]
> > > 
> > > On Mon 06-05-19 16:37:42, Yang Shi wrote:
> > > > 
> > > > On 4/28/19 12:13 PM, Yang Shi wrote:
> > > > > 
> > > > > On 4/23/19 10:52 AM, Michal Hocko wrote:
> > > > > > On Wed 24-04-19 00:43:01, Yang Shi wrote:
> > > > > > > The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility
> > > > > > > for each
> > > > > > > vma") introduced THPeligible bit for processes' smaps. But, when
> > > > > > > checking
> > > > > > > the eligibility for shmem vma, __transparent_hugepage_enabled()
> > > > > > > is
> > > > > > > called to override the result from shmem_huge_enabled().  It may
> > > > > > > result
> > > > > > > in the anonymous vma's THP flag override shmem's.  For example,
> > > > > > > running a
> > > > > > > simple test which create THP for shmem, but with anonymous THP
> > > > > > > disabled,
> > > > > > > when reading the process's smaps, it may show:
> > > > > > > 
> > > > > > > 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
> > > > > > > Size:               4096 kB
> > > > > > > ...
> > > > > > > [snip]
> > > > > > > ...
> > > > > > > ShmemPmdMapped:     4096 kB
> > > > > > > ...
> > > > > > > [snip]
> > > > > > > ...
> > > > > > > THPeligible:    0
> > > > > > > 
> > > > > > > And, /proc/meminfo does show THP allocated and PMD mapped too:
> > > > > > > 
> > > > > > > ShmemHugePages:     4096 kB
> > > > > > > ShmemPmdMapped:     4096 kB
> > > > > > > 
> > > > > > > This doesn't make too much sense.  The anonymous THP flag should
> > > > > > > not
> > > > > > > intervene shmem THP.  Calling shmem_huge_enabled() with checking
> > > > > > > MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
> > > > > > > dax vma check since we already checked if the vma is shmem
> > > > > > > already.
> > > > > > Kirill, can we get a confirmation that this is really intended
> > > > > > behavior
> > > > > > rather than an omission please? Is this documented? What is a
> > > > > > global
> > > > > > knob to simply disable THP system wise?
> > > > > Hi Kirill,
> > > > > 
> > > > > Ping. Any comment?
> > > > Talked with Kirill at LSFMM, it sounds this is kind of intended
> > > > behavior
> > > > according to him. But, we all agree it looks inconsistent.
> > > > 
> > > > So, we may have two options:
> > > >      - Just fix the false negative issue as what the patch does
> > > >      - Change the behavior to make it more consistent
> > > > 
> > > > I'm not sure whether anyone relies on the behavior explicitly or
> > > > implicitly
> > > > or not.
> > > Well, I would be certainly more happy with a more consistent behavior.
> > > Talked to Hugh at LSFMM about this and he finds treating shmem objects
> > > separately from the anonymous memory. And that is already the case
> > > partially when each mount point might have its own setup. So the primary
> > > question is whether we need a one global knob to controll all THP
> > > allocations. One argument to have that is that it might be helpful to
> > > for an admin to simply disable source of THP at a single place rather
> > > than crawling over all shmem mount points and remount them. Especially
> > > in environments where shmem points are mounted in a container by a
> > > non-root. Why would somebody wanted something like that? One example
> > > would be to temporarily workaround high order allocations issues which
> > > we have seen non trivial amount of in the past and we are likely not at
> > > the end of the tunel.
> > 
> > Shmem has a global control for such use. Setting shmem_enabled to "force"
> > or "deny" would enable or disable THP for shmem globally, including non-fs
> > objects, i.e. memfd, SYS V shmem, etc.
> > 
> > > 
> > > That being said I would be in favor of treating the global sysfs knob to
> > > be global for all THP allocations. I will not push back on that if there
> > > is a general consensus that shmem and fs in general are a different
> > > class of objects and a single global control is not desirable for
> > > whatever reasons.
> > 
> > OK, we need more inputs from Kirill, Hugh and other folks.
> 
> [Forgot cc to mailing lists]
> 
> Hi guys,
> 
> How should we move forward for this one? Make the sysfs knob
> (/sys/kernel/mm/transparent_hugepage/enabled) to be global for both anonymous
> and tmpfs? Or just treat shmem objects separately from anon memory then fix
> the false-negative of THP eligibility by this patch?

Sorry for not getting back to you sooner on this.

I don't like to drive design by smaps. I agree with the word "mess" used
several times of THP tunings in this thread, but it's too easy to make
that mess worse by unnecessary changes, so I'm very cautious here.

The addition of "THPeligible" without an "Anon" in its name was
unfortunate. I suppose we're two releases too late to change that.

Applying process (PR_SET_THP_DISABLE) and mm (MADV_*HUGEPAGE)
limitations to shared filesystem objects doesn't work all that well.

I recommend that you continue to treat shmem objects separately from
anon memory, and just make the smaps "THPeligible" more often accurate.

Is your v2 patch earlier in this thread the best for that?
No answer tonight, I'll re-examine later in the day.

Hugh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-06-07 10:57             ` Hugh Dickins
@ 2019-06-07 14:25               ` Michal Hocko
  2019-06-07 18:51               ` Yang Shi
  1 sibling, 0 replies; 17+ messages in thread
From: Michal Hocko @ 2019-06-07 14:25 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Yang Shi, Kirill A. Shutemov, vbabka, rientjes, kirill, akpm,
	linux-mm, linux-kernel

On Fri 07-06-19 03:57:18, Hugh Dickins wrote:
[...]
> The addition of "THPeligible" without an "Anon" in its name was
> unfortunate. I suppose we're two releases too late to change that.

Well, I do not really see any reason why THPeligible should be Anon
specific at all. Even if ...

> Applying process (PR_SET_THP_DISABLE) and mm (MADV_*HUGEPAGE)
> limitations to shared filesystem objects doesn't work all that well.

... this is what we are going with then it is really important to have a
single place to query the eligibility IMHO.

> I recommend that you continue to treat shmem objects separately from
> anon memory, and just make the smaps "THPeligible" more often accurate.

Agreed on this.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-06-07 10:57             ` Hugh Dickins
  2019-06-07 14:25               ` Michal Hocko
@ 2019-06-07 18:51               ` Yang Shi
  1 sibling, 0 replies; 17+ messages in thread
From: Yang Shi @ 2019-06-07 18:51 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Michal Hocko, Kirill A. Shutemov, vbabka, rientjes, kirill, akpm,
	linux-mm, linux-kernel



On 6/7/19 3:57 AM, Hugh Dickins wrote:
> On Thu, 6 Jun 2019, Yang Shi wrote:
>> On 5/7/19 10:10 AM, Yang Shi wrote:
>>> On 5/7/19 3:47 AM, Michal Hocko wrote:
>>>> [Hmm, I thought, Hugh was CCed]
>>>>
>>>> On Mon 06-05-19 16:37:42, Yang Shi wrote:
>>>>> On 4/28/19 12:13 PM, Yang Shi wrote:
>>>>>> On 4/23/19 10:52 AM, Michal Hocko wrote:
>>>>>>> On Wed 24-04-19 00:43:01, Yang Shi wrote:
>>>>>>>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility
>>>>>>>> for each
>>>>>>>> vma") introduced THPeligible bit for processes' smaps. But, when
>>>>>>>> checking
>>>>>>>> the eligibility for shmem vma, __transparent_hugepage_enabled()
>>>>>>>> is
>>>>>>>> called to override the result from shmem_huge_enabled().  It may
>>>>>>>> result
>>>>>>>> in the anonymous vma's THP flag override shmem's.  For example,
>>>>>>>> running a
>>>>>>>> simple test which create THP for shmem, but with anonymous THP
>>>>>>>> disabled,
>>>>>>>> when reading the process's smaps, it may show:
>>>>>>>>
>>>>>>>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
>>>>>>>> Size:               4096 kB
>>>>>>>> ...
>>>>>>>> [snip]
>>>>>>>> ...
>>>>>>>> ShmemPmdMapped:     4096 kB
>>>>>>>> ...
>>>>>>>> [snip]
>>>>>>>> ...
>>>>>>>> THPeligible:    0
>>>>>>>>
>>>>>>>> And, /proc/meminfo does show THP allocated and PMD mapped too:
>>>>>>>>
>>>>>>>> ShmemHugePages:     4096 kB
>>>>>>>> ShmemPmdMapped:     4096 kB
>>>>>>>>
>>>>>>>> This doesn't make too much sense.  The anonymous THP flag should
>>>>>>>> not
>>>>>>>> intervene shmem THP.  Calling shmem_huge_enabled() with checking
>>>>>>>> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
>>>>>>>> dax vma check since we already checked if the vma is shmem
>>>>>>>> already.
>>>>>>> Kirill, can we get a confirmation that this is really intended
>>>>>>> behavior
>>>>>>> rather than an omission please? Is this documented? What is a
>>>>>>> global
>>>>>>> knob to simply disable THP system wise?
>>>>>> Hi Kirill,
>>>>>>
>>>>>> Ping. Any comment?
>>>>> Talked with Kirill at LSFMM, it sounds this is kind of intended
>>>>> behavior
>>>>> according to him. But, we all agree it looks inconsistent.
>>>>>
>>>>> So, we may have two options:
>>>>>       - Just fix the false negative issue as what the patch does
>>>>>       - Change the behavior to make it more consistent
>>>>>
>>>>> I'm not sure whether anyone relies on the behavior explicitly or
>>>>> implicitly
>>>>> or not.
>>>> Well, I would be certainly more happy with a more consistent behavior.
>>>> Talked to Hugh at LSFMM about this and he finds treating shmem objects
>>>> separately from the anonymous memory. And that is already the case
>>>> partially when each mount point might have its own setup. So the primary
>>>> question is whether we need a one global knob to controll all THP
>>>> allocations. One argument to have that is that it might be helpful to
>>>> for an admin to simply disable source of THP at a single place rather
>>>> than crawling over all shmem mount points and remount them. Especially
>>>> in environments where shmem points are mounted in a container by a
>>>> non-root. Why would somebody wanted something like that? One example
>>>> would be to temporarily workaround high order allocations issues which
>>>> we have seen non trivial amount of in the past and we are likely not at
>>>> the end of the tunel.
>>> Shmem has a global control for such use. Setting shmem_enabled to "force"
>>> or "deny" would enable or disable THP for shmem globally, including non-fs
>>> objects, i.e. memfd, SYS V shmem, etc.
>>>
>>>> That being said I would be in favor of treating the global sysfs knob to
>>>> be global for all THP allocations. I will not push back on that if there
>>>> is a general consensus that shmem and fs in general are a different
>>>> class of objects and a single global control is not desirable for
>>>> whatever reasons.
>>> OK, we need more inputs from Kirill, Hugh and other folks.
>> [Forgot cc to mailing lists]
>>
>> Hi guys,
>>
>> How should we move forward for this one? Make the sysfs knob
>> (/sys/kernel/mm/transparent_hugepage/enabled) to be global for both anonymous
>> and tmpfs? Or just treat shmem objects separately from anon memory then fix
>> the false-negative of THP eligibility by this patch?
> Sorry for not getting back to you sooner on this.
>
> I don't like to drive design by smaps. I agree with the word "mess" used
> several times of THP tunings in this thread, but it's too easy to make
> that mess worse by unnecessary changes, so I'm very cautious here.
>
> The addition of "THPeligible" without an "Anon" in its name was
> unfortunate. I suppose we're two releases too late to change that.

The smaps shows it is anon vma or shmem vma for the most cases.

>
> Applying process (PR_SET_THP_DISABLE) and mm (MADV_*HUGEPAGE)
> limitations to shared filesystem objects doesn't work all that well.

The THP eligibility indicator is per vma, it just reports whether THP is 
eligible for a specific vma. So, I'm supposed it should keep consistent 
with MMF_DISABLE_THP and MADV_*HUGEPAGE setting.

The current implementation in shmem and kuhugepaged also checks these.

>
> I recommend that you continue to treat shmem objects separately from
> anon memory, and just make the smaps "THPeligible" more often accurate.
>
> Is your v2 patch earlier in this thread the best for that?

The v2 patch treats shmem objects separately from anon memory and it 
makes the "THPeligible" more often accurate.

> No answer tonight, I'll re-examine later in the day.
>
> Hugh


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-04-23 16:43 [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility Yang Shi
  2019-04-23 17:52 ` Michal Hocko
@ 2019-06-08  3:58 ` Hugh Dickins
  2019-06-10 17:33   ` Yang Shi
  1 sibling, 1 reply; 17+ messages in thread
From: Hugh Dickins @ 2019-06-08  3:58 UTC (permalink / raw)
  To: Yang Shi
  Cc: mhocko, vbabka, rientjes, kirill, kirill.shutemov, akpm,
	linux-mm, linux-kernel

On Wed, 24 Apr 2019, Yang Shi wrote:

> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
> vma") introduced THPeligible bit for processes' smaps. But, when checking
> the eligibility for shmem vma, __transparent_hugepage_enabled() is
> called to override the result from shmem_huge_enabled().  It may result
> in the anonymous vma's THP flag override shmem's.  For example, running a
> simple test which create THP for shmem, but with anonymous THP disabled,
> when reading the process's smaps, it may show:
> 
> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
> Size:               4096 kB
> ...
> [snip]
> ...
> ShmemPmdMapped:     4096 kB
> ...
> [snip]
> ...
> THPeligible:    0
> 
> And, /proc/meminfo does show THP allocated and PMD mapped too:
> 
> ShmemHugePages:     4096 kB
> ShmemPmdMapped:     4096 kB
> 
> This doesn't make too much sense.  The anonymous THP flag should not
> intervene shmem THP.  Calling shmem_huge_enabled() with checking
> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
> dax vma check since we already checked if the vma is shmem already.
> 
> Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Kirill A. Shutemov <kirill@shutemov.name>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> ---
> v2: Check VM_NOHUGEPAGE per Michal Hocko
> 
>  mm/huge_memory.c | 4 ++--
>  mm/shmem.c       | 3 +++
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 165ea46..5881e82 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
>  {
>  	if (vma_is_anonymous(vma))
>  		return __transparent_hugepage_enabled(vma);
> -	if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
> -		return __transparent_hugepage_enabled(vma);
> +	if (vma_is_shmem(vma))
> +		return shmem_huge_enabled(vma);
>  
>  	return false;
>  }
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 2275a0f..6f09a31 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
>  	loff_t i_size;
>  	pgoff_t off;
>  
> +	if ((vma->vm_flags & VM_NOHUGEPAGE) ||
> +	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
> +		return false;

Yes, that is correct; and correctly placed. But a little more is needed:
see how mm/memory.c's transhuge_vma_suitable() will only allow a pmd to
be used instead of a pte if the vma offset and size permit. smaps should
not report a shmem vma as THPeligible if its offset or size prevent it.

And I see that should also be fixed on anon vmas: at present smaps
reports even a 4kB anon vma as THPeligible, which is not right.
Maybe a test like transhuge_vma_suitable() can be added into
transparent_hugepage_enabled(), to handle anon and shmem together.
I say "like transhuge_vma_suitable()", because that function needs
an address, which here you don't have.

The anon offset situation is interesting: usually anon vm_pgoff is
initialized to fit with its vm_start, so the anon offset check passes;
but I wonder what happens after mremap to a different address - does
transhuge_vma_suitable() then prevent the use of pmds where they could
actually be used? Not a Number#1 priority to investigate or fix here!
but a curiosity someone might want to look into.

>  	if (shmem_huge == SHMEM_HUGE_FORCE)
>  		return true;
>  	if (shmem_huge == SHMEM_HUGE_DENY)
> -- 
> 1.8.3.1


Even with your changes
ShmemPmdMapped:     4096 kB
THPeligible:    0
will easily be seen: THPeligible reflects whether a huge page can be
allocated and mapped by pmd in that vma; but if something else already
allocated the huge page earlier, it will be mapped by pmd in this vma
if offset and size allow, whatever THPeligible says. We could change
transhuge_vma_suitable() to force ptes in that case, but it would be
a silly change, just to make what smaps shows easier to explain.

Hugh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-06-08  3:58 ` Hugh Dickins
@ 2019-06-10 17:33   ` Yang Shi
  2019-06-12 18:44     ` Hugh Dickins
  0 siblings, 1 reply; 17+ messages in thread
From: Yang Shi @ 2019-06-10 17:33 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: mhocko, vbabka, rientjes, kirill, kirill.shutemov, akpm,
	linux-mm, linux-kernel



On 6/7/19 8:58 PM, Hugh Dickins wrote:
> On Wed, 24 Apr 2019, Yang Shi wrote:
>
>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
>> vma") introduced THPeligible bit for processes' smaps. But, when checking
>> the eligibility for shmem vma, __transparent_hugepage_enabled() is
>> called to override the result from shmem_huge_enabled().  It may result
>> in the anonymous vma's THP flag override shmem's.  For example, running a
>> simple test which create THP for shmem, but with anonymous THP disabled,
>> when reading the process's smaps, it may show:
>>
>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
>> Size:               4096 kB
>> ...
>> [snip]
>> ...
>> ShmemPmdMapped:     4096 kB
>> ...
>> [snip]
>> ...
>> THPeligible:    0
>>
>> And, /proc/meminfo does show THP allocated and PMD mapped too:
>>
>> ShmemHugePages:     4096 kB
>> ShmemPmdMapped:     4096 kB
>>
>> This doesn't make too much sense.  The anonymous THP flag should not
>> intervene shmem THP.  Calling shmem_huge_enabled() with checking
>> MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
>> dax vma check since we already checked if the vma is shmem already.
>>
>> Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: David Rientjes <rientjes@google.com>
>> Cc: Kirill A. Shutemov <kirill@shutemov.name>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> ---
>> v2: Check VM_NOHUGEPAGE per Michal Hocko
>>
>>   mm/huge_memory.c | 4 ++--
>>   mm/shmem.c       | 3 +++
>>   2 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 165ea46..5881e82 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
>>   {
>>   	if (vma_is_anonymous(vma))
>>   		return __transparent_hugepage_enabled(vma);
>> -	if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
>> -		return __transparent_hugepage_enabled(vma);
>> +	if (vma_is_shmem(vma))
>> +		return shmem_huge_enabled(vma);
>>   
>>   	return false;
>>   }
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index 2275a0f..6f09a31 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
>>   	loff_t i_size;
>>   	pgoff_t off;
>>   
>> +	if ((vma->vm_flags & VM_NOHUGEPAGE) ||
>> +	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
>> +		return false;
> Yes, that is correct; and correctly placed. But a little more is needed:
> see how mm/memory.c's transhuge_vma_suitable() will only allow a pmd to
> be used instead of a pte if the vma offset and size permit. smaps should
> not report a shmem vma as THPeligible if its offset or size prevent it.
>
> And I see that should also be fixed on anon vmas: at present smaps
> reports even a 4kB anon vma as THPeligible, which is not right.
> Maybe a test like transhuge_vma_suitable() can be added into
> transparent_hugepage_enabled(), to handle anon and shmem together.
> I say "like transhuge_vma_suitable()", because that function needs
> an address, which here you don't have.

Thanks for the remind. Since we don't have an address I'm supposed we 
just need check if the vma's size is big enough or not other than other 
alignment check.

And, I'm wondering whether we could reuse transhuge_vma_suitable() by 
passing in an impossible address, i.e. -1 since it is not a valid 
userspace address. It can be used as and indicator that this call is 
from THPeligible context.

>
> The anon offset situation is interesting: usually anon vm_pgoff is
> initialized to fit with its vm_start, so the anon offset check passes;
> but I wonder what happens after mremap to a different address - does
> transhuge_vma_suitable() then prevent the use of pmds where they could
> actually be used? Not a Number#1 priority to investigate or fix here!
> but a curiosity someone might want to look into.

Will mark on my TODO list.

>
>>   	if (shmem_huge == SHMEM_HUGE_FORCE)
>>   		return true;
>>   	if (shmem_huge == SHMEM_HUGE_DENY)
>> -- 
>> 1.8.3.1
>
> Even with your changes
> ShmemPmdMapped:     4096 kB
> THPeligible:    0
> will easily be seen: THPeligible reflects whether a huge page can be
> allocated and mapped by pmd in that vma; but if something else already
> allocated the huge page earlier, it will be mapped by pmd in this vma
> if offset and size allow, whatever THPeligible says. We could change
> transhuge_vma_suitable() to force ptes in that case, but it would be
> a silly change, just to make what smaps shows easier to explain.

Where did this come from? From the commit log? If so it is the example 
for the wrong smap output. If that case really happens, I think we could 
document it since THPeligible should just show the current status.

>
> Hugh


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-06-10 17:33   ` Yang Shi
@ 2019-06-12 18:44     ` Hugh Dickins
  2019-06-12 19:59       ` Yang Shi
  0 siblings, 1 reply; 17+ messages in thread
From: Hugh Dickins @ 2019-06-12 18:44 UTC (permalink / raw)
  To: Yang Shi
  Cc: Hugh Dickins, mhocko, vbabka, rientjes, kirill, kirill.shutemov,
	akpm, linux-mm, linux-kernel

On Mon, 10 Jun 2019, Yang Shi wrote:
> On 6/7/19 8:58 PM, Hugh Dickins wrote:
> > Yes, that is correct; and correctly placed. But a little more is needed:
> > see how mm/memory.c's transhuge_vma_suitable() will only allow a pmd to
> > be used instead of a pte if the vma offset and size permit. smaps should
> > not report a shmem vma as THPeligible if its offset or size prevent it.
> > 
> > And I see that should also be fixed on anon vmas: at present smaps
> > reports even a 4kB anon vma as THPeligible, which is not right.
> > Maybe a test like transhuge_vma_suitable() can be added into
> > transparent_hugepage_enabled(), to handle anon and shmem together.
> > I say "like transhuge_vma_suitable()", because that function needs
> > an address, which here you don't have.
> 
> Thanks for the remind. Since we don't have an address I'm supposed we just
> need check if the vma's size is big enough or not other than other alignment
> check.
> 
> And, I'm wondering whether we could reuse transhuge_vma_suitable() by passing
> in an impossible address, i.e. -1 since it is not a valid userspace address.
> It can be used as and indicator that this call is from THPeligible context.

Perhaps, but sounds like it will abuse and uglify transhuge_vma_suitable()
just for smaps. Would passing transhuge_vma_suitable() the address
    ((vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE)
give the the correct answer in all cases?

> > 
> > The anon offset situation is interesting: usually anon vm_pgoff is
> > initialized to fit with its vm_start, so the anon offset check passes;
> > but I wonder what happens after mremap to a different address - does
> > transhuge_vma_suitable() then prevent the use of pmds where they could
> > actually be used? Not a Number#1 priority to investigate or fix here!
> > but a curiosity someone might want to look into.
> 
> Will mark on my TODO list.
> 
> > Even with your changes
> > ShmemPmdMapped:     4096 kB
> > THPeligible:    0
> > will easily be seen: THPeligible reflects whether a huge page can be
> > allocated and mapped by pmd in that vma; but if something else already
> > allocated the huge page earlier, it will be mapped by pmd in this vma
> > if offset and size allow, whatever THPeligible says. We could change
> > transhuge_vma_suitable() to force ptes in that case, but it would be
> > a silly change, just to make what smaps shows easier to explain.
> 
> Where did this come from? From the commit log? If so it is the example for
> the wrong smap output. If that case really happens, I think we could document
> it since THPeligible should just show the current status.

Please read again what I explained there: it's not necessarily an example
of wrong smaps output, it's reasonable smaps output for a reasonable case.

Yes, maybe Documentation/filesystems/proc.txt should explain "THPeligble"
a little better - "eligible for allocating THP pages" rather than just
"eligible for THP pages" would be good enough? we don't want to write
a book about the various cases.

Oh, and the "THPeligible" output lines up very nicely there in proc.txt:
could the actual alignment of that 0 or 1 be fixed in smaps itself too?

Thanks,
Hugh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
  2019-06-12 18:44     ` Hugh Dickins
@ 2019-06-12 19:59       ` Yang Shi
  0 siblings, 0 replies; 17+ messages in thread
From: Yang Shi @ 2019-06-12 19:59 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: mhocko, vbabka, rientjes, kirill, kirill.shutemov, akpm,
	linux-mm, linux-kernel



On 6/12/19 11:44 AM, Hugh Dickins wrote:
> On Mon, 10 Jun 2019, Yang Shi wrote:
>> On 6/7/19 8:58 PM, Hugh Dickins wrote:
>>> Yes, that is correct; and correctly placed. But a little more is needed:
>>> see how mm/memory.c's transhuge_vma_suitable() will only allow a pmd to
>>> be used instead of a pte if the vma offset and size permit. smaps should
>>> not report a shmem vma as THPeligible if its offset or size prevent it.
>>>
>>> And I see that should also be fixed on anon vmas: at present smaps
>>> reports even a 4kB anon vma as THPeligible, which is not right.
>>> Maybe a test like transhuge_vma_suitable() can be added into
>>> transparent_hugepage_enabled(), to handle anon and shmem together.
>>> I say "like transhuge_vma_suitable()", because that function needs
>>> an address, which here you don't have.
>> Thanks for the remind. Since we don't have an address I'm supposed we just
>> need check if the vma's size is big enough or not other than other alignment
>> check.
>>
>> And, I'm wondering whether we could reuse transhuge_vma_suitable() by passing
>> in an impossible address, i.e. -1 since it is not a valid userspace address.
>> It can be used as and indicator that this call is from THPeligible context.
> Perhaps, but sounds like it will abuse and uglify transhuge_vma_suitable()
> just for smaps. Would passing transhuge_vma_suitable() the address
>      ((vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE)
> give the the correct answer in all cases?

Yes, it looks better.

>
>>> The anon offset situation is interesting: usually anon vm_pgoff is
>>> initialized to fit with its vm_start, so the anon offset check passes;
>>> but I wonder what happens after mremap to a different address - does
>>> transhuge_vma_suitable() then prevent the use of pmds where they could
>>> actually be used? Not a Number#1 priority to investigate or fix here!
>>> but a curiosity someone might want to look into.
>> Will mark on my TODO list.
>>
>>> Even with your changes
>>> ShmemPmdMapped:     4096 kB
>>> THPeligible:    0
>>> will easily be seen: THPeligible reflects whether a huge page can be
>>> allocated and mapped by pmd in that vma; but if something else already
>>> allocated the huge page earlier, it will be mapped by pmd in this vma
>>> if offset and size allow, whatever THPeligible says. We could change
>>> transhuge_vma_suitable() to force ptes in that case, but it would be
>>> a silly change, just to make what smaps shows easier to explain.
>> Where did this come from? From the commit log? If so it is the example for
>> the wrong smap output. If that case really happens, I think we could document
>> it since THPeligible should just show the current status.
> Please read again what I explained there: it's not necessarily an example
> of wrong smaps output, it's reasonable smaps output for a reasonable case.
>
> Yes, maybe Documentation/filesystems/proc.txt should explain "THPeligble"
> a little better - "eligible for allocating THP pages" rather than just
> "eligible for THP pages" would be good enough? we don't want to write
> a book about the various cases.

Yes, I agree.

>
> Oh, and the "THPeligible" output lines up very nicely there in proc.txt:
> could the actual alignment of that 0 or 1 be fixed in smaps itself too?

Sure.

Thanks,
Yang

>
> Thanks,
> Hugh


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-06-12 19:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-23 16:43 [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility Yang Shi
2019-04-23 17:52 ` Michal Hocko
2019-04-23 18:34   ` Yang Shi
2019-04-24  0:22     ` Yang Shi
2019-04-24  7:58       ` Michal Hocko
2019-04-28 19:13   ` Yang Shi
2019-05-06 23:37     ` Yang Shi
2019-05-07 10:47       ` Michal Hocko
2019-05-07 17:10         ` Yang Shi
2019-06-06 18:59           ` Yang Shi
2019-06-07 10:57             ` Hugh Dickins
2019-06-07 14:25               ` Michal Hocko
2019-06-07 18:51               ` Yang Shi
2019-06-08  3:58 ` Hugh Dickins
2019-06-10 17:33   ` Yang Shi
2019-06-12 18:44     ` Hugh Dickins
2019-06-12 19:59       ` Yang Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).