linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
       [not found] <20190624031604.7764-1-hdanton@sina.com>
@ 2019-06-24  4:27 ` Song Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Song Liu @ 2019-06-24  4:27 UTC (permalink / raw)
  To: Hillf Danton
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm

Hi Hillf,

> On Jun 23, 2019, at 8:16 PM, Hillf Danton <hdanton@sina.com> wrote:
> 
> 
> Hello
> 
> On Sun, 23 Jun 2019 13:48:47 +0800 Song Liu wrote:
>> This patch is (hopefully) the first step to enable THP for non-shmem
>> filesystems.
>> 
>> This patch enables an application to put part of its text sections to THP
>> via madvise, for example:
>> 
>>    madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
>> 
>> We tried to reuse the logic for THP on tmpfs.
>> 
>> Currently, write is not supported for non-shmem THP. khugepaged will only
>> process vma with VM_DENYWRITE. The next patch will handle writes, which
>> would only happen when the vma with VM_DENYWRITE is unmapped.
>> 
>> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
>> feature.
>> 
>> Acked-by: Rik van Riel <riel@surriel.com>
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>> mm/Kconfig      | 11 ++++++
>> mm/filemap.c    |  4 +--
>> mm/khugepaged.c | 90 ++++++++++++++++++++++++++++++++++++++++---------
>> mm/rmap.c       | 12 ++++---
>> 4 files changed, 96 insertions(+), 21 deletions(-)
>> 
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index f0c76ba47695..0a8fd589406d 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -762,6 +762,17 @@ config GUP_BENCHMARK
>> 
>> 	  See tools/testing/selftests/vm/gup_benchmark.c
>> 
>> +config READ_ONLY_THP_FOR_FS
>> +	bool "Read-only THP for filesystems (EXPERIMENTAL)"
>> +	depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
>> +
> The ext4 mentioned in the cover letter, along with the subject line of
> this patch, suggests the scissoring of SHMEM.

We reuse khugepaged code for SHMEM, so the dependency does exist. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem)FS
       [not found] <20190624074816.10992-1-hdanton@sina.com>
@ 2019-06-24 21:17 ` Song Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Song Liu @ 2019-06-24 21:17 UTC (permalink / raw)
  To: Hillf Danton
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm



> On Jun 24, 2019, at 12:48 AM, Hillf Danton <hdanton@sina.com> wrote:
> 
> 
> Hello
> 
> On Mon, 24 Jun 2019 12:28:32 +0800 Song Liu wrote:
>> 
>> Hi Hillf,
>> 
>>> On Jun 23, 2019, at 8:16 PM, Hillf Danton <hdanton@sina.com> wrote:
>>> 
>>> 
>>> Hello
>>> 
>>> On Sun, 23 Jun 2019 13:48:47 +0800 Song Liu wrote:
>>>> This patch is (hopefully) the first step to enable THP for non-shmem
>>>> filesystems.
>>>> 
>>>> This patch enables an application to put part of its text sections to THP
>>>> via madvise, for example:
>>>> 
>>>>   madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
>>>> 
>>>> We tried to reuse the logic for THP on tmpfs.
>>>> 
>>>> Currently, write is not supported for non-shmem THP. khugepaged will only
>>>> process vma with VM_DENYWRITE. The next patch will handle writes, which
>>>> would only happen when the vma with VM_DENYWRITE is unmapped.
>>>> 
>>>> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
>>>> feature.
>>>> 
>>>> Acked-by: Rik van Riel <riel@surriel.com>
>>>> Signed-off-by: Song Liu <songliubraving@fb.com>
>>>> ---
>>>> mm/Kconfig      | 11 ++++++
>>>> mm/filemap.c    |  4 +--
>>>> mm/khugepaged.c | 90 ++++++++++++++++++++++++++++++++++++++++---------
>>>> mm/rmap.c       | 12 ++++---
>>>> 4 files changed, 96 insertions(+), 21 deletions(-)
>>>> 
>>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>>> index f0c76ba47695..0a8fd589406d 100644
>>>> --- a/mm/Kconfig
>>>> +++ b/mm/Kconfig
>>>> @@ -762,6 +762,17 @@ config GUP_BENCHMARK
>>>> 
>>>> 	  See tools/testing/selftests/vm/gup_benchmark.c
>>>> 
>>>> +config READ_ONLY_THP_FOR_FS
>>>> +	bool "Read-only THP for filesystems (EXPERIMENTAL)"
>>>> +	depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
>>>> +
>>> The ext4 mentioned in the cover letter, along with the subject line of
>>> this patch, suggests the scissoring of SHMEM.
>> 
>> We reuse khugepaged code for SHMEM, so the dependency does exist.
>> 
> On the other hand I see collapse_file() and khugepaged_scan_file(), and
> wonder if ext4 files can be handled by the new functions. If yes, we can
> drop that dependency in the game of RO thp to make ext4 be ext4, and
> shmem be shmem, as they are.

Ext4 files can be handled by these functions. We will need fs specific
code for writable THPs (in the future). 

In longer term, once the code (with write support) become more stable, 
we will drop this config. As of now, I think it is OK to depend on SHMEM. 

Thanks,
Song






^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-24 15:15               ` Kirill A. Shutemov
@ 2019-06-24 16:33                 ` Song Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Song Liu @ 2019-06-24 16:33 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm, hdanton



> On Jun 24, 2019, at 8:15 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Mon, Jun 24, 2019 at 03:04:21PM +0000, Song Liu wrote:
>> 
>> 
>>> On Jun 24, 2019, at 7:54 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>>> 
>>> On Mon, Jun 24, 2019 at 02:42:13PM +0000, Song Liu wrote:
>>>> 
>>>> 
>>>>> On Jun 24, 2019, at 7:27 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>>>>> 
>>>>> On Mon, Jun 24, 2019 at 02:01:05PM +0000, Song Liu wrote:
>>>>>>>> @@ -1392,6 +1403,23 @@ static void collapse_file(struct mm_struct *mm,
>>>>>>>> 				result = SCAN_FAIL;
>>>>>>>> 				goto xa_unlocked;
>>>>>>>> 			}
>>>>>>>> +		} else if (!page || xa_is_value(page)) {
>>>>>>>> +			xas_unlock_irq(&xas);
>>>>>>>> +			page_cache_sync_readahead(mapping, &file->f_ra, file,
>>>>>>>> +						  index, PAGE_SIZE);
>>>>>>>> +			lru_add_drain();
>>>>>>> 
>>>>>>> Why?
>>>>>> 
>>>>>> isolate_lru_page() is likely to fail if we don't drain the pagevecs. 
>>>>> 
>>>>> Please add a comment.
>>>> 
>>>> Will do. 
>>>> 
>>>>> 
>>>>>>>> +			page = find_lock_page(mapping, index);
>>>>>>>> +			if (unlikely(page == NULL)) {
>>>>>>>> +				result = SCAN_FAIL;
>>>>>>>> +				goto xa_unlocked;
>>>>>>>> +			}
>>>>>>>> +		} else if (!PageUptodate(page)) {
>>>>>>> 
>>>>>>> Maybe we should try wait_on_page_locked() here before give up?
>>>>>> 
>>>>>> Are you referring to the "if (!PageUptodate(page))" case? 
>>>>> 
>>>>> Yes.
>>>> 
>>>> I think this case happens when another thread is reading the page in. 
>>>> I could not think of a way to trigger this condition for testing. 
>>>> 
>>>> On the other hand, with current logic, we will retry the page on the 
>>>> next scan, so I guess this is OK. 
>>> 
>>> What I meant that calling wait_on_page_locked() on !PageUptodate() page
>>> will likely make it up-to-date and we don't need to SCAN_FAIL the attempt.
>>> 
>> 
>> Yeah, I got the point. My only concern is that I don't know how to 
>> reliably trigger this case for testing. I can try to trigger it. But I 
>> don't know whether it will happen easily. 
> 
> Atrifically slowing down IO should do the trick.
> 

Let me try that. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-24 15:04             ` Song Liu
@ 2019-06-24 15:15               ` Kirill A. Shutemov
  2019-06-24 16:33                 ` Song Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill A. Shutemov @ 2019-06-24 15:15 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm, hdanton

On Mon, Jun 24, 2019 at 03:04:21PM +0000, Song Liu wrote:
> 
> 
> > On Jun 24, 2019, at 7:54 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > 
> > On Mon, Jun 24, 2019 at 02:42:13PM +0000, Song Liu wrote:
> >> 
> >> 
> >>> On Jun 24, 2019, at 7:27 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> >>> 
> >>> On Mon, Jun 24, 2019 at 02:01:05PM +0000, Song Liu wrote:
> >>>>>> @@ -1392,6 +1403,23 @@ static void collapse_file(struct mm_struct *mm,
> >>>>>> 				result = SCAN_FAIL;
> >>>>>> 				goto xa_unlocked;
> >>>>>> 			}
> >>>>>> +		} else if (!page || xa_is_value(page)) {
> >>>>>> +			xas_unlock_irq(&xas);
> >>>>>> +			page_cache_sync_readahead(mapping, &file->f_ra, file,
> >>>>>> +						  index, PAGE_SIZE);
> >>>>>> +			lru_add_drain();
> >>>>> 
> >>>>> Why?
> >>>> 
> >>>> isolate_lru_page() is likely to fail if we don't drain the pagevecs. 
> >>> 
> >>> Please add a comment.
> >> 
> >> Will do. 
> >> 
> >>> 
> >>>>>> +			page = find_lock_page(mapping, index);
> >>>>>> +			if (unlikely(page == NULL)) {
> >>>>>> +				result = SCAN_FAIL;
> >>>>>> +				goto xa_unlocked;
> >>>>>> +			}
> >>>>>> +		} else if (!PageUptodate(page)) {
> >>>>> 
> >>>>> Maybe we should try wait_on_page_locked() here before give up?
> >>>> 
> >>>> Are you referring to the "if (!PageUptodate(page))" case? 
> >>> 
> >>> Yes.
> >> 
> >> I think this case happens when another thread is reading the page in. 
> >> I could not think of a way to trigger this condition for testing. 
> >> 
> >> On the other hand, with current logic, we will retry the page on the 
> >> next scan, so I guess this is OK. 
> > 
> > What I meant that calling wait_on_page_locked() on !PageUptodate() page
> > will likely make it up-to-date and we don't need to SCAN_FAIL the attempt.
> > 
> 
> Yeah, I got the point. My only concern is that I don't know how to 
> reliably trigger this case for testing. I can try to trigger it. But I 
> don't know whether it will happen easily. 

Atrifically slowing down IO should do the trick.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-24 14:54           ` Kirill A. Shutemov
@ 2019-06-24 15:04             ` Song Liu
  2019-06-24 15:15               ` Kirill A. Shutemov
  0 siblings, 1 reply; 11+ messages in thread
From: Song Liu @ 2019-06-24 15:04 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm, hdanton



> On Jun 24, 2019, at 7:54 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Mon, Jun 24, 2019 at 02:42:13PM +0000, Song Liu wrote:
>> 
>> 
>>> On Jun 24, 2019, at 7:27 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>>> 
>>> On Mon, Jun 24, 2019 at 02:01:05PM +0000, Song Liu wrote:
>>>>>> @@ -1392,6 +1403,23 @@ static void collapse_file(struct mm_struct *mm,
>>>>>> 				result = SCAN_FAIL;
>>>>>> 				goto xa_unlocked;
>>>>>> 			}
>>>>>> +		} else if (!page || xa_is_value(page)) {
>>>>>> +			xas_unlock_irq(&xas);
>>>>>> +			page_cache_sync_readahead(mapping, &file->f_ra, file,
>>>>>> +						  index, PAGE_SIZE);
>>>>>> +			lru_add_drain();
>>>>> 
>>>>> Why?
>>>> 
>>>> isolate_lru_page() is likely to fail if we don't drain the pagevecs. 
>>> 
>>> Please add a comment.
>> 
>> Will do. 
>> 
>>> 
>>>>>> +			page = find_lock_page(mapping, index);
>>>>>> +			if (unlikely(page == NULL)) {
>>>>>> +				result = SCAN_FAIL;
>>>>>> +				goto xa_unlocked;
>>>>>> +			}
>>>>>> +		} else if (!PageUptodate(page)) {
>>>>> 
>>>>> Maybe we should try wait_on_page_locked() here before give up?
>>>> 
>>>> Are you referring to the "if (!PageUptodate(page))" case? 
>>> 
>>> Yes.
>> 
>> I think this case happens when another thread is reading the page in. 
>> I could not think of a way to trigger this condition for testing. 
>> 
>> On the other hand, with current logic, we will retry the page on the 
>> next scan, so I guess this is OK. 
> 
> What I meant that calling wait_on_page_locked() on !PageUptodate() page
> will likely make it up-to-date and we don't need to SCAN_FAIL the attempt.
> 

Yeah, I got the point. My only concern is that I don't know how to 
reliably trigger this case for testing. I can try to trigger it. But I 
don't know whether it will happen easily. 

Thanks,
Song





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-24 14:42         ` Song Liu
@ 2019-06-24 14:54           ` Kirill A. Shutemov
  2019-06-24 15:04             ` Song Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill A. Shutemov @ 2019-06-24 14:54 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm, hdanton

On Mon, Jun 24, 2019 at 02:42:13PM +0000, Song Liu wrote:
> 
> 
> > On Jun 24, 2019, at 7:27 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > 
> > On Mon, Jun 24, 2019 at 02:01:05PM +0000, Song Liu wrote:
> >>>> @@ -1392,6 +1403,23 @@ static void collapse_file(struct mm_struct *mm,
> >>>> 				result = SCAN_FAIL;
> >>>> 				goto xa_unlocked;
> >>>> 			}
> >>>> +		} else if (!page || xa_is_value(page)) {
> >>>> +			xas_unlock_irq(&xas);
> >>>> +			page_cache_sync_readahead(mapping, &file->f_ra, file,
> >>>> +						  index, PAGE_SIZE);
> >>>> +			lru_add_drain();
> >>> 
> >>> Why?
> >> 
> >> isolate_lru_page() is likely to fail if we don't drain the pagevecs. 
> > 
> > Please add a comment.
> 
> Will do. 
> 
> > 
> >>>> +			page = find_lock_page(mapping, index);
> >>>> +			if (unlikely(page == NULL)) {
> >>>> +				result = SCAN_FAIL;
> >>>> +				goto xa_unlocked;
> >>>> +			}
> >>>> +		} else if (!PageUptodate(page)) {
> >>> 
> >>> Maybe we should try wait_on_page_locked() here before give up?
> >> 
> >> Are you referring to the "if (!PageUptodate(page))" case? 
> > 
> > Yes.
> 
> I think this case happens when another thread is reading the page in. 
> I could not think of a way to trigger this condition for testing. 
> 
> On the other hand, with current logic, we will retry the page on the 
> next scan, so I guess this is OK. 

What I meant that calling wait_on_page_locked() on !PageUptodate() page
will likely make it up-to-date and we don't need to SCAN_FAIL the attempt.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-24 14:27       ` Kirill A. Shutemov
@ 2019-06-24 14:42         ` Song Liu
  2019-06-24 14:54           ` Kirill A. Shutemov
  0 siblings, 1 reply; 11+ messages in thread
From: Song Liu @ 2019-06-24 14:42 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm, hdanton



> On Jun 24, 2019, at 7:27 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Mon, Jun 24, 2019 at 02:01:05PM +0000, Song Liu wrote:
>>>> @@ -1392,6 +1403,23 @@ static void collapse_file(struct mm_struct *mm,
>>>> 				result = SCAN_FAIL;
>>>> 				goto xa_unlocked;
>>>> 			}
>>>> +		} else if (!page || xa_is_value(page)) {
>>>> +			xas_unlock_irq(&xas);
>>>> +			page_cache_sync_readahead(mapping, &file->f_ra, file,
>>>> +						  index, PAGE_SIZE);
>>>> +			lru_add_drain();
>>> 
>>> Why?
>> 
>> isolate_lru_page() is likely to fail if we don't drain the pagevecs. 
> 
> Please add a comment.

Will do. 

> 
>>>> +			page = find_lock_page(mapping, index);
>>>> +			if (unlikely(page == NULL)) {
>>>> +				result = SCAN_FAIL;
>>>> +				goto xa_unlocked;
>>>> +			}
>>>> +		} else if (!PageUptodate(page)) {
>>> 
>>> Maybe we should try wait_on_page_locked() here before give up?
>> 
>> Are you referring to the "if (!PageUptodate(page))" case? 
> 
> Yes.

I think this case happens when another thread is reading the page in. 
I could not think of a way to trigger this condition for testing. 

On the other hand, with current logic, we will retry the page on the 
next scan, so I guess this is OK. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-24 14:01     ` Song Liu
@ 2019-06-24 14:27       ` Kirill A. Shutemov
  2019-06-24 14:42         ` Song Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill A. Shutemov @ 2019-06-24 14:27 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm, hdanton

On Mon, Jun 24, 2019 at 02:01:05PM +0000, Song Liu wrote:
> >> @@ -1392,6 +1403,23 @@ static void collapse_file(struct mm_struct *mm,
> >> 				result = SCAN_FAIL;
> >> 				goto xa_unlocked;
> >> 			}
> >> +		} else if (!page || xa_is_value(page)) {
> >> +			xas_unlock_irq(&xas);
> >> +			page_cache_sync_readahead(mapping, &file->f_ra, file,
> >> +						  index, PAGE_SIZE);
> >> +			lru_add_drain();
> > 
> > Why?
> 
> isolate_lru_page() is likely to fail if we don't drain the pagevecs. 

Please add a comment.

> >> +			page = find_lock_page(mapping, index);
> >> +			if (unlikely(page == NULL)) {
> >> +				result = SCAN_FAIL;
> >> +				goto xa_unlocked;
> >> +			}
> >> +		} else if (!PageUptodate(page)) {
> > 
> > Maybe we should try wait_on_page_locked() here before give up?
> 
> Are you referring to the "if (!PageUptodate(page))" case? 

Yes.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-24 12:47   ` Kirill A. Shutemov
@ 2019-06-24 14:01     ` Song Liu
  2019-06-24 14:27       ` Kirill A. Shutemov
  0 siblings, 1 reply; 11+ messages in thread
From: Song Liu @ 2019-06-24 14:01 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm, hdanton



> On Jun 24, 2019, at 5:47 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Sat, Jun 22, 2019 at 10:47:48PM -0700, Song Liu wrote:
>> This patch is (hopefully) the first step to enable THP for non-shmem
>> filesystems.
>> 
>> This patch enables an application to put part of its text sections to THP
>> via madvise, for example:
>> 
>>    madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
>> 
>> We tried to reuse the logic for THP on tmpfs.
>> 
>> Currently, write is not supported for non-shmem THP. khugepaged will only
>> process vma with VM_DENYWRITE. The next patch will handle writes, which
>> would only happen when the vma with VM_DENYWRITE is unmapped.
>> 
>> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
>> feature.
>> 
>> Acked-by: Rik van Riel <riel@surriel.com>
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>> mm/Kconfig      | 11 ++++++
>> mm/filemap.c    |  4 +--
>> mm/khugepaged.c | 90 ++++++++++++++++++++++++++++++++++++++++---------
>> mm/rmap.c       | 12 ++++---
>> 4 files changed, 96 insertions(+), 21 deletions(-)
>> 
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index f0c76ba47695..0a8fd589406d 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -762,6 +762,17 @@ config GUP_BENCHMARK
>> 
>> 	  See tools/testing/selftests/vm/gup_benchmark.c
>> 
>> +config READ_ONLY_THP_FOR_FS
>> +	bool "Read-only THP for filesystems (EXPERIMENTAL)"
>> +	depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
>> +
>> +	help
>> +	  Allow khugepaged to put read-only file-backed pages in THP.
>> +
>> +	  This is marked experimental because it is a new feature. Write
>> +	  support of file THPs will be developed in the next few release
>> +	  cycles.
>> +
>> config ARCH_HAS_PTE_SPECIAL
>> 	bool
>> 
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 5f072a113535..e79ceccdc6df 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -203,8 +203,8 @@ static void unaccount_page_cache_page(struct address_space *mapping,
>> 		__mod_node_page_state(page_pgdat(page), NR_SHMEM, -nr);
>> 		if (PageTransHuge(page))
>> 			__dec_node_page_state(page, NR_SHMEM_THPS);
>> -	} else {
>> -		VM_BUG_ON_PAGE(PageTransHuge(page), page);
>> +	} else if (PageTransHuge(page)) {
>> +		__dec_node_page_state(page, NR_FILE_THPS);
>> 	}
>> 
>> 	/*
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 158cad542627..090127e4e185 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -48,6 +48,7 @@ enum scan_result {
>> 	SCAN_CGROUP_CHARGE_FAIL,
>> 	SCAN_EXCEED_SWAP_PTE,
>> 	SCAN_TRUNCATED,
>> +	SCAN_PAGE_HAS_PRIVATE,
>> };
>> 
>> #define CREATE_TRACE_POINTS
>> @@ -404,7 +405,11 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
>> 	    (vm_flags & VM_NOHUGEPAGE) ||
>> 	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
>> 		return false;
>> -	if (shmem_file(vma->vm_file)) {
>> +
>> +	if (shmem_file(vma->vm_file) ||
>> +	    (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>> +	     vma->vm_file &&
>> +	     (vm_flags & VM_DENYWRITE))) {
>> 		if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGE_PAGECACHE))
>> 			return false;
>> 		return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
>> @@ -456,8 +461,9 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
>> 	unsigned long hstart, hend;
>> 
>> 	/*
>> -	 * khugepaged does not yet work on non-shmem files or special
>> -	 * mappings. And file-private shmem THP is not supported.
>> +	 * khugepaged only supports read-only files for non-shmem files.
>> +	 * khugepaged does not yet work on special mappings. And
>> +	 * file-private shmem THP is not supported.
>> 	 */
>> 	if (!hugepage_vma_check(vma, vm_flags))
>> 		return 0;
>> @@ -1287,12 +1293,12 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
>> }
>> 
>> /**
>> - * collapse_file - collapse small tmpfs/shmem pages into huge one.
>> + * collapse_file - collapse filemap/tmpfs/shmem pages into huge one.
>>  *
>>  * Basic scheme is simple, details are more complex:
>>  *  - allocate and lock a new huge page;
>>  *  - scan page cache replacing old pages with the new one
>> - *    + swap in pages if necessary;
>> + *    + swap/gup in pages if necessary;
>>  *    + fill in gaps;
>>  *    + keep old pages around in case rollback is required;
>>  *  - if replacing succeeds:
>> @@ -1316,7 +1322,11 @@ static void collapse_file(struct mm_struct *mm,
>> 	LIST_HEAD(pagelist);
>> 	XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER);
>> 	int nr_none = 0, result = SCAN_SUCCEED;
>> +	bool is_shmem = shmem_file(file);
>> 
>> +#ifndef CONFIG_READ_ONLY_THP_FOR_FS
>> +	VM_BUG_ON(!is_shmem);
>> +#endif
> 
> 	VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem);

Will fix. 

> 
>> 	VM_BUG_ON(start & (HPAGE_PMD_NR - 1));
>> 
>> 	/* Only allocate from the target node */
>> @@ -1348,7 +1358,8 @@ static void collapse_file(struct mm_struct *mm,
>> 	} while (1);
>> 
>> 	__SetPageLocked(new_page);
>> -	__SetPageSwapBacked(new_page);
>> +	if (is_shmem)
>> +		__SetPageSwapBacked(new_page);
>> 	new_page->index = start;
>> 	new_page->mapping = mapping;
>> 
>> @@ -1363,7 +1374,7 @@ static void collapse_file(struct mm_struct *mm,
>> 		struct page *page = xas_next(&xas);
>> 
>> 		VM_BUG_ON(index != xas.xa_index);
>> -		if (!page) {
>> +		if (is_shmem && !page) {
>> 			/*
>> 			 * Stop if extent has been truncated or hole-punched,
>> 			 * and is now completely empty.
>> @@ -1384,7 +1395,7 @@ static void collapse_file(struct mm_struct *mm,
>> 			continue;
>> 		}
>> 
>> -		if (xa_is_value(page) || !PageUptodate(page)) {
>> +		if (is_shmem && (xa_is_value(page) || !PageUptodate(page))) {
>> 			xas_unlock_irq(&xas);
>> 			/* swap in or instantiate fallocated page */
>> 			if (shmem_getpage(mapping->host, index, &page,
>> @@ -1392,6 +1403,23 @@ static void collapse_file(struct mm_struct *mm,
>> 				result = SCAN_FAIL;
>> 				goto xa_unlocked;
>> 			}
>> +		} else if (!page || xa_is_value(page)) {
>> +			xas_unlock_irq(&xas);
>> +			page_cache_sync_readahead(mapping, &file->f_ra, file,
>> +						  index, PAGE_SIZE);
>> +			lru_add_drain();
> 
> Why?

isolate_lru_page() is likely to fail if we don't drain the pagevecs. 

> 
>> +			page = find_lock_page(mapping, index);
>> +			if (unlikely(page == NULL)) {
>> +				result = SCAN_FAIL;
>> +				goto xa_unlocked;
>> +			}
>> +		} else if (!PageUptodate(page)) {
> 
> Maybe we should try wait_on_page_locked() here before give up?

Are you referring to the "if (!PageUptodate(page))" case? 

> 
>> +			VM_BUG_ON(is_shmem);
>> +			result = SCAN_FAIL;
>> +			goto xa_locked;
>> +		} else if (!is_shmem && PageDirty(page)) {
>> +			result = SCAN_FAIL;
>> +			goto xa_locked;
>> 		} else if (trylock_page(page)) {
>> 			get_page(page);
>> 			xas_unlock_irq(&xas);
> -- 
> Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-23  5:47 ` [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS Song Liu
@ 2019-06-24 12:47   ` Kirill A. Shutemov
  2019-06-24 14:01     ` Song Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill A. Shutemov @ 2019-06-24 12:47 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, kernel-team, william.kucharski, akpm, hdanton

On Sat, Jun 22, 2019 at 10:47:48PM -0700, Song Liu wrote:
> This patch is (hopefully) the first step to enable THP for non-shmem
> filesystems.
> 
> This patch enables an application to put part of its text sections to THP
> via madvise, for example:
> 
>     madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
> 
> We tried to reuse the logic for THP on tmpfs.
> 
> Currently, write is not supported for non-shmem THP. khugepaged will only
> process vma with VM_DENYWRITE. The next patch will handle writes, which
> would only happen when the vma with VM_DENYWRITE is unmapped.
> 
> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
> feature.
> 
> Acked-by: Rik van Riel <riel@surriel.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  mm/Kconfig      | 11 ++++++
>  mm/filemap.c    |  4 +--
>  mm/khugepaged.c | 90 ++++++++++++++++++++++++++++++++++++++++---------
>  mm/rmap.c       | 12 ++++---
>  4 files changed, 96 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/Kconfig b/mm/Kconfig
> index f0c76ba47695..0a8fd589406d 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -762,6 +762,17 @@ config GUP_BENCHMARK
>  
>  	  See tools/testing/selftests/vm/gup_benchmark.c
>  
> +config READ_ONLY_THP_FOR_FS
> +	bool "Read-only THP for filesystems (EXPERIMENTAL)"
> +	depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
> +
> +	help
> +	  Allow khugepaged to put read-only file-backed pages in THP.
> +
> +	  This is marked experimental because it is a new feature. Write
> +	  support of file THPs will be developed in the next few release
> +	  cycles.
> +
>  config ARCH_HAS_PTE_SPECIAL
>  	bool
>  
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 5f072a113535..e79ceccdc6df 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -203,8 +203,8 @@ static void unaccount_page_cache_page(struct address_space *mapping,
>  		__mod_node_page_state(page_pgdat(page), NR_SHMEM, -nr);
>  		if (PageTransHuge(page))
>  			__dec_node_page_state(page, NR_SHMEM_THPS);
> -	} else {
> -		VM_BUG_ON_PAGE(PageTransHuge(page), page);
> +	} else if (PageTransHuge(page)) {
> +		__dec_node_page_state(page, NR_FILE_THPS);
>  	}
>  
>  	/*
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 158cad542627..090127e4e185 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -48,6 +48,7 @@ enum scan_result {
>  	SCAN_CGROUP_CHARGE_FAIL,
>  	SCAN_EXCEED_SWAP_PTE,
>  	SCAN_TRUNCATED,
> +	SCAN_PAGE_HAS_PRIVATE,
>  };
>  
>  #define CREATE_TRACE_POINTS
> @@ -404,7 +405,11 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
>  	    (vm_flags & VM_NOHUGEPAGE) ||
>  	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
>  		return false;
> -	if (shmem_file(vma->vm_file)) {
> +
> +	if (shmem_file(vma->vm_file) ||
> +	    (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
> +	     vma->vm_file &&
> +	     (vm_flags & VM_DENYWRITE))) {
>  		if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGE_PAGECACHE))
>  			return false;
>  		return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
> @@ -456,8 +461,9 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
>  	unsigned long hstart, hend;
>  
>  	/*
> -	 * khugepaged does not yet work on non-shmem files or special
> -	 * mappings. And file-private shmem THP is not supported.
> +	 * khugepaged only supports read-only files for non-shmem files.
> +	 * khugepaged does not yet work on special mappings. And
> +	 * file-private shmem THP is not supported.
>  	 */
>  	if (!hugepage_vma_check(vma, vm_flags))
>  		return 0;
> @@ -1287,12 +1293,12 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
>  }
>  
>  /**
> - * collapse_file - collapse small tmpfs/shmem pages into huge one.
> + * collapse_file - collapse filemap/tmpfs/shmem pages into huge one.
>   *
>   * Basic scheme is simple, details are more complex:
>   *  - allocate and lock a new huge page;
>   *  - scan page cache replacing old pages with the new one
> - *    + swap in pages if necessary;
> + *    + swap/gup in pages if necessary;
>   *    + fill in gaps;
>   *    + keep old pages around in case rollback is required;
>   *  - if replacing succeeds:
> @@ -1316,7 +1322,11 @@ static void collapse_file(struct mm_struct *mm,
>  	LIST_HEAD(pagelist);
>  	XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER);
>  	int nr_none = 0, result = SCAN_SUCCEED;
> +	bool is_shmem = shmem_file(file);
>  
> +#ifndef CONFIG_READ_ONLY_THP_FOR_FS
> +	VM_BUG_ON(!is_shmem);
> +#endif

	VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem);

>  	VM_BUG_ON(start & (HPAGE_PMD_NR - 1));
>  
>  	/* Only allocate from the target node */
> @@ -1348,7 +1358,8 @@ static void collapse_file(struct mm_struct *mm,
>  	} while (1);
>  
>  	__SetPageLocked(new_page);
> -	__SetPageSwapBacked(new_page);
> +	if (is_shmem)
> +		__SetPageSwapBacked(new_page);
>  	new_page->index = start;
>  	new_page->mapping = mapping;
>  
> @@ -1363,7 +1374,7 @@ static void collapse_file(struct mm_struct *mm,
>  		struct page *page = xas_next(&xas);
>  
>  		VM_BUG_ON(index != xas.xa_index);
> -		if (!page) {
> +		if (is_shmem && !page) {
>  			/*
>  			 * Stop if extent has been truncated or hole-punched,
>  			 * and is now completely empty.
> @@ -1384,7 +1395,7 @@ static void collapse_file(struct mm_struct *mm,
>  			continue;
>  		}
>  
> -		if (xa_is_value(page) || !PageUptodate(page)) {
> +		if (is_shmem && (xa_is_value(page) || !PageUptodate(page))) {
>  			xas_unlock_irq(&xas);
>  			/* swap in or instantiate fallocated page */
>  			if (shmem_getpage(mapping->host, index, &page,
> @@ -1392,6 +1403,23 @@ static void collapse_file(struct mm_struct *mm,
>  				result = SCAN_FAIL;
>  				goto xa_unlocked;
>  			}
> +		} else if (!page || xa_is_value(page)) {
> +			xas_unlock_irq(&xas);
> +			page_cache_sync_readahead(mapping, &file->f_ra, file,
> +						  index, PAGE_SIZE);
> +			lru_add_drain();

Why?

> +			page = find_lock_page(mapping, index);
> +			if (unlikely(page == NULL)) {
> +				result = SCAN_FAIL;
> +				goto xa_unlocked;
> +			}
> +		} else if (!PageUptodate(page)) {

Maybe we should try wait_on_page_locked() here before give up?

> +			VM_BUG_ON(is_shmem);
> +			result = SCAN_FAIL;
> +			goto xa_locked;
> +		} else if (!is_shmem && PageDirty(page)) {
> +			result = SCAN_FAIL;
> +			goto xa_locked;
>  		} else if (trylock_page(page)) {
>  			get_page(page);
>  			xas_unlock_irq(&xas);
-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
  2019-06-23  5:47 [PATCH v7 0/6] Enable THP for text section of non-shmem files Song Liu
@ 2019-06-23  5:47 ` Song Liu
  2019-06-24 12:47   ` Kirill A. Shutemov
  0 siblings, 1 reply; 11+ messages in thread
From: Song Liu @ 2019-06-23  5:47 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-kernel
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	akpm, hdanton, Song Liu

This patch is (hopefully) the first step to enable THP for non-shmem
filesystems.

This patch enables an application to put part of its text sections to THP
via madvise, for example:

    madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);

We tried to reuse the logic for THP on tmpfs.

Currently, write is not supported for non-shmem THP. khugepaged will only
process vma with VM_DENYWRITE. The next patch will handle writes, which
would only happen when the vma with VM_DENYWRITE is unmapped.

An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
feature.

Acked-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
---
 mm/Kconfig      | 11 ++++++
 mm/filemap.c    |  4 +--
 mm/khugepaged.c | 90 ++++++++++++++++++++++++++++++++++++++++---------
 mm/rmap.c       | 12 ++++---
 4 files changed, 96 insertions(+), 21 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index f0c76ba47695..0a8fd589406d 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -762,6 +762,17 @@ config GUP_BENCHMARK
 
 	  See tools/testing/selftests/vm/gup_benchmark.c
 
+config READ_ONLY_THP_FOR_FS
+	bool "Read-only THP for filesystems (EXPERIMENTAL)"
+	depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
+
+	help
+	  Allow khugepaged to put read-only file-backed pages in THP.
+
+	  This is marked experimental because it is a new feature. Write
+	  support of file THPs will be developed in the next few release
+	  cycles.
+
 config ARCH_HAS_PTE_SPECIAL
 	bool
 
diff --git a/mm/filemap.c b/mm/filemap.c
index 5f072a113535..e79ceccdc6df 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -203,8 +203,8 @@ static void unaccount_page_cache_page(struct address_space *mapping,
 		__mod_node_page_state(page_pgdat(page), NR_SHMEM, -nr);
 		if (PageTransHuge(page))
 			__dec_node_page_state(page, NR_SHMEM_THPS);
-	} else {
-		VM_BUG_ON_PAGE(PageTransHuge(page), page);
+	} else if (PageTransHuge(page)) {
+		__dec_node_page_state(page, NR_FILE_THPS);
 	}
 
 	/*
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 158cad542627..090127e4e185 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -48,6 +48,7 @@ enum scan_result {
 	SCAN_CGROUP_CHARGE_FAIL,
 	SCAN_EXCEED_SWAP_PTE,
 	SCAN_TRUNCATED,
+	SCAN_PAGE_HAS_PRIVATE,
 };
 
 #define CREATE_TRACE_POINTS
@@ -404,7 +405,11 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 	    (vm_flags & VM_NOHUGEPAGE) ||
 	    test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
 		return false;
-	if (shmem_file(vma->vm_file)) {
+
+	if (shmem_file(vma->vm_file) ||
+	    (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
+	     vma->vm_file &&
+	     (vm_flags & VM_DENYWRITE))) {
 		if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGE_PAGECACHE))
 			return false;
 		return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
@@ -456,8 +461,9 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
 	unsigned long hstart, hend;
 
 	/*
-	 * khugepaged does not yet work on non-shmem files or special
-	 * mappings. And file-private shmem THP is not supported.
+	 * khugepaged only supports read-only files for non-shmem files.
+	 * khugepaged does not yet work on special mappings. And
+	 * file-private shmem THP is not supported.
 	 */
 	if (!hugepage_vma_check(vma, vm_flags))
 		return 0;
@@ -1287,12 +1293,12 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
 }
 
 /**
- * collapse_file - collapse small tmpfs/shmem pages into huge one.
+ * collapse_file - collapse filemap/tmpfs/shmem pages into huge one.
  *
  * Basic scheme is simple, details are more complex:
  *  - allocate and lock a new huge page;
  *  - scan page cache replacing old pages with the new one
- *    + swap in pages if necessary;
+ *    + swap/gup in pages if necessary;
  *    + fill in gaps;
  *    + keep old pages around in case rollback is required;
  *  - if replacing succeeds:
@@ -1316,7 +1322,11 @@ static void collapse_file(struct mm_struct *mm,
 	LIST_HEAD(pagelist);
 	XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER);
 	int nr_none = 0, result = SCAN_SUCCEED;
+	bool is_shmem = shmem_file(file);
 
+#ifndef CONFIG_READ_ONLY_THP_FOR_FS
+	VM_BUG_ON(!is_shmem);
+#endif
 	VM_BUG_ON(start & (HPAGE_PMD_NR - 1));
 
 	/* Only allocate from the target node */
@@ -1348,7 +1358,8 @@ static void collapse_file(struct mm_struct *mm,
 	} while (1);
 
 	__SetPageLocked(new_page);
-	__SetPageSwapBacked(new_page);
+	if (is_shmem)
+		__SetPageSwapBacked(new_page);
 	new_page->index = start;
 	new_page->mapping = mapping;
 
@@ -1363,7 +1374,7 @@ static void collapse_file(struct mm_struct *mm,
 		struct page *page = xas_next(&xas);
 
 		VM_BUG_ON(index != xas.xa_index);
-		if (!page) {
+		if (is_shmem && !page) {
 			/*
 			 * Stop if extent has been truncated or hole-punched,
 			 * and is now completely empty.
@@ -1384,7 +1395,7 @@ static void collapse_file(struct mm_struct *mm,
 			continue;
 		}
 
-		if (xa_is_value(page) || !PageUptodate(page)) {
+		if (is_shmem && (xa_is_value(page) || !PageUptodate(page))) {
 			xas_unlock_irq(&xas);
 			/* swap in or instantiate fallocated page */
 			if (shmem_getpage(mapping->host, index, &page,
@@ -1392,6 +1403,23 @@ static void collapse_file(struct mm_struct *mm,
 				result = SCAN_FAIL;
 				goto xa_unlocked;
 			}
+		} else if (!page || xa_is_value(page)) {
+			xas_unlock_irq(&xas);
+			page_cache_sync_readahead(mapping, &file->f_ra, file,
+						  index, PAGE_SIZE);
+			lru_add_drain();
+			page = find_lock_page(mapping, index);
+			if (unlikely(page == NULL)) {
+				result = SCAN_FAIL;
+				goto xa_unlocked;
+			}
+		} else if (!PageUptodate(page)) {
+			VM_BUG_ON(is_shmem);
+			result = SCAN_FAIL;
+			goto xa_locked;
+		} else if (!is_shmem && PageDirty(page)) {
+			result = SCAN_FAIL;
+			goto xa_locked;
 		} else if (trylock_page(page)) {
 			get_page(page);
 			xas_unlock_irq(&xas);
@@ -1426,6 +1454,12 @@ static void collapse_file(struct mm_struct *mm,
 			goto out_unlock;
 		}
 
+		if (page_has_private(page) &&
+		    !try_to_release_page(page, GFP_KERNEL)) {
+			result = SCAN_PAGE_HAS_PRIVATE;
+			break;
+		}
+
 		if (page_mapped(page))
 			unmap_mapping_pages(mapping, index, 1, false);
 
@@ -1463,12 +1497,18 @@ static void collapse_file(struct mm_struct *mm,
 		goto xa_unlocked;
 	}
 
-	__inc_node_page_state(new_page, NR_SHMEM_THPS);
+	if (is_shmem)
+		__inc_node_page_state(new_page, NR_SHMEM_THPS);
+	else
+		__inc_node_page_state(new_page, NR_FILE_THPS);
+
 	if (nr_none) {
 		struct zone *zone = page_zone(new_page);
 
 		__mod_node_page_state(zone->zone_pgdat, NR_FILE_PAGES, nr_none);
-		__mod_node_page_state(zone->zone_pgdat, NR_SHMEM, nr_none);
+		if (is_shmem)
+			__mod_node_page_state(zone->zone_pgdat,
+					      NR_SHMEM, nr_none);
 	}
 
 xa_locked:
@@ -1506,10 +1546,15 @@ static void collapse_file(struct mm_struct *mm,
 
 		SetPageUptodate(new_page);
 		page_ref_add(new_page, HPAGE_PMD_NR - 1);
-		set_page_dirty(new_page);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
+
+		if (is_shmem) {
+			set_page_dirty(new_page);
+			lru_cache_add_anon(new_page);
+		} else {
+			lru_cache_add_file(new_page);
+		}
 		count_memcg_events(memcg, THP_COLLAPSE_ALLOC, 1);
-		lru_cache_add_anon(new_page);
 
 		/*
 		 * Remove pte page tables, so we can re-fault the page as huge.
@@ -1524,7 +1569,9 @@ static void collapse_file(struct mm_struct *mm,
 		/* Something went wrong: roll back page cache changes */
 		xas_lock_irq(&xas);
 		mapping->nrpages -= nr_none;
-		shmem_uncharge(mapping->host, nr_none);
+
+		if (is_shmem)
+			shmem_uncharge(mapping->host, nr_none);
 
 		xas_set(&xas, start);
 		xas_for_each(&xas, page, end - 1) {
@@ -1607,6 +1654,17 @@ static void khugepaged_scan_file(struct mm_struct *mm,
 			break;
 		}
 
+		if (page_has_private(page) && trylock_page(page)) {
+			int ret;
+
+			ret = try_to_release_page(page, GFP_KERNEL);
+			unlock_page(page);
+			if (!ret) {
+				result = SCAN_PAGE_HAS_PRIVATE;
+				break;
+			}
+		}
+
 		if (page_count(page) != 1 + page_mapcount(page)) {
 			result = SCAN_PAGE_COUNT;
 			break;
@@ -1713,11 +1771,13 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
 			VM_BUG_ON(khugepaged_scan.address < hstart ||
 				  khugepaged_scan.address + HPAGE_PMD_SIZE >
 				  hend);
-			if (shmem_file(vma->vm_file)) {
+			if (vma->vm_file) {
 				struct file *file;
 				pgoff_t pgoff = linear_page_index(vma,
 						khugepaged_scan.address);
-				if (!shmem_huge_enabled(vma))
+
+				if (shmem_file(vma->vm_file)
+				    && !shmem_huge_enabled(vma))
 					goto skip;
 				file = get_file(vma->vm_file);
 				up_read(&mm->mmap_sem);
diff --git a/mm/rmap.c b/mm/rmap.c
index e5dfe2ae6b0d..87cfa2c19eda 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1192,8 +1192,10 @@ void page_add_file_rmap(struct page *page, bool compound)
 		}
 		if (!atomic_inc_and_test(compound_mapcount_ptr(page)))
 			goto out;
-		VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
-		__inc_node_page_state(page, NR_SHMEM_PMDMAPPED);
+		if (PageSwapBacked(page))
+			__inc_node_page_state(page, NR_SHMEM_PMDMAPPED);
+		else
+			__inc_node_page_state(page, NR_FILE_PMDMAPPED);
 	} else {
 		if (PageTransCompound(page) && page_mapping(page)) {
 			VM_WARN_ON_ONCE(!PageLocked(page));
@@ -1232,8 +1234,10 @@ static void page_remove_file_rmap(struct page *page, bool compound)
 		}
 		if (!atomic_add_negative(-1, compound_mapcount_ptr(page)))
 			goto out;
-		VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
-		__dec_node_page_state(page, NR_SHMEM_PMDMAPPED);
+		if (PageSwapBacked(page))
+			__dec_node_page_state(page, NR_SHMEM_PMDMAPPED);
+		else
+			__dec_node_page_state(page, NR_FILE_PMDMAPPED);
 	} else {
 		if (!atomic_add_negative(-1, &page->_mapcount))
 			goto out;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-06-24 21:18 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190624031604.7764-1-hdanton@sina.com>
2019-06-24  4:27 ` [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS Song Liu
     [not found] <20190624074816.10992-1-hdanton@sina.com>
2019-06-24 21:17 ` [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem)FS Song Liu
2019-06-23  5:47 [PATCH v7 0/6] Enable THP for text section of non-shmem files Song Liu
2019-06-23  5:47 ` [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS Song Liu
2019-06-24 12:47   ` Kirill A. Shutemov
2019-06-24 14:01     ` Song Liu
2019-06-24 14:27       ` Kirill A. Shutemov
2019-06-24 14:42         ` Song Liu
2019-06-24 14:54           ` Kirill A. Shutemov
2019-06-24 15:04             ` Song Liu
2019-06-24 15:15               ` Kirill A. Shutemov
2019-06-24 16:33                 ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).