linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH uprobe, thp 3/4] uprobe: support huge page by only splitting the pmd
       [not found] ` <20190529212049.2413886-4-songliubraving@fb.com>
@ 2019-05-30 11:08   ` William Kucharski
  2019-05-30 17:24     ` Song Liu
  2019-05-30 12:14   ` Kirill A. Shutemov
  1 sibling, 1 reply; 12+ messages in thread
From: William Kucharski @ 2019-05-30 11:08 UTC (permalink / raw)
  To: Song Liu
  Cc: LKML, Linux-MM, namit, Peter Zijlstra, oleg, Steven Rostedt,
	mhiramat, Matthew Wilcox, kirill.shutemov, kernel-team,
	Chad Mynhier, mike.kravetz


Is there any reason to worry about supporting PUD-sized uprobe pages if
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD is defined? I would prefer
not to bake in the assumption that "huge" means PMD-sized and more than
it already is.

For example, if CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD is configured,
mm_address_trans_huge() should either make the call to pud_trans_huge()
if appropriate, or a VM_BUG_ON_PAGE should be added in case the routine
is ever called with one.

Otherwise it looks pretty reasonable to me.

    -- Bill


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 1/4] mm, thp: allow preallocate pgtable for split_huge_pmd_address()
       [not found] ` <20190529212049.2413886-2-songliubraving@fb.com>
@ 2019-05-30 11:10   ` Kirill A. Shutemov
  2019-05-30 11:14     ` Kirill A. Shutemov
  0 siblings, 1 reply; 12+ messages in thread
From: Kirill A. Shutemov @ 2019-05-30 11:10 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	chad.mynhier, mike.kravetz

On Wed, May 29, 2019 at 02:20:46PM -0700, Song Liu wrote:
> @@ -2133,10 +2133,15 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
>  	VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PMD_SIZE, vma);
>  	VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd)
>  				&& !pmd_devmap(*pmd));
> +	/* only file backed vma need preallocate pgtable*/
> +	VM_BUG_ON(vma_is_anonymous(vma) && prealloc_pgtable);
>  
>  	count_vm_event(THP_SPLIT_PMD);
>  
> -	if (!vma_is_anonymous(vma)) {
> +	if (prealloc_pgtable) {
> +		pgtable_trans_huge_deposit(mm, pmd, prealloc_pgtable);
> +		mm_inc_nr_pmds(mm);
> +	} else if (!vma_is_anonymous(vma)) {
>  		_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd);
>  		/*
>  		 * We are going to unmap this huge page. So

Nope. This going to leak a page table for architectures where
arch_needs_pgtable_deposit() is true.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 1/4] mm, thp: allow preallocate pgtable for split_huge_pmd_address()
  2019-05-30 11:10   ` [PATCH uprobe, thp 1/4] mm, thp: allow preallocate pgtable for split_huge_pmd_address() Kirill A. Shutemov
@ 2019-05-30 11:14     ` Kirill A. Shutemov
  2019-05-30 17:23       ` Song Liu
  0 siblings, 1 reply; 12+ messages in thread
From: Kirill A. Shutemov @ 2019-05-30 11:14 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	chad.mynhier, mike.kravetz

On Thu, May 30, 2019 at 02:10:15PM +0300, Kirill A. Shutemov wrote:
> On Wed, May 29, 2019 at 02:20:46PM -0700, Song Liu wrote:
> > @@ -2133,10 +2133,15 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
> >  	VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PMD_SIZE, vma);
> >  	VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd)
> >  				&& !pmd_devmap(*pmd));
> > +	/* only file backed vma need preallocate pgtable*/
> > +	VM_BUG_ON(vma_is_anonymous(vma) && prealloc_pgtable);
> >  
> >  	count_vm_event(THP_SPLIT_PMD);
> >  
> > -	if (!vma_is_anonymous(vma)) {
> > +	if (prealloc_pgtable) {
> > +		pgtable_trans_huge_deposit(mm, pmd, prealloc_pgtable);
> > +		mm_inc_nr_pmds(mm);
> > +	} else if (!vma_is_anonymous(vma)) {
> >  		_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd);
> >  		/*
> >  		 * We are going to unmap this huge page. So
> 
> Nope. This going to leak a page table for architectures where
> arch_needs_pgtable_deposit() is true.

And I don't there's correct handling of dirty bit.

And what about DAX? Will it blow up? I think so.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 2/4] uprobe: use original page when all uprobes are removed
       [not found] ` <20190529212049.2413886-3-songliubraving@fb.com>
@ 2019-05-30 11:17   ` Kirill A. Shutemov
  2019-05-30 17:18     ` Song Liu
  0 siblings, 1 reply; 12+ messages in thread
From: Kirill A. Shutemov @ 2019-05-30 11:17 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	chad.mynhier, mike.kravetz

On Wed, May 29, 2019 at 02:20:47PM -0700, Song Liu wrote:
> @@ -501,6 +512,20 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>  	copy_highpage(new_page, old_page);
>  	copy_to_page(new_page, vaddr, &opcode, UPROBE_SWBP_INSN_SIZE);
>  
> +	index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT;
> +	orig_page = find_get_page(vma->vm_file->f_inode->i_mapping, index);
> +	if (orig_page) {
> +		if (memcmp(page_address(orig_page),
> +			   page_address(new_page), PAGE_SIZE) == 0) {

Does it work for highmem?


-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 3/4] uprobe: support huge page by only splitting the pmd
       [not found] ` <20190529212049.2413886-4-songliubraving@fb.com>
  2019-05-30 11:08   ` [PATCH uprobe, thp 3/4] uprobe: support huge page by only splitting the pmd William Kucharski
@ 2019-05-30 12:14   ` Kirill A. Shutemov
  2019-05-30 17:37     ` Song Liu
  1 sibling, 1 reply; 12+ messages in thread
From: Kirill A. Shutemov @ 2019-05-30 12:14 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	chad.mynhier, mike.kravetz

On Wed, May 29, 2019 at 02:20:48PM -0700, Song Liu wrote:
> Instead of splitting the compound page with FOLL_SPLIT, this patch allows
> uprobe to only split pmd for huge pages.
> 
> A helper function mm_address_trans_huge(mm, address) was introduced to
> test whether the address in mm is pointing to THP.

Maybe it would be cleaner to have FOLL_SPLIT_PMD which would strip
trans_huge PMD if any and then set pte using get_locked_pte()?

This way you'll not need any changes in split_huge_pmd() path. Clearing
PMD will be fine.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 4/4] uprobe: collapse THP pmd after removing all uprobes
       [not found] ` <20190529212049.2413886-5-songliubraving@fb.com>
@ 2019-05-30 12:20   ` Kirill A. Shutemov
  2019-05-30 17:26     ` Song Liu
  0 siblings, 1 reply; 12+ messages in thread
From: Kirill A. Shutemov @ 2019-05-30 12:20 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	chad.mynhier, mike.kravetz

On Wed, May 29, 2019 at 02:20:49PM -0700, Song Liu wrote:
> After all uprobes are removed from the huge page (with PTE pgtable), it
> is possible to collapse the pmd and benefit from THP again. This patch
> does the collapse.

I don't think it's right way to go. We should deferred it to khugepaged.
We need to teach khugepaged to deal with PTE-mapped compound page.
And uprobe should only kick khugepaged for a VMA. Maybe synchronously.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 2/4] uprobe: use original page when all uprobes are removed
  2019-05-30 11:17   ` [PATCH uprobe, thp 2/4] uprobe: use original page when all uprobes are removed Kirill A. Shutemov
@ 2019-05-30 17:18     ` Song Liu
  0 siblings, 0 replies; 12+ messages in thread
From: Song Liu @ 2019-05-30 17:18 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, linux-mm, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, Kernel Team, william.kucharski,
	chad.mynhier, mike.kravetz



> On May 30, 2019, at 4:17 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Wed, May 29, 2019 at 02:20:47PM -0700, Song Liu wrote:
>> @@ -501,6 +512,20 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>> 	copy_highpage(new_page, old_page);
>> 	copy_to_page(new_page, vaddr, &opcode, UPROBE_SWBP_INSN_SIZE);
>> 
>> +	index = vaddr_to_offset(vma, vaddr & PAGE_MASK) >> PAGE_SHIFT;
>> +	orig_page = find_get_page(vma->vm_file->f_inode->i_mapping, index);
>> +	if (orig_page) {
>> +		if (memcmp(page_address(orig_page),
>> +			   page_address(new_page), PAGE_SIZE) == 0) {
> 
> Does it work for highmem?

Good catch! I will fix it in v2. 

Thanks!
Song

> 
> 
> -- 
> Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 1/4] mm, thp: allow preallocate pgtable for split_huge_pmd_address()
  2019-05-30 11:14     ` Kirill A. Shutemov
@ 2019-05-30 17:23       ` Song Liu
  0 siblings, 0 replies; 12+ messages in thread
From: Song Liu @ 2019-05-30 17:23 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, linux-mm, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, Kernel Team, william.kucharski,
	chad.mynhier, mike.kravetz



> On May 30, 2019, at 4:14 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Thu, May 30, 2019 at 02:10:15PM +0300, Kirill A. Shutemov wrote:
>> On Wed, May 29, 2019 at 02:20:46PM -0700, Song Liu wrote:
>>> @@ -2133,10 +2133,15 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
>>> 	VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PMD_SIZE, vma);
>>> 	VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd)
>>> 				&& !pmd_devmap(*pmd));
>>> +	/* only file backed vma need preallocate pgtable*/
>>> +	VM_BUG_ON(vma_is_anonymous(vma) && prealloc_pgtable);
>>> 
>>> 	count_vm_event(THP_SPLIT_PMD);
>>> 
>>> -	if (!vma_is_anonymous(vma)) {
>>> +	if (prealloc_pgtable) {
>>> +		pgtable_trans_huge_deposit(mm, pmd, prealloc_pgtable);
>>> +		mm_inc_nr_pmds(mm);
>>> +	} else if (!vma_is_anonymous(vma)) {
>>> 		_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd);
>>> 		/*
>>> 		 * We are going to unmap this huge page. So
>> 
>> Nope. This going to leak a page table for architectures where
>> arch_needs_pgtable_deposit() is true.
> 
> And I don't there's correct handling of dirty bit.
> 
> And what about DAX? Will it blow up? I think so.
> 

Let me look into these cases. Thanks for the feedback!

Song

> -- 
> Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 3/4] uprobe: support huge page by only splitting the pmd
  2019-05-30 11:08   ` [PATCH uprobe, thp 3/4] uprobe: support huge page by only splitting the pmd William Kucharski
@ 2019-05-30 17:24     ` Song Liu
  0 siblings, 0 replies; 12+ messages in thread
From: Song Liu @ 2019-05-30 17:24 UTC (permalink / raw)
  To: William Kucharski
  Cc: LKML, Linux-MM, namit, Peter Zijlstra, oleg, Steven Rostedt,
	mhiramat, Matthew Wilcox, kirill.shutemov, Kernel Team,
	Chad Mynhier, mike.kravetz



> On May 30, 2019, at 4:08 AM, William Kucharski <william.kucharski@oracle.com> wrote:
> 
> 
> Is there any reason to worry about supporting PUD-sized uprobe pages if
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD is defined? I would prefer
> not to bake in the assumption that "huge" means PMD-sized and more than
> it already is.
> 
> For example, if CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD is configured,
> mm_address_trans_huge() should either make the call to pud_trans_huge()
> if appropriate, or a VM_BUG_ON_PAGE should be added in case the routine
> is ever called with one.
> 
> Otherwise it looks pretty reasonable to me.
> 
>    -- Bill
> 

I will try that in v2. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 4/4] uprobe: collapse THP pmd after removing all uprobes
  2019-05-30 12:20   ` [PATCH uprobe, thp 4/4] uprobe: collapse THP pmd after removing all uprobes Kirill A. Shutemov
@ 2019-05-30 17:26     ` Song Liu
  2019-05-31  7:00       ` Kirill A. Shutemov
  0 siblings, 1 reply; 12+ messages in thread
From: Song Liu @ 2019-05-30 17:26 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: LKML, Linux-MM, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, Kernel Team, william.kucharski,
	chad.mynhier, mike.kravetz



> On May 30, 2019, at 5:20 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Wed, May 29, 2019 at 02:20:49PM -0700, Song Liu wrote:
>> After all uprobes are removed from the huge page (with PTE pgtable), it
>> is possible to collapse the pmd and benefit from THP again. This patch
>> does the collapse.
> 
> I don't think it's right way to go. We should deferred it to khugepaged.
> We need to teach khugepaged to deal with PTE-mapped compound page.
> And uprobe should only kick khugepaged for a VMA. Maybe synchronously.
> 

I guess that would be the same logic, but run in khugepaged? It doesn't
have to be done synchronously. 

Let me try that

Thanks,
Song


> -- 
> Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 3/4] uprobe: support huge page by only splitting the pmd
  2019-05-30 12:14   ` Kirill A. Shutemov
@ 2019-05-30 17:37     ` Song Liu
  0 siblings, 0 replies; 12+ messages in thread
From: Song Liu @ 2019-05-30 17:37 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: LKML, Linux-MM, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, Kernel Team, william.kucharski,
	chad.mynhier, mike.kravetz



> On May 30, 2019, at 5:14 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> On Wed, May 29, 2019 at 02:20:48PM -0700, Song Liu wrote:
>> Instead of splitting the compound page with FOLL_SPLIT, this patch allows
>> uprobe to only split pmd for huge pages.
>> 
>> A helper function mm_address_trans_huge(mm, address) was introduced to
>> test whether the address in mm is pointing to THP.
> 
> Maybe it would be cleaner to have FOLL_SPLIT_PMD which would strip
> trans_huge PMD if any and then set pte using get_locked_pte()?

FOLL_SPLIT_PMD sounds like a great idea! Let me try it. 

Thanks,
Song

> 
> This way you'll not need any changes in split_huge_pmd() path. Clearing
> PMD will be fine.
> 
> -- 
> Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH uprobe, thp 4/4] uprobe: collapse THP pmd after removing all uprobes
  2019-05-30 17:26     ` Song Liu
@ 2019-05-31  7:00       ` Kirill A. Shutemov
  0 siblings, 0 replies; 12+ messages in thread
From: Kirill A. Shutemov @ 2019-05-31  7:00 UTC (permalink / raw)
  To: Song Liu
  Cc: LKML, Linux-MM, namit, peterz, oleg, rostedt, mhiramat,
	matthew.wilcox, kirill.shutemov, Kernel Team, william.kucharski,
	chad.mynhier, mike.kravetz

On Thu, May 30, 2019 at 05:26:38PM +0000, Song Liu wrote:
> 
> 
> > On May 30, 2019, at 5:20 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > 
> > On Wed, May 29, 2019 at 02:20:49PM -0700, Song Liu wrote:
> >> After all uprobes are removed from the huge page (with PTE pgtable), it
> >> is possible to collapse the pmd and benefit from THP again. This patch
> >> does the collapse.
> > 
> > I don't think it's right way to go. We should deferred it to khugepaged.
> > We need to teach khugepaged to deal with PTE-mapped compound page.
> > And uprobe should only kick khugepaged for a VMA. Maybe synchronously.
> > 
> 
> I guess that would be the same logic, but run in khugepaged? It doesn't
> have to be done synchronously. 

My idea was that since we have all required locking in place we can call
into khugepaged code that does the collapse, without waithing for it to
get to the VMA.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-05-31  7:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190529212049.2413886-1-songliubraving@fb.com>
     [not found] ` <20190529212049.2413886-4-songliubraving@fb.com>
2019-05-30 11:08   ` [PATCH uprobe, thp 3/4] uprobe: support huge page by only splitting the pmd William Kucharski
2019-05-30 17:24     ` Song Liu
2019-05-30 12:14   ` Kirill A. Shutemov
2019-05-30 17:37     ` Song Liu
     [not found] ` <20190529212049.2413886-2-songliubraving@fb.com>
2019-05-30 11:10   ` [PATCH uprobe, thp 1/4] mm, thp: allow preallocate pgtable for split_huge_pmd_address() Kirill A. Shutemov
2019-05-30 11:14     ` Kirill A. Shutemov
2019-05-30 17:23       ` Song Liu
     [not found] ` <20190529212049.2413886-3-songliubraving@fb.com>
2019-05-30 11:17   ` [PATCH uprobe, thp 2/4] uprobe: use original page when all uprobes are removed Kirill A. Shutemov
2019-05-30 17:18     ` Song Liu
     [not found] ` <20190529212049.2413886-5-songliubraving@fb.com>
2019-05-30 12:20   ` [PATCH uprobe, thp 4/4] uprobe: collapse THP pmd after removing all uprobes Kirill A. Shutemov
2019-05-30 17:26     ` Song Liu
2019-05-31  7:00       ` Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).