linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
       [not found] ` <20190724083600.832091-3-songliubraving@fb.com>
@ 2019-07-24  9:07   ` Oleg Nesterov
  2019-07-24  9:17     ` Oleg Nesterov
  2019-07-24 11:37   ` Oleg Nesterov
  1 sibling, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2019-07-24  9:07 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, akpm, matthew.wilcox, kirill.shutemov,
	peterz, rostedt, kernel-team, william.kucharski

On 07/24, Song Liu wrote:
>
> This patch allows uprobe to use original page when possible (all uprobes
> on the page are already removed).

and only if the original page is already in the page cache and uptodate,
right?

another reason why I think unmap makes more sense... but I won't argue.

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
  2019-07-24  9:07   ` [PATCH v8 2/4] uprobe: use original page when all uprobes are removed Oleg Nesterov
@ 2019-07-24  9:17     ` Oleg Nesterov
  2019-07-24  9:20       ` Song Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2019-07-24  9:17 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, akpm, matthew.wilcox, kirill.shutemov,
	peterz, rostedt, kernel-team, william.kucharski

On 07/24, Oleg Nesterov wrote:
>
> On 07/24, Song Liu wrote:
> >
> > This patch allows uprobe to use original page when possible (all uprobes
> > on the page are already removed).
>
> and only if the original page is already in the page cache and uptodate,
> right?
>
> another reason why I think unmap makes more sense... but I won't argue.

but somehow I forgot we need to read the original page anyway to check
pages_identical(), so unmap is not really better, please forget.

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
  2019-07-24  9:17     ` Oleg Nesterov
@ 2019-07-24  9:20       ` Song Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Song Liu @ 2019-07-24  9:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: lkml, linux-mm, akpm, matthew.wilcox, kirill.shutemov, peterz,
	rostedt, Kernel Team, william.kucharski



> On Jul 24, 2019, at 2:17 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> 
> On 07/24, Oleg Nesterov wrote:
>> 
>> On 07/24, Song Liu wrote:
>>> 
>>> This patch allows uprobe to use original page when possible (all uprobes
>>> on the page are already removed).
>> 
>> and only if the original page is already in the page cache and uptodate,
>> right?
>> 
>> another reason why I think unmap makes more sense... but I won't argue.
> 
> but somehow I forgot we need to read the original page anyway to check
> pages_identical(), so unmap is not really better, please forget.
> 

Yeah, I was trying to explain this. :)

Thanks for your feedbacks. 
Song


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
       [not found] ` <20190724083600.832091-3-songliubraving@fb.com>
  2019-07-24  9:07   ` [PATCH v8 2/4] uprobe: use original page when all uprobes are removed Oleg Nesterov
@ 2019-07-24 11:37   ` Oleg Nesterov
  2019-07-24 18:52     ` Song Liu
  1 sibling, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2019-07-24 11:37 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-kernel, linux-mm, akpm, matthew.wilcox, kirill.shutemov,
	peterz, rostedt, kernel-team, william.kucharski

On 07/24, Song Liu wrote:
>
>  	lock_page(old_page);
> @@ -177,15 +180,24 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
>  	mmu_notifier_invalidate_range_start(&range);
>  	err = -EAGAIN;
>  	if (!page_vma_mapped_walk(&pvmw)) {
> -		mem_cgroup_cancel_charge(new_page, memcg, false);
> +		if (!orig)
> +			mem_cgroup_cancel_charge(new_page, memcg, false);
>  		goto unlock;
>  	}
>  	VM_BUG_ON_PAGE(addr != pvmw.address, old_page);
>  
>  	get_page(new_page);
> -	page_add_new_anon_rmap(new_page, vma, addr, false);
> -	mem_cgroup_commit_charge(new_page, memcg, false, false);
> -	lru_cache_add_active_or_unevictable(new_page, vma);
> +	if (orig) {
> +		lock_page(new_page);  /* for page_add_file_rmap() */
> +		page_add_file_rmap(new_page, false);


Shouldn't we re-check new_page->mapping after lock_page() ? Or we can't
race with truncate?


and I am worried this code can try to lock the same page twice...
Say, the probed application does MADV_DONTNEED and then writes "int3"
into vma->vm_file at the same address to fool verify_opcode().

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
  2019-07-24 11:37   ` Oleg Nesterov
@ 2019-07-24 18:52     ` Song Liu
  2019-07-25  8:14       ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Song Liu @ 2019-07-24 18:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: lkml, Linux-MM, Andrew Morton, matthew.wilcox, kirill.shutemov,
	peterz, rostedt, Kernel Team, william.kucharski



> On Jul 24, 2019, at 4:37 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> 
> On 07/24, Song Liu wrote:
>> 
>> 	lock_page(old_page);
>> @@ -177,15 +180,24 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
>> 	mmu_notifier_invalidate_range_start(&range);
>> 	err = -EAGAIN;
>> 	if (!page_vma_mapped_walk(&pvmw)) {
>> -		mem_cgroup_cancel_charge(new_page, memcg, false);
>> +		if (!orig)
>> +			mem_cgroup_cancel_charge(new_page, memcg, false);
>> 		goto unlock;
>> 	}
>> 	VM_BUG_ON_PAGE(addr != pvmw.address, old_page);
>> 
>> 	get_page(new_page);
>> -	page_add_new_anon_rmap(new_page, vma, addr, false);
>> -	mem_cgroup_commit_charge(new_page, memcg, false, false);
>> -	lru_cache_add_active_or_unevictable(new_page, vma);
>> +	if (orig) {
>> +		lock_page(new_page);  /* for page_add_file_rmap() */
>> +		page_add_file_rmap(new_page, false);
> 
> 
> Shouldn't we re-check new_page->mapping after lock_page() ? Or we can't
> race with truncate?

We can't race with truncate, because the file is open as binary and 
protected with DENYWRITE (ETXTBSY). 

> 
> 
> and I am worried this code can try to lock the same page twice...
> Say, the probed application does MADV_DONTNEED and then writes "int3"
> into vma->vm_file at the same address to fool verify_opcode().
> 

Do you mean the case where old_page == new_page? I think this won't 
happen, because in uprobe_write_opcode() we only do orig_page for 
!is_register case. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
  2019-07-24 18:52     ` Song Liu
@ 2019-07-25  8:14       ` Oleg Nesterov
  2019-07-25 18:17         ` Song Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2019-07-25  8:14 UTC (permalink / raw)
  To: Song Liu
  Cc: lkml, Linux-MM, Andrew Morton, matthew.wilcox, kirill.shutemov,
	peterz, rostedt, Kernel Team, william.kucharski

On 07/24, Song Liu wrote:
>
>
> > On Jul 24, 2019, at 4:37 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > On 07/24, Song Liu wrote:
> >>
> >> 	lock_page(old_page);
> >> @@ -177,15 +180,24 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
> >> 	mmu_notifier_invalidate_range_start(&range);
> >> 	err = -EAGAIN;
> >> 	if (!page_vma_mapped_walk(&pvmw)) {
> >> -		mem_cgroup_cancel_charge(new_page, memcg, false);
> >> +		if (!orig)
> >> +			mem_cgroup_cancel_charge(new_page, memcg, false);
> >> 		goto unlock;
> >> 	}
> >> 	VM_BUG_ON_PAGE(addr != pvmw.address, old_page);
> >>
> >> 	get_page(new_page);
> >> -	page_add_new_anon_rmap(new_page, vma, addr, false);
> >> -	mem_cgroup_commit_charge(new_page, memcg, false, false);
> >> -	lru_cache_add_active_or_unevictable(new_page, vma);
> >> +	if (orig) {
> >> +		lock_page(new_page);  /* for page_add_file_rmap() */
> >> +		page_add_file_rmap(new_page, false);
> >
> >
> > Shouldn't we re-check new_page->mapping after lock_page() ? Or we can't
> > race with truncate?
>
> We can't race with truncate, because the file is open as binary and
> protected with DENYWRITE (ETXTBSY).

No. Yes, deny_write_access() protects mm->exe_file, but not the dynamic
libraries or other files which can be mmaped.

> > and I am worried this code can try to lock the same page twice...
> > Say, the probed application does MADV_DONTNEED and then writes "int3"
> > into vma->vm_file at the same address to fool verify_opcode().
> >
>
> Do you mean the case where old_page == new_page?

Yes,

> I think this won't
> happen, because in uprobe_write_opcode() we only do orig_page for
> !is_register case.

See above.

!is_register doesn't necessarily mean the original page was previously cow'ed.
And even if it was cow'ed, MADV_DONTNEED can restore the original mapping.

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
  2019-07-25  8:14       ` Oleg Nesterov
@ 2019-07-25 18:17         ` Song Liu
  2019-07-26  6:07           ` Song Liu
  2019-07-26  8:44           ` Oleg Nesterov
  0 siblings, 2 replies; 10+ messages in thread
From: Song Liu @ 2019-07-25 18:17 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: lkml, Linux-MM, Andrew Morton, matthew.wilcox, kirill.shutemov,
	peterz, rostedt, Kernel Team, william.kucharski



> On Jul 25, 2019, at 1:14 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> 
> On 07/24, Song Liu wrote:
>> 
>> 
>>> On Jul 24, 2019, at 4:37 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>>> 
>>> On 07/24, Song Liu wrote:
>>>> 
>>>> 	lock_page(old_page);
>>>> @@ -177,15 +180,24 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
>>>> 	mmu_notifier_invalidate_range_start(&range);
>>>> 	err = -EAGAIN;
>>>> 	if (!page_vma_mapped_walk(&pvmw)) {
>>>> -		mem_cgroup_cancel_charge(new_page, memcg, false);
>>>> +		if (!orig)
>>>> +			mem_cgroup_cancel_charge(new_page, memcg, false);
>>>> 		goto unlock;
>>>> 	}
>>>> 	VM_BUG_ON_PAGE(addr != pvmw.address, old_page);
>>>> 
>>>> 	get_page(new_page);
>>>> -	page_add_new_anon_rmap(new_page, vma, addr, false);
>>>> -	mem_cgroup_commit_charge(new_page, memcg, false, false);
>>>> -	lru_cache_add_active_or_unevictable(new_page, vma);
>>>> +	if (orig) {
>>>> +		lock_page(new_page);  /* for page_add_file_rmap() */
>>>> +		page_add_file_rmap(new_page, false);
>>> 
>>> 
>>> Shouldn't we re-check new_page->mapping after lock_page() ? Or we can't
>>> race with truncate?
>> 
>> We can't race with truncate, because the file is open as binary and
>> protected with DENYWRITE (ETXTBSY).
> 
> No. Yes, deny_write_access() protects mm->exe_file, but not the dynamic
> libraries or other files which can be mmaped.

I see. Let me see how we can cover this. 

> 
>>> and I am worried this code can try to lock the same page twice...
>>> Say, the probed application does MADV_DONTNEED and then writes "int3"
>>> into vma->vm_file at the same address to fool verify_opcode().
>>> 
>> 
>> Do you mean the case where old_page == new_page?
> 
> Yes,
> 
>> I think this won't
>> happen, because in uprobe_write_opcode() we only do orig_page for
>> !is_register case.
> 
> See above.
> 
> !is_register doesn't necessarily mean the original page was previously cow'ed.
> And even if it was cow'ed, MADV_DONTNEED can restore the original mapping.

I guess I know the case now. We can probably avoid this with an simp\x10le 
check for old_page == new_page?

Thanks,
Song


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
  2019-07-25 18:17         ` Song Liu
@ 2019-07-26  6:07           ` Song Liu
  2019-07-26  8:44           ` Oleg Nesterov
  1 sibling, 0 replies; 10+ messages in thread
From: Song Liu @ 2019-07-26  6:07 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: lkml, Linux-MM, Andrew Morton, matthew.wilcox, kirill.shutemov,
	peterz, rostedt, Kernel Team, william.kucharski


Hi Oleg, 

>> 
>> No. Yes, deny_write_access() protects mm->exe_file, but not the dynamic
>> libraries or other files which can be mmaped.
> 
> I see. Let me see how we can cover this. 
> 
>> 
>>>> and I am worried this code can try to lock the same page twice...
>>>> Say, the probed application does MADV_DONTNEED and then writes "int3"
>>>> into vma->vm_file at the same address to fool verify_opcode().
>>>> 
>>> 
>>> Do you mean the case where old_page == new_page?
>> 
>> Yes,
>> 
>>> I think this won't
>>> happen, because in uprobe_write_opcode() we only do orig_page for
>>> !is_register case.
>> 
>> See above.
>> 
>> !is_register doesn't necessarily mean the original page was previously cow'ed.
>> And even if it was cow'ed, MADV_DONTNEED can restore the original mapping.
> 
> I guess I know the case now. We can probably avoid this with an simp\x10le 
> check for old_page == new_page?

I decided to follow your suggestion of "unmap old_page; fault in orig_page". 
Please see v9 of the set. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
  2019-07-25 18:17         ` Song Liu
  2019-07-26  6:07           ` Song Liu
@ 2019-07-26  8:44           ` Oleg Nesterov
  2019-07-26 21:19             ` Song Liu
  1 sibling, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2019-07-26  8:44 UTC (permalink / raw)
  To: Song Liu
  Cc: lkml, Linux-MM, Andrew Morton, matthew.wilcox, kirill.shutemov,
	peterz, rostedt, Kernel Team, william.kucharski

On 07/25, Song Liu wrote:
>
> I guess I know the case now. We can probably avoid this with an simp\x10le 
> check for old_page == new_page?

better yet, I think we can check PageAnon(old_page) and avoid the unnecessary
__replace_page() in this case. See the patch below.

Anyway, why __replace_page() needs to lock both pages? This doesn't look nice
even if it were correct. I think it can do lock_page(old_page) later.

Oleg.


--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -488,6 +488,10 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
 		ref_ctr_updated = 1;
 	}
 
+	ret = 0;
+	if (!is_register && !PageAnon(old_page))
+		goto put_old;
+
 	ret = anon_vma_prepare(vma);
 	if (ret)
 		goto put_old;


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v8 2/4] uprobe: use original page when all uprobes are removed
  2019-07-26  8:44           ` Oleg Nesterov
@ 2019-07-26 21:19             ` Song Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Song Liu @ 2019-07-26 21:19 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: lkml, Linux-MM, Andrew Morton, matthew.wilcox, kirill.shutemov,
	peterz, rostedt, Kernel Team, william.kucharski



> On Jul 26, 2019, at 1:44 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> 
> On 07/25, Song Liu wrote:
>> 
>> I guess I know the case now. We can probably avoid this with an simp\x10le 
>> check for old_page == new_page?
> 
> better yet, I think we can check PageAnon(old_page) and avoid the unnecessary
> __replace_page() in this case. See the patch below.

I added PageAnon(old_page) check in v9 of the patch. 

> 
> Anyway, why __replace_page() needs to lock both pages? This doesn't look nice
> even if it were correct. I think it can do lock_page(old_page) later.
> 

Agreed. I have changed the v9 to only unmap old_page. So it should be cleaner. 

Thanks again for these good suggestions,
Song

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-07-26 21:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190724083600.832091-1-songliubraving@fb.com>
     [not found] ` <20190724083600.832091-3-songliubraving@fb.com>
2019-07-24  9:07   ` [PATCH v8 2/4] uprobe: use original page when all uprobes are removed Oleg Nesterov
2019-07-24  9:17     ` Oleg Nesterov
2019-07-24  9:20       ` Song Liu
2019-07-24 11:37   ` Oleg Nesterov
2019-07-24 18:52     ` Song Liu
2019-07-25  8:14       ` Oleg Nesterov
2019-07-25 18:17         ` Song Liu
2019-07-26  6:07           ` Song Liu
2019-07-26  8:44           ` Oleg Nesterov
2019-07-26 21:19             ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).