All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails
@ 2012-07-28 16:31 Oleg Nesterov
  2012-07-28 16:34 ` Oleg Nesterov
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Oleg Nesterov @ 2012-07-28 16:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	Srikar Dronamraju, William Cohen, linux-kernel

https://bugzilla.redhat.com/show_bug.cgi?id=843640

If mmap_region()->uprobe_mmap() fails, unmap_and_free_vma path
does unmap_region() but does not remove the soon-to-be-freed vma
from rb tree (actually there are more problems).

Perhaps we could do do_munmap() + return in this case, but in fact
it is simply wrong to abort if uprobe_mmap() fails. Until at least
we move the !UPROBE_COPY_INSN code from install_breakpoint() to
uprobe_register().

For example, uprobe_mmap()->install_breakpoint() can fail if the
probed insn is not supported (remember, uprobe_register() succeeds
if nobody mmaps inode/offset), mmap() should not fail in this case.

dup_mmap()->uprobe_mmap() is wrong too by the same reason, fork()
can race with uprobe_register() and fail for no reason if it wins
the race and does install_breakpoint() first.

Change mmap_region() and dup_mmap() to ignore the error code from
uprobe_mmap().

Reported-by: William Cohen <wcohen@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org> # v3.5
---
 kernel/fork.c |    4 ++--
 mm/mmap.c     |    5 ++---
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index ab5211b..54bb88a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -454,8 +454,8 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
 		if (retval)
 			goto out;
 
-		if (file && uprobe_mmap(tmp))
-			goto out;
+		if (file)
+			uprobe_mmap(tmp);
 	}
 	/* a new mm has just been created */
 	arch_dup_mmap(oldmm, mm);
diff --git a/mm/mmap.c b/mm/mmap.c
index 4fe2697..f25fd3f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1355,9 +1355,8 @@ out:
 	} else if ((flags & MAP_POPULATE) && !(flags & MAP_NONBLOCK))
 		make_pages_present(addr, addr + len);
 
-	if (file && uprobe_mmap(vma))
-		/* matching probes but cannot insert */
-		goto unmap_and_free_vma;
+	if (file)
+		uprobe_mmap(vma);
 
 	return addr;
 
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails
  2012-07-28 16:31 [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails Oleg Nesterov
@ 2012-07-28 16:34 ` Oleg Nesterov
  2012-07-30 13:22 ` William Cohen
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2012-07-28 16:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	Srikar Dronamraju, William Cohen, linux-kernel

On 07/28, Oleg Nesterov wrote:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=843640
>
> If mmap_region()->uprobe_mmap() fails, unmap_and_free_vma path
> does unmap_region() but does not remove the soon-to-be-freed vma
> from rb tree (actually there are more problems).

Just in case...

Ingo, this is orthogonal to other pending changes I sent. I think
3.6 (and 3.5-stable) needs this fix.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails
  2012-07-28 16:31 [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails Oleg Nesterov
  2012-07-28 16:34 ` Oleg Nesterov
@ 2012-07-30 13:22 ` William Cohen
  2012-07-31  6:47 ` Srikar Dronamraju
  2012-08-03 17:46 ` [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails Srikar Dronamraju
  3 siblings, 0 replies; 18+ messages in thread
From: William Cohen @ 2012-07-30 13:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	Srikar Dronamraju, linux-kernel

On 07/28/2012 12:31 PM, Oleg Nesterov wrote:
> https://bugzilla.redhat.com/show_bug.cgi?id=843640

Hi Oleg,

I checked the following patch and it does fix the problem on the 3.5.0+ kernel.

-Will

> 
> If mmap_region()->uprobe_mmap() fails, unmap_and_free_vma path
> does unmap_region() but does not remove the soon-to-be-freed vma
> from rb tree (actually there are more problems).
> 
> Perhaps we could do do_munmap() + return in this case, but in fact
> it is simply wrong to abort if uprobe_mmap() fails. Until at least
> we move the !UPROBE_COPY_INSN code from install_breakpoint() to
> uprobe_register().
> 
> For example, uprobe_mmap()->install_breakpoint() can fail if the
> probed insn is not supported (remember, uprobe_register() succeeds
> if nobody mmaps inode/offset), mmap() should not fail in this case.
> 
> dup_mmap()->uprobe_mmap() is wrong too by the same reason, fork()
> can race with uprobe_register() and fail for no reason if it wins
> the race and does install_breakpoint() first.
> 
> Change mmap_region() and dup_mmap() to ignore the error code from
> uprobe_mmap().
> 
> Reported-by: William Cohen <wcohen@redhat.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> Cc: <stable@vger.kernel.org> # v3.5
> ---
>  kernel/fork.c |    4 ++--
>  mm/mmap.c     |    5 ++---
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index ab5211b..54bb88a 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -454,8 +454,8 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
>  		if (retval)
>  			goto out;
>  
> -		if (file && uprobe_mmap(tmp))
> -			goto out;
> +		if (file)
> +			uprobe_mmap(tmp);
>  	}
>  	/* a new mm has just been created */
>  	arch_dup_mmap(oldmm, mm);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 4fe2697..f25fd3f 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1355,9 +1355,8 @@ out:
>  	} else if ((flags & MAP_POPULATE) && !(flags & MAP_NONBLOCK))
>  		make_pages_present(addr, addr + len);
>  
> -	if (file && uprobe_mmap(vma))
> -		/* matching probes but cannot insert */
> -		goto unmap_and_free_vma;
> +	if (file)
> +		uprobe_mmap(vma);
>  
>  	return addr;
>  


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails
  2012-07-28 16:31 [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails Oleg Nesterov
  2012-07-28 16:34 ` Oleg Nesterov
  2012-07-30 13:22 ` William Cohen
@ 2012-07-31  6:47 ` Srikar Dronamraju
  2012-07-31 12:48   ` Oleg Nesterov
  2012-08-03 17:46 ` [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails Srikar Dronamraju
  3 siblings, 1 reply; 18+ messages in thread
From: Srikar Dronamraju @ 2012-07-31  6:47 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

> 
> If mmap_region()->uprobe_mmap() fails, unmap_and_free_vma path
> does unmap_region() but does not remove the soon-to-be-freed vma
> from rb tree (actually there are more problems).
> 
> Perhaps we could do do_munmap() + return in this case, but in fact
> it is simply wrong to abort if uprobe_mmap() fails. Until at least
> we move the !UPROBE_COPY_INSN code from install_breakpoint() to
> uprobe_register().
> 
> For example, uprobe_mmap()->install_breakpoint() can fail if the
> probed insn is not supported (remember, uprobe_register() succeeds
> if nobody mmaps inode/offset), mmap() should not fail in this case.
> 
> dup_mmap()->uprobe_mmap() is wrong too by the same reason, fork()
> can race with uprobe_register() and fail for no reason if it wins
> the race and does install_breakpoint() first.
> 
> Change mmap_region() and dup_mmap() to ignore the error code from
> uprobe_mmap().
> 
> Reported-by: William Cohen <wcohen@redhat.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> Cc: <stable@vger.kernel.org> # v3.5
> ---
>  kernel/fork.c |    4 ++--
>  mm/mmap.c     |    5 ++---
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index ab5211b..54bb88a 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -454,8 +454,8 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
>  		if (retval)
>  			goto out;
> 
> -		if (file && uprobe_mmap(tmp))
> -			goto out;
> +		if (file)
> +			uprobe_mmap(tmp);

I am not comfortable with this fix.

Lets say there were 10 probes that were to be installed in that vma.
we were able to install five probes and the 6th one happened to fail
because of invalid instruction; we dont continue with the registering
probes for the remaining 4 probes.

Your fix allows probe hits for 5 registered probes that can lead to
misleading analysis. For example if one of the first five probes was a
malloc and the probe at free() was one of the last probes which wasnt
registered, then a person doing an analysis based on probes can say
there was a memory leak.

The intention behind failing mmap()/fork() if uprobe_mmap failed,
was to make sure that we always report the correct number of events.

Infact this was something that Peter advocated very strongly
http://article.gmane.org/gmane.linux.kernel.mm/59956

I think the long term solution is as you mentioned, move the
instruction analysis to register.

Till then should we just ignore invalid instruction probes similar to 
existing probes. i.e we return -ESRCH or some such which is ignored in
uprobe_mmap(), but is taken care of in the uprobe_register path.

The above may not be an elegant solution but.. 

-- 
thanks and regards
Srikar


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails
  2012-07-31  6:47 ` Srikar Dronamraju
@ 2012-07-31 12:48   ` Oleg Nesterov
  2012-07-31 13:25     ` Oleg Nesterov
  2012-08-02 10:05     ` [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap Srikar Dronamraju
  0 siblings, 2 replies; 18+ messages in thread
From: Oleg Nesterov @ 2012-07-31 12:48 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

On 07/31, Srikar Dronamraju wrote:
>
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -454,8 +454,8 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
> >  		if (retval)
> >  			goto out;
> >
> > -		if (file && uprobe_mmap(tmp))
> > -			goto out;
> > +		if (file)
> > +			uprobe_mmap(tmp);
>
> I am not comfortable with this fix.

OK, so what you suggest for now?

Please note that it is very trivial to crash the kernel. Just
do something like

	echo "p /bin/true:OFFSET_OF_SYSCALL_INSN" > /sys/kernel/debug/tracing/uprobe_events
	/bin/true

(or any other unsupported insn)

Yes sure, I agree that in the long term this change should be
reconsidered.

> I think the long term solution is as you mentioned, move the
> instruction analysis to register.

Exactly. And we already discusssed this, we have a lot of other
reasons to do this.

> Lets say there were 10 probes that were to be installed in that vma.
> we were able to install five probes and the 6th one happened to fail
> because of invalid instruction; we dont continue with the registering
> probes for the remaining 4 probes.

Yes. And I already have the patch. I didn't send it because, unlike
this fix, it depends on other changes (already in -tip).

Until we move analysis to register, until we teach the callers of
uprobe_mmap() to bailout (and please note that vma_adjust() ignores
the result too), uprobe_mmap() should not give up if install fails,
it should continue.

> The intention behind failing mmap()/fork() if uprobe_mmap failed,
> was to make sure that we always report the correct number of events.

Sure, I understand and agree.

But what we can do right now?

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails
  2012-07-31 12:48   ` Oleg Nesterov
@ 2012-07-31 13:25     ` Oleg Nesterov
  2012-08-02 10:05     ` [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap Srikar Dronamraju
  1 sibling, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2012-07-31 13:25 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

On 07/31, Oleg Nesterov wrote:
>
> OK, so what you suggest for now?
>
> Please note that it is very trivial to crash the kernel. Just
> do something like
>
> 	echo "p /bin/true:OFFSET_OF_SYSCALL_INSN" > /sys/kernel/debug/tracing/uprobe_events
> 	/bin/true

Forgot to mention...

And even it it didn't crash, mmap() (and thus exec) should not fail.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-07-31 12:48   ` Oleg Nesterov
  2012-07-31 13:25     ` Oleg Nesterov
@ 2012-08-02 10:05     ` Srikar Dronamraju
  2012-08-02 13:53       ` Oleg Nesterov
  2012-08-02 14:17       ` Oleg Nesterov
  1 sibling, 2 replies; 18+ messages in thread
From: Srikar Dronamraju @ 2012-08-02 10:05 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

uprobe_mmap()->install_breakpoint() can fail if the probed insn is not
supported (remember, uprobe_register() succeeds if nobody mmaps
inode/offset). Failure in uprobe_mmap() causes mmap_region/do_fork to
fail too.

However failing mmap_region()/do_fork() because of a probe on an
unsupported instruction is wrong.

Hence change uprobe_mmap() to ignore unsupported instructions.

Oleg Nesterov analyzed the root cause of this problem.

While at it, add a missing put_uprobe() in the path where uprobe_mmap()
races with uprobe_unregister().

Reported-by: William Cohen <wcohen@redhat.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org> # v3.5
---
 kernel/events/uprobes.c |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index c08a22d..c8a8c39 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1051,8 +1051,10 @@ int uprobe_mmap(struct vm_area_struct *vma)
 			if (ret == -EEXIST) {
 				ret = 0;
 
-				if (!is_swbp_at_addr(vma->vm_mm, vaddr))
+				if (!is_swbp_at_addr(vma->vm_mm, vaddr)) {
+					put_uprobe(uprobe);
 					continue;
+				}
 
 				/*
 				 * Unable to insert a breakpoint, but
@@ -1060,6 +1062,15 @@ int uprobe_mmap(struct vm_area_struct *vma)
 				 * probe count.
 				 */
 				atomic_inc(&vma->vm_mm->uprobes_state.count);
+			} else if (ret == -ENOTSUPP) {
+				/*
+				 * A probe at unsupported instruction
+				 * shouldnt cause mmap_region() / do_fork()
+				 * to fail.
+				 */
+				ret = 0;
+				put_uprobe(uprobe);
+				continue;
 			}
 
 			if (!ret)


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-02 10:05     ` [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap Srikar Dronamraju
@ 2012-08-02 13:53       ` Oleg Nesterov
  2012-08-02 16:42         ` Srikar Dronamraju
  2012-08-03 12:13         ` Srikar Dronamraju
  2012-08-02 14:17       ` Oleg Nesterov
  1 sibling, 2 replies; 18+ messages in thread
From: Oleg Nesterov @ 2012-08-02 13:53 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

On 08/02, Srikar Dronamraju wrote:
>
> uprobe_mmap()->install_breakpoint() can fail if the probed insn is not
> supported

But there are other reasons why it can fail,

> However failing mmap_region()/do_fork() because of a probe on an
> unsupported instruction is wrong.

Srikar, I strongly, absolutely disagree. Please correct me, but..

Do you realize how much uprobes hooks in mmap_region/dup_mmap are broken?

(cough, can't resist, vma_adjust()->uprobe_mmap() is not right too, but
 this is another story).

OK, lets start with dup_mmap:

		// retval == 0

		if (file && uprobe_mmap(tmp))
			goto out;

	out:
		up_write(&mm->mmap_sem);
		flush_tlb_mm(oldmm);
		up_write(&oldmm->mmap_sem);
		return retval;

Given that retval == 0, what do you think dup_mmap() returns if
uprobe_mmap() fails? And note that we didn't copy all vmas.
OK, at least this can't crash (afaics), and easy to fix.


But mmap_region() is worse, much worse. It simply can _not_ fail
after uprobe_mmap (of course, I am not saying this is unfixable)
without the crash. And note that the crash is "delayed". And btw,
like dup_mmap(), mmap_region() doesn't return the error too.

Srikar, I strongly believe this horror must not exist. Either
we should teach mmap_region() and dup_mmap() (and vma_adjust!)
to fail correctly, or we should ignore the error code.

It is that simple, isn't it?

Whatever you do with uprobe_mmap(), even if you change it to always
return 0, the code in mmap_region() is absolutely, absolutely broken.

Do you agree?

And once again. I agree, in the long term we should reconsider
this change. But we need a simple fix for now/stable.

> Hence change uprobe_mmap() to ignore unsupported instructions.

OK. Now suppose that mmap_region()-> uprobe_mmap() fails because
the caller is SIGKILL'ed (so __get_user_pages fails). Given that
mmap_region() can't handle the error correctly, the kernel can
crash.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-02 10:05     ` [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap Srikar Dronamraju
  2012-08-02 13:53       ` Oleg Nesterov
@ 2012-08-02 14:17       ` Oleg Nesterov
  2012-08-02 16:54         ` Srikar Dronamraju
  1 sibling, 1 reply; 18+ messages in thread
From: Oleg Nesterov @ 2012-08-02 14:17 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

Forgot to mention...

On 08/02, Srikar Dronamraju wrote:
>
> While at it, add a missing put_uprobe() in the path where uprobe_mmap()
> races with uprobe_unregister().
> ...
> @@ -1051,8 +1051,10 @@ int uprobe_mmap(struct vm_area_struct *vma)
>  			if (ret == -EEXIST) {
>  				ret = 0;
>
> -				if (!is_swbp_at_addr(vma->vm_mm, vaddr))
> +				if (!is_swbp_at_addr(vma->vm_mm, vaddr)) {
> +					put_uprobe(uprobe);
>  					continue;
> +				}

Yes, this part looks correct.

In fact, I think this is not really correct anyway (wrt counter)
but we are going to kill it.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-02 13:53       ` Oleg Nesterov
@ 2012-08-02 16:42         ` Srikar Dronamraju
  2012-08-02 17:48           ` Oleg Nesterov
  2012-08-03 12:13         ` Srikar Dronamraju
  1 sibling, 1 reply; 18+ messages in thread
From: Srikar Dronamraju @ 2012-08-02 16:42 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

> > uprobe_mmap()->install_breakpoint() can fail if the probed insn is not
> > supported
> 
> But there are other reasons why it can fail,
> 
> > However failing mmap_region()/do_fork() because of a probe on an
> > unsupported instruction is wrong.
> 
> Srikar, I strongly, absolutely disagree. Please correct me, but..
> 
> Do you realize how much uprobes hooks in mmap_region/dup_mmap are broken?
> 
> (cough, can't resist, vma_adjust()->uprobe_mmap() is not right too, but
>  this is another story).
> 
> OK, lets start with dup_mmap:
> 
> 		// retval == 0
> 
> 		if (file && uprobe_mmap(tmp))
> 			goto out;
> 
> 	out:
> 		up_write(&mm->mmap_sem);
> 		flush_tlb_mm(oldmm);
> 		up_write(&oldmm->mmap_sem);
> 		return retval;
> 
> Given that retval == 0, what do you think dup_mmap() returns if
> uprobe_mmap() fails? And note that we didn't copy all vmas.
> OK, at least this can't crash (afaics), and easy to fix.
> 
> 
> But mmap_region() is worse, much worse. It simply can _not_ fail
> after uprobe_mmap (of course, I am not saying this is unfixable)
> without the crash. And note that the crash is "delayed". And btw,
> like dup_mmap(), mmap_region() doesn't return the error too.

Oh, thats really a bad mistake from my side. I apologize.

> 
> Srikar, I strongly believe this horror must not exist. Either
> we should teach mmap_region() and dup_mmap() (and vma_adjust!)
> to fail correctly, or we should ignore the error code.
> 
> It is that simple, isn't it?
> 

Right.

> Whatever you do with uprobe_mmap(), even if you change it to always
> return 0, the code in mmap_region() is absolutely, absolutely broken.
> 

But uprobe_mmap() still needs to walk thro the probe list and 
insert the other probes. Currently we just ignore the remaining probes
and that has to be fixed within uprobe_mmap.

> Do you agree?
> 
> And once again. I agree, in the long term we should reconsider
> this change. But we need a simple fix for now/stable.
> 

Agree.

> > Hence change uprobe_mmap() to ignore unsupported instructions.
> 
> OK. Now suppose that mmap_region()-> uprobe_mmap() fails because
> the caller is SIGKILL'ed (so __get_user_pages fails). Given that
> mmap_region() can't handle the error correctly, the kernel can
> crash.
> 

How about having your fix + my patch to fix the uprobe_mmap()?

-- 
Thanks and Regards
Srikar


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-02 14:17       ` Oleg Nesterov
@ 2012-08-02 16:54         ` Srikar Dronamraju
  2012-08-02 17:53           ` Oleg Nesterov
  0 siblings, 1 reply; 18+ messages in thread
From: Srikar Dronamraju @ 2012-08-02 16:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

* Oleg Nesterov <oleg@redhat.com> [2012-08-02 16:17:57]:

> Forgot to mention...
> 
> On 08/02, Srikar Dronamraju wrote:
> >
> > While at it, add a missing put_uprobe() in the path where uprobe_mmap()
> > races with uprobe_unregister().
> > ...
> > @@ -1051,8 +1051,10 @@ int uprobe_mmap(struct vm_area_struct *vma)
> >  			if (ret == -EEXIST) {
> >  				ret = 0;
> >
> > -				if (!is_swbp_at_addr(vma->vm_mm, vaddr))
> > +				if (!is_swbp_at_addr(vma->vm_mm, vaddr)) {
> > +					put_uprobe(uprobe);
> >  					continue;
> > +				}
> 
> Yes, this part looks correct.
> 
> In fact, I think this is not really correct anyway (wrt counter)
> but we are going to kill it.
> 
> 

Are you expecting the counter to be decreased/increased here?

This is case where the uprobe_mmap() and uprobe_unregister() raced, and
by the time install_breakpoint() was called by uprobe_mmap(), there were
no consumers.  i.e there are no uprobe->consumers and the underlying
instruction is still not a breakpoint instruction. 

Since we are refusing to add a breakpoint and that there is no
breakpoint, there is no need to increment/decrement the counter here.

Do let me know if I have missed something.

-- 
Thanks and Regards
Srikar


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-02 16:42         ` Srikar Dronamraju
@ 2012-08-02 17:48           ` Oleg Nesterov
  0 siblings, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2012-08-02 17:48 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

On 08/02, Srikar Dronamraju wrote:
>
> How about having your fix + my patch to fix the uprobe_mmap()?

(I guess you mean -ENOTSUPP part)

Perhaps...

But, as I said, I think that until we fix the callers uprobe_mmap()
should ignore all errors and proceed. Except, perhaps, it makes sense
to check fatal_signal_pending() to not upset oom-killer.

I already had this patch, but then I decided it would be better to
do other changes first. See my next reply to your another email.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-02 16:54         ` Srikar Dronamraju
@ 2012-08-02 17:53           ` Oleg Nesterov
  2012-08-03  1:20             ` Srikar Dronamraju
  0 siblings, 1 reply; 18+ messages in thread
From: Oleg Nesterov @ 2012-08-02 17:53 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

On 08/02, Srikar Dronamraju wrote:
>
> * Oleg Nesterov <oleg@redhat.com> [2012-08-02 16:17:57]:
>
> > Forgot to mention...
> >
> > On 08/02, Srikar Dronamraju wrote:
> > >
> > > While at it, add a missing put_uprobe() in the path where uprobe_mmap()
> > > races with uprobe_unregister().
> > > ...
> > > @@ -1051,8 +1051,10 @@ int uprobe_mmap(struct vm_area_struct *vma)
> > >  			if (ret == -EEXIST) {
> > >  				ret = 0;
> > >
> > > -				if (!is_swbp_at_addr(vma->vm_mm, vaddr))
> > > +				if (!is_swbp_at_addr(vma->vm_mm, vaddr)) {
> > > +					put_uprobe(uprobe);
> > >  					continue;
> > > +				}
> >
> > Yes, this part looks correct.
> >
> > In fact, I think this is not really correct anyway (wrt counter)
> > but we are going to kill it.
> >
> >
>
> Are you expecting the counter to be decreased/increased here?

uprobes_state.count is very wrong, afaics. I'll try to send the fixes
"soon", after we solve the pending problems (this one + stepping).

> This is case where the uprobe_mmap() and uprobe_unregister() raced, and
> by the time install_breakpoint() was called by uprobe_mmap(), there were
> no consumers.

Yes, exactly, and this case doesn't look 100% right too,

> i.e there are no uprobe->consumers and the underlying
> instruction is still not a breakpoint instruction.

Yes, but what if it _IS_ "int3" ?

Yet another reason to move arch_uprobe_analyze_insn/etc to _register.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-02 17:53           ` Oleg Nesterov
@ 2012-08-03  1:20             ` Srikar Dronamraju
  2012-08-03 13:47               ` Oleg Nesterov
  0 siblings, 1 reply; 18+ messages in thread
From: Srikar Dronamraju @ 2012-08-03  1:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

* Oleg Nesterov <oleg@redhat.com> [2012-08-02 19:53:12]:

> On 08/02, Srikar Dronamraju wrote:
> >
> > * Oleg Nesterov <oleg@redhat.com> [2012-08-02 16:17:57]:
> >
> > > Forgot to mention...
> > >
> > > On 08/02, Srikar Dronamraju wrote:
> > > >
> > > > While at it, add a missing put_uprobe() in the path where uprobe_mmap()
> > > > races with uprobe_unregister().
> > > > ...
> > > > @@ -1051,8 +1051,10 @@ int uprobe_mmap(struct vm_area_struct *vma)
> > > >  			if (ret == -EEXIST) {
> > > >  				ret = 0;
> > > >
> > > > -				if (!is_swbp_at_addr(vma->vm_mm, vaddr))
> > > > +				if (!is_swbp_at_addr(vma->vm_mm, vaddr)) {
> > > > +					put_uprobe(uprobe);
> > > >  					continue;
> > > > +				}
> > >
> > > Yes, this part looks correct.
> > >
> > > In fact, I think this is not really correct anyway (wrt counter)
> > > but we are going to kill it.
> > >
> > >
> >
> > Are you expecting the counter to be decreased/increased here?
> 
> uprobes_state.count is very wrong, afaics. I'll try to send the fixes
> "soon", after we solve the pending problems (this one + stepping).
> 
> > This is case where the uprobe_mmap() and uprobe_unregister() raced, and
> > by the time install_breakpoint() was called by uprobe_mmap(), there were
> > no consumers.
> 
> Yes, exactly, and this case doesn't look 100% right too,
> 
> > i.e there are no uprobe->consumers and the underlying
> > instruction is still not a breakpoint instruction.
> 
> Yes, but what if it _IS_ "int3" ?

for int3, install_breakpoint returns -ENOTSUPP as install_breakpoint
does an explicit check if the instruction is breakpoint instruction 
and x86 analyse_insn() also returns -ENOTSUPP.

> 
> Yet another reason to move arch_uprobe_analyze_insn/etc to _register.
> 

I am for moving the stuff to _register that avoids us from looking at
these cases.

-- 
Thanks and Regards


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-02 13:53       ` Oleg Nesterov
  2012-08-02 16:42         ` Srikar Dronamraju
@ 2012-08-03 12:13         ` Srikar Dronamraju
  2012-08-03 13:38           ` Oleg Nesterov
  1 sibling, 1 reply; 18+ messages in thread
From: Srikar Dronamraju @ 2012-08-03 12:13 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

> OK, lets start with dup_mmap:
> 
> 		// retval == 0
> 
> 		if (file && uprobe_mmap(tmp))
> 			goto out;
> 
> 	out:
> 		up_write(&mm->mmap_sem);
> 		flush_tlb_mm(oldmm);
> 		up_write(&oldmm->mmap_sem);
> 		return retval;
> 
> Given that retval == 0, what do you think dup_mmap() returns if
> uprobe_mmap() fails? And note that we didn't copy all vmas.
> OK, at least this can't crash (afaics), and easy to fix.
> 
> 
> But mmap_region() is worse, much worse. It simply can _not_ fail
> after uprobe_mmap (of course, I am not saying this is unfixable)
> without the crash. And note that the crash is "delayed". And btw,
> like dup_mmap(), mmap_region() doesn't return the error too.
> 
> Srikar, I strongly believe this horror must not exist. Either
> we should teach mmap_region() and dup_mmap() (and vma_adjust!)
> to fail correctly, or we should ignore the error code.
> 
> It is that simple, isn't it?

I think you would have thought of this approach already but just wanted
to check if below is fine with you.

diff --git a/kernel/fork.c b/kernel/fork.c
index f00e319..78bfd94 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -456,10 +456,10 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
 		if (tmp->vm_ops && tmp->vm_ops->open)
 			tmp->vm_ops->open(tmp);
 
-		if (retval)
-			goto out;
+		if (!retval && file)
+			retval = uprobe_mmap(tmp);
 
-		if (file && uprobe_mmap(tmp))
+		if (retval)
 			goto out;
 	}
 	/* a new mm has just been created */
diff --git a/mm/mmap.c b/mm/mmap.c
index 4fe2697..91d36fb 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1355,9 +1355,12 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 	} else if ((flags & MAP_POPULATE) && !(flags & MAP_NONBLOCK))
 		make_pages_present(addr, addr + len);
 
-	if (file && uprobe_mmap(vma))
+	if (file) {
+		error = uprobe_mmap(vma)
 		/* matching probes but cannot insert */
-		goto unmap_and_free_vma;
+		if (error)
+			goto unmap_and_free_vma;
+	}
 
 	return addr;
 
Basically, I am setting the return value of uprobe_mmap() so that
mmap_region and do_fork() fail.  This still needs the fix in
uprobe_mmap() to ignore unsupported instructions.

I am completely _okay_ with not setting the return values as proposed by you. 
Just that setting return values and the fix in uprobe_mmap() might help
in minor issues

- We either probe all probes where possible or intimate to the user that
  the requested operation wasnt successful.

- If valid probes follow a probe with invalid instruction, we still
  allow valid probes.

- If get_user_pages()/set_swbp() fail because of genuine reasons like
  ENOMEM, then we dont retry to place probes on the subsequent vmas.

-- 
thanks and regards
Srikar


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-03 12:13         ` Srikar Dronamraju
@ 2012-08-03 13:38           ` Oleg Nesterov
  0 siblings, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2012-08-03 13:38 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

On 08/03, Srikar Dronamraju wrote:
>
> > But mmap_region() is worse, much worse. It simply can _not_ fail
> > after uprobe_mmap (of course, I am not saying this is unfixable)
> > without the crash. And note that the crash is "delayed". And btw,
> > like dup_mmap(), mmap_region() doesn't return the error too.
> >
> > Srikar, I strongly believe this horror must not exist. Either
> > we should teach mmap_region() and dup_mmap() (and vma_adjust!)
> > to fail correctly, or we should ignore the error code.
> >
> > It is that simple, isn't it?
>
> I think you would have thought of this approach already but just wanted
> to check if below is fine with you.
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1355,9 +1355,12 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>  	} else if ((flags & MAP_POPULATE) && !(flags & MAP_NONBLOCK))
>  		make_pages_present(addr, addr + len);
>
> -	if (file && uprobe_mmap(vma))
> +	if (file) {
> +		error = uprobe_mmap(vma)
>  		/* matching probes but cannot insert */
> -		goto unmap_and_free_vma;
> +		if (error)
> +			goto unmap_and_free_vma;

No, this is wrong.

Please look at the code under unmap_and_free_vma. The main part is
unmap_region(), but this does NOT remove vma from mm->mm_rb and then
this vma is kmem_cache_free'ed. IOW, this leaves the freed object
in rb tree.

Afaics, there are other things this error path doesn't do but this
is how William noticed the bug (kernel hang). I don't think the fix
is suitable for stable.

Srikar, can't we make these fixes on top of my simple patch which
fixes the easy-to-trigger kernel crashes/hangs?

If yes, may be you can ack that patch for Ingo?


And yes, uprobe_mmap() needs changes too. But can't we do this a
bit later? Currently uprobes_state.count is very wrong, it simply
does not count uprobes correctly even in the very simple cases.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap
  2012-08-03  1:20             ` Srikar Dronamraju
@ 2012-08-03 13:47               ` Oleg Nesterov
  0 siblings, 0 replies; 18+ messages in thread
From: Oleg Nesterov @ 2012-08-03 13:47 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

On 08/03, Srikar Dronamraju wrote:
>
> * Oleg Nesterov <oleg@redhat.com> [2012-08-02 19:53:12]:
>
> > On 08/02, Srikar Dronamraju wrote:
> > >
> > > This is case where the uprobe_mmap() and uprobe_unregister() raced, and
> > > by the time install_breakpoint() was called by uprobe_mmap(), there were
> > > no consumers.
> >
> > Yes, exactly, and this case doesn't look 100% right too,
> >
> > > i.e there are no uprobe->consumers and the underlying
> > > instruction is still not a breakpoint instruction.
> >
> > Yes, but what if it _IS_ "int3" ?
>
> for int3, install_breakpoint returns -ENOTSUPP as install_breakpoint
> does an explicit check if the instruction is breakpoint instruction
> and x86 analyse_insn() also returns -ENOTSUPP.

install_breakpoint() checks ->consumers first and returns EEXIST.

OK. Suppose that the probed insn is int3, and nobody mmaps it.

	1. uprobe_register() succeeds

	2. uprobe_unregister() is called, it does consumer_del(),
	   but before it calls delete_uprobe()...

	3. uprobe_mmap() finds this uprobe and install_breakpoint()
	   returns -EEXIST.

We could fix this particular problem (and other similar), but I think
this is pointless. This all is broken. Please give me some time to try
to make a patch which removes this all.

> > Yet another reason to move arch_uprobe_analyze_insn/etc to _register.
> >
>
> I am for moving the stuff to _register that avoids us from looking at
> these cases.

Yes. Lets try to do this step-by-step, after we fix the pending/discussed
problems.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails
  2012-07-28 16:31 [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails Oleg Nesterov
                   ` (2 preceding siblings ...)
  2012-07-31  6:47 ` Srikar Dronamraju
@ 2012-08-03 17:46 ` Srikar Dronamraju
  3 siblings, 0 replies; 18+ messages in thread
From: Srikar Dronamraju @ 2012-08-03 17:46 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, Frank Ch. Eigler, Peter Zijlstra,
	William Cohen, linux-kernel

* Oleg Nesterov <oleg@redhat.com> [2012-07-28 18:31:57]:

> https://bugzilla.redhat.com/show_bug.cgi?id=843640
> 
> If mmap_region()->uprobe_mmap() fails, unmap_and_free_vma path
> does unmap_region() but does not remove the soon-to-be-freed vma
> from rb tree (actually there are more problems).
> 
> Perhaps we could do do_munmap() + return in this case, but in fact
> it is simply wrong to abort if uprobe_mmap() fails. Until at least
> we move the !UPROBE_COPY_INSN code from install_breakpoint() to
> uprobe_register().
> 
> For example, uprobe_mmap()->install_breakpoint() can fail if the
> probed insn is not supported (remember, uprobe_register() succeeds
> if nobody mmaps inode/offset), mmap() should not fail in this case.
> 
> dup_mmap()->uprobe_mmap() is wrong too by the same reason, fork()
> can race with uprobe_register() and fail for no reason if it wins
> the race and does install_breakpoint() first.
> 
> Change mmap_region() and dup_mmap() to ignore the error code from
> uprobe_mmap().
> 
> Reported-by: William Cohen <wcohen@redhat.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> Cc: <stable@vger.kernel.org> # v3.5

After discussions with Oleg, its very clear that this apt fix for now.

Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

> ---
>  kernel/fork.c |    4 ++--
>  mm/mmap.c     |    5 ++---
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index ab5211b..54bb88a 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -454,8 +454,8 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
>  		if (retval)
>  			goto out;
> 
> -		if (file && uprobe_mmap(tmp))
> -			goto out;
> +		if (file)
> +			uprobe_mmap(tmp);
>  	}
>  	/* a new mm has just been created */
>  	arch_dup_mmap(oldmm, mm);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 4fe2697..f25fd3f 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1355,9 +1355,8 @@ out:
>  	} else if ((flags & MAP_POPULATE) && !(flags & MAP_NONBLOCK))
>  		make_pages_present(addr, addr + len);
> 
> -	if (file && uprobe_mmap(vma))
> -		/* matching probes but cannot insert */
> -		goto unmap_and_free_vma;
> +	if (file)
> +		uprobe_mmap(vma);
> 
>  	return addr;
> 
> -- 
> 1.5.5.1
> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-08-03 17:47 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-28 16:31 [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails Oleg Nesterov
2012-07-28 16:34 ` Oleg Nesterov
2012-07-30 13:22 ` William Cohen
2012-07-31  6:47 ` Srikar Dronamraju
2012-07-31 12:48   ` Oleg Nesterov
2012-07-31 13:25     ` Oleg Nesterov
2012-08-02 10:05     ` [PATCH] uprobes: Ignore unsupported instructions in uprobe_mmap Srikar Dronamraju
2012-08-02 13:53       ` Oleg Nesterov
2012-08-02 16:42         ` Srikar Dronamraju
2012-08-02 17:48           ` Oleg Nesterov
2012-08-03 12:13         ` Srikar Dronamraju
2012-08-03 13:38           ` Oleg Nesterov
2012-08-02 14:17       ` Oleg Nesterov
2012-08-02 16:54         ` Srikar Dronamraju
2012-08-02 17:53           ` Oleg Nesterov
2012-08-03  1:20             ` Srikar Dronamraju
2012-08-03 13:47               ` Oleg Nesterov
2012-08-03 17:46 ` [PATCH] uprobes: mmap_region() corrupts mm->mm_rb if uprobe_mmap() fails Srikar Dronamraju

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.