All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix race in process_vm_rw_core
@ 2012-01-13 11:30 Christopher Yeoh
  2012-01-13 16:04 ` Oleg Nesterov
  2012-01-13 22:30 ` Andrew Morton
  0 siblings, 2 replies; 8+ messages in thread
From: Christopher Yeoh @ 2012-01-13 11:30 UTC (permalink / raw)
  To: linux-kernel, Oleg Nesterov, Linus Torvalds; +Cc: Andrew Morton, David Howells

Hi Linus,

Below is a patch which fixes the race in process_vm_core found by
Oleg (http://article.gmane.org/gmane.linux.kernel/1235667/).
It consolidates some code with mm_for_maps since what they do is almost
identical.

Oleg - I've kept the breakout of ptrace_may_attach and get_task_mm to
preserve only having to take the task lock once. I see some performance
difference with a microbenchmark but haven't had a chance to test with
some HPC benchmarks yet so for the moment I'd like to leave it in. At
this stage I think its more important to get the race fixed and I'm at
Linux.conf.au all next week. I'll send a patch out for the 
rw_copy_check_uvector cleanup after I get back from LCA.

Regards,

Chris
-- 
cyeoh@au.ibm.com
Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com>
Cc: stable@vger.kernel.org
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 851ba3d..094d650 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -254,22 +254,7 @@ static struct mm_struct *check_mem_permission(struct task_struct *task)
 
 struct mm_struct *mm_for_maps(struct task_struct *task)
 {
-	struct mm_struct *mm;
-	int err;
-
-	err =  mutex_lock_killable(&task->signal->cred_guard_mutex);
-	if (err)
-		return ERR_PTR(err);
-
-	mm = get_task_mm(task);
-	if (mm && mm != current->mm &&
-			!ptrace_may_access(task, PTRACE_MODE_READ)) {
-		mmput(mm);
-		mm = ERR_PTR(-EACCES);
-	}
-	mutex_unlock(&task->signal->cred_guard_mutex);
-
-	return mm;
+	return get_check_task_mm(task, PTRACE_MODE_READ);
 }
 
 static int proc_pid_cmdline(struct task_struct *task, char * buffer)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 1c4f3e9..8a64cae 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2235,6 +2235,10 @@ static inline void mmdrop(struct mm_struct * mm)
 extern void mmput(struct mm_struct *);
 /* Grab a reference to a task's mm, if it is not already going away */
 extern struct mm_struct *get_task_mm(struct task_struct *task);
+/* Grab a reference to a task's mm, if it is not already going away
+   and ptrace_may_access with the mode parameter passed to it succeeds */
+extern struct mm_struct *get_check_task_mm(struct task_struct *task,
+					   unsigned int mode);
 /* Remove the current tasks stale references to the old mm_struct */
 extern void mm_release(struct task_struct *, struct mm_struct *);
 /* Allocate a new mm structure and copy contents from tsk->mm */
diff --git a/kernel/fork.c b/kernel/fork.c
index da4a6a1..9688fb0 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -644,6 +644,37 @@ struct mm_struct *get_task_mm(struct task_struct *task)
 }
 EXPORT_SYMBOL_GPL(get_task_mm);
 
+struct mm_struct *get_check_task_mm(struct task_struct *task, unsigned int mode)
+{
+	struct mm_struct *mm;
+	int err;
+
+	err =  mutex_lock_killable(&task->signal->cred_guard_mutex);
+	if (err)
+		return ERR_PTR(err);
+
+	task_lock(task);
+	if (__ptrace_may_access(task, mode)) {
+		mm = ERR_PTR(-EACCES);
+		goto out;
+	}
+
+	mm = task->mm;
+	if (mm) {
+		if (task->flags & PF_KTHREAD)
+			mm = NULL;
+		else
+			atomic_inc(&mm->mm_users);
+	}
+
+out:
+	task_unlock(task);
+	mutex_unlock(&task->signal->cred_guard_mutex);
+
+	return mm;
+}
+EXPORT_SYMBOL_GPL(get_check_task_mm);
+
 /* Please note the differences between mmput and mm_release.
  * mmput is called whenever we stop holding onto a mm_struct,
  * error success whatever.
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index e920aa3..aa8009d 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -298,23 +298,15 @@ static ssize_t process_vm_rw_core(pid_t pid, const struct iovec *lvec,
 		goto free_proc_pages;
 	}
 
-	task_lock(task);
-	if (__ptrace_may_access(task, PTRACE_MODE_ATTACH)) {
-		task_unlock(task);
-		rc = -EPERM;
-		goto put_task_struct;
-	}
-	mm = task->mm;
-
-	if (!mm || (task->flags & PF_KTHREAD)) {
-		task_unlock(task);
-		rc = -EINVAL;
+	mm = get_check_task_mm(task, PTRACE_MODE_ATTACH);
+	if (!mm || IS_ERR(mm)) {
+		if (!mm)
+			rc = -EINVAL;
+		else
+			rc = -EPERM;
 		goto put_task_struct;
 	}
 
-	atomic_inc(&mm->mm_users);
-	task_unlock(task);
-
 	for (i = 0; i < riovcnt && iov_l_curr_idx < liovcnt; i++) {
 		rc = process_vm_rw_single_vec(
 			(unsigned long)rvec[i].iov_base, rvec[i].iov_len,


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix race in process_vm_rw_core
  2012-01-13 11:30 [PATCH] Fix race in process_vm_rw_core Christopher Yeoh
@ 2012-01-13 16:04 ` Oleg Nesterov
  2012-01-13 23:26   ` Christopher Yeoh
  2012-01-13 22:30 ` Andrew Morton
  1 sibling, 1 reply; 8+ messages in thread
From: Oleg Nesterov @ 2012-01-13 16:04 UTC (permalink / raw)
  To: Christopher Yeoh
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, David Howells

On 01/13, Christopher Yeoh wrote:
>
> Hi Linus,
>
> Below is a patch which fixes the race in process_vm_core found by
> Oleg (http://article.gmane.org/gmane.linux.kernel/1235667/).
> It consolidates some code with mm_for_maps since what they do is almost
> identical.
>
> Oleg - I've kept the breakout of ptrace_may_attach and get_task_mm to
> preserve only having to take the task lock once. I see some performance
> difference with a microbenchmark but haven't had a chance to test with
> some HPC benchmarks yet so for the moment I'd like to leave it in.

I still think we should avoid the copy-and-paste code, we can do this
without the extra unlock+lock if it hurts.

However,

> At
> this stage I think its more important to get the race fixed and I'm at
> Linux.conf.au all next week. I'll send a patch out for the
> rw_copy_check_uvector cleanup after I get back from LCA.

OK, lets fix the bug first.

>  struct mm_struct *mm_for_maps(struct task_struct *task)
>  {
> -	struct mm_struct *mm;
> -	int err;
> -
> -	err =  mutex_lock_killable(&task->signal->cred_guard_mutex);
> -	if (err)
> -		return ERR_PTR(err);
> -
> -	mm = get_task_mm(task);
> -	if (mm && mm != current->mm &&
> -			!ptrace_may_access(task, PTRACE_MODE_READ)) {
> -		mmput(mm);
> -		mm = ERR_PTR(-EACCES);
> -	}
> -	mutex_unlock(&task->signal->cred_guard_mutex);
> -
> -	return mm;
> +	return get_check_task_mm(task, PTRACE_MODE_READ);
>  }
> ...
> +struct mm_struct *get_check_task_mm(struct task_struct *task, unsigned int mode)
> +{
> +	struct mm_struct *mm;
> +	int err;
> +
> +	err =  mutex_lock_killable(&task->signal->cred_guard_mutex);
> +	if (err)
> +		return ERR_PTR(err);
> +
> +	task_lock(task);
> +	if (__ptrace_may_access(task, mode)) {
> +		mm = ERR_PTR(-EACCES);
> +		goto out;
> +	}

Probably you should check "mm != current->mm" before __ptrace_may_access(),
otherwise this changes the rules for, say, /proc/pid/maps.

> @@ -298,23 +298,15 @@ static ssize_t process_vm_rw_core(pid_t pid, const struct iovec *lvec,
>  		goto free_proc_pages;
>  	}
>
> -	task_lock(task);
> -	if (__ptrace_may_access(task, PTRACE_MODE_ATTACH)) {
> -		task_unlock(task);
> -		rc = -EPERM;
> -		goto put_task_struct;
> -	}
> -	mm = task->mm;
> -
> -	if (!mm || (task->flags & PF_KTHREAD)) {
> -		task_unlock(task);
> -		rc = -EINVAL;
> +	mm = get_check_task_mm(task, PTRACE_MODE_ATTACH);
> +	if (!mm || IS_ERR(mm)) {
> +		if (!mm)
> +			rc = -EINVAL;
> +		else
> +			rc = -EPERM;

Cosmetic nit. I won't insist, but why -EPERM is better than -EACCES
returned by get_check_task_mm()? IOW, why not rc = PTR_ERR() ?

Note that get_check_task_mm() can return -EINTR, in this case -EPERM
looks confusing even if this doesn't really materr (the killed task
can't return to usermode).

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix race in process_vm_rw_core
  2012-01-13 11:30 [PATCH] Fix race in process_vm_rw_core Christopher Yeoh
  2012-01-13 16:04 ` Oleg Nesterov
@ 2012-01-13 22:30 ` Andrew Morton
  2012-01-13 23:30   ` Christopher Yeoh
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2012-01-13 22:30 UTC (permalink / raw)
  To: Christopher Yeoh
  Cc: linux-kernel, Oleg Nesterov, Linus Torvalds, David Howells

On Fri, 13 Jan 2012 22:00:28 +1030
Christopher Yeoh <cyeoh@au1.ibm.com> wrote:

> Hi Linus,
> 
> Below is a patch which fixes the race in process_vm_core found by
> Oleg (http://article.gmane.org/gmane.linux.kernel/1235667/).
> It consolidates some code with mm_for_maps since what they do is almost
> identical.
> 
> Oleg - I've kept the breakout of ptrace_may_attach and get_task_mm to
> preserve only having to take the task lock once. I see some performance
> difference with a microbenchmark but haven't had a chance to test with
> some HPC benchmarks yet so for the moment I'd like to leave it in. At
> this stage I think its more important to get the race fixed and I'm at
> Linux.conf.au all next week. I'll send a patch out for the 
> rw_copy_check_uvector cleanup after I get back from LCA.
> 
> ...
>
> +struct mm_struct *get_check_task_mm(struct task_struct *task, unsigned int mode)
> +{
> +	struct mm_struct *mm;
> +	int err;
> +
> +	err =  mutex_lock_killable(&task->signal->cred_guard_mutex);
> +	if (err)
> +		return ERR_PTR(err);
> +
> +	task_lock(task);
> +	if (__ptrace_may_access(task, mode)) {
> +		mm = ERR_PTR(-EACCES);
> +		goto out;
> +	}
> +
> +	mm = task->mm;
> +	if (mm) {
> +		if (task->flags & PF_KTHREAD)
> +			mm = NULL;
> +		else
> +			atomic_inc(&mm->mm_users);
> +	}
> +
> +out:
> +	task_unlock(task);
> +	mutex_unlock(&task->signal->cred_guard_mutex);
> +
> +	return mm;
> +}
> +EXPORT_SYMBOL_GPL(get_check_task_mm);

I don't think the export is needed - CONFIG_PROC_FS=m isn't supported.

btw, I'm trying to work out why we didn't make the whole process_vm_access.o
feature Kconfigurable, so people who don't want it do not get burdened
with it?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix race in process_vm_rw_core
  2012-01-13 16:04 ` Oleg Nesterov
@ 2012-01-13 23:26   ` Christopher Yeoh
  2012-01-14 17:58     ` Oleg Nesterov
  0 siblings, 1 reply; 8+ messages in thread
From: Christopher Yeoh @ 2012-01-13 23:26 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: linux-kernel, Linus Torvalds, Andrew Morton, David Howells

On Fri, 13 Jan 2012 17:04:42 +0100
Oleg Nesterov <oleg@redhat.com> wrote:
> On 01/13, Christopher Yeoh wrote:
> > ...
> > +struct mm_struct *get_check_task_mm(struct task_struct *task,
> > unsigned int mode) +{
> > +	struct mm_struct *mm;
> > +	int err;
> > +
> > +	err =
> > mutex_lock_killable(&task->signal->cred_guard_mutex);
> > +	if (err)
> > +		return ERR_PTR(err);
> > +
> > +	task_lock(task);
> > +	if (__ptrace_may_access(task, mode)) {
> > +		mm = ERR_PTR(-EACCES);
> > +		goto out;
> > +	}
> 
> Probably you should check "mm != current->mm" before
> __ptrace_may_access(), otherwise this changes the rules for,
> say, /proc/pid/maps.

__ptrace_may_access has a check for task == current already - 
Is that sufficient?

	/* Don't let security modules deny introspection */
	if (task == current)
		return 0;

> > @@ -298,23 +298,15 @@ static ssize_t process_vm_rw_core(pid_t pid,
> > const struct iovec *lvec, goto free_proc_pages;
> >  	}
> >
> > -	task_lock(task);
> > -	if (__ptrace_may_access(task, PTRACE_MODE_ATTACH)) {
> > -		task_unlock(task);
> > -		rc = -EPERM;
> > -		goto put_task_struct;
> > -	}
> > -	mm = task->mm;
> > -
> > -	if (!mm || (task->flags & PF_KTHREAD)) {
> > -		task_unlock(task);
> > -		rc = -EINVAL;
> > +	mm = get_check_task_mm(task, PTRACE_MODE_ATTACH);
> > +	if (!mm || IS_ERR(mm)) {
> > +		if (!mm)
> > +			rc = -EINVAL;
> > +		else
> > +			rc = -EPERM;
> 
> Cosmetic nit. I won't insist, but why -EPERM is better than -EACCES
> returned by get_check_task_mm()? IOW, why not rc = PTR_ERR() ?

Maybe I should just convert EACCES to EPERM for process_vm_rw_core. I
left get_check_task_mm with EACCESS to preserve existing behaviour
for mm_for_maps.

SUSv3 defines EACCES and EPERM as 

[EACCES]
Permission denied. An attempt was made to access a file in a way
forbidden by its file access permissions.

[EPERM]
Operation not permitted. An attempt was made to perform an operation
limited to processes with appropriate privileges or to the owner of a
file or other resource.

So EPERM is more appropriate for process_vm_readv/writev

Chris
-- 
cyeoh@au.ibm.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix race in process_vm_rw_core
  2012-01-13 22:30 ` Andrew Morton
@ 2012-01-13 23:30   ` Christopher Yeoh
  0 siblings, 0 replies; 8+ messages in thread
From: Christopher Yeoh @ 2012-01-13 23:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Oleg Nesterov, Linus Torvalds, David Howells

On Fri, 13 Jan 2012 14:30:50 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:
> > +struct mm_struct *get_check_task_mm(struct task_struct *task,
> > unsigned int mode) +{
> > +	struct mm_struct *mm;
> > +	int err;
> > +
> > +	err =
> > mutex_lock_killable(&task->signal->cred_guard_mutex);
> > +	if (err)
> > +		return ERR_PTR(err);
> > +
> > +	task_lock(task);
> > +	if (__ptrace_may_access(task, mode)) {
> > +		mm = ERR_PTR(-EACCES);
> > +		goto out;
> > +	}
> > +
> > +	mm = task->mm;
> > +	if (mm) {
> > +		if (task->flags & PF_KTHREAD)
> > +			mm = NULL;
> > +		else
> > +			atomic_inc(&mm->mm_users);
> > +	}
> > +
> > +out:
> > +	task_unlock(task);
> > +	mutex_unlock(&task->signal->cred_guard_mutex);
> > +
> > +	return mm;
> > +}
> > +EXPORT_SYMBOL_GPL(get_check_task_mm);
> 
> I don't think the export is needed - CONFIG_PROC_FS=m isn't supported.

ok.

> btw, I'm trying to work out why we didn't make the whole
> process_vm_access.o feature Kconfigurable, so people who don't want
> it do not get burdened with it?

I don't know. I'll make this change if you'd like but I won't have time
to do it before leaving for LCA tomorrow. Have a toddler about to arrive
at my house.

Chris
-- 
cyeoh@au.ibm.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix race in process_vm_rw_core
  2012-01-13 23:26   ` Christopher Yeoh
@ 2012-01-14 17:58     ` Oleg Nesterov
  2012-01-16  2:56       ` Christopher Yeoh
  0 siblings, 1 reply; 8+ messages in thread
From: Oleg Nesterov @ 2012-01-14 17:58 UTC (permalink / raw)
  To: Christopher Yeoh
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, David Howells

On 01/14, Christopher Yeoh wrote:
>
> On Fri, 13 Jan 2012 17:04:42 +0100
> Oleg Nesterov <oleg@redhat.com> wrote:
> > On 01/13, Christopher Yeoh wrote:
> > > ...
> > > +struct mm_struct *get_check_task_mm(struct task_struct *task,
> > > unsigned int mode) +{
> > > +	struct mm_struct *mm;
> > > +	int err;
> > > +
> > > +	err =
> > > mutex_lock_killable(&task->signal->cred_guard_mutex);
> > > +	if (err)
> > > +		return ERR_PTR(err);
> > > +
> > > +	task_lock(task);
> > > +	if (__ptrace_may_access(task, mode)) {
> > > +		mm = ERR_PTR(-EACCES);
> > > +		goto out;
> > > +	}
> >
> > Probably you should check "mm != current->mm" before
> > __ptrace_may_access(), otherwise this changes the rules for,
> > say, /proc/pid/maps.
>
> __ptrace_may_access has a check for task == current already -
> Is that sufficient?
>
> 	/* Don't let security modules deny introspection */
> 	if (task == current)
> 		return 0;

I don't think this is sufficient in the multithreaded or CLONE_VM case,
task_cred/etc is per-thread.

It is not that I think that this "current->mm != mm" check is important,
in fact personally I think it shouldn't exist.

But we shouldn't add the subtle and not documented behavioural change, and
obviously process_vm_rw() has no security problems if mm == current->mm.

> > > +	mm = get_check_task_mm(task, PTRACE_MODE_ATTACH);
> > > +	if (!mm || IS_ERR(mm)) {
> > > +		if (!mm)
> > > +			rc = -EINVAL;
> > > +		else
> > > +			rc = -EPERM;
> >
> > Cosmetic nit. I won't insist, but why -EPERM is better than -EACCES
> > returned by get_check_task_mm()? IOW, why not rc = PTR_ERR() ?
>
> Maybe I should just convert EACCES to EPERM for process_vm_rw_core. I
> left get_check_task_mm with EACCESS to preserve existing behaviour
> for mm_for_maps.
>
> SUSv3 defines EACCES and EPERM as
>
> [EACCES]
> Permission denied. An attempt was made to access a file in a way
> forbidden by its file access permissions.
>
> [EPERM]
> Operation not permitted. An attempt was made to perform an operation
> limited to processes with appropriate privileges or to the owner of a
> file or other resource.
>
> So EPERM is more appropriate for process_vm_readv/writev

Well, imho EACCES would be fine too and my point was s/EINTR/EPERM/
looks a bit confusing.

But OK, this is subjective and minor, I won't argue.

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix race in process_vm_rw_core
  2012-01-14 17:58     ` Oleg Nesterov
@ 2012-01-16  2:56       ` Christopher Yeoh
  2012-01-16 18:59         ` Oleg Nesterov
  0 siblings, 1 reply; 8+ messages in thread
From: Christopher Yeoh @ 2012-01-16  2:56 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: linux-kernel, Linus Torvalds, Andrew Morton, David Howells

On Sat, 14 Jan 2012 18:58:29 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> On 01/14, Christopher Yeoh wrote:
> >
> > On Fri, 13 Jan 2012 17:04:42 +0100
> > Oleg Nesterov <oleg@redhat.com> wrote:
> > > On 01/13, Christopher Yeoh wrote:
> > > > ...
> > > > +struct mm_struct *get_check_task_mm(struct task_struct *task,
> > > > unsigned int mode) +{
> > > > +	struct mm_struct *mm;
> > > > +	int err;
> > > > +
> > > > +	err =
> > > > mutex_lock_killable(&task->signal->cred_guard_mutex);
> > > > +	if (err)
> > > > +		return ERR_PTR(err);
> > > > +
> > > > +	task_lock(task);
> > > > +	if (__ptrace_may_access(task, mode)) {
> > > > +		mm = ERR_PTR(-EACCES);
> > > > +		goto out;
> > > > +	}
> > >
> > > Probably you should check "mm != current->mm" before
> > > __ptrace_may_access(), otherwise this changes the rules for,
> > > say, /proc/pid/maps.
> >
> > __ptrace_may_access has a check for task == current already -
> > Is that sufficient?
> >
> > 	/* Don't let security modules deny introspection */
> > 	if (task == current)
> > 		return 0;
> 
> I don't think this is sufficient in the multithreaded or CLONE_VM
> case, task_cred/etc is per-thread.
> 
> It is not that I think that this "current->mm != mm" check is
> important, in fact personally I think it shouldn't exist.
> 
> But we shouldn't add the subtle and not documented behavioural
> change, and obviously process_vm_rw() has no security problems if mm
> == current->mm.
> 

Ok, updated patch below:

- adds the "current->mm != mm" check
- removes EXPORT_SYMBOL_GPL for get_check_task_mm

 fs/proc/base.c         |   17 +----------------
 include/linux/sched.h  |    4 ++++
 kernel/fork.c          |   30 ++++++++++++++++++++++++++++++
 mm/process_vm_access.c |   20 ++++++--------------
 4 files changed, 41 insertions(+), 30 deletions(-)
Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com>
Cc: stable@vger.kernel.org
---
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 851ba3d..094d650 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -254,22 +254,7 @@ static struct mm_struct *check_mem_permission(struct task_struct *task)
 
 struct mm_struct *mm_for_maps(struct task_struct *task)
 {
-	struct mm_struct *mm;
-	int err;
-
-	err =  mutex_lock_killable(&task->signal->cred_guard_mutex);
-	if (err)
-		return ERR_PTR(err);
-
-	mm = get_task_mm(task);
-	if (mm && mm != current->mm &&
-			!ptrace_may_access(task, PTRACE_MODE_READ)) {
-		mmput(mm);
-		mm = ERR_PTR(-EACCES);
-	}
-	mutex_unlock(&task->signal->cred_guard_mutex);
-
-	return mm;
+	return get_check_task_mm(task, PTRACE_MODE_READ);
 }
 
 static int proc_pid_cmdline(struct task_struct *task, char * buffer)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 1c4f3e9..8a64cae 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2235,6 +2235,10 @@ static inline void mmdrop(struct mm_struct * mm)
 extern void mmput(struct mm_struct *);
 /* Grab a reference to a task's mm, if it is not already going away */
 extern struct mm_struct *get_task_mm(struct task_struct *task);
+/* Grab a reference to a task's mm, if it is not already going away
+   and ptrace_may_access with the mode parameter passed to it succeeds */
+extern struct mm_struct *get_check_task_mm(struct task_struct *task,
+					   unsigned int mode);
 /* Remove the current tasks stale references to the old mm_struct */
 extern void mm_release(struct task_struct *, struct mm_struct *);
 /* Allocate a new mm structure and copy contents from tsk->mm */
diff --git a/kernel/fork.c b/kernel/fork.c
index da4a6a1..b6c193a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -644,6 +644,36 @@ struct mm_struct *get_task_mm(struct task_struct *task)
 }
 EXPORT_SYMBOL_GPL(get_task_mm);
 
+struct mm_struct *get_check_task_mm(struct task_struct *task, unsigned int mode)
+{
+	struct mm_struct *mm;
+	int err;
+
+	err =  mutex_lock_killable(&task->signal->cred_guard_mutex);
+	if (err)
+		return ERR_PTR(err);
+
+	task_lock(task);
+	mm = task->mm;
+	if (mm != current->mm && __ptrace_may_access(task, mode)) {
+		mm = ERR_PTR(-EACCES);
+		goto out;
+	}
+
+	if (mm) {
+		if (task->flags & PF_KTHREAD)
+			mm = NULL;
+		else
+			atomic_inc(&mm->mm_users);
+	}
+
+out:
+	task_unlock(task);
+	mutex_unlock(&task->signal->cred_guard_mutex);
+
+	return mm;
+}
+
 /* Please note the differences between mmput and mm_release.
  * mmput is called whenever we stop holding onto a mm_struct,
  * error success whatever.
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index e920aa3..aa8009d 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -298,23 +298,15 @@ static ssize_t process_vm_rw_core(pid_t pid, const struct iovec *lvec,
 		goto free_proc_pages;
 	}
 
-	task_lock(task);
-	if (__ptrace_may_access(task, PTRACE_MODE_ATTACH)) {
-		task_unlock(task);
-		rc = -EPERM;
-		goto put_task_struct;
-	}
-	mm = task->mm;
-
-	if (!mm || (task->flags & PF_KTHREAD)) {
-		task_unlock(task);
-		rc = -EINVAL;
+	mm = get_check_task_mm(task, PTRACE_MODE_ATTACH);
+	if (!mm || IS_ERR(mm)) {
+		if (!mm)
+			rc = -EINVAL;
+		else
+			rc = -EPERM;
 		goto put_task_struct;
 	}
 
-	atomic_inc(&mm->mm_users);
-	task_unlock(task);
-
 	for (i = 0; i < riovcnt && iov_l_curr_idx < liovcnt; i++) {
 		rc = process_vm_rw_single_vec(
 			(unsigned long)rvec[i].iov_base, rvec[i].iov_len,


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix race in process_vm_rw_core
  2012-01-16  2:56       ` Christopher Yeoh
@ 2012-01-16 18:59         ` Oleg Nesterov
  0 siblings, 0 replies; 8+ messages in thread
From: Oleg Nesterov @ 2012-01-16 18:59 UTC (permalink / raw)
  To: Christopher Yeoh
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, David Howells

On 01/16, Christopher Yeoh wrote:
>
> +struct mm_struct *get_check_task_mm(struct task_struct *task, unsigned int mode)
> +{
> +	struct mm_struct *mm;
> +	int err;
> +
> +	err =  mutex_lock_killable(&task->signal->cred_guard_mutex);
> +	if (err)
> +		return ERR_PTR(err);
> +
> +	task_lock(task);
> +	mm = task->mm;
> +	if (mm != current->mm && __ptrace_may_access(task, mode)) {
> +		mm = ERR_PTR(-EACCES);
> +		goto out;
> +	}
> +
> +	if (mm) {
> +		if (task->flags & PF_KTHREAD)
> +			mm = NULL;
> +		else
> +			atomic_inc(&mm->mm_users);
> +	}

This still looks a bit strange, we call __ptrace_may_access()
before we check ->mm != NULL even if this is safe... Really, we
would simply fix the bug then try to microoptimize this code.

But OK, I promised I won't argue ;)

I believe the patch is correct and fixes the problem.

Oleg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-01-16 19:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-13 11:30 [PATCH] Fix race in process_vm_rw_core Christopher Yeoh
2012-01-13 16:04 ` Oleg Nesterov
2012-01-13 23:26   ` Christopher Yeoh
2012-01-14 17:58     ` Oleg Nesterov
2012-01-16  2:56       ` Christopher Yeoh
2012-01-16 18:59         ` Oleg Nesterov
2012-01-13 22:30 ` Andrew Morton
2012-01-13 23:30   ` Christopher Yeoh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.