linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
@ 2012-02-29 15:16 Cyrill Gorcunov
  2012-02-29 15:23 ` Pavel Emelyanov
  2012-02-29 19:24 ` Oleg Nesterov
  0 siblings, 2 replies; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-02-29 15:16 UTC (permalink / raw)
  To: LKML
  Cc: Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo, Oleg Nesterov

Hi guys,

at restore time we would like to have a way for /proc/pid/exe
symlink recovering. So I thought extending prctl might be
a good idea.

Still maybe there some other good and 'right' way to do it,
so I would like to gather opinions.

Please review, thanks!

	Cyrill
---
From: Cyrill Gorcunov <gorcunov@openvz.org>
Subject: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file

When we do restore we would like to have a way to setup
a former mm_struct::exe_file so that /proc/pid/exe would
point to the original executable file a process had at
checkpoint time.

For this sake PR_SET_MM_EXE_FILE code is introduced.

Note, if mm_struct::exe_file already mapped more than once
we refuse to change anything (which prevents kernel from
potential problems).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 include/linux/prctl.h |    1 
 kernel/sys.c          |   73 +++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 73 insertions(+), 1 deletion(-)

Index: linux-2.6.git/include/linux/prctl.h
===================================================================
--- linux-2.6.git.orig/include/linux/prctl.h
+++ linux-2.6.git/include/linux/prctl.h
@@ -118,5 +118,6 @@
 # define PR_SET_MM_ENV_START		10
 # define PR_SET_MM_ENV_END		11
 # define PR_SET_MM_AUXV			12
+# define PR_SET_MM_EXE_FILE		13
 
 #endif /* _LINUX_PRCTL_H */
Index: linux-2.6.git/kernel/sys.c
===================================================================
--- linux-2.6.git.orig/kernel/sys.c
+++ linux-2.6.git/kernel/sys.c
@@ -1701,6 +1701,66 @@ static bool vma_flags_mismatch(struct vm
 		(vma->vm_flags & banned);
 }
 
+/* Expects mm->mmap_sem is read-taken */
+static int prctl_set_mm_exe_file(struct mm_struct *mm,
+				 const void __user *path,
+				 size_t size)
+{
+	struct file *new_exe_file;
+	char *pathbuf;
+	int ret = 0;
+
+	if (size >= PATH_MAX)
+		return -EINVAL;
+
+	/*
+	 * We allow to change only those exe's which
+	 * are not mapped several times. This one
+	 * is early test while mmap_sem is taken.
+	 */
+	if (mm->num_exe_file_vmas > 1)
+		return -EBUSY;
+
+	up_read(&mm->mmap_sem);
+
+	pathbuf = kmalloc(size, GFP_TEMPORARY);
+	if (!pathbuf) {
+		ret = -ENOMEM;
+		goto err_down;
+	}
+
+	if (copy_from_user(pathbuf, path, size)) {
+		kfree(pathbuf);
+		ret = -EFAULT;
+		goto err_down;
+	}
+	pathbuf[size-1] = '\0';
+
+	new_exe_file = open_exec(pathbuf);
+	kfree(pathbuf);
+
+	down_read(&mm->mmap_sem);
+
+	if (IS_ERR(new_exe_file))
+		return PTR_ERR(new_exe_file);
+
+	/*
+	 * We allow to change only those exe's which
+	 * are not mapped several times.
+	 */
+	if (mm->num_exe_file_vmas < 2) {
+		set_mm_exe_file(mm, new_exe_file);
+		ret = 0;
+	} else
+		ret = -EBUSY;
+
+	return ret;
+
+err_down:
+	down_read(&mm->mmap_sem);
+	return ret;
+}
+
 static int prctl_set_mm(int opt, unsigned long addr,
 			unsigned long arg4, unsigned long arg5)
 {
@@ -1709,7 +1769,9 @@ static int prctl_set_mm(int opt, unsigne
 	struct vm_area_struct *vma;
 	int error = 0;
 
-	if (arg5 || (arg4 && opt != PR_SET_MM_AUXV))
+	if (arg5 || (arg4 &&
+		     opt != PR_SET_MM_AUXV &&
+		     opt != PR_SET_MM_EXE_FILE))
 		return -EINVAL;
 
 	if (!capable(CAP_SYS_ADMIN))
@@ -1837,6 +1899,15 @@ static int prctl_set_mm(int opt, unsigne
 
 		return 0;
 	}
+
+	/*
+	 * This to restore /proc/self/exe link.
+	 */
+	case PR_SET_MM_EXE_FILE:
+		error = prctl_set_mm_exe_file(mm, (const void __user *)addr, arg4);
+		if (error)
+			goto out;
+		break;
 	default:
 		error = -EINVAL;
 		goto out;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-02-29 15:16 [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file Cyrill Gorcunov
@ 2012-02-29 15:23 ` Pavel Emelyanov
  2012-02-29 15:31   ` Cyrill Gorcunov
  2012-02-29 19:24 ` Oleg Nesterov
  1 sibling, 1 reply; 21+ messages in thread
From: Pavel Emelyanov @ 2012-02-29 15:23 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Kees Cook, Tejun Heo,
	Oleg Nesterov

On 02/29/2012 07:16 PM, Cyrill Gorcunov wrote:
> Hi guys,
> 
> at restore time we would like to have a way for /proc/pid/exe
> symlink recovering. So I thought extending prctl might be
> a good idea.
> 
> Still maybe there some other good and 'right' way to do it,
> so I would like to gather opinions.

If using the prctl for this, then taking a given file descriptor
(with fcheck) instead of taking path and opening it in kernel is
better.

> Please review, thanks!
> 
> 	Cyrill

Thanks,
Pavel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-02-29 15:23 ` Pavel Emelyanov
@ 2012-02-29 15:31   ` Cyrill Gorcunov
  0 siblings, 0 replies; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-02-29 15:31 UTC (permalink / raw)
  To: Pavel Emelyanov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Kees Cook, Tejun Heo,
	Oleg Nesterov

On Wed, Feb 29, 2012 at 07:23:07PM +0400, Pavel Emelyanov wrote:
> On 02/29/2012 07:16 PM, Cyrill Gorcunov wrote:
> > Hi guys,
> > 
> > at restore time we would like to have a way for /proc/pid/exe
> > symlink recovering. So I thought extending prctl might be
> > a good idea.
> > 
> > Still maybe there some other good and 'right' way to do it,
> > so I would like to gather opinions.
> 
> If using the prctl for this, then taking a given file descriptor
> (with fcheck) instead of taking path and opening it in kernel is
> better.
> 

Sounds like a plan. Thanks Pavel!

	Cyrill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-02-29 15:16 [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file Cyrill Gorcunov
  2012-02-29 15:23 ` Pavel Emelyanov
@ 2012-02-29 19:24 ` Oleg Nesterov
  2012-02-29 20:01   ` Cyrill Gorcunov
  1 sibling, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2012-02-29 19:24 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On 02/29, Cyrill Gorcunov wrote:
>
> +static int prctl_set_mm_exe_file(struct mm_struct *mm,
> +				 const void __user *path,
> +				 size_t size)
> +{
> +	struct file *new_exe_file;
> +	char *pathbuf;
> +	int ret = 0;
> +
> +	if (size >= PATH_MAX)
> +		return -EINVAL;
> +
> +	/*
> +	 * We allow to change only those exe's which
> +	 * are not mapped several times. This one
> +	 * is early test while mmap_sem is taken.
> +	 */
> +	if (mm->num_exe_file_vmas > 1)
> +		return -EBUSY;

I don't really understand this check, but it is racy. Another thread
can change ->num_exe_file_vmas right after the check.

> +       up_read(&mm->mmap_sem);

up? I do not see down...

> +	new_exe_file = open_exec(pathbuf);
> +	kfree(pathbuf);
> +
> +	down_read(&mm->mmap_sem);

probably you meant "up" here. OK, I am ignoring ->mmap_sem, I can't
understand what did you really mean ;)

> +	if (IS_ERR(new_exe_file))
> +		return PTR_ERR(new_exe_file);
> +
> +	/*
> +	 * We allow to change only those exe's which
> +	 * are not mapped several times.
> +	 */
> +	if (mm->num_exe_file_vmas < 2) {
> +		set_mm_exe_file(mm, new_exe_file);
> +		ret = 0;
> +	} else
> +		ret = -EBUSY;
> +
> +	return ret;

Both success/EBUSY leak new_exe_file. And I agree with Pavel,
prctl_set_mm_exe_file() should take fd, not filename.

I simply can't understand why set_mm_exe_file() is safe. What
if we race with another thread doing set_mm_exe_file() too?
Or it can race with added_exe_file_vma/removed_exe_file_vma.

And. set_mm_exe_file() sets ->num_exe_file_vmas = 0, this is
simply wrong? It should match the number of VM_EXECUTABLE
vmas.

In short, I do not understand the patch at all. It seems, you
only need to replace mm->exe_file under down_write(mmap_sem)
and nothing else.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-02-29 19:24 ` Oleg Nesterov
@ 2012-02-29 20:01   ` Cyrill Gorcunov
  2012-03-01 18:06     ` Oleg Nesterov
  0 siblings, 1 reply; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-02-29 20:01 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On Wed, Feb 29, 2012 at 08:24:00PM +0100, Oleg Nesterov wrote:
> On 02/29, Cyrill Gorcunov wrote:
> >
> > +static int prctl_set_mm_exe_file(struct mm_struct *mm,
> > +				 const void __user *path,
> > +				 size_t size)
> > +{
> > +	struct file *new_exe_file;
> > +	char *pathbuf;
> > +	int ret = 0;
> > +
> > +	if (size >= PATH_MAX)
> > +		return -EINVAL;
> > +
> > +	/*
> > +	 * We allow to change only those exe's which
> > +	 * are not mapped several times. This one
> > +	 * is early test while mmap_sem is taken.
> > +	 */
> > +	if (mm->num_exe_file_vmas > 1)
> > +		return -EBUSY;
> 
> I don't really understand this check, but it is racy. Another thread
> can change ->num_exe_file_vmas right after the check.
> 
> > +       up_read(&mm->mmap_sem);
> 
> up? I do not see down...

down is taken in calling routine (as pointed in comment on
prctl_set_mm_exe_file), thus I suppose I miss something since
the calling functions which increment/decrement num_exe_file_vmas
(such as mremap) do down_write(mmap_sem) first.

> 
> > +	new_exe_file = open_exec(pathbuf);
> > +	kfree(pathbuf);
> > +
> > +	down_read(&mm->mmap_sem);
> 
> probably you meant "up" here. OK, I am ignoring ->mmap_sem, I can't
> understand what did you really mean ;)
> 

nop, down instead ;)

> > +	if (IS_ERR(new_exe_file))
> > +		return PTR_ERR(new_exe_file);
> > +
> > +	/*
> > +	 * We allow to change only those exe's which
> > +	 * are not mapped several times.
> > +	 */
> > +	if (mm->num_exe_file_vmas < 2) {
> > +		set_mm_exe_file(mm, new_exe_file);
> > +		ret = 0;
> > +	} else
> > +		ret = -EBUSY;
> > +
> > +	return ret;
> 
> Both success/EBUSY leak new_exe_file. And I agree with Pavel,

yeah, will fix, thanks!

> prctl_set_mm_exe_file() should take fd, not filename.
> 

yes, i'll switch to this idea

> I simply can't understand why set_mm_exe_file() is safe. What
> if we race with another thread doing set_mm_exe_file() too?
> Or it can race with added_exe_file_vma/removed_exe_file_vma.

really, Oleg, I don't see race here since this routine is
caller under down_read and I've been releasing mmap_sem for
short time then reacquiring it, and recheck for number of
num_exe_file_vmas. so I presume I miss something obvious
here.

> 
> And. set_mm_exe_file() sets ->num_exe_file_vmas = 0, this is
> simply wrong? It should match the number of VM_EXECUTABLE
> vmas.
> 

yes, it's a nit which sould be fixed. thanks!

> In short, I do not understand the patch at all. It seems, you
> only need to replace mm->exe_file under down_write(mmap_sem)
> and nothing else.

I can't just replace it, I wanted to check it the new symlink
will indeed point to executable (such ceheck btw is done
in open_exec() helper) and I actually wonted to replace
only freshly created executables which didn't have any
remaps on executable VMA (still I might be wrong
here and it's indeed safe just to replace old exe_file).

That's why I posted it as RFC and really appreciate
feedback (so, thanks a lot, Oleg!).

	Cyrill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-02-29 20:01   ` Cyrill Gorcunov
@ 2012-03-01 18:06     ` Oleg Nesterov
  2012-03-01 19:17       ` Cyrill Gorcunov
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2012-03-01 18:06 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On 03/01, Cyrill Gorcunov wrote:
>
> On Wed, Feb 29, 2012 at 08:24:00PM +0100, Oleg Nesterov wrote:
> > On 02/29, Cyrill Gorcunov wrote:
> > >
> > > +static int prctl_set_mm_exe_file(struct mm_struct *mm,
> > > +				 const void __user *path,
> > > +				 size_t size)
> > > +{
> > > +	struct file *new_exe_file;
> > > +	char *pathbuf;
> > > +	int ret = 0;
> > > +
> > > +	if (size >= PATH_MAX)
> > > +		return -EINVAL;
> > > +
> > > +	/*
> > > +	 * We allow to change only those exe's which
> > > +	 * are not mapped several times. This one
> > > +	 * is early test while mmap_sem is taken.
> > > +	 */
> > > +	if (mm->num_exe_file_vmas > 1)
> > > +		return -EBUSY;
> >
> > I don't really understand this check, but it is racy. Another thread
> > can change ->num_exe_file_vmas right after the check.
> >
> > > +       up_read(&mm->mmap_sem);
> >
> > up? I do not see down...
>
> down is taken in calling routine (as pointed in comment on
> prctl_set_mm_exe_file),

Ah, indeed, stupid me. Somehow I thought that the caller is sys_prctl().
So it is called by prctl_set_mm() which holds ->mmap_sem for reading.

> thus I suppose I miss something since
> the calling functions which increment/decrement num_exe_file_vmas
> (such as mremap) do down_write(mmap_sem) first.

Yes, so ->num_exe_file_vmas is stable under mmap_sem. But it can
be changed right after up_read(), so I don't underastand this check
anyway.

OK, you recheck this counter later, under mmap_sem.

> > I simply can't understand why set_mm_exe_file() is safe. What
> > if we race with another thread doing set_mm_exe_file() too?
> > Or it can race with added_exe_file_vma/removed_exe_file_vma.
>
> really, Oleg, I don't see race here since this routine is
> caller under down_read and I've been releasing mmap_sem for
> short time then reacquiring it, and recheck for number of
> num_exe_file_vmas. so I presume I miss something obvious
> here.

OK, now that I understand the locking, we can't race with
added_exe_file_vma/removed_exe_file_vma. But I still think we
can race with set_mm_exe_file().

And yes, I think you missed something obvious ;) Suppose that
2 threads call prctl_set_mm(PR_SET_MM_EXE_FILE) at the same
time. Both threads can take ->mmap_sem for reading and do
set_mm_exe_file() at the same time.

> > And. set_mm_exe_file() sets ->num_exe_file_vmas = 0, this is
> > simply wrong? It should match the number of VM_EXECUTABLE
> > vmas.
> >
>
> yes, it's a nit which sould be fixed. thanks!

OK, but then you do not need to check ->num_exe_file_vmas at all?

Except, of course, I think we should fail if this counter is zero.

The changelog says:

	Note, if mm_struct::exe_file already mapped more than once
	we refuse to change anything (which prevents kernel from
	potential problems).

why? which problems?

> > In short, I do not understand the patch at all. It seems, you
> > only need to replace mm->exe_file under down_write(mmap_sem)
> > and nothing else.
>
> I can't just replace it, I wanted to check it the new symlink
> will indeed point to executable

I meant I see no reason to play with num_exe_file_vmas, you only
need to replace ->exe_file.

As for additional checks, I have no opinion. I don't know if it
really makes sense to verify it is executable.

But, hmm. There is another problem with your patch. open_exec()
does deny_write_access(), and I do not see who does the necessary
allow_write_access().

> and I actually wonted to replace
> only freshly created executables which didn't have any
> remaps on executable VMA

I don't really understand what do you mean.

In any case, PR_SET_MM_EXE_FILE is cheating. The new file doesn't
match ->vm_file of VM_EXECUTABLE vmas. And it can be writable.

But why do we require num_exe_file_vmas == 1?

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-01 18:06     ` Oleg Nesterov
@ 2012-03-01 19:17       ` Cyrill Gorcunov
  2012-03-01 19:41         ` Oleg Nesterov
  0 siblings, 1 reply; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-03-01 19:17 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On Thu, Mar 01, 2012 at 07:06:16PM +0100, Oleg Nesterov wrote:
...
> 
> OK, now that I understand the locking, we can't race with
> added_exe_file_vma/removed_exe_file_vma. But I still think we
> can race with set_mm_exe_file().
> 
> And yes, I think you missed something obvious ;) Suppose that
> 2 threads call prctl_set_mm(PR_SET_MM_EXE_FILE) at the same
> time. Both threads can take ->mmap_sem for reading and do
> set_mm_exe_file() at the same time.
> 

Hi Oleg, thanks a lot for comments! Today I started remakeing
the patch and found one obvious problem -- the mmap_sem should
be taken for write not read, since otherwise someone might get
wrong reference when we're chenging this symlink. So sure, this
was incorrect in first place ;)

> > > And. set_mm_exe_file() sets ->num_exe_file_vmas = 0, this is
> > > simply wrong? It should match the number of VM_EXECUTABLE
> > > vmas.
> > >
> >
> > yes, it's a nit which sould be fixed. thanks!
> 
> OK, but then you do not need to check ->num_exe_file_vmas at all?
> 
> Except, of course, I think we should fail if this counter is zero.
> 
> The changelog says:
> 
> 	Note, if mm_struct::exe_file already mapped more than once
> 	we refuse to change anything (which prevents kernel from
> 	potential problems).
> 
> why? which problems?
> 

I've some gut feeling that if I have num_exe_file_vmas more than
one I might broke something if I replace exe_file (actually I'm
checking this, was interrupted with other duties).

> > > In short, I do not understand the patch at all. It seems, you
> > > only need to replace mm->exe_file under down_write(mmap_sem)
> > > and nothing else.
> >
> > I can't just replace it, I wanted to check it the new symlink
> > will indeed point to executable
> 
> I meant I see no reason to play with num_exe_file_vmas, you only
> need to replace ->exe_file.

Yes, I tend to think the same (but as I mentioned, I'm checking
if this really wont break anything).

So, Oleg, basically the new version will use opened fd in form
like (note, I'll recheck for refs and locks more so this is
a draft version to point idea)

static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
{
	struct files_struct *files;
	struct file *file;
	int ret;

	/*
	 * Make sure if someone is trying to obtain
	 * the existing exe_file it will not get
	 * results until we've finished.
	 */
	down_write(&mm->mmap_sem);

	files = get_files_struct(current);
	if (!files)
		return -EINVAL;

	spin_lock(&files->file_lock);

	file = fcheck_files(files, fd);
	if (!file) {
		ret = -EBADFD;
		goto out_unlock;
	}
	get_file(file);
	spin_unlock(&files->file_lock);

	if (mm->num_exe_file_vmas)
		removed_exe_file_vma(mm);

-> This are two lines I'm doubt about
->	set_mm_exe_file(mm, file);
->	added_exe_file_vma(mm);

out_unlock:
	put_files_struct(files);
out:
	up_write(&mm->mmap_sem);
	return ret;
}

	Cyrill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-01 19:17       ` Cyrill Gorcunov
@ 2012-03-01 19:41         ` Oleg Nesterov
  2012-03-01 20:00           ` Cyrill Gorcunov
  2012-03-02 14:26           ` Cyrill Gorcunov
  0 siblings, 2 replies; 21+ messages in thread
From: Oleg Nesterov @ 2012-03-01 19:41 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On 03/01, Cyrill Gorcunov wrote:
>
> So, Oleg, basically the new version will use opened fd in form
> like (note, I'll recheck for refs and locks more so this is
> a draft version to point idea)
>
> static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
> {
> 	struct files_struct *files;
> 	struct file *file;
> 	int ret;
>
> 	/*
> 	 * Make sure if someone is trying to obtain
> 	 * the existing exe_file it will not get
> 	 * results until we've finished.
> 	 */
> 	down_write(&mm->mmap_sem);
>
> 	files = get_files_struct(current);
> 	if (!files)
> 		return -EINVAL;
>
> 	spin_lock(&files->file_lock);
>
> 	file = fcheck_files(files, fd);
> 	if (!file) {
> 		ret = -EBADFD;
> 		goto out_unlock;
> 	}
> 	get_file(file);
> 	spin_unlock(&files->file_lock);
>
> 	if (mm->num_exe_file_vmas)
> 		removed_exe_file_vma(mm);

Still can't understand. I think you need:

	file = fget(fd);
	if (!file)
		return -EBADF;

	down_write(&mm->mmap_sem);
	if (mm->num_exe_file_vmas) {
		fput(mm->exe_file);
		mm->exe_file = file;
		file = NULL;
	}
	up_write(&mm->mmap_sem);

	if (!file)
		return 0;

	fput(file);
	return -ESOMETHING;

and that is all.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-01 19:41         ` Oleg Nesterov
@ 2012-03-01 20:00           ` Cyrill Gorcunov
  2012-03-02 15:03             ` Oleg Nesterov
  2012-03-02 14:26           ` Cyrill Gorcunov
  1 sibling, 1 reply; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-03-01 20:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On Thu, Mar 01, 2012 at 08:41:20PM +0100, Oleg Nesterov wrote:
...
> 
> Still can't understand. I think you need:
> 
> 	file = fget(fd);
> 	if (!file)
> 		return -EBADF;
> 
> 	down_write(&mm->mmap_sem);
> 	if (mm->num_exe_file_vmas) {
> 		fput(mm->exe_file);
> 		mm->exe_file = file;
> 		file = NULL;
> 	}
> 	up_write(&mm->mmap_sem);
> 
> 	if (!file)
> 		return 0;
> 
> 	fput(file);
> 	return -ESOMETHING;
> 
> and that is all.

This breaks overall logic of num_exe_file_vmas.
What the point to have it at all then? I mean,
if there several executable sections in elf file,
once loader finish its work we will have
num_exe_file_vmas more than 1.

Then the process calls for prctl and replaces
own exe_file (I'm talking about possible scenario
since for our own tool we know that there will be
only one .text section and we're more-less safe
in replacing own exe_file, but this interface
will be available for everyone who has c/r config
entry turned on, so I'm trying to find which
negative impact this feature might have, call me
paranoid), so once process has replaced own exe_file
to something else the code which depends on
num_exe_file_vmas become broken.

May not we have a scenario when removed_exe_file_vma
is be called somewhere else later, once this prctl
finished its work? That's what I fear of.

	Cyrill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-01 19:41         ` Oleg Nesterov
  2012-03-01 20:00           ` Cyrill Gorcunov
@ 2012-03-02 14:26           ` Cyrill Gorcunov
  2012-03-02 15:26             ` Oleg Nesterov
  1 sibling, 1 reply; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-03-02 14:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On Thu, Mar 01, 2012 at 08:41:20PM +0100, Oleg Nesterov wrote:
...
> Still can't understand. I think you need:
> 
> 	file = fget(fd);
> 	if (!file)
> 		return -EBADF;
> 
> 	down_write(&mm->mmap_sem);
> 	if (mm->num_exe_file_vmas) {
> 		fput(mm->exe_file);
> 		mm->exe_file = file;
> 		file = NULL;
> 	}
> 	up_write(&mm->mmap_sem);
> 
> 	if (!file)
> 		return 0;
> 
> 	fput(file);
> 	return -ESOMETHING;
> 
> and that is all.

Hi Oleg, sure you were right. I even think testing for
num_exe_file_vmas is not needed, since until real
executable VMA is read and mapped it might remain
zero. So I guess something like below should work.

	Cyrill
---
From: Cyrill Gorcunov <gorcunov@openvz.org>
Subject: c/r: prctl: Add ability to set new mm_struct::exe_file v2

When we do restore we would like to have a way to setup
a former mm_struct::exe_file so that /proc/pid/exe would
point to the original executable file a process had at
checkpoint time.

For this sake PR_SET_MM_EXE_FILE code is introduced.

This feature is available iif CONFIG_CHECKPOINT_RESTORE is set.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Kees Cook <keescook@chromium.org>
CC: Tejun Heo <tj@kernel.org>
---
 include/linux/prctl.h |    1 +
 kernel/sys.c          |   24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+)

Index: linux-2.6.git/include/linux/prctl.h
===================================================================
--- linux-2.6.git.orig/include/linux/prctl.h
+++ linux-2.6.git/include/linux/prctl.h
@@ -118,5 +118,6 @@
 # define PR_SET_MM_ENV_START		10
 # define PR_SET_MM_ENV_END		11
 # define PR_SET_MM_AUXV			12
+# define PR_SET_MM_EXE_FILE		13
 
 #endif /* _LINUX_PRCTL_H */
Index: linux-2.6.git/kernel/sys.c
===================================================================
--- linux-2.6.git.orig/kernel/sys.c
+++ linux-2.6.git/kernel/sys.c
@@ -36,6 +36,7 @@
 #include <linux/personality.h>
 #include <linux/ptrace.h>
 #include <linux/fs_struct.h>
+#include <linux/file.h>
 #include <linux/gfp.h>
 #include <linux/syscore_ops.h>
 #include <linux/version.h>
@@ -1701,6 +1702,25 @@ static bool vma_flags_mismatch(struct vm
 		(vma->vm_flags & banned);
 }
 
+static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
+{
+	struct file *new_exe_file, *old_exe_file;
+
+	new_exe_file = fget(fd);
+	if (!new_exe_file)
+		return -EBADF;
+
+	down_write(&mm->mmap_sem);
+	old_exe_file = mm->exe_file;
+	mm->exe_file = new_exe_file;
+	up_write(&mm->mmap_sem);
+
+	if (old_exe_file)
+		fput(old_exe_file);
+
+	return 0;
+}
+
 static int prctl_set_mm(int opt, unsigned long addr,
 			unsigned long arg4, unsigned long arg5)
 {
@@ -1715,6 +1735,9 @@ static int prctl_set_mm(int opt, unsigne
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
+	if (opt == PR_SET_MM_EXE_FILE)
+		return prctl_set_mm_exe_file(mm, (unsigned int)addr);
+
 	if (addr >= TASK_SIZE)
 		return -EINVAL;
 
@@ -1837,6 +1860,7 @@ static int prctl_set_mm(int opt, unsigne
 
 		return 0;
 	}
+
 	default:
 		error = -EINVAL;
 		goto out;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-01 20:00           ` Cyrill Gorcunov
@ 2012-03-02 15:03             ` Oleg Nesterov
  0 siblings, 0 replies; 21+ messages in thread
From: Oleg Nesterov @ 2012-03-02 15:03 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On 03/02, Cyrill Gorcunov wrote:
>
> On Thu, Mar 01, 2012 at 08:41:20PM +0100, Oleg Nesterov wrote:
> ...
> >
> > Still can't understand. I think you need:
> >
> > 	file = fget(fd);
> > 	if (!file)
> > 		return -EBADF;
> >
> > 	down_write(&mm->mmap_sem);
> > 	if (mm->num_exe_file_vmas) {
> > 		fput(mm->exe_file);
> > 		mm->exe_file = file;
> > 		file = NULL;
> > 	}
> > 	up_write(&mm->mmap_sem);
> >
> > 	if (!file)
> > 		return 0;
> >
> > 	fput(file);
> > 	return -ESOMETHING;
> >
> > and that is all.
>
> This breaks overall logic of num_exe_file_vmas.

Why? In my opinion, your patch breaks the logic ;)

> What the point to have it at all then?

I think it should die. I already suggested to do

	- struct file *exe_file;
	+ struct path *exe_path;

and kill this counter, but this is off-topic.

> I mean,
> if there several executable sections in elf file,
> once loader finish its work we will have
> num_exe_file_vmas more than 1.

Yes. And?

> Then the process calls for prctl and replaces
> own exe_file (I'm talking about possible scenario
> since for our own tool we know that there will be
> only one .text section and we're more-less safe
> in replacing own exe_file,

confused. I do not see the "num_exe_file_vmas == 1" check in the
last version. (yes, I think it is not needed).

OTOH, you should check num_exe_file_vmas != 0, otherwise you break
the current logic.

> but this interface
> will be available for everyone who has c/r config
> entry turned on,

Yes, and thus it should work in any case.

> so I'm trying to find which
> negative impact this feature might have,

If you find something negative - please explain and correct me ;)

Your message starts with "This breaks overall logic" without any
explanation.

> so once process has replaced own exe_file
> to something else the code which depends on
> num_exe_file_vmas become broken.

Again, why???

> May not we have a scenario when removed_exe_file_vma
> is be called somewhere else later, once this prctl
> finished its work? That's what I fear of.

Of course, removed_exe_file_vma() or added_exe_file_vma() can
be called after prctl(). And we should keep the current logic:
mm->exe_file exists until num_exe_file_vmas != 0.

To simplify, currently we have:

	- num_exe_file_vmas is equal to the number of
	  MAP_EXECUTABLE vmas

	- (num_exe_file_vmas != 0) <=> (exe_file != NULL)

You should keep this. Or you should change the rules and explain
why you are doing this.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-02 14:26           ` Cyrill Gorcunov
@ 2012-03-02 15:26             ` Oleg Nesterov
  2012-03-02 16:12               ` Cyrill Gorcunov
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2012-03-02 15:26 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On 03/02, Cyrill Gorcunov wrote:
>
> On Thu, Mar 01, 2012 at 08:41:20PM +0100, Oleg Nesterov wrote:
> ...
> > Still can't understand. I think you need:
> >
> > 	file = fget(fd);
> > 	if (!file)
> > 		return -EBADF;
> >
> > 	down_write(&mm->mmap_sem);
> > 	if (mm->num_exe_file_vmas) {
> > 		fput(mm->exe_file);
> > 		mm->exe_file = file;
> > 		file = NULL;
> > 	}
> > 	up_write(&mm->mmap_sem);
> >
> > 	if (!file)
> > 		return 0;
> >
> > 	fput(file);
> > 	return -ESOMETHING;
> >
> > and that is all.
>
> Hi Oleg, sure you were right. I even think testing for
> num_exe_file_vmas is not needed, since until real
> executable VMA is read and mapped it might remain
> zero. So I guess something like below should work.

No, you need to check num_exe_file_vmas != 0.

> +static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
> +{
> +	struct file *new_exe_file, *old_exe_file;
> +
> +	new_exe_file = fget(fd);
> +	if (!new_exe_file)
> +		return -EBADF;
> +
> +	down_write(&mm->mmap_sem);
> +	old_exe_file = mm->exe_file;
> +	mm->exe_file = new_exe_file;
> +	up_write(&mm->mmap_sem);

This changes the current rule: mm->exe_file goes away once
num_exe_file_vmas becomes zero.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-02 15:26             ` Oleg Nesterov
@ 2012-03-02 16:12               ` Cyrill Gorcunov
  2012-03-03 22:33                 ` Cyrill Gorcunov
  0 siblings, 1 reply; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-03-02 16:12 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On Fri, Mar 02, 2012 at 04:26:22PM +0100, Oleg Nesterov wrote:
...
> > Hi Oleg, sure you were right. I even think testing for
> > num_exe_file_vmas is not needed, since until real
> > executable VMA is read and mapped it might remain
> > zero. So I guess something like below should work.
> 
> No, you need to check num_exe_file_vmas != 0.
> 
> > +static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
> > +{
> > +	struct file *new_exe_file, *old_exe_file;
> > +
> > +	new_exe_file = fget(fd);
> > +	if (!new_exe_file)
> > +		return -EBADF;
> > +
> > +	down_write(&mm->mmap_sem);
> > +	old_exe_file = mm->exe_file;
> > +	mm->exe_file = new_exe_file;
> > +	up_write(&mm->mmap_sem);
> 
> This changes the current rule: mm->exe_file goes away once
> num_exe_file_vmas becomes zero.

Yes, Oleg, you're right of course! The set_mm_exe_file confused
me a bit (I didn't reply to your previous emails simply because
you were asking me to prove what exactly is broken when num_exe_file_vmas
greater 2 and we reassign new exe_file, and I didn't know what to reply
since I knew I was wrong in first assumption).

As a conclusion -- sure I need to check for num_exe_file_vmas to
not break current logic, thanks!

	Cyrill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-02 16:12               ` Cyrill Gorcunov
@ 2012-03-03 22:33                 ` Cyrill Gorcunov
  2012-03-05 14:21                   ` Oleg Nesterov
  0 siblings, 1 reply; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-03-03 22:33 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On Fri, Mar 02, 2012 at 08:12:47PM +0400, Cyrill Gorcunov wrote:
...
> 
> Yes, Oleg, you're right of course! The set_mm_exe_file confused
> me a bit (I didn't reply to your previous emails simply because
> you were asking me to prove what exactly is broken when num_exe_file_vmas
> greater 2 and we reassign new exe_file, and I didn't know what to reply
> since I knew I was wrong in first assumption).
> 
> As a conclusion -- sure I need to check for num_exe_file_vmas to
> not break current logic, thanks!
> 

Hi Oleg,

i suppose the below one should do the trick. I mostly concerned
at moment if I added all permission tests needed.

	Cyrill
---
c/r: prctl: Add ability to set new mm_struct::exe_file v2

From: Cyrill Gorcunov <gorcunov@openvz.org>

When we do restore we would like to have a way to setup
a former mm_struct::exe_file so that /proc/pid/exe would
point to the original executable file a process had at
checkpoint time.

For this the PR_SET_MM_EXE_FILE code is introduced.

This feature is available iif CONFIG_CHECKPOINT_RESTORE is set.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Kees Cook <keescook@chromium.org>
CC: Tejun Heo <tj@kernel.org>
---
 include/linux/prctl.h |    1 
 kernel/sys.c          |   56 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

Index: linux-2.6.git/include/linux/prctl.h
===================================================================
--- linux-2.6.git.orig/include/linux/prctl.h
+++ linux-2.6.git/include/linux/prctl.h
@@ -118,5 +118,6 @@
 # define PR_SET_MM_ENV_START		10
 # define PR_SET_MM_ENV_END		11
 # define PR_SET_MM_AUXV			12
+# define PR_SET_MM_EXE_FILE		13
 
 #endif /* _LINUX_PRCTL_H */
Index: linux-2.6.git/kernel/sys.c
===================================================================
--- linux-2.6.git.orig/kernel/sys.c
+++ linux-2.6.git/kernel/sys.c
@@ -36,6 +36,8 @@
 #include <linux/personality.h>
 #include <linux/ptrace.h>
 #include <linux/fs_struct.h>
+#include <linux/file.h>
+#include <linux/mount.h>
 #include <linux/gfp.h>
 #include <linux/syscore_ops.h>
 #include <linux/version.h>
@@ -1701,6 +1703,56 @@ static bool vma_flags_mismatch(struct vm
 		(vma->vm_flags & banned);
 }
 
+static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
+{
+	struct file *exe_file;
+	struct dentry *dentry;
+	int err;
+
+	exe_file = fget(fd);
+	if (!exe_file)
+		return -EBADF;
+
+	dentry = exe_file->f_path.dentry;
+
+	/*
+	 * Permissions should be the same as if the
+	 * file has being opened by the kernel for
+	 * execution.
+	 */
+	err = -EACCES;
+	if (!S_ISREG(dentry->d_inode->i_mode)	||
+	    exe_file->f_path.mnt->mnt_flags & MNT_NOEXEC)
+		goto exit;
+
+	if ((exe_file->f_flags & O_ACCMODE) != O_RDONLY)
+		goto exit;
+
+	err = inode_permission(dentry->d_inode, MAY_EXEC);
+	if (err)
+		goto exit;
+
+	down_write(&mm->mmap_sem);
+	if (mm->num_exe_file_vmas) {
+
+		err = deny_write_access(exe_file);
+		if (err) {
+			up_write(&mm->mmap_sem);
+			goto exit;
+		}
+
+		fput(mm->exe_file);
+		mm->exe_file = exe_file;
+	} else {
+		fput(exe_file);
+		err = -ENXIO;
+	}
+	up_write(&mm->mmap_sem);
+
+exit:
+	return err;
+}
+
 static int prctl_set_mm(int opt, unsigned long addr,
 			unsigned long arg4, unsigned long arg5)
 {
@@ -1715,6 +1767,9 @@ static int prctl_set_mm(int opt, unsigne
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
+	if (opt == PR_SET_MM_EXE_FILE)
+		return prctl_set_mm_exe_file(mm, (unsigned int)addr);
+
 	if (addr >= TASK_SIZE)
 		return -EINVAL;
 
@@ -1837,6 +1892,7 @@ static int prctl_set_mm(int opt, unsigne
 
 		return 0;
 	}
+
 	default:
 		error = -EINVAL;
 		goto out;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-03 22:33                 ` Cyrill Gorcunov
@ 2012-03-05 14:21                   ` Oleg Nesterov
  2012-03-05 14:26                     ` Oleg Nesterov
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2012-03-05 14:21 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On 03/04, Cyrill Gorcunov wrote:
>
> +static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
> +{
> +	struct file *exe_file;
> +	struct dentry *dentry;
> +	int err;
> +
> +	exe_file = fget(fd);
> +	if (!exe_file)
> +		return -EBADF;
> +
> +	dentry = exe_file->f_path.dentry;
> +
> +	/*
> +	 * Permissions should be the same as if the
> +	 * file has being opened by the kernel for
> +	 * execution.
> +	 */

Why?

> +	err = -EACCES;
> +	if (!S_ISREG(dentry->d_inode->i_mode)	||
> +	    exe_file->f_path.mnt->mnt_flags & MNT_NOEXEC)
> +		goto exit;
> +
> +	if ((exe_file->f_flags & O_ACCMODE) != O_RDONLY)
> +		goto exit;
> +
> +	err = inode_permission(dentry->d_inode, MAY_EXEC);
> +	if (err)
> +		goto exit;

OK, I won't argue, probably this makes sense to make sure that
admin can't get a heart attack looking at /proc/pid/exe.

But the O_RDONLY check looks strange. We are not going to write
to this file, we only set the name (and that is why I think it
should be mm->exe_path). What is the point to check that the file
was opened without FMODE_WRITE? Even if there were any security
risk the apllication can open this file again with the different
flags.

And btw this check is redundant anyway because you do
deny_write_access() below. However, this deny_write_access() looks
wrong:

> +	down_write(&mm->mmap_sem);
> +	if (mm->num_exe_file_vmas) {
> +
> +		err = deny_write_access(exe_file);

And who does allow_write_access() ?

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-05 14:21                   ` Oleg Nesterov
@ 2012-03-05 14:26                     ` Oleg Nesterov
  2012-03-05 14:46                       ` Cyrill Gorcunov
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2012-03-05 14:26 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On 03/05, Oleg Nesterov wrote:
>
> On 03/04, Cyrill Gorcunov wrote:
> >
> > +	err = -EACCES;
> > +	if (!S_ISREG(dentry->d_inode->i_mode)	||
> > +	    exe_file->f_path.mnt->mnt_flags & MNT_NOEXEC)
> > +		goto exit;
> > +
> > +	if ((exe_file->f_flags & O_ACCMODE) != O_RDONLY)
> > +		goto exit;
> > +
> > +	err = inode_permission(dentry->d_inode, MAY_EXEC);
> > +	if (err)
> > +		goto exit;
>
> OK, I won't argue, probably this makes sense to make sure that
> admin can't get a heart attack looking at /proc/pid/exe.
>
> But the O_RDONLY check looks strange. We are not going to write
> to this file, we only set the name (and that is why I think it
> should be mm->exe_path). What is the point to check that the file
> was opened without FMODE_WRITE? Even if there were any security
> risk the apllication can open this file again with the different
> flags.

Seriously, I think we should cleanup this before c/r adds more
ugliness. I'll try to make the patch today.

And with all these checks I am no longer sure that fd is better
than filename ;)

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-05 14:26                     ` Oleg Nesterov
@ 2012-03-05 14:46                       ` Cyrill Gorcunov
  2012-03-05 15:40                         ` Oleg Nesterov
  0 siblings, 1 reply; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-03-05 14:46 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On Mon, Mar 05, 2012 at 03:26:55PM +0100, Oleg Nesterov wrote:
> > OK, I won't argue, probably this makes sense to make sure that
> > admin can't get a heart attack looking at /proc/pid/exe.
> >
> > But the O_RDONLY check looks strange. We are not going to write
> > to this file, we only set the name (and that is why I think it
> > should be mm->exe_path). What is the point to check that the file
> > was opened without FMODE_WRITE? Even if there were any security
> > risk the apllication can open this file again with the different
> > flags.
>

Hi Oleg!

Replying to both your email -- I wanted to be as close to open_exec
as possible. This prctl does cheat the kernel but with this tests
the cheating should be minimized (it's almost the same as open_exec
does).

> Seriously, I think we should cleanup this before c/r adds more
> ugliness. I'll try to make the patch today.
> 

Cleanup what? If you mean this patch -- just point me what
should I do.

> And with all these checks I am no longer sure that fd is better
> than filename ;)

This security tests was a reason why I've used open_exec in
first version of the patch (and I still would prefer to
have open_exec here instead of fd).

As to allow-write-access -- it should be cleaned once process
finished, no?

	Cyrill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-05 14:46                       ` Cyrill Gorcunov
@ 2012-03-05 15:40                         ` Oleg Nesterov
  2012-03-05 16:01                           ` Cyrill Gorcunov
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2012-03-05 15:40 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On 03/05, Cyrill Gorcunov wrote:
>
> On Mon, Mar 05, 2012 at 03:26:55PM +0100, Oleg Nesterov wrote:
> > > OK, I won't argue, probably this makes sense to make sure that
> > > admin can't get a heart attack looking at /proc/pid/exe.
> > >
> > > But the O_RDONLY check looks strange. We are not going to write
> > > to this file, we only set the name (and that is why I think it
> > > should be mm->exe_path). What is the point to check that the file
> > > was opened without FMODE_WRITE? Even if there were any security
> > > risk the apllication can open this file again with the different
> > > flags.
> >
>
> Hi Oleg!
>
> Replying to both your email -- I wanted to be as close to open_exec
> as possible.

I see. But open_exec() is different, it returns the file we are going
to read/mmap. PR_SET_MM_EXE_FILE is different, I think O_RDONLY buys
nothing and looks confusing.

Anyway, as I said I won't argue.

> This prctl does cheat the kernel

Yep. Except, well, it cheats the user-space.

> but with this tests
> the cheating should be minimized (it's almost the same as open_exec
> does).

I don't reallt understand "minimized" ;) With this tests
proc/pid/exe can't look "obviously wrong", I agree. But that is all.

> > Seriously, I think we should cleanup this before c/r adds more
> > ugliness. I'll try to make the patch today.
>
> Cleanup what? If you mean this patch -- just point me what
> should I do.

I just sent the patch, "turn mm->exe_file into mm->exe_path"

> > And with all these checks I am no longer sure that fd is better
> > than filename ;)
>
> This security tests was a reason why I've used open_exec in
> first version of the patch

Yes, but me and Pavel forced you to use "int fd" ;)

> (and I still would prefer to
> have open_exec here instead of fd).

With the patch I sent "struct file *" is not needed at all.
I think prctl() can use user_path().

> As to allow-write-access -- it should be cleaned once process
> finished, no?

Exactly! And who will increment ->i_writecount? Nobody, that is
the problem.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-05 15:40                         ` Oleg Nesterov
@ 2012-03-05 16:01                           ` Cyrill Gorcunov
  2012-03-05 16:31                             ` Oleg Nesterov
  0 siblings, 1 reply; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-03-05 16:01 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On Mon, Mar 05, 2012 at 04:40:29PM +0100, Oleg Nesterov wrote:
> >
> > Hi Oleg!
> >
> > Replying to both your email -- I wanted to be as close to open_exec
> > as possible.
> 
> I see. But open_exec() is different, it returns the file we are going
> to read/mmap. PR_SET_MM_EXE_FILE is different, I think O_RDONLY buys
> nothing and looks confusing.
> 
> Anyway, as I said I won't argue.
> 

OK

> > This prctl does cheat the kernel
> 
> Yep. Except, well, it cheats the user-space.

And kernel as well, since this link is not anymore
the same as it was, and memory read from this file
(at execution time) will keep data irrelevant to what
the new symlink points to (but it's exactly by design).

> 
> > but with this tests
> > the cheating should be minimized (it's almost the same as open_exec
> > does).
> 
> I don't reallt understand "minimized" ;) With this tests
> proc/pid/exe can't look "obviously wrong", I agree. But that is all.

call it "minimized wrong" ;)

> 
> > > Seriously, I think we should cleanup this before c/r adds more
> > > ugliness. I'll try to make the patch today.
> >
> > Cleanup what? If you mean this patch -- just point me what
> > should I do.
> 
> I just sent the patch, "turn mm->exe_file into mm->exe_path"
> 

Cool, just got it, thanks!

> > > And with all these checks I am no longer sure that fd is better
> > > than filename ;)
> >
> > This security tests was a reason why I've used open_exec in
> > first version of the patch
> 
> Yes, but me and Pavel forced you to use "int fd" ;)
> 

With plain fd it doesn't require kernel to allocate temp
memory and copy path from user space, this is a positive
effect.

> > (and I still would prefer to
> > have open_exec here instead of fd).
> 
> With the patch I sent "struct file *" is not needed at all.
> I think prctl() can use user_path().

OK!

> 
> > As to allow-write-access -- it should be cleaned once process
> > finished, no?
> 
> Exactly! And who will increment ->i_writecount? Nobody, that is
> the problem.

I see, thanks!

	Cyrill

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-05 16:01                           ` Cyrill Gorcunov
@ 2012-03-05 16:31                             ` Oleg Nesterov
  2012-03-05 16:45                               ` Cyrill Gorcunov
  0 siblings, 1 reply; 21+ messages in thread
From: Oleg Nesterov @ 2012-03-05 16:31 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On 03/05, Cyrill Gorcunov wrote:
>
> On Mon, Mar 05, 2012 at 04:40:29PM +0100, Oleg Nesterov wrote:
>
> > > This prctl does cheat the kernel
> >
> > Yep. Except, well, it cheats the user-space.
>
> And kernel as well, since this link is not anymore
> the same as it was, and memory read from this file
> (at execution time) will keep data irrelevant to what
> the new symlink points to (but it's exactly by design).

The kernel will never use this file for reading. It is only
used as a placeholder for ->f_path shown in proc.

User-space can be confused, yes, this is by design.

Oleg.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file
  2012-03-05 16:31                             ` Oleg Nesterov
@ 2012-03-05 16:45                               ` Cyrill Gorcunov
  0 siblings, 0 replies; 21+ messages in thread
From: Cyrill Gorcunov @ 2012-03-05 16:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: LKML, Andrew Morton, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook,
	Tejun Heo

On Mon, Mar 05, 2012 at 05:31:30PM +0100, Oleg Nesterov wrote:
> 
> User-space can be confused, yes, this is by design.
> 

OK ;)

	Cyrill

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-03-05 16:45 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-29 15:16 [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file Cyrill Gorcunov
2012-02-29 15:23 ` Pavel Emelyanov
2012-02-29 15:31   ` Cyrill Gorcunov
2012-02-29 19:24 ` Oleg Nesterov
2012-02-29 20:01   ` Cyrill Gorcunov
2012-03-01 18:06     ` Oleg Nesterov
2012-03-01 19:17       ` Cyrill Gorcunov
2012-03-01 19:41         ` Oleg Nesterov
2012-03-01 20:00           ` Cyrill Gorcunov
2012-03-02 15:03             ` Oleg Nesterov
2012-03-02 14:26           ` Cyrill Gorcunov
2012-03-02 15:26             ` Oleg Nesterov
2012-03-02 16:12               ` Cyrill Gorcunov
2012-03-03 22:33                 ` Cyrill Gorcunov
2012-03-05 14:21                   ` Oleg Nesterov
2012-03-05 14:26                     ` Oleg Nesterov
2012-03-05 14:46                       ` Cyrill Gorcunov
2012-03-05 15:40                         ` Oleg Nesterov
2012-03-05 16:01                           ` Cyrill Gorcunov
2012-03-05 16:31                             ` Oleg Nesterov
2012-03-05 16:45                               ` Cyrill Gorcunov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).