* [patch 0/2] [PATCH 0/2] prctl extension in a sake of c/r @ 2012-03-16 20:55 Cyrill Gorcunov 2012-03-16 20:55 ` [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file Cyrill Gorcunov 2012-03-16 20:55 ` [patch 2/2] c/r: prctl: Add ability to get clear_tid_address Cyrill Gorcunov 0 siblings, 2 replies; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-16 20:55 UTC (permalink / raw) To: LKML Cc: Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley Hi, this series is resulting one from long discuss about ability to restore /proc/pid/exe. Hope it will fit everyone. While exe symlink has been reviewed (and actually fixed, thanks a lot Oleg!) I've got no opinions on "Add ability to get clear_tid_address" one. Anyway both patches have been tested intensively. Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-16 20:55 [patch 0/2] [PATCH 0/2] prctl extension in a sake of c/r Cyrill Gorcunov @ 2012-03-16 20:55 ` Cyrill Gorcunov 2012-03-19 22:15 ` Andrew Morton 2012-03-16 20:55 ` [patch 2/2] c/r: prctl: Add ability to get clear_tid_address Cyrill Gorcunov 1 sibling, 1 reply; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-16 20:55 UTC (permalink / raw) To: LKML Cc: Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley, Cyrill Gorcunov [-- Attachment #1: c-r-prctl-add-SET_MM-exe-symlink-5 --] [-- Type: text/plain, Size: 3475 bytes --] When we do restore we would like to have a way to setup a former mm_struct::exe_file so that /proc/pid/exe would point to the original executable file a process had at checkpoint time. For this the PR_SET_MM_EXE_FILE code is introduced. This option takes a file descriptor which will be set as a source for new /proc/$pid/exe symlink. Note it allows to change /proc/$pid/exe iif there are no VM_EXECUTABLE vmas present for current process, simply because this feature is a special to C/R and mm::num_exe_file_vmas become meaningless after that. Also this action is one-shot only. For security reason we don't allow to change the symlink several times. This feature is available iif CONFIG_CHECKPOINT_RESTORE is set. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> CC: Pavel Emelyanov <xemul@parallels.com> CC: Kees Cook <keescook@chromium.org> CC: Tejun Heo <tj@kernel.org> CC: Matt Helsley <matthltc@us.ibm.com> --- include/linux/prctl.h | 1 kernel/sys.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+) Index: linux-2.6.git/include/linux/prctl.h =================================================================== --- linux-2.6.git.orig/include/linux/prctl.h +++ linux-2.6.git/include/linux/prctl.h @@ -118,5 +118,6 @@ # define PR_SET_MM_ENV_START 10 # define PR_SET_MM_ENV_END 11 # define PR_SET_MM_AUXV 12 +# define PR_SET_MM_EXE_FILE 13 #endif /* _LINUX_PRCTL_H */ Index: linux-2.6.git/kernel/sys.c =================================================================== --- linux-2.6.git.orig/kernel/sys.c +++ linux-2.6.git/kernel/sys.c @@ -36,6 +36,8 @@ #include <linux/personality.h> #include <linux/ptrace.h> #include <linux/fs_struct.h> +#include <linux/file.h> +#include <linux/mount.h> #include <linux/gfp.h> #include <linux/syscore_ops.h> #include <linux/version.h> @@ -1701,6 +1703,57 @@ static bool vma_flags_mismatch(struct vm (vma->vm_flags & banned); } +static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd) +{ + struct file *exe_file; + struct dentry *dentry; + int err; + + /* + * Setting new mm::exe_file is only allowed + * when no VM_EXECUTABLE vma's left. So make + * a fast test first. + */ + if (mm->num_exe_file_vmas) + return -EBUSY; + + exe_file = fget(fd); + if (!exe_file) + return -EBADF; + + dentry = exe_file->f_path.dentry; + + /* + * Because the original mm->exe_file + * points to executable file, make sure + * this one is executable as well to not + * break an overall picture. + */ + err = -EACCES; + if (!S_ISREG(dentry->d_inode->i_mode) || + exe_file->f_path.mnt->mnt_flags & MNT_NOEXEC) + goto exit; + + err = inode_permission(dentry->d_inode, MAY_EXEC); + if (err) + goto exit; + + /* + * For security reason changing mm->exe_file + * is one-shot action. + */ + down_write(&mm->mmap_sem); + if (likely(!mm->exe_file)) + set_mm_exe_file(mm, exe_file); + else + err = -EBUSY; + up_write(&mm->mmap_sem); + +exit: + fput(exe_file); + return err; +} + static int prctl_set_mm(int opt, unsigned long addr, unsigned long arg4, unsigned long arg5) { @@ -1715,6 +1768,9 @@ static int prctl_set_mm(int opt, unsigne if (!capable(CAP_SYS_RESOURCE)) return -EPERM; + if (opt == PR_SET_MM_EXE_FILE) + return prctl_set_mm_exe_file(mm, (unsigned int)addr); + if (addr >= TASK_SIZE) return -EINVAL; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-16 20:55 ` [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file Cyrill Gorcunov @ 2012-03-19 22:15 ` Andrew Morton 2012-03-19 22:39 ` Cyrill Gorcunov 0 siblings, 1 reply; 22+ messages in thread From: Andrew Morton @ 2012-03-19 22:15 UTC (permalink / raw) To: Cyrill Gorcunov Cc: LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Sat, 17 Mar 2012 00:55:57 +0400 Cyrill Gorcunov <gorcunov@openvz.org> wrote: > When we do restore we would like to have a way to setup > a former mm_struct::exe_file so that /proc/pid/exe would > point to the original executable file a process had at > checkpoint time. > > For this the PR_SET_MM_EXE_FILE code is introduced. > This option takes a file descriptor which will be > set as a source for new /proc/$pid/exe symlink. > > Note it allows to change /proc/$pid/exe iif there > are no VM_EXECUTABLE vmas present for current process, > simply because this feature is a special to C/R > and mm::num_exe_file_vmas become meaningless after > that. > > Also this action is one-shot only. For security reason > we don't allow to change the symlink several times. What is this mysterious "security reason"? > > ... > > +static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd) > +{ > + struct file *exe_file; > + struct dentry *dentry; > + int err; > + > + /* > + * Setting new mm::exe_file is only allowed > + * when no VM_EXECUTABLE vma's left. So make > + * a fast test first. > + */ > + if (mm->num_exe_file_vmas) > + return -EBUSY; > + > + exe_file = fget(fd); > + if (!exe_file) > + return -EBADF; > + > + dentry = exe_file->f_path.dentry; > + > + /* > + * Because the original mm->exe_file > + * points to executable file, make sure > + * this one is executable as well to not > + * break an overall picture. > + */ > + err = -EACCES; > + if (!S_ISREG(dentry->d_inode->i_mode) || > + exe_file->f_path.mnt->mnt_flags & MNT_NOEXEC) > + goto exit; > + > + err = inode_permission(dentry->d_inode, MAY_EXEC); > + if (err) > + goto exit; > + > + /* > + * For security reason changing mm->exe_file > + * is one-shot action. > + */ It should be explained here also. The comment is pretty useless - if we don't tell people what this "security reason" is, how can future developers be sure that they aren't violating it? > + down_write(&mm->mmap_sem); > + if (likely(!mm->exe_file)) > + set_mm_exe_file(mm, exe_file); > + else > + err = -EBUSY; > + up_write(&mm->mmap_sem); > + > +exit: > + fput(exe_file); > + return err; > +} > + > static int prctl_set_mm(int opt, unsigned long addr, > unsigned long arg4, unsigned long arg5) > { > > ... > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 22:15 ` Andrew Morton @ 2012-03-19 22:39 ` Cyrill Gorcunov 2012-03-19 22:41 ` richard -rw- weinberger 0 siblings, 1 reply; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-19 22:39 UTC (permalink / raw) To: Andrew Morton Cc: LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Mon, Mar 19, 2012 at 03:15:07PM -0700, Andrew Morton wrote: ... > > > > Also this action is one-shot only. For security reason > > we don't allow to change the symlink several times. > > What is this mysterious "security reason"? > Oh, sorry I should have included Matt's comment here | Before this patch that state was rather ephemeral and almost entirely | under the control of the kernel. The only way userspace could change it | was by unmapping the region(s) mapped during exec*(). At that point it | could not "lie" and insert some other symlink there and the admin would | be better able to determine what had happened. | | With this patch -- especially the multi-shot form -- the symlink will | be entirely under the control of (potentially untrusted) userspace code | and the admin is totally at the mercy of the userspace code. In | single-shot form programs could use the prctl() to ensure the symlink | could not be changed later -- the restart tool would be the only program | that would need to ensure that prctl() had not been used since the last | exec*(). ... > > It should be explained here also. The comment is pretty useless - if > we don't tell people what this "security reason" is, how can future > developers be sure that they aren't violating it? > Actually I liked multi-shot version more but Matt arguments convinced me that one-short fashion is more "secure" in terms of overall kernel state and potential transitions/changes of this /proc/pid/exe symlink. At least with one-shot version the admin may be sure that the symlink is never changed more than once, ever. Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 22:39 ` Cyrill Gorcunov @ 2012-03-19 22:41 ` richard -rw- weinberger 2012-03-19 22:46 ` Andrew Morton 2012-03-19 22:47 ` Cyrill Gorcunov 0 siblings, 2 replies; 22+ messages in thread From: richard -rw- weinberger @ 2012-03-19 22:41 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Andrew Morton, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Mon, Mar 19, 2012 at 11:39 PM, Cyrill Gorcunov <gorcunov@openvz.org> wrote: > On Mon, Mar 19, 2012 at 03:15:07PM -0700, Andrew Morton wrote: > ... >> > >> > Also this action is one-shot only. For security reason >> > we don't allow to change the symlink several times. >> >> What is this mysterious "security reason"? >> > > Oh, sorry I should have included Matt's comment here > > | Before this patch that state was rather ephemeral and almost entirely > | under the control of the kernel. The only way userspace could change it > | was by unmapping the region(s) mapped during exec*(). At that point it > | could not "lie" and insert some other symlink there and the admin would > | be better able to determine what had happened. > | > | With this patch -- especially the multi-shot form -- the symlink will > | be entirely under the control of (potentially untrusted) userspace code > | and the admin is totally at the mercy of the userspace code. In > | single-shot form programs could use the prctl() to ensure the symlink > | could not be changed later -- the restart tool would be the only program > | that would need to ensure that prctl() had not been used since the last > | exec*(). > ... >> >> It should be explained here also. The comment is pretty useless - if >> we don't tell people what this "security reason" is, how can future >> developers be sure that they aren't violating it? >> > > Actually I liked multi-shot version more but Matt arguments convinced > me that one-short fashion is more "secure" in terms of overall kernel > state and potential transitions/changes of this /proc/pid/exe symlink. > > At least with one-shot version the admin may be sure that the symlink > is never changed more than once, ever. > And changing it once does not harm security? I'm sure that rootkit writers will like this feature... -- Thanks, //richard ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 22:41 ` richard -rw- weinberger @ 2012-03-19 22:46 ` Andrew Morton 2012-03-19 22:50 ` Cyrill Gorcunov ` (2 more replies) 2012-03-19 22:47 ` Cyrill Gorcunov 1 sibling, 3 replies; 22+ messages in thread From: Andrew Morton @ 2012-03-19 22:46 UTC (permalink / raw) To: richard -rw- weinberger Cc: Cyrill Gorcunov, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Mon, 19 Mar 2012 23:41:36 +0100 richard -rw- weinberger <richard.weinberger@gmail.com> wrote: > On Mon, Mar 19, 2012 at 11:39 PM, Cyrill Gorcunov <gorcunov@openvz.org> wrote: > > On Mon, Mar 19, 2012 at 03:15:07PM -0700, Andrew Morton wrote: > > ... > >> > > >> > Also this action is one-shot only. For security reason > >> > we don't allow to change the symlink several times. > >> > >> What is this mysterious "security reason"? > >> > > > > Oh, sorry I should have included Matt's comment here Please send a patch with the updated changelog and improved comment? > > > > Actually I liked multi-shot version more but Matt arguments convinced > > me that one-short fashion is more "secure" in terms of overall kernel > > state and potential transitions/changes of this /proc/pid/exe symlink. > > > > At least with one-shot version the admin may be sure that the symlink > > is never changed more than once, ever. > > > > And changing it once does not harm security? > I'm sure that rootkit writers will like this feature... Well, let's discuss this more completely. In what ways could an attacker use this? How serious is the problem? What actions can be taken to lessen it? etcetera. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 22:46 ` Andrew Morton @ 2012-03-19 22:50 ` Cyrill Gorcunov 2012-03-19 22:59 ` Andrew Morton 2012-03-19 23:02 ` richard -rw- weinberger 2012-03-20 6:55 ` Cyrill Gorcunov 2 siblings, 1 reply; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-19 22:50 UTC (permalink / raw) To: Andrew Morton Cc: richard -rw- weinberger, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Mon, Mar 19, 2012 at 03:46:49PM -0700, Andrew Morton wrote: > > Please send a patch with the updated changelog and improved comment? Sure I'll resend. > > > > > > > Actually I liked multi-shot version more but Matt arguments convinced > > > me that one-short fashion is more "secure" in terms of overall kernel > > > state and potential transitions/changes of this /proc/pid/exe symlink. > > > > > > At least with one-shot version the admin may be sure that the symlink > > > is never changed more than once, ever. > > > > > > > And changing it once does not harm security? > > I'm sure that rootkit writers will like this feature... > > Well, let's discuss this more completely. In what ways could an > attacker use this? How serious is the problem? What actions can be > taken to lessen it? etcetera. It can use it iif CAP_SYS_RESOURCE is granted. Otherwise you'll get -eaccess. Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 22:50 ` Cyrill Gorcunov @ 2012-03-19 22:59 ` Andrew Morton 2012-03-19 23:12 ` Cyrill Gorcunov 0 siblings, 1 reply; 22+ messages in thread From: Andrew Morton @ 2012-03-19 22:59 UTC (permalink / raw) To: Cyrill Gorcunov Cc: richard -rw- weinberger, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Tue, 20 Mar 2012 02:50:20 +0400 Cyrill Gorcunov <gorcunov@openvz.org> wrote: > On Mon, Mar 19, 2012 at 03:46:49PM -0700, Andrew Morton wrote: > > > > Please send a patch with the updated changelog and improved comment? > > Sure I'll resend. > > > > > > > > > > > Actually I liked multi-shot version more but Matt arguments convinced > > > > me that one-short fashion is more "secure" in terms of overall kernel > > > > state and potential transitions/changes of this /proc/pid/exe symlink. > > > > > > > > At least with one-shot version the admin may be sure that the symlink > > > > is never changed more than once, ever. > > > > > > > > > > And changing it once does not harm security? > > > I'm sure that rootkit writers will like this feature... > > > > Well, let's discuss this more completely. In what ways could an > > attacker use this? How serious is the problem? What actions can be > > taken to lessen it? etcetera. > > It can use it iif CAP_SYS_RESOURCE is granted. > Otherwise you'll get -eaccess. A rootkit already obtained CAP_SYS_RESOURCE. What we're concerned about here is its ability to hide itself from view and its ability to obscure the way in which it obtained elevated privs. How much this patch worsens the situation is unclear to me, so let's think it through. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 22:59 ` Andrew Morton @ 2012-03-19 23:12 ` Cyrill Gorcunov 0 siblings, 0 replies; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-19 23:12 UTC (permalink / raw) To: Andrew Morton Cc: richard -rw- weinberger, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Mon, Mar 19, 2012 at 03:59:26PM -0700, Andrew Morton wrote: ... > > > > It can use it iif CAP_SYS_RESOURCE is granted. > > Otherwise you'll get -eaccess. > > A rootkit already obtained CAP_SYS_RESOURCE. What we're concerned > about here is its ability to hide itself from view and its ability to > obscure the way in which it obtained elevated privs. Well, if rootkit got CAP_SYS_RESOURCE I think we're in bad situation then -- it might change the symlink to some 'known' and trusted application and you'll never notice that (without scanning the memory area such rootkit uses, and note 'scanning' here because you need to scan for memory contents to figure out that memory do not correspond the file symlinks point to). Actually being able to restore program 'transparently' is a primary aim of checkpoint-restore itself. > > How much this patch worsens the situation is unclear to me, so let's > think it through. Dunno Andrew, /proc/exe/symlink is never trusted source of info I guess. But I need to think some more... Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 22:46 ` Andrew Morton 2012-03-19 22:50 ` Cyrill Gorcunov @ 2012-03-19 23:02 ` richard -rw- weinberger 2012-03-19 23:17 ` Cyrill Gorcunov 2012-03-20 6:55 ` Cyrill Gorcunov 2 siblings, 1 reply; 22+ messages in thread From: richard -rw- weinberger @ 2012-03-19 23:02 UTC (permalink / raw) To: Andrew Morton Cc: Cyrill Gorcunov, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Mon, Mar 19, 2012 at 11:46 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > Well, let's discuss this more completely. In what ways could an > attacker use this? How serious is the problem? What actions can be > taken to lessen it? etcetera. After considering the problem a bit more I think it's not a big problem. We must not trust /proc/pid/exe in anyway. An attacker can always execute another binary without calling execve(). So, why makes that one-short fashion the feature more secure? Let the user change the exe symlink as often as he wants. >From a security point of view the exe symlink is anyway useless. -- Thanks, //richard ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 23:02 ` richard -rw- weinberger @ 2012-03-19 23:17 ` Cyrill Gorcunov 2012-03-19 23:23 ` richard -rw- weinberger 0 siblings, 1 reply; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-19 23:17 UTC (permalink / raw) To: richard -rw- weinberger Cc: Andrew Morton, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Tue, Mar 20, 2012 at 12:02:44AM +0100, richard -rw- weinberger wrote: > On Mon, Mar 19, 2012 at 11:46 PM, Andrew Morton > <akpm@linux-foundation.org> wrote: > > Well, let's discuss this more completely. In what ways could an > > attacker use this? How serious is the problem? What actions can be > > taken to lessen it? etcetera. > > After considering the problem a bit more I think it's not a big problem. > We must not trust /proc/pid/exe in anyway. Well, Richard, we probably do not trust it anyway but sysadmins might do (and this was another reason for one-shot behaviour -- to not bring heart attacks to sysadmins, and everyone would know this link might be changed only one time ;) > An attacker can always execute another binary without calling execve(). That's what c/r basically does :) > > So, why makes that one-short fashion the feature more secure? > Let the user change the exe symlink as often as he wants. > From a security point of view the exe symlink is anyway useless. Maybe better to call it 'predictable' then rather than 'secure'? Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 23:17 ` Cyrill Gorcunov @ 2012-03-19 23:23 ` richard -rw- weinberger 0 siblings, 0 replies; 22+ messages in thread From: richard -rw- weinberger @ 2012-03-19 23:23 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Andrew Morton, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Tue, Mar 20, 2012 at 12:17 AM, Cyrill Gorcunov <gorcunov@openvz.org> wrote: >> So, why makes that one-short fashion the feature more secure? >> Let the user change the exe symlink as often as he wants. >> From a security point of view the exe symlink is anyway useless. > > Maybe better to call it 'predictable' then rather than 'secure'? I'd say one-short makes it less confusing. :-) -- Thanks, //richard ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 22:46 ` Andrew Morton 2012-03-19 22:50 ` Cyrill Gorcunov 2012-03-19 23:02 ` richard -rw- weinberger @ 2012-03-20 6:55 ` Cyrill Gorcunov 2012-03-22 23:38 ` Eric W. Biederman 2 siblings, 1 reply; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-20 6:55 UTC (permalink / raw) To: Andrew Morton Cc: richard -rw- weinberger, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Mon, Mar 19, 2012 at 03:46:49PM -0700, Andrew Morton wrote: > > >> > > >> What is this mysterious "security reason"? > > >> > > > > > > Oh, sorry I should have included Matt's comment here > > Please send a patch with the updated changelog and improved comment? > Andrew, take a look please, will the changelog and comments look better? Cyrill --- From: Cyrill Gorcunov <gorcunov@openvz.org> Subject: c/r: prctl: add ability to set new mm_struct::exe_file When we do restore we would like to have a way to setup a former mm_struct::exe_file so that /proc/pid/exe would point to the original executable file a process had at checkpoint time. For this the PR_SET_MM_EXE_FILE code is introduced. This option takes a file descriptor which will be set as a source for new /proc/$pid/exe symlink. Note it allows to change /proc/$pid/exe iif there are no VM_EXECUTABLE vmas present for current process, simply because this feature is a special to C/R and mm::num_exe_file_vmas become meaningless after that. To minimize the amount of transition the /proc/pid/exe symlink might have, this feature is implemented in one-shot manner. Thus once changed the symlink can't be changed again. This should help sysadmins to monitor the symlinks over all process running in a system. In particular one could make a snapshot of processes and ring alarm if there unexpected changes of /proc/pid/exe's in a system. Note -- this feature is available iif CONFIG_CHECKPOINT_RESTORE is set and the caller must have CAP_SYS_RESOURCE capability granted, otherwise the request to change symlink will be rejected. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> CC: Pavel Emelyanov <xemul@parallels.com> CC: Kees Cook <keescook@chromium.org> CC: Tejun Heo <tj@kernel.org> CC: Matt Helsley <matthltc@us.ibm.com> CC: richard -rw- weinberger <richard.weinberger@gmail.com> --- include/linux/prctl.h | 1 kernel/sys.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+) Index: linux-2.6.git/include/linux/prctl.h =================================================================== --- linux-2.6.git.orig/include/linux/prctl.h +++ linux-2.6.git/include/linux/prctl.h @@ -118,5 +118,6 @@ # define PR_SET_MM_ENV_START 10 # define PR_SET_MM_ENV_END 11 # define PR_SET_MM_AUXV 12 +# define PR_SET_MM_EXE_FILE 13 #endif /* _LINUX_PRCTL_H */ Index: linux-2.6.git/kernel/sys.c =================================================================== --- linux-2.6.git.orig/kernel/sys.c +++ linux-2.6.git/kernel/sys.c @@ -36,6 +36,8 @@ #include <linux/personality.h> #include <linux/ptrace.h> #include <linux/fs_struct.h> +#include <linux/file.h> +#include <linux/mount.h> #include <linux/gfp.h> #include <linux/syscore_ops.h> #include <linux/version.h> @@ -1701,6 +1703,57 @@ static bool vma_flags_mismatch(struct vm (vma->vm_flags & banned); } +static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd) +{ + struct file *exe_file; + struct dentry *dentry; + int err; + + /* + * Setting new mm::exe_file is only allowed when no VM_EXECUTABLE vma's + * remain. So perform a quick test first. + */ + if (mm->num_exe_file_vmas) + return -EBUSY; + + exe_file = fget(fd); + if (!exe_file) + return -EBADF; + + dentry = exe_file->f_path.dentry; + + /* + * Because the original mm->exe_file points to executable file, make + * sure that this one is executable as well, to avoid breaking an + * overall picture. + */ + err = -EACCES; + if (!S_ISREG(dentry->d_inode->i_mode) || + exe_file->f_path.mnt->mnt_flags & MNT_NOEXEC) + goto exit; + + err = inode_permission(dentry->d_inode, MAY_EXEC); + if (err) + goto exit; + + /* + * The symlink can be changed only once, just to disallow arbitrary + * transitions malicious software might bring in. This means one + * could make a snapshot over all processes running and monitor + * /proc/pid/exe changes to notice unusual activity if needed. + */ + down_write(&mm->mmap_sem); + if (likely(!mm->exe_file)) + set_mm_exe_file(mm, exe_file); + else + err = -EBUSY; + up_write(&mm->mmap_sem); + +exit: + fput(exe_file); + return err; +} + static int prctl_set_mm(int opt, unsigned long addr, unsigned long arg4, unsigned long arg5) { @@ -1715,6 +1768,9 @@ static int prctl_set_mm(int opt, unsigne if (!capable(CAP_SYS_RESOURCE)) return -EPERM; + if (opt == PR_SET_MM_EXE_FILE) + return prctl_set_mm_exe_file(mm, (unsigned int)addr); + if (addr >= TASK_SIZE) return -EINVAL; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-20 6:55 ` Cyrill Gorcunov @ 2012-03-22 23:38 ` Eric W. Biederman 2012-03-23 6:41 ` Cyrill Gorcunov 2012-03-23 17:06 ` Matt Helsley 0 siblings, 2 replies; 22+ messages in thread From: Eric W. Biederman @ 2012-03-22 23:38 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Andrew Morton, richard -rw- weinberger, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley Cyrill Gorcunov <gorcunov@openvz.org> writes: > On Mon, Mar 19, 2012 at 03:46:49PM -0700, Andrew Morton wrote: >> > >> >> > >> What is this mysterious "security reason"? >> > >> >> > > >> > > Oh, sorry I should have included Matt's comment here >> >> Please send a patch with the updated changelog and improved comment? >> > > Andrew, take a look please, will the changelog and comments look > better? Can you change this to take an actual address and get the exe_file from an mmapped area and make certain that the mmaped_area is already mapped MAP_EXEC. That will prevent out-right lies. At least then we will know that exe_file will at least be a file that is mapped executable in the process's address space. It's not a lot better but it makes /proc/<pid>/exe at almost as trustable as it is now. > Cyrill > --- > From: Cyrill Gorcunov <gorcunov@openvz.org> > Subject: c/r: prctl: add ability to set new mm_struct::exe_file > > When we do restore we would like to have a way to setup > a former mm_struct::exe_file so that /proc/pid/exe would > point to the original executable file a process had at > checkpoint time. > > For this the PR_SET_MM_EXE_FILE code is introduced. > This option takes a file descriptor which will be > set as a source for new /proc/$pid/exe symlink. > > Note it allows to change /proc/$pid/exe iif there > are no VM_EXECUTABLE vmas present for current process, > simply because this feature is a special to C/R > and mm::num_exe_file_vmas become meaningless after > that. > > To minimize the amount of transition the /proc/pid/exe > symlink might have, this feature is implemented in one-shot > manner. Thus once changed the symlink can't be changed > again. This should help sysadmins to monitor the symlinks > over all process running in a system. > > In particular one could make a snapshot of processes and > ring alarm if there unexpected changes of /proc/pid/exe's > in a system. > > Note -- this feature is available iif CONFIG_CHECKPOINT_RESTORE > is set and the caller must have CAP_SYS_RESOURCE capability > granted, otherwise the request to change symlink will be > rejected. > > Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> > Reviewed-by: Oleg Nesterov <oleg@redhat.com> > CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > CC: Pavel Emelyanov <xemul@parallels.com> > CC: Kees Cook <keescook@chromium.org> > CC: Tejun Heo <tj@kernel.org> > CC: Matt Helsley <matthltc@us.ibm.com> > CC: richard -rw- weinberger <richard.weinberger@gmail.com> > --- > include/linux/prctl.h | 1 > kernel/sys.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 57 insertions(+) > > Index: linux-2.6.git/include/linux/prctl.h > =================================================================== > --- linux-2.6.git.orig/include/linux/prctl.h > +++ linux-2.6.git/include/linux/prctl.h > @@ -118,5 +118,6 @@ > # define PR_SET_MM_ENV_START 10 > # define PR_SET_MM_ENV_END 11 > # define PR_SET_MM_AUXV 12 > +# define PR_SET_MM_EXE_FILE 13 > > #endif /* _LINUX_PRCTL_H */ > Index: linux-2.6.git/kernel/sys.c > =================================================================== > --- linux-2.6.git.orig/kernel/sys.c > +++ linux-2.6.git/kernel/sys.c > @@ -36,6 +36,8 @@ > #include <linux/personality.h> > #include <linux/ptrace.h> > #include <linux/fs_struct.h> > +#include <linux/file.h> > +#include <linux/mount.h> > #include <linux/gfp.h> > #include <linux/syscore_ops.h> > #include <linux/version.h> > @@ -1701,6 +1703,57 @@ static bool vma_flags_mismatch(struct vm > (vma->vm_flags & banned); > } > > +static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd) > +{ > + struct file *exe_file; > + struct dentry *dentry; > + int err; > + > + /* > + * Setting new mm::exe_file is only allowed when no VM_EXECUTABLE vma's > + * remain. So perform a quick test first. > + */ > + if (mm->num_exe_file_vmas) > + return -EBUSY; > + > + exe_file = fget(fd); > + if (!exe_file) > + return -EBADF; > + > + dentry = exe_file->f_path.dentry; > + > + /* > + * Because the original mm->exe_file points to executable file, make > + * sure that this one is executable as well, to avoid breaking an > + * overall picture. > + */ > + err = -EACCES; > + if (!S_ISREG(dentry->d_inode->i_mode) || > + exe_file->f_path.mnt->mnt_flags & MNT_NOEXEC) > + goto exit; > + > + err = inode_permission(dentry->d_inode, MAY_EXEC); > + if (err) > + goto exit; > + > + /* > + * The symlink can be changed only once, just to disallow arbitrary > + * transitions malicious software might bring in. This means one > + * could make a snapshot over all processes running and monitor > + * /proc/pid/exe changes to notice unusual activity if needed. > + */ > + down_write(&mm->mmap_sem); > + if (likely(!mm->exe_file)) > + set_mm_exe_file(mm, exe_file); > + else > + err = -EBUSY; > + up_write(&mm->mmap_sem); > + > +exit: > + fput(exe_file); > + return err; > +} > + > static int prctl_set_mm(int opt, unsigned long addr, > unsigned long arg4, unsigned long arg5) > { > @@ -1715,6 +1768,9 @@ static int prctl_set_mm(int opt, unsigne > if (!capable(CAP_SYS_RESOURCE)) > return -EPERM; > > + if (opt == PR_SET_MM_EXE_FILE) > + return prctl_set_mm_exe_file(mm, (unsigned int)addr); > + > if (addr >= TASK_SIZE) > return -EINVAL; > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-22 23:38 ` Eric W. Biederman @ 2012-03-23 6:41 ` Cyrill Gorcunov 2012-03-23 6:47 ` Cyrill Gorcunov 2012-03-23 17:06 ` Matt Helsley 1 sibling, 1 reply; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-23 6:41 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, richard -rw- weinberger, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Thu, Mar 22, 2012 at 04:38:43PM -0700, Eric W. Biederman wrote: > > > > Andrew, take a look please, will the changelog and comments look > > better? > > Can you change this to take an actual address and get the exe_file > from an mmapped area and make certain that the mmaped_area is already > mapped MAP_EXEC. > > That will prevent out-right lies. > > At least then we will know that exe_file will at least be a file that is > mapped executable in the process's address space. It's not a lot better > but it makes /proc/<pid>/exe at almost as trustable as it is now. This won't work for all cases. When we restore a program we map new VM_EXEC areas _without_ vma::vm_file field. Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-23 6:41 ` Cyrill Gorcunov @ 2012-03-23 6:47 ` Cyrill Gorcunov 0 siblings, 0 replies; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-23 6:47 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, richard -rw- weinberger, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Fri, Mar 23, 2012 at 10:41:36AM +0400, Cyrill Gorcunov wrote: > On Thu, Mar 22, 2012 at 04:38:43PM -0700, Eric W. Biederman wrote: > > > > > > Andrew, take a look please, will the changelog and comments look > > > better? > > > > Can you change this to take an actual address and get the exe_file > > from an mmapped area and make certain that the mmaped_area is already > > mapped MAP_EXEC. > > > > That will prevent out-right lies. > > > > At least then we will know that exe_file will at least be a file that is > > mapped executable in the process's address space. It's not a lot better > > but it makes /proc/<pid>/exe at almost as trustable as it is now. > > This won't work for all cases. When we restore a program we map new > VM_EXEC areas _without_ vma::vm_file field. > Well, it's not complete true ;) At moment all exec areas we re-map during restore do correspond to executable files. but I think having ability to set this symlink without requirement such as 'every VM_EXEC should be mapped with file' is a win. That said I would prefer to leave this interface as is, until there strong objections. Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-22 23:38 ` Eric W. Biederman 2012-03-23 6:41 ` Cyrill Gorcunov @ 2012-03-23 17:06 ` Matt Helsley 1 sibling, 0 replies; 22+ messages in thread From: Matt Helsley @ 2012-03-23 17:06 UTC (permalink / raw) To: Eric W. Biederman Cc: Cyrill Gorcunov, Andrew Morton, richard -rw- weinberger, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Thu, Mar 22, 2012 at 04:38:43PM -0700, Eric W. Biederman wrote: > Cyrill Gorcunov <gorcunov@openvz.org> writes: > > > On Mon, Mar 19, 2012 at 03:46:49PM -0700, Andrew Morton wrote: > >> > >> > >> > >> What is this mysterious "security reason"? > >> > >> > >> > > > >> > > Oh, sorry I should have included Matt's comment here > >> > >> Please send a patch with the updated changelog and improved comment? > >> > > > > Andrew, take a look please, will the changelog and comments look > > better? > > Can you change this to take an actual address and get the exe_file > from an mmapped area and make certain that the mmaped_area is already > mapped MAP_EXEC. Do you mean PROT_EXEC/VM_EXEC? > > That will prevent out-right lies. > > At least then we will know that exe_file will at least be a file that is > mapped executable in the process's address space. It's not a lot better > but it makes /proc/<pid>/exe at almost as trustable as it is now. I don't dislike the idea. However just because it's mapped with one of those flags does not mean that a single instruction of it will ever be executed. So it's not much better than using the fd :/. Perhaps there is some way to use the userspace stack and/or regs to get a reasonable instruction pointer, lookup its VMA, and use that? I'm not sure that would work for c/r though... Cheers, -Matt ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file 2012-03-19 22:41 ` richard -rw- weinberger 2012-03-19 22:46 ` Andrew Morton @ 2012-03-19 22:47 ` Cyrill Gorcunov 1 sibling, 0 replies; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-19 22:47 UTC (permalink / raw) To: richard -rw- weinberger Cc: Andrew Morton, LKML, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley On Mon, Mar 19, 2012 at 11:41:36PM +0100, richard -rw- weinberger wrote: > > > > Actually I liked multi-shot version more but Matt arguments convinced > > me that one-short fashion is more "secure" in terms of overall kernel > > state and potential transitions/changes of this /proc/pid/exe symlink. > > > > At least with one-shot version the admin may be sure that the symlink > > is never changed more than once, ever. > > > > And changing it once does not harm security? > I'm sure that rootkit writers will like this feature... The one-shot limits the amount of transitions, but you still have to obtain CAP_SYS_RESOURCE before you'll be able to change this symlink (ie it's not 'anyone-can-change-it' feature). Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* [patch 2/2] c/r: prctl: Add ability to get clear_tid_address 2012-03-16 20:55 [patch 0/2] [PATCH 0/2] prctl extension in a sake of c/r Cyrill Gorcunov 2012-03-16 20:55 ` [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file Cyrill Gorcunov @ 2012-03-16 20:55 ` Cyrill Gorcunov 2012-03-19 16:51 ` Kees Cook 1 sibling, 1 reply; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-16 20:55 UTC (permalink / raw) To: LKML Cc: Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Matt Helsley, Andrew Vagin, Cyrill Gorcunov, Pedro Alves [-- Attachment #1: c-r-prctl-add-PR_GET_TID_ADDRESS --] [-- Type: text/plain, Size: 2295 bytes --] Zero is written at clear_tid_address, when the process exits. This functionality is used by pthread_join(). We already have sys_set_tid_address() to change this address for current task but there is no way to obtain it from a user space. Without ability to find this address and dump it we can't restore pthread'ed apps which do call pthread_join() once they have been restored. This patch introduces PR_GET_TID_ADDRESS prctl option which allow current process to obtain own clear_tid_address. This feature is available iif CONFIG_CHECKPOINT_RESTORE is set. Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Pedro Alves <palves@redhat.com> CC: Oleg Nesterov <oleg@redhat.com> CC: Pavel Emelyanov <xemul@parallels.com> CC: Tejun Heo <tj@kernel.org> --- include/linux/prctl.h | 2 ++ kernel/sys.c | 13 +++++++++++++ 2 files changed, 15 insertions(+) Index: linux-2.6.git/include/linux/prctl.h =================================================================== --- linux-2.6.git.orig/include/linux/prctl.h +++ linux-2.6.git/include/linux/prctl.h @@ -120,4 +120,6 @@ # define PR_SET_MM_AUXV 12 # define PR_SET_MM_EXE_FILE 13 +#define PR_GET_TID_ADDRESS 36 + #endif /* _LINUX_PRCTL_H */ Index: linux-2.6.git/kernel/sys.c =================================================================== --- linux-2.6.git.orig/kernel/sys.c +++ linux-2.6.git/kernel/sys.c @@ -1901,12 +1901,22 @@ out: up_read(&mm->mmap_sem); return error; } + +static int prctl_get_tid_address(struct task_struct *me, int __user **tid_addr) +{ + return put_user(me->clear_child_tid, tid_addr); +} + #else /* CONFIG_CHECKPOINT_RESTORE */ static int prctl_set_mm(int opt, unsigned long addr, unsigned long arg4, unsigned long arg5) { return -EINVAL; } +static int prctl_get_tid_address(struct task_struct *me, int __user **tid_addr) +{ + return -EINVAL; +} #endif SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, @@ -2061,6 +2071,9 @@ SYSCALL_DEFINE5(prctl, int, option, unsi case PR_SET_MM: error = prctl_set_mm(arg2, arg3, arg4, arg5); break; + case PR_GET_TID_ADDRESS: + error = prctl_get_tid_address(me, (int __user **)arg2); + break; default: error = -EINVAL; break; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 2/2] c/r: prctl: Add ability to get clear_tid_address 2012-03-16 20:55 ` [patch 2/2] c/r: prctl: Add ability to get clear_tid_address Cyrill Gorcunov @ 2012-03-19 16:51 ` Kees Cook 2012-03-19 16:55 ` Cyrill Gorcunov 0 siblings, 1 reply; 22+ messages in thread From: Kees Cook @ 2012-03-19 16:51 UTC (permalink / raw) To: Cyrill Gorcunov Cc: LKML, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Tejun Heo, Matt Helsley, Andrew Vagin, Pedro Alves On Fri, Mar 16, 2012 at 1:55 PM, Cyrill Gorcunov <gorcunov@openvz.org> wrote: > Zero is written at clear_tid_address, when > the process exits. This functionality is used > by pthread_join(). > > We already have sys_set_tid_address() to change this > address for current task but there is no way to obtain > it from a user space. Is it worth introducing a syscall for this just for symmetry? I suspect not, in which case: Acked-by: Kees Cook <keescook@chromium.org> -Kees -- Kees Cook ChromeOS Security ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [patch 2/2] c/r: prctl: Add ability to get clear_tid_address 2012-03-19 16:51 ` Kees Cook @ 2012-03-19 16:55 ` Cyrill Gorcunov 0 siblings, 0 replies; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-19 16:55 UTC (permalink / raw) To: Kees Cook Cc: LKML, Andrew Morton, Oleg Nesterov, KOSAKI Motohiro, Pavel Emelyanov, Tejun Heo, Matt Helsley, Andrew Vagin, Pedro Alves On Mon, Mar 19, 2012 at 09:51:36AM -0700, Kees Cook wrote: > On Fri, Mar 16, 2012 at 1:55 PM, Cyrill Gorcunov <gorcunov@openvz.org> wrote: > > Zero is written at clear_tid_address, when > > the process exits. This functionality is used > > by pthread_join(). > > > > We already have sys_set_tid_address() to change this > > address for current task but there is no way to obtain > > it from a user space. > > Is it worth introducing a syscall for this just for symmetry? I > suspect not, in which case: > > Acked-by: Kees Cook <keescook@chromium.org> > Thanks Kees! syscall was considered as "no-no" in previous convesation about this patch, so we switched to prctl. Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* [patch 0/2] c/r update to prctl @ 2012-03-09 21:41 Cyrill Gorcunov 2012-03-09 21:41 ` [patch 2/2] c/r: prctl: Add ability to get clear_tid_address Cyrill Gorcunov 0 siblings, 1 reply; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-09 21:41 UTC (permalink / raw) To: LKML, Andrew Morton Cc: Matt Helsley, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Oleg Nesterov Hi, this is a series to extend prctl a bit for c/r needs. - new PR_SET_MM_EXE_FILE option to setup /proc/pid/exe symlink on restored program - new PR_GET_TID_ADDRESS option to retrieve process' clear_tid_address Andrew, please drop Oleg's exe_path patch from -mm tree | The patch titled | Subject: mm/exec: rename mm->exe_file to mm->exe_path | has been added to the -mm tree. Its filename is | mm-exec-rename-mm-exe_file-to-mm-exe_path.patch This patch was deprecated. Cyrill ^ permalink raw reply [flat|nested] 22+ messages in thread
* [patch 2/2] c/r: prctl: Add ability to get clear_tid_address 2012-03-09 21:41 [patch 0/2] c/r update to prctl Cyrill Gorcunov @ 2012-03-09 21:41 ` Cyrill Gorcunov 0 siblings, 0 replies; 22+ messages in thread From: Cyrill Gorcunov @ 2012-03-09 21:41 UTC (permalink / raw) To: LKML, Andrew Morton Cc: Matt Helsley, KOSAKI Motohiro, Pavel Emelyanov, Kees Cook, Tejun Heo, Oleg Nesterov, Andrew Vagin, Cyrill Gorcunov, Pedro Alves [-- Attachment #1: c-r-prctl-add-PR_GET_TID_ADDRESS --] [-- Type: text/plain, Size: 2295 bytes --] Zero is written at clear_tid_address, when the process exits. This functionality is used by pthread_join(). We already have sys_set_tid_address() to change this address for current task but there is no way to obtain it from a user space. Without ability to find this address and dump it we can't restore pthread'ed apps which do call pthread_join() once they have been restored. This patch introduces PR_GET_TID_ADDRESS prctl option which allow current process to obtain own clear_tid_address. This feature is available iif CONFIG_CHECKPOINT_RESTORE is set. Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Pedro Alves <palves@redhat.com> CC: Oleg Nesterov <oleg@redhat.com> CC: Pavel Emelyanov <xemul@parallels.com> CC: Tejun Heo <tj@kernel.org> --- include/linux/prctl.h | 2 ++ kernel/sys.c | 13 +++++++++++++ 2 files changed, 15 insertions(+) Index: linux-2.6.git/include/linux/prctl.h =================================================================== --- linux-2.6.git.orig/include/linux/prctl.h +++ linux-2.6.git/include/linux/prctl.h @@ -120,4 +120,6 @@ # define PR_SET_MM_AUXV 12 # define PR_SET_MM_EXE_FILE 13 +#define PR_GET_TID_ADDRESS 36 + #endif /* _LINUX_PRCTL_H */ Index: linux-2.6.git/kernel/sys.c =================================================================== --- linux-2.6.git.orig/kernel/sys.c +++ linux-2.6.git/kernel/sys.c @@ -1897,12 +1897,22 @@ out: up_read(&mm->mmap_sem); return error; } + +static int prctl_get_tid_address(struct task_struct *me, int __user **tid_addr) +{ + return put_user(me->clear_child_tid, tid_addr); +} + #else /* CONFIG_CHECKPOINT_RESTORE */ static int prctl_set_mm(int opt, unsigned long addr, unsigned long arg4, unsigned long arg5) { return -EINVAL; } +static int prctl_get_tid_address(struct task_struct *me, int __user **tid_addr) +{ + return -EINVAL; +} #endif SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, @@ -2057,6 +2067,9 @@ SYSCALL_DEFINE5(prctl, int, option, unsi case PR_SET_MM: error = prctl_set_mm(arg2, arg3, arg4, arg5); break; + case PR_GET_TID_ADDRESS: + error = prctl_get_tid_address(me, (int __user **)arg2); + break; default: error = -EINVAL; break; ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2012-03-23 17:07 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-03-16 20:55 [patch 0/2] [PATCH 0/2] prctl extension in a sake of c/r Cyrill Gorcunov 2012-03-16 20:55 ` [patch 1/2] c/r: prctl: Add ability to set new mm_struct::exe_file Cyrill Gorcunov 2012-03-19 22:15 ` Andrew Morton 2012-03-19 22:39 ` Cyrill Gorcunov 2012-03-19 22:41 ` richard -rw- weinberger 2012-03-19 22:46 ` Andrew Morton 2012-03-19 22:50 ` Cyrill Gorcunov 2012-03-19 22:59 ` Andrew Morton 2012-03-19 23:12 ` Cyrill Gorcunov 2012-03-19 23:02 ` richard -rw- weinberger 2012-03-19 23:17 ` Cyrill Gorcunov 2012-03-19 23:23 ` richard -rw- weinberger 2012-03-20 6:55 ` Cyrill Gorcunov 2012-03-22 23:38 ` Eric W. Biederman 2012-03-23 6:41 ` Cyrill Gorcunov 2012-03-23 6:47 ` Cyrill Gorcunov 2012-03-23 17:06 ` Matt Helsley 2012-03-19 22:47 ` Cyrill Gorcunov 2012-03-16 20:55 ` [patch 2/2] c/r: prctl: Add ability to get clear_tid_address Cyrill Gorcunov 2012-03-19 16:51 ` Kees Cook 2012-03-19 16:55 ` Cyrill Gorcunov -- strict thread matches above, loose matches on Subject: below -- 2012-03-09 21:41 [patch 0/2] c/r update to prctl Cyrill Gorcunov 2012-03-09 21:41 ` [patch 2/2] c/r: prctl: Add ability to get clear_tid_address Cyrill Gorcunov
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.