linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization
@ 2015-03-14 22:39 Davidlohr Bueso
  2015-03-14 22:39 ` [PATCH 1/4] " Davidlohr Bueso
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2015-03-14 22:39 UTC (permalink / raw)
  To: akpm; +Cc: viro, gorcunov, oleg, koct9i, dave, linux-mm, linux-kernel

This is a set I created on top of patch 1/4 which also includes mm_struct cleanups
and dealing with prctl exe_file functionality. Specific details are in each patch.
Patch 4 is an extra trivial one I found while going through the code.

Applies on top of next-20150313.

Thanks!

Davidlohr Bueso (4):
  mm: replace mmap_sem for mm->exe_file serialization
  mm: introduce struct exe_file
  prctl: move MMF_EXE_FILE_CHANGED into exe_file struct
  kernel/fork: use pr_alert() for rss counter bugs

 fs/exec.c                |   6 +++
 include/linux/mm.h       |   4 ++
 include/linux/mm_types.h |   8 +++-
 include/linux/sched.h    |   5 +--
 kernel/fork.c            |  72 ++++++++++++++++++++++++++------
 kernel/sys.c             | 106 +++++++++++++++++++++++++++--------------------
 6 files changed, 141 insertions(+), 60 deletions(-)

-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4] mm: replace mmap_sem for mm->exe_file serialization
  2015-03-14 22:39 [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Davidlohr Bueso
@ 2015-03-14 22:39 ` Davidlohr Bueso
  2015-03-14 22:39 ` [PATCH 2/4] mm: introduce struct exe_file Davidlohr Bueso
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2015-03-14 22:39 UTC (permalink / raw)
  To: akpm
  Cc: viro, gorcunov, oleg, koct9i, dave, linux-mm, linux-kernel,
	Davidlohr Bueso

We currently use the mmap_sem to serialize the mm exe_file.
This is atrocious and a clear example of the misuses this
lock has all over the place, making any significant changes
to the address space locking that much more complex and tedious.
This also has to do of how we used to check for the vma's vm_file
being VM_EXECUTABLE (much of which was replaced by 2dd8ad81e31).

This patch, therefore, removes the mmap_sem dependency and
introduces a specific lock for the exe_file (rwlock_t, as it is
read mostly and protects a trivial critical region). As mentioned,
the motivation is to cleanup mmap_sem (as opposed to exe_file
performance). A nice side effect of this is that we avoid taking
the mmap_sem (shared) in fork paths for the exe_file handling.

Now that callers have been updated and standardized[1, 2] around
the get_mm_set_exe_file() interface, changing the locking scheme
is quite straightforward. The exception being the prctl calls
(ie PR_SET_MM_EXE_FILE). Because this caller actually updates
the mm->exe_file, we need to handle it in the same patch that changes
the locking rules. For this we need to reorganize prctl_set_mm_exe_file,
such that:

o mmap_sem is taken when actually needed.

o a new set_mm_exe_file_locked() function is introduced to be used by
  prctl. We now need to explicitly acquire the exe_file_lock as before
  it was implicit in holding the mmap_sem for write.

o a new __prctl_set_mm_exe_file() helper is created, which actually
  does the exe_file handling for the mm side -- needing the write
  lock for updating the mm->flags.

mm: improve handling of mm->exe_file
[1] Part 1: https://lkml.org/lkml/2015/2/18/721
[2] Part 2: https://lkml.org/lkml/2015/2/25/679

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
CC: Konstantin Khlebnikov <koct9i@gmail.com>
---
 fs/exec.c                |  6 ++++
 include/linux/mm.h       |  3 ++
 include/linux/mm_types.h |  1 +
 kernel/fork.c            | 26 +++++++++++---
 kernel/sys.c             | 92 ++++++++++++++++++++++++++++--------------------
 5 files changed, 85 insertions(+), 43 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 314e8d8..02bfd98 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1082,7 +1082,13 @@ int flush_old_exec(struct linux_binprm * bprm)
 	if (retval)
 		goto out;
 
+	/*
+	 * Must be called _before_ exec_mmap() as bprm->mm is
+	 * not visibile until then. This also enables the update
+	 * to be lockless.
+	 */
 	set_mm_exe_file(bprm->mm, bprm->file);
+
 	/*
 	 * Release all of the old mmap stuff
 	 */
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6571dd78..0c0720d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1909,7 +1909,10 @@ static inline int check_data_rlimit(unsigned long rlim,
 extern int mm_take_all_locks(struct mm_struct *mm);
 extern void mm_drop_all_locks(struct mm_struct *mm);
 
+/* mm->exe_file handling */
 extern void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file);
+extern void set_mm_exe_file_locked(struct mm_struct *mm,
+				   struct file *new_exe_file);
 extern struct file *get_mm_exe_file(struct mm_struct *mm);
 
 extern int may_expand_vm(struct mm_struct *mm, unsigned long npages);
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 590630e..5951baf 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -429,6 +429,7 @@ struct mm_struct {
 #endif
 
 	/* store ref to file /proc/<pid>/exe symlink points to */
+	rwlock_t exe_file_lock;
 	struct file *exe_file;
 #ifdef CONFIG_MMU_NOTIFIER
 	struct mmu_notifier_mm *mmu_notifier_mm;
diff --git a/kernel/fork.c b/kernel/fork.c
index ab1a008..a573b18 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -566,6 +566,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p)
 	mm_init_owner(mm, p);
 	mmu_notifier_mm_init(mm);
 	clear_tlb_flush_pending(mm);
+	rwlock_init(&mm->exe_file_lock);
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
 	mm->pmd_huge_pte = NULL;
 #endif
@@ -674,12 +675,25 @@ void mmput(struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(mmput);
 
+void set_mm_exe_file_locked(struct mm_struct *mm, struct file *new_exe_file)
+{
+	if (new_exe_file)
+		get_file(new_exe_file);
+	if (mm->exe_file)
+		fput(mm->exe_file);
+
+	write_lock(&mm->exe_file_lock);
+	mm->exe_file = new_exe_file;
+	write_unlock(&mm->exe_file_lock);
+}
+
 void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
 {
 	if (new_exe_file)
 		get_file(new_exe_file);
 	if (mm->exe_file)
 		fput(mm->exe_file);
+
 	mm->exe_file = new_exe_file;
 }
 
@@ -687,20 +701,22 @@ struct file *get_mm_exe_file(struct mm_struct *mm)
 {
 	struct file *exe_file;
 
-	/* We need mmap_sem to protect against races with removal of exe_file */
-	down_read(&mm->mmap_sem);
+	read_lock(&mm->exe_file_lock);
 	exe_file = mm->exe_file;
 	if (exe_file)
 		get_file(exe_file);
-	up_read(&mm->mmap_sem);
+	read_unlock(&mm->exe_file_lock);
+
 	return exe_file;
 }
 EXPORT_SYMBOL(get_mm_exe_file);
 
 static void dup_mm_exe_file(struct mm_struct *oldmm, struct mm_struct *newmm)
 {
-	/* It's safe to write the exe_file pointer without exe_file_lock because
-	 * this is called during fork when the task is not yet in /proc */
+	/*
+	 * It's safe to write the exe_file without the
+	 * exe_file_lock as we are just setting up the new task.
+	 */
 	newmm->exe_file = get_mm_exe_file(oldmm);
 }
 
diff --git a/kernel/sys.c b/kernel/sys.c
index 3be3449..a4d70f0 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1649,14 +1649,59 @@ SYSCALL_DEFINE1(umask, int, mask)
 	return mask;
 }
 
-static int prctl_set_mm_exe_file_locked(struct mm_struct *mm, unsigned int fd)
+static int __prctl_set_mm_exe_file(struct mm_struct *mm, struct fd exe)
+{
+	int err;
+	struct file *exe_file;
+
+	/*
+	 * Forbid mm->exe_file change if old file still mapped.
+	 */
+	exe_file = get_mm_exe_file(mm);
+	err = -EBUSY;
+	down_write(&mm->mmap_sem);
+	if (exe_file) {
+		struct vm_area_struct *vma;
+
+		for (vma = mm->mmap; vma; vma = vma->vm_next) {
+			if (!vma->vm_file)
+				continue;
+			if (path_equal(&vma->vm_file->f_path,
+				       &exe_file->f_path)) {
+				fput(exe_file);
+				goto exit_err;
+			}
+		}
+
+		fput(exe_file);
+	}
+
+	/*
+	 * The symlink can be changed only once, just to disallow arbitrary
+	 * transitions malicious software might bring in. This means one
+	 * could make a snapshot over all processes running and monitor
+	 * /proc/pid/exe changes to notice unusual activity if needed.
+	 */
+	err = -EPERM;
+	if (test_and_set_bit(MMF_EXE_FILE_CHANGED, &mm->flags))
+		goto exit_err;
+
+	up_write(&mm->mmap_sem);
+
+	/* this grabs a reference to exe.file */
+	set_mm_exe_file_locked(mm, exe.file);
+	return 0;
+exit_err:
+	up_write(&mm->mmap_sem);
+	return err;
+}
+
+static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
 {
 	struct fd exe;
 	struct inode *inode;
 	int err;
 
-	VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
-
 	exe = fdget(fd);
 	if (!exe.file)
 		return -EBADF;
@@ -1677,32 +1722,7 @@ static int prctl_set_mm_exe_file_locked(struct mm_struct *mm, unsigned int fd)
 	if (err)
 		goto exit;
 
-	/*
-	 * Forbid mm->exe_file change if old file still mapped.
-	 */
-	err = -EBUSY;
-	if (mm->exe_file) {
-		struct vm_area_struct *vma;
-
-		for (vma = mm->mmap; vma; vma = vma->vm_next)
-			if (vma->vm_file &&
-			    path_equal(&vma->vm_file->f_path,
-				       &mm->exe_file->f_path))
-				goto exit;
-	}
-
-	/*
-	 * The symlink can be changed only once, just to disallow arbitrary
-	 * transitions malicious software might bring in. This means one
-	 * could make a snapshot over all processes running and monitor
-	 * /proc/pid/exe changes to notice unusual activity if needed.
-	 */
-	err = -EPERM;
-	if (test_and_set_bit(MMF_EXE_FILE_CHANGED, &mm->flags))
-		goto exit;
-
-	err = 0;
-	set_mm_exe_file(mm, exe.file);	/* this grabs a reference to exe.file */
+	err = __prctl_set_mm_exe_file(mm, exe);
 exit:
 	fdput(exe);
 	return err;
@@ -1840,10 +1860,10 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
 		user_auxv[AT_VECTOR_SIZE - 1] = AT_NULL;
 	}
 
-	down_write(&mm->mmap_sem);
 	if (prctl_map.exe_fd != (u32)-1)
-		error = prctl_set_mm_exe_file_locked(mm, prctl_map.exe_fd);
-	downgrade_write(&mm->mmap_sem);
+		error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);
+
+	down_read(&mm->mmap_sem);
 	if (error)
 		goto out;
 
@@ -1909,12 +1929,8 @@ static int prctl_set_mm(int opt, unsigned long addr,
 	if (!capable(CAP_SYS_RESOURCE))
 		return -EPERM;
 
-	if (opt == PR_SET_MM_EXE_FILE) {
-		down_write(&mm->mmap_sem);
-		error = prctl_set_mm_exe_file_locked(mm, (unsigned int)addr);
-		up_write(&mm->mmap_sem);
-		return error;
-	}
+	if (opt == PR_SET_MM_EXE_FILE)
+		return prctl_set_mm_exe_file(mm, (unsigned int)addr);
 
 	if (addr >= TASK_SIZE || addr < mmap_min_addr)
 		return -EINVAL;
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] mm: introduce struct exe_file
  2015-03-14 22:39 [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Davidlohr Bueso
  2015-03-14 22:39 ` [PATCH 1/4] " Davidlohr Bueso
@ 2015-03-14 22:39 ` Davidlohr Bueso
  2015-03-14 22:39 ` [PATCH 3/4] prctl: move MMF_EXE_FILE_CHANGED into exe_file struct Davidlohr Bueso
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2015-03-14 22:39 UTC (permalink / raw)
  To: akpm
  Cc: viro, gorcunov, oleg, koct9i, dave, linux-mm, linux-kernel,
	Davidlohr Bueso

This patch isolates exe_file handling into its own data
structure, tiding up the mm_struct bits (which must remain
there as we provide prctl thread interfaces to change it).
Note that none of the interfaces have changed, users will
continue dealing with the actual backing struct file, but
internally we isolate things, serialization remaining the same.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
CC: Konstantin Khlebnikov <koct9i@gmail.com>
---
 include/linux/mm_types.h |  8 ++++++--
 kernel/fork.c            | 27 ++++++++++++++-------------
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5951baf..1fc994e 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -326,6 +326,11 @@ struct core_state {
 	struct completion startup;
 };
 
+struct exe_file {
+	rwlock_t lock;
+	struct file *file;
+};
+
 enum {
 	MM_FILEPAGES,
 	MM_ANONPAGES,
@@ -429,8 +434,7 @@ struct mm_struct {
 #endif
 
 	/* store ref to file /proc/<pid>/exe symlink points to */
-	rwlock_t exe_file_lock;
-	struct file *exe_file;
+	struct exe_file exe_file;
 #ifdef CONFIG_MMU_NOTIFIER
 	struct mmu_notifier_mm *mmu_notifier_mm;
 #endif
diff --git a/kernel/fork.c b/kernel/fork.c
index a573b18..aa0332b 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -566,7 +566,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p)
 	mm_init_owner(mm, p);
 	mmu_notifier_mm_init(mm);
 	clear_tlb_flush_pending(mm);
-	rwlock_init(&mm->exe_file_lock);
+	rwlock_init(&mm->exe_file.lock);
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
 	mm->pmd_huge_pte = NULL;
 #endif
@@ -679,45 +679,46 @@ void set_mm_exe_file_locked(struct mm_struct *mm, struct file *new_exe_file)
 {
 	if (new_exe_file)
 		get_file(new_exe_file);
-	if (mm->exe_file)
-		fput(mm->exe_file);
+	if (mm->exe_file.file)
+		fput(mm->exe_file.file);
 
-	write_lock(&mm->exe_file_lock);
-	mm->exe_file = new_exe_file;
-	write_unlock(&mm->exe_file_lock);
+	write_lock(&mm->exe_file.lock);
+	mm->exe_file.file = new_exe_file;
+	write_unlock(&mm->exe_file.lock);
 }
 
 void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
 {
 	if (new_exe_file)
 		get_file(new_exe_file);
-	if (mm->exe_file)
-		fput(mm->exe_file);
+	if (mm->exe_file.file)
+		fput(mm->exe_file.file);
 
-	mm->exe_file = new_exe_file;
+	mm->exe_file.file = new_exe_file;
 }
 
 struct file *get_mm_exe_file(struct mm_struct *mm)
 {
 	struct file *exe_file;
 
-	read_lock(&mm->exe_file_lock);
-	exe_file = mm->exe_file;
+	read_lock(&mm->exe_file.lock);
+	exe_file = mm->exe_file.file;
 	if (exe_file)
 		get_file(exe_file);
-	read_unlock(&mm->exe_file_lock);
+	read_unlock(&mm->exe_file.lock);
 
 	return exe_file;
 }
 EXPORT_SYMBOL(get_mm_exe_file);
 
+
 static void dup_mm_exe_file(struct mm_struct *oldmm, struct mm_struct *newmm)
 {
 	/*
 	 * It's safe to write the exe_file without the
 	 * exe_file_lock as we are just setting up the new task.
 	 */
-	newmm->exe_file = get_mm_exe_file(oldmm);
+	newmm->exe_file.file = get_mm_exe_file(oldmm);
 }
 
 /**
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] prctl: move MMF_EXE_FILE_CHANGED into exe_file struct
  2015-03-14 22:39 [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Davidlohr Bueso
  2015-03-14 22:39 ` [PATCH 1/4] " Davidlohr Bueso
  2015-03-14 22:39 ` [PATCH 2/4] mm: introduce struct exe_file Davidlohr Bueso
@ 2015-03-14 22:39 ` Davidlohr Bueso
  2015-03-15  2:13   ` Davidlohr Bueso
  2015-03-14 22:39 ` [PATCH 4/4] kernel/fork: use pr_alert() for rss counter bugs Davidlohr Bueso
  2015-03-15 14:21 ` [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Oleg Nesterov
  4 siblings, 1 reply; 15+ messages in thread
From: Davidlohr Bueso @ 2015-03-14 22:39 UTC (permalink / raw)
  To: akpm
  Cc: viro, gorcunov, oleg, koct9i, dave, linux-mm, linux-kernel,
	Davidlohr Bueso

Given the introduction of the exe_file structure, this
functionality should be associated with. Because this is
a very specific prctl property, it is easily to do so. As
of now, when the file has already changed, mmap_sem is not
taken at all (however we do need it of course to check the
old mapping, but this is now shared) and we maintain the
test-and-set logic ensuring nothing raced when we were not
holding the exe_file lock.

Now, the downside is that this patch makes MMF_EXE_FILE_CHANGED
functionality general, of course trivially enlarging the mm_struct
to users that don't care about this - which is the most usual case.
But I don't see this of any importance really. Similarly the funcs
that prctl makes use of are also global, in fork.c -- I preferred
leaving things generic for any(?) future user(s), but it could very
well be moved to sys.c.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
CC: Konstantin Khlebnikov <koct9i@gmail.com>
---
 include/linux/mm.h       |  5 +++--
 include/linux/mm_types.h |  1 +
 include/linux/sched.h    |  5 ++---
 kernel/fork.c            | 37 +++++++++++++++++++++++++++---
 kernel/sys.c             | 58 +++++++++++++++++++++++++-----------------------
 5 files changed, 70 insertions(+), 36 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0c0720d..90eae9f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1911,9 +1911,10 @@ extern void mm_drop_all_locks(struct mm_struct *mm);
 
 /* mm->exe_file handling */
 extern void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file);
-extern void set_mm_exe_file_locked(struct mm_struct *mm,
-				   struct file *new_exe_file);
 extern struct file *get_mm_exe_file(struct mm_struct *mm);
+extern bool test_and_set_mm_exe_file(struct mm_struct *mm,
+				     struct file *new_exe_file);
+extern bool mm_exe_file_changed(struct mm_struct *mm);
 
 extern int may_expand_vm(struct mm_struct *mm, unsigned long npages);
 extern struct vm_area_struct *_install_special_mapping(struct mm_struct *mm,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 1fc994e..2d8b06b 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -328,6 +328,7 @@ struct core_state {
 
 struct exe_file {
 	rwlock_t lock;
+	bool changed; /* see prctl_set_mm_exe_file() */
 	struct file *file;
 };
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6d77432..0caf62e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -494,10 +494,9 @@ static inline int get_dumpable(struct mm_struct *mm)
 					/* leave room for more dump flags */
 #define MMF_VM_MERGEABLE	16	/* KSM may merge identical pages */
 #define MMF_VM_HUGEPAGE		17	/* set when VM_HUGEPAGE is set on vma */
-#define MMF_EXE_FILE_CHANGED	18	/* see prctl_set_mm_exe_file() */
 
-#define MMF_HAS_UPROBES		19	/* has uprobes */
-#define MMF_RECALC_UPROBES	20	/* MMF_HAS_UPROBES can be wrong */
+#define MMF_HAS_UPROBES		18	/* has uprobes */
+#define MMF_RECALC_UPROBES	19	/* MMF_HAS_UPROBES can be wrong */
 
 #define MMF_INIT_MASK		(MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK)
 
diff --git a/kernel/fork.c b/kernel/fork.c
index aa0332b..54b0b91 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -675,18 +675,50 @@ void mmput(struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(mmput);
 
-void set_mm_exe_file_locked(struct mm_struct *mm, struct file *new_exe_file)
+/*
+ * exe_file handling is differentiated by the caller's need to
+ * be aware of the file being changed -- which will always require
+ * holding the exe_file lock. As such, the following functions
+ * keep track of this are (currently only used by prctl):
+ *   - test_and_set_mm_exe_file()
+ *   - mm_exe_file_changed()
+ *
+ * The rest of the callers should only stick to:
+ *   - set_mm_exe_file()
+ *   - get_mm_exe_file()
+ */
+bool test_and_set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
 {
+	bool ret = false;
+
+	write_lock(&mm->exe_file.lock);
+	if (mm->exe_file.changed)
+		goto done;
+
 	if (new_exe_file)
 		get_file(new_exe_file);
 	if (mm->exe_file.file)
 		fput(mm->exe_file.file);
 
-	write_lock(&mm->exe_file.lock);
 	mm->exe_file.file = new_exe_file;
+	ret = mm->exe_file.changed = true;
+done:
 	write_unlock(&mm->exe_file.lock);
+	return ret;
 }
 
+bool mm_exe_file_changed(struct mm_struct *mm)
+{
+	bool ret;
+
+	read_lock(&mm->exe_file.lock);
+	ret = mm->exe_file.changed;
+	read_lock(&mm->exe_file.lock);
+
+	return ret;
+}
+
+/* lockless -- see each caller for justification */
 void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
 {
 	if (new_exe_file)
@@ -711,7 +743,6 @@ struct file *get_mm_exe_file(struct mm_struct *mm)
 }
 EXPORT_SYMBOL(get_mm_exe_file);
 
-
 static void dup_mm_exe_file(struct mm_struct *oldmm, struct mm_struct *newmm)
 {
 	/*
diff --git a/kernel/sys.c b/kernel/sys.c
index a4d70f0..a82d0c4 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1649,17 +1649,27 @@ SYSCALL_DEFINE1(umask, int, mask)
 	return mask;
 }
 
-static int __prctl_set_mm_exe_file(struct mm_struct *mm, struct fd exe)
+static int __prctl_set_mm_exe_file(struct mm_struct *mm, struct fd exefd)
 {
-	int err;
 	struct file *exe_file;
 
 	/*
+	 * The symlink can be changed only once, just to disallow arbitrary
+	 * transitions malicious software might bring in. This means one
+	 * could make a snapshot over all processes running and monitor
+	 * /proc/pid/exe changes to notice unusual activity if needed.
+	 */
+	if (mm_exe_file_changed(mm))
+		return -EPERM;
+
+	/*
 	 * Forbid mm->exe_file change if old file still mapped.
 	 */
 	exe_file = get_mm_exe_file(mm);
-	err = -EBUSY;
-	down_write(&mm->mmap_sem);
+	if (!exe_file)
+		goto set_file;
+
+	down_read(&mm->mmap_sem);
 	if (exe_file) {
 		struct vm_area_struct *vma;
 
@@ -1669,44 +1679,36 @@ static int __prctl_set_mm_exe_file(struct mm_struct *mm, struct fd exe)
 			if (path_equal(&vma->vm_file->f_path,
 				       &exe_file->f_path)) {
 				fput(exe_file);
-				goto exit_err;
+				up_read(&mm->mmap_sem);
+				return -EBUSY;
 			}
 		}
 
 		fput(exe_file);
 	}
+	up_read(&mm->mmap_sem);
 
+set_file:
 	/*
-	 * The symlink can be changed only once, just to disallow arbitrary
-	 * transitions malicious software might bring in. This means one
-	 * could make a snapshot over all processes running and monitor
-	 * /proc/pid/exe changes to notice unusual activity if needed.
+	 * Recheck the file state again before setting.
+	 * This grabs a reference to exefd.file.
 	 */
-	err = -EPERM;
-	if (test_and_set_bit(MMF_EXE_FILE_CHANGED, &mm->flags))
-		goto exit_err;
-
-	up_write(&mm->mmap_sem);
-
-	/* this grabs a reference to exe.file */
-	set_mm_exe_file_locked(mm, exe.file);
-	return 0;
-exit_err:
-	up_write(&mm->mmap_sem);
-	return err;
+	if (test_and_set_mm_exe_file(mm, exefd.file))
+		return 0;
+	return -EPERM;
 }
 
 static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
 {
-	struct fd exe;
+	struct fd exefd;
 	struct inode *inode;
 	int err;
 
-	exe = fdget(fd);
-	if (!exe.file)
+	exefd = fdget(fd);
+	if (!exefd.file)
 		return -EBADF;
 
-	inode = file_inode(exe.file);
+	inode = file_inode(exefd.file);
 
 	/*
 	 * Because the original mm->exe_file points to executable file, make
@@ -1715,16 +1717,16 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
 	 */
 	err = -EACCES;
 	if (!S_ISREG(inode->i_mode)	||
-	    exe.file->f_path.mnt->mnt_flags & MNT_NOEXEC)
+	    exefd.file->f_path.mnt->mnt_flags & MNT_NOEXEC)
 		goto exit;
 
 	err = inode_permission(inode, MAY_EXEC);
 	if (err)
 		goto exit;
 
-	err = __prctl_set_mm_exe_file(mm, exe);
+	err = __prctl_set_mm_exe_file(mm, exefd);
 exit:
-	fdput(exe);
+	fdput(exefd);
 	return err;
 }
 
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] kernel/fork: use pr_alert() for rss counter bugs
  2015-03-14 22:39 [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Davidlohr Bueso
                   ` (2 preceding siblings ...)
  2015-03-14 22:39 ` [PATCH 3/4] prctl: move MMF_EXE_FILE_CHANGED into exe_file struct Davidlohr Bueso
@ 2015-03-14 22:39 ` Davidlohr Bueso
  2015-03-16 11:30   ` Konstantin Khlebnikov
  2015-03-15 14:21 ` [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Oleg Nesterov
  4 siblings, 1 reply; 15+ messages in thread
From: Davidlohr Bueso @ 2015-03-14 22:39 UTC (permalink / raw)
  To: akpm
  Cc: viro, gorcunov, oleg, koct9i, dave, linux-mm, linux-kernel,
	Davidlohr Bueso

... everyone else does.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
CC: Konstantin Khlebnikov <koct9i@gmail.com>
---
 kernel/fork.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 54b0b91..fc5d4f3 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -602,8 +602,8 @@ static void check_mm(struct mm_struct *mm)
 		long x = atomic_long_read(&mm->rss_stat.count[i]);
 
 		if (unlikely(x))
-			printk(KERN_ALERT "BUG: Bad rss-counter state "
-					  "mm:%p idx:%d val:%ld\n", mm, i, x);
+			pr_alert("BUG: Bad rss-counter state "
+				 "mm:%p idx:%d val:%ld\n", mm, i, x);
 	}
 
 	if (atomic_long_read(&mm->nr_ptes))
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] prctl: move MMF_EXE_FILE_CHANGED into exe_file struct
  2015-03-14 22:39 ` [PATCH 3/4] prctl: move MMF_EXE_FILE_CHANGED into exe_file struct Davidlohr Bueso
@ 2015-03-15  2:13   ` Davidlohr Bueso
  0 siblings, 0 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2015-03-15  2:13 UTC (permalink / raw)
  To: akpm; +Cc: viro, gorcunov, oleg, koct9i, linux-mm, linux-kernel

On Sat, 2015-03-14 at 15:39 -0700, Davidlohr Bueso wrote:
> +	if (test_and_set_mm_exe_file(mm, exefd.file))
> +		return 0;
> +	return -EPERM;

Bah, this is obviously bogus. We'd need the following folded in:

diff --git a/kernel/sys.c b/kernel/sys.c
index a82d0c4..41b27bd 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1694,8 +1694,8 @@ set_file:
         * This grabs a reference to exefd.file.
         */
        if (test_and_set_mm_exe_file(mm, exefd.file))
-               return 0;
-       return -EPERM;
+               return -EPERM;
+       return 0;
 }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization
  2015-03-14 22:39 [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Davidlohr Bueso
                   ` (3 preceding siblings ...)
  2015-03-14 22:39 ` [PATCH 4/4] kernel/fork: use pr_alert() for rss counter bugs Davidlohr Bueso
@ 2015-03-15 14:21 ` Oleg Nesterov
  2015-03-15 14:54   ` Davidlohr Bueso
  4 siblings, 1 reply; 15+ messages in thread
From: Oleg Nesterov @ 2015-03-15 14:21 UTC (permalink / raw)
  To: Davidlohr Bueso; +Cc: akpm, viro, gorcunov, koct9i, linux-mm, linux-kernel

I didn't even read this version, but honestly I don't like it anyway.

I leave the review to Cyrill and Konstantin though, If they like these
changes I won't argue.

But I simply can't understand why are you doing this.



Yes, this code needs cleanups, I agree. Does this series makes it better?
To me it doesn't, and the diffstat below shows that it blows the code.

In fact, to me it complicates this code. For example. Personally I think
that MMF_EXE_FILE_CHANGED should die. And currently we can just remove it.
Not after your patch which adds another dependency.



Or do you think this is performance improvement? I don't think so. Yes,
prctl() abuses mmap_sem, but this not a hot path and the task can only
abuse its own ->mm.

OK, I agree, dup_mm_exe_file() is horrible. But as I already said it can
simply die. We can move this code into dup_mmap() and avoid another
down_read/up_read.


Hmm. And this series is simply wrong without more changes in audit paths.
Unfortunately this is fixable, but let me NACK at least this version ;)


Speaking of cleanups... IIRC Konstantin suggested to rcuify this pointer
and I agree, this looks better than the new lock. But in fact I think
that the cleanups should start with s/get_mm_exe_file//get_mm_exe_path/.
Note that nobody actually needs "file *", every caller needs "struct path".
Plus kill dup_mm_exe_file().

Oleg.


On 03/14, Davidlohr Bueso wrote:
>
> This is a set I created on top of patch 1/4 which also includes mm_struct cleanups
> and dealing with prctl exe_file functionality. Specific details are in each patch.
> Patch 4 is an extra trivial one I found while going through the code.
>
> Applies on top of next-20150313.
>
> Thanks!
>
> Davidlohr Bueso (4):
>   mm: replace mmap_sem for mm->exe_file serialization
>   mm: introduce struct exe_file
>   prctl: move MMF_EXE_FILE_CHANGED into exe_file struct
>   kernel/fork: use pr_alert() for rss counter bugs
>
>  fs/exec.c                |   6 +++
>  include/linux/mm.h       |   4 ++
>  include/linux/mm_types.h |   8 +++-
>  include/linux/sched.h    |   5 +--
>  kernel/fork.c            |  72 ++++++++++++++++++++++++++------
>  kernel/sys.c             | 106 +++++++++++++++++++++++++++--------------------
>  6 files changed, 141 insertions(+), 60 deletions(-)
>
> --
> 2.1.4
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization
  2015-03-15 14:21 ` [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Oleg Nesterov
@ 2015-03-15 14:54   ` Davidlohr Bueso
  2015-03-15 15:26     ` Oleg Nesterov
  0 siblings, 1 reply; 15+ messages in thread
From: Davidlohr Bueso @ 2015-03-15 14:54 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: akpm, viro, gorcunov, koct9i, linux-mm, linux-kernel

On Sun, 2015-03-15 at 15:21 +0100, Oleg Nesterov wrote:
> I didn't even read this version, but honestly I don't like it anyway.
> 
> I leave the review to Cyrill and Konstantin though, If they like these
> changes I won't argue.
> 
> But I simply can't understand why are you doing this.
> 
> 
> 
> Yes, this code needs cleanups, I agree. Does this series makes it better?
> To me it doesn't, and the diffstat below shows that it blows the code.

Looking at some of the caller paths now, I have to disagree.

> 
> In fact, to me it complicates this code. For example. Personally I think
> that MMF_EXE_FILE_CHANGED should die. And currently we can just remove it.

How could you remove this? I mean it's user functionality, so you need
some way of keeping track of a changed file. But you might be talking
about something else.

> Not after your patch which adds another dependency.

I don't add another dependency, I just replace the current one.

> 
> Or do you think this is performance improvement? I don't think so. Yes,
> prctl() abuses mmap_sem, but this not a hot path and the task can only
> abuse its own ->mm.

I've tried to make it as clear as possible this is a not performance
patch. I guess I've failed. Let me repeat it again: this is *not*
performance motivated ;) This kind of things under mmap_sem prevents
lock breakup.

> OK, I agree, dup_mm_exe_file() is horrible. But as I already said it can
> simply die. We can move this code into dup_mmap() and avoid another
> down_read/up_read.

If this series goes to the dumpster then ok I'll send a patch for this,
I have no objection.

> 
> Hmm. And this series is simply wrong without more changes in audit paths.
> Unfortunately this is fixable, but let me NACK at least this version ;)

Could you explain this? Are you referring to the audit.c user? If so
that caller has already been updated.

> 
> 
> Speaking of cleanups... IIRC Konstantin suggested to rcuify this pointer
> and I agree, this looks better than the new lock.

Yes, I can do that in patch 1, but as mentioned, rcu is not really the
question to me, it's the lock for when we change the exe file, so if
it's not mmap_sem we'd still need another lock. If mmap_sem is kept, yes
we can use the read lock in things like get_mm_exe_file() and still rely
on the file ref counting so we wouldn't need to do everything under rcu,
which was a though I originally had.

Thanks,
Davidlohr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization
  2015-03-15 14:54   ` Davidlohr Bueso
@ 2015-03-15 15:26     ` Oleg Nesterov
  2015-03-15 15:42       ` Davidlohr Bueso
  0 siblings, 1 reply; 15+ messages in thread
From: Oleg Nesterov @ 2015-03-15 15:26 UTC (permalink / raw)
  To: Davidlohr Bueso; +Cc: akpm, viro, gorcunov, koct9i, linux-mm, linux-kernel

On 03/15, Davidlohr Bueso wrote:
>
> On Sun, 2015-03-15 at 15:21 +0100, Oleg Nesterov wrote:
> > I didn't even read this version, but honestly I don't like it anyway.
> >
> > I leave the review to Cyrill and Konstantin though, If they like these
> > changes I won't argue.
> >
> > But I simply can't understand why are you doing this.
> >
> >
> >
> > Yes, this code needs cleanups, I agree. Does this series makes it better?
> > To me it doesn't, and the diffstat below shows that it blows the code.
>
> Looking at some of the caller paths now, I have to disagree.

And I believe you are wrong. But let me repeat, I leave this to Cyrill
and Konstantin. Cleanups are always subjective.

> > In fact, to me it complicates this code. For example. Personally I think
> > that MMF_EXE_FILE_CHANGED should die. And currently we can just remove it.
>
> How could you remove this?

Just remove this flag and the test_and_set_bit(MMF_EXE_FILE_CHANGED) check.
Again, this is subjective, but to me it looks ugly. Why do we allow to
change ->exe_file but only once?

> > Not after your patch which adds another dependency.
>
> I don't add another dependency, I just replace the current one.

But you did. If we remove test_and_set_bit(MMF_EXE_FILE_CHANGED)
set_mm_exe_file() becomes racy with your patch. Sure, this is fixable too.

> > Or do you think this is performance improvement? I don't think so. Yes,
> > prctl() abuses mmap_sem, but this not a hot path and the task can only
> > abuse its own ->mm.
>
> I've tried to make it as clear as possible this is a not performance
> patch. I guess I've failed. Let me repeat it again: this is *not*
> performance motivated ;)

OK.

> This kind of things under mmap_sem prevents
> lock breakup.

Could you spell?

> > Hmm. And this series is simply wrong without more changes in audit paths.
> > Unfortunately this is fixable, but let me NACK at least this version ;)
>
> Could you explain this? Are you referring to the audit.c user? If so
> that caller has already been updated.

I do not see these changes in Linus's tree. OK, if those caller's were
already changed somewhere else then unfortunately I can't nack this patch
by technical reasons ;)

But perhaps you should mention that this change depends on other patches
and name them.

> > Speaking of cleanups... IIRC Konstantin suggested to rcuify this pointer
> > and I agree, this looks better than the new lock.
>
> Yes, I can do that in patch 1, but as mentioned, rcu is not really the
> question to me, it's the lock for when we change the exe file, so if
> it's not mmap_sem we'd still need another lock.

Not if we keep MMF_EXE_FILE_CHANGED. See above, we can change it lockless.
And even without MMF_EXE_FILE_CHANGED, we can use xchg().

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization
  2015-03-15 15:26     ` Oleg Nesterov
@ 2015-03-15 15:42       ` Davidlohr Bueso
  2015-03-15 17:05         ` Cyrill Gorcunov
  0 siblings, 1 reply; 15+ messages in thread
From: Davidlohr Bueso @ 2015-03-15 15:42 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: akpm, viro, gorcunov, koct9i, linux-mm, linux-kernel

On Sun, 2015-03-15 at 16:26 +0100, Oleg Nesterov wrote:
> On 03/15, Davidlohr Bueso wrote:
> >
> > On Sun, 2015-03-15 at 15:21 +0100, Oleg Nesterov wrote:
> > > I didn't even read this version, but honestly I don't like it anyway.
> > >
> > > I leave the review to Cyrill and Konstantin though, If they like these
> > > changes I won't argue.
> > >
> > > But I simply can't understand why are you doing this.
> > >
> > >
> > >
> > > Yes, this code needs cleanups, I agree. Does this series makes it better?
> > > To me it doesn't, and the diffstat below shows that it blows the code.
> >
> > Looking at some of the caller paths now, I have to disagree.
> 
> And I believe you are wrong. But let me repeat, I leave this to Cyrill
> and Konstantin. Cleanups are always subjective.
> 
> > > In fact, to me it complicates this code. For example. Personally I think
> > > that MMF_EXE_FILE_CHANGED should die. And currently we can just remove it.
> >
> > How could you remove this?
> 
> Just remove this flag and the test_and_set_bit(MMF_EXE_FILE_CHANGED) check.
> Again, this is subjective, but to me it looks ugly. Why do we allow to
> change ->exe_file but only once?

Ok I think I am finally seeing where you are going. And I like it *a
lot* because it allows us to basically replace mmap_sem with rcu
(MMF_EXE_FILE_CHANGED being the only user that requires a lock!!), but
am afraid it might not be possible. I mean currently we have no rule wrt
to users that don't deal with prctl. 

Forbidding multiple exe_file changes to be generic would certainly
change address space semantics, probably for the better (tighter around
security), but changed nonetheless so users would have a right to
complain, no? So if we can get away with removing MMF_EXE_FILE_CHANGED
I'm all for it. Andrew?

Thanks,
Davidlohr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization
  2015-03-15 15:42       ` Davidlohr Bueso
@ 2015-03-15 17:05         ` Cyrill Gorcunov
  2015-03-15 17:34           ` Davidlohr Bueso
  2015-03-16 22:08           ` Kees Cook
  0 siblings, 2 replies; 15+ messages in thread
From: Cyrill Gorcunov @ 2015-03-15 17:05 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Oleg Nesterov, akpm, viro, koct9i, linux-mm, linux-kernel, Kees Cook

On Sun, Mar 15, 2015 at 08:42:05AM -0700, Davidlohr Bueso wrote:
> > > > Yes, this code needs cleanups, I agree. Does this series makes it better?
> > > > To me it doesn't, and the diffstat below shows that it blows the code.
> > >
> > > Looking at some of the caller paths now, I have to disagree.
> > 
> > And I believe you are wrong. But let me repeat, I leave this to Cyrill
> > and Konstantin. Cleanups are always subjective.
> > 
> > > > In fact, to me it complicates this code. For example. Personally I think
> > > > that MMF_EXE_FILE_CHANGED should die. And currently we can just remove it.
> > >
> > > How could you remove this?
> > 
> > Just remove this flag and the test_and_set_bit(MMF_EXE_FILE_CHANGED) check.
> > Again, this is subjective, but to me it looks ugly. Why do we allow to
> > change ->exe_file but only once?

This came from very first versions of the functionality implemented
in prctl. It supposed to help sysadmins to notice if there exe
transition happened. As to me it doesn't bring much security, if I
would be a virus I would simply replace executing code with ptrace
or via other ways without telling outside world that i've changed
exe path. That said I would happily rip off this MMF_EXE_FILE_CHANGED
bit but I fear security guys won't be that happy about it.
(CC'ing Kees)

As to series as a "cleanup" in general -- we need to measure that
at least it doesn't bring perf downgrade at least.

> Ok I think I am finally seeing where you are going. And I like it *a
> lot* because it allows us to basically replace mmap_sem with rcu
> (MMF_EXE_FILE_CHANGED being the only user that requires a lock!!), but
> am afraid it might not be possible. I mean currently we have no rule wrt
> to users that don't deal with prctl. 
> 
> Forbidding multiple exe_file changes to be generic would certainly
> change address space semantics, probably for the better (tighter around
> security), but changed nonetheless so users would have a right to
> complain, no? So if we can get away with removing MMF_EXE_FILE_CHANGED
> I'm all for it. Andrew?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization
  2015-03-15 17:05         ` Cyrill Gorcunov
@ 2015-03-15 17:34           ` Davidlohr Bueso
  2015-03-16 22:08           ` Kees Cook
  1 sibling, 0 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2015-03-15 17:34 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Oleg Nesterov, akpm, viro, koct9i, linux-mm, linux-kernel,
	Kees Cook, Michael Kerrisk (man-pages),
	Linux API

On Sun, 2015-03-15 at 20:05 +0300, Cyrill Gorcunov wrote:
> On Sun, Mar 15, 2015 at 08:42:05AM -0700, Davidlohr Bueso wrote:
> > > > > Yes, this code needs cleanups, I agree. Does this series makes it better?
> > > > > To me it doesn't, and the diffstat below shows that it blows the code.
> > > >
> > > > Looking at some of the caller paths now, I have to disagree.
> > > 
> > > And I believe you are wrong. But let me repeat, I leave this to Cyrill
> > > and Konstantin. Cleanups are always subjective.
> > > 
> > > > > In fact, to me it complicates this code. For example. Personally I think
> > > > > that MMF_EXE_FILE_CHANGED should die. And currently we can just remove it.
> > > >
> > > > How could you remove this?
> > > 
> > > Just remove this flag and the test_and_set_bit(MMF_EXE_FILE_CHANGED) check.
> > > Again, this is subjective, but to me it looks ugly. Why do we allow to
> > > change ->exe_file but only once?
> 
> This came from very first versions of the functionality implemented
> in prctl. It supposed to help sysadmins to notice if there exe
> transition happened. As to me it doesn't bring much security, if I
> would be a virus I would simply replace executing code with ptrace
> or via other ways without telling outside world that i've changed
> exe path. That said I would happily rip off this MMF_EXE_FILE_CHANGED
> bit but I fear security guys won't be that happy about it.
> (CC'ing Kees)

Also adding Michael for any prctl manpage and api changes.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] kernel/fork: use pr_alert() for rss counter bugs
  2015-03-14 22:39 ` [PATCH 4/4] kernel/fork: use pr_alert() for rss counter bugs Davidlohr Bueso
@ 2015-03-16 11:30   ` Konstantin Khlebnikov
  0 siblings, 0 replies; 15+ messages in thread
From: Konstantin Khlebnikov @ 2015-03-16 11:30 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Andrew Morton, Al Viro, Cyrill Gorcunov, Oleg Nesterov, linux-mm,
	Linux Kernel Mailing List, Davidlohr Bueso

On Sun, Mar 15, 2015 at 1:39 AM, Davidlohr Bueso <dave@stgolabs.net> wrote:
> ... everyone else does.
>
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: Oleg Nesterov <oleg@redhat.com>
> CC: Konstantin Khlebnikov <koct9i@gmail.com>
> ---
>  kernel/fork.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 54b0b91..fc5d4f3 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -602,8 +602,8 @@ static void check_mm(struct mm_struct *mm)
>                 long x = atomic_long_read(&mm->rss_stat.count[i]);
>
>                 if (unlikely(x))
> -                       printk(KERN_ALERT "BUG: Bad rss-counter state "
> -                                         "mm:%p idx:%d val:%ld\n", mm, i, x);
> +                       pr_alert("BUG: Bad rss-counter state "
> +                                "mm:%p idx:%d val:%ld\n", mm, i, x);
>         }

Ack.

>
>         if (atomic_long_read(&mm->nr_ptes))
> --
> 2.1.4
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization
  2015-03-15 17:05         ` Cyrill Gorcunov
  2015-03-15 17:34           ` Davidlohr Bueso
@ 2015-03-16 22:08           ` Kees Cook
  2015-03-20 16:09             ` Cyrill Gorcunov
  1 sibling, 1 reply; 15+ messages in thread
From: Kees Cook @ 2015-03-16 22:08 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Davidlohr Bueso, Oleg Nesterov, Andrew Morton, Al Viro, koct9i,
	Linux-MM, LKML

On Sun, Mar 15, 2015 at 10:05 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Sun, Mar 15, 2015 at 08:42:05AM -0700, Davidlohr Bueso wrote:
>> > > > Yes, this code needs cleanups, I agree. Does this series makes it better?
>> > > > To me it doesn't, and the diffstat below shows that it blows the code.
>> > >
>> > > Looking at some of the caller paths now, I have to disagree.
>> >
>> > And I believe you are wrong. But let me repeat, I leave this to Cyrill
>> > and Konstantin. Cleanups are always subjective.
>> >
>> > > > In fact, to me it complicates this code. For example. Personally I think
>> > > > that MMF_EXE_FILE_CHANGED should die. And currently we can just remove it.
>> > >
>> > > How could you remove this?
>> >
>> > Just remove this flag and the test_and_set_bit(MMF_EXE_FILE_CHANGED) check.
>> > Again, this is subjective, but to me it looks ugly. Why do we allow to
>> > change ->exe_file but only once?
>
> This came from very first versions of the functionality implemented
> in prctl. It supposed to help sysadmins to notice if there exe
> transition happened. As to me it doesn't bring much security, if I
> would be a virus I would simply replace executing code with ptrace
> or via other ways without telling outside world that i've changed
> exe path. That said I would happily rip off this MMF_EXE_FILE_CHANGED
> bit but I fear security guys won't be that happy about it.
> (CC'ing Kees)
>
> As to series as a "cleanup" in general -- we need to measure that
> at least it doesn't bring perf downgrade at least.
>
>> Ok I think I am finally seeing where you are going. And I like it *a
>> lot* because it allows us to basically replace mmap_sem with rcu
>> (MMF_EXE_FILE_CHANGED being the only user that requires a lock!!), but
>> am afraid it might not be possible. I mean currently we have no rule wrt
>> to users that don't deal with prctl.
>>
>> Forbidding multiple exe_file changes to be generic would certainly
>> change address space semantics, probably for the better (tighter around
>> security), but changed nonetheless so users would have a right to
>> complain, no? So if we can get away with removing MMF_EXE_FILE_CHANGED
>> I'm all for it. Andrew?

I can't figure out why MMF_EXE_FILE_CHANGED is used to stop a second
change. But it does seem useful to mark a process as "hey, we know for
sure this the exe_file changed on this process" from an accounting
perspective.

And I'd agree about the malware: it would never use this interface, so
there's no security benefit I can see. Maybe I haven't had enough
coffee, though. :)

-Kees

-- 
Kees Cook
Chrome OS Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization
  2015-03-16 22:08           ` Kees Cook
@ 2015-03-20 16:09             ` Cyrill Gorcunov
  0 siblings, 0 replies; 15+ messages in thread
From: Cyrill Gorcunov @ 2015-03-20 16:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: Davidlohr Bueso, Oleg Nesterov, Andrew Morton, Al Viro, koct9i,
	Linux-MM, LKML

On Mon, Mar 16, 2015 at 03:08:40PM -0700, Kees Cook wrote:
> >
> >> Ok I think I am finally seeing where you are going. And I like it *a
> >> lot* because it allows us to basically replace mmap_sem with rcu
> >> (MMF_EXE_FILE_CHANGED being the only user that requires a lock!!), but
> >> am afraid it might not be possible. I mean currently we have no rule wrt
> >> to users that don't deal with prctl.
> >>
> >> Forbidding multiple exe_file changes to be generic would certainly
> >> change address space semantics, probably for the better (tighter around
> >> security), but changed nonetheless so users would have a right to
> >> complain, no? So if we can get away with removing MMF_EXE_FILE_CHANGED
> >> I'm all for it. Andrew?
> 
> I can't figure out why MMF_EXE_FILE_CHANGED is used to stop a second
> change. But it does seem useful to mark a process as "hey, we know for
> sure this the exe_file changed on this process" from an accounting
> perspective.

Sure, except it start being more stopper for further development so
ripping it off would help ;)

> 
> And I'd agree about the malware: it would never use this interface, so
> there's no security benefit I can see. Maybe I haven't had enough
> coffee, though. :)

Yes, same here, would never use it either.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-03-20 16:09 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-14 22:39 [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Davidlohr Bueso
2015-03-14 22:39 ` [PATCH 1/4] " Davidlohr Bueso
2015-03-14 22:39 ` [PATCH 2/4] mm: introduce struct exe_file Davidlohr Bueso
2015-03-14 22:39 ` [PATCH 3/4] prctl: move MMF_EXE_FILE_CHANGED into exe_file struct Davidlohr Bueso
2015-03-15  2:13   ` Davidlohr Bueso
2015-03-14 22:39 ` [PATCH 4/4] kernel/fork: use pr_alert() for rss counter bugs Davidlohr Bueso
2015-03-16 11:30   ` Konstantin Khlebnikov
2015-03-15 14:21 ` [PATCH -next v2 0/4] mm: replace mmap_sem for mm->exe_file serialization Oleg Nesterov
2015-03-15 14:54   ` Davidlohr Bueso
2015-03-15 15:26     ` Oleg Nesterov
2015-03-15 15:42       ` Davidlohr Bueso
2015-03-15 17:05         ` Cyrill Gorcunov
2015-03-15 17:34           ` Davidlohr Bueso
2015-03-16 22:08           ` Kees Cook
2015-03-20 16:09             ` Cyrill Gorcunov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).