linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s)
@ 2013-10-13 19:18 Oleg Nesterov
  2013-10-13 19:18 ` [PATCH 1/5] uprobes: Change the callsite of uprobe_copy_process() Oleg Nesterov
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-13 19:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

Hello.

Please review, this series fixes the serious bug reported by
Martin and David and cc's stable. See the changelog in 5/5.

Oleg.

 kernel/events/uprobes.c |  144 +++++++++++++++++++++++++++++++++++------------
 kernel/fork.c           |    2 +-
 2 files changed, 109 insertions(+), 37 deletions(-)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/5] uprobes: Change the callsite of uprobe_copy_process()
  2013-10-13 19:18 [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
@ 2013-10-13 19:18 ` Oleg Nesterov
  2013-10-16 12:37   ` Srikar Dronamraju
  2013-10-13 19:18 ` [PATCH 2/5] uprobes: Introduce __create_xol_area() Oleg Nesterov
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-13 19:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

Preparation for the next patches.

Move the callsite of uprobe_copy_process() in copy_process() down
to the succesfull return. We do not care if copy_process() fails,
uprobe_free_utask() won't be called in this case so the wrong
->utask != NULL doesn't matter.

OTOH, with this change we know that copy_process() can't fail when
uprobe_copy_process() is called, the new task should either return
to user-mode or call do_exit(). This way uprobe_copy_process() can:

	1. setup p->utask != NULL if necessary

	2. setup uprobes_state.xol_area

	3. use task_work_add(p)

Also, move the definition of uprobe_copy_process() down so that it
can see get_utask().

Cc: stable@vger.kernel.org # 3.9+
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/events/uprobes.c |   16 ++++++++--------
 kernel/fork.c           |    2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ad8e1bd..db7a1dc 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1345,14 +1345,6 @@ void uprobe_free_utask(struct task_struct *t)
 }
 
 /*
- * Called in context of a new clone/fork from copy_process.
- */
-void uprobe_copy_process(struct task_struct *t)
-{
-	t->utask = NULL;
-}
-
-/*
  * Allocate a uprobe_task object for the task if if necessary.
  * Called when the thread hits a breakpoint.
  *
@@ -1368,6 +1360,14 @@ static struct uprobe_task *get_utask(void)
 }
 
 /*
+ * Called in context of a new clone/fork from copy_process.
+ */
+void uprobe_copy_process(struct task_struct *t)
+{
+	t->utask = NULL;
+}
+
+/*
  * Current area->vaddr notion assume the trampoline address is always
  * equal area->vaddr.
  *
diff --git a/kernel/fork.c b/kernel/fork.c
index 086fe73..d3603b8 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1373,7 +1373,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	INIT_LIST_HEAD(&p->pi_state_list);
 	p->pi_state_cache = NULL;
 #endif
-	uprobe_copy_process(p);
 	/*
 	 * sigaltstack should be cleared when sharing the same VM
 	 */
@@ -1490,6 +1489,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	perf_event_fork(p);
 
 	trace_task_newtask(p, clone_flags);
+	uprobe_copy_process(p);
 
 	return p;
 
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/5] uprobes: Introduce __create_xol_area()
  2013-10-13 19:18 [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
  2013-10-13 19:18 ` [PATCH 1/5] uprobes: Change the callsite of uprobe_copy_process() Oleg Nesterov
@ 2013-10-13 19:18 ` Oleg Nesterov
  2013-10-16 12:41   ` Srikar Dronamraju
  2013-10-13 19:18 ` [PATCH 3/5] uprobes: Teach __create_xol_area() to accept the predefined vaddr Oleg Nesterov
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-13 19:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

No functional changes, preparation.

Extract the code which actually allocates/installs the new area
into the new helper, __create_xol_area().

While at it remove the unnecessary "ret = ENOMEM" and "ret = 0"
in xol_add_vma(), they both have no effect.

Cc: stable@vger.kernel.org # 3.9+
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/events/uprobes.c |   49 ++++++++++++++++++++++++----------------------
 1 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index db7a1dc..c09d417 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1096,17 +1096,15 @@ void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned lon
 }
 
 /* Slot allocation for XOL */
-static int xol_add_vma(struct xol_area *area)
+static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
 {
-	struct mm_struct *mm = current->mm;
 	int ret = -EALREADY;
 
 	down_write(&mm->mmap_sem);
 	if (mm->uprobes_state.xol_area)
 		goto fail;
 
-	ret = -ENOMEM;
-	/* Try to map as high as possible, this is only a hint. */
+		/* Try to map as high as possible, this is only a hint. */
 	area->vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0);
 	if (area->vaddr & ~PAGE_MASK) {
 		ret = area->vaddr;
@@ -1120,28 +1118,17 @@ static int xol_add_vma(struct xol_area *area)
 
 	smp_wmb();	/* pairs with get_xol_area() */
 	mm->uprobes_state.xol_area = area;
-	ret = 0;
  fail:
 	up_write(&mm->mmap_sem);
 
 	return ret;
 }
 
-/*
- * get_xol_area - Allocate process's xol_area if necessary.
- * This area will be used for storing instructions for execution out of line.
- *
- * Returns the allocated area or NULL.
- */
-static struct xol_area *get_xol_area(void)
+static struct xol_area *__create_xol_area(void)
 {
 	struct mm_struct *mm = current->mm;
-	struct xol_area *area;
 	uprobe_opcode_t insn = UPROBE_SWBP_INSN;
-
-	area = mm->uprobes_state.xol_area;
-	if (area)
-		goto ret;
+	struct xol_area *area;
 
 	area = kzalloc(sizeof(*area), GFP_KERNEL);
 	if (unlikely(!area))
@@ -1155,13 +1142,13 @@ static struct xol_area *get_xol_area(void)
 	if (!area->page)
 		goto free_bitmap;
 
-	/* allocate first slot of task's xol_area for the return probes */
+	init_waitqueue_head(&area->wq);
+	/* Reserve the 1st slot for get_trampoline_vaddr() */
 	set_bit(0, area->bitmap);
-	copy_to_page(area->page, 0, &insn, UPROBE_SWBP_INSN_SIZE);
 	atomic_set(&area->slot_count, 1);
-	init_waitqueue_head(&area->wq);
+	copy_to_page(area->page, 0, &insn, UPROBE_SWBP_INSN_SIZE);
 
-	if (!xol_add_vma(area))
+	if (!xol_add_vma(mm, area))
 		return area;
 
 	__free_page(area->page);
@@ -1170,9 +1157,25 @@ static struct xol_area *get_xol_area(void)
  free_area:
 	kfree(area);
  out:
+ 	return NULL;
+}
+
+/*
+ * get_xol_area - Allocate process's xol_area if necessary.
+ * This area will be used for storing instructions for execution out of line.
+ *
+ * Returns the allocated area or NULL.
+ */
+static struct xol_area *get_xol_area(void)
+{
+	struct mm_struct *mm = current->mm;
+	struct xol_area *area;
+
+	if (!mm->uprobes_state.xol_area)
+		__create_xol_area();
+
 	area = mm->uprobes_state.xol_area;
- ret:
-	smp_read_barrier_depends();     /* pairs with wmb in xol_add_vma() */
+	smp_read_barrier_depends();	/* pairs with wmb in xol_add_vma() */
 	return area;
 }
 
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/5] uprobes: Teach __create_xol_area() to accept the predefined vaddr
  2013-10-13 19:18 [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
  2013-10-13 19:18 ` [PATCH 1/5] uprobes: Change the callsite of uprobe_copy_process() Oleg Nesterov
  2013-10-13 19:18 ` [PATCH 2/5] uprobes: Introduce __create_xol_area() Oleg Nesterov
@ 2013-10-13 19:18 ` Oleg Nesterov
  2013-10-16 12:43   ` Srikar Dronamraju
  2013-10-13 19:18 ` [PATCH 4/5] uprobes: Change uprobe_copy_process() to dup return_instances Oleg Nesterov
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-13 19:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

Currently xol_add_vma() uses get_unmapped_area() for area->vaddr,
but the next patches need to use the fixed address. So this patch
adds the new "vaddr" argument to __create_xol_area() which should
be used as area->vaddr if it is nonzero.

xol_add_vma() doesn't bother to verify that the predefined addr is
not used, insert_vm_struct() should fail if find_vma_links() detects
the overlap with the existing vma.

Also, __create_xol_area() doesn't need __GFP_ZERO to allocate area.

Cc: stable@vger.kernel.org # 3.9+
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/events/uprobes.c |   18 +++++++++++-------
 1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index c09d417..c836135 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1104,11 +1104,14 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
 	if (mm->uprobes_state.xol_area)
 		goto fail;
 
+	if (!area->vaddr) {
 		/* Try to map as high as possible, this is only a hint. */
-	area->vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0);
-	if (area->vaddr & ~PAGE_MASK) {
-		ret = area->vaddr;
-		goto fail;
+		area->vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE,
+						PAGE_SIZE, 0, 0);
+		if (area->vaddr & ~PAGE_MASK) {
+			ret = area->vaddr;
+			goto fail;
+		}
 	}
 
 	ret = install_special_mapping(mm, area->vaddr, PAGE_SIZE,
@@ -1124,13 +1127,13 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
 	return ret;
 }
 
-static struct xol_area *__create_xol_area(void)
+static struct xol_area *__create_xol_area(unsigned long vaddr)
 {
 	struct mm_struct *mm = current->mm;
 	uprobe_opcode_t insn = UPROBE_SWBP_INSN;
 	struct xol_area *area;
 
-	area = kzalloc(sizeof(*area), GFP_KERNEL);
+	area = kmalloc(sizeof(*area), GFP_KERNEL);
 	if (unlikely(!area))
 		goto out;
 
@@ -1142,6 +1145,7 @@ static struct xol_area *__create_xol_area(void)
 	if (!area->page)
 		goto free_bitmap;
 
+	area->vaddr = vaddr;
 	init_waitqueue_head(&area->wq);
 	/* Reserve the 1st slot for get_trampoline_vaddr() */
 	set_bit(0, area->bitmap);
@@ -1172,7 +1176,7 @@ static struct xol_area *get_xol_area(void)
 	struct xol_area *area;
 
 	if (!mm->uprobes_state.xol_area)
-		__create_xol_area();
+		__create_xol_area(0);
 
 	area = mm->uprobes_state.xol_area;
 	smp_read_barrier_depends();	/* pairs with wmb in xol_add_vma() */
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/5] uprobes: Change uprobe_copy_process() to dup return_instances
  2013-10-13 19:18 [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
                   ` (2 preceding siblings ...)
  2013-10-13 19:18 ` [PATCH 3/5] uprobes: Teach __create_xol_area() to accept the predefined vaddr Oleg Nesterov
@ 2013-10-13 19:18 ` Oleg Nesterov
  2013-10-14 18:45   ` Peter Zijlstra
  2013-10-16 12:47   ` Srikar Dronamraju
  2013-10-13 19:18 ` [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area Oleg Nesterov
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-13 19:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

uprobe_copy_process() assumes that the new child doesn't need
->utask, it should be allocated by demand.

But this is not true if the forking task has the pending ret-
probes, the child should report them as well and thus it needs
the copy of parent's ->return_instances chain. Otherwise the
child crashes when it returns from the probed function.

Note: this change alone doesn't fix the problem, see the next
change.

Cc: stable@vger.kernel.org # 3.9+
Reported-by: Martin Cermak <mcermak@redhat.com>
Reported-by: David Smith <dsmith@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/events/uprobes.c |   43 +++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index c836135..67b65c8 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1366,12 +1366,55 @@ static struct uprobe_task *get_utask(void)
 	return current->utask;
 }
 
+static int dup_utask(struct task_struct *t, struct uprobe_task *o_utask)
+{
+	struct uprobe_task *n_utask;
+	struct return_instance **p, *o, *n;
+
+	n_utask = kzalloc(sizeof(struct uprobe_task), GFP_KERNEL);
+	if (!n_utask)
+		return -ENOMEM;
+	t->utask = n_utask;
+
+	p = &n_utask->return_instances;
+	for (o = o_utask->return_instances; o; o = o->next) {
+		n = kmalloc(sizeof(struct return_instance), GFP_KERNEL);
+		if (!n)
+			return -ENOMEM;
+
+		*n = *o;
+		atomic_inc(&n->uprobe->ref);
+		n->next = NULL;
+
+		*p = n;
+		p = &n->next;
+		n_utask->depth++;
+	}
+
+	return 0;
+}
+
+static void uprobe_warn(struct task_struct *t, const char *msg)
+{
+	pr_warn("uprobe: %s:%d failed to %s\n",
+			current->comm, current->pid, msg);
+}
+
 /*
  * Called in context of a new clone/fork from copy_process.
  */
 void uprobe_copy_process(struct task_struct *t)
 {
+	struct uprobe_task *utask = current->utask;
+	struct mm_struct *mm = current->mm;
+
 	t->utask = NULL;
+
+	if (mm == t->mm || !utask || !utask->return_instances)
+		return;
+
+	if (dup_utask(t, utask))
+		return uprobe_warn(t, "dup ret instances");
 }
 
 /*
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area
  2013-10-13 19:18 [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
                   ` (3 preceding siblings ...)
  2013-10-13 19:18 ` [PATCH 4/5] uprobes: Change uprobe_copy_process() to dup return_instances Oleg Nesterov
@ 2013-10-13 19:18 ` Oleg Nesterov
  2013-10-14 14:09   ` Peter Zijlstra
                     ` (2 more replies)
  2013-10-14 18:29 ` [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
  2013-10-16 17:39 ` [PATCH 6/5] uprobes: Teach uprobe_copy_process() to handle CLONE_VFORK Oleg Nesterov
  6 siblings, 3 replies; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-13 19:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

This finally fixes the serious bug in uretprobes: a forked child
crashes if the parent called fork() with the pending ret probe.

Trivial test-case:

	# perf probe -x /lib/libc.so.6 __fork%return
	# perf record -e probe_libc:__fork perl -le 'fork || print "OK"'

(the child doesn't print "OK", it is killed by SIGSEGV)

If the child returns from the probed function it actually returns
to trampoline_vaddr, because it got the copy of parent's stack
mangled by prepare_uretprobe() when the parent entered this func.

It crashes because a) this address is not mapped and b) until the
previous change it doesn't have the proper->return_instances info.

This means that uprobe_copy_process() has to create xol_area which
has the trampoline slot, and its vaddr should be equal to parent's
xol_area->vaddr.

Unfortunately, uprobe_copy_process() can not simply do
__create_xol_area(child, xol_area->vaddr). This could actually work
but perf_event_mmap() doesn't expect the usage of foreign ->mm. So
we offload this to task_work_run(), and pass the argument via not
yet used utask->vaddr.

We know that this vaddr is fine for install_special_mapping(), the
necessary hole was recently "created" by dup_mmap() which skips the
parent's VM_DONTCOPY area, and nobody else could use the new mm.

Unfortunately, this also means that we can not handle the errors
properly, we obviously can not abort the already completed fork().
So we simply print the warning if GFP_KERNEL allocation (the only
possible reason) fails.

Cc: stable@vger.kernel.org # 3.9+
Reported-by: Martin Cermak <mcermak@redhat.com>
Reported-by: David Smith <dsmith@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/events/uprobes.c |   22 ++++++++++++++++++++++
 1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 67b65c8..0c5d9d4 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -35,6 +35,7 @@
 #include <linux/kdebug.h>	/* notifier mechanism */
 #include "../../mm/internal.h"	/* munlock_vma_page */
 #include <linux/percpu-rwsem.h>
+#include <linux/task_work.h>
 
 #include <linux/uprobes.h>
 
@@ -1400,6 +1401,17 @@ static void uprobe_warn(struct task_struct *t, const char *msg)
 			current->comm, current->pid, msg);
 }
 
+static void dup_xol_work(struct callback_head *work)
+{
+	kfree(work);
+
+	if (current->flags & PF_EXITING)
+		return;
+
+	if (!__create_xol_area(current->utask->vaddr))
+		uprobe_warn(current, "dup xol area");
+}
+
 /*
  * Called in context of a new clone/fork from copy_process.
  */
@@ -1407,6 +1419,7 @@ void uprobe_copy_process(struct task_struct *t)
 {
 	struct uprobe_task *utask = current->utask;
 	struct mm_struct *mm = current->mm;
+	struct callback_head *work;
 
 	t->utask = NULL;
 
@@ -1415,6 +1428,15 @@ void uprobe_copy_process(struct task_struct *t)
 
 	if (dup_utask(t, utask))
 		return uprobe_warn(t, "dup ret instances");
+
+	/* TODO: move it into the union in uprobe_task */
+	work = kmalloc(sizeof(*work), GFP_KERNEL);
+	if (!work)
+		return uprobe_warn(t, "dup xol area");
+
+	utask->vaddr = mm->uprobes_state.xol_area->vaddr;
+	init_task_work(work, dup_xol_work);
+	task_work_add(t, work, true);
 }
 
 /*
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area
  2013-10-13 19:18 ` [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area Oleg Nesterov
@ 2013-10-14 14:09   ` Peter Zijlstra
  2013-10-14 14:55     ` Oleg Nesterov
  2013-10-16 12:53   ` Srikar Dronamraju
  2013-10-18 15:49   ` Oleg Nesterov
  2 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2013-10-14 14:09 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Srikar Dronamraju, linux-kernel

On Sun, Oct 13, 2013 at 09:18:44PM +0200, Oleg Nesterov wrote:
> This finally fixes the serious bug in uretprobes: a forked child
> crashes if the parent called fork() with the pending ret probe.
> 
> Trivial test-case:
> 
> 	# perf probe -x /lib/libc.so.6 __fork%return
> 	# perf record -e probe_libc:__fork perl -le 'fork || print "OK"'
> 
> (the child doesn't print "OK", it is killed by SIGSEGV)
> 
> If the child returns from the probed function it actually returns
> to trampoline_vaddr, because it got the copy of parent's stack
> mangled by prepare_uretprobe() when the parent entered this func.
> 
> It crashes because a) this address is not mapped and b) until the
> previous change it doesn't have the proper->return_instances info.
> 
> This means that uprobe_copy_process() has to create xol_area which
> has the trampoline slot, and its vaddr should be equal to parent's
> xol_area->vaddr.
> 
> Unfortunately, uprobe_copy_process() can not simply do
> __create_xol_area(child, xol_area->vaddr). This could actually work
> but perf_event_mmap() doesn't expect the usage of foreign ->mm. So
> we offload this to task_work_run(), and pass the argument via not
> yet used utask->vaddr.
> 
> We know that this vaddr is fine for install_special_mapping(), the
> necessary hole was recently "created" by dup_mmap() which skips the
> parent's VM_DONTCOPY area, and nobody else could use the new mm.
> 
> Unfortunately, this also means that we can not handle the errors
> properly, we obviously can not abort the already completed fork().
> So we simply print the warning if GFP_KERNEL allocation (the only
> possible reason) fails.

Oh cute.. so we could actually ignore this perf_event_mmap() because we
got it for the parent when we inserted the probe, and the perf tools
assume the child mm layout is identical to the parent layout (it doesn't
actually see the VM_DONTCOPY bit).

So we could add: 'if (vma->vm_mm != current->mm) return;' to
perf_event_mmap() with a very big nasty comment.

That said; should we hide the XOL vma from perf altogether? That is; it
will greatly obfuscate the perf data to get hits from the XOL table as
we've got no means of mapping it back to an instruction.

We could transform the perf IP from XOL areas back to the original
instruction site. The only side effect that has is that since the XOL
code is far more expensive than the original single instruction the
instruction appears excessively more expensive than expected.

Thoughts?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area
  2013-10-14 14:09   ` Peter Zijlstra
@ 2013-10-14 14:55     ` Oleg Nesterov
  2013-10-14 15:47       ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-14 14:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Srikar Dronamraju, linux-kernel

On 10/14, Peter Zijlstra wrote:
>
> On Sun, Oct 13, 2013 at 09:18:44PM +0200, Oleg Nesterov wrote:
> > This finally fixes the serious bug in uretprobes: a forked child
> > crashes if the parent called fork() with the pending ret probe.
> > ...
> >
> > Unfortunately, this also means that we can not handle the errors
> > properly, we obviously can not abort the already completed fork().
> > So we simply print the warning if GFP_KERNEL allocation (the only
> > possible reason) fails.
>
> Oh cute.. so we could actually ignore this perf_event_mmap() because we
> got it for the parent when we inserted the probe, and the perf tools
> assume the child mm layout is identical to the parent layout (it doesn't
> actually see the VM_DONTCOPY bit).
>
> So we could add: 'if (vma->vm_mm != current->mm) return;' to
> perf_event_mmap() with a very big nasty comment.

Perhaps. I can't really comment, but this is really nasty. I mean, this
simply doesn't look good. perf_event_mmap() will be reported or not
depending on how/why the task creates xol_area.

> That said; should we hide the XOL vma from perf altogether? That is; it
> will greatly obfuscate the perf data to get hits from the XOL table as
> we've got no means of mapping it back to an instruction.

Again, I can't really comment. But this creates the special case. OK,
xol_area is "special" anon mapping anyway, but still. And of course
this needs __install_special_mapping().

So I'd prefer to push these changes as is for now. GFP_KERNEL should
"never" fail and we need the fix for stable.

I agree, in the long term we should fix the inability to handle the
errors correctly. But this needs more changes and more uprobes hooks.
To simplify, suppose that we simply remove perf_event_mmap() from
install_special_mapping() (yes, wes, we cant'). Then we should:

	1. revert 1/5, it already moves uprobe_copy_process() to the
	   point-of-no-return (for 4-5).

	2. uprobe_copy_process() can avoid task_work_add() _and_ it can
	   return an error if dup_utask/dup_xol fails.

	3. However, 2. means that we need to handle the potential errors
	   after uprobe_copy_process() suceeds. This means we need, say,
	   uprobe_abort_fork() somewhere near bad_fork_cleanup_mm.

So will you agree with task_work for now?

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area
  2013-10-14 14:55     ` Oleg Nesterov
@ 2013-10-14 15:47       ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2013-10-14 15:47 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Srikar Dronamraju, linux-kernel

On Mon, Oct 14, 2013 at 04:55:39PM +0200, Oleg Nesterov wrote:
> So will you agree with task_work for now?

oh sure, these suggestions were more a where next thing. because
 the current state is very much undesirable any which way. 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s)
  2013-10-13 19:18 [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
                   ` (4 preceding siblings ...)
  2013-10-13 19:18 ` [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area Oleg Nesterov
@ 2013-10-14 18:29 ` Oleg Nesterov
  2013-10-16 17:38   ` Oleg Nesterov
  2013-10-16 17:39 ` [PATCH 6/5] uprobes: Teach uprobe_copy_process() to handle CLONE_VFORK Oleg Nesterov
  6 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-14 18:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

On 10/13, Oleg Nesterov wrote:
>
> Please review, this series fixes the serious bug reported by
> Martin and David and cc's stable. See the changelog in 5/5.

Forgot to mention...

This probably needs another patch to handle the special case, vfork().
In this case it would be more correct to dup return_instances but
(obviously) avoid dup_xol_area.

However I think this is not that important, the child should not "unwind"
the stack if it shares mm/stack with its parent, otherwise it can corrupt
the parent's stack.

So I think that this change (which needs to pass "clone_flags" to
uprobe_copy_process) is not the stable material. But I'll update the
changelog in 4/5 to explain that "mm == t->mm" check should be updated.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/5] uprobes: Change uprobe_copy_process() to dup return_instances
  2013-10-13 19:18 ` [PATCH 4/5] uprobes: Change uprobe_copy_process() to dup return_instances Oleg Nesterov
@ 2013-10-14 18:45   ` Peter Zijlstra
  2013-10-14 19:00     ` Oleg Nesterov
  2013-10-16 12:47   ` Srikar Dronamraju
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2013-10-14 18:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Srikar Dronamraju, linux-kernel

On Sun, Oct 13, 2013 at 09:18:41PM +0200, Oleg Nesterov wrote:
> uprobe_copy_process() assumes that the new child doesn't need
> ->utask, it should be allocated by demand.
> 
> But this is not true if the forking task has the pending ret-
> probes, the child should report them as well and thus it needs
> the copy of parent's ->return_instances chain. Otherwise the
> child crashes when it returns from the probed function.

So children don't automagically inherit the same probes (only though the
high level interface -- like perf), so wouldn't simply fixing up the
child stack be a solution?

If not; its not entirely clear to my why this isn't a good solution
based on these changelogs.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/5] uprobes: Change uprobe_copy_process() to dup return_instances
  2013-10-14 18:45   ` Peter Zijlstra
@ 2013-10-14 19:00     ` Oleg Nesterov
  0 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-14 19:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Srikar Dronamraju, linux-kernel

On 10/14, Peter Zijlstra wrote:
>
> On Sun, Oct 13, 2013 at 09:18:41PM +0200, Oleg Nesterov wrote:
> > uprobe_copy_process() assumes that the new child doesn't need
> > ->utask, it should be allocated by demand.
> >
> > But this is not true if the forking task has the pending ret-
> > probes, the child should report them as well and thus it needs
> > the copy of parent's ->return_instances chain. Otherwise the
> > child crashes when it returns from the probed function.
>
> So children don't automagically inherit the same probes

They actually do. And in this case they also "inherit" the fact that
the probed function was called, even if the forked child didn't do
this actually.

> so wouldn't simply fixing up the
> child stack be a solution?

This was plan A ;)

> If not; its not entirely clear to my why this isn't a good solution

Tthis doesn't look correct, although "correct" is subjective and we
never tried to enforce the rules before. But at least stap wants to
see the reports from the child.

Another reason is that this needs the arch-specific changes/hooks.
Say, I simply do not know how we can "revert" the effect of
"regs->link = trampoline_vaddr" on powerpc, this looks simply
impossible.

And even on x86 we either need __access_remote_vm() from copy_process()
or or dup_utask() + task_work_run() so that the child can do this itself.

(plus we also need to change prepare_uretprobe(), say, on x86 it should
 record regs->sp in return_instance, but this is minor).

> based on these changelogs.

Note the "the child should report them as well"... but yes, agreed,
I will update the changelog.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/5] uprobes: Change the callsite of uprobe_copy_process()
  2013-10-13 19:18 ` [PATCH 1/5] uprobes: Change the callsite of uprobe_copy_process() Oleg Nesterov
@ 2013-10-16 12:37   ` Srikar Dronamraju
  0 siblings, 0 replies; 22+ messages in thread
From: Srikar Dronamraju @ 2013-10-16 12:37 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Peter Zijlstra, linux-kernel

* Oleg Nesterov <oleg@redhat.com> [2013-10-13 21:18:31]:

> Preparation for the next patches.
> 
> Move the callsite of uprobe_copy_process() in copy_process() down
> to the succesfull return. We do not care if copy_process() fails,
> uprobe_free_utask() won't be called in this case so the wrong
> ->utask != NULL doesn't matter.
> 
> OTOH, with this change we know that copy_process() can't fail when
> uprobe_copy_process() is called, the new task should either return
> to user-mode or call do_exit(). This way uprobe_copy_process() can:
> 
> 	1. setup p->utask != NULL if necessary
> 
> 	2. setup uprobes_state.xol_area
> 
> 	3. use task_work_add(p)
> 
> Also, move the definition of uprobe_copy_process() down so that it
> can see get_utask().
> 
> Cc: stable@vger.kernel.org # 3.9+
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>


-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] uprobes: Introduce __create_xol_area()
  2013-10-13 19:18 ` [PATCH 2/5] uprobes: Introduce __create_xol_area() Oleg Nesterov
@ 2013-10-16 12:41   ` Srikar Dronamraju
  2013-10-16 12:50     ` Srikar Dronamraju
  0 siblings, 1 reply; 22+ messages in thread
From: Srikar Dronamraju @ 2013-10-16 12:41 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Peter Zijlstra, linux-kernel

* Oleg Nesterov <oleg@redhat.com> [2013-10-13 21:18:35]:

> No functional changes, preparation.
> 
> Extract the code which actually allocates/installs the new area
> into the new helper, __create_xol_area().
> 
> While at it remove the unnecessary "ret = ENOMEM" and "ret = 0"
> in xol_add_vma(), they both have no effect.
> 
> Cc: stable@vger.kernel.org # 3.9+
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

One nit below.

> 
>  /* Slot allocation for XOL */
> -static int xol_add_vma(struct xol_area *area)
> +static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
>  {
> -	struct mm_struct *mm = current->mm;
>  	int ret = -EALREADY;
> 
>  	down_write(&mm->mmap_sem);
>  	if (mm->uprobes_state.xol_area)
>  		goto fail;
> 
> -	ret = -ENOMEM;
> -	/* Try to map as high as possible, this is only a hint. */
> +		/* Try to map as high as possible, this is only a hint. */

Nit: This comment seems to be shifted unnecessarily.


-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/5] uprobes: Teach __create_xol_area() to accept the predefined vaddr
  2013-10-13 19:18 ` [PATCH 3/5] uprobes: Teach __create_xol_area() to accept the predefined vaddr Oleg Nesterov
@ 2013-10-16 12:43   ` Srikar Dronamraju
  0 siblings, 0 replies; 22+ messages in thread
From: Srikar Dronamraju @ 2013-10-16 12:43 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Peter Zijlstra, linux-kernel

* Oleg Nesterov <oleg@redhat.com> [2013-10-13 21:18:38]:

> Currently xol_add_vma() uses get_unmapped_area() for area->vaddr,
> but the next patches need to use the fixed address. So this patch
> adds the new "vaddr" argument to __create_xol_area() which should
> be used as area->vaddr if it is nonzero.
> 
> xol_add_vma() doesn't bother to verify that the predefined addr is
> not used, insert_vm_struct() should fail if find_vma_links() detects
> the overlap with the existing vma.
> 
> Also, __create_xol_area() doesn't need __GFP_ZERO to allocate area.
> 
> Cc: stable@vger.kernel.org # 3.9+
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/5] uprobes: Change uprobe_copy_process() to dup return_instances
  2013-10-13 19:18 ` [PATCH 4/5] uprobes: Change uprobe_copy_process() to dup return_instances Oleg Nesterov
  2013-10-14 18:45   ` Peter Zijlstra
@ 2013-10-16 12:47   ` Srikar Dronamraju
  1 sibling, 0 replies; 22+ messages in thread
From: Srikar Dronamraju @ 2013-10-16 12:47 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Peter Zijlstra, linux-kernel

* Oleg Nesterov <oleg@redhat.com> [2013-10-13 21:18:41]:

> uprobe_copy_process() assumes that the new child doesn't need
> ->utask, it should be allocated by demand.
> 
> But this is not true if the forking task has the pending ret-
> probes, the child should report them as well and thus it needs
> the copy of parent's ->return_instances chain. Otherwise the
> child crashes when it returns from the probed function.
> 
> Note: this change alone doesn't fix the problem, see the next
> change.
> 
> Cc: stable@vger.kernel.org # 3.9+
> Reported-by: Martin Cermak <mcermak@redhat.com>
> Reported-by: David Smith <dsmith@redhat.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>


-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/5] uprobes: Introduce __create_xol_area()
  2013-10-16 12:41   ` Srikar Dronamraju
@ 2013-10-16 12:50     ` Srikar Dronamraju
  0 siblings, 0 replies; 22+ messages in thread
From: Srikar Dronamraju @ 2013-10-16 12:50 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Peter Zijlstra, linux-kernel

> 
> > 
> >  /* Slot allocation for XOL */
> > -static int xol_add_vma(struct xol_area *area)
> > +static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
> >  {
> > -	struct mm_struct *mm = current->mm;
> >  	int ret = -EALREADY;
> > 
> >  	down_write(&mm->mmap_sem);
> >  	if (mm->uprobes_state.xol_area)
> >  		goto fail;
> > 
> > -	ret = -ENOMEM;
> > -	/* Try to map as high as possible, this is only a hint. */
> > +		/* Try to map as high as possible, this is only a hint. */
> 
> Nit: This comment seems to be shifted unnecessarily.
> 

This gets corrected in the next patch.. Sorry for the noise.

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area
  2013-10-13 19:18 ` [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area Oleg Nesterov
  2013-10-14 14:09   ` Peter Zijlstra
@ 2013-10-16 12:53   ` Srikar Dronamraju
  2013-10-16 16:09     ` Oleg Nesterov
  2013-10-18 15:49   ` Oleg Nesterov
  2 siblings, 1 reply; 22+ messages in thread
From: Srikar Dronamraju @ 2013-10-16 12:53 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Peter Zijlstra, linux-kernel

* Oleg Nesterov <oleg@redhat.com> [2013-10-13 21:18:44]:

> This finally fixes the serious bug in uretprobes: a forked child
> crashes if the parent called fork() with the pending ret probe.
> 
> Trivial test-case:
> 
> 	# perf probe -x /lib/libc.so.6 __fork%return
> 	# perf record -e probe_libc:__fork perl -le 'fork || print "OK"'
> 
> (the child doesn't print "OK", it is killed by SIGSEGV)
> 
> If the child returns from the probed function it actually returns
> to trampoline_vaddr, because it got the copy of parent's stack
> mangled by prepare_uretprobe() when the parent entered this func.
> 
> It crashes because a) this address is not mapped and b) until the
> previous change it doesn't have the proper->return_instances info.
> 
> This means that uprobe_copy_process() has to create xol_area which
> has the trampoline slot, and its vaddr should be equal to parent's
> xol_area->vaddr.
> 
> Unfortunately, uprobe_copy_process() can not simply do
> __create_xol_area(child, xol_area->vaddr). This could actually work
> but perf_event_mmap() doesn't expect the usage of foreign ->mm. So
> we offload this to task_work_run(), and pass the argument via not
> yet used utask->vaddr.
> 
> We know that this vaddr is fine for install_special_mapping(), the
> necessary hole was recently "created" by dup_mmap() which skips the
> parent's VM_DONTCOPY area, and nobody else could use the new mm.


I was actually thinking if we can remove the VM_DONTCOPY from
install_special_mapping, But there are obvious issues with that approach
+ lot more house keeping. So your approach is much much better.

> 
> Unfortunately, this also means that we can not handle the errors
> properly, we obviously can not abort the already completed fork().
> So we simply print the warning if GFP_KERNEL allocation (the only
> possible reason) fails.
> 
> Cc: stable@vger.kernel.org # 3.9+
> Reported-by: Martin Cermak <mcermak@redhat.com>
> Reported-by: David Smith <dsmith@redhat.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

-- 
Thanks and Regards
Srikar Dronamraju


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area
  2013-10-16 12:53   ` Srikar Dronamraju
@ 2013-10-16 16:09     ` Oleg Nesterov
  0 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-16 16:09 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Ingo Molnar, Anton Arapov, David Smith, Frank Ch. Eigler,
	Martin Cermak, Peter Zijlstra, linux-kernel

On 10/16, Srikar Dronamraju wrote:
>
> * Oleg Nesterov <oleg@redhat.com> [2013-10-13 21:18:44]:
>
> > We know that this vaddr is fine for install_special_mapping(), the
> > necessary hole was recently "created" by dup_mmap() which skips the
> > parent's VM_DONTCOPY area, and nobody else could use the new mm.
>
>
> I was actually thinking if we can remove the VM_DONTCOPY from
> install_special_mapping,

I considered this option. I even thought about playing with vm_flags
in uprobe_start/stop_dup_mmap ;)

> But there are obvious issues with that approach

and we simply can't do this. Unlike, say, vdso xol vma can not be
cloned automatically, we obviously can't use the same area->page.
Plus we need to dup area itself, at least for ->bitmap.

> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

Thanks!

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s)
  2013-10-14 18:29 ` [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
@ 2013-10-16 17:38   ` Oleg Nesterov
  0 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-16 17:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

On 10/14, Oleg Nesterov wrote:
>
> On 10/13, Oleg Nesterov wrote:
> >
> > Please review, this series fixes the serious bug reported by
> > Martin and David and cc's stable. See the changelog in 5/5.

OK, nobody seems to object, I am going to ask Ingo to pull this fix.

But,

> This probably needs another patch to handle the special case, vfork().
> In this case it would be more correct to dup return_instances but
> (obviously) avoid dup_xol_area.
>
> However I think this is not that important, the child should not "unwind"
> the stack if it shares mm/stack with its parent, otherwise it can corrupt
> the parent's stack.

Yes, but I forgot that at least the child should return from vfork()
itself and it can be ret-probed.

So I am sending the additional 6/5 in reply to 0/5. This change can
be joined with 1/5, but I'd prefer to do this in a separate patch for
better documentation.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 6/5] uprobes: Teach uprobe_copy_process() to handle CLONE_VFORK
  2013-10-13 19:18 [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
                   ` (5 preceding siblings ...)
  2013-10-14 18:29 ` [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
@ 2013-10-16 17:39 ` Oleg Nesterov
  6 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-16 17:39 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

uprobe_copy_process() does nothing if the child shares ->mm with
the forking process, but there is a special case: CLONE_VFORK.
In this case it would be more correct to do dup_utask() but avoid
dup_xol(). This is not that important, the child should not unwind
its stack too much, this can corrupt the parent's stack, but at
least we need this to allow to ret-probe __vfork() itself.

Note: in theory, it would be better to check task_pt_regs(p)->sp
instead of CLONE_VFORK, we need to dup_utask() if and only if the
child can return from the function called by the parent. But this
needs the arch-dependant helper, and I think that nobody actually
does clone(same_stack, CLONE_VM).

Cc: stable@vger.kernel.org # 3.9+
Reported-by: Martin Cermak <mcermak@redhat.com>
Reported-by: David Smith <dsmith@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/uprobes.h |    4 ++--
 kernel/events/uprobes.c |   10 ++++++++--
 kernel/fork.c           |    2 +-
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index 06f28be..13a7f13 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -117,7 +117,7 @@ extern void uprobe_start_dup_mmap(void);
 extern void uprobe_end_dup_mmap(void);
 extern void uprobe_dup_mmap(struct mm_struct *oldmm, struct mm_struct *newmm);
 extern void uprobe_free_utask(struct task_struct *t);
-extern void uprobe_copy_process(struct task_struct *t);
+extern void uprobe_copy_process(struct task_struct *t, unsigned long flags);
 extern unsigned long __weak uprobe_get_swbp_addr(struct pt_regs *regs);
 extern int uprobe_post_sstep_notifier(struct pt_regs *regs);
 extern int uprobe_pre_sstep_notifier(struct pt_regs *regs);
@@ -174,7 +174,7 @@ static inline unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)
 static inline void uprobe_free_utask(struct task_struct *t)
 {
 }
-static inline void uprobe_copy_process(struct task_struct *t)
+static inline void uprobe_copy_process(struct task_struct *t, unsigned long flags)
 {
 }
 static inline void uprobe_clear_state(struct mm_struct *mm)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 0c5d9d4..a18dcb6 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1415,7 +1415,7 @@ static void dup_xol_work(struct callback_head *work)
 /*
  * Called in context of a new clone/fork from copy_process.
  */
-void uprobe_copy_process(struct task_struct *t)
+void uprobe_copy_process(struct task_struct *t, unsigned long flags)
 {
 	struct uprobe_task *utask = current->utask;
 	struct mm_struct *mm = current->mm;
@@ -1423,12 +1423,18 @@ void uprobe_copy_process(struct task_struct *t)
 
 	t->utask = NULL;
 
-	if (mm == t->mm || !utask || !utask->return_instances)
+	if (!utask || !utask->return_instances)
+		return;
+
+	if (mm == t->mm && !(flags & CLONE_VFORK))
 		return;
 
 	if (dup_utask(t, utask))
 		return uprobe_warn(t, "dup ret instances");
 
+	if (mm == t->mm)
+		return;
+
 	/* TODO: move it into the union in uprobe_task */
 	work = kmalloc(sizeof(*work), GFP_KERNEL);
 	if (!work)
diff --git a/kernel/fork.c b/kernel/fork.c
index d3603b8..8531609 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1489,7 +1489,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	perf_event_fork(p);
 
 	trace_task_newtask(p, clone_flags);
-	uprobe_copy_process(p);
+	uprobe_copy_process(p, clone_flags);
 
 	return p;
 
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area
  2013-10-13 19:18 ` [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area Oleg Nesterov
  2013-10-14 14:09   ` Peter Zijlstra
  2013-10-16 12:53   ` Srikar Dronamraju
@ 2013-10-18 15:49   ` Oleg Nesterov
  2 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2013-10-18 15:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Arapov, David Smith, Frank Ch. Eigler, Martin Cermak,
	Peter Zijlstra, Srikar Dronamraju, linux-kernel

On 10/13, Oleg Nesterov wrote:
>
> Unfortunately, uprobe_copy_process() can not simply do
> __create_xol_area(child, xol_area->vaddr). This could actually work
> but perf_event_mmap() doesn't expect the usage of foreign ->mm. So
> we offload this to task_work_run(), and pass the argument via not
> yet used utask->vaddr.

OK, this patch needs a fix, I'll send v2 in a minute.


> +	work = kmalloc(sizeof(*work), GFP_KERNEL);
> +	if (!work)
> +		return uprobe_warn(t, "dup xol area");
> +
> +	utask->vaddr = mm->uprobes_state.xol_area->vaddr;

Yes, currently utask->return_instances && !uprobes_state.xol_area
is not possible.

> +	init_task_work(work, dup_xol_work);
> +	task_work_add(t, work, true);

But if dup_xol_work() fails and the child does another fork(), it
can hit area == NULL, so we need to check this.

Oleg.


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2013-10-18 15:57 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-13 19:18 [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
2013-10-13 19:18 ` [PATCH 1/5] uprobes: Change the callsite of uprobe_copy_process() Oleg Nesterov
2013-10-16 12:37   ` Srikar Dronamraju
2013-10-13 19:18 ` [PATCH 2/5] uprobes: Introduce __create_xol_area() Oleg Nesterov
2013-10-16 12:41   ` Srikar Dronamraju
2013-10-16 12:50     ` Srikar Dronamraju
2013-10-13 19:18 ` [PATCH 3/5] uprobes: Teach __create_xol_area() to accept the predefined vaddr Oleg Nesterov
2013-10-16 12:43   ` Srikar Dronamraju
2013-10-13 19:18 ` [PATCH 4/5] uprobes: Change uprobe_copy_process() to dup return_instances Oleg Nesterov
2013-10-14 18:45   ` Peter Zijlstra
2013-10-14 19:00     ` Oleg Nesterov
2013-10-16 12:47   ` Srikar Dronamraju
2013-10-13 19:18 ` [PATCH 5/5] uprobes: Change uprobe_copy_process() to dup xol_area Oleg Nesterov
2013-10-14 14:09   ` Peter Zijlstra
2013-10-14 14:55     ` Oleg Nesterov
2013-10-14 15:47       ` Peter Zijlstra
2013-10-16 12:53   ` Srikar Dronamraju
2013-10-16 16:09     ` Oleg Nesterov
2013-10-18 15:49   ` Oleg Nesterov
2013-10-14 18:29 ` [PATCH 0/5] uprobes: fix fork() with the pending ret-probe(s) Oleg Nesterov
2013-10-16 17:38   ` Oleg Nesterov
2013-10-16 17:39 ` [PATCH 6/5] uprobes: Teach uprobe_copy_process() to handle CLONE_VFORK Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).