linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/6] task_work_add: generic process-context callbacks
@ 2012-04-19 23:14 Oleg Nesterov
  2012-04-19 23:14 ` [PATCH v6 1/6] " Oleg Nesterov
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Oleg Nesterov @ 2012-04-19 23:14 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

Hello.

The final-final-final version. Andrew, I hope you can take it.

Compared to v5 this series has the new patches which remove the
dead code, 5/6 and 6/6.

And, while doing 5/6 I noticed that arch/hexagon/kernel/signal.c
lacks tracehook_notify_resume(), this one-liner comes as 3/6.

The patches from v5 were not changed.

To all: sorry for this endless spam. It seems that nobody objects,
I'll trim CC if I need to send v7.

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v6 1/6] task_work_add: generic process-context callbacks
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
@ 2012-04-19 23:14 ` Oleg Nesterov
  2012-04-19 23:15 ` [PATCH v6 2/6] genirq: reimplement exit_irq_thread() hook via task_work_add() Oleg Nesterov
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2012-04-19 23:14 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

Provide a simple mechanism that allows running code in the
(nonatomic) context of the arbitrary task.

The caller does task_work_add(task, task_work) and this task
executes task_work->func() either from do_notify_resume() or
from do_exit(). The callback can rely on PF_EXITING to detect
the latter case.

"struct task_work" can be embedded in another struct, still it
has "void *data" to handle the most common/simple case.

This allows us to kill the ->replacement_session_keyring hack,
and potentially this can have more users.

Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks
into tracehook_notify_resume() and do_exit(). But at the same
time we can remove the "replacement_session_keyring != NULL"
checks from arch/*/signal.c and exit_creds().

Note: task_work_add/task_work_run abuses ->pi_lock. This is
only because this lock is already used by lookup_pi_state() to
synchronize with do_exit() setting PF_EXITING. Fortunately the
scope of this lock in task_work.c is really tiny, and the code
is unlikely anyway.

v2:
	- implement task_work_cancel(func), it removes the first
	  task_work with the same callback.
v3:
	- task_work_add() gets the new arg, "bool notify" to
	  conditionalize set_notify_resume(), this makes it useable
	  for kthreads and task_work_add(notify => false) can
	  work without TIF_NOTIFY_RESUME.

	- don't add the dummy "ifndef TIF_NOTIFY_RESUME" inlines,
	  just add the simple check in task_work_add().
v4:
	- s/task_work_queue/task_work_add/
v5:
	- task_work_run() uses current explicitely

Todo:
	- move clear_thread_flag(TIF_NOTIFY_RESUME) from arch/
	  to tracehook_notify_resume()

	- rename tracehook_notify_resume() and move it into
	  linux/task_work.h

	- m68k and xtensa don't have TIF_NOTIFY_RESUME and thus
	  task_work_add(notify => true) fails with -ENOTSUPP.

	  However, ->replacement_session_keyring equally needs
	  this flag, task_work_add() is not worse.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/sched.h     |    2 +
 include/linux/task_work.h |   33 +++++++++++++++++
 include/linux/tracehook.h |   13 ++++++-
 kernel/Makefile           |    2 +-
 kernel/exit.c             |    5 ++-
 kernel/fork.c             |    1 +
 kernel/task_work.c        |   84 +++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 136 insertions(+), 4 deletions(-)
 create mode 100644 include/linux/task_work.h
 create mode 100644 kernel/task_work.c

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 81a173c..be004ac 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1445,6 +1445,8 @@ struct task_struct {
 	int (*notifier)(void *priv);
 	void *notifier_data;
 	sigset_t *notifier_mask;
+	struct hlist_head task_works;
+
 	struct audit_context *audit_context;
 #ifdef CONFIG_AUDITSYSCALL
 	uid_t loginuid;
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
new file mode 100644
index 0000000..294d5d5
--- /dev/null
+++ b/include/linux/task_work.h
@@ -0,0 +1,33 @@
+#ifndef _LINUX_TASK_WORK_H
+#define _LINUX_TASK_WORK_H
+
+#include <linux/list.h>
+#include <linux/sched.h>
+
+struct task_work;
+typedef void (*task_work_func_t)(struct task_work *);
+
+struct task_work {
+	struct hlist_node hlist;
+	task_work_func_t func;
+	void *data;
+};
+
+static inline void
+init_task_work(struct task_work *twork, task_work_func_t func, void *data)
+{
+	twork->func = func;
+	twork->data = data;
+}
+
+int task_work_add(struct task_struct *task, struct task_work *twork, bool);
+struct task_work *task_work_cancel(struct task_struct *, task_work_func_t);
+void task_work_run(void);
+
+static inline void exit_task_work(struct task_struct *task)
+{
+	if (unlikely(!hlist_empty(&task->task_works)))
+		task_work_run();
+}
+
+#endif	/* _LINUX_TASK_WORK_H */
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 51bd91d..6a4d82b 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -49,6 +49,7 @@
 #include <linux/sched.h>
 #include <linux/ptrace.h>
 #include <linux/security.h>
+#include <linux/task_work.h>
 struct linux_binprm;
 
 /*
@@ -153,7 +154,6 @@ static inline void tracehook_signal_handler(int sig, siginfo_t *info,
 		ptrace_notify(SIGTRAP);
 }
 
-#ifdef TIF_NOTIFY_RESUME
 /**
  * set_notify_resume - cause tracehook_notify_resume() to be called
  * @task:		task that will call tracehook_notify_resume()
@@ -165,8 +165,10 @@ static inline void tracehook_signal_handler(int sig, siginfo_t *info,
  */
 static inline void set_notify_resume(struct task_struct *task)
 {
+#ifdef TIF_NOTIFY_RESUME
 	if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_RESUME))
 		kick_process(task);
+#endif
 }
 
 /**
@@ -184,7 +186,14 @@ static inline void set_notify_resume(struct task_struct *task)
  */
 static inline void tracehook_notify_resume(struct pt_regs *regs)
 {
+	/*
+	 * The caller just cleared TIF_NOTIFY_RESUME. This barrier
+	 * pairs with task_work_add()->set_notify_resume() after
+	 * hlist_add_head(task->task_works);
+	 */
+	smp_mb__after_clear_bit();
+	if (unlikely(!hlist_empty(&current->task_works)))
+		task_work_run();
 }
-#endif	/* TIF_NOTIFY_RESUME */
 
 #endif	/* <linux/tracehook.h> */
diff --git a/kernel/Makefile b/kernel/Makefile
index cb41b95..5790f8b 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -10,7 +10,7 @@ obj-y     = fork.o exec_domain.o panic.o printk.o \
 	    kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
 	    hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \
 	    notifier.o ksysfs.o cred.o \
-	    async.o range.o groups.o
+	    async.o range.o groups.o task_work.o
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace debug files and internal ftrace files
diff --git a/kernel/exit.c b/kernel/exit.c
index d8bd3b4..b82c38e 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -946,11 +946,14 @@ void do_exit(long code)
 	exit_signals(tsk);  /* sets PF_EXITING */
 	/*
 	 * tsk->flags are checked in the futex code to protect against
-	 * an exiting task cleaning up the robust pi futexes.
+	 * an exiting task cleaning up the robust pi futexes, and in
+	 * task_work_add() to avoid the race with exit_task_work().
 	 */
 	smp_mb();
 	raw_spin_unlock_wait(&tsk->pi_lock);
 
+	exit_task_work(tsk);
+
 	exit_irq_thread();
 
 	if (unlikely(in_atomic()))
diff --git a/kernel/fork.c b/kernel/fork.c
index b9372a0..d1108ac 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1380,6 +1380,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	 */
 	p->group_leader = p;
 	INIT_LIST_HEAD(&p->thread_group);
+	INIT_HLIST_HEAD(&p->task_works);
 
 	/* Now that the task is set up, run cgroup callbacks if
 	 * necessary. We need to run them before the task is visible
diff --git a/kernel/task_work.c b/kernel/task_work.c
new file mode 100644
index 0000000..82d1c79
--- /dev/null
+++ b/kernel/task_work.c
@@ -0,0 +1,84 @@
+#include <linux/spinlock.h>
+#include <linux/task_work.h>
+#include <linux/tracehook.h>
+
+int
+task_work_add(struct task_struct *task, struct task_work *twork, bool notify)
+{
+	unsigned long flags;
+	int err = -ESRCH;
+
+#ifndef TIF_NOTIFY_RESUME
+	if (notify)
+		return -ENOTSUPP;
+#endif
+	/*
+	 * We must not insert the new work if the task has already passed
+	 * exit_task_work(). We rely on do_exit()->raw_spin_unlock_wait()
+	 * and check PF_EXITING under pi_lock.
+	 */
+	raw_spin_lock_irqsave(&task->pi_lock, flags);
+	if (likely(!(task->flags & PF_EXITING))) {
+		hlist_add_head(&twork->hlist, &task->task_works);
+		err = 0;
+	}
+	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+
+	/* test_and_set_bit() implies mb(), see tracehook_notify_resume(). */
+	if (likely(!err) && notify)
+		set_notify_resume(task);
+	return err;
+}
+
+struct task_work *
+task_work_cancel(struct task_struct *task, task_work_func_t func)
+{
+	unsigned long flags;
+	struct task_work *twork;
+	struct hlist_node *pos;
+
+	raw_spin_lock_irqsave(&task->pi_lock, flags);
+	hlist_for_each_entry(twork, pos, &task->task_works, hlist) {
+		if (twork->func == func) {
+			hlist_del(&twork->hlist);
+			goto found;
+		}
+	}
+	twork = NULL;
+ found:
+	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+
+	return twork;
+}
+
+void task_work_run(void)
+{
+	struct task_struct *task = current;
+	struct hlist_head task_works;
+	struct hlist_node *pos;
+
+	raw_spin_lock_irq(&task->pi_lock);
+	hlist_move_list(&task->task_works, &task_works);
+	raw_spin_unlock_irq(&task->pi_lock);
+
+	if (unlikely(hlist_empty(&task_works)))
+		return;
+	/*
+	 * We use hlist to save the space in task_struct, but we want fifo.
+	 * Find the last entry, the list should be short, then process them
+	 * in reverse order.
+	 */
+	for (pos = task_works.first; pos->next; pos = pos->next)
+		;
+
+	for (;;) {
+		struct hlist_node **pprev = pos->pprev;
+		struct task_work *twork = container_of(pos, struct task_work,
+							hlist);
+		twork->func(twork);
+
+		if (pprev == &task_works.first)
+			break;
+		pos = container_of(pprev, struct hlist_node, next);
+	}
+}
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 2/6] genirq: reimplement exit_irq_thread() hook via task_work_add()
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
  2012-04-19 23:14 ` [PATCH v6 1/6] " Oleg Nesterov
@ 2012-04-19 23:15 ` Oleg Nesterov
  2012-04-19 23:15 ` [PATCH v6 3/6] hexagon: do_notify_resume() needs tracehook_notify_resume() Oleg Nesterov
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2012-04-19 23:15 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

exit_irq_thread() and task->irq_thread are needed to handle
the unexpected (and unlikely) exit of irq-thread.

We can use task_work instead and make this all private to
kernel/irq/manage.c, cleanup plus micro-optimization.

1. rename exit_irq_thread() to irq_thread_dtor(), make it
   static, and move it up before irq_thread().

2. change irq_thread() to do task_work_add(irq_thread_dtor)
   at the start and task_work_cancel() before return.

   tracehook_notify_resume() can never play with kthreads,
   only do_exit()->exit_task_work() can call the callback
   and this is what we want.

3. remove task_struct->irq_thread and the special hook
   in do_exit().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/interrupt.h |    4 --
 include/linux/sched.h     |   10 +-----
 kernel/exit.c             |    2 -
 kernel/irq/manage.c       |   69 +++++++++++++++++++++-----------------------
 4 files changed, 35 insertions(+), 50 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 2aea5d2..1cdd4d0 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -142,8 +142,6 @@ request_any_context_irq(unsigned int irq, irq_handler_t handler,
 extern int __must_check
 request_percpu_irq(unsigned int irq, irq_handler_t handler,
 		   const char *devname, void __percpu *percpu_dev_id);
-
-extern void exit_irq_thread(void);
 #else
 
 extern int __must_check
@@ -177,8 +175,6 @@ request_percpu_irq(unsigned int irq, irq_handler_t handler,
 {
 	return request_irq(irq, handler, 0, devname, percpu_dev_id);
 }
-
-static inline void exit_irq_thread(void) { }
 #endif
 
 extern void free_irq(unsigned int, void *);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index be004ac..e36edfb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1346,11 +1346,6 @@ struct task_struct {
 	unsigned sched_reset_on_fork:1;
 	unsigned sched_contributes_to_load:1;
 
-#ifdef CONFIG_GENERIC_HARDIRQS
-	/* IRQ handler threads */
-	unsigned irq_thread:1;
-#endif
-
 	pid_t pid;
 	pid_t tgid;
 
@@ -1358,10 +1353,9 @@ struct task_struct {
 	/* Canary value for the -fstack-protector gcc feature */
 	unsigned long stack_canary;
 #endif
-
-	/* 
+	/*
 	 * pointers to (original) parent process, youngest child, younger sibling,
-	 * older sibling, respectively.  (p->father can be replaced with 
+	 * older sibling, respectively.  (p->father can be replaced with
 	 * p->real_parent->pid)
 	 */
 	struct task_struct __rcu *real_parent; /* real parent process */
diff --git a/kernel/exit.c b/kernel/exit.c
index b82c38e..8135243 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -954,8 +954,6 @@ void do_exit(long code)
 
 	exit_task_work(tsk);
 
-	exit_irq_thread();
-
 	if (unlikely(in_atomic()))
 		printk(KERN_INFO "note: %s[%d] exited with preempt_count %d\n",
 				current->comm, task_pid_nr(current),
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 89a3ea8..525391c 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -14,6 +14,7 @@
 #include <linux/interrupt.h>
 #include <linux/slab.h>
 #include <linux/sched.h>
+#include <linux/task_work.h>
 
 #include "internals.h"
 
@@ -773,11 +774,39 @@ static void wake_threads_waitq(struct irq_desc *desc)
 		wake_up(&desc->wait_for_threads);
 }
 
+static void irq_thread_dtor(struct task_work *unused)
+{
+	struct task_struct *tsk = current;
+	struct irq_desc *desc;
+	struct irqaction *action;
+
+	if (WARN_ON_ONCE(!(current->flags & PF_EXITING)))
+		return;
+
+	action = kthread_data(tsk);
+
+	printk(KERN_ERR
+	       "exiting task \"%s\" (%d) is an active IRQ thread (irq %d)\n",
+	       tsk->comm ? tsk->comm : "", tsk->pid, action->irq);
+
+	desc = irq_to_desc(action->irq);
+	/*
+	 * If IRQTF_RUNTHREAD is set, we need to decrement
+	 * desc->threads_active and wake possible waiters.
+	 */
+	if (test_and_clear_bit(IRQTF_RUNTHREAD, &action->thread_flags))
+		wake_threads_waitq(desc);
+
+	/* Prevent a stale desc->threads_oneshot */
+	irq_finalize_oneshot(desc, action);
+}
+
 /*
  * Interrupt handler thread
  */
 static int irq_thread(void *data)
 {
+	struct task_work on_exit_work;
 	static const struct sched_param param = {
 		.sched_priority = MAX_USER_RT_PRIO/2,
 	};
@@ -793,7 +822,9 @@ static int irq_thread(void *data)
 		handler_fn = irq_thread_fn;
 
 	sched_setscheduler(current, SCHED_FIFO, &param);
-	current->irq_thread = 1;
+
+	init_task_work(&on_exit_work, irq_thread_dtor, NULL);
+	task_work_add(current, &on_exit_work, false);
 
 	while (!irq_wait_for_interrupt(action)) {
 		irqreturn_t action_ret;
@@ -815,45 +846,11 @@ static int irq_thread(void *data)
 	 * cannot touch the oneshot mask at this point anymore as
 	 * __setup_irq() might have given out currents thread_mask
 	 * again.
-	 *
-	 * Clear irq_thread. Otherwise exit_irq_thread() would make
-	 * fuzz about an active irq thread going into nirvana.
 	 */
-	current->irq_thread = 0;
+	task_work_cancel(current, irq_thread_dtor);
 	return 0;
 }
 
-/*
- * Called from do_exit()
- */
-void exit_irq_thread(void)
-{
-	struct task_struct *tsk = current;
-	struct irq_desc *desc;
-	struct irqaction *action;
-
-	if (!tsk->irq_thread)
-		return;
-
-	action = kthread_data(tsk);
-
-	printk(KERN_ERR
-	       "exiting task \"%s\" (%d) is an active IRQ thread (irq %d)\n",
-	       tsk->comm ? tsk->comm : "", tsk->pid, action->irq);
-
-	desc = irq_to_desc(action->irq);
-
-	/*
-	 * If IRQTF_RUNTHREAD is set, we need to decrement
-	 * desc->threads_active and wake possible waiters.
-	 */
-	if (test_and_clear_bit(IRQTF_RUNTHREAD, &action->thread_flags))
-		wake_threads_waitq(desc);
-
-	/* Prevent a stale desc->threads_oneshot */
-	irq_finalize_oneshot(desc, action);
-}
-
 static void irq_setup_forced_threading(struct irqaction *new)
 {
 	if (!force_irqthreads)
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 3/6] hexagon: do_notify_resume() needs tracehook_notify_resume()
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
  2012-04-19 23:14 ` [PATCH v6 1/6] " Oleg Nesterov
  2012-04-19 23:15 ` [PATCH v6 2/6] genirq: reimplement exit_irq_thread() hook via task_work_add() Oleg Nesterov
@ 2012-04-19 23:15 ` Oleg Nesterov
  2012-04-20  0:24   ` Richard Kuo
  2012-04-19 23:15 ` [PATCH v6 4/6] keys: change keyctl_session_to_parent() to use task_work_add() Oleg Nesterov
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2012-04-19 23:15 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

arch/hexagon/kernel/signal.c:do_notify_resume() forgets to call
tracehook_notify_resume() if TIF_NOTIFY_RESUME is set.

Cc: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 arch/hexagon/kernel/signal.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/hexagon/kernel/signal.c b/arch/hexagon/kernel/signal.c
index ecbab34..ab5f5ad 100644
--- a/arch/hexagon/kernel/signal.c
+++ b/arch/hexagon/kernel/signal.c
@@ -272,6 +272,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
+		tracehook_notify_resume(regs);
 		if (current->replacement_session_keyring)
 			key_replace_session_keyring();
 	}
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 4/6] keys: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
                   ` (2 preceding siblings ...)
  2012-04-19 23:15 ` [PATCH v6 3/6] hexagon: do_notify_resume() needs tracehook_notify_resume() Oleg Nesterov
@ 2012-04-19 23:15 ` Oleg Nesterov
  2012-04-19 23:16 ` [PATCH v6 5/6] keys: kill the dummy key_replace_session_keyring() Oleg Nesterov
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2012-04-19 23:15 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

Change keyctl_session_to_parent() to use task_work_add() and
move key_replace_session_keyring() logic into task_work->func().

Note that we do task_work_cancel() before task_work_add() to
ensure that only one work can be pending at any time. This is
important, we must not allow user-space to abuse the parent's
->task_works list.

The callback, replace_session_keyring(), checks PF_EXITING.
I guess this is not really needed but looks better.

As a side effect, this fixes the (unlikely) race. The callers
of key_replace_session_keyring() and keyctl_session_to_parent()
lack the necessary barriers, the parent can miss the request.

Now we can remove task_struct->replacement_session_keyring and
related code.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/key.h          |    6 +--
 security/keys/internal.h     |    2 +
 security/keys/keyctl.c       |   73 ++++++++++++++++++++----------------------
 security/keys/process_keys.c |   20 ++++-------
 4 files changed, 46 insertions(+), 55 deletions(-)

diff --git a/include/linux/key.h b/include/linux/key.h
index 96933b1..0c263d6 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -33,6 +33,8 @@ typedef uint32_t key_perm_t;
 
 struct key;
 
+#define key_replace_session_keyring()	do { } while (0)
+
 #ifdef CONFIG_KEYS
 
 #undef KEY_DEBUGGING
@@ -302,9 +304,6 @@ static inline bool key_is_instantiated(const struct key *key)
 #ifdef CONFIG_SYSCTL
 extern ctl_table key_sysctls[];
 #endif
-
-extern void key_replace_session_keyring(void);
-
 /*
  * the userspace interface
  */
@@ -327,7 +326,6 @@ extern void key_init(void);
 #define key_fsuid_changed(t)		do { } while(0)
 #define key_fsgid_changed(t)		do { } while(0)
 #define key_init()			do { } while(0)
-#define key_replace_session_keyring()	do { } while(0)
 
 #endif /* CONFIG_KEYS */
 #endif /* __KERNEL__ */
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 65647f8..c2986da 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -14,6 +14,7 @@
 
 #include <linux/sched.h>
 #include <linux/key-type.h>
+#include <linux/task_work.h>
 
 #ifdef __KDEBUG
 #define kenter(FMT, ...) \
@@ -148,6 +149,7 @@ extern key_ref_t lookup_user_key(key_serial_t id, unsigned long flags,
 #define KEY_LOOKUP_FOR_UNLINK	0x04
 
 extern long join_session_keyring(const char *name);
+extern void key_change_session_keyring(struct task_work *twork);
 
 extern struct work_struct key_gc_work;
 extern unsigned key_gc_delay;
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index fb767c6..55a451b 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1423,50 +1423,57 @@ long keyctl_get_security(key_serial_t keyid,
  */
 long keyctl_session_to_parent(void)
 {
-#ifdef TIF_NOTIFY_RESUME
 	struct task_struct *me, *parent;
 	const struct cred *mycred, *pcred;
-	struct cred *cred, *oldcred;
+	struct task_work *newwork, *oldwork;
 	key_ref_t keyring_r;
+	struct cred *cred;
 	int ret;
 
 	keyring_r = lookup_user_key(KEY_SPEC_SESSION_KEYRING, 0, KEY_LINK);
 	if (IS_ERR(keyring_r))
 		return PTR_ERR(keyring_r);
 
+	ret = -ENOMEM;
+	newwork = kmalloc(sizeof(struct task_work), GFP_KERNEL);
+	if (!newwork)
+		goto error_keyring;
+
 	/* our parent is going to need a new cred struct, a new tgcred struct
 	 * and new security data, so we allocate them here to prevent ENOMEM in
 	 * our parent */
-	ret = -ENOMEM;
 	cred = cred_alloc_blank();
 	if (!cred)
-		goto error_keyring;
+		goto error_newwork;
 
 	cred->tgcred->session_keyring = key_ref_to_ptr(keyring_r);
-	keyring_r = NULL;
+	init_task_work(newwork, key_change_session_keyring, cred);
 
 	me = current;
 	rcu_read_lock();
 	write_lock_irq(&tasklist_lock);
 
-	parent = me->real_parent;
 	ret = -EPERM;
+	oldwork = NULL;
+	parent = me->real_parent;
 
 	/* the parent mustn't be init and mustn't be a kernel thread */
 	if (parent->pid <= 1 || !parent->mm)
-		goto not_permitted;
+		goto unlock;
 
 	/* the parent must be single threaded */
 	if (!thread_group_empty(parent))
-		goto not_permitted;
+		goto unlock;
 
 	/* the parent and the child must have different session keyrings or
 	 * there's no point */
 	mycred = current_cred();
 	pcred = __task_cred(parent);
 	if (mycred == pcred ||
-	    mycred->tgcred->session_keyring == pcred->tgcred->session_keyring)
-		goto already_same;
+	    mycred->tgcred->session_keyring == pcred->tgcred->session_keyring) {
+		ret = 0;
+		goto unlock;
+	}
 
 	/* the parent must have the same effective ownership and mustn't be
 	 * SUID/SGID */
@@ -1476,50 +1483,40 @@ long keyctl_session_to_parent(void)
 	    pcred->gid	!= mycred->egid	||
 	    pcred->egid	!= mycred->egid	||
 	    pcred->sgid	!= mycred->egid)
-		goto not_permitted;
+		goto unlock;
 
 	/* the keyrings must have the same UID */
 	if ((pcred->tgcred->session_keyring &&
 	     pcred->tgcred->session_keyring->uid != mycred->euid) ||
 	    mycred->tgcred->session_keyring->uid != mycred->euid)
-		goto not_permitted;
+		goto unlock;
 
-	/* if there's an already pending keyring replacement, then we replace
-	 * that */
-	oldcred = parent->replacement_session_keyring;
+	/* cancel an already pending keyring replacement */
+	oldwork = task_work_cancel(parent, key_change_session_keyring);
 
 	/* the replacement session keyring is applied just prior to userspace
 	 * restarting */
-	parent->replacement_session_keyring = cred;
-	cred = NULL;
-	set_ti_thread_flag(task_thread_info(parent), TIF_NOTIFY_RESUME);
-
+	ret = task_work_add(parent, newwork, true);
+	if (!ret)
+		newwork = NULL;
+unlock:
 	write_unlock_irq(&tasklist_lock);
 	rcu_read_unlock();
-	if (oldcred)
-		put_cred(oldcred);
-	return 0;
-
-already_same:
-	ret = 0;
-not_permitted:
-	write_unlock_irq(&tasklist_lock);
-	rcu_read_unlock();
-	put_cred(cred);
+	if (oldwork) {
+		put_cred(oldwork->data);
+		kfree(oldwork);
+	}
+	if (newwork) {
+		put_cred(newwork->data);
+		kfree(newwork);
+	}
 	return ret;
 
+error_newwork:
+	kfree(newwork);
 error_keyring:
 	key_ref_put(keyring_r);
 	return ret;
-
-#else /* !TIF_NOTIFY_RESUME */
-	/*
-	 * To be removed when TIF_NOTIFY_RESUME has been implemented on
-	 * m68k/xtensa
-	 */
-#warning TIF_NOTIFY_RESUME not implemented
-	return -EOPNOTSUPP;
-#endif /* !TIF_NOTIFY_RESUME */
 }
 
 /*
diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
index be7ecb2..7ab0d8b 100644
--- a/security/keys/process_keys.c
+++ b/security/keys/process_keys.c
@@ -832,23 +832,17 @@ error:
  * Replace a process's session keyring on behalf of one of its children when
  * the target  process is about to resume userspace execution.
  */
-void key_replace_session_keyring(void)
+void key_change_session_keyring(struct task_work *twork)
 {
-	const struct cred *old;
-	struct cred *new;
-
-	if (!current->replacement_session_keyring)
-		return;
+	const struct cred *old = current_cred();
+	struct cred *new = twork->data;
 
-	write_lock_irq(&tasklist_lock);
-	new = current->replacement_session_keyring;
-	current->replacement_session_keyring = NULL;
-	write_unlock_irq(&tasklist_lock);
-
-	if (!new)
+	kfree(twork);
+	if (unlikely(current->flags & PF_EXITING)) {
+		put_cred(new);
 		return;
+	}
 
-	old = current_cred();
 	new->  uid	= old->  uid;
 	new-> euid	= old-> euid;
 	new-> suid	= old-> suid;
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 5/6] keys: kill the dummy key_replace_session_keyring()
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
                   ` (3 preceding siblings ...)
  2012-04-19 23:15 ` [PATCH v6 4/6] keys: change keyctl_session_to_parent() to use task_work_add() Oleg Nesterov
@ 2012-04-19 23:16 ` Oleg Nesterov
  2012-04-19 23:16 ` [PATCH v6 6/6] keys: kill task_struct->replacement_session_keyring Oleg Nesterov
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2012-04-19 23:16 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

After the previouse change key_replace_session_keyring() becomes
a nop. Remove the dummy definition in key.h and update the callers
in arch/*/kernel/signal.c.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 arch/alpha/kernel/signal.c     |    2 --
 arch/arm/kernel/signal.c       |    2 --
 arch/avr32/kernel/signal.c     |    2 --
 arch/blackfin/kernel/signal.c  |    2 --
 arch/c6x/kernel/signal.c       |    2 --
 arch/cris/kernel/ptrace.c      |    2 --
 arch/frv/kernel/signal.c       |    2 --
 arch/h8300/kernel/signal.c     |    2 --
 arch/hexagon/kernel/signal.c   |    2 --
 arch/ia64/kernel/process.c     |    2 --
 arch/m32r/kernel/signal.c      |    2 --
 arch/mips/kernel/signal.c      |    2 --
 arch/mn10300/kernel/signal.c   |    2 --
 arch/openrisc/kernel/signal.c  |    2 --
 arch/parisc/kernel/signal.c    |    2 --
 arch/powerpc/kernel/signal.c   |    2 --
 arch/s390/kernel/signal.c      |    2 --
 arch/sh/kernel/signal_32.c     |    2 --
 arch/sh/kernel/signal_64.c     |    2 --
 arch/sparc/kernel/signal_32.c  |    2 --
 arch/sparc/kernel/signal_64.c  |    2 --
 arch/tile/kernel/process.c     |    2 --
 arch/unicore32/kernel/signal.c |    2 --
 arch/x86/kernel/signal.c       |    2 --
 include/linux/key.h            |    2 --
 25 files changed, 0 insertions(+), 50 deletions(-)

diff --git a/arch/alpha/kernel/signal.c b/arch/alpha/kernel/signal.c
index 35f2ef4..bb61c58 100644
--- a/arch/alpha/kernel/signal.c
+++ b/arch/alpha/kernel/signal.c
@@ -616,7 +616,5 @@ do_notify_resume(struct pt_regs *regs, struct switch_stack *sw,
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index 7cb532f..5c34754 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -782,7 +782,5 @@ do_notify_resume(struct pt_regs *regs, unsigned int thread_flags, int syscall)
 	if (thread_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/avr32/kernel/signal.c b/arch/avr32/kernel/signal.c
index 64f886f..0d512c5 100644
--- a/arch/avr32/kernel/signal.c
+++ b/arch/avr32/kernel/signal.c
@@ -327,7 +327,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, struct thread_info *ti)
 	if (ti->flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/blackfin/kernel/signal.c b/arch/blackfin/kernel/signal.c
index d536f35..0af1a1b 100644
--- a/arch/blackfin/kernel/signal.c
+++ b/arch/blackfin/kernel/signal.c
@@ -347,8 +347,6 @@ asmlinkage void do_notify_resume(struct pt_regs *regs)
 	if (test_thread_flag(TIF_NOTIFY_RESUME)) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
 
diff --git a/arch/c6x/kernel/signal.c b/arch/c6x/kernel/signal.c
index 3b5a050..ee7beae 100644
--- a/arch/c6x/kernel/signal.c
+++ b/arch/c6x/kernel/signal.c
@@ -361,7 +361,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, u32 thread_info_flags,
 	if (thread_info_flags & (1 << TIF_NOTIFY_RESUME)) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/cris/kernel/ptrace.c b/arch/cris/kernel/ptrace.c
index d114ad3..58d44ee 100644
--- a/arch/cris/kernel/ptrace.c
+++ b/arch/cris/kernel/ptrace.c
@@ -40,7 +40,5 @@ void do_notify_resume(int canrestart, struct pt_regs *regs,
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/frv/kernel/signal.c b/arch/frv/kernel/signal.c
index bab0129..151a3fd 100644
--- a/arch/frv/kernel/signal.c
+++ b/arch/frv/kernel/signal.c
@@ -583,8 +583,6 @@ asmlinkage void do_notify_resume(__u32 thread_info_flags)
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(__frame);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 
 } /* end do_notify_resume() */
diff --git a/arch/h8300/kernel/signal.c b/arch/h8300/kernel/signal.c
index af842c3..14c46e8 100644
--- a/arch/h8300/kernel/signal.c
+++ b/arch/h8300/kernel/signal.c
@@ -557,7 +557,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, u32 thread_info_flags)
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/hexagon/kernel/signal.c b/arch/hexagon/kernel/signal.c
index ab5f5ad..f16152e 100644
--- a/arch/hexagon/kernel/signal.c
+++ b/arch/hexagon/kernel/signal.c
@@ -273,8 +273,6 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
 
diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
index ce74e14..99f34a8 100644
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -199,8 +199,6 @@ do_notify_resume_user(sigset_t *unused, struct sigscratch *scr, long in_syscall)
 	if (test_thread_flag(TIF_NOTIFY_RESUME)) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(&scr->pt);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 
 	/* copy user rbs to kernel rbs */
diff --git a/arch/m32r/kernel/signal.c b/arch/m32r/kernel/signal.c
index a08697f..d0a24f4 100644
--- a/arch/m32r/kernel/signal.c
+++ b/arch/m32r/kernel/signal.c
@@ -391,8 +391,6 @@ void do_notify_resume(struct pt_regs *regs, __u32 thread_info_flags)
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 
 	clear_thread_flag(TIF_IRET);
diff --git a/arch/mips/kernel/signal.c b/arch/mips/kernel/signal.c
index 185ca00..cebed39 100644
--- a/arch/mips/kernel/signal.c
+++ b/arch/mips/kernel/signal.c
@@ -669,8 +669,6 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, void *unused,
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
 
diff --git a/arch/mn10300/kernel/signal.c b/arch/mn10300/kernel/signal.c
index 690f4e9..909059e 100644
--- a/arch/mn10300/kernel/signal.c
+++ b/arch/mn10300/kernel/signal.c
@@ -575,7 +575,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, u32 thread_info_flags)
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(current_frame());
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/openrisc/kernel/signal.c b/arch/openrisc/kernel/signal.c
index e970743..9ae6115 100644
--- a/arch/openrisc/kernel/signal.c
+++ b/arch/openrisc/kernel/signal.c
@@ -376,7 +376,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs)
 	if (current_thread_info()->flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c
index 12c1ed3..c3102c4 100644
--- a/arch/parisc/kernel/signal.c
+++ b/arch/parisc/kernel/signal.c
@@ -647,7 +647,5 @@ void do_notify_resume(struct pt_regs *regs, long in_syscall)
 	if (test_thread_flag(TIF_NOTIFY_RESUME)) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index 651c596..bfc3ec1 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -193,8 +193,6 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
 
diff --git a/arch/s390/kernel/signal.c b/arch/s390/kernel/signal.c
index f7582b2..fbf2495 100644
--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -517,6 +517,4 @@ void do_notify_resume(struct pt_regs *regs)
 {
 	clear_thread_flag(TIF_NOTIFY_RESUME);
 	tracehook_notify_resume(regs);
-	if (current->replacement_session_keyring)
-		key_replace_session_keyring();
 }
diff --git a/arch/sh/kernel/signal_32.c b/arch/sh/kernel/signal_32.c
index 5901fba..a6b74c3 100644
--- a/arch/sh/kernel/signal_32.c
+++ b/arch/sh/kernel/signal_32.c
@@ -633,7 +633,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, unsigned int save_r0,
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/sh/kernel/signal_64.c b/arch/sh/kernel/signal_64.c
index 3c9a6f7..52bdfb0 100644
--- a/arch/sh/kernel/signal_64.c
+++ b/arch/sh/kernel/signal_64.c
@@ -737,7 +737,5 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, unsigned long thread_info
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
diff --git a/arch/sparc/kernel/signal_32.c b/arch/sparc/kernel/signal_32.c
index 1e750e4..72bd0f7 100644
--- a/arch/sparc/kernel/signal_32.c
+++ b/arch/sparc/kernel/signal_32.c
@@ -603,8 +603,6 @@ void do_notify_resume(struct pt_regs *regs, unsigned long orig_i0,
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
 
diff --git a/arch/sparc/kernel/signal_64.c b/arch/sparc/kernel/signal_64.c
index 48b0f57..f2dcaab 100644
--- a/arch/sparc/kernel/signal_64.c
+++ b/arch/sparc/kernel/signal_64.c
@@ -618,8 +618,6 @@ void do_notify_resume(struct pt_regs *regs, unsigned long orig_i0, unsigned long
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
 
diff --git a/arch/tile/kernel/process.c b/arch/tile/kernel/process.c
index 2d5ef61..ff15527 100644
--- a/arch/tile/kernel/process.c
+++ b/arch/tile/kernel/process.c
@@ -584,8 +584,6 @@ int do_work_pending(struct pt_regs *regs, u32 thread_info_flags)
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 		return 1;
 	}
 	if (thread_info_flags & _TIF_SINGLESTEP) {
diff --git a/arch/unicore32/kernel/signal.c b/arch/unicore32/kernel/signal.c
index 911b549..b1e5cd5 100644
--- a/arch/unicore32/kernel/signal.c
+++ b/arch/unicore32/kernel/signal.c
@@ -470,8 +470,6 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 	if (thread_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 }
 
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 115eac4..d727de0 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -831,8 +831,6 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
-		if (current->replacement_session_keyring)
-			key_replace_session_keyring();
 	}
 	if (thread_info_flags & _TIF_USER_RETURN_NOTIFY)
 		fire_user_return_notifiers();
diff --git a/include/linux/key.h b/include/linux/key.h
index 0c263d6..16cbdb0 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -33,8 +33,6 @@ typedef uint32_t key_perm_t;
 
 struct key;
 
-#define key_replace_session_keyring()	do { } while (0)
-
 #ifdef CONFIG_KEYS
 
 #undef KEY_DEBUGGING
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 6/6] keys: kill task_struct->replacement_session_keyring
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
                   ` (4 preceding siblings ...)
  2012-04-19 23:16 ` [PATCH v6 5/6] keys: kill the dummy key_replace_session_keyring() Oleg Nesterov
@ 2012-04-19 23:16 ` Oleg Nesterov
  2012-04-20  8:45 ` [PATCH v6 4/6] keys: change keyctl_session_to_parent() to use task_work_add() David Howells
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2012-04-19 23:16 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

Kill the no longer used task_struct->replacement_session_keyring,
update copy_creds() and exit_creds().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/sched.h |    2 --
 kernel/cred.c         |    9 ---------
 2 files changed, 0 insertions(+), 11 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e36edfb..ac04a02 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1402,8 +1402,6 @@ struct task_struct {
 					 * credentials (COW) */
 	const struct cred __rcu *cred;	/* effective (overridable) subjective task
 					 * credentials (COW) */
-	struct cred *replacement_session_keyring; /* for KEYCTL_SESSION_TO_PARENT */
-
 	char comm[TASK_COMM_LEN]; /* executable name excluding path
 				     - access with [gs]et_task_comm (which lock
 				       it with task_lock())
diff --git a/kernel/cred.c b/kernel/cred.c
index e70683d..9570736 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -198,13 +198,6 @@ void exit_creds(struct task_struct *tsk)
 	validate_creds(cred);
 	alter_cred_subscribers(cred, -1);
 	put_cred(cred);
-
-	cred = (struct cred *) tsk->replacement_session_keyring;
-	if (cred) {
-		tsk->replacement_session_keyring = NULL;
-		validate_creds(cred);
-		put_cred(cred);
-	}
 }
 
 /**
@@ -386,8 +379,6 @@ int copy_creds(struct task_struct *p, unsigned long clone_flags)
 	struct cred *new;
 	int ret;
 
-	p->replacement_session_keyring = NULL;
-
 	if (
 #ifdef CONFIG_KEYS
 		!p->cred->thread_keyring &&
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v6 3/6] hexagon: do_notify_resume() needs tracehook_notify_resume()
  2012-04-19 23:15 ` [PATCH v6 3/6] hexagon: do_notify_resume() needs tracehook_notify_resume() Oleg Nesterov
@ 2012-04-20  0:24   ` Richard Kuo
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Kuo @ 2012-04-20  0:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-arch, linux-kernel

On Fri, Apr 20, 2012 at 01:15:27AM +0200, Oleg Nesterov wrote:
> arch/hexagon/kernel/signal.c:do_notify_resume() forgets to call
> tracehook_notify_resume() if TIF_NOTIFY_RESUME is set.
> 
> Cc: Richard Kuo <rkuo@codeaurora.org>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  arch/hexagon/kernel/signal.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 

Thanks for catching that!

Acked-by: Richard Kuo <rkuo@codeaurora.org>


-- 

Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v6 4/6] keys: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
                   ` (5 preceding siblings ...)
  2012-04-19 23:16 ` [PATCH v6 6/6] keys: kill task_struct->replacement_session_keyring Oleg Nesterov
@ 2012-04-20  8:45 ` David Howells
  2012-04-20  8:45 ` [PATCH v6 5/6] keys: kill the dummy key_replace_session_keyring() David Howells
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2012-04-20  8:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> Change keyctl_session_to_parent() to use task_work_add() and
> move key_replace_session_keyring() logic into task_work->func().
> 
> Note that we do task_work_cancel() before task_work_add() to
> ensure that only one work can be pending at any time. This is
> important, we must not allow user-space to abuse the parent's
> ->task_works list.
> 
> The callback, replace_session_keyring(), checks PF_EXITING.
> I guess this is not really needed but looks better.
> 
> As a side effect, this fixes the (unlikely) race. The callers
> of key_replace_session_keyring() and keyctl_session_to_parent()
> lack the necessary barriers, the parent can miss the request.
> 
> Now we can remove task_struct->replacement_session_keyring and
> related code.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: David Howells <dhowells@redhat.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v6 5/6] keys: kill the dummy key_replace_session_keyring()
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
                   ` (6 preceding siblings ...)
  2012-04-20  8:45 ` [PATCH v6 4/6] keys: change keyctl_session_to_parent() to use task_work_add() David Howells
@ 2012-04-20  8:45 ` David Howells
  2012-04-20  8:45 ` [PATCH v6 6/6] keys: kill task_struct->replacement_session_keyring David Howells
  2012-04-20  8:48 ` [PATCH v6 1/6] task_work_add: generic process-context callbacks David Howells
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2012-04-20  8:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> After the previouse change key_replace_session_keyring() becomes
> a nop. Remove the dummy definition in key.h and update the callers
> in arch/*/kernel/signal.c.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: David Howells <dhowells@redhat.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v6 6/6] keys: kill task_struct->replacement_session_keyring
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
                   ` (7 preceding siblings ...)
  2012-04-20  8:45 ` [PATCH v6 5/6] keys: kill the dummy key_replace_session_keyring() David Howells
@ 2012-04-20  8:45 ` David Howells
  2012-04-20  8:48 ` [PATCH v6 1/6] task_work_add: generic process-context callbacks David Howells
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2012-04-20  8:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> Kill the no longer used task_struct->replacement_session_keyring,
> update copy_creds() and exit_creds().
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: David Howells <dhowells@redhat.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v6 1/6] task_work_add: generic process-context callbacks
  2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
                   ` (8 preceding siblings ...)
  2012-04-20  8:45 ` [PATCH v6 6/6] keys: kill task_struct->replacement_session_keyring David Howells
@ 2012-04-20  8:48 ` David Howells
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2012-04-20  8:48 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Richard Kuo,
	Tejun Heo, linux-arch, linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> Provide a simple mechanism that allows running code in the
> (nonatomic) context of the arbitrary task.
> 
> The caller does task_work_add(task, task_work) and this task
> executes task_work->func() either from do_notify_resume() or
> from do_exit(). The callback can rely on PF_EXITING to detect
> the latter case.
> 
> "struct task_work" can be embedded in another struct, still it
> has "void *data" to handle the most common/simple case.
> 
> This allows us to kill the ->replacement_session_keyring hack,
> and potentially this can have more users.
> 
> Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks
> into tracehook_notify_resume() and do_exit(). But at the same
> time we can remove the "replacement_session_keyring != NULL"
> checks from arch/*/signal.c and exit_creds().
> 
> Note: task_work_add/task_work_run abuses ->pi_lock. This is
> only because this lock is already used by lookup_pi_state() to
> synchronize with do_exit() setting PF_EXITING. Fortunately the
> scope of this lock in task_work.c is really tiny, and the code
> is unlikely anyway.
> 
> v2:
> 	- implement task_work_cancel(func), it removes the first
> 	  task_work with the same callback.
> v3:
> 	- task_work_add() gets the new arg, "bool notify" to
> 	  conditionalize set_notify_resume(), this makes it useable
> 	  for kthreads and task_work_add(notify => false) can
> 	  work without TIF_NOTIFY_RESUME.
> 
> 	- don't add the dummy "ifndef TIF_NOTIFY_RESUME" inlines,
> 	  just add the simple check in task_work_add().
> v4:
> 	- s/task_work_queue/task_work_add/
> v5:
> 	- task_work_run() uses current explicitely
> 
> Todo:
> 	- move clear_thread_flag(TIF_NOTIFY_RESUME) from arch/
> 	  to tracehook_notify_resume()
> 
> 	- rename tracehook_notify_resume() and move it into
> 	  linux/task_work.h
> 
> 	- m68k and xtensa don't have TIF_NOTIFY_RESUME and thus
> 	  task_work_add(notify => true) fails with -ENOTSUPP.
> 
> 	  However, ->replacement_session_keyring equally needs
> 	  this flag, task_work_add() is not worse.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: David Howells <dhowells@redhat.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-04-20  8:49 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-19 23:14 [PATCH v6 0/6] task_work_add: generic process-context callbacks Oleg Nesterov
2012-04-19 23:14 ` [PATCH v6 1/6] " Oleg Nesterov
2012-04-19 23:15 ` [PATCH v6 2/6] genirq: reimplement exit_irq_thread() hook via task_work_add() Oleg Nesterov
2012-04-19 23:15 ` [PATCH v6 3/6] hexagon: do_notify_resume() needs tracehook_notify_resume() Oleg Nesterov
2012-04-20  0:24   ` Richard Kuo
2012-04-19 23:15 ` [PATCH v6 4/6] keys: change keyctl_session_to_parent() to use task_work_add() Oleg Nesterov
2012-04-19 23:16 ` [PATCH v6 5/6] keys: kill the dummy key_replace_session_keyring() Oleg Nesterov
2012-04-19 23:16 ` [PATCH v6 6/6] keys: kill task_struct->replacement_session_keyring Oleg Nesterov
2012-04-20  8:45 ` [PATCH v6 4/6] keys: change keyctl_session_to_parent() to use task_work_add() David Howells
2012-04-20  8:45 ` [PATCH v6 5/6] keys: kill the dummy key_replace_session_keyring() David Howells
2012-04-20  8:45 ` [PATCH v6 6/6] keys: kill task_struct->replacement_session_keyring David Howells
2012-04-20  8:48 ` [PATCH v6 1/6] task_work_add: generic process-context callbacks David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).