All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] task_work_add (was: task_work_queue)
@ 2012-04-14  2:12 Oleg Nesterov
  2012-04-14  2:12 ` [PATCH v4 1/3] task_work_add: generic process-context callbacks Oleg Nesterov
                   ` (6 more replies)
  0 siblings, 7 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-14  2:12 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Hello.

Changes based on review from Andrew:

	- s/task_work_queue/task_work_add/

	- don't create a secret dependency upon task_work.h
	  including sched.h in tracehook.h and keys/internal.h

	- add more includes into task_work.[ch] to not rely
	  on "sched.h includes everything"

The genirq patch comes as 2/3, it was already acked by Thomas.

3/3 still needs the ack from David, although his last review
was positive and I hope I fully addressed his comments.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v4 1/3] task_work_add: generic process-context callbacks
  2012-04-14  2:12 [PATCH v4 0/3] task_work_add (was: task_work_queue) Oleg Nesterov
@ 2012-04-14  2:12 ` Oleg Nesterov
  2012-04-14  2:40   ` Linus Torvalds
  2012-04-14  2:12 ` [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add() Oleg Nesterov
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-14  2:12 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Provide a simple mechanism that allows running code in the
(nonatomic) context of the arbitrary task.

The caller does task_work_add(task, task_work) and this task
executes task_work->func() either from do_notify_resume() or
from do_exit(). The callback can rely on PF_EXITING to detect
the latter case.

"struct task_work" can be embedded in another struct, still it
has "void *data" to handle the most common/simple case.

This allows us to kill the ->replacement_session_keyring hack,
and potentially this can have more users.

Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks
into tracehook_notify_resume() and do_exit(). But at the same
time we can remove the "replacement_session_keyring != NULL"
checks from arch/*/signal.c and exit_creds().

Note: task_work_add/task_work_run abuses ->pi_lock. This is
only because this lock is already used by lookup_pi_state() to
synchronize with do_exit() setting PF_EXITING. Fortunately the
scope of this lock in task_work.c is really tiny, and the code
is unlikely anyway.

v2:
	- implement task_work_cancel(func), it removes the first
	  task_work with the same callback.
v3:
	- task_work_add() gets the new arg, "bool notify" to
	  conditionalize set_notify_resume(), this makes it useable
	  for kthreads and task_work_add(notify => false) can
	  work without TIF_NOTIFY_RESUME.

	- don't add the dummy "ifndef TIF_NOTIFY_RESUME" inlines,
	  just add the simple check in task_work_add().
v4:
	- s/task_work_queue/task_work_add/

Todo:
	- move clear_thread_flag(TIF_NOTIFY_RESUME) from arch/
	  to tracehook_notify_resume()

	- rename tracehook_notify_resume() and move it into
	  linux/task_work.h

	- m68k and xtensa don't have TIF_NOTIFY_RESUME and thus
	  task_work_add(notify => true) fails with -ENOTSUPP.

	  However, ->replacement_session_keyring equally needs
	  this flag, task_work_add() is not worse.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/sched.h     |    2 +
 include/linux/task_work.h |   33 ++++++++++++++++++
 include/linux/tracehook.h |   13 ++++++-
 kernel/Makefile           |    2 +-
 kernel/exit.c             |    5 ++-
 kernel/fork.c             |    1 +
 kernel/task_work.c        |   83 +++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 135 insertions(+), 4 deletions(-)
 create mode 100644 include/linux/task_work.h
 create mode 100644 kernel/task_work.c

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 81a173c..be004ac 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1445,6 +1445,8 @@ struct task_struct {
 	int (*notifier)(void *priv);
 	void *notifier_data;
 	sigset_t *notifier_mask;
+	struct hlist_head task_works;
+
 	struct audit_context *audit_context;
 #ifdef CONFIG_AUDITSYSCALL
 	uid_t loginuid;
diff --git a/include/linux/task_work.h b/include/linux/task_work.h
new file mode 100644
index 0000000..9ed6c31
--- /dev/null
+++ b/include/linux/task_work.h
@@ -0,0 +1,33 @@
+#ifndef _LINUX_TASK_WORK_H
+#define _LINUX_TASK_WORK_H
+
+#include <linux/list.h>
+#include <linux/sched.h>
+
+struct task_work;
+typedef void (*task_work_func_t)(struct task_work *);
+
+struct task_work {
+	struct hlist_node hlist;
+	task_work_func_t func;
+	void *data;
+};
+
+static inline void
+init_task_work(struct task_work *twork, task_work_func_t func, void *data)
+{
+	twork->func = func;
+	twork->data = data;
+}
+
+int task_work_add(struct task_struct *task, struct task_work *twork, bool);
+struct task_work *task_work_cancel(struct task_struct *, task_work_func_t);
+void task_work_run(struct task_struct *task);
+
+static inline void exit_task_work(struct task_struct *task)
+{
+	if (unlikely(!hlist_empty(&task->task_works)))
+		task_work_run(task);
+}
+
+#endif	/* _LINUX_TASK_WORK_H */
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
index 51bd91d..a575f26 100644
--- a/include/linux/tracehook.h
+++ b/include/linux/tracehook.h
@@ -49,6 +49,7 @@
 #include <linux/sched.h>
 #include <linux/ptrace.h>
 #include <linux/security.h>
+#include <linux/task_work.h>
 struct linux_binprm;
 
 /*
@@ -153,7 +154,6 @@ static inline void tracehook_signal_handler(int sig, siginfo_t *info,
 		ptrace_notify(SIGTRAP);
 }
 
-#ifdef TIF_NOTIFY_RESUME
 /**
  * set_notify_resume - cause tracehook_notify_resume() to be called
  * @task:		task that will call tracehook_notify_resume()
@@ -165,8 +165,10 @@ static inline void tracehook_signal_handler(int sig, siginfo_t *info,
  */
 static inline void set_notify_resume(struct task_struct *task)
 {
+#ifdef TIF_NOTIFY_RESUME
 	if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_RESUME))
 		kick_process(task);
+#endif
 }
 
 /**
@@ -184,7 +186,14 @@ static inline void set_notify_resume(struct task_struct *task)
  */
 static inline void tracehook_notify_resume(struct pt_regs *regs)
 {
+	/*
+	 * The caller just cleared TIF_NOTIFY_RESUME. This barrier
+	 * pairs with task_work_add()->set_notify_resume() after
+	 * hlist_add_head(task->task_works);
+	 */
+	smp_mb__after_clear_bit();
+	if (unlikely(!hlist_empty(&current->task_works)))
+		task_work_run(current);
 }
-#endif	/* TIF_NOTIFY_RESUME */
 
 #endif	/* <linux/tracehook.h> */
diff --git a/kernel/Makefile b/kernel/Makefile
index cb41b95..5790f8b 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -10,7 +10,7 @@ obj-y     = fork.o exec_domain.o panic.o printk.o \
 	    kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
 	    hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \
 	    notifier.o ksysfs.o cred.o \
-	    async.o range.o groups.o
+	    async.o range.o groups.o task_work.o
 
 ifdef CONFIG_FUNCTION_TRACER
 # Do not trace debug files and internal ftrace files
diff --git a/kernel/exit.c b/kernel/exit.c
index d8bd3b4..b82c38e 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -946,11 +946,14 @@ void do_exit(long code)
 	exit_signals(tsk);  /* sets PF_EXITING */
 	/*
 	 * tsk->flags are checked in the futex code to protect against
-	 * an exiting task cleaning up the robust pi futexes.
+	 * an exiting task cleaning up the robust pi futexes, and in
+	 * task_work_add() to avoid the race with exit_task_work().
 	 */
 	smp_mb();
 	raw_spin_unlock_wait(&tsk->pi_lock);
 
+	exit_task_work(tsk);
+
 	exit_irq_thread();
 
 	if (unlikely(in_atomic()))
diff --git a/kernel/fork.c b/kernel/fork.c
index b9372a0..d1108ac 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1380,6 +1380,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	 */
 	p->group_leader = p;
 	INIT_LIST_HEAD(&p->thread_group);
+	INIT_HLIST_HEAD(&p->task_works);
 
 	/* Now that the task is set up, run cgroup callbacks if
 	 * necessary. We need to run them before the task is visible
diff --git a/kernel/task_work.c b/kernel/task_work.c
new file mode 100644
index 0000000..a7fe579
--- /dev/null
+++ b/kernel/task_work.c
@@ -0,0 +1,83 @@
+#include <linux/spinlock.h>
+#include <linux/task_work.h>
+#include <linux/tracehook.h>
+
+int
+task_work_add(struct task_struct *task, struct task_work *twork, bool notify)
+{
+	unsigned long flags;
+	int err = -ESRCH;
+
+#ifndef TIF_NOTIFY_RESUME
+	if (notify)
+		return -ENOTSUPP;
+#endif
+	/*
+	 * We must not insert the new work if the task has already passed
+	 * exit_task_work(). We rely on do_exit()->raw_spin_unlock_wait()
+	 * and check PF_EXITING under pi_lock.
+	 */
+	raw_spin_lock_irqsave(&task->pi_lock, flags);
+	if (likely(!(task->flags & PF_EXITING))) {
+		hlist_add_head(&twork->hlist, &task->task_works);
+		err = 0;
+	}
+	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+
+	/* test_and_set_bit() implies mb(), see tracehook_notify_resume(). */
+	if (likely(!err) && notify)
+		set_notify_resume(task);
+	return err;
+}
+
+struct task_work *
+task_work_cancel(struct task_struct *task, task_work_func_t func)
+{
+	unsigned long flags;
+	struct task_work *twork;
+	struct hlist_node *pos;
+
+	raw_spin_lock_irqsave(&task->pi_lock, flags);
+	hlist_for_each_entry(twork, pos, &task->task_works, hlist) {
+		if (twork->func == func) {
+			hlist_del(&twork->hlist);
+			goto found;
+		}
+	}
+	twork = NULL;
+ found:
+	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+
+	return twork;
+}
+
+void task_work_run(struct task_struct *task)
+{
+	struct hlist_head task_works;
+	struct hlist_node *pos;
+
+	raw_spin_lock_irq(&task->pi_lock);
+	hlist_move_list(&task->task_works, &task_works);
+	raw_spin_unlock_irq(&task->pi_lock);
+
+	if (unlikely(hlist_empty(&task_works)))
+		return;
+	/*
+	 * We use hlist to save the space in task_struct, but we want fifo.
+	 * Find the last entry, the list should be short, then process them
+	 * in reverse order.
+	 */
+	for (pos = task_works.first; pos->next; pos = pos->next)
+		;
+
+	for (;;) {
+		struct hlist_node **pprev = pos->pprev;
+		struct task_work *twork = container_of(pos, struct task_work,
+							hlist);
+		twork->func(twork);
+
+		if (pprev == &task_works.first)
+			break;
+		pos = container_of(pprev, struct hlist_node, next);
+	}
+}
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add()
  2012-04-14  2:12 [PATCH v4 0/3] task_work_add (was: task_work_queue) Oleg Nesterov
  2012-04-14  2:12 ` [PATCH v4 1/3] task_work_add: generic process-context callbacks Oleg Nesterov
@ 2012-04-14  2:12 ` Oleg Nesterov
  2012-04-14  2:12 ` [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add() Oleg Nesterov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-14  2:12 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

exit_irq_thread() and task->irq_thread are needed to handle
the unexpected (and unlikely) exit of irq-thread.

We can use task_work instead and make this all private to
kernel/irq/manage.c, cleanup plus micro-optimization.

1. rename exit_irq_thread() to irq_thread_dtor(), make it
   static, and move it up before irq_thread().

2. change irq_thread() to do task_queue_work(irq_thread_dtor)
   at the start and task_work_cancel() before return.

   tracehook_notify_resume() can never play with kthreads,
   only do_exit()->exit_task_work() can call the callback
   and this is what we want.

3. remove task_struct->irq_thread and the special hook
   in do_exit().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/interrupt.h |    4 --
 include/linux/sched.h     |   10 +-----
 kernel/exit.c             |    2 -
 kernel/irq/manage.c       |   69 +++++++++++++++++++++-----------------------
 4 files changed, 35 insertions(+), 50 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 2aea5d2..1cdd4d0 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -142,8 +142,6 @@ request_any_context_irq(unsigned int irq, irq_handler_t handler,
 extern int __must_check
 request_percpu_irq(unsigned int irq, irq_handler_t handler,
 		   const char *devname, void __percpu *percpu_dev_id);
-
-extern void exit_irq_thread(void);
 #else
 
 extern int __must_check
@@ -177,8 +175,6 @@ request_percpu_irq(unsigned int irq, irq_handler_t handler,
 {
 	return request_irq(irq, handler, 0, devname, percpu_dev_id);
 }
-
-static inline void exit_irq_thread(void) { }
 #endif
 
 extern void free_irq(unsigned int, void *);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index be004ac..e36edfb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1346,11 +1346,6 @@ struct task_struct {
 	unsigned sched_reset_on_fork:1;
 	unsigned sched_contributes_to_load:1;
 
-#ifdef CONFIG_GENERIC_HARDIRQS
-	/* IRQ handler threads */
-	unsigned irq_thread:1;
-#endif
-
 	pid_t pid;
 	pid_t tgid;
 
@@ -1358,10 +1353,9 @@ struct task_struct {
 	/* Canary value for the -fstack-protector gcc feature */
 	unsigned long stack_canary;
 #endif
-
-	/* 
+	/*
 	 * pointers to (original) parent process, youngest child, younger sibling,
-	 * older sibling, respectively.  (p->father can be replaced with 
+	 * older sibling, respectively.  (p->father can be replaced with
 	 * p->real_parent->pid)
 	 */
 	struct task_struct __rcu *real_parent; /* real parent process */
diff --git a/kernel/exit.c b/kernel/exit.c
index b82c38e..8135243 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -954,8 +954,6 @@ void do_exit(long code)
 
 	exit_task_work(tsk);
 
-	exit_irq_thread();
-
 	if (unlikely(in_atomic()))
 		printk(KERN_INFO "note: %s[%d] exited with preempt_count %d\n",
 				current->comm, task_pid_nr(current),
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 89a3ea8..525391c 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -14,6 +14,7 @@
 #include <linux/interrupt.h>
 #include <linux/slab.h>
 #include <linux/sched.h>
+#include <linux/task_work.h>
 
 #include "internals.h"
 
@@ -773,11 +774,39 @@ static void wake_threads_waitq(struct irq_desc *desc)
 		wake_up(&desc->wait_for_threads);
 }
 
+static void irq_thread_dtor(struct task_work *unused)
+{
+	struct task_struct *tsk = current;
+	struct irq_desc *desc;
+	struct irqaction *action;
+
+	if (WARN_ON_ONCE(!(current->flags & PF_EXITING)))
+		return;
+
+	action = kthread_data(tsk);
+
+	printk(KERN_ERR
+	       "exiting task \"%s\" (%d) is an active IRQ thread (irq %d)\n",
+	       tsk->comm ? tsk->comm : "", tsk->pid, action->irq);
+
+	desc = irq_to_desc(action->irq);
+	/*
+	 * If IRQTF_RUNTHREAD is set, we need to decrement
+	 * desc->threads_active and wake possible waiters.
+	 */
+	if (test_and_clear_bit(IRQTF_RUNTHREAD, &action->thread_flags))
+		wake_threads_waitq(desc);
+
+	/* Prevent a stale desc->threads_oneshot */
+	irq_finalize_oneshot(desc, action);
+}
+
 /*
  * Interrupt handler thread
  */
 static int irq_thread(void *data)
 {
+	struct task_work on_exit_work;
 	static const struct sched_param param = {
 		.sched_priority = MAX_USER_RT_PRIO/2,
 	};
@@ -793,7 +822,9 @@ static int irq_thread(void *data)
 		handler_fn = irq_thread_fn;
 
 	sched_setscheduler(current, SCHED_FIFO, &param);
-	current->irq_thread = 1;
+
+	init_task_work(&on_exit_work, irq_thread_dtor, NULL);
+	task_work_add(current, &on_exit_work, false);
 
 	while (!irq_wait_for_interrupt(action)) {
 		irqreturn_t action_ret;
@@ -815,45 +846,11 @@ static int irq_thread(void *data)
 	 * cannot touch the oneshot mask at this point anymore as
 	 * __setup_irq() might have given out currents thread_mask
 	 * again.
-	 *
-	 * Clear irq_thread. Otherwise exit_irq_thread() would make
-	 * fuzz about an active irq thread going into nirvana.
 	 */
-	current->irq_thread = 0;
+	task_work_cancel(current, irq_thread_dtor);
 	return 0;
 }
 
-/*
- * Called from do_exit()
- */
-void exit_irq_thread(void)
-{
-	struct task_struct *tsk = current;
-	struct irq_desc *desc;
-	struct irqaction *action;
-
-	if (!tsk->irq_thread)
-		return;
-
-	action = kthread_data(tsk);
-
-	printk(KERN_ERR
-	       "exiting task \"%s\" (%d) is an active IRQ thread (irq %d)\n",
-	       tsk->comm ? tsk->comm : "", tsk->pid, action->irq);
-
-	desc = irq_to_desc(action->irq);
-
-	/*
-	 * If IRQTF_RUNTHREAD is set, we need to decrement
-	 * desc->threads_active and wake possible waiters.
-	 */
-	if (test_and_clear_bit(IRQTF_RUNTHREAD, &action->thread_flags))
-		wake_threads_waitq(desc);
-
-	/* Prevent a stale desc->threads_oneshot */
-	irq_finalize_oneshot(desc, action);
-}
-
 static void irq_setup_forced_threading(struct irqaction *new)
 {
 	if (!force_irqthreads)
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-14  2:12 [PATCH v4 0/3] task_work_add (was: task_work_queue) Oleg Nesterov
  2012-04-14  2:12 ` [PATCH v4 1/3] task_work_add: generic process-context callbacks Oleg Nesterov
  2012-04-14  2:12 ` [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add() Oleg Nesterov
@ 2012-04-14  2:12 ` Oleg Nesterov
  2012-04-14  2:32 ` [PATCH v4 0/3] task_work_add (was: task_work_queue) Linus Torvalds
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-14  2:12 UTC (permalink / raw)
  To: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner
  Cc: Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Change keyctl_session_to_parent() to use task_work_add() and
move key_replace_session_keyring() logic into task_work->func().

Note that we do task_work_cancel() before task_work_add() to
ensure that only one work can be pending at any time. This is
important, we must not allow user-space to abuse the parent's
->task_works list.

The callback, replace_session_keyring(), checks PF_EXITING.
I guess this is not really needed but looks better.

As a side effect, this fixes the (unlikely) race. The callers
of key_replace_session_keyring() and keyctl_session_to_parent()
lack the necessary barriers, the parent can miss the request.

Now we can remove task_struct->replacement_session_keyring and
related code.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/key.h          |    6 +--
 security/keys/internal.h     |    2 +
 security/keys/keyctl.c       |   76 ++++++++++++++++++++----------------------
 security/keys/process_keys.c |   25 ++++----------
 4 files changed, 47 insertions(+), 62 deletions(-)

diff --git a/include/linux/key.h b/include/linux/key.h
index 96933b1..0c263d6 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -33,6 +33,8 @@ typedef uint32_t key_perm_t;
 
 struct key;
 
+#define key_replace_session_keyring()	do { } while (0)
+
 #ifdef CONFIG_KEYS
 
 #undef KEY_DEBUGGING
@@ -302,9 +304,6 @@ static inline bool key_is_instantiated(const struct key *key)
 #ifdef CONFIG_SYSCTL
 extern ctl_table key_sysctls[];
 #endif
-
-extern void key_replace_session_keyring(void);
-
 /*
  * the userspace interface
  */
@@ -327,7 +326,6 @@ extern void key_init(void);
 #define key_fsuid_changed(t)		do { } while(0)
 #define key_fsgid_changed(t)		do { } while(0)
 #define key_init()			do { } while(0)
-#define key_replace_session_keyring()	do { } while(0)
 
 #endif /* CONFIG_KEYS */
 #endif /* __KERNEL__ */
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 65647f8..c2986da 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -14,6 +14,7 @@
 
 #include <linux/sched.h>
 #include <linux/key-type.h>
+#include <linux/task_work.h>
 
 #ifdef __KDEBUG
 #define kenter(FMT, ...) \
@@ -148,6 +149,7 @@ extern key_ref_t lookup_user_key(key_serial_t id, unsigned long flags,
 #define KEY_LOOKUP_FOR_UNLINK	0x04
 
 extern long join_session_keyring(const char *name);
+extern void key_change_session_keyring(struct task_work *twork);
 
 extern struct work_struct key_gc_work;
 extern unsigned key_gc_delay;
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index fb767c6..f946fca 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1423,50 +1423,57 @@ long keyctl_get_security(key_serial_t keyid,
  */
 long keyctl_session_to_parent(void)
 {
-#ifdef TIF_NOTIFY_RESUME
 	struct task_struct *me, *parent;
 	const struct cred *mycred, *pcred;
-	struct cred *cred, *oldcred;
+	struct task_work *newwork, *oldwork;
 	key_ref_t keyring_r;
-	int ret;
+	struct cred *cred;
+	int err;
 
 	keyring_r = lookup_user_key(KEY_SPEC_SESSION_KEYRING, 0, KEY_LINK);
 	if (IS_ERR(keyring_r))
 		return PTR_ERR(keyring_r);
 
+	err = -ENOMEM;
+	newwork = kmalloc(sizeof(struct task_work), GFP_KERNEL);
+	if (!newwork)
+		goto error_keyring;
+
 	/* our parent is going to need a new cred struct, a new tgcred struct
 	 * and new security data, so we allocate them here to prevent ENOMEM in
 	 * our parent */
-	ret = -ENOMEM;
 	cred = cred_alloc_blank();
 	if (!cred)
 		goto error_keyring;
 
 	cred->tgcred->session_keyring = key_ref_to_ptr(keyring_r);
-	keyring_r = NULL;
+	init_task_work(newwork, key_change_session_keyring, cred);
 
 	me = current;
 	rcu_read_lock();
 	write_lock_irq(&tasklist_lock);
 
+	err = -EPERM;
+	oldwork = NULL;
 	parent = me->real_parent;
-	ret = -EPERM;
 
 	/* the parent mustn't be init and mustn't be a kernel thread */
 	if (parent->pid <= 1 || !parent->mm)
-		goto not_permitted;
+		goto unlock;
 
 	/* the parent must be single threaded */
 	if (!thread_group_empty(parent))
-		goto not_permitted;
+		goto unlock;
 
 	/* the parent and the child must have different session keyrings or
 	 * there's no point */
 	mycred = current_cred();
 	pcred = __task_cred(parent);
 	if (mycred == pcred ||
-	    mycred->tgcred->session_keyring == pcred->tgcred->session_keyring)
-		goto already_same;
+	    mycred->tgcred->session_keyring == pcred->tgcred->session_keyring) {
+		err = 0;
+		goto unlock;
+	}
 
 	/* the parent must have the same effective ownership and mustn't be
 	 * SUID/SGID */
@@ -1476,50 +1483,39 @@ long keyctl_session_to_parent(void)
 	    pcred->gid	!= mycred->egid	||
 	    pcred->egid	!= mycred->egid	||
 	    pcred->sgid	!= mycred->egid)
-		goto not_permitted;
+		goto unlock;
 
 	/* the keyrings must have the same UID */
 	if ((pcred->tgcred->session_keyring &&
 	     pcred->tgcred->session_keyring->uid != mycred->euid) ||
 	    mycred->tgcred->session_keyring->uid != mycred->euid)
-		goto not_permitted;
+		goto unlock;
 
-	/* if there's an already pending keyring replacement, then we replace
-	 * that */
-	oldcred = parent->replacement_session_keyring;
+	/* cancel an already pending keyring replacement */
+	oldwork = task_work_cancel(parent, key_change_session_keyring);
 
 	/* the replacement session keyring is applied just prior to userspace
 	 * restarting */
-	parent->replacement_session_keyring = cred;
-	cred = NULL;
-	set_ti_thread_flag(task_thread_info(parent), TIF_NOTIFY_RESUME);
-
-	write_unlock_irq(&tasklist_lock);
-	rcu_read_unlock();
-	if (oldcred)
-		put_cred(oldcred);
-	return 0;
-
-already_same:
-	ret = 0;
-not_permitted:
+	err = task_work_add(parent, newwork, true);
+	if (!err)
+		newwork = NULL;
+unlock:
 	write_unlock_irq(&tasklist_lock);
 	rcu_read_unlock();
-	put_cred(cred);
-	return ret;
+	if (oldwork) {
+		put_cred(oldwork->data);
+		kfree(oldwork);
+	}
+	if (newwork) {
+		put_cred(newwork->data);
+		kfree(newwork);
+	}
+	return err;
 
 error_keyring:
+	kfree(newwork);
 	key_ref_put(keyring_r);
-	return ret;
-
-#else /* !TIF_NOTIFY_RESUME */
-	/*
-	 * To be removed when TIF_NOTIFY_RESUME has been implemented on
-	 * m68k/xtensa
-	 */
-#warning TIF_NOTIFY_RESUME not implemented
-	return -EOPNOTSUPP;
-#endif /* !TIF_NOTIFY_RESUME */
+	return err;
 }
 
 /*
diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
index be7ecb2..eb6d804 100644
--- a/security/keys/process_keys.c
+++ b/security/keys/process_keys.c
@@ -828,27 +828,17 @@ error:
 	return ret;
 }
 
-/*
- * Replace a process's session keyring on behalf of one of its children when
- * the target  process is about to resume userspace execution.
- */
-void key_replace_session_keyring(void)
+void key_change_session_keyring(struct task_work *twork)
 {
-	const struct cred *old;
-	struct cred *new;
+	const struct cred *old = current_cred();
+	struct cred *new = twork->data;
 
-	if (!current->replacement_session_keyring)
-		return;
-
-	write_lock_irq(&tasklist_lock);
-	new = current->replacement_session_keyring;
-	current->replacement_session_keyring = NULL;
-	write_unlock_irq(&tasklist_lock);
-
-	if (!new)
+	kfree(twork);
+	if (unlikely(current->flags & PF_EXITING)) {
+		put_cred(new);
 		return;
+	}
 
-	old = current_cred();
 	new->  uid	= old->  uid;
 	new-> euid	= old-> euid;
 	new-> suid	= old-> suid;
@@ -873,6 +863,5 @@ void key_replace_session_keyring(void)
 	new->tgcred->process_keyring = key_get(old->tgcred->process_keyring);
 
 	security_transfer_creds(new, old);
-
 	commit_creds(new);
 }
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 0/3] task_work_add (was: task_work_queue)
  2012-04-14  2:12 [PATCH v4 0/3] task_work_add (was: task_work_queue) Oleg Nesterov
                   ` (2 preceding siblings ...)
  2012-04-14  2:12 ` [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add() Oleg Nesterov
@ 2012-04-14  2:32 ` Linus Torvalds
  2012-04-14  3:28   ` Oleg Nesterov
  2012-04-14 18:08 ` Peter Zijlstra
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2012-04-14  2:32 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Howells, Thomas Gleixner, Alexander Gordeev,
	Chris Zankel, David Smith, Frank Ch. Eigler, Geert Uytterhoeven,
	Larry Woodman, Peter Zijlstra, Tejun Heo, linux-kernel

On Fri, Apr 13, 2012 at 7:12 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> Changes based on review from Andrew:
>
>        - s/task_work_queue/task_work_add/
>
>        - don't create a secret dependency upon task_work.h
>          including sched.h in tracehook.h and keys/internal.h
>
>        - add more includes into task_work.[ch] to not rely
>          on "sched.h includes everything"

I thought based on Andrew's comments that you were going to remove the
extra code to do FIFO for no obvious reason. Apparently nothing
actually wanted/needed it, so why do it?

                  Linus

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 1/3] task_work_add: generic process-context callbacks
  2012-04-14  2:12 ` [PATCH v4 1/3] task_work_add: generic process-context callbacks Oleg Nesterov
@ 2012-04-14  2:40   ` Linus Torvalds
  2012-04-14  3:05     ` Oleg Nesterov
                       ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Linus Torvalds @ 2012-04-14  2:40 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Howells, Thomas Gleixner, Alexander Gordeev,
	Chris Zankel, David Smith, Frank Ch. Eigler, Geert Uytterhoeven,
	Larry Woodman, Peter Zijlstra, Tejun Heo, linux-kernel

This is seriously buggy:

On Fri, Apr 13, 2012 at 7:12 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> +void task_work_run(struct task_struct *task)
> +{
> +       struct hlist_head task_works;
> +       struct hlist_node *pos;
> +
> +       raw_spin_lock_irq(&task->pi_lock);
> +       hlist_move_list(&task->task_works, &task_works);
> +       raw_spin_unlock_irq(&task->pi_lock);
> +
> +       if (unlikely(hlist_empty(&task_works)))
> +               return;
> +       /*
> +        * We use hlist to save the space in task_struct, but we want fifo.
> +        * Find the last entry, the list should be short, then process them
> +        * in reverse order.
> +        */
> +       for (pos = task_works.first; pos->next; pos = pos->next)
> +               ;
> +
> +       for (;;) {
> +               struct hlist_node **pprev = pos->pprev;
> +               struct task_work *twork = container_of(pos, struct task_work,
> +                                                       hlist);
> +               twork->func(twork);
> +
> +               if (pprev == &task_works.first)
> +                       break;
> +               pos = container_of(pprev, struct hlist_node, next);
> +       }
> +}

No can do. You've removed the task-work from the process list, and you
no longer hold the spinlock that protects that list. That means that
you *cannot* access the task-work data structure any more, because it
may long be gone.

Look at the users of this interface that you wrote yourself. They
allocate the task-work on the stack, and do a "task_work_cancel()"
before returning. That data structure is *gone*. You can't dereference
it any more.

So quite frankly, the only safe approach is to copy the twork->func
while holding the lock. And passing in the "twork" to the function
isn't safe either, as far as I can see, since it may be gone too.

Basically, *any* access of 'twork' after it is removed from the list
and you have released the task spinlock is unsafe, as far as I can
tell.

Alternatively, you must make the rule be that the data can only be
freed by the caller *if* it was returned from "task_work_cancel()".
But then you can't allocate it on the stack any more, and have to
allocate it separately.

Or you need to implement some kind of "task_work_cancel_sync()"
function that guarantees that it waits for the actual work function to
finish. And I don't know how you'd do that.

But as it is, this series looks seriously buggy.

                      Linus

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 1/3] task_work_add: generic process-context callbacks
  2012-04-14  2:40   ` Linus Torvalds
@ 2012-04-14  3:05     ` Oleg Nesterov
  2012-04-14  3:39     ` Oleg Nesterov
  2012-04-17 13:53     ` David Howells
  2 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-14  3:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, David Howells, Thomas Gleixner, Alexander Gordeev,
	Chris Zankel, David Smith, Frank Ch. Eigler, Geert Uytterhoeven,
	Larry Woodman, Peter Zijlstra, Tejun Heo, linux-kernel

On 04/13, Linus Torvalds wrote:
>
> This is seriously buggy:

I am already sleep. But,

> On Fri, Apr 13, 2012 at 7:12 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > +void task_work_run(struct task_struct *task)
> > +{
> > +       struct hlist_head task_works;
> > +       struct hlist_node *pos;
> > +
> > +       raw_spin_lock_irq(&task->pi_lock);
> > +       hlist_move_list(&task->task_works, &task_works);
> > +       raw_spin_unlock_irq(&task->pi_lock);
> > +
> > +       if (unlikely(hlist_empty(&task_works)))
> > +               return;
> > +       /*
> > +        * We use hlist to save the space in task_struct, but we want fifo.
> > +        * Find the last entry, the list should be short, then process them
> > +        * in reverse order.
> > +        */
> > +       for (pos = task_works.first; pos->next; pos = pos->next)
> > +               ;
> > +
> > +       for (;;) {
> > +               struct hlist_node **pprev = pos->pprev;
> > +               struct task_work *twork = container_of(pos, struct task_work,
> > +                                                       hlist);
> > +               twork->func(twork);
> > +
> > +               if (pprev == &task_works.first)
> > +                       break;
> > +               pos = container_of(pprev, struct hlist_node, next);
> > +       }
> > +}
>
> No can do. You've removed the task-work from the process list, and you
> no longer hold the spinlock that protects that list. That means that
> you *cannot* access the task-work data structure any more, because it
> may long be gone.
>
> Look at the users of this interface that you wrote yourself. They
> allocate the task-work on the stack, and do a "task_work_cancel()"
> before returning. That data structure is *gone*. You can't dereference
> it any more.

tsk is always "current", probably this should be documented, I'll add
the comment.

So this can't race with irq_thread() which uses the task_work on stack.

> Basically, *any* access of 'twork' after it is removed from the list
> and you have released the task spinlock is unsafe, as far as I can
> tell.

I don't follow.

Once task_work_run() removes task_work from list (and drops the lock)
nobody can use this twork. task_work_cancel obviously can't find it,
it will return NULL.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 0/3] task_work_add (was: task_work_queue)
  2012-04-14  2:32 ` [PATCH v4 0/3] task_work_add (was: task_work_queue) Linus Torvalds
@ 2012-04-14  3:28   ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-14  3:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, David Howells, Thomas Gleixner, Alexander Gordeev,
	Chris Zankel, David Smith, Frank Ch. Eigler, Geert Uytterhoeven,
	Larry Woodman, Peter Zijlstra, Tejun Heo, linux-kernel

On 04/13, Linus Torvalds wrote:
>
> On Fri, Apr 13, 2012 at 7:12 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Changes based on review from Andrew:
> >
> >        - s/task_work_queue/task_work_add/
> >
> >        - don't create a secret dependency upon task_work.h
> >          including sched.h in tracehook.h and keys/internal.h
> >
> >        - add more includes into task_work.[ch] to not rely
> >          on "sched.h includes everything"
>
> I thought based on Andrew's comments that you were going to remove the
> extra code to do FIFO for no obvious reason.

No, I tried to defense fifo,

> Apparently nothing
> actually wanted/needed it,

True, currently nothing. And most probably never will. task_work_add()
should only be used for the "unlikely" events.

> so why do it?

I can remove it.

But, unless you strongly object, personally I'd prefer to keep fifo.

Once again. Imho of course, but fifo is simply "obviously better"
from the common sense pov when it comes to submit-the-work-for-
execution.

For example. keyctl_session_to_parent() does _cancel only to protect
from exploits doing keyctl(KEYCTL_SESSION_TO_PARENT) in an endless
loop. It could simply do task_work_add(), but in this case we need
fifo for correctness.

But, again, again, I am not going to argue too much.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 1/3] task_work_add: generic process-context callbacks
  2012-04-14  2:40   ` Linus Torvalds
  2012-04-14  3:05     ` Oleg Nesterov
@ 2012-04-14  3:39     ` Oleg Nesterov
  2012-04-14  8:00       ` Linus Torvalds
  2012-04-17 13:53     ` David Howells
  2 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-14  3:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, David Howells, Thomas Gleixner, Alexander Gordeev,
	Chris Zankel, David Smith, Frank Ch. Eigler, Geert Uytterhoeven,
	Larry Woodman, Peter Zijlstra, Tejun Heo, linux-kernel

probably I should also comment this part...

On 04/13, Linus Torvalds wrote:
>
> Alternatively, you must make the rule be that the data can only be
> freed by the caller *if* it was returned from "task_work_cancel()".

Exactly.

Once the caller does task_work_add(twork), it no longer "owns" this
twork.

But, if task_work_cancel() succeeds - you own it again.

> But then you can't allocate it on the stack any more, and have to
> allocate it separately.

Yes, unless you do task_work_add/cancel(current).

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 1/3] task_work_add: generic process-context callbacks
  2012-04-14  3:39     ` Oleg Nesterov
@ 2012-04-14  8:00       ` Linus Torvalds
  2012-04-14 20:27         ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2012-04-14  8:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Howells, Thomas Gleixner, Alexander Gordeev,
	Chris Zankel, David Smith, Frank Ch. Eigler, Geert Uytterhoeven,
	Larry Woodman, Peter Zijlstra, Tejun Heo, linux-kernel

On Fri, Apr 13, 2012 at 8:39 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>>
>> Alternatively, you must make the rule be that the data can only be
>> freed by the caller *if* it was returned from "task_work_cancel()".
>
> Exactly.
>
> Once the caller does task_work_add(twork), it no longer "owns" this
> twork.
>
> But, if task_work_cancel() succeeds - you own it again.

*IF* it succeeds.

My argument was that it might race, and not successfully remove the
work entry at all - but because it was allocated on the stack, it gets
freed *regardless* of whether the cancel got it or not.

>> But then you can't allocate it on the stack any more, and have to
>> allocate it separately.
>
> Yes, unless you do task_work_add/cancel(current).

Ok, your argument seems to be that "current" is special, and can not
race, because the work execution is always synchronous with the task
it got scheduled on.

I can buy that. But if so, it needs some big honking comment. Maybe
the whole "add yourself onto your *own* task work" should be an
explicit separate interface.

And that whole "run_task_work()" function should *not* take a "task"
pointer, because it would be horribly horribly wrong to ever run it in
any context than "current".

                 Linus

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 0/3] task_work_add (was: task_work_queue)
  2012-04-14  2:12 [PATCH v4 0/3] task_work_add (was: task_work_queue) Oleg Nesterov
                   ` (3 preceding siblings ...)
  2012-04-14  2:32 ` [PATCH v4 0/3] task_work_add (was: task_work_queue) Linus Torvalds
@ 2012-04-14 18:08 ` Peter Zijlstra
  2012-04-14 20:17   ` Oleg Nesterov
  2012-04-17 14:11 ` [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add() David Howells
  2012-04-17 14:23 ` [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add() David Howells
  6 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2012-04-14 18:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Tejun Heo, linux-kernel

On Sat, 2012-04-14 at 04:12 +0200, Oleg Nesterov wrote:
> Hello.
> 
> Changes based on review from Andrew:
> 
> 	- s/task_work_queue/task_work_add/
> 
> 	- don't create a secret dependency upon task_work.h
> 	  including sched.h in tracehook.h and keys/internal.h
> 
> 	- add more includes into task_work.[ch] to not rely
> 	  on "sched.h includes everything"
> 
> The genirq patch comes as 2/3, it was already acked by Thomas.
> 
> 3/3 still needs the ack from David, although his last review
> was positive and I hope I fully addressed his comments.

Can you use this to get rid of TIF_UPROBE as well?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 0/3] task_work_add (was: task_work_queue)
  2012-04-14 18:08 ` Peter Zijlstra
@ 2012-04-14 20:17   ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-14 20:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, David Howells, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Tejun Heo, linux-kernel

On 04/14, Peter Zijlstra wrote:
>
> Can you use this to get rid of TIF_UPROBE as well?

Yes, I thought about uprobes. May be it can use task_work, but
it is not easy to avoid TIF_UPROBE. The problem is the first
UTASK_BP_HIT when current->utask == NULL, and ->utask is the
only "right" place to hold task_work. We will see.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 1/3] task_work_add: generic process-context callbacks
  2012-04-14  8:00       ` Linus Torvalds
@ 2012-04-14 20:27         ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-14 20:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, David Howells, Thomas Gleixner, Alexander Gordeev,
	Chris Zankel, David Smith, Frank Ch. Eigler, Geert Uytterhoeven,
	Larry Woodman, Peter Zijlstra, Tejun Heo, linux-kernel

On 04/14, Linus Torvalds wrote:
>
> On Fri, Apr 13, 2012 at 8:39 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> >>
> > Once the caller does task_work_add(twork), it no longer "owns" this
> > twork.
> >
> > But, if task_work_cancel() succeeds - you own it again.
>
> *IF* it succeeds.

Sure.

> >> But then you can't allocate it on the stack any more, and have to
> >> allocate it separately.
> >
> > Yes, unless you do task_work_add/cancel(current).
>
> Ok, your argument seems to be that "current" is special, and can not
> race, because the work execution is always synchronous with the task
> it got scheduled on.

Yes, exactly.

> And that whole "run_task_work()" function should *not* take a "task"
> pointer, because it would be horribly horribly wrong to ever run it in
> any context than "current".

And it was task_work_queue(void) initially. But then I decided to
micro-optimize this, the callers already have this task_struct in
the register. And we have other examples like this, say, exit_mm().

However. I agree that it would be more understandable and clean
to use current in task_work_run(void), and percpu_read is cheap.

So I'll remove this argument and send v5 after David reviews 3/3.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 1/3] task_work_add: generic process-context callbacks
  2012-04-14  2:40   ` Linus Torvalds
  2012-04-14  3:05     ` Oleg Nesterov
  2012-04-14  3:39     ` Oleg Nesterov
@ 2012-04-17 13:53     ` David Howells
  2012-04-17 16:16       ` Oleg Nesterov
  2 siblings, 1 reply; 29+ messages in thread
From: David Howells @ 2012-04-17 13:53 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Linus Torvalds, Andrew Morton, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> tsk is always "current", probably this should be documented, I'll add
> the comment.

If tsk is always current, then don't pass it as a parameter.  Some arches keep
current in a register and it's much more efficient to use current directly in
those cases - and those that don't have to be able to obtain it reasonably
quickly.

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add()
  2012-04-14  2:12 [PATCH v4 0/3] task_work_add (was: task_work_queue) Oleg Nesterov
                   ` (4 preceding siblings ...)
  2012-04-14 18:08 ` Peter Zijlstra
@ 2012-04-17 14:11 ` David Howells
  2012-04-17 16:29   ` Oleg Nesterov
  2012-04-17 14:23 ` [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add() David Howells
  6 siblings, 1 reply; 29+ messages in thread
From: David Howells @ 2012-04-17 14:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> 2. change irq_thread() to do task_queue_work(irq_thread_dtor)
>    at the start and task_work_cancel() before return.
> 
>    tracehook_notify_resume() can never play with kthreads,
>    only do_exit()->exit_task_work() can call the callback
>    and this is what we want.

Hmmm...  This seems wrong.  You're now using the hook in two distinct ways:
the primary use of that the hook is to detect that userspace is about to
resume processing (via TIF_NOTIFY_RESUME) and then you're abusing the fact
that the hook is also invoked via do_exit() to perform a clean up because
we've got to get rid of it somehow under that circumstance.

This only works for you because you're operating in a kernel thread which
doesn't have a userspace (and so will never see TIF_NOTIFY_RESUME).  However,
if someone tries that in an ordinary thread, it is liable to malfunction as
the record could be executed and deleted at some unpredictable point in the
future.

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-14  2:12 [PATCH v4 0/3] task_work_add (was: task_work_queue) Oleg Nesterov
                   ` (5 preceding siblings ...)
  2012-04-17 14:11 ` [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add() David Howells
@ 2012-04-17 14:23 ` David Howells
  2012-04-17 16:36   ` Oleg Nesterov
  6 siblings, 1 reply; 29+ messages in thread
From: David Howells @ 2012-04-17 14:23 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> -	int ret;
> +	struct cred *cred;
> +	int err;

There's no need to change ret to be err.  It produces slightly less churn if
you don't.

> +	err = -ENOMEM;
> +	newwork = kmalloc(sizeof(struct task_work), GFP_KERNEL);
> +	if (!newwork)
> +		goto error_keyring;

This passes unnecessarily through kfree(newwork).  Please add a new goto label
and go to that.

> -/*
> - * Replace a process's session keyring on behalf of one of its children when
> - * the target  process is about to resume userspace execution.
> - */

Please don't delete the banner comment saying what the function does.

> -void key_replace_session_keyring(void)
> +void key_change_session_keyring(struct task_work *twork)

There's no particular need to change the name of the function.

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 1/3] task_work_add: generic process-context callbacks
  2012-04-17 13:53     ` David Howells
@ 2012-04-17 16:16       ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-17 16:16 UTC (permalink / raw)
  To: David Howells
  Cc: Linus Torvalds, Andrew Morton, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

On 04/17, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > tsk is always "current", probably this should be documented, I'll add
> > the comment.
>
> If tsk is always current, then don't pass it as a parameter.

This is already changed, I removed this arg.

> Some arches keep
> current in a register and it's much more efficient to use current directly in
> those cases

OK, thanks.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add()
  2012-04-17 14:11 ` [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add() David Howells
@ 2012-04-17 16:29   ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-17 16:29 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

On 04/17, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > 2. change irq_thread() to do task_queue_work(irq_thread_dtor)
> >    at the start and task_work_cancel() before return.
> >
> >    tracehook_notify_resume() can never play with kthreads,
> >    only do_exit()->exit_task_work() can call the callback
> >    and this is what we want.
>
> Hmmm...  This seems wrong.

I don't agree...

> You're now using the hook in two distinct ways:
> the primary use of that the hook is to detect that userspace is about to
> resume processing (via TIF_NOTIFY_RESUME)

Yes,

> and then you're abusing the fact
> that the hook is also invoked via do_exit() to perform a clean up because
> we've got to get rid of it somehow under that circumstance.

Yes, and please note that this cleanup is only needed if irq thread
crashes.

> This only works for you because you're operating in a kernel thread which
> doesn't have a userspace (and so will never see TIF_NOTIFY_RESUME).

Yes,

> However,
> if someone tries that in an ordinary thread,

Sure, nobody should do this with the ordinary thread. At least
exactly this.

> it is liable to malfunction as
> the record could be executed and deleted at some unpredictable point in the
> future.

"In the future" is not possible (and this doesn't depend on
PF_KTHREAD). irq_thread() does task_work_cancel() before return.
And until it returns it returns the work can't be executed
unless this task exits in between.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-17 14:23 ` [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add() David Howells
@ 2012-04-17 16:36   ` Oleg Nesterov
  2012-04-17 19:34     ` Oleg Nesterov
  2012-04-19 16:52     ` David Howells
  0 siblings, 2 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-17 16:36 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

On 04/17, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > -	int ret;
> > +	struct cred *cred;
> > +	int err;
>
> There's no need to change ret to be err.  It produces slightly less churn if
> you don't.

OK, I'll change it back.

>
> > +	err = -ENOMEM;
> > +	newwork = kmalloc(sizeof(struct task_work), GFP_KERNEL);
> > +	if (!newwork)
> > +		goto error_keyring;
>
> This passes unnecessarily through kfree(newwork).

but harmless, newwork == NULL in this case,

> Please add a new goto label
> and go to that.

OK, probably this is more clear, will do.

> > -/*
> > - * Replace a process's session keyring on behalf of one of its children when
> > - * the target  process is about to resume userspace execution.
> > - */
>
> Please don't delete the banner comment saying what the function does.

OOPS, sorry, this was unintentional.

> > -void key_replace_session_keyring(void)
> > +void key_change_session_keyring(struct task_work *twork)
>
> There's no particular need to change the name of the function.

There is.

Until we cleanup arch/* do_notify_resume() still does

	if (current->replacement_session_keyring)
		key_replace_session_keyring();

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-17 16:36   ` Oleg Nesterov
@ 2012-04-17 19:34     ` Oleg Nesterov
  2012-04-19 16:52     ` David Howells
  1 sibling, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-17 19:34 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

On 04/17, Oleg Nesterov wrote:
>
> On 04/17, David Howells wrote:
> >
> > > -void key_replace_session_keyring(void)
> > > +void key_change_session_keyring(struct task_work *twork)
> >
> > There's no particular need to change the name of the function.
>
> There is.
>
> Until we cleanup arch/* do_notify_resume() still does
>
> 	if (current->replacement_session_keyring)
> 		key_replace_session_keyring();

So the rename is still here.

Anything else? Please see the updated patch below.

Oleg.

>From 2ce2ff8e8715d0d25cc0228fa58f5721caebb780 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg@redhat.com>
Date: Thu, 12 Apr 2012 02:30:44 +0200
Subject: [PATCH] cred: change keyctl_session_to_parent() to use task_work_add()

Change keyctl_session_to_parent() to use task_work_add() and
move key_replace_session_keyring() logic into task_work->func().

Note that we do task_work_cancel() before task_work_add() to
ensure that only one work can be pending at any time. This is
important, we must not allow user-space to abuse the parent's
->task_works list.

The callback, replace_session_keyring(), checks PF_EXITING.
I guess this is not really needed but looks better.

As a side effect, this fixes the (unlikely) race. The callers
of key_replace_session_keyring() and keyctl_session_to_parent()
lack the necessary barriers, the parent can miss the request.

Now we can remove task_struct->replacement_session_keyring and
related code.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/key.h          |    6 +--
 security/keys/internal.h     |    2 +
 security/keys/keyctl.c       |   73 ++++++++++++++++++++----------------------
 security/keys/process_keys.c |   21 ++++--------
 4 files changed, 46 insertions(+), 56 deletions(-)

diff --git a/include/linux/key.h b/include/linux/key.h
index 96933b1..0c263d6 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -33,6 +33,8 @@ typedef uint32_t key_perm_t;
 
 struct key;
 
+#define key_replace_session_keyring()	do { } while (0)
+
 #ifdef CONFIG_KEYS
 
 #undef KEY_DEBUGGING
@@ -302,9 +304,6 @@ static inline bool key_is_instantiated(const struct key *key)
 #ifdef CONFIG_SYSCTL
 extern ctl_table key_sysctls[];
 #endif
-
-extern void key_replace_session_keyring(void);
-
 /*
  * the userspace interface
  */
@@ -327,7 +326,6 @@ extern void key_init(void);
 #define key_fsuid_changed(t)		do { } while(0)
 #define key_fsgid_changed(t)		do { } while(0)
 #define key_init()			do { } while(0)
-#define key_replace_session_keyring()	do { } while(0)
 
 #endif /* CONFIG_KEYS */
 #endif /* __KERNEL__ */
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 65647f8..c2986da 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -14,6 +14,7 @@
 
 #include <linux/sched.h>
 #include <linux/key-type.h>
+#include <linux/task_work.h>
 
 #ifdef __KDEBUG
 #define kenter(FMT, ...) \
@@ -148,6 +149,7 @@ extern key_ref_t lookup_user_key(key_serial_t id, unsigned long flags,
 #define KEY_LOOKUP_FOR_UNLINK	0x04
 
 extern long join_session_keyring(const char *name);
+extern void key_change_session_keyring(struct task_work *twork);
 
 extern struct work_struct key_gc_work;
 extern unsigned key_gc_delay;
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index fb767c6..55a451b 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1423,50 +1423,57 @@ long keyctl_get_security(key_serial_t keyid,
  */
 long keyctl_session_to_parent(void)
 {
-#ifdef TIF_NOTIFY_RESUME
 	struct task_struct *me, *parent;
 	const struct cred *mycred, *pcred;
-	struct cred *cred, *oldcred;
+	struct task_work *newwork, *oldwork;
 	key_ref_t keyring_r;
+	struct cred *cred;
 	int ret;
 
 	keyring_r = lookup_user_key(KEY_SPEC_SESSION_KEYRING, 0, KEY_LINK);
 	if (IS_ERR(keyring_r))
 		return PTR_ERR(keyring_r);
 
+	ret = -ENOMEM;
+	newwork = kmalloc(sizeof(struct task_work), GFP_KERNEL);
+	if (!newwork)
+		goto error_keyring;
+
 	/* our parent is going to need a new cred struct, a new tgcred struct
 	 * and new security data, so we allocate them here to prevent ENOMEM in
 	 * our parent */
-	ret = -ENOMEM;
 	cred = cred_alloc_blank();
 	if (!cred)
-		goto error_keyring;
+		goto error_newwork;
 
 	cred->tgcred->session_keyring = key_ref_to_ptr(keyring_r);
-	keyring_r = NULL;
+	init_task_work(newwork, key_change_session_keyring, cred);
 
 	me = current;
 	rcu_read_lock();
 	write_lock_irq(&tasklist_lock);
 
-	parent = me->real_parent;
 	ret = -EPERM;
+	oldwork = NULL;
+	parent = me->real_parent;
 
 	/* the parent mustn't be init and mustn't be a kernel thread */
 	if (parent->pid <= 1 || !parent->mm)
-		goto not_permitted;
+		goto unlock;
 
 	/* the parent must be single threaded */
 	if (!thread_group_empty(parent))
-		goto not_permitted;
+		goto unlock;
 
 	/* the parent and the child must have different session keyrings or
 	 * there's no point */
 	mycred = current_cred();
 	pcred = __task_cred(parent);
 	if (mycred == pcred ||
-	    mycred->tgcred->session_keyring == pcred->tgcred->session_keyring)
-		goto already_same;
+	    mycred->tgcred->session_keyring == pcred->tgcred->session_keyring) {
+		ret = 0;
+		goto unlock;
+	}
 
 	/* the parent must have the same effective ownership and mustn't be
 	 * SUID/SGID */
@@ -1476,50 +1483,40 @@ long keyctl_session_to_parent(void)
 	    pcred->gid	!= mycred->egid	||
 	    pcred->egid	!= mycred->egid	||
 	    pcred->sgid	!= mycred->egid)
-		goto not_permitted;
+		goto unlock;
 
 	/* the keyrings must have the same UID */
 	if ((pcred->tgcred->session_keyring &&
 	     pcred->tgcred->session_keyring->uid != mycred->euid) ||
 	    mycred->tgcred->session_keyring->uid != mycred->euid)
-		goto not_permitted;
+		goto unlock;
 
-	/* if there's an already pending keyring replacement, then we replace
-	 * that */
-	oldcred = parent->replacement_session_keyring;
+	/* cancel an already pending keyring replacement */
+	oldwork = task_work_cancel(parent, key_change_session_keyring);
 
 	/* the replacement session keyring is applied just prior to userspace
 	 * restarting */
-	parent->replacement_session_keyring = cred;
-	cred = NULL;
-	set_ti_thread_flag(task_thread_info(parent), TIF_NOTIFY_RESUME);
-
+	ret = task_work_add(parent, newwork, true);
+	if (!ret)
+		newwork = NULL;
+unlock:
 	write_unlock_irq(&tasklist_lock);
 	rcu_read_unlock();
-	if (oldcred)
-		put_cred(oldcred);
-	return 0;
-
-already_same:
-	ret = 0;
-not_permitted:
-	write_unlock_irq(&tasklist_lock);
-	rcu_read_unlock();
-	put_cred(cred);
+	if (oldwork) {
+		put_cred(oldwork->data);
+		kfree(oldwork);
+	}
+	if (newwork) {
+		put_cred(newwork->data);
+		kfree(newwork);
+	}
 	return ret;
 
+error_newwork:
+	kfree(newwork);
 error_keyring:
 	key_ref_put(keyring_r);
 	return ret;
-
-#else /* !TIF_NOTIFY_RESUME */
-	/*
-	 * To be removed when TIF_NOTIFY_RESUME has been implemented on
-	 * m68k/xtensa
-	 */
-#warning TIF_NOTIFY_RESUME not implemented
-	return -EOPNOTSUPP;
-#endif /* !TIF_NOTIFY_RESUME */
 }
 
 /*
diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
index be7ecb2..556dea2 100644
--- a/security/keys/process_keys.c
+++ b/security/keys/process_keys.c
@@ -832,23 +832,17 @@ error:
  * Replace a process's session keyring on behalf of one of its children when
  * the target  process is about to resume userspace execution.
  */
-void key_replace_session_keyring(void)
+void key_change_session_keyring(struct task_work *twork)
 {
-	const struct cred *old;
-	struct cred *new;
+	const struct cred *old = current_cred();
+	struct cred *new = twork->data;
 
-	if (!current->replacement_session_keyring)
-		return;
-
-	write_lock_irq(&tasklist_lock);
-	new = current->replacement_session_keyring;
-	current->replacement_session_keyring = NULL;
-	write_unlock_irq(&tasklist_lock);
-
-	if (!new)
+	kfree(twork);
+	if (unlikely(current->flags & PF_EXITING)) {
+		put_cred(new);
 		return;
+	}
 
-	old = current_cred();
 	new->  uid	= old->  uid;
 	new-> euid	= old-> euid;
 	new-> suid	= old-> suid;
@@ -873,6 +867,5 @@ void key_replace_session_keyring(void)
 	new->tgcred->process_keyring = key_get(old->tgcred->process_keyring);
 
 	security_transfer_creds(new, old);
-
 	commit_creds(new);
 }
-- 
1.5.5.1




^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-17 16:36   ` Oleg Nesterov
  2012-04-17 19:34     ` Oleg Nesterov
@ 2012-04-19 16:52     ` David Howells
  2012-04-19 17:34       ` Oleg Nesterov
  2012-04-19 17:55       ` David Howells
  1 sibling, 2 replies; 29+ messages in thread
From: David Howells @ 2012-04-19 16:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> +#define key_replace_session_keyring()	do { } while (0)
> +

Ummm... Why?  You've made the function no longer exist and so this shouldn't be
there anymore.  You shouldn't just leave the call sites in place.

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 16:52     ` David Howells
@ 2012-04-19 17:34       ` Oleg Nesterov
  2012-04-19 17:36         ` Oleg Nesterov
  2012-04-19 17:55       ` David Howells
  1 sibling, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-19 17:34 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

On 04/19, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > +#define key_replace_session_keyring()	do { } while (0)
> > +
>
> Ummm... Why?  You've made the function no longer exist and so this shouldn't be
> there anymore.  You shouldn't just leave the call sites in place.

Sure!

I'll send the (trivial) patches to cleanup the code in arch/ (24 callers),
plus we should kill the no longer used task ->replacement_session_keyring
and update copy_creds/exit_creds.

But I certainly do not want to put these changes in 3/3 or in this series.
I'll send the patches to Andrew once (I hope ;) he takes 1-3.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 17:34       ` Oleg Nesterov
@ 2012-04-19 17:36         ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-19 17:36 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

forgot to mention...

On 04/19, Oleg Nesterov wrote:
> On 04/19, David Howells wrote:
> >
> > Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > +#define key_replace_session_keyring()	do { } while (0)
> > > +
> >
> > Ummm... Why?  You've made the function no longer exist and so this shouldn't be
> > there anymore.  You shouldn't just leave the call sites in place.
>
> Sure!
>
> I'll send the (trivial) patches to cleanup the code in arch/ (24 callers),
> plus we should kill the no longer used task ->replacement_session_keyring
> and update copy_creds/exit_creds.

... and we should kill tracehook_notify_resume()

> But I certainly do not want to put these changes in 3/3 or in this series.
> I'll send the patches to Andrew once (I hope ;) he takes 1-3.

Yes.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 16:52     ` David Howells
  2012-04-19 17:34       ` Oleg Nesterov
@ 2012-04-19 17:55       ` David Howells
  2012-04-19 18:10         ` Oleg Nesterov
  1 sibling, 1 reply; 29+ messages in thread
From: David Howells @ 2012-04-19 17:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> But I certainly do not want to put these changes in 3/3 or in this series.

Why not?  It seems the correct place for them.

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 17:55       ` David Howells
@ 2012-04-19 18:10         ` Oleg Nesterov
  2012-04-19 18:40           ` Oleg Nesterov
  2012-04-19 19:34           ` David Howells
  0 siblings, 2 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-19 18:10 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

On 04/19, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > But I certainly do not want to put these changes in 3/3 or in this series.
>
> Why not?  It seems the correct place for them.

I wanted to discuss the functional changes first, then send the trivial
cleanups. And the changes in arch/ are always painful. And, "remove
tracehook_notify_resume" is a bit off-topic, but I do not want to change
the same code in arch/ twice.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 18:10         ` Oleg Nesterov
@ 2012-04-19 18:40           ` Oleg Nesterov
  2012-04-19 19:34           ` David Howells
  1 sibling, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-19 18:40 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

On 04/19, Oleg Nesterov wrote:
>
> On 04/19, David Howells wrote:
> >
> > Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > But I certainly do not want to put these changes in 3/3 or in this series.
> >
> > Why not?  It seems the correct place for them.
>
> I wanted to discuss the functional changes first, then send the trivial
> cleanups. And the changes in arch/ are always painful. And, "remove
> tracehook_notify_resume" is a bit off-topic, but I do not want to change
> the same code in arch/ twice.

That said, if you do not agree with 3/3 without "remove the call sites",
I'll send v6 which doesn't rename key_replace_session_keyring() but
updates the old callers:

	 arch/alpha/kernel/signal.c     |    2 -
	 arch/arm/kernel/signal.c       |    2 -
	 arch/avr32/kernel/signal.c     |    2 -
	 arch/blackfin/kernel/signal.c  |    2 -
	 arch/c6x/kernel/signal.c       |    2 -
	 arch/cris/kernel/ptrace.c      |    2 -
	 arch/frv/kernel/signal.c       |    2 -
	 arch/h8300/kernel/signal.c     |    2 -
	 arch/hexagon/kernel/signal.c   |    3 +-
	 arch/ia64/kernel/process.c     |    2 -
	 arch/m32r/kernel/signal.c      |    2 -
	 arch/mips/kernel/signal.c      |    2 -
	 arch/mn10300/kernel/signal.c   |    2 -
	 arch/openrisc/kernel/signal.c  |    2 -
	 arch/parisc/kernel/signal.c    |    2 -
	 arch/powerpc/kernel/signal.c   |    2 -
	 arch/s390/kernel/signal.c      |    2 -
	 arch/sh/kernel/signal_32.c     |    2 -
	 arch/sh/kernel/signal_64.c     |    2 -
	 arch/sparc/kernel/signal_32.c  |    2 -
	 arch/sparc/kernel/signal_64.c  |    2 -
	 arch/tile/kernel/process.c     |    2 -
	 arch/unicore32/kernel/signal.c |    2 -
	 arch/x86/kernel/signal.c       |    2 -
	 include/linux/key.h            |    4 --
	 include/linux/sched.h          |    2 -
	 kernel/cred.c                  |    9 -----
	 security/keys/internal.h       |    2 +
	 security/keys/keyctl.c         |   73 +++++++++++++++++++---------------------
	 security/keys/process_keys.c   |   20 ++++-------
	 30 files changed, 45 insertions(+), 114 deletions(-)

I am not going to argue, but honestly I don't think it is a good
idea to mix the functional changes in keyctl.c and the simple
(but obviously untested) cleanups.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 18:10         ` Oleg Nesterov
  2012-04-19 18:40           ` Oleg Nesterov
@ 2012-04-19 19:34           ` David Howells
  2012-04-19 19:47             ` Oleg Nesterov
  2012-04-19 22:26             ` David Howells
  1 sibling, 2 replies; 29+ messages in thread
From: David Howells @ 2012-04-19 19:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> I am not going to argue, but honestly I don't think it is a good
> idea to mix the functional changes in keyctl.c and the simple
> (but obviously untested) cleanups.

I meant add the cleanup as patch 4.  That way they are still separate.  I'll
let you change the name of the function to make it easier;-)

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 19:34           ` David Howells
@ 2012-04-19 19:47             ` Oleg Nesterov
  2012-04-19 22:26             ` David Howells
  1 sibling, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2012-04-19 19:47 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

On 04/19, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > I am not going to argue, but honestly I don't think it is a good
> > idea to mix the functional changes in keyctl.c and the simple
> > (but obviously untested) cleanups.
>
> I meant add the cleanup as patch 4.  That way they are still separate.  I'll
> let you change the name of the function to make it easier;-)

OK.

I'd like to avoid the unnecessary resends, lets discuss this before
I send the patch(es).

At least I'd like to split this into 2 patches:

	4/3: only removes the dead code from arch/*

	5/3: updates copy_creds/exit_creds and kills
	     ->replacement_session_keyring, plus renames
	     key_change_ back to key_replace_

Do you agree?

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add()
  2012-04-19 19:34           ` David Howells
  2012-04-19 19:47             ` Oleg Nesterov
@ 2012-04-19 22:26             ` David Howells
  1 sibling, 0 replies; 29+ messages in thread
From: David Howells @ 2012-04-19 22:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Alexander Gordeev, Chris Zankel, David Smith, Frank Ch. Eigler,
	Geert Uytterhoeven, Larry Woodman, Peter Zijlstra, Tejun Heo,
	linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> On 04/19, David Howells wrote:
> >
> > Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > I am not going to argue, but honestly I don't think it is a good
> > > idea to mix the functional changes in keyctl.c and the simple
> > > (but obviously untested) cleanups.
> >
> > I meant add the cleanup as patch 4.  That way they are still separate.  I'll
> > let you change the name of the function to make it easier;-)
> 
> OK.
> 
> I'd like to avoid the unnecessary resends, lets discuss this before
> I send the patch(es).
> 
> At least I'd like to split this into 2 patches:
> 
> 	4/3: only removes the dead code from arch/*
> 
> 	5/3: updates copy_creds/exit_creds and kills
> 	     ->replacement_session_keyring, plus renames
> 	     key_change_ back to key_replace_
> 
> Do you agree?

Well, don't bother renaming the function back, but apart from that, okay.

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2012-04-19 22:51 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-14  2:12 [PATCH v4 0/3] task_work_add (was: task_work_queue) Oleg Nesterov
2012-04-14  2:12 ` [PATCH v4 1/3] task_work_add: generic process-context callbacks Oleg Nesterov
2012-04-14  2:40   ` Linus Torvalds
2012-04-14  3:05     ` Oleg Nesterov
2012-04-14  3:39     ` Oleg Nesterov
2012-04-14  8:00       ` Linus Torvalds
2012-04-14 20:27         ` Oleg Nesterov
2012-04-17 13:53     ` David Howells
2012-04-17 16:16       ` Oleg Nesterov
2012-04-14  2:12 ` [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add() Oleg Nesterov
2012-04-14  2:12 ` [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add() Oleg Nesterov
2012-04-14  2:32 ` [PATCH v4 0/3] task_work_add (was: task_work_queue) Linus Torvalds
2012-04-14  3:28   ` Oleg Nesterov
2012-04-14 18:08 ` Peter Zijlstra
2012-04-14 20:17   ` Oleg Nesterov
2012-04-17 14:11 ` [PATCH v4 2/3] genirq: reimplement exit_irq_thread() hook via task_work_add() David Howells
2012-04-17 16:29   ` Oleg Nesterov
2012-04-17 14:23 ` [PATCH v4 3/3] cred: change keyctl_session_to_parent() to use task_work_add() David Howells
2012-04-17 16:36   ` Oleg Nesterov
2012-04-17 19:34     ` Oleg Nesterov
2012-04-19 16:52     ` David Howells
2012-04-19 17:34       ` Oleg Nesterov
2012-04-19 17:36         ` Oleg Nesterov
2012-04-19 17:55       ` David Howells
2012-04-19 18:10         ` Oleg Nesterov
2012-04-19 18:40           ` Oleg Nesterov
2012-04-19 19:34           ` David Howells
2012-04-19 19:47             ` Oleg Nesterov
2012-04-19 22:26             ` David Howells

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.