All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] allow multiple kthreadd's
@ 2020-05-01 16:01 J. Bruce Fields
  2020-05-01 16:01 ` [PATCH 1/4] kthreads: minor kthreadd refactoring J. Bruce Fields
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-01 16:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-nfs, Jeff Layton, David Howells, Tejun Heo, Shaohua Li,
	Oleg Nesterov, linux-kernel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

These patches allow a caller to create its own kthreadd.

The motivation is file delegations: currently any write operation from a
client breaks all delegations, even delegations held by the same client.

To fix that, we need to know which client is performing a given
operation.

So, we let nfsd put all the nfsd threads into the same thread group (by
spawning them from its own private kthreadd), then patch the delegation
code to treat delegation breaks from the same thread group as not
conflicting, and then leave it to nfsd to sort out conflicts among its
own clients.  Those patches are in:

	git://linux-nfs.org/~bfields/linux.git deleg-fix-self-conflicts

This was an idea from Trond.  Part of his motivation was that it could
work for userspace servers (like Ganesha and Samba) as well.  (We don't
currently let them request delegations, but probably will some day--it
shouldn't be difficult.)

Previously I considered instead adding a new field somewhere in the
struct task.  That might require a new system call to expose to user
space.  Or we might be able to put this in a keyring, if David Howells
thought that would work.

Before that I tried passing the identity of the breaker explicitly, but
that looks like it would require passing the new argument around to huge
swaths of the VFS.

Anyway, does this multiple kthreadd approach look reasonable?

(If so, who should handle the patches?)

--b.

J. Bruce Fields (4):
  kthreads: minor kthreadd refactoring
  kthreads: Simplify tsk_fork_get_node
  kthreads: allow multiple kthreadd's
  kthreads: allow cloning threads with different flags

 include/linux/kthread.h |  21 +++++-
 init/init_task.c        |   3 +
 init/main.c             |   4 +-
 kernel/fork.c           |   4 ++
 kernel/kthread.c        | 140 +++++++++++++++++++++++++++++-----------
 5 files changed, 132 insertions(+), 40 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/4] kthreads: minor kthreadd refactoring
  2020-05-01 16:01 [PATCH 0/4] allow multiple kthreadd's J. Bruce Fields
@ 2020-05-01 16:01 ` J. Bruce Fields
  2020-05-01 16:01 ` [PATCH 2/4] kthreads: Simplify tsk_fork_get_node J. Bruce Fields
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-01 16:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-nfs, Jeff Layton, David Howells, Tejun Heo, Shaohua Li,
	Oleg Nesterov, linux-kernel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Trivial refactoring, no change in behavior.

Not really necessary, a separate function for the inner loop just seems
a little nicer to me.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 kernel/kthread.c | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index bfbfa481be3a..4217fded891a 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -578,6 +578,24 @@ int kthread_stop(struct task_struct *k)
 }
 EXPORT_SYMBOL(kthread_stop);
 
+void kthread_do_work(void)
+{
+	spin_lock(&kthread_create_lock);
+	while (!list_empty(&kthread_create_list)) {
+		struct kthread_create_info *create;
+
+		create = list_entry(kthread_create_list.next,
+				    struct kthread_create_info, list);
+		list_del_init(&create->list);
+		spin_unlock(&kthread_create_lock);
+
+		create_kthread(create);
+
+		spin_lock(&kthread_create_lock);
+	}
+	spin_unlock(&kthread_create_lock);
+}
+
 int kthreadd(void *unused)
 {
 	struct task_struct *tsk = current;
@@ -597,20 +615,7 @@ int kthreadd(void *unused)
 			schedule();
 		__set_current_state(TASK_RUNNING);
 
-		spin_lock(&kthread_create_lock);
-		while (!list_empty(&kthread_create_list)) {
-			struct kthread_create_info *create;
-
-			create = list_entry(kthread_create_list.next,
-					    struct kthread_create_info, list);
-			list_del_init(&create->list);
-			spin_unlock(&kthread_create_lock);
-
-			create_kthread(create);
-
-			spin_lock(&kthread_create_lock);
-		}
-		spin_unlock(&kthread_create_lock);
+		kthread_do_work();
 	}
 
 	return 0;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/4] kthreads: Simplify tsk_fork_get_node
  2020-05-01 16:01 [PATCH 0/4] allow multiple kthreadd's J. Bruce Fields
  2020-05-01 16:01 ` [PATCH 1/4] kthreads: minor kthreadd refactoring J. Bruce Fields
@ 2020-05-01 16:01 ` J. Bruce Fields
  2020-05-01 16:01 ` [PATCH 3/4] kthreads: allow multiple kthreadd's J. Bruce Fields
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-01 16:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-nfs, Jeff Layton, David Howells, Tejun Heo, Shaohua Li,
	Oleg Nesterov, linux-kernel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

This will also simplify a following patch that allows multiple
kthreadd's.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 init/init_task.c | 3 +++
 kernel/fork.c    | 4 ++++
 kernel/kthread.c | 3 +--
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/init/init_task.c b/init/init_task.c
index bd403ed3e418..fdd760393760 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -154,6 +154,9 @@ struct task_struct init_task
 	.vtime.starttime = 0,
 	.vtime.state	= VTIME_SYS,
 #endif
+#ifdef CONFIG_NUMA
+	.pref_node_fork = NUMA_NO_NODE,
+#endif
 #ifdef CONFIG_NUMA_BALANCING
 	.numa_preferred_nid = NUMA_NO_NODE,
 	.numa_group	= NULL,
diff --git a/kernel/fork.c b/kernel/fork.c
index 8c700f881d92..fa35890534d5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -942,6 +942,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
 	tsk->fail_nth = 0;
 #endif
 
+#ifdef CONFIG_NUMA
+	tsk->pref_node_fork = NUMA_NO_NODE;
+#endif
+
 #ifdef CONFIG_BLK_CGROUP
 	tsk->throttle_queue = NULL;
 	tsk->use_memdelay = 0;
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 4217fded891a..483bee57a9c8 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -274,8 +274,7 @@ static int kthread(void *_create)
 int tsk_fork_get_node(struct task_struct *tsk)
 {
 #ifdef CONFIG_NUMA
-	if (tsk == kthreadd_task)
-		return tsk->pref_node_fork;
+	return tsk->pref_node_fork;
 #endif
 	return NUMA_NO_NODE;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/4] kthreads: allow multiple kthreadd's
  2020-05-01 16:01 [PATCH 0/4] allow multiple kthreadd's J. Bruce Fields
  2020-05-01 16:01 ` [PATCH 1/4] kthreads: minor kthreadd refactoring J. Bruce Fields
  2020-05-01 16:01 ` [PATCH 2/4] kthreads: Simplify tsk_fork_get_node J. Bruce Fields
@ 2020-05-01 16:01 ` J. Bruce Fields
  2020-05-01 16:01 ` [PATCH 4/4] kthreads: allow cloning threads with different flags J. Bruce Fields
  2020-05-01 17:59 ` [PATCH 0/4] allow multiple kthreadd's Linus Torvalds
  4 siblings, 0 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-01 16:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-nfs, Jeff Layton, David Howells, Tejun Heo, Shaohua Li,
	Oleg Nesterov, linux-kernel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

Allow subsystems to run their own kthreadd's.

I'm experimenting with this to allow nfsd to put its threads into its
own thread group to make it easy for the vfs to tell when nfsd is
breaking one of its own leases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 include/linux/kthread.h |  20 ++++++-
 init/main.c             |   4 +-
 kernel/kthread.c        | 113 ++++++++++++++++++++++++++++++----------
 3 files changed, 107 insertions(+), 30 deletions(-)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 8bbcaad7ef0f..a7ffdf96a3b2 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -5,6 +5,24 @@
 #include <linux/err.h>
 #include <linux/sched.h>
 
+struct kthread_group {
+	char *name;
+	spinlock_t create_lock;
+	struct list_head create_list;
+	struct task_struct *task;
+};
+
+extern struct kthread_group kthreadd_default;
+
+struct kthread_group *kthread_start_group(char *);
+void kthread_stop_group(struct kthread_group *);
+
+struct task_struct *kthread_group_create_on_node(struct kthread_group *,
+					int (*threadfn)(void *data),
+					 void *data,
+					 int node,
+					 const char namefmt[], ...);
+
 __printf(4, 5)
 struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
 					   void *data,
@@ -63,7 +81,7 @@ int kthread_park(struct task_struct *k);
 void kthread_unpark(struct task_struct *k);
 void kthread_parkme(void);
 
-int kthreadd(void *unused);
+int kthreadd(void *);
 extern struct task_struct *kthreadd_task;
 extern int tsk_fork_get_node(struct task_struct *tsk);
 
diff --git a/init/main.c b/init/main.c
index a48617f2e5e5..5256071b0e05 100644
--- a/init/main.c
+++ b/init/main.c
@@ -641,9 +641,9 @@ noinline void __ref rest_init(void)
 	rcu_read_unlock();
 
 	numa_default_policy();
-	pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
+	pid = kernel_thread(kthreadd, &kthreadd_default, CLONE_FS | CLONE_FILES);
 	rcu_read_lock();
-	kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
+	kthreadd_default.task = find_task_by_pid_ns(pid, &init_pid_ns);
 	rcu_read_unlock();
 
 	/*
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 483bee57a9c8..5232f6f597b5 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -25,9 +25,44 @@
 #include <linux/numa.h>
 #include <trace/events/sched.h>
 
-static DEFINE_SPINLOCK(kthread_create_lock);
-static LIST_HEAD(kthread_create_list);
-struct task_struct *kthreadd_task;
+struct kthread_group kthreadd_default = {
+	.name = "kthreadd",
+	.create_lock = __SPIN_LOCK_UNLOCKED(kthreadd_default.create_lock),
+	.create_list = LIST_HEAD_INIT(kthreadd_default.create_list),
+};
+
+void wake_kthreadd(struct kthread_group *kg)
+{
+	wake_up_process(kg->task);
+}
+
+struct kthread_group *kthread_start_group(char *name)
+{
+	struct kthread_group *new;
+	struct task_struct *task;
+
+	new = kmalloc(sizeof(struct kthread_group), GFP_KERNEL);
+	if (!new)
+		return ERR_PTR(-ENOMEM);
+	spin_lock_init(&new->create_lock);
+	INIT_LIST_HEAD(&new->create_list);
+	new->name = name;
+	task = kthread_run(kthreadd, new, name);
+	if (IS_ERR(task)) {
+		kfree(new);
+		return ERR_CAST(task);
+	}
+	new->task = task;
+	return new;
+}
+EXPORT_SYMBOL_GPL(kthread_start_group);
+
+void kthread_stop_group(struct kthread_group *kg)
+{
+	kthread_stop(kg->task);
+	kfree(kg);
+}
+EXPORT_SYMBOL_GPL(kthread_stop_group);
 
 struct kthread_create_info
 {
@@ -301,11 +336,13 @@ static void create_kthread(struct kthread_create_info *create)
 	}
 }
 
-static __printf(4, 0)
-struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
-						    void *data, int node,
-						    const char namefmt[],
-						    va_list args)
+
+static __printf(5, 0)
+struct task_struct *__kthread_group_create_on_node(struct kthread_group *kg,
+						int (*threadfn)(void *data),
+						void *data, int node,
+						const char namefmt[],
+						va_list args)
 {
 	DECLARE_COMPLETION_ONSTACK(done);
 	struct task_struct *task;
@@ -319,11 +356,11 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
 	create->node = node;
 	create->done = &done;
 
-	spin_lock(&kthread_create_lock);
-	list_add_tail(&create->list, &kthread_create_list);
-	spin_unlock(&kthread_create_lock);
+	spin_lock(&kg->create_lock);
+	list_add_tail(&create->list, &kg->create_list);
+	spin_unlock(&kg->create_lock);
 
-	wake_up_process(kthreadd_task);
+	wake_kthreadd(kg);
 	/*
 	 * Wait for completion in killable state, for I might be chosen by
 	 * the OOM killer while kthreadd is trying to allocate memory for
@@ -365,6 +402,25 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
 	return task;
 }
 
+__printf(5, 0)
+struct task_struct *kthread_group_create_on_node(struct kthread_group *kg,
+						int (*threadfn)(void *data),
+						void *data, int node,
+						const char namefmt[],
+						...)
+{
+	struct task_struct *task;
+	va_list args;
+
+	va_start(args, namefmt);
+	task = __kthread_group_create_on_node(kg, threadfn,
+						data, node, namefmt, args);
+	va_end(args);
+
+	return task;
+}
+EXPORT_SYMBOL_GPL(kthread_group_create_on_node);
+
 /**
  * kthread_create_on_node - create a kthread.
  * @threadfn: the function to run until signal_pending(current).
@@ -397,7 +453,8 @@ struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
 	va_list args;
 
 	va_start(args, namefmt);
-	task = __kthread_create_on_node(threadfn, data, node, namefmt, args);
+	task = __kthread_group_create_on_node(&kthreadd_default, threadfn,
+						data, node, namefmt, args);
 	va_end(args);
 
 	return task;
@@ -577,30 +634,31 @@ int kthread_stop(struct task_struct *k)
 }
 EXPORT_SYMBOL(kthread_stop);
 
-void kthread_do_work(void)
+void kthread_do_work(struct kthread_group *kg)
 {
-	spin_lock(&kthread_create_lock);
-	while (!list_empty(&kthread_create_list)) {
+	spin_lock(&kg->create_lock);
+	while (!list_empty(&kg->create_list)) {
 		struct kthread_create_info *create;
 
-		create = list_entry(kthread_create_list.next,
+		create = list_entry(kg->create_list.next,
 				    struct kthread_create_info, list);
 		list_del_init(&create->list);
-		spin_unlock(&kthread_create_lock);
+		spin_unlock(&kg->create_lock);
 
 		create_kthread(create);
 
-		spin_lock(&kthread_create_lock);
+		spin_lock(&kg->create_lock);
 	}
-	spin_unlock(&kthread_create_lock);
+	spin_unlock(&kg->create_lock);
 }
 
-int kthreadd(void *unused)
+int kthreadd(void *data)
 {
+	struct kthread_group *kg = data;
 	struct task_struct *tsk = current;
 
 	/* Setup a clean context for our children to inherit. */
-	set_task_comm(tsk, "kthreadd");
+	set_task_comm(tsk, kg->name);
 	ignore_signals(tsk);
 	set_cpus_allowed_ptr(tsk, cpu_all_mask);
 	set_mems_allowed(node_states[N_MEMORY]);
@@ -608,13 +666,13 @@ int kthreadd(void *unused)
 	current->flags |= PF_NOFREEZE;
 	cgroup_init_kthreadd();
 
-	for (;;) {
+	while (current == kthreadd_default.task || !kthread_should_stop()) {
 		set_current_state(TASK_INTERRUPTIBLE);
-		if (list_empty(&kthread_create_list))
+		if (list_empty(&kg->create_list))
 			schedule();
 		__set_current_state(TASK_RUNNING);
 
-		kthread_do_work();
+		kthread_do_work(kg);
 	}
 
 	return 0;
@@ -712,8 +770,9 @@ __kthread_create_worker(int cpu, unsigned int flags,
 	if (cpu >= 0)
 		node = cpu_to_node(cpu);
 
-	task = __kthread_create_on_node(kthread_worker_fn, worker,
-						node, namefmt, args);
+	task = __kthread_group_create_on_node(&kthreadd_default,
+					kthread_worker_fn,
+					worker, node, namefmt, args);
 	if (IS_ERR(task))
 		goto fail_task;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/4] kthreads: allow cloning threads with different flags
  2020-05-01 16:01 [PATCH 0/4] allow multiple kthreadd's J. Bruce Fields
                   ` (2 preceding siblings ...)
  2020-05-01 16:01 ` [PATCH 3/4] kthreads: allow multiple kthreadd's J. Bruce Fields
@ 2020-05-01 16:01 ` J. Bruce Fields
  2020-05-01 17:59 ` [PATCH 0/4] allow multiple kthreadd's Linus Torvalds
  4 siblings, 0 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-01 16:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-nfs, Jeff Layton, David Howells, Tejun Heo, Shaohua Li,
	Oleg Nesterov, linux-kernel, J. Bruce Fields

From: "J. Bruce Fields" <bfields@redhat.com>

This is so knfsd can add CLONE_THREAD.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 include/linux/kthread.h |  3 ++-
 kernel/kthread.c        | 11 +++++++----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index a7ffdf96a3b2..7069feb6da65 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -10,11 +10,12 @@ struct kthread_group {
 	spinlock_t create_lock;
 	struct list_head create_list;
 	struct task_struct *task;
+	unsigned long flags;
 };
 
 extern struct kthread_group kthreadd_default;
 
-struct kthread_group *kthread_start_group(char *);
+struct kthread_group *kthread_start_group(unsigned long, char *);
 void kthread_stop_group(struct kthread_group *);
 
 struct task_struct *kthread_group_create_on_node(struct kthread_group *,
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 5232f6f597b5..57f6687ecec7 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -29,6 +29,7 @@ struct kthread_group kthreadd_default = {
 	.name = "kthreadd",
 	.create_lock = __SPIN_LOCK_UNLOCKED(kthreadd_default.create_lock),
 	.create_list = LIST_HEAD_INIT(kthreadd_default.create_list),
+	.flags = CLONE_FS | CLONE_FILES | SIGCHLD,
 };
 
 void wake_kthreadd(struct kthread_group *kg)
@@ -36,7 +37,7 @@ void wake_kthreadd(struct kthread_group *kg)
 	wake_up_process(kg->task);
 }
 
-struct kthread_group *kthread_start_group(char *name)
+struct kthread_group *kthread_start_group(unsigned long flags, char *name)
 {
 	struct kthread_group *new;
 	struct task_struct *task;
@@ -47,6 +48,7 @@ struct kthread_group *kthread_start_group(char *name)
 	spin_lock_init(&new->create_lock);
 	INIT_LIST_HEAD(&new->create_list);
 	new->name = name;
+	new->flags = flags;
 	task = kthread_run(kthreadd, new, name);
 	if (IS_ERR(task)) {
 		kfree(new);
@@ -314,7 +316,8 @@ int tsk_fork_get_node(struct task_struct *tsk)
 	return NUMA_NO_NODE;
 }
 
-static void create_kthread(struct kthread_create_info *create)
+static void create_kthread(struct kthread_create_info *create,
+			   unsigned long flags)
 {
 	int pid;
 
@@ -322,7 +325,7 @@ static void create_kthread(struct kthread_create_info *create)
 	current->pref_node_fork = create->node;
 #endif
 	/* We want our own signal handler (we take no signals by default). */
-	pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
+	pid = kernel_thread(kthread, create, flags);
 	if (pid < 0) {
 		/* If user was SIGKILLed, I release the structure. */
 		struct completion *done = xchg(&create->done, NULL);
@@ -645,7 +648,7 @@ void kthread_do_work(struct kthread_group *kg)
 		list_del_init(&create->list);
 		spin_unlock(&kg->create_lock);
 
-		create_kthread(create);
+		create_kthread(create, kg->flags);
 
 		spin_lock(&kg->create_lock);
 	}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-01 16:01 [PATCH 0/4] allow multiple kthreadd's J. Bruce Fields
                   ` (3 preceding siblings ...)
  2020-05-01 16:01 ` [PATCH 4/4] kthreads: allow cloning threads with different flags J. Bruce Fields
@ 2020-05-01 17:59 ` Linus Torvalds
  2020-05-01 18:21   ` Tejun Heo
  4 siblings, 1 reply; 22+ messages in thread
From: Linus Torvalds @ 2020-05-01 17:59 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Tejun Heo, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Fri, May 1, 2020 at 9:02 AM J. Bruce Fields <bfields@redhat.com> wrote:
>
> Anyway, does this multiple kthreadd approach look reasonable?

I don't see anything that looks alarming.

My main reaction was that I don't like the "kthreadd" name, but that's
because for some reason I always read it as "kthre add". That may be
just me. It normally doesn't bother me (this code doesn't get all that
much work on it, it's been very stable), but it was very obvious when
reading your patches.

In fact, I liked _your_ naming better, to the point where I was going
"'kthread_group' is a much better name than 'kthreadd', and that
'kthreadd()' function would read better as 'kthread_group_run()' or
something".

But that may just be a personal quirk of mine, and isn't a big deal.
On the whole the patches looked all sane to me.

> (If so, who should handle the patches?)

We have had _very_ little work in this area, probably because most of
the kthread work has been subsumed by workqueues.

Which kind of makes me want to point a finger at Tejun. But it's been
mostly PeterZ touching this file lately..

                Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-01 17:59 ` [PATCH 0/4] allow multiple kthreadd's Linus Torvalds
@ 2020-05-01 18:21   ` Tejun Heo
  2020-05-01 18:30     ` Linus Torvalds
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Tejun Heo @ 2020-05-01 18:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: J. Bruce Fields, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

Hello,

On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> Which kind of makes me want to point a finger at Tejun. But it's been
> mostly PeterZ touching this file lately..

Looks fine to me too. I don't quite understand the usecase tho. It looks
like all it's being used for is to tag some kthreads as belonging to the
same group. Can't that be done with kthread_data()?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-01 18:21   ` Tejun Heo
@ 2020-05-01 18:30     ` Linus Torvalds
  2020-05-01 19:02       ` J. Bruce Fields
  2020-05-01 18:49     ` J. Bruce Fields
  2020-05-05  2:15     ` J. Bruce Fields
  2 siblings, 1 reply; 22+ messages in thread
From: Linus Torvalds @ 2020-05-01 18:30 UTC (permalink / raw)
  To: Tejun Heo
  Cc: J. Bruce Fields, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Fri, May 1, 2020 at 11:22 AM Tejun Heo <tj@kernel.org> wrote:
>
> Looks fine to me too. I don't quite understand the usecase tho. It looks
> like all it's being used for is to tag some kthreads as belonging to the
> same group. Can't that be done with kthread_data()?

I _think_ Bruce wants the signal handling unification too, because
nfsd wants to react to being shut down with signals.

But you're right, more explanation of why nfsd wants/needs a separate
grouping is a good idea.

                    Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-01 18:21   ` Tejun Heo
  2020-05-01 18:30     ` Linus Torvalds
@ 2020-05-01 18:49     ` J. Bruce Fields
  2020-05-01 19:05       ` Trond Myklebust
  2020-05-05  2:15     ` J. Bruce Fields
  2 siblings, 1 reply; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-01 18:49 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Fri, May 01, 2020 at 02:21:54PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> > Which kind of makes me want to point a finger at Tejun. But it's been
> > mostly PeterZ touching this file lately..
> 
> Looks fine to me too. I don't quite understand the usecase tho. It looks
> like all it's being used for is to tag some kthreads as belonging to the
> same group.

Pretty much.

> Can't that be done with kthread_data()?

Huh, maybe so, thanks.

I need to check this from generic file locking code that could be run by
any task--but I assume there's an easy way I can check if I'm a kthread
before calling  kthread_data(current).

I do expect to expose a delegation interface for userspace servers
eventually too.  But we could do the tgid check for them and still use
kthread_data() for nfsd.  That might work.

--b.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-01 18:30     ` Linus Torvalds
@ 2020-05-01 19:02       ` J. Bruce Fields
  0 siblings, 0 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-01 19:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tejun Heo, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Fri, May 01, 2020 at 11:30:38AM -0700, Linus Torvalds wrote:
> On Fri, May 1, 2020 at 11:22 AM Tejun Heo <tj@kernel.org> wrote:
> >
> > Looks fine to me too. I don't quite understand the usecase tho. It looks
> > like all it's being used for is to tag some kthreads as belonging to the
> > same group. Can't that be done with kthread_data()?
> 
> I _think_ Bruce wants the signal handling unification too, because
> nfsd wants to react to being shut down with signals.

No, maybe kthread_data() might do the job.

(I don't see how this would help with signal handling.  But, I'm kind of
ignorant of how signalling works.)

--b.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-01 18:49     ` J. Bruce Fields
@ 2020-05-01 19:05       ` Trond Myklebust
  2020-05-01 19:20         ` tj
  2020-05-01 19:22         ` J. Bruce Fields
  0 siblings, 2 replies; 22+ messages in thread
From: Trond Myklebust @ 2020-05-01 19:05 UTC (permalink / raw)
  To: bfields, tj
  Cc: torvalds, jlayton, linux-nfs, shli, linux-kernel, dhowells, oleg

On Fri, 2020-05-01 at 14:49 -0400, J. Bruce Fields wrote:
> On Fri, May 01, 2020 at 02:21:54PM -0400, Tejun Heo wrote:
> > Hello,
> > 
> > On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> > > Which kind of makes me want to point a finger at Tejun. But it's
> > > been
> > > mostly PeterZ touching this file lately..
> > 
> > Looks fine to me too. I don't quite understand the usecase tho. It
> > looks
> > like all it's being used for is to tag some kthreads as belonging
> > to the
> > same group.
> 
> Pretty much.
> 

Wen running an instance of knfsd from inside a container, you want to
be able to have the knfsd kthreads be parented to the container init
process so that they get killed off when the container is killed.

Right now, we can easily leak those kernel threads simply by killing
the container.

> > Can't that be done with kthread_data()?
> 
> Huh, maybe so, thanks.
> 
> I need to check this from generic file locking code that could be run
> by
> any task--but I assume there's an easy way I can check if I'm a
> kthread
> before calling  kthread_data(current).
> 
> I do expect to expose a delegation interface for userspace servers
> eventually too.  But we could do the tgid check for them and still
> use
> kthread_data() for nfsd.  That might work.
> 
> --b.
> 
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-01 19:05       ` Trond Myklebust
@ 2020-05-01 19:20         ` tj
  2020-05-01 19:22         ` J. Bruce Fields
  1 sibling, 0 replies; 22+ messages in thread
From: tj @ 2020-05-01 19:20 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: bfields, torvalds, jlayton, linux-nfs, shli, linux-kernel,
	dhowells, oleg

Hello,

On Fri, May 01, 2020 at 07:05:46PM +0000, Trond Myklebust wrote:
> Wen running an instance of knfsd from inside a container, you want to
> be able to have the knfsd kthreads be parented to the container init
> process so that they get killed off when the container is killed.
> 
> Right now, we can easily leak those kernel threads simply by killing
> the container.

Hmm... I'm not sure that'd be a lot easier because they're in their own
thread group. It's not like you can queue signal to the group leader cause
the other kthreads to automatically die. Each would have to handle the exit
explicitly. As long as there is a way to iterate the member kthreads (e.g.
list going through kthread_data or sth else hanging off there), just using
existing kthread interface might not be much different, or maybe even easier
given how hairy signal handling in kthreads can get.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-01 19:05       ` Trond Myklebust
  2020-05-01 19:20         ` tj
@ 2020-05-01 19:22         ` J. Bruce Fields
  1 sibling, 0 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-01 19:22 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: tj, torvalds, jlayton, linux-nfs, shli, linux-kernel, dhowells, oleg

On Fri, May 01, 2020 at 07:05:46PM +0000, Trond Myklebust wrote:
> On Fri, 2020-05-01 at 14:49 -0400, J. Bruce Fields wrote:
> > On Fri, May 01, 2020 at 02:21:54PM -0400, Tejun Heo wrote:
> > > Hello,
> > > 
> > > On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> > > > Which kind of makes me want to point a finger at Tejun. But it's
> > > > been
> > > > mostly PeterZ touching this file lately..
> > > 
> > > Looks fine to me too. I don't quite understand the usecase tho. It
> > > looks
> > > like all it's being used for is to tag some kthreads as belonging
> > > to the
> > > same group.
> > 
> > Pretty much.
> 
> Wen running an instance of knfsd from inside a container, you want to
> be able to have the knfsd kthreads be parented to the container init
> process so that they get killed off when the container is killed.
> 
> Right now, we can easily leak those kernel threads simply by killing
> the container.

Oh, got it.

Currently knfsd supports nfs service in containers, but it uses a single
set of threads to serve requests from any container.  It should shut the
server threads down when the last container using them goes away.

--b.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-01 18:21   ` Tejun Heo
  2020-05-01 18:30     ` Linus Torvalds
  2020-05-01 18:49     ` J. Bruce Fields
@ 2020-05-05  2:15     ` J. Bruce Fields
  2020-05-05 15:54       ` Tejun Heo
  2020-05-05 21:01       ` J. Bruce Fields
  2 siblings, 2 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-05  2:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Fri, May 01, 2020 at 02:21:54PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> > Which kind of makes me want to point a finger at Tejun. But it's been
> > mostly PeterZ touching this file lately..
> 
> Looks fine to me too. I don't quite understand the usecase tho. It looks
> like all it's being used for is to tag some kthreads as belonging to the
> same group. Can't that be done with kthread_data()?

Yeah, so I'd forgotten about kthread->data.

We're currently using it to pass the struct svc_rqst that a new nfsd
thread needs.  But once the new thread has gotten that, I guess it could
set kthread->data to some global value that it uses to say "I'm a knfsd
thread"?

I suppose that would work.

Though now I'm feeling greedy: it would be nice to have both some kind
of global flag, *and* keep kthread->data pointing to svc_rqst (as that
would give me a simpler and quicker way to figure out which client is
conflicting).  Could I take a flag bit in kthread->flags, maybe?

--b.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-05  2:15     ` J. Bruce Fields
@ 2020-05-05 15:54       ` Tejun Heo
  2020-05-05 16:23         ` J. Bruce Fields
  2020-05-05 21:01       ` J. Bruce Fields
  1 sibling, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2020-05-05 15:54 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

Hello, Bruce.

On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> We're currently using it to pass the struct svc_rqst that a new nfsd
> thread needs.  But once the new thread has gotten that, I guess it could
> set kthread->data to some global value that it uses to say "I'm a knfsd
> thread"?
> 
> I suppose that would work.
> 
> Though now I'm feeling greedy: it would be nice to have both some kind
> of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> would give me a simpler and quicker way to figure out which client is
> conflicting).  Could I take a flag bit in kthread->flags, maybe?

Hmm... that'd be solvable if kthread->data can point to a struct which does
both things, right? Because it doesn't have free() callback, it's a bit
awkward but the threadfn itself can unlink and RCU-free it before returning.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-05 15:54       ` Tejun Heo
@ 2020-05-05 16:23         ` J. Bruce Fields
  0 siblings, 0 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-05 16:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: J. Bruce Fields, Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Tue, May 05, 2020 at 11:54:05AM -0400, Tejun Heo wrote:
> Hello, Bruce.
> 
> On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> > We're currently using it to pass the struct svc_rqst that a new nfsd
> > thread needs.  But once the new thread has gotten that, I guess it could
> > set kthread->data to some global value that it uses to say "I'm a knfsd
> > thread"?
> > 
> > I suppose that would work.
> > 
> > Though now I'm feeling greedy: it would be nice to have both some kind
> > of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> > would give me a simpler and quicker way to figure out which client is
> > conflicting).  Could I take a flag bit in kthread->flags, maybe?
> 
> Hmm... that'd be solvable if kthread->data can point to a struct which does
> both things, right?

Isn't this some sort of chicken-and-egg problem?

If you don't know whether a given kthread is an nfsd thread or not, then
it's not safe to assume that kthread->data points to some nfsd-specific
structure that might tell you whether it's an nfsd thread.

> Because it doesn't have free() callback, it's a bit
> awkward but the threadfn itself can unlink and RCU-free it before returning.

It's only ever going to be referenced from the thread itself.  This is
just a way to ask "am I running as an nfsd thread?" when we're deep
inside generic filesystem code somewhere.  So I don't think there's any
complicated lifetime issues here.

--b.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-05  2:15     ` J. Bruce Fields
  2020-05-05 15:54       ` Tejun Heo
@ 2020-05-05 21:01       ` J. Bruce Fields
  2020-05-05 21:09         ` Tejun Heo
  1 sibling, 1 reply; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-05 21:01 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Tejun Heo, Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> Though now I'm feeling greedy: it would be nice to have both some kind
> of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> would give me a simpler and quicker way to figure out which client is
> conflicting).  Could I take a flag bit in kthread->flags, maybe?

Would something like this be too hacky?:

	--- a/kernel/kthread.c
	+++ b/kernel/kthread.c
	@@ -58,6 +58,7 @@ enum KTHREAD_BITS {
	 	KTHREAD_IS_PER_CPU = 0,
	 	KTHREAD_SHOULD_STOP,
	 	KTHREAD_SHOULD_PARK,
	+	KTHREAD_IS_NFSD,
	 };
	 
	 static inline void set_kthread_struct(void *kthread)
	@@ -164,6 +165,25 @@ void *kthread_data(struct task_struct *task)
	 	return to_kthread(task)->data;
	 }
	 
	+void kthread_set_nfsd()
	+{
	+	set_bit(KTHREAD_IS_NFSD, &to_kthread(current)->flags);
	+}
	+EXPORT_SYMBOL_GPL(kthread_set_nfsd);
	+
	+void *kthread_nfsd_data()
	+{
	+	struct kthread *k;
	+
	+	if (!(current->flags & PF_KTHREAD))
	+		return NULL;
	+	k = to_kthread(current);
	+	if (test_bit(KTHREAD_IS_NFSD, &k->flags))
	+		return k->data;
	+	return NULL;
	+}
	+EXPORT_SYMBOL_GPL(kthread_nfsd_data);
	+
	 /**
	  * kthread_probe_data - speculative version of kthread_data()
	  * @task: possible kthread task in question

It feels weird to make such a special case for nfsd, but it makes this
all very easy for me; complete patch below.

--b.

commit 8b0a2e86dafa
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Fri Jul 28 16:35:15 2017 -0400

    nfsd: clients don't need to break their own delegations
    
    We currently revoke read delegations on any write open or any operation
    that modifies file data or metadata (including rename, link, and
    unlink).  But if the delegation in question is the only read delegation
    and is held by the client performing the operation, that's not really
    necessary.
    
    It's not always possible to prevent this in the NFSv4.0 case, because
    there's not always a way to determine which client an NFSv4.0 delegation
    came from.  (In theory we could try to guess this from the transport
    layer, e.g., by assuming all traffic on a given TCP connection comes
    from the same client.  But that's not really correct.)
    
    In the NFSv4.1 case the session layer always tells us the client.
    
    This patch should remove such self-conflicts in all cases where we can
    reliably determine the client from the compound.
    
    To do that we need to track "who" is performing a given (possibly
    lease-breaking) file operation.  We're doing that by storing the
    information in the svc_rqst and using kthread_data() to map the current
    task back to a svc_rqst.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 5057e4d9dcd1..9fdcec416614 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -425,6 +425,7 @@ prototypes::
 	int (*lm_grant)(struct file_lock *, struct file_lock *, int);
 	void (*lm_break)(struct file_lock *); /* break_lease callback */
 	int (*lm_change)(struct file_lock **, int);
+	bool (*lm_breaker_owns_lease)(struct file_lock *);
 
 locking rules:
 
@@ -435,6 +436,7 @@ lm_notify:		yes		yes			no
 lm_grant:		no		no			no
 lm_break:		yes		no			no
 lm_change		yes		no			no
+lm_breaker_owns_lease:	no		no			no
 ==========		=============	=================	=========
 
 buffer_head
diff --git a/fs/locks.c b/fs/locks.c
index b8a31c1c4fff..a3f186846e93 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1557,6 +1557,9 @@ static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
 {
 	bool rc;
 
+	if (lease->fl_lmops->lm_breaker_owns_lease
+			&& lease->fl_lmops->lm_breaker_owns_lease(lease))
+		return false;
 	if ((breaker->fl_flags & FL_LAYOUT) != (lease->fl_flags & FL_LAYOUT)) {
 		rc = false;
 		goto trace;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 0e75f7fb5fec..a6d73aa51ce4 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2302,6 +2302,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
 	}
 	check_if_stalefh_allowed(args);
 
+	rqstp->rq_lease_breaker = (void **)&cstate->clp;
+
 	trace_nfsd_compound(rqstp, args->opcnt);
 	while (!status && resp->opcnt < args->opcnt) {
 		op = &args->ops[resp->opcnt++];
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e32ecedece0f..2368051bbef3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4520,6 +4520,19 @@ nfsd_break_deleg_cb(struct file_lock *fl)
 	return ret;
 }
 
+static bool nfsd_breaker_owns_lease(struct file_lock *fl)
+{
+	struct nfs4_delegation *dl = fl->fl_owner;
+	struct svc_rqst *rqst;
+	struct nfs4_client *clp;
+
+	rqst = kthread_nfsd_data();
+	if (!rqst)
+		return false;
+	clp = *(rqst->rq_lease_breaker);
+	return dl->dl_stid.sc_client == clp;
+}
+
 static int
 nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
 		     struct list_head *dispose)
@@ -4531,6 +4544,7 @@ nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
 }
 
 static const struct lock_manager_operations nfsd_lease_mng_ops = {
+	.lm_breaker_owns_lease = nfsd_breaker_owns_lease,
 	.lm_break = nfsd_break_deleg_cb,
 	.lm_change = nfsd_change_deleg_cb,
 };
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index ca9fd348548b..9c15b77a726f 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -888,6 +888,8 @@ nfsd(void *vrqstp)
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 	int err;
 
+	kthread_set_nfsd();
+
 	/* Lock module and set up kernel thread */
 	mutex_lock(&nfsd_mutex);
 
@@ -1011,6 +1013,7 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32 *statp)
 		*statp = rpc_garbage_args;
 		return 1;
 	}
+	rqstp->rq_lease_breaker = NULL;
 	/*
 	 * Give the xdr decoder a chance to change this if it wants
 	 * (necessary in the NFSv4.0 compound case)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4f6f59b4f22a..4b784560ffaf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1045,6 +1045,7 @@ struct lock_manager_operations {
 	bool (*lm_break)(struct file_lock *);
 	int (*lm_change)(struct file_lock *, int, struct list_head *);
 	void (*lm_setup)(struct file_lock *, void **);
+	bool (*lm_breaker_owns_lease)(struct file_lock *);
 };
 
 struct lock_manager {
diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 8bbcaad7ef0f..d374cad65931 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -58,6 +58,8 @@ bool kthread_should_park(void);
 bool __kthread_should_park(struct task_struct *k);
 bool kthread_freezable_should_stop(bool *was_frozen);
 void *kthread_data(struct task_struct *k);
+void kthread_set_nfsd(void);
+void *kthread_nfsd_data(void);
 void *kthread_probe_data(struct task_struct *k);
 int kthread_park(struct task_struct *k);
 void kthread_unpark(struct task_struct *k);
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index fd390894a584..abf4a57ce4a7 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -299,6 +299,7 @@ struct svc_rqst {
 	struct net		*rq_bc_net;	/* pointer to backchannel's
 						 * net namespace
 						 */
+	void **			rq_lease_breaker; /* The v4 client breaking a lease */
 };
 
 #define SVC_NET(rqst) (rqst->rq_xprt ? rqst->rq_xprt->xpt_net : rqst->rq_bc_net)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 09b103b92c5a..54d83f329b85 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -58,6 +58,7 @@ enum KTHREAD_BITS {
 	KTHREAD_IS_PER_CPU = 0,
 	KTHREAD_SHOULD_STOP,
 	KTHREAD_SHOULD_PARK,
+	KTHREAD_IS_NFSD,
 };
 
 static inline void set_kthread_struct(void *kthread)
@@ -164,6 +165,25 @@ void *kthread_data(struct task_struct *task)
 	return to_kthread(task)->data;
 }
 
+void kthread_set_nfsd()
+{
+	set_bit(KTHREAD_IS_NFSD, &to_kthread(current)->flags);
+}
+EXPORT_SYMBOL_GPL(kthread_set_nfsd);
+
+void *kthread_nfsd_data()
+{
+	struct kthread *k;
+
+	if (!(current->flags & PF_KTHREAD))
+		return NULL;
+	k = to_kthread(current);
+	if (test_bit(KTHREAD_IS_NFSD, &k->flags))
+		return k->data;
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(kthread_nfsd_data);
+
 /**
  * kthread_probe_data - speculative version of kthread_data()
  * @task: possible kthread task in question

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-05 21:01       ` J. Bruce Fields
@ 2020-05-05 21:09         ` Tejun Heo
  2020-05-05 21:25           ` J. Bruce Fields
  0 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2020-05-05 21:09 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: J. Bruce Fields, Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

Hello,

On Tue, May 05, 2020 at 05:01:18PM -0400, J. Bruce Fields wrote:
> On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> > Though now I'm feeling greedy: it would be nice to have both some kind
> > of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> > would give me a simpler and quicker way to figure out which client is
> > conflicting).  Could I take a flag bit in kthread->flags, maybe?
> 
> Would something like this be too hacky?:

It's not the end of the world but a bit hacky. I wonder whether something
like the following would work better for identifying worker type so that you
can do sth like

 if (kthread_fn(current) == nfsd)
        return kthread_data(current);
 else
        return NULL;     

Thanks.

diff --git a/kernel/kthread.c b/kernel/kthread.c
index bfbfa481be3a..4f3ab9f2c994 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -46,6 +46,7 @@ struct kthread_create_info
 struct kthread {
 	unsigned long flags;
 	unsigned int cpu;
+	int (*threadfn)(void *);
 	void *data;
 	struct completion parked;
 	struct completion exited;
@@ -152,6 +153,13 @@ bool kthread_freezable_should_stop(bool *was_frozen)
 }
 EXPORT_SYMBOL_GPL(kthread_freezable_should_stop);
 
+void *kthread_fn(struct task_struct *task)
+{
+	if (task->flags & PF_KTHREAD)
+		return to_kthread(task)->threadfn;
+	return NULL;
+}
+
 /**
  * kthread_data - return data value specified on kthread creation
  * @task: kthread task in question
@@ -244,6 +252,7 @@ static int kthread(void *_create)
 		do_exit(-ENOMEM);
 	}
 
+	self->threadfn = threadfn;
 	self->data = data;
 	init_completion(&self->exited);
 	init_completion(&self->parked);

-- 
tejun

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-05 21:09         ` Tejun Heo
@ 2020-05-05 21:25           ` J. Bruce Fields
  2020-05-06 15:36             ` J. Bruce Fields
  0 siblings, 1 reply; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-05 21:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: J. Bruce Fields, Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Tue, May 05, 2020 at 05:09:56PM -0400, Tejun Heo wrote:
> Hello,
> 
> On Tue, May 05, 2020 at 05:01:18PM -0400, J. Bruce Fields wrote:
> > On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> > > Though now I'm feeling greedy: it would be nice to have both some kind
> > > of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> > > would give me a simpler and quicker way to figure out which client is
> > > conflicting).  Could I take a flag bit in kthread->flags, maybe?
> > 
> > Would something like this be too hacky?:
> 
> It's not the end of the world but a bit hacky. I wonder whether something
> like the following would work better for identifying worker type so that you
> can do sth like
> 
>  if (kthread_fn(current) == nfsd)
>         return kthread_data(current);
>  else
>         return NULL;     

Yes, definitely more generic, looks good to me.

--b.

> 
> Thanks.
> 
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index bfbfa481be3a..4f3ab9f2c994 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -46,6 +46,7 @@ struct kthread_create_info
>  struct kthread {
>  	unsigned long flags;
>  	unsigned int cpu;
> +	int (*threadfn)(void *);
>  	void *data;
>  	struct completion parked;
>  	struct completion exited;
> @@ -152,6 +153,13 @@ bool kthread_freezable_should_stop(bool *was_frozen)
>  }
>  EXPORT_SYMBOL_GPL(kthread_freezable_should_stop);
>  
> +void *kthread_fn(struct task_struct *task)
> +{
> +	if (task->flags & PF_KTHREAD)
> +		return to_kthread(task)->threadfn;
> +	return NULL;
> +}
> +
>  /**
>   * kthread_data - return data value specified on kthread creation
>   * @task: kthread task in question
> @@ -244,6 +252,7 @@ static int kthread(void *_create)
>  		do_exit(-ENOMEM);
>  	}
>  
> +	self->threadfn = threadfn;
>  	self->data = data;
>  	init_completion(&self->exited);
>  	init_completion(&self->parked);
> 
> -- 
> tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-05 21:25           ` J. Bruce Fields
@ 2020-05-06 15:36             ` J. Bruce Fields
  2020-05-06 15:39               ` Tejun Heo
  0 siblings, 1 reply; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-06 15:36 UTC (permalink / raw)
  To: Tejun Heo
  Cc: J. Bruce Fields, Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Tue, May 05, 2020 at 05:25:27PM -0400, J. Bruce Fields wrote:
> On Tue, May 05, 2020 at 05:09:56PM -0400, Tejun Heo wrote:
> > It's not the end of the world but a bit hacky. I wonder whether something
> > like the following would work better for identifying worker type so that you
> > can do sth like
> > 
> >  if (kthread_fn(current) == nfsd)
> >         return kthread_data(current);
> >  else
> >         return NULL;     
> 
> Yes, definitely more generic, looks good to me.

This is what I'm testing with.

If it's OK with you, could I add your Signed-off-by and take it through
the nfsd tree? I'll have some other patches that will depend on it.

--b.


commit 379bfe5257b6
Author: Tejun Heo <tj@kernel.org>
Date:   Tue May 5 21:26:07 2020 -0400

    kthread: save thread function
    
    It's handy to keep the kthread_fn just as a unique cookie to identify
    classes of kthreads.  E.g. if you can verify that a given task is
    running your thread_fn, then you may know what sort of type kthread_data
    points to.
    
    We'll use this in nfsd to pass some information into the vfs.  Note it
    will need kthread_data() exported too.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 8bbcaad7ef0f..c00ee443833f 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -57,6 +57,7 @@ bool kthread_should_stop(void);
 bool kthread_should_park(void);
 bool __kthread_should_park(struct task_struct *k);
 bool kthread_freezable_should_stop(bool *was_frozen);
+void *kthread_fn(struct task_struct *k);
 void *kthread_data(struct task_struct *k);
 void *kthread_probe_data(struct task_struct *k);
 int kthread_park(struct task_struct *k);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index bfbfa481be3a..b87c4a9ba91d 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -46,6 +46,7 @@ struct kthread_create_info
 struct kthread {
 	unsigned long flags;
 	unsigned int cpu;
+	int (*threadfn)(void *);
 	void *data;
 	struct completion parked;
 	struct completion exited;
@@ -152,6 +153,20 @@ bool kthread_freezable_should_stop(bool *was_frozen)
 }
 EXPORT_SYMBOL_GPL(kthread_freezable_should_stop);
 
+/**
+ * kthread_fn - return the function specified on kthread creation
+ * @task: kthread task in question
+ *
+ * Returns NULL if the task is not a kthread.
+ */
+void *kthread_fn(struct task_struct *task)
+{
+	if (task->flags & PF_KTHREAD)
+		return to_kthread(task)->threadfn;
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(kthread_fn);
+
 /**
  * kthread_data - return data value specified on kthread creation
  * @task: kthread task in question
@@ -164,6 +179,7 @@ void *kthread_data(struct task_struct *task)
 {
 	return to_kthread(task)->data;
 }
+EXPORT_SYMBOL_GPL(kthread_data);
 
 /**
  * kthread_probe_data - speculative version of kthread_data()
@@ -244,6 +260,7 @@ static int kthread(void *_create)
 		do_exit(-ENOMEM);
 	}
 
+	self->threadfn = threadfn;
 	self->data = data;
 	init_completion(&self->exited);
 	init_completion(&self->parked);

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-06 15:36             ` J. Bruce Fields
@ 2020-05-06 15:39               ` Tejun Heo
  2020-05-06 15:54                 ` J. Bruce Fields
  0 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2020-05-06 15:39 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: J. Bruce Fields, Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

Hello, Bruce.

On Wed, May 06, 2020 at 11:36:58AM -0400, J. Bruce Fields wrote:
> On Tue, May 05, 2020 at 05:25:27PM -0400, J. Bruce Fields wrote:
> > On Tue, May 05, 2020 at 05:09:56PM -0400, Tejun Heo wrote:
> > > It's not the end of the world but a bit hacky. I wonder whether something
> > > like the following would work better for identifying worker type so that you
> > > can do sth like
> > > 
> > >  if (kthread_fn(current) == nfsd)
> > >         return kthread_data(current);
> > >  else
> > >         return NULL;     
> > 
> > Yes, definitely more generic, looks good to me.
> 
> This is what I'm testing with.
> 
> If it's OK with you, could I add your Signed-off-by and take it through
> the nfsd tree? I'll have some other patches that will depend on it.

Please feel free to use the code however you see fit. Given that it'll be
originating from you, my signed-off-by might not be the right tag. Something
like Original-patch-by should be good (nothing is fine too).

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] allow multiple kthreadd's
  2020-05-06 15:39               ` Tejun Heo
@ 2020-05-06 15:54                 ` J. Bruce Fields
  0 siblings, 0 replies; 22+ messages in thread
From: J. Bruce Fields @ 2020-05-06 15:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: J. Bruce Fields, Linus Torvalds, open list:NFS, SUNRPC, AND...,
	Jeff Layton, David Howells, Shaohua Li, Oleg Nesterov,
	Linux Kernel Mailing List

On Wed, May 06, 2020 at 11:39:20AM -0400, Tejun Heo wrote:
> Hello, Bruce.
> 
> On Wed, May 06, 2020 at 11:36:58AM -0400, J. Bruce Fields wrote:
> > On Tue, May 05, 2020 at 05:25:27PM -0400, J. Bruce Fields wrote:
> > > On Tue, May 05, 2020 at 05:09:56PM -0400, Tejun Heo wrote:
> > > > It's not the end of the world but a bit hacky. I wonder whether something
> > > > like the following would work better for identifying worker type so that you
> > > > can do sth like
> > > > 
> > > >  if (kthread_fn(current) == nfsd)
> > > >         return kthread_data(current);
> > > >  else
> > > >         return NULL;     
> > > 
> > > Yes, definitely more generic, looks good to me.
> > 
> > This is what I'm testing with.
> > 
> > If it's OK with you, could I add your Signed-off-by and take it through
> > the nfsd tree? I'll have some other patches that will depend on it.
> 
> Please feel free to use the code however you see fit. Given that it'll be
> originating from you, my signed-off-by might not be the right tag. Something
> like Original-patch-by should be good (nothing is fine too).

OK, I'll do that, thanks!

--b.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-05-06 15:54 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01 16:01 [PATCH 0/4] allow multiple kthreadd's J. Bruce Fields
2020-05-01 16:01 ` [PATCH 1/4] kthreads: minor kthreadd refactoring J. Bruce Fields
2020-05-01 16:01 ` [PATCH 2/4] kthreads: Simplify tsk_fork_get_node J. Bruce Fields
2020-05-01 16:01 ` [PATCH 3/4] kthreads: allow multiple kthreadd's J. Bruce Fields
2020-05-01 16:01 ` [PATCH 4/4] kthreads: allow cloning threads with different flags J. Bruce Fields
2020-05-01 17:59 ` [PATCH 0/4] allow multiple kthreadd's Linus Torvalds
2020-05-01 18:21   ` Tejun Heo
2020-05-01 18:30     ` Linus Torvalds
2020-05-01 19:02       ` J. Bruce Fields
2020-05-01 18:49     ` J. Bruce Fields
2020-05-01 19:05       ` Trond Myklebust
2020-05-01 19:20         ` tj
2020-05-01 19:22         ` J. Bruce Fields
2020-05-05  2:15     ` J. Bruce Fields
2020-05-05 15:54       ` Tejun Heo
2020-05-05 16:23         ` J. Bruce Fields
2020-05-05 21:01       ` J. Bruce Fields
2020-05-05 21:09         ` Tejun Heo
2020-05-05 21:25           ` J. Bruce Fields
2020-05-06 15:36             ` J. Bruce Fields
2020-05-06 15:39               ` Tejun Heo
2020-05-06 15:54                 ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.