linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/10] kthread: Kthread worker API improvements
@ 2016-05-30 14:59 Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 01/10] kthread/smpboot: Do not park in kthread_create_on_cpu() Petr Mladek
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek,
	Catalin Marinas, linux-watchdog, Corey Minyard,
	openipmi-developer, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, Maxim Levitsky, Zhang Rui, Eduardo Valentin,
	Jacob Pan, linux-pm, Sebastian Andrzej Siewior

I send the kthread worker API improvements separately as discussed
in v6, see
https://lkml.kernel.org/g/<20160511105224.GE2749@pathway.suse.cz>
They seem to be ready for inclusion in 4.8.

I will send the conversion of the particular kthreads once
the API changes are in some maintainers three (-mm?) and
visible in linux-next. If nobody suggests some other approach.

Also I plan to continue with conversion of more kthreads.

Just to remember. The intention of this patchset is to make
it easier to manipulate and maintain kthreads. Especially, I want
to replace all the custom main cycles with a generic one.
Also I want to make the kthreads sleep in a consistent
state in a common place when there is no work.


Changes against v6:

  + no changes.


Changes against v5:

  + removed spin_trylock() from delayed_kthread_work_timer_fn();
    instead temporary released worked->lock() when calling
    del_timer_sync(); made sure that any queueing was blocked
    by work->canceling in the meatime

  + used 0th byte for KTW_FREEZABLE to reduce confusion

  + fixed warnings in comments reported by make htmldocs

  + sigh, there was no easy way to create an empty va_list
    that would work on all architectures; decided to make
    @namefmt generic in create_kthread_worker_on_cpu()

  + converted khungtaskd a better way; it was inspired by
    the recent changes that appeared in 4.6-rc1


Changes against v4:

  + added worker->delayed_work_list; it simplified the check
    for pending work; we do not longer need the new timer_active()
    function; also we do not need the link work->timer. On the
    other hand we need to distinguish between the normal and
    the delayed work by a boolean parameter passed to
    the common functions, e.g. __cancel_kthread_work_sync()
    
  + replaced most try_lock repeat cycles with a WARN_ON();
    the API does not allow to use the work with more workers;
    so such a situation would be a bug; it removed the
    complex try_lock_kthread_work() function that supported
    more modes;

  + renamed kthread_work_pending() to queuing_blocked();
    added this function later when really needed

  + renamed try_to_cancel_kthread_work() to __cancel_kthread_work();
    in fact, this a common implementation for the async cancel()
    function

  + removed a dull check for invalid cpu number in
    create_kthread_worker_on_cpu(); removed some other unnecessary
    code structures as suggested by Tejun

  + consistently used bool return value in all new __cancel functions

  + fixed ordering of cpu and flags parameters in
    create_kthread_worker_on_cpu() vs. create_kthread_worker()

  + used memset in the init_kthread_worker()

  + updated many comments as suggested by Tejun and as
    required the above changes

  + removed obsolete patch adding timer_active()

  + removed obsolete patch for using try_lock in flush_kthread_worker()

  + double checked all existing users of kthread worker API
    that they reinitialized the work when the worker was started
    and would not print false warnings; all looked fine

  + added taken acks for the Intel Powerclamp conversion
    

Changes against v3:

  + allow to free struct kthread_work from its callback; do not touch
    the struct from the worker post-mortem; as a side effect, the structure
    must be reinitialized when the worker gets restarted; updated
    khugepaged, and kmemleak accordingly

  + call del_timer_sync() with worker->lock; instead, detect canceling
    in the timer callback and give up an attempt to get the lock there;
    do busy loop with spin_is_locked() to reduce cache bouncing

  + renamed ipmi+func() -> ipmi_kthread_worker_func() as suggested
    by Corey

  + added some collected Reviewed-by

  
Changes against v2:

  + used worker->lock to synchronize the operations with the work
    instead of the PENDING bit as suggested by Tejun Heo; it simplified
    the implementation in several ways

  + added timer_active(); used it together with del_timer_sync()
    to cancel the work a less tricky way

  + removed the controversial conversion of the RCU kthreads

  + added several other examples: hung_task, kmemleak, ipmi,
    IB/fmr_pool, memstick/r592, intel_powerclamp

  + the helper fixes for the ring buffer benchmark has been improved
    as suggested by Steven; they already are in the Linus tree now

  + fixed a possible race between the check for existing khugepaged
    worker and queuing the work
 

Changes against v1:

  + remove wrappers to manipulate the scheduling policy and priority

  + remove questionable wakeup_and_destroy_kthread_worker() variant

  + do not check for chained work when draining the queue

  + allocate struct kthread worker in create_kthread_work() and
    use more simple checks for running worker

  + add support for delayed kthread works and use them instead
    of waiting inside the works

  + rework the "unrelated" fixes for the ring buffer benchmark
    as discussed in the 1st RFC; also sent separately

  + convert also the consumer in the ring buffer benchmark


I have tested this patch set against the stable Linus tree
for 4.7-rc1.

Comments against v6 can be found at
https://lkml.kernel.org/g/<1460646879-617-1-git-send-email-pmladek@suse.com>

Petr Mladek (10):
  kthread/smpboot: Do not park in kthread_create_on_cpu()
  kthread: Allow to call __kthread_create_on_node() with va_list args
  kthread: Add create_kthread_worker*()
  kthread: Add drain_kthread_worker()
  kthread: Add destroy_kthread_worker()
  kthread: Detect when a kthread work is used by more workers
  kthread: Initial support for delayed kthread work
  kthread: Allow to cancel kthread work
  kthread: Allow to modify delayed kthread work
  kthread: Better support freezable kthread workers

 include/linux/kthread.h |  57 +++++
 kernel/kthread.c        | 571 +++++++++++++++++++++++++++++++++++++++++++-----
 kernel/smpboot.c        |   5 +
 3 files changed, 581 insertions(+), 52 deletions(-)

CC: Catalin Marinas <catalin.marinas@arm.com>
CC: linux-watchdog@vger.kernel.org
CC: Corey Minyard <minyard@acm.org>
CC: openipmi-developer@lists.sourceforge.net
CC: Doug Ledford <dledford@redhat.com>
CC: Sean Hefty <sean.hefty@intel.com>
CC: Hal Rosenstock <hal.rosenstock@gmail.com>
CC: linux-rdma@vger.kernel.org
CC: Maxim Levitsky <maximlevitsky@gmail.com>
CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
CC: linux-pm@vger.kernel.org
CC: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

-- 
1.8.5.6

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v7 01/10] kthread/smpboot: Do not park in kthread_create_on_cpu()
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 02/10] kthread: Allow to call __kthread_create_on_node() with va_list args Petr Mladek
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

kthread_create_on_cpu() was added by the commit 2a1d446019f9a5983e
("kthread: Implement park/unpark facility"). It is currently used
only when enabling new CPU. For this purpose, the newly created
kthread has to be parked.

The CPU binding is a bit tricky. The kthread is parked when the CPU
has not been allowed yet. And the CPU is bound when the kthread
is unparked.

The function would be useful for more per-CPU kthreads, e.g.
bnx2fc_thread, fcoethread. For this purpose, the newly created
kthread should stay in the uninterruptible state.

This patch moves the parking into smpboot. It binds the thread
already when created. Then the function might be used universally.
Also the behavior is consistent with kthread_create() and
kthread_create_on_node().

Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/kthread.c | 8 ++++++--
 kernel/smpboot.c | 5 +++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 9ff173dca1ae..1ffc11ec5546 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -390,10 +390,10 @@ struct task_struct *kthread_create_on_cpu(int (*threadfn)(void *data),
 				   cpu);
 	if (IS_ERR(p))
 		return p;
+	kthread_bind(p, cpu);
+	/* CPU hotplug need to bind once again when unparking the thread. */
 	set_bit(KTHREAD_IS_PER_CPU, &to_kthread(p)->flags);
 	to_kthread(p)->cpu = cpu;
-	/* Park the thread to get it out of TASK_UNINTERRUPTIBLE state */
-	kthread_park(p);
 	return p;
 }
 
@@ -407,6 +407,10 @@ static void __kthread_unpark(struct task_struct *k, struct kthread *kthread)
 	 * which might be about to be cleared.
 	 */
 	if (test_and_clear_bit(KTHREAD_IS_PARKED, &kthread->flags)) {
+		/*
+		 * Newly created kthread was parked when the CPU was offline.
+		 * The binding was lost and we need to set it again.
+		 */
 		if (test_bit(KTHREAD_IS_PER_CPU, &kthread->flags))
 			__kthread_bind(k, kthread->cpu, TASK_PARKED);
 		wake_up_state(k, TASK_PARKED);
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 13bc43d1fb22..4a5c6e73ecd4 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -186,6 +186,11 @@ __smpboot_create_thread(struct smp_hotplug_thread *ht, unsigned int cpu)
 		kfree(td);
 		return PTR_ERR(tsk);
 	}
+	/*
+	 * Park the thread so that it could start right on the CPU
+	 * when it is available.
+	 */
+	kthread_park(tsk);
 	get_task_struct(tsk);
 	*per_cpu_ptr(ht->store, cpu) = tsk;
 	if (ht->create) {
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 02/10] kthread: Allow to call __kthread_create_on_node() with va_list args
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 01/10] kthread/smpboot: Do not park in kthread_create_on_cpu() Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 03/10] kthread: Add create_kthread_worker*() Petr Mladek
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

kthread_create_on_node() implements a bunch of logic to create
the kthread. It is already called by kthread_create_on_cpu().

We are going to extend the kthread worker API and will
need to call kthread_create_on_node() with va_list args there.

This patch does only a refactoring and does not modify the existing
behavior.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/kthread.c | 72 +++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 42 insertions(+), 30 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 1ffc11ec5546..bfe8742c4217 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -244,33 +244,10 @@ static void create_kthread(struct kthread_create_info *create)
 	}
 }
 
-/**
- * kthread_create_on_node - create a kthread.
- * @threadfn: the function to run until signal_pending(current).
- * @data: data ptr for @threadfn.
- * @node: task and thread structures for the thread are allocated on this node
- * @namefmt: printf-style name for the thread.
- *
- * Description: This helper function creates and names a kernel
- * thread.  The thread will be stopped: use wake_up_process() to start
- * it.  See also kthread_run().  The new thread has SCHED_NORMAL policy and
- * is affine to all CPUs.
- *
- * If thread is going to be bound on a particular cpu, give its node
- * in @node, to get NUMA affinity for kthread stack, or else give NUMA_NO_NODE.
- * When woken, the thread will run @threadfn() with @data as its
- * argument. @threadfn() can either call do_exit() directly if it is a
- * standalone thread for which no one will call kthread_stop(), or
- * return when 'kthread_should_stop()' is true (which means
- * kthread_stop() has been called).  The return value should be zero
- * or a negative error number; it will be passed to kthread_stop().
- *
- * Returns a task_struct or ERR_PTR(-ENOMEM) or ERR_PTR(-EINTR).
- */
-struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
-					   void *data, int node,
-					   const char namefmt[],
-					   ...)
+static struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
+						    void *data, int node,
+						    const char namefmt[],
+						    va_list args)
 {
 	DECLARE_COMPLETION_ONSTACK(done);
 	struct task_struct *task;
@@ -311,11 +288,8 @@ struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
 	task = create->result;
 	if (!IS_ERR(task)) {
 		static const struct sched_param param = { .sched_priority = 0 };
-		va_list args;
 
-		va_start(args, namefmt);
 		vsnprintf(task->comm, sizeof(task->comm), namefmt, args);
-		va_end(args);
 		/*
 		 * root may have changed our (kthreadd's) priority or CPU mask.
 		 * The kernel thread should not inherit these properties.
@@ -326,6 +300,44 @@ struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
 	kfree(create);
 	return task;
 }
+
+/**
+ * kthread_create_on_node - create a kthread.
+ * @threadfn: the function to run until signal_pending(current).
+ * @data: data ptr for @threadfn.
+ * @node: task and thread structures for the thread are allocated on this node
+ * @namefmt: printf-style name for the thread.
+ *
+ * Description: This helper function creates and names a kernel
+ * thread.  The thread will be stopped: use wake_up_process() to start
+ * it.  See also kthread_run().  The new thread has SCHED_NORMAL policy and
+ * is affine to all CPUs.
+ *
+ * If thread is going to be bound on a particular cpu, give its node
+ * in @node, to get NUMA affinity for kthread stack, or else give NUMA_NO_NODE.
+ * When woken, the thread will run @threadfn() with @data as its
+ * argument. @threadfn() can either call do_exit() directly if it is a
+ * standalone thread for which no one will call kthread_stop(), or
+ * return when 'kthread_should_stop()' is true (which means
+ * kthread_stop() has been called).  The return value should be zero
+ * or a negative error number; it will be passed to kthread_stop().
+ *
+ * Returns a task_struct or ERR_PTR(-ENOMEM) or ERR_PTR(-EINTR).
+ */
+struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
+					   void *data, int node,
+					   const char namefmt[],
+					   ...)
+{
+	struct task_struct *task;
+	va_list args;
+
+	va_start(args, namefmt);
+	task = __kthread_create_on_node(threadfn, data, node, namefmt, args);
+	va_end(args);
+
+	return task;
+}
 EXPORT_SYMBOL(kthread_create_on_node);
 
 static void __kthread_bind_mask(struct task_struct *p, const struct cpumask *mask, long state)
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 03/10] kthread: Add create_kthread_worker*()
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 01/10] kthread/smpboot: Do not park in kthread_create_on_cpu() Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 02/10] kthread: Allow to call __kthread_create_on_node() with va_list args Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  2016-06-01 19:36   ` Andrew Morton
  2016-05-30 14:59 ` [PATCH v7 04/10] kthread: Add drain_kthread_worker() Petr Mladek
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.

This patch defines create_kthread_worker() and
create_kthread_worker_on_cpu() functions that hide implementation details.

They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think
that we do not want any alternative main thread because it would be
hard to support consistency with the rest of the kthread worker API.

The naming and function of create_kthread_worker() is inspired by
the workqueues API like the rest of the kthread worker API.

The create_kthread_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.

This patch does _not_ convert existing kthread workers. The kthread worker
API need more improvements first, e.g. a function to destroy the worker.

IMPORTANT:

create_kthread_worker_on_cpu() allows to use any format of the
worker name, in compare with kthread_create_on_cpu(). The good thing
is that it is more generic. The bad thing is that most users will
need to pass the cpu number in two parameters, e.g.
create_kthread_worker_on_cpu(cpu, "helper/%d", cpu).

To be honest, the main motivation was to avoid the need for an
empty va_list. The only legal way was to create a helper function that
would be called with an empty list. Other attempts caused compilation
warnings or even errors on different architectures.

There were also other alternatives, for example, using #define or
splitting __create_kthread_worker(). The used solution looked
like the least ugly.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/kthread.h |   7 +++
 kernel/kthread.c        | 113 +++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 110 insertions(+), 10 deletions(-)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index e691b6a23f72..468011efa68d 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -124,6 +124,13 @@ extern void __init_kthread_worker(struct kthread_worker *worker,
 
 int kthread_worker_fn(void *worker_ptr);
 
+__printf(1, 2)
+struct kthread_worker *
+create_kthread_worker(const char namefmt[], ...);
+
+struct kthread_worker *
+create_kthread_worker_on_cpu(int cpu, const char namefmt[], ...);
+
 bool queue_kthread_work(struct kthread_worker *worker,
 			struct kthread_work *work);
 void flush_kthread_work(struct kthread_work *work);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index bfe8742c4217..76364374ff98 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -567,23 +567,24 @@ EXPORT_SYMBOL_GPL(__init_kthread_worker);
  * kthread_worker_fn - kthread function to process kthread_worker
  * @worker_ptr: pointer to initialized kthread_worker
  *
- * This function can be used as @threadfn to kthread_create() or
- * kthread_run() with @worker_ptr argument pointing to an initialized
- * kthread_worker.  The started kthread will process work_list until
- * the it is stopped with kthread_stop().  A kthread can also call
- * this function directly after extra initialization.
+ * This function implements the main cycle of kthread worker. It processes
+ * work_list until it is stopped with kthread_stop(). It sleeps when the queue
+ * is empty.
  *
- * Different kthreads can be used for the same kthread_worker as long
- * as there's only one kthread attached to it at any given time.  A
- * kthread_worker without an attached kthread simply collects queued
- * kthread_works.
+ * The works are not allowed to keep any locks, disable preemption or interrupts
+ * when they finish. There is defined a safe point for freezing when one work
+ * finishes and before a new one is started.
  */
 int kthread_worker_fn(void *worker_ptr)
 {
 	struct kthread_worker *worker = worker_ptr;
 	struct kthread_work *work;
 
-	WARN_ON(worker->task);
+	/*
+	 * FIXME: Update the check and remove the assignment when all kthread
+	 * worker users are created using create_kthread_worker*() functions.
+	 */
+	WARN_ON(worker->task && worker->task != current);
 	worker->task = current;
 repeat:
 	set_current_state(TASK_INTERRUPTIBLE);	/* mb paired w/ kthread_stop */
@@ -617,6 +618,98 @@ repeat:
 }
 EXPORT_SYMBOL_GPL(kthread_worker_fn);
 
+static struct kthread_worker *
+__create_kthread_worker(int cpu, const char namefmt[], va_list args)
+{
+	struct kthread_worker *worker;
+	struct task_struct *task;
+
+	worker = kzalloc(sizeof(*worker), GFP_KERNEL);
+	if (!worker)
+		return ERR_PTR(-ENOMEM);
+
+	init_kthread_worker(worker);
+
+	if (cpu >= 0) {
+		char name[TASK_COMM_LEN];
+
+		/*
+		 * creare_kthread_worker_on_cpu() allows to pass a generic
+		 * namefmt in compare with kthread_create_on_cpu. We need
+		 * to format it here.
+		 */
+		vsnprintf(name, sizeof(name), namefmt, args);
+		task = kthread_create_on_cpu(kthread_worker_fn, worker,
+					     cpu, name);
+	} else {
+		task = __kthread_create_on_node(kthread_worker_fn, worker,
+						-1, namefmt, args);
+	}
+
+	if (IS_ERR(task))
+		goto fail_task;
+
+	worker->task = task;
+	wake_up_process(task);
+	return worker;
+
+fail_task:
+	kfree(worker);
+	return ERR_CAST(task);
+}
+
+/**
+ * create_kthread_worker - create a kthread worker
+ * @namefmt: printf-style name for the kthread worker (task).
+ *
+ * Returns a pointer to the allocated worker on success, ERR_PTR(-ENOMEM)
+ * when the needed structures could not get allocated, and ERR_PTR(-EINTR)
+ * when the worker was SIGKILLed.
+ */
+struct kthread_worker *
+create_kthread_worker(const char namefmt[], ...)
+{
+	struct kthread_worker *worker;
+	va_list args;
+
+	va_start(args, namefmt);
+	worker = __create_kthread_worker(-1, namefmt, args);
+	va_end(args);
+
+	return worker;
+}
+EXPORT_SYMBOL(create_kthread_worker);
+
+/**
+ * create_kthread_worker_on_cpu - create a kthread worker and bind it
+ *	it to a given CPU and the associated NUMA node.
+ * @cpu: CPU number
+ * @namefmt: printf-style name for the kthread worker (task).
+ *
+ * Use a valid CPU number if you want to bind the kthread worker
+ * to the given CPU and the associated NUMA node.
+ *
+ * A good practice is to add the cpu number also into the worker name.
+ * For example, use create_kthread_worker_on_cpu(cpu, "helper/%d", cpu).
+ *
+ * Returns a pointer to the allocated worker on success, ERR_PTR(-ENOMEM)
+ * when the needed structures could not get allocated, and ERR_PTR(-EINTR)
+ * when the worker was SIGKILLed.
+ */
+struct kthread_worker *
+create_kthread_worker_on_cpu(int cpu, const char namefmt[], ...)
+{
+	struct kthread_worker *worker;
+	va_list args;
+
+	va_start(args, namefmt);
+	worker = __create_kthread_worker(cpu, namefmt, args);
+	va_end(args);
+
+	return worker;
+}
+EXPORT_SYMBOL(create_kthread_worker_on_cpu);
+
 /* insert @work before @pos in @worker */
 static void insert_kthread_work(struct kthread_worker *worker,
 			       struct kthread_work *work,
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 04/10] kthread: Add drain_kthread_worker()
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
                   ` (2 preceding siblings ...)
  2016-05-30 14:59 ` [PATCH v7 03/10] kthread: Add create_kthread_worker*() Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 05/10] kthread: Add destroy_kthread_worker() Petr Mladek
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

flush_kthread_worker() returns when the currently queued works are proceed.
But some other works might have been queued in the meantime.

This patch adds drain_kthread_worker() that is inspired by
drain_workqueue(). It returns when the queue is completely
empty and warns when it takes too long.

The initial implementation does not block queuing new works when
draining. It makes things much easier. The blocking would be useful
to debug potential problems but it is not clear if it is worth
the complication at the moment.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/kthread.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 76364374ff98..6051aa9d93c6 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -818,3 +818,37 @@ void flush_kthread_worker(struct kthread_worker *worker)
 	wait_for_completion(&fwork.done);
 }
 EXPORT_SYMBOL_GPL(flush_kthread_worker);
+
+/**
+ * drain_kthread_worker - drain a kthread worker
+ * @worker: worker to be drained
+ *
+ * Wait until there is no work queued for the given kthread worker.
+ * @worker is flushed repeatedly until it becomes empty.  The number
+ * of flushing is determined by the depth of chaining and should
+ * be relatively short.  Whine if it takes too long.
+ *
+ * The caller is responsible for blocking all users of this kthread
+ * worker from queuing new works. Also it is responsible for blocking
+ * the already queued works from an infinite re-queuing!
+ */
+void drain_kthread_worker(struct kthread_worker *worker)
+{
+	int flush_cnt = 0;
+
+	spin_lock_irq(&worker->lock);
+
+	while (!list_empty(&worker->work_list)) {
+		spin_unlock_irq(&worker->lock);
+
+		flush_kthread_worker(worker);
+		WARN_ONCE(flush_cnt++ > 10,
+			  "kthread worker %s: drain_kthread_worker() isn't complete after %u tries\n",
+			  worker->task->comm, flush_cnt);
+
+		spin_lock_irq(&worker->lock);
+	}
+
+	spin_unlock_irq(&worker->lock);
+}
+EXPORT_SYMBOL(drain_kthread_worker);
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 05/10] kthread: Add destroy_kthread_worker()
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
                   ` (3 preceding siblings ...)
  2016-05-30 14:59 ` [PATCH v7 04/10] kthread: Add drain_kthread_worker() Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 06/10] kthread: Detect when a kthread work is used by more workers Petr Mladek
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

The current kthread worker users call flush() and stop() explicitly.
This function drains the worker, stops it, and frees the kthread_worker
struct in one call.

It is supposed to be used together with create_kthread_worker*() that
allocates struct kthread_worker.

Also note that drain() correctly handles self-queuing works in compare
with flush().

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/kthread.h |  2 ++
 kernel/kthread.c        | 21 +++++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 468011efa68d..a36604fa8aa2 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -136,4 +136,6 @@ bool queue_kthread_work(struct kthread_worker *worker,
 void flush_kthread_work(struct kthread_work *work);
 void flush_kthread_worker(struct kthread_worker *worker);
 
+void destroy_kthread_worker(struct kthread_worker *worker);
+
 #endif /* _LINUX_KTHREAD_H */
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 6051aa9d93c6..441651765f08 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -852,3 +852,24 @@ void drain_kthread_worker(struct kthread_worker *worker)
 	spin_unlock_irq(&worker->lock);
 }
 EXPORT_SYMBOL(drain_kthread_worker);
+
+/**
+ * destroy_kthread_worker - destroy a kthread worker
+ * @worker: worker to be destroyed
+ *
+ * Drain and destroy @worker.  It has the same conditions
+ * for use as drain_kthread_worker(), see above.
+ */
+void destroy_kthread_worker(struct kthread_worker *worker)
+{
+	struct task_struct *task;
+
+	task = worker->task;
+	if (WARN_ON(!task))
+		return;
+
+	drain_kthread_worker(worker);
+	kthread_stop(task);
+	kfree(worker);
+}
+EXPORT_SYMBOL(destroy_kthread_worker);
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 06/10] kthread: Detect when a kthread work is used by more workers
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
                   ` (4 preceding siblings ...)
  2016-05-30 14:59 ` [PATCH v7 05/10] kthread: Add destroy_kthread_worker() Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 07/10] kthread: Initial support for delayed kthread work Petr Mladek
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

Nothing currently prevents a work from queuing for a kthread worker
when it is already running on another one. This means that the work
might run in parallel on more workers. Also some operations, e.g.
flush or drain are not reliable.

This problem will be even more visible after we add cancel_kthread_work()
function. It will only have "work" as the parameter and will use
worker->lock to synchronize with others.

Well, normally this is not a problem because the API users are sane.
But bugs might happen and users also might be crazy.

This patch adds a warning when we try to insert the work for another
worker. It does not fully prevent the misuse because it would make the
code much more complicated without a big benefit.

It adds the same warning also into flush_kthread_work() instead of
the repeated attempts to get the right lock.

A side effect is that one needs to explicitly reinitialize the work
if it must be queued into another worker. This is needed, for example,
when the worker is stopped and started again. It is a bit inconvenient.
But it looks like a good compromise between the stability and complexity.

I have double checked all existing users of the kthread worker API
and they all seems to initialize the work after the worker gets
started.

Just for completeness, the patch adds a check that the work is not
already in a queue.

The patch also puts all the checks into a separate function. It will
be reused when implementing delayed works.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/kthread.c | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 441651765f08..385a7d6b4872 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -574,6 +574,9 @@ EXPORT_SYMBOL_GPL(__init_kthread_worker);
  * The works are not allowed to keep any locks, disable preemption or interrupts
  * when they finish. There is defined a safe point for freezing when one work
  * finishes and before a new one is started.
+ *
+ * Also the works must not be handled by more workers at the same time, see also
+ * queue_kthread_work().
  */
 int kthread_worker_fn(void *worker_ptr)
 {
@@ -710,12 +713,21 @@ create_kthread_worker_on_cpu(int cpu, const char namefmt[], ...)
 }
 EXPORT_SYMBOL(create_kthread_worker_on_cpu);
 
+static void insert_kthread_work_sanity_check(struct kthread_worker *worker,
+					       struct kthread_work *work)
+{
+	lockdep_assert_held(&worker->lock);
+	WARN_ON_ONCE(!list_empty(&work->node));
+	/* Do not use a work with more workers, see queue_kthread_work() */
+	WARN_ON_ONCE(work->worker && work->worker != worker);
+}
+
 /* insert @work before @pos in @worker */
 static void insert_kthread_work(struct kthread_worker *worker,
-			       struct kthread_work *work,
-			       struct list_head *pos)
+				struct kthread_work *work,
+				struct list_head *pos)
 {
-	lockdep_assert_held(&worker->lock);
+	insert_kthread_work_sanity_check(worker, work);
 
 	list_add_tail(&work->node, pos);
 	work->worker = worker;
@@ -731,6 +743,9 @@ static void insert_kthread_work(struct kthread_worker *worker,
  * Queue @work to work processor @task for async execution.  @task
  * must have been created with kthread_worker_create().  Returns %true
  * if @work was successfully queued, %false if it was already pending.
+ *
+ * Reinitialize the work if it needs to be used by another worker.
+ * For example, when the worker was stopped and started again.
  */
 bool queue_kthread_work(struct kthread_worker *worker,
 			struct kthread_work *work)
@@ -775,16 +790,13 @@ void flush_kthread_work(struct kthread_work *work)
 	struct kthread_worker *worker;
 	bool noop = false;
 
-retry:
 	worker = work->worker;
 	if (!worker)
 		return;
 
 	spin_lock_irq(&worker->lock);
-	if (work->worker != worker) {
-		spin_unlock_irq(&worker->lock);
-		goto retry;
-	}
+	/* Work must not be used with more workers, see queue_kthread_work(). */
+	WARN_ON_ONCE(work->worker != worker);
 
 	if (!list_empty(&work->node))
 		insert_kthread_work(worker, &fwork.work, work->node.next);
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 07/10] kthread: Initial support for delayed kthread work
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
                   ` (5 preceding siblings ...)
  2016-05-30 14:59 ` [PATCH v7 06/10] kthread: Detect when a kthread work is used by more workers Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 08/10] kthread: Allow to cancel " Petr Mladek
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

We are going to use kthread_worker more widely and delayed works
will be pretty useful.

The implementation is inspired by workqueues. It uses a timer to
queue the work after the requested delay. If the delay is zero,
the work is queued immediately.

In compare with workqueues, each work is associated with a single
worker (kthread). Therefore the implementation could be much easier.
In particular, we use the worker->lock to synchronize all the
operations with the work. We do not need any atomic operation
with a flags variable.

In fact, we do not need any state variable at all. Instead, we
add a list of delayed works into the worker. Then the pending
work is listed either in the list of queued or delayed works.
And the existing check of pending works is the same even for
the delayed ones.

A work must not be assigned to another worker unless reinitialized.
Therefore the timer handler might expect that dwork->work->worker
is valid and it could simply take the lock. We just add some
sanity checks to help with debugging a potential misuse.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/kthread.h |  33 ++++++++++++++++
 kernel/kthread.c        | 102 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 135 insertions(+)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index a36604fa8aa2..b27be55cffa3 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -63,10 +63,12 @@ extern int tsk_fork_get_node(struct task_struct *tsk);
  */
 struct kthread_work;
 typedef void (*kthread_work_func_t)(struct kthread_work *work);
+void delayed_kthread_work_timer_fn(unsigned long __data);
 
 struct kthread_worker {
 	spinlock_t		lock;
 	struct list_head	work_list;
+	struct list_head	delayed_work_list;
 	struct task_struct	*task;
 	struct kthread_work	*current_work;
 };
@@ -77,9 +79,15 @@ struct kthread_work {
 	struct kthread_worker	*worker;
 };
 
+struct delayed_kthread_work {
+	struct kthread_work work;
+	struct timer_list timer;
+};
+
 #define KTHREAD_WORKER_INIT(worker)	{				\
 	.lock = __SPIN_LOCK_UNLOCKED((worker).lock),			\
 	.work_list = LIST_HEAD_INIT((worker).work_list),		\
+	.delayed_work_list = LIST_HEAD_INIT((worker).delayed_work_list),\
 	}
 
 #define KTHREAD_WORK_INIT(work, fn)	{				\
@@ -87,12 +95,23 @@ struct kthread_work {
 	.func = (fn),							\
 	}
 
+#define DELAYED_KTHREAD_WORK_INIT(dwork, fn) {				\
+	.work = KTHREAD_WORK_INIT((dwork).work, (fn)),			\
+	.timer = __TIMER_INITIALIZER(delayed_kthread_work_timer_fn,	\
+				     0, (unsigned long)&(dwork),	\
+				     TIMER_IRQSAFE),			\
+	}
+
 #define DEFINE_KTHREAD_WORKER(worker)					\
 	struct kthread_worker worker = KTHREAD_WORKER_INIT(worker)
 
 #define DEFINE_KTHREAD_WORK(work, fn)					\
 	struct kthread_work work = KTHREAD_WORK_INIT(work, fn)
 
+#define DEFINE_DELAYED_KTHREAD_WORK(dwork, fn)				\
+	struct delayed_kthread_work dwork =				\
+		DELAYED_KTHREAD_WORK_INIT(dwork, fn)
+
 /*
  * kthread_worker.lock needs its own lockdep class key when defined on
  * stack with lockdep enabled.  Use the following macros in such cases.
@@ -122,6 +141,15 @@ extern void __init_kthread_worker(struct kthread_worker *worker,
 		(work)->func = (fn);					\
 	} while (0)
 
+#define init_delayed_kthread_work(dwork, fn)				\
+	do {								\
+		init_kthread_work(&(dwork)->work, (fn));		\
+		__setup_timer(&(dwork)->timer,				\
+			      delayed_kthread_work_timer_fn,		\
+			      (unsigned long)(dwork),			\
+			      TIMER_IRQSAFE);				\
+	} while (0)
+
 int kthread_worker_fn(void *worker_ptr);
 
 __printf(1, 2)
@@ -133,6 +161,11 @@ create_kthread_worker_on_cpu(int cpu, const char namefmt[], ...);
 
 bool queue_kthread_work(struct kthread_worker *worker,
 			struct kthread_work *work);
+
+bool queue_delayed_kthread_work(struct kthread_worker *worker,
+				struct delayed_kthread_work *dwork,
+				unsigned long delay);
+
 void flush_kthread_work(struct kthread_work *work);
 void flush_kthread_worker(struct kthread_worker *worker);
 
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 385a7d6b4872..7655357065e1 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -559,6 +559,7 @@ void __init_kthread_worker(struct kthread_worker *worker,
 	spin_lock_init(&worker->lock);
 	lockdep_set_class_and_name(&worker->lock, key, name);
 	INIT_LIST_HEAD(&worker->work_list);
+	INIT_LIST_HEAD(&worker->delayed_work_list);
 	worker->task = NULL;
 }
 EXPORT_SYMBOL_GPL(__init_kthread_worker);
@@ -763,6 +764,107 @@ bool queue_kthread_work(struct kthread_worker *worker,
 }
 EXPORT_SYMBOL_GPL(queue_kthread_work);
 
+/**
+ * delayed_kthread_work_timer_fn - callback that queues the associated delayed
+ *	kthread work when the timer expires.
+ * @__data: pointer to the data associated with the timer
+ *
+ * The format of the function is defined by struct timer_list.
+ * It should have been called from irqsafe timer with irq already off.
+ */
+void delayed_kthread_work_timer_fn(unsigned long __data)
+{
+	struct delayed_kthread_work *dwork =
+		(struct delayed_kthread_work *)__data;
+	struct kthread_work *work = &dwork->work;
+	struct kthread_worker *worker = work->worker;
+
+	/*
+	 * This might happen when a pending work is reinitialized.
+	 * It means that it is used a wrong way.
+	 */
+	if (WARN_ON_ONCE(!worker))
+		return;
+
+	spin_lock(&worker->lock);
+	/* Work must not be used with more workers, see queue_kthread_work(). */
+	WARN_ON_ONCE(work->worker != worker);
+
+	/* Move the work from worker->delayed_work_list. */
+	WARN_ON_ONCE(list_empty(&work->node));
+	list_del_init(&work->node);
+	insert_kthread_work(worker, work, &worker->work_list);
+
+	spin_unlock(&worker->lock);
+}
+EXPORT_SYMBOL(delayed_kthread_work_timer_fn);
+
+void __queue_delayed_kthread_work(struct kthread_worker *worker,
+				struct delayed_kthread_work *dwork,
+				unsigned long delay)
+{
+	struct timer_list *timer = &dwork->timer;
+	struct kthread_work *work = &dwork->work;
+
+	WARN_ON_ONCE(timer->function != delayed_kthread_work_timer_fn ||
+		     timer->data != (unsigned long)dwork);
+
+	/*
+	 * If @delay is 0, queue @dwork->work immediately.  This is for
+	 * both optimization and correctness.  The earliest @timer can
+	 * expire is on the closest next tick and delayed_work users depend
+	 * on that there's no such delay when @delay is 0.
+	 */
+	if (!delay) {
+		insert_kthread_work(worker, work, &worker->work_list);
+		return;
+	}
+
+	/* Be paranoid and try to detect possible races already now. */
+	insert_kthread_work_sanity_check(worker, work);
+
+	list_add(&work->node, &worker->delayed_work_list);
+	work->worker = worker;
+	timer_stats_timer_set_start_info(&dwork->timer);
+	timer->expires = jiffies + delay;
+	add_timer(timer);
+}
+
+/**
+ * queue_delayed_kthread_work - queue the associated kthread work
+ *	after a delay.
+ * @worker: target kthread_worker
+ * @dwork: delayed_kthread_work to queue
+ * @delay: number of jiffies to wait before queuing
+ *
+ * If the work has not been pending it starts a timer that will queue
+ * the work after the given @delay. If @delay is zero, it queues the
+ * work immediately.
+ *
+ * Return: %false if the @work has already been pending. It means that
+ * either the timer was running or the work was queued. It returns %true
+ * otherwise.
+ */
+bool queue_delayed_kthread_work(struct kthread_worker *worker,
+				struct delayed_kthread_work *dwork,
+				unsigned long delay)
+{
+	struct kthread_work *work = &dwork->work;
+	unsigned long flags;
+	bool ret = false;
+
+	spin_lock_irqsave(&worker->lock, flags);
+
+	if (list_empty(&work->node)) {
+		__queue_delayed_kthread_work(worker, dwork, delay);
+		ret = true;
+	}
+
+	spin_unlock_irqrestore(&worker->lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(queue_delayed_kthread_work);
+
 struct kthread_flush_work {
 	struct kthread_work	work;
 	struct completion	done;
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 08/10] kthread: Allow to cancel kthread work
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
                   ` (6 preceding siblings ...)
  2016-05-30 14:59 ` [PATCH v7 07/10] kthread: Initial support for delayed kthread work Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 09/10] kthread: Allow to modify delayed " Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 10/10] kthread: Better support freezable kthread workers Petr Mladek
  9 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.

This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations
via the worker lock, we use del_timer_sync() and a counter to count
parallel cancel operations. Therefore the implementation might be easier.

First, we check if a worker is assigned. If not, the work has newer
been queued after it was initialized.

Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.

Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running.
We need to temporary release the worker->lock to avoid a possible
deadlock with the callback. In the meantime, we set work->canceling
counter to avoid any queuing.

Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.

Fifth, if the work is running, we call flush_kthread_work(). It might
take an arbitrary time. We need to release the worker-lock again.
In the meantime, we again block any queuing by the canceling counter.

As already mentioned, the check for a pending kthread work is done under
a lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_work_flush(). Any queuing is blocked until the counter is zero.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/kthread.h |   5 ++
 kernel/kthread.c        | 131 +++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 134 insertions(+), 2 deletions(-)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index b27be55cffa3..49f59b087b6b 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -77,6 +77,8 @@ struct kthread_work {
 	struct list_head	node;
 	kthread_work_func_t	func;
 	struct kthread_worker	*worker;
+	/* Number of canceling calls that are running at the moment. */
+	int			canceling;
 };
 
 struct delayed_kthread_work {
@@ -169,6 +171,9 @@ bool queue_delayed_kthread_work(struct kthread_worker *worker,
 void flush_kthread_work(struct kthread_work *work);
 void flush_kthread_worker(struct kthread_worker *worker);
 
+bool cancel_kthread_work_sync(struct kthread_work *work);
+bool cancel_delayed_kthread_work_sync(struct delayed_kthread_work *work);
+
 void destroy_kthread_worker(struct kthread_worker *worker);
 
 #endif /* _LINUX_KTHREAD_H */
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 7655357065e1..10129fdd4f3b 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -714,6 +714,18 @@ create_kthread_worker_on_cpu(int cpu, const char namefmt[], ...)
 }
 EXPORT_SYMBOL(create_kthread_worker_on_cpu);
 
+/*
+ * Returns true when the work could not be queued at the moment.
+ * It happens when it is already pending in a worker list
+ * or when it is being cancelled.
+ *
+ * This function must be called under work->worker->lock.
+ */
+static inline bool queuing_blocked(const struct kthread_work *work)
+{
+	return !list_empty(&work->node) || work->canceling;
+}
+
 static void insert_kthread_work_sanity_check(struct kthread_worker *worker,
 					       struct kthread_work *work)
 {
@@ -755,7 +767,7 @@ bool queue_kthread_work(struct kthread_worker *worker,
 	unsigned long flags;
 
 	spin_lock_irqsave(&worker->lock, flags);
-	if (list_empty(&work->node)) {
+	if (!queuing_blocked(work)) {
 		insert_kthread_work(worker, work, &worker->work_list);
 		ret = true;
 	}
@@ -855,7 +867,7 @@ bool queue_delayed_kthread_work(struct kthread_worker *worker,
 
 	spin_lock_irqsave(&worker->lock, flags);
 
-	if (list_empty(&work->node)) {
+	if (!queuing_blocked(work)) {
 		__queue_delayed_kthread_work(worker, dwork, delay);
 		ret = true;
 	}
@@ -914,6 +926,121 @@ void flush_kthread_work(struct kthread_work *work)
 }
 EXPORT_SYMBOL_GPL(flush_kthread_work);
 
+/*
+ * This function removes the work from the worker queue. Also it makes sure
+ * that it won't get queued later via the delayed work's timer.
+ *
+ * The work might still be in use when this function finishes. See the
+ * current_work proceed by the worker.
+ *
+ * Return: %true if @work was pending and successfully canceled,
+ *	%false if @work was not pending
+ */
+static bool __cancel_kthread_work(struct kthread_work *work, bool is_dwork,
+				  unsigned long *flags)
+{
+	/* Try to cancel the timer if exists. */
+	if (is_dwork) {
+		struct delayed_kthread_work *dwork =
+			container_of(work, struct delayed_kthread_work, work);
+		struct kthread_worker *worker = work->worker;
+
+		/*
+		 * del_timer_sync() must be called to make sure that the timer
+		 * callback is not running. The lock must be temporary released
+		 * to avoid a deadlock with the callback. In the meantime,
+		 * any queuing is blocked by setting the canceling counter.
+		 */
+		work->canceling++;
+		spin_unlock_irqrestore(&worker->lock, *flags);
+		del_timer_sync(&dwork->timer);
+		spin_lock_irqsave(&worker->lock, *flags);
+		work->canceling--;
+	}
+
+	/*
+	 * Try to remove the work from a worker list. It might either
+	 * be from worker->work_list or from worker->delayed_work_list.
+	 */
+	if (!list_empty(&work->node)) {
+		list_del_init(&work->node);
+		return true;
+	}
+
+	return false;
+}
+
+static bool __cancel_kthread_work_sync(struct kthread_work *work, bool is_dwork)
+{
+	struct kthread_worker *worker = work->worker;
+	unsigned long flags;
+	int ret = false;
+
+	if (!worker)
+		goto out;
+
+	spin_lock_irqsave(&worker->lock, flags);
+	/* Work must not be used with more workers, see queue_kthread_work(). */
+	WARN_ON_ONCE(work->worker != worker);
+
+	ret = __cancel_kthread_work(work, is_dwork, &flags);
+
+	if (worker->current_work != work)
+		goto out_fast;
+
+	/*
+	 * The work is in progress and we need to wait with the lock released.
+	 * In the meantime, block any queuing by setting the canceling counter.
+	 */
+	work->canceling++;
+	spin_unlock_irqrestore(&worker->lock, flags);
+	flush_kthread_work(work);
+	spin_lock_irqsave(&worker->lock, flags);
+	work->canceling--;
+
+out_fast:
+	spin_unlock_irqrestore(&worker->lock, flags);
+out:
+	return ret;
+}
+
+/**
+ * cancel_kthread_work_sync - cancel a kthread work and wait for it to finish
+ * @work: the kthread work to cancel
+ *
+ * Cancel @work and wait for its execution to finish.  This function
+ * can be used even if the work re-queues itself. On return from this
+ * function, @work is guaranteed to be not pending or executing on any CPU.
+ *
+ * cancel_kthread_work_sync(&delayed_work->work) must not be used for
+ * delayed_work's. Use cancel_delayed_kthread_work_sync() instead.
+ *
+ * The caller must ensure that the worker on which @work was last
+ * queued can't be destroyed before this function returns.
+ *
+ * Return: %true if @work was pending, %false otherwise.
+ */
+bool cancel_kthread_work_sync(struct kthread_work *work)
+{
+	return __cancel_kthread_work_sync(work, false);
+}
+EXPORT_SYMBOL_GPL(cancel_kthread_work_sync);
+
+/**
+ * cancel_delayed_kthread_work_sync - cancel a delayed kthread work and
+ *	wait for it to finish.
+ * @dwork: the delayed kthread work to cancel
+ *
+ * This is cancel_kthread_work_sync() for delayed works.
+ *
+ * Return: %true if @dwork was pending, %false otherwise.
+ */
+bool cancel_delayed_kthread_work_sync(struct delayed_kthread_work *dwork)
+{
+	return __cancel_kthread_work_sync(&dwork->work, true);
+}
+EXPORT_SYMBOL_GPL(cancel_delayed_kthread_work_sync);
+
 /**
  * flush_kthread_worker - flush all current works on a kthread_worker
  * @worker: worker to flush
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 09/10] kthread: Allow to modify delayed kthread work
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
                   ` (7 preceding siblings ...)
  2016-05-30 14:59 ` [PATCH v7 08/10] kthread: Allow to cancel " Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  2016-05-30 14:59 ` [PATCH v7 10/10] kthread: Better support freezable kthread workers Petr Mladek
  9 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

There are situations when we need to modify the delay of a delayed kthread
work. For example, when the work depends on an event and the initial delay
means a timeout. Then we want to queue the work immediately when the event
happens.

This patch implements mod_delayed_kthread_work() as inspired workqueues.
It cancels the timer, removes the work from any worker list and queues it
again with the given timeout.

A very special case is when the work is being canceled at the same time.
It might happen because of the regular cancel_delayed_kthread_work_sync()
or by another mod_delayed_kthread_work(). In this case, we do nothing and
let the other operation win. This should not normally happen as the caller
is supposed to synchronize these operations a reasonable way.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/kthread.h |  4 ++++
 kernel/kthread.c        | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 49f59b087b6b..1d5ca191562f 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -168,6 +168,10 @@ bool queue_delayed_kthread_work(struct kthread_worker *worker,
 				struct delayed_kthread_work *dwork,
 				unsigned long delay);
 
+bool mod_delayed_kthread_work(struct kthread_worker *worker,
+			      struct delayed_kthread_work *dwork,
+			      unsigned long delay);
+
 void flush_kthread_work(struct kthread_work *work);
 void flush_kthread_worker(struct kthread_worker *worker);
 
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 10129fdd4f3b..2cc32cad66ef 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -970,6 +970,59 @@ static bool __cancel_kthread_work(struct kthread_work *work, bool is_dwork,
 	return false;
 }
 
+/**
+ * mod_delayed_kthread_work - modify delay of or queue a delayed kthread work
+ * @worker: kthread worker to use
+ * @dwork: delayed kthread work to queue
+ * @delay: number of jiffies to wait before queuing
+ *
+ * If @dwork is idle, equivalent to queue_delayed_kthread work(). Otherwise,
+ * modify @dwork's timer so that it expires after @delay. If @delay is zero,
+ * @work is guaranteed to be queued immediately.
+ *
+ * Return: %true if @dwork was pending and its timer was modified,
+ * %false otherwise.
+ *
+ * A special case is when the work is being canceled in parallel.
+ * It might be caused either by the real cancel_delayed_kthread_work_sync()
+ * or yet another mod_delayed_kthread_work() call. We let the other command
+ * win and return %false here. The caller is supposed to synchronize these
+ * operations a reasonable way.
+ *
+ * This function is safe to call from any context including IRQ handler.
+ * See __cancel_kthread_work() and delayed_kthread_work_timer_fn()
+ * for details.
+ */
+bool mod_delayed_kthread_work(struct kthread_worker *worker,
+			      struct delayed_kthread_work *dwork,
+			      unsigned long delay)
+{
+	struct kthread_work *work = &dwork->work;
+	unsigned long flags;
+	int ret = false;
+
+	spin_lock_irqsave(&worker->lock, flags);
+
+	/* Do not bother with canceling when never queued. */
+	if (!work->worker)
+		goto fast_queue;
+
+	/* Work must not be used with more workers, see queue_kthread_work() */
+	WARN_ON_ONCE(work->worker != worker);
+
+	/* Do not fight with another command that is canceling this work. */
+	if (work->canceling)
+		goto out;
+
+	ret = __cancel_kthread_work(work, true, &flags);
+fast_queue:
+	__queue_delayed_kthread_work(worker, dwork, delay);
+out:
+	spin_unlock_irqrestore(&worker->lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(mod_delayed_kthread_work);
+
 static bool __cancel_kthread_work_sync(struct kthread_work *work, bool is_dwork)
 {
 	struct kthread_worker *worker = work->worker;
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 10/10] kthread: Better support freezable kthread workers
  2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
                   ` (8 preceding siblings ...)
  2016-05-30 14:59 ` [PATCH v7 09/10] kthread: Allow to modify delayed " Petr Mladek
@ 2016-05-30 14:59 ` Petr Mladek
  9 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-05-30 14:59 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek

This patch allows to make kthread worker freezable via a new @flags
parameter. It will allow to avoid an init work in some kthreads.

It currently does not affect the function of kthread_worker_fn()
but it might help to do some optimization or fixes eventually.

I currently do not know about any other use for the @flags
parameter but I believe that we will want more flags
in the future.

Finally, I hope that it will not cause confusion with @flags member
in struct kthread. Well, I guess that we will want to rework the
basic kthreads implementation once all kthreads are converted into
kthread workers or workqueues. It is possible that we will merge
the two structures.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/kthread.h | 12 +++++++++---
 kernel/kthread.c        | 21 +++++++++++++++------
 2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 1d5ca191562f..edad163b26d0 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -65,7 +65,12 @@ struct kthread_work;
 typedef void (*kthread_work_func_t)(struct kthread_work *work);
 void delayed_kthread_work_timer_fn(unsigned long __data);
 
+enum {
+	KTW_FREEZABLE		= 1 << 0,	/* freeze during suspend */
+};
+
 struct kthread_worker {
+	unsigned int		flags;
 	spinlock_t		lock;
 	struct list_head	work_list;
 	struct list_head	delayed_work_list;
@@ -154,12 +159,13 @@ extern void __init_kthread_worker(struct kthread_worker *worker,
 
 int kthread_worker_fn(void *worker_ptr);
 
-__printf(1, 2)
+__printf(2, 3)
 struct kthread_worker *
-create_kthread_worker(const char namefmt[], ...);
+create_kthread_worker(unsigned int flags, const char namefmt[], ...);
 
 struct kthread_worker *
-create_kthread_worker_on_cpu(int cpu, const char namefmt[], ...);
+create_kthread_worker_on_cpu(int cpu, unsigned int flags,
+			     const char namefmt[], ...);
 
 bool queue_kthread_work(struct kthread_worker *worker,
 			struct kthread_work *work);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 2cc32cad66ef..4ee4c05f8bf7 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -556,11 +556,11 @@ void __init_kthread_worker(struct kthread_worker *worker,
 				const char *name,
 				struct lock_class_key *key)
 {
+	memset(worker, 0, sizeof(struct kthread_worker));
 	spin_lock_init(&worker->lock);
 	lockdep_set_class_and_name(&worker->lock, key, name);
 	INIT_LIST_HEAD(&worker->work_list);
 	INIT_LIST_HEAD(&worker->delayed_work_list);
-	worker->task = NULL;
 }
 EXPORT_SYMBOL_GPL(__init_kthread_worker);
 
@@ -590,6 +590,10 @@ int kthread_worker_fn(void *worker_ptr)
 	 */
 	WARN_ON(worker->task && worker->task != current);
 	worker->task = current;
+
+	if (worker->flags & KTW_FREEZABLE)
+		set_freezable();
+
 repeat:
 	set_current_state(TASK_INTERRUPTIBLE);	/* mb paired w/ kthread_stop */
 
@@ -623,7 +627,8 @@ repeat:
 EXPORT_SYMBOL_GPL(kthread_worker_fn);
 
 static struct kthread_worker *
-__create_kthread_worker(int cpu, const char namefmt[], va_list args)
+__create_kthread_worker(int cpu, unsigned int flags,
+			const char namefmt[], va_list args)
 {
 	struct kthread_worker *worker;
 	struct task_struct *task;
@@ -653,6 +658,7 @@ __create_kthread_worker(int cpu, const char namefmt[], va_list args)
 	if (IS_ERR(task))
 		goto fail_task;
 
+	worker->flags = flags;
 	worker->task = task;
 	wake_up_process(task);
 	return worker;
@@ -664,6 +670,7 @@ fail_task:
 
 /**
  * create_kthread_worker - create a kthread worker
+ * @flags: flags modifying the default behavior of the worker
  * @namefmt: printf-style name for the kthread worker (task).
  *
  * Returns a pointer to the allocated worker on success, ERR_PTR(-ENOMEM)
@@ -671,13 +678,13 @@ fail_task:
  * when the worker was SIGKILLed.
  */
 struct kthread_worker *
-create_kthread_worker(const char namefmt[], ...)
+create_kthread_worker(unsigned int flags, const char namefmt[], ...)
 {
 	struct kthread_worker *worker;
 	va_list args;
 
 	va_start(args, namefmt);
-	worker = __create_kthread_worker(-1, namefmt, args);
+	worker = __create_kthread_worker(-1, flags, namefmt, args);
 	va_end(args);
 
 	return worker;
@@ -688,6 +695,7 @@ EXPORT_SYMBOL(create_kthread_worker);
  * create_kthread_worker_on_cpu - create a kthread worker and bind it
  *	it to a given CPU and the associated NUMA node.
  * @cpu: CPU number
+ * @flags: flags modifying the default behavior of the worker
  * @namefmt: printf-style name for the kthread worker (task).
  *
  * Use a valid CPU number if you want to bind the kthread worker
@@ -701,13 +709,14 @@ EXPORT_SYMBOL(create_kthread_worker);
  * when the worker was SIGKILLed.
  */
 struct kthread_worker *
-create_kthread_worker_on_cpu(int cpu, const char namefmt[], ...)
+create_kthread_worker_on_cpu(int cpu, unsigned int flags,
+			     const char namefmt[], ...)
 {
 	struct kthread_worker *worker;
 	va_list args;
 
 	va_start(args, namefmt);
-	worker = __create_kthread_worker(cpu, namefmt, args);
+	worker = __create_kthread_worker(cpu, flags, namefmt, args);
 	va_end(args);
 
 	return worker;
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v7 03/10] kthread: Add create_kthread_worker*()
  2016-05-30 14:59 ` [PATCH v7 03/10] kthread: Add create_kthread_worker*() Petr Mladek
@ 2016-06-01 19:36   ` Andrew Morton
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2016-06-01 19:36 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Oleg Nesterov, Tejun Heo, Ingo Molnar, Peter Zijlstra,
	Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel

On Mon, 30 May 2016 16:59:24 +0200 Petr Mladek <pmladek@suse.com> wrote:

> Kthread workers are currently created using the classic kthread API,
> namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
> parameter.
> 
> This patch defines create_kthread_worker() and
> create_kthread_worker_on_cpu() functions that hide implementation details.

I hate to nick pits, but the naming isn't good.

A good, disciplined and pretty common naming scheme is to lead the
overall identifier with the name of the relevant subsystem.  kthread
has done that *fairly* well:


Things we got right:

kthread_create_on_node
kthread_create
kthread_create_on_cpu
kthread_run
kthread_bind
kthread_bind_mask
kthread_stop
kthread_should_stop
kthread_should_park
kthread_freezable_should_stop
kthread_data
kthread_park
kthread_unpark
kthread_parkme
kthreadd
kthread_work_func_t
KTHREAD_WORKER_INIT
KTHREAD_WORK_INIT
KTHREAD_WORKER_INIT_ONSTACK
kthread_worker_fn

Things we didn't:

probe_kthread_data
DEFINE_KTHREAD_WORKER
DEFINE_KTHREAD_WORK
DEFINE_KTHREAD_WORKER_ONSTACK
DEFINE_KTHREAD_WORKER_ONSTACK
__init_kthread_worker
init_kthread_worker
init_kthread_work
queue_kthread_work
flush_kthread_work
flush_kthread_worker


So I suggest kthread_create_worker() and
kthread_create_worker_on_cpu(), please.

And this might be a suitable time to regularize some of the "things we
didn't" identifiers, if you're feeling keen.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-06-01 19:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-30 14:59 [PATCH v7 00/10] kthread: Kthread worker API improvements Petr Mladek
2016-05-30 14:59 ` [PATCH v7 01/10] kthread/smpboot: Do not park in kthread_create_on_cpu() Petr Mladek
2016-05-30 14:59 ` [PATCH v7 02/10] kthread: Allow to call __kthread_create_on_node() with va_list args Petr Mladek
2016-05-30 14:59 ` [PATCH v7 03/10] kthread: Add create_kthread_worker*() Petr Mladek
2016-06-01 19:36   ` Andrew Morton
2016-05-30 14:59 ` [PATCH v7 04/10] kthread: Add drain_kthread_worker() Petr Mladek
2016-05-30 14:59 ` [PATCH v7 05/10] kthread: Add destroy_kthread_worker() Petr Mladek
2016-05-30 14:59 ` [PATCH v7 06/10] kthread: Detect when a kthread work is used by more workers Petr Mladek
2016-05-30 14:59 ` [PATCH v7 07/10] kthread: Initial support for delayed kthread work Petr Mladek
2016-05-30 14:59 ` [PATCH v7 08/10] kthread: Allow to cancel " Petr Mladek
2016-05-30 14:59 ` [PATCH v7 09/10] kthread: Allow to modify delayed " Petr Mladek
2016-05-30 14:59 ` [PATCH v7 10/10] kthread: Better support freezable kthread workers Petr Mladek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).