All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jiri Kosina <jkosina@suse.cz>, Borislav Petkov <bp@suse.de>,
	Michal Hocko <mhocko@suse.cz>,
	linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
	linux-api@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 09/22] kthread: Allow to cancel kthread work
Date: Mon, 25 Jan 2016 14:17:09 -0500	[thread overview]
Message-ID: <20160125191709.GE3628@mtj.duckdns.org> (raw)
In-Reply-To: <1453736711-6703-10-git-send-email-pmladek@suse.com>

Hello,

On Mon, Jan 25, 2016 at 04:44:58PM +0100, Petr Mladek wrote:
> @@ -574,6 +575,7 @@ EXPORT_SYMBOL_GPL(__init_kthread_worker);
>  static inline bool kthread_work_pending(const struct kthread_work *work)
>  {
>  	return !list_empty(&work->node) ||
> +	       work->canceling ||
>  	       (work->timer && timer_active(work->timer));
>  }

So, the reason ->canceling test is necessary is to ensure that
self-requeueing work items can be canceled reliably.  It's not to
block "further queueing" in general.  It's probably worthwhile to
clear that up in the description and comment.

> +/*
> + * Get the worker lock if any worker is associated with the work.
> + * Depending on @check_canceling, it might need to give up the busy
> + * wait when work->canceling gets set.
> + */

While mentioning @check_canceling, the above doesn't actually explain
what it does.

> +static bool try_lock_kthread_work(struct kthread_work *work,
> +				  bool check_canceling)
>  {
>  	struct kthread_worker *worker;
>  	int ret = false;
> @@ -790,7 +798,24 @@ try_again:
>  	if (!worker)
>  		goto out;
>  
> -	spin_lock(&worker->lock);
> +	if (check_canceling) {
> +		if (!spin_trylock(&worker->lock)) {
> +			/*
> +			 * Busy wait with spin_is_locked() to avoid
> +			 * cache bouncing. Break when canceling
> +			 * is set to avoid a deadlock.
> +			 */
> +			do {
> +				if (READ_ONCE(work->canceling))
> +					goto out;

Why READ_ONCE?

> +				cpu_relax();
> +			} while (spin_is_locked(&worker->lock));
> +			goto try_again;
> +		}
> +	} else {
> +		spin_lock(&worker->lock);
> +	}
> +
>  	if (worker != work->worker) {
>  		spin_unlock(&worker->lock);
>  		goto try_again;
> @@ -820,10 +845,13 @@ void delayed_kthread_work_timer_fn(unsigned long __data)
>  		(struct delayed_kthread_work *)__data;
>  	struct kthread_work *work = &dwork->work;
>  
> -	if (!try_lock_kthread_work(work))
> +	/* Give up when the work is being canceled. */
> +	if (!try_lock_kthread_work(work, true))

Again, this is the trickest part of the whole thing.  Please add a
comment explaining why this is necessary.

>  		return;
>  
> -	__queue_kthread_work(work->worker, work);
> +	if (!work->canceling)
> +		__queue_kthread_work(work->worker, work);
> +
...
> +static int
> +try_to_cancel_kthread_work(struct kthread_work *work,
> +				   spinlock_t *lock,
> +				   unsigned long *flags)

bool?

> +{
> +	int ret = 0;
> +
> +	/* Try to cancel the timer if pending. */
> +	if (work->timer && del_timer_sync(work->timer)) {
> +		ret = 1;
> +		goto out;
> +	}
> +
> +	/* Try to remove queued work before it is being executed. */
> +	if (!list_empty(&work->node)) {
> +		list_del_init(&work->node);
> +		ret = 1;
> +	}
> +
> +out:
> +	return ret;

Again, what's up with unnecessary goto exits?

> +static bool __cancel_kthread_work_sync(struct kthread_work *work)
> +{
> +	struct kthread_worker *worker;
> +	unsigned long flags;
> +	int ret;
> +
> +	local_irq_save(flags);
> +	if (!try_lock_kthread_work(work, false)) {
> +		local_irq_restore(flags);

Can't try_lock_kthread_work() take &flags?

> +		ret = 0;
> +		goto out;
> +	}
> +	worker = work->worker;
> +
> +	/*
> +	 * Block further queueing. It must be set before trying to cancel
> +	 * the kthread work. It avoids a possible deadlock between
> +	 * del_timer_sync() and the timer callback.
> +	 */

So, "blocking further queueing" and "a possible deadlock between
del_timer_sync() and the timer callback" don't have anything to do
with each other, do they?  Those are two separate things.  You need
the former to guarantee cancelation of self-requeueing work items and
the latter for deadlock avoidance, no?

> +	work->canceling++;
> +	ret = try_to_cancel_kthread_work(work, &worker->lock, &flags);
> +
> +	if (worker->current_work != work)
> +		goto out_fast;

If there are two racing cancellers, wouldn't this allow the losing one
to return while the work item is still running?

> +	spin_unlock_irqrestore(&worker->lock, flags);
> +	flush_kthread_work(work);
> +	/*
> +	 * Nobody is allowed to switch the worker or queue the work
> +	 * when .canceling is set.
> +	 */
> +	spin_lock_irqsave(&worker->lock, flags);
> +
> +out_fast:
> +	work->canceling--;
> +	spin_unlock_irqrestore(&worker->lock, flags);
> +out:
> +	return ret;
> +}

Thanks.

-- 
tejun

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jiri Kosina <jkosina@suse.cz>, Borislav Petkov <bp@suse.de>,
	Michal Hocko <mhocko@suse.cz>,
	linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
	linux-api@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 09/22] kthread: Allow to cancel kthread work
Date: Mon, 25 Jan 2016 14:17:09 -0500	[thread overview]
Message-ID: <20160125191709.GE3628@mtj.duckdns.org> (raw)
In-Reply-To: <1453736711-6703-10-git-send-email-pmladek@suse.com>

Hello,

On Mon, Jan 25, 2016 at 04:44:58PM +0100, Petr Mladek wrote:
> @@ -574,6 +575,7 @@ EXPORT_SYMBOL_GPL(__init_kthread_worker);
>  static inline bool kthread_work_pending(const struct kthread_work *work)
>  {
>  	return !list_empty(&work->node) ||
> +	       work->canceling ||
>  	       (work->timer && timer_active(work->timer));
>  }

So, the reason ->canceling test is necessary is to ensure that
self-requeueing work items can be canceled reliably.  It's not to
block "further queueing" in general.  It's probably worthwhile to
clear that up in the description and comment.

> +/*
> + * Get the worker lock if any worker is associated with the work.
> + * Depending on @check_canceling, it might need to give up the busy
> + * wait when work->canceling gets set.
> + */

While mentioning @check_canceling, the above doesn't actually explain
what it does.

> +static bool try_lock_kthread_work(struct kthread_work *work,
> +				  bool check_canceling)
>  {
>  	struct kthread_worker *worker;
>  	int ret = false;
> @@ -790,7 +798,24 @@ try_again:
>  	if (!worker)
>  		goto out;
>  
> -	spin_lock(&worker->lock);
> +	if (check_canceling) {
> +		if (!spin_trylock(&worker->lock)) {
> +			/*
> +			 * Busy wait with spin_is_locked() to avoid
> +			 * cache bouncing. Break when canceling
> +			 * is set to avoid a deadlock.
> +			 */
> +			do {
> +				if (READ_ONCE(work->canceling))
> +					goto out;

Why READ_ONCE?

> +				cpu_relax();
> +			} while (spin_is_locked(&worker->lock));
> +			goto try_again;
> +		}
> +	} else {
> +		spin_lock(&worker->lock);
> +	}
> +
>  	if (worker != work->worker) {
>  		spin_unlock(&worker->lock);
>  		goto try_again;
> @@ -820,10 +845,13 @@ void delayed_kthread_work_timer_fn(unsigned long __data)
>  		(struct delayed_kthread_work *)__data;
>  	struct kthread_work *work = &dwork->work;
>  
> -	if (!try_lock_kthread_work(work))
> +	/* Give up when the work is being canceled. */
> +	if (!try_lock_kthread_work(work, true))

Again, this is the trickest part of the whole thing.  Please add a
comment explaining why this is necessary.

>  		return;
>  
> -	__queue_kthread_work(work->worker, work);
> +	if (!work->canceling)
> +		__queue_kthread_work(work->worker, work);
> +
...
> +static int
> +try_to_cancel_kthread_work(struct kthread_work *work,
> +				   spinlock_t *lock,
> +				   unsigned long *flags)

bool?

> +{
> +	int ret = 0;
> +
> +	/* Try to cancel the timer if pending. */
> +	if (work->timer && del_timer_sync(work->timer)) {
> +		ret = 1;
> +		goto out;
> +	}
> +
> +	/* Try to remove queued work before it is being executed. */
> +	if (!list_empty(&work->node)) {
> +		list_del_init(&work->node);
> +		ret = 1;
> +	}
> +
> +out:
> +	return ret;

Again, what's up with unnecessary goto exits?

> +static bool __cancel_kthread_work_sync(struct kthread_work *work)
> +{
> +	struct kthread_worker *worker;
> +	unsigned long flags;
> +	int ret;
> +
> +	local_irq_save(flags);
> +	if (!try_lock_kthread_work(work, false)) {
> +		local_irq_restore(flags);

Can't try_lock_kthread_work() take &flags?

> +		ret = 0;
> +		goto out;
> +	}
> +	worker = work->worker;
> +
> +	/*
> +	 * Block further queueing. It must be set before trying to cancel
> +	 * the kthread work. It avoids a possible deadlock between
> +	 * del_timer_sync() and the timer callback.
> +	 */

So, "blocking further queueing" and "a possible deadlock between
del_timer_sync() and the timer callback" don't have anything to do
with each other, do they?  Those are two separate things.  You need
the former to guarantee cancelation of self-requeueing work items and
the latter for deadlock avoidance, no?

> +	work->canceling++;
> +	ret = try_to_cancel_kthread_work(work, &worker->lock, &flags);
> +
> +	if (worker->current_work != work)
> +		goto out_fast;

If there are two racing cancellers, wouldn't this allow the losing one
to return while the work item is still running?

> +	spin_unlock_irqrestore(&worker->lock, flags);
> +	flush_kthread_work(work);
> +	/*
> +	 * Nobody is allowed to switch the worker or queue the work
> +	 * when .canceling is set.
> +	 */
> +	spin_lock_irqsave(&worker->lock, flags);
> +
> +out_fast:
> +	work->canceling--;
> +	spin_unlock_irqrestore(&worker->lock, flags);
> +out:
> +	return ret;
> +}

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-01-25 19:17 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-25 15:44 [PATCH v4 00/22] kthread: Use kthread worker API more widely Petr Mladek
2016-01-25 15:44 ` Petr Mladek
2016-01-25 15:44 ` Petr Mladek
2016-01-25 15:44 ` [PATCH v4 01/22] timer: Allow to check when the timer callback has not finished yet Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 18:44   ` Tejun Heo
2016-01-25 18:44     ` Tejun Heo
2016-01-25 15:44 ` [PATCH v4 02/22] kthread/smpboot: Do not park in kthread_create_on_cpu() Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 15:44 ` [PATCH v4 03/22] kthread: Allow to call __kthread_create_on_node() with va_list args Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 15:44 ` [PATCH v4 04/22] kthread: Add create_kthread_worker*() Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 18:53   ` Tejun Heo
2016-01-25 18:53     ` Tejun Heo
2016-02-16 15:44     ` Petr Mladek
2016-02-16 15:44       ` Petr Mladek
2016-02-16 16:08       ` Tejun Heo
2016-02-16 16:08         ` Tejun Heo
2016-02-16 16:08         ` Tejun Heo
2016-02-16 16:10       ` Petr Mladek
2016-02-16 16:10         ` Petr Mladek
2016-02-16 16:10         ` Petr Mladek
2016-01-25 15:44 ` [PATCH v4 05/22] kthread: Add drain_kthread_worker() Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 15:44 ` [PATCH v4 06/22] kthread: Add destroy_kthread_worker() Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 15:44 ` [PATCH v4 07/22] kthread: Detect when a kthread work is used by more workers Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 18:57   ` Tejun Heo
2016-01-25 18:57     ` Tejun Heo
2016-01-25 18:57     ` Tejun Heo
2016-02-16 16:38     ` Petr Mladek
2016-02-16 16:38       ` Petr Mladek
2016-02-16 16:38       ` Petr Mladek
2016-01-25 15:44 ` [PATCH v4 08/22] kthread: Initial support for delayed kthread work Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 19:04   ` Tejun Heo
2016-01-25 19:04     ` Tejun Heo
2016-01-25 19:04     ` Tejun Heo
2016-01-25 15:44 ` [PATCH v4 09/22] kthread: Allow to cancel " Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 19:17   ` Tejun Heo [this message]
2016-01-25 19:17     ` Tejun Heo
2016-02-19 16:22     ` Petr Mladek
2016-02-19 16:22       ` Petr Mladek
2016-01-25 15:44 ` [PATCH v4 10/22] kthread: Allow to modify delayed " Petr Mladek
2016-01-25 15:44   ` Petr Mladek
2016-01-25 19:19   ` Tejun Heo
2016-01-25 19:19     ` Tejun Heo
2016-01-25 15:45 ` [PATCH v4 11/22] kthread: Better support freezable kthread workers Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 19:21   ` Tejun Heo
2016-01-25 19:21     ` Tejun Heo
2016-01-25 15:45 ` [PATCH v4 12/22] kthread: Use try_lock_kthread_work() in flush_kthread_work() Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45 ` [PATCH v4 13/22] mm/huge_page: Convert khugepaged() into kthread worker API Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45 ` [PATCH v4 14/22] ring_buffer: Convert benchmark kthreads " Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45 ` [PATCH v4 15/22] hung_task: Convert hungtaskd " Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45 ` [PATCH v4 16/22] kmemleak: Convert kmemleak kthread " Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45 ` [PATCH v4 17/22] ipmi: Convert kipmi " Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45 ` [PATCH v4 18/22] IB/fmr_pool: Convert the cleanup thread " Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45 ` [PATCH v4 19/22] memstick/r592: Better synchronize debug messages in r592_io kthread Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45 ` [PATCH v4 20/22] memstick/r592: convert r592_io kthread into kthread worker API Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 15:45 ` [PATCH v4 21/22] thermal/intel_powerclamp: Remove duplicated code that starts the kthread Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 16:23   ` Jacob Pan
2016-01-25 16:23     ` Jacob Pan
2016-01-25 15:45 ` [PATCH v4 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API Petr Mladek
2016-01-25 15:45   ` Petr Mladek
2016-01-25 16:28   ` Jacob Pan
2016-01-25 16:28     ` Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160125191709.GE3628@mtj.duckdns.org \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bp@suse.de \
    --cc=jkosina@suse.cz \
    --cc=josh@joshtriplett.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.