linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>, Will Deacon <will@kernel.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Logan Gunthorpe <logang@deltatee.com>,
	Kurt Schwemmer <kurt.schwemmer@microsemi.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org, Felipe Balbi <balbi@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-usb@vger.kernel.org, Kalle Valo <kvalo@codeaurora.org>,
	"David S. Miller" <davem@davemloft.net>,
	linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
	Oleg Nesterov <oleg@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [patch V2 11/15] completion: Use simple wait queues
Date: Wed, 18 Mar 2020 20:33:51 -0400	[thread overview]
Message-ID: <20200319003351.GA211584@google.com> (raw)
In-Reply-To: <20200318204408.521507446@linutronix.de>

Hi Thomas,

On Wed, Mar 18, 2020 at 09:43:13PM +0100, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> completion uses a wait_queue_head_t to enqueue waiters.
> 
> wait_queue_head_t contains a spinlock_t to protect the list of waiters
> which excludes it from being used in truly atomic context on a PREEMPT_RT
> enabled kernel.
> 
> The spinlock in the wait queue head cannot be replaced by a raw_spinlock
> because:
> 
>   - wait queues can have custom wakeup callbacks, which acquire other
>     spinlock_t locks and have potentially long execution times

Cool, makes sense.

>   - wake_up() walks an unbounded number of list entries during the wake up
>     and may wake an unbounded number of waiters.

Just to clarify here, wake_up() will really wake up just 1 waiter if all the
waiters on the queue are exclusive right? So in such scenario at least, the
"unbounded number of waiters" would not be an issue if everything waiting was
exclusive and waitqueue with wake_up() was used. Please correct me if I'm
wrong about that though.

So the main reasons to avoid waitqueue in favor of swait (as you mentioned)
would be the sleep-while-atomic issue in truly atomic context on RT, and the
fact that callbacks can take a long time.

> 
> For simplicity and performance reasons complete() should be usable on
> PREEMPT_RT enabled kernels.
> 
> completions do not use custom wakeup callbacks and are usually single
> waiter, except for a few corner cases.
> 
> Replace the wait queue in the completion with a simple wait queue (swait),
> which uses a raw_spinlock_t for protecting the waiter list and therefore is
> safe to use inside truly atomic regions on PREEMPT_RT.
> 
> There is no semantical or functional change:
> 
>   - completions use the exclusive wait mode which is what swait provides
> 
>   - complete() wakes one exclusive waiter
> 
>   - complete_all() wakes all waiters while holding the lock which protects
>     the wait queue against newly incoming waiters. The conversion to swait
>     preserves this behaviour.
> 
> complete_all() might cause unbound latencies with a large number of waiters
> being woken at once, but most complete_all() usage sites are either in
> testing or initialization code or have only a really small number of
> concurrent waiters which for now does not cause a latency problem. Keep it
> simple for now.
> 
> The fixup of the warning check in the USB gadget driver is just a straight
> forward conversion of the lockless waiter check from one waitqueue type to
> the other.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Arnd Bergmann <arnd@arndb.de>

Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>

thanks,

 - Joel


> ---
> V2: Split out the orinoco and usb gadget parts and amended change log
> ---
>  drivers/usb/gadget/function/f_fs.c |    2 +-
>  include/linux/completion.h         |    8 ++++----
>  kernel/sched/completion.c          |   36 +++++++++++++++++++-----------------
>  3 files changed, 24 insertions(+), 22 deletions(-)
> 
> --- a/drivers/usb/gadget/function/f_fs.c
> +++ b/drivers/usb/gadget/function/f_fs.c
> @@ -1703,7 +1703,7 @@ static void ffs_data_put(struct ffs_data
>  		pr_info("%s(): freeing\n", __func__);
>  		ffs_data_clear(ffs);
>  		BUG_ON(waitqueue_active(&ffs->ev.waitq) ||
> -		       waitqueue_active(&ffs->ep0req_completion.wait) ||
> +		       swait_active(&ffs->ep0req_completion.wait) ||
>  		       waitqueue_active(&ffs->wait));
>  		destroy_workqueue(ffs->io_completion_wq);
>  		kfree(ffs->dev_name);
> --- a/include/linux/completion.h
> +++ b/include/linux/completion.h
> @@ -9,7 +9,7 @@
>   * See kernel/sched/completion.c for details.
>   */
>  
> -#include <linux/wait.h>
> +#include <linux/swait.h>
>  
>  /*
>   * struct completion - structure used to maintain state for a "completion"
> @@ -25,7 +25,7 @@
>   */
>  struct completion {
>  	unsigned int done;
> -	wait_queue_head_t wait;
> +	struct swait_queue_head wait;
>  };
>  
>  #define init_completion_map(x, m) __init_completion(x)
> @@ -34,7 +34,7 @@ static inline void complete_acquire(stru
>  static inline void complete_release(struct completion *x) {}
>  
>  #define COMPLETION_INITIALIZER(work) \
> -	{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
> +	{ 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
>  
>  #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
>  	(*({ init_completion_map(&(work), &(map)); &(work); }))
> @@ -85,7 +85,7 @@ static inline void complete_release(stru
>  static inline void __init_completion(struct completion *x)
>  {
>  	x->done = 0;
> -	init_waitqueue_head(&x->wait);
> +	init_swait_queue_head(&x->wait);
>  }
>  
>  /**
> --- a/kernel/sched/completion.c
> +++ b/kernel/sched/completion.c
> @@ -29,12 +29,12 @@ void complete(struct completion *x)
>  {
>  	unsigned long flags;
>  
> -	spin_lock_irqsave(&x->wait.lock, flags);
> +	raw_spin_lock_irqsave(&x->wait.lock, flags);
>  
>  	if (x->done != UINT_MAX)
>  		x->done++;
> -	__wake_up_locked(&x->wait, TASK_NORMAL, 1);
> -	spin_unlock_irqrestore(&x->wait.lock, flags);
> +	swake_up_locked(&x->wait);
> +	raw_spin_unlock_irqrestore(&x->wait.lock, flags);
>  }
>  EXPORT_SYMBOL(complete);
>  
> @@ -58,10 +58,12 @@ void complete_all(struct completion *x)
>  {
>  	unsigned long flags;
>  
> -	spin_lock_irqsave(&x->wait.lock, flags);
> +	WARN_ON(irqs_disabled());
> +
> +	raw_spin_lock_irqsave(&x->wait.lock, flags);
>  	x->done = UINT_MAX;
> -	__wake_up_locked(&x->wait, TASK_NORMAL, 0);
> -	spin_unlock_irqrestore(&x->wait.lock, flags);
> +	swake_up_all_locked(&x->wait);
> +	raw_spin_unlock_irqrestore(&x->wait.lock, flags);
>  }
>  EXPORT_SYMBOL(complete_all);
>  
> @@ -70,20 +72,20 @@ do_wait_for_common(struct completion *x,
>  		   long (*action)(long), long timeout, int state)
>  {
>  	if (!x->done) {
> -		DECLARE_WAITQUEUE(wait, current);
> +		DECLARE_SWAITQUEUE(wait);
>  
> -		__add_wait_queue_entry_tail_exclusive(&x->wait, &wait);
>  		do {
>  			if (signal_pending_state(state, current)) {
>  				timeout = -ERESTARTSYS;
>  				break;
>  			}
> +			__prepare_to_swait(&x->wait, &wait);
>  			__set_current_state(state);
> -			spin_unlock_irq(&x->wait.lock);
> +			raw_spin_unlock_irq(&x->wait.lock);
>  			timeout = action(timeout);
> -			spin_lock_irq(&x->wait.lock);
> +			raw_spin_lock_irq(&x->wait.lock);
>  		} while (!x->done && timeout);
> -		__remove_wait_queue(&x->wait, &wait);
> +		__finish_swait(&x->wait, &wait);
>  		if (!x->done)
>  			return timeout;
>  	}
> @@ -100,9 +102,9 @@ static inline long __sched
>  
>  	complete_acquire(x);
>  
> -	spin_lock_irq(&x->wait.lock);
> +	raw_spin_lock_irq(&x->wait.lock);
>  	timeout = do_wait_for_common(x, action, timeout, state);
> -	spin_unlock_irq(&x->wait.lock);
> +	raw_spin_unlock_irq(&x->wait.lock);
>  
>  	complete_release(x);
>  
> @@ -291,12 +293,12 @@ bool try_wait_for_completion(struct comp
>  	if (!READ_ONCE(x->done))
>  		return false;
>  
> -	spin_lock_irqsave(&x->wait.lock, flags);
> +	raw_spin_lock_irqsave(&x->wait.lock, flags);
>  	if (!x->done)
>  		ret = false;
>  	else if (x->done != UINT_MAX)
>  		x->done--;
> -	spin_unlock_irqrestore(&x->wait.lock, flags);
> +	raw_spin_unlock_irqrestore(&x->wait.lock, flags);
>  	return ret;
>  }
>  EXPORT_SYMBOL(try_wait_for_completion);
> @@ -322,8 +324,8 @@ bool completion_done(struct completion *
>  	 * otherwise we can end up freeing the completion before complete()
>  	 * is done referencing it.
>  	 */
> -	spin_lock_irqsave(&x->wait.lock, flags);
> -	spin_unlock_irqrestore(&x->wait.lock, flags);
> +	raw_spin_lock_irqsave(&x->wait.lock, flags);
> +	raw_spin_unlock_irqrestore(&x->wait.lock, flags);
>  	return true;
>  }
>  EXPORT_SYMBOL(completion_done);
> 

  parent reply	other threads:[~2020-03-19  0:33 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-18 20:43 [patch V2 00/15] Lock ordering documentation and annotation for lockdep Thomas Gleixner
2020-03-18 20:43 ` [patch V2 01/15] PCI/switchtec: Fix init_completion race condition with poll_wait() Thomas Gleixner
2020-03-18 21:25   ` Bjorn Helgaas
2020-03-18 20:43 ` [patch V2 02/15] pci/switchtec: Replace completion wait queue usage for poll Thomas Gleixner
2020-03-18 21:26   ` Bjorn Helgaas
2020-03-18 22:11   ` Logan Gunthorpe
2020-03-18 20:43 ` [patch V2 03/15] usb: gadget: Use completion interface instead of open coding it Thomas Gleixner
2020-03-19  8:41   ` Greg Kroah-Hartman
2020-03-18 20:43 ` [patch V2 04/15] orinoco_usb: Use the regular completion interfaces Thomas Gleixner
2020-03-19  8:40   ` Greg Kroah-Hartman
2020-03-18 20:43 ` [patch V2 05/15] acpi: Remove header dependency Thomas Gleixner
2020-03-18 20:43 ` [patch V2 06/15] rcuwait: Add @state argument to rcuwait_wait_event() Thomas Gleixner
2020-03-20  5:36   ` Davidlohr Bueso
2020-03-20  8:45     ` Sebastian Andrzej Siewior
2020-03-20  8:58       ` Davidlohr Bueso
2020-03-20  9:48   ` [PATCH 0/5] Remove mm.h from arch/*/include/asm/uaccess.h Sebastian Andrzej Siewior
2020-03-20  9:48     ` [PATCH 1/5] nds32: Remove mm.h from asm/uaccess.h Sebastian Andrzej Siewior
2020-03-20  9:48     ` [PATCH 2/5] csky: " Sebastian Andrzej Siewior
2020-03-21 11:24       ` Guo Ren
2020-03-21 12:08         ` Thomas Gleixner
2020-03-21 14:11           ` Guo Ren
2020-03-20  9:48     ` [PATCH 3/5] hexagon: " Sebastian Andrzej Siewior
2020-03-20  9:48     ` [PATCH 4/5] ia64: " Sebastian Andrzej Siewior
2020-03-20  9:48     ` [PATCH 5/5] microblaze: " Sebastian Andrzej Siewior
2020-03-18 20:43 ` [patch V2 07/15] powerpc/ps3: Convert half completion to rcuwait Thomas Gleixner
2020-03-19  9:00   ` Sebastian Andrzej Siewior
2020-03-19  9:18     ` Peter Zijlstra
2020-03-19  9:21   ` Davidlohr Bueso
2020-03-19 10:04   ` Christoph Hellwig
2020-03-19 10:26     ` Sebastian Andrzej Siewior
2020-03-20  0:01       ` Geoff Levand
2020-03-20  0:45       ` Michael Ellerman
2020-03-21 10:41     ` Thomas Gleixner
2020-03-18 20:43 ` [patch V2 08/15] Documentation: Add lock ordering and nesting documentation Thomas Gleixner
2020-03-18 22:31   ` Paul E. McKenney
2020-03-19 18:02     ` Thomas Gleixner
2020-03-20 16:01       ` Paul E. McKenney
2020-03-20 19:51         ` Thomas Gleixner
2020-03-20 21:02           ` Paul E. McKenney
2020-03-20 22:36             ` Thomas Gleixner
2020-03-21  2:29               ` Paul E. McKenney
2020-03-21 10:26                 ` Thomas Gleixner
2020-03-21 17:23                   ` Paul E. McKenney
2020-03-19  8:51   ` Davidlohr Bueso
2020-03-19 15:04   ` Jonathan Corbet
2020-03-19 18:04     ` Thomas Gleixner
2020-03-21 21:21   ` Joel Fernandes
2020-03-21 21:49     ` Thomas Gleixner
2020-03-22  1:36       ` Joel Fernandes
2020-03-18 20:43 ` [patch V2 09/15] timekeeping: Split jiffies seqlock Thomas Gleixner
2020-03-18 20:43 ` [patch V2 10/15] sched/swait: Prepare usage in completions Thomas Gleixner
2020-03-18 20:43 ` [patch V2 11/15] completion: Use simple wait queues Thomas Gleixner
2020-03-18 22:28   ` Logan Gunthorpe
2020-03-19  0:33   ` Joel Fernandes [this message]
2020-03-19  0:44     ` Thomas Gleixner
2020-03-19  8:42   ` Greg Kroah-Hartman
2020-03-19 17:12   ` Linus Torvalds
2020-03-19 23:25   ` Julian Calaby
2020-03-20  6:59     ` Christoph Hellwig
2020-03-20  9:01   ` Davidlohr Bueso
2020-03-20  8:50 ` [patch V2 00/15] Lock ordering documentation and annotation for lockdep Davidlohr Bueso
2020-03-20  8:55 ` [PATCH 16/15] rcuwait: Get rid of stale name comment Davidlohr Bueso
2020-03-20  8:55   ` [PATCH 17/15] rcuwait: Inform rcuwait_wake_up() users if a wakeup was attempted Davidlohr Bueso
2020-03-20  9:13     ` Sebastian Andrzej Siewior
2020-03-20 10:44     ` Peter Zijlstra
2020-03-20  8:55   ` [PATCH 18/15] kvm: Replace vcpu->swait with rcuwait Davidlohr Bueso
2020-03-20 11:20     ` Paolo Bonzini
2020-03-20 12:54     ` Peter Zijlstra
2020-03-22 16:33       ` Davidlohr Bueso
2020-03-22 22:32         ` Peter Zijlstra
2020-03-20  8:55   ` [PATCH 19/15] sched/swait: Reword some of the main description Davidlohr Bueso
2020-03-20  9:19     ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200319003351.GA211584@google.com \
    --to=joel@joelfernandes.org \
    --cc=arnd@arndb.de \
    --cc=balbi@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=bigeasy@linutronix.de \
    --cc=dave@stgolabs.net \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=kurt.schwemmer@microsemi.com \
    --cc=kvalo@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=logang@deltatee.com \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).