[PATCH v2] Fix a lockup in wait_for_completion() and friends

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2] Fix a lockup in wait_for_completion() and friends
@ 2019-05-08 20:57 minyard
  2019-05-09 16:19 ` [PATCH RT " Sebastian Andrzej Siewior
  0 siblings, 1 reply; 11+ messages in thread
From: minyard @ 2019-05-08 20:57 UTC (permalink / raw)
  To: linux-rt-users; +Cc: minyard, Corey Minyard

From: Corey Minyard <cminyard@mvista.com>

The function call do_wait_for_common() has a race condition that
can result in lockups waiting for completions.  Adding the thread
to (and removing the thread from) the wait queue for the completion
is done outside the do loop in that function.  However, if the thread
is woken up, the swake_up_locked() function will delete the entry
from the wait queue.  If that happens and another thread sneaks
in and decrements the done count in the completion to zero, the
loop will go around again, but the thread will no longer be in the
wait queue, so there is no way to wake it up.

Fix it by adding/removing the thread to/from the wait queue inside
the do loop.

Fixes: a04ff6b4ec4ee7e ("completion: Use simple wait queues")
Signed-off-by: Corey Minyard <cminyard@mvista.com>
---
I sent the wrong version of this, I had spotted this before but didn't
fix it here.  Adding the thread to the wait queue needs to come after
the signal check.  Sorry about the noise.

 kernel/sched/completion.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index 755a58084978..4f9b4cc0c95a 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -70,20 +70,20 @@ do_wait_for_common(struct completion *x,
 		   long (*action)(long), long timeout, int state)
 {
 	if (!x->done) {
-		DECLARE_SWAITQUEUE(wait);
-
-		__prepare_to_swait(&x->wait, &wait);
 		do {
+			DECLARE_SWAITQUEUE(wait);
+
 			if (signal_pending_state(state, current)) {
 				timeout = -ERESTARTSYS;
 				break;
 			}
+			__prepare_to_swait(&x->wait, &wait);
 			__set_current_state(state);
 			raw_spin_unlock_irq(&x->wait.lock);
 			timeout = action(timeout);
 			raw_spin_lock_irq(&x->wait.lock);
+			__finish_swait(&x->wait, &wait);
 		} while (!x->done && timeout);
-		__finish_swait(&x->wait, &wait);
 		if (!x->done)
 			return timeout;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-08 20:57 [PATCH v2] Fix a lockup in wait_for_completion() and friends minyard
@ 2019-05-09 16:19 ` Sebastian Andrzej Siewior
  2019-05-09 17:46   ` Corey Minyard
                     ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-09 16:19 UTC (permalink / raw)
  To: minyard
  Cc: linux-rt-users, Corey Minyard, Peter Zijlstra, linux-kernel,
	tglx, Steven Rostedt

Please:
 - add some RT developers on Cc:
 - add lkml
 - use [PATCH RT] instead just [PATCH] so it is visible that you target
   the RT tree.

On 2019-05-08 15:57:28 [-0500], minyard@acm.org wrote:
> From: Corey Minyard <cminyard@mvista.com>
> 
> The function call do_wait_for_common() has a race condition that
> can result in lockups waiting for completions.  Adding the thread
> to (and removing the thread from) the wait queue for the completion
> is done outside the do loop in that function.  However, if the thread
> is woken up, the swake_up_locked() function will delete the entry
> from the wait queue.  If that happens and another thread sneaks
> in and decrements the done count in the completion to zero, the
> loop will go around again, but the thread will no longer be in the
> wait queue, so there is no way to wake it up.
> 
> Fix it by adding/removing the thread to/from the wait queue inside
> the do loop.

So you are saying:
	T0			T1			    T2
	wait_for_completion()
	 do_wait_for_common()
	  __prepare_to_swait()
	   schedule()
	    		       complete()
			        x->done++ (0 -> 1)
				raw_spin_lock_irqsave()
				 swake_up_locked()           wait_for_completion()
				  wake_up_process(T0)
				  list_del_init()
				raw_spin_unlock_irqrestore()
	                                                      raw_spin_lock_irq(&x->wait.lock)
	 raw_spin_lock_irq(&x->wait.lock)                      x->done != UINT_MAX, 1 -> 0
							       return 1
							      raw_spin_unlock_irq(&x->wait.lock)
	 while (!x->done && timeout),
	 continue loop, not enqueued
	 on &x->wait

The difference compared to the non-swait based implementation is that
swake_up_locked() removes woken up tasks from the list while the other
implementation (wait_queue_entry based, default_wake_function()) does
not. Buh

One question for the upstream completion implementation:
completion_done() returns true if there are no waiters. It acquires the
wait.lock to ensure that complete()/complete_all() is done. However,
once complete releases the lock it is guaranteed that the wake_up() (for
the waiter) occurred. The waiter task still needs to be remove itself
from the wait-queue before the completion can be removed.
Do I miss something?

> Fixes: a04ff6b4ec4ee7e ("completion: Use simple wait queues")
> Signed-off-by: Corey Minyard <cminyard@mvista.com>
> ---
> I sent the wrong version of this, I had spotted this before but didn't
> fix it here.  Adding the thread to the wait queue needs to come after
> the signal check.  Sorry about the noise.
> 
>  kernel/sched/completion.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
> index 755a58084978..4f9b4cc0c95a 100644
> --- a/kernel/sched/completion.c
> +++ b/kernel/sched/completion.c
> @@ -70,20 +70,20 @@ do_wait_for_common(struct completion *x,
>  		   long (*action)(long), long timeout, int state)
>  {
>  	if (!x->done) {
> -		DECLARE_SWAITQUEUE(wait);
> -
> -		__prepare_to_swait(&x->wait, &wait);

you can keep DECLARE_SWAITQUEUE remove just __prepare_to_swait()

>  		do {
> +			DECLARE_SWAITQUEUE(wait);
> +
>  			if (signal_pending_state(state, current)) {
>  				timeout = -ERESTARTSYS;
>  				break;
>  			}
> +			__prepare_to_swait(&x->wait, &wait);

add this, yes and you are done.

>  			__set_current_state(state);
>  			raw_spin_unlock_irq(&x->wait.lock);
>  			timeout = action(timeout);
>  			raw_spin_lock_irq(&x->wait.lock);
> +			__finish_swait(&x->wait, &wait);
>  		} while (!x->done && timeout);
> -		__finish_swait(&x->wait, &wait);
>  		if (!x->done)
>  			return timeout;
>  	}

Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-09 16:19 ` [PATCH RT " Sebastian Andrzej Siewior
@ 2019-05-09 17:46   ` Corey Minyard
  2019-05-14  8:43   ` Peter Zijlstra
  2019-06-26 10:35   ` Peter Zijlstra
  2 siblings, 0 replies; 11+ messages in thread
From: Corey Minyard @ 2019-05-09 17:46 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, Corey Minyard, Peter Zijlstra, linux-kernel,
	tglx, Steven Rostedt

On Thu, May 09, 2019 at 06:19:25PM +0200, Sebastian Andrzej Siewior wrote:
> Please:
>  - add some RT developers on Cc:
>  - add lkml
>  - use [PATCH RT] instead just [PATCH] so it is visible that you target
>    the RT tree.

Will do.  I'll add your diagram below, too.

> 
> On 2019-05-08 15:57:28 [-0500], minyard@acm.org wrote:
> > From: Corey Minyard <cminyard@mvista.com>
> > 
> > The function call do_wait_for_common() has a race condition that
> > can result in lockups waiting for completions.  Adding the thread
> > to (and removing the thread from) the wait queue for the completion
> > is done outside the do loop in that function.  However, if the thread
> > is woken up, the swake_up_locked() function will delete the entry
> > from the wait queue.  If that happens and another thread sneaks
> > in and decrements the done count in the completion to zero, the
> > loop will go around again, but the thread will no longer be in the
> > wait queue, so there is no way to wake it up.
> > 
> > Fix it by adding/removing the thread to/from the wait queue inside
> > the do loop.
> 
> So you are saying:
> 	T0			T1			    T2
> 	wait_for_completion()
> 	 do_wait_for_common()
> 	  __prepare_to_swait()
> 	   schedule()
> 	    		       complete()
> 			        x->done++ (0 -> 1)
> 				raw_spin_lock_irqsave()
> 				 swake_up_locked()           wait_for_completion()
> 				  wake_up_process(T0)
> 				  list_del_init()
> 				raw_spin_unlock_irqrestore()
> 	                                                      raw_spin_lock_irq(&x->wait.lock)
> 	 raw_spin_lock_irq(&x->wait.lock)                      x->done != UINT_MAX, 1 -> 0
> 							       return 1
> 							      raw_spin_unlock_irq(&x->wait.lock)
> 	 while (!x->done && timeout),
> 	 continue loop, not enqueued
> 	 on &x->wait
> 
> The difference compared to the non-swait based implementation is that
> swake_up_locked() removes woken up tasks from the list while the other
> implementation (wait_queue_entry based, default_wake_function()) does
> not. Buh

Yes, exactly.  I was wondering if swait could be changed to not remove
the waiter, but that seemed like a bad idea.  It is an unusual semantic,
though.

I thought some more about this, wondering why everything isn't keeling
over because of this.  I'm guessing that just about everything using
completions has a single waiter, so it doesn't matter.  I just wrote
some code that has a bunch of waiters, so I hit it.

-corey

> 
> One question for the upstream completion implementation:
> completion_done() returns true if there are no waiters. It acquires the
> wait.lock to ensure that complete()/complete_all() is done. However,
> once complete releases the lock it is guaranteed that the wake_up() (for
> the waiter) occurred. The waiter task still needs to be remove itself
> from the wait-queue before the completion can be removed.
> Do I miss something?
> 
> > Fixes: a04ff6b4ec4ee7e ("completion: Use simple wait queues")
> > Signed-off-by: Corey Minyard <cminyard@mvista.com>
> > ---
> > I sent the wrong version of this, I had spotted this before but didn't
> > fix it here.  Adding the thread to the wait queue needs to come after
> > the signal check.  Sorry about the noise.
> > 
> >  kernel/sched/completion.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
> > index 755a58084978..4f9b4cc0c95a 100644
> > --- a/kernel/sched/completion.c
> > +++ b/kernel/sched/completion.c
> > @@ -70,20 +70,20 @@ do_wait_for_common(struct completion *x,
> >  		   long (*action)(long), long timeout, int state)
> >  {
> >  	if (!x->done) {
> > -		DECLARE_SWAITQUEUE(wait);
> > -
> > -		__prepare_to_swait(&x->wait, &wait);
> 
> you can keep DECLARE_SWAITQUEUE remove just __prepare_to_swait()
> 
> >  		do {
> > +			DECLARE_SWAITQUEUE(wait);
> > +
> >  			if (signal_pending_state(state, current)) {
> >  				timeout = -ERESTARTSYS;
> >  				break;
> >  			}
> > +			__prepare_to_swait(&x->wait, &wait);
> 
> add this, yes and you are done.
> 
> >  			__set_current_state(state);
> >  			raw_spin_unlock_irq(&x->wait.lock);
> >  			timeout = action(timeout);
> >  			raw_spin_lock_irq(&x->wait.lock);
> > +			__finish_swait(&x->wait, &wait);
> >  		} while (!x->done && timeout);
> > -		__finish_swait(&x->wait, &wait);
> >  		if (!x->done)
> >  			return timeout;
> >  	}
> 
> Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-09 16:19 ` [PATCH RT " Sebastian Andrzej Siewior
  2019-05-09 17:46   ` Corey Minyard
@ 2019-05-14  8:43   ` Peter Zijlstra
  2019-05-14  9:12     ` Sebastian Andrzej Siewior
  2019-06-26 10:35   ` Peter Zijlstra
  2 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2019-05-14  8:43 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: minyard, linux-rt-users, Corey Minyard, linux-kernel, tglx,
	Steven Rostedt, Ingo Molnar

On Thu, May 09, 2019 at 06:19:25PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-05-08 15:57:28 [-0500], minyard@acm.org wrote:

> >  kernel/sched/completion.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
> > index 755a58084978..4f9b4cc0c95a 100644
> > --- a/kernel/sched/completion.c
> > +++ b/kernel/sched/completion.c
> > @@ -70,20 +70,20 @@ do_wait_for_common(struct completion *x,
> >  		   long (*action)(long), long timeout, int state)
> >  {
> >  	if (!x->done) {
> > -		DECLARE_SWAITQUEUE(wait);
> > -
> > -		__prepare_to_swait(&x->wait, &wait);
> 
> you can keep DECLARE_SWAITQUEUE remove just __prepare_to_swait()
> 
> >  		do {
> > +			DECLARE_SWAITQUEUE(wait);
> > +
> >  			if (signal_pending_state(state, current)) {
> >  				timeout = -ERESTARTSYS;
> >  				break;
> >  			}
> > +			__prepare_to_swait(&x->wait, &wait);
> 
> add this, yes and you are done.
> 
> >  			__set_current_state(state);
> >  			raw_spin_unlock_irq(&x->wait.lock);
> >  			timeout = action(timeout);
> >  			raw_spin_lock_irq(&x->wait.lock);
> > +			__finish_swait(&x->wait, &wait);
> >  		} while (!x->done && timeout);
> > -		__finish_swait(&x->wait, &wait);
> >  		if (!x->done)
> >  			return timeout;
> >  	}

Now.. that will fix it, but I think it is also wrong.

The problem being that it violates FIFO, something that might be more
important on -RT than elsewhere.

The regular wait API seems confused/inconsistent when it uses
autoremove_wake_function and default_wake_function, which doesn't help,
but we can easily support this with swait -- the problematic thing is
the custom wake functions, we musn't do that.

(also, mingo went and renamed a whole bunch of wait_* crap and didn't do
the same to swait_ so now its named all different :/)

Something like the below perhaps.

---
diff --git a/include/linux/swait.h b/include/linux/swait.h
index 73e06e9986d4..f194437ae7d2 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -61,11 +61,13 @@ struct swait_queue_head {
 struct swait_queue {
 	struct task_struct	*task;
 	struct list_head	task_list;
+	unsigned int		remove;
 };
 
 #define __SWAITQUEUE_INITIALIZER(name) {				\
 	.task		= current,					\
 	.task_list	= LIST_HEAD_INIT((name).task_list),		\
+	.remove		= 1,						\
 }
 
 #define DECLARE_SWAITQUEUE(name)					\
diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
index e83a3f8449f6..86974ecbabfc 100644
--- a/kernel/sched/swait.c
+++ b/kernel/sched/swait.c
@@ -28,7 +28,8 @@ void swake_up_locked(struct swait_queue_head *q)
 
 	curr = list_first_entry(&q->task_list, typeof(*curr), task_list);
 	wake_up_process(curr->task);
-	list_del_init(&curr->task_list);
+	if (curr->remove)
+		list_del_init(&curr->task_list);
 }
 EXPORT_SYMBOL(swake_up_locked);
 
@@ -57,7 +58,8 @@ void swake_up_all(struct swait_queue_head *q)
 		curr = list_first_entry(&tmp, typeof(*curr), task_list);
 
 		wake_up_state(curr->task, TASK_NORMAL);
-		list_del_init(&curr->task_list);
+		if (curr->remove)
+			list_del_init(&curr->task_list);
 
 		if (list_empty(&tmp))
 			break;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-14  8:43   ` Peter Zijlstra
@ 2019-05-14  9:12     ` Sebastian Andrzej Siewior
  2019-05-14 11:35       ` Peter Zijlstra
  2019-05-14 12:13       ` Corey Minyard
  0 siblings, 2 replies; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-14  9:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: minyard, linux-rt-users, Corey Minyard, linux-kernel, tglx,
	Steven Rostedt, Ingo Molnar

On 2019-05-14 10:43:56 [+0200], Peter Zijlstra wrote:
> Now.. that will fix it, but I think it is also wrong.
> 
> The problem being that it violates FIFO, something that might be more
> important on -RT than elsewhere.

Wouldn't -RT be more about waking the task with the highest priority
instead the one that waited the longest?

> The regular wait API seems confused/inconsistent when it uses
> autoremove_wake_function and default_wake_function, which doesn't help,
> but we can easily support this with swait -- the problematic thing is
> the custom wake functions, we musn't do that.
> 
> (also, mingo went and renamed a whole bunch of wait_* crap and didn't do
> the same to swait_ so now its named all different :/)
> 
> Something like the below perhaps.

This still violates FIFO because a task can do wait_for_completion(),
not enqueue itself on the list because it noticed a pending wake and
leave. The list order is preserved, we have that.
But this a completion list. We have probably multiple worker waiting for
something to do so all of those should be of equal priority, maybe one
for each core or so. So it shouldn't matter which one we wake up.

Corey, would it make any change which waiter is going to be woken up?

Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-14  9:12     ` Sebastian Andrzej Siewior
@ 2019-05-14 11:35       ` Peter Zijlstra
  2019-05-14 15:25         ` Sebastian Andrzej Siewior
  2019-05-14 12:13       ` Corey Minyard
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2019-05-14 11:35 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: minyard, linux-rt-users, Corey Minyard, linux-kernel, tglx,
	Steven Rostedt, Ingo Molnar

On Tue, May 14, 2019 at 11:12:19AM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-05-14 10:43:56 [+0200], Peter Zijlstra wrote:
> > Now.. that will fix it, but I think it is also wrong.
> > 
> > The problem being that it violates FIFO, something that might be more
> > important on -RT than elsewhere.
> 
> Wouldn't -RT be more about waking the task with the highest priority
> instead the one that waited the longest?

Possibly, but that's a far larger patch. Also, even with that
completions do not avoid inversions and thus don't really make nice RT
primitives anyway.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-14  9:12     ` Sebastian Andrzej Siewior
  2019-05-14 11:35       ` Peter Zijlstra
@ 2019-05-14 12:13       ` Corey Minyard
  2019-05-14 15:36         ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 11+ messages in thread
From: Corey Minyard @ 2019-05-14 12:13 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Peter Zijlstra, minyard, linux-rt-users, linux-kernel, tglx,
	Steven Rostedt, Ingo Molnar

On Tue, May 14, 2019 at 11:12:19AM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-05-14 10:43:56 [+0200], Peter Zijlstra wrote:
> > Now.. that will fix it, but I think it is also wrong.
> > 
> > The problem being that it violates FIFO, something that might be more
> > important on -RT than elsewhere.
> 
> Wouldn't -RT be more about waking the task with the highest priority
> instead the one that waited the longest?
> 
> > The regular wait API seems confused/inconsistent when it uses
> > autoremove_wake_function and default_wake_function, which doesn't help,
> > but we can easily support this with swait -- the problematic thing is
> > the custom wake functions, we musn't do that.
> > 
> > (also, mingo went and renamed a whole bunch of wait_* crap and didn't do
> > the same to swait_ so now its named all different :/)
> > 
> > Something like the below perhaps.
> 
> This still violates FIFO because a task can do wait_for_completion(),
> not enqueue itself on the list because it noticed a pending wake and
> leave. The list order is preserved, we have that.
> But this a completion list. We have probably multiple worker waiting for
> something to do so all of those should be of equal priority, maybe one
> for each core or so. So it shouldn't matter which one we wake up.
> 
> Corey, would it make any change which waiter is going to be woken up?

In the application that found this, the wake order probably isn't
relevant.

For other applications, I really doubt that very many are using multiple
waiters.  If so, this bug would have been reported sooner, I think.

As you mention, for RT you would want waiter woken by priority and FIFO
within priority.  I don't think POSIX says anything about FIFO within
priority, but that's probably a good idea.  That's no longer a simple
wait queue  The way it is now is probably closer to that than what Peter
suggested, but not really that close.

This is heavily used in drivers and fs code, where it probably doesn't
matter.  I looked through a few users in mm and kernel, and they had
one waiter or were init/shutdown type things where order is not important.

So I'm not sure it's important.

-corey

> 
> Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-14 11:35       ` Peter Zijlstra
@ 2019-05-14 15:25         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-14 15:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: minyard, linux-rt-users, Corey Minyard, linux-kernel, tglx,
	Steven Rostedt, Ingo Molnar

On 2019-05-14 13:35:38 [+0200], Peter Zijlstra wrote:
> On Tue, May 14, 2019 at 11:12:19AM +0200, Sebastian Andrzej Siewior wrote:
> > On 2019-05-14 10:43:56 [+0200], Peter Zijlstra wrote:
> > > Now.. that will fix it, but I think it is also wrong.
> > > 
> > > The problem being that it violates FIFO, something that might be more
> > > important on -RT than elsewhere.
> > 
> > Wouldn't -RT be more about waking the task with the highest priority
> > instead the one that waited the longest?
> 
> Possibly, but that's a far larger patch. Also, even with that
> completions do not avoid inversions and thus don't really make nice RT
> primitives anyway.

See. So it does not really matter if a particular waiter ends up at the
end of the queue.
Anyway, I don't really think we need this but if you want it, let me add
it.
What about the other question I had regarding completion_done()?

Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-14 12:13       ` Corey Minyard
@ 2019-05-14 15:36         ` Sebastian Andrzej Siewior
  2019-05-15 16:22           ` Corey Minyard
  0 siblings, 1 reply; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-05-14 15:36 UTC (permalink / raw)
  To: Corey Minyard
  Cc: Peter Zijlstra, minyard, linux-rt-users, linux-kernel, tglx,
	Steven Rostedt, Ingo Molnar

On 2019-05-14 07:13:50 [-0500], Corey Minyard wrote:
> > Corey, would it make any change which waiter is going to be woken up?
> 
> In the application that found this, the wake order probably isn't
> relevant.

what I expected.

> For other applications, I really doubt that very many are using multiple
> waiters.  If so, this bug would have been reported sooner, I think.

most other do either one waiter/waker pair or one waker and multiple
waiter. And then reinit_completion() is used for the next round.

> As you mention, for RT you would want waiter woken by priority and FIFO
> within priority.  I don't think POSIX says anything about FIFO within
> priority, but that's probably a good idea.  That's no longer a simple
> wait queue  The way it is now is probably closer to that than what Peter
> suggested, but not really that close.
> 
> This is heavily used in drivers and fs code, where it probably doesn't
> matter.  I looked through a few users in mm and kernel, and they had
> one waiter or were init/shutdown type things where order is not important.
> 
> So I'm not sure it's important.

Why did you bring POSIX into this? This isn't an API exported to
userland which would fall into that category.

Peter's suggestion for FIFO is that we probably don't want to starve one
thread/waiter if it is always enqueued at the end of the list. As you
said, in your case it does not matter because (I assume) each waiter is
equal and the outcome would be the same.

> -corey

Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-14 15:36         ` Sebastian Andrzej Siewior
@ 2019-05-15 16:22           ` Corey Minyard
  0 siblings, 0 replies; 11+ messages in thread
From: Corey Minyard @ 2019-05-15 16:22 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Peter Zijlstra, minyard, linux-rt-users, linux-kernel, tglx,
	Steven Rostedt, Ingo Molnar

On Tue, May 14, 2019 at 05:36:47PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-05-14 07:13:50 [-0500], Corey Minyard wrote:
> > > Corey, would it make any change which waiter is going to be woken up?
> > 
> > In the application that found this, the wake order probably isn't
> > relevant.
> 
> what I expected.
> 
> > For other applications, I really doubt that very many are using multiple
> > waiters.  If so, this bug would have been reported sooner, I think.
> 
> most other do either one waiter/waker pair or one waker and multiple
> waiter. And then reinit_completion() is used for the next round.
> 
> > As you mention, for RT you would want waiter woken by priority and FIFO
> > within priority.  I don't think POSIX says anything about FIFO within
> > priority, but that's probably a good idea.  That's no longer a simple
> > wait queue  The way it is now is probably closer to that than what Peter
> > suggested, but not really that close.
> > 
> > This is heavily used in drivers and fs code, where it probably doesn't
> > matter.  I looked through a few users in mm and kernel, and they had
> > one waiter or were init/shutdown type things where order is not important.
> > 
> > So I'm not sure it's important.
> 
> Why did you bring POSIX into this? This isn't an API exported to
> userland which would fall into that category.

My understanding is that POSIX.1b scheduling classes affect everything.
So if you have two processes blocked on the same things for any reason,
you need to wake up the higher priority one first.  In reality, it
probably doesn't matter unless something more critical uses completions. 

> 
> Peter's suggestion for FIFO is that we probably don't want to starve one
> thread/waiter if it is always enqueued at the end of the list. As you
> said, in your case it does not matter because (I assume) each waiter is
> equal and the outcome would be the same.

Yeah, my case is not relevant to this, I'm more concerned with the cases
that are.

-corey

> 
> > -corey
> 
> Sebastian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
  2019-05-09 16:19 ` [PATCH RT " Sebastian Andrzej Siewior
  2019-05-09 17:46   ` Corey Minyard
  2019-05-14  8:43   ` Peter Zijlstra
@ 2019-06-26 10:35   ` Peter Zijlstra
  2 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2019-06-26 10:35 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: minyard, linux-rt-users, Corey Minyard, linux-kernel, tglx,
	Steven Rostedt

On Thu, May 09, 2019 at 06:19:25PM +0200, Sebastian Andrzej Siewior wrote:
> One question for the upstream completion implementation:
> completion_done() returns true if there are no waiters. It acquires the
> wait.lock to ensure that complete()/complete_all() is done. However,
> once complete releases the lock it is guaranteed that the wake_up() (for
> the waiter) occurred. The waiter task still needs to be remove itself
> from the wait-queue before the completion can be removed.
> Do I miss something?

So you mean:

	init_completion(&done);


	wait_for_copmletion(&done)
	  spin_lock()
	   __add_wait_queue()
	  spin_unlock()
	  schedule()

					complete()
								completion_done()
	  spin_lock()
	  __remove_wait_queue()
	  spin_unlock()

Right?

I think that boils down to that whenever you have multiple waiters,
someone needs to be in charge of @done's lifetime.

The case that matters is:

	DECLARE_COMPLETION_ONSTACK(done)

	while (!completion_done(&done))
		cpu_relax();

Where there is but a single waiter, and that waiter is
completion_done(). In that case it must not return early.

Now, I've also seen a ton of code do:

	if (!completion_done(done))
		complete(done);

And that makes me itch... but I've not bothered to look into it hard
enough.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-06-26 10:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-08 20:57 [PATCH v2] Fix a lockup in wait_for_completion() and friends minyard
2019-05-09 16:19 ` [PATCH RT " Sebastian Andrzej Siewior
2019-05-09 17:46   ` Corey Minyard
2019-05-14  8:43   ` Peter Zijlstra
2019-05-14  9:12     ` Sebastian Andrzej Siewior
2019-05-14 11:35       ` Peter Zijlstra
2019-05-14 15:25         ` Sebastian Andrzej Siewior
2019-05-14 12:13       ` Corey Minyard
2019-05-14 15:36         ` Sebastian Andrzej Siewior
2019-05-15 16:22           ` Corey Minyard
2019-06-26 10:35   ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.