All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] Fix a lockup in wait_for_completion() and friends
@ 2019-05-08 20:57 minyard
  2019-05-09 16:19 ` [PATCH RT " Sebastian Andrzej Siewior
  0 siblings, 1 reply; 28+ messages in thread
From: minyard @ 2019-05-08 20:57 UTC (permalink / raw)
  To: linux-rt-users; +Cc: minyard, Corey Minyard

From: Corey Minyard <cminyard@mvista.com>

The function call do_wait_for_common() has a race condition that
can result in lockups waiting for completions.  Adding the thread
to (and removing the thread from) the wait queue for the completion
is done outside the do loop in that function.  However, if the thread
is woken up, the swake_up_locked() function will delete the entry
from the wait queue.  If that happens and another thread sneaks
in and decrements the done count in the completion to zero, the
loop will go around again, but the thread will no longer be in the
wait queue, so there is no way to wake it up.

Fix it by adding/removing the thread to/from the wait queue inside
the do loop.

Fixes: a04ff6b4ec4ee7e ("completion: Use simple wait queues")
Signed-off-by: Corey Minyard <cminyard@mvista.com>
---
I sent the wrong version of this, I had spotted this before but didn't
fix it here.  Adding the thread to the wait queue needs to come after
the signal check.  Sorry about the noise.

 kernel/sched/completion.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index 755a58084978..4f9b4cc0c95a 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -70,20 +70,20 @@ do_wait_for_common(struct completion *x,
 		   long (*action)(long), long timeout, int state)
 {
 	if (!x->done) {
-		DECLARE_SWAITQUEUE(wait);
-
-		__prepare_to_swait(&x->wait, &wait);
 		do {
+			DECLARE_SWAITQUEUE(wait);
+
 			if (signal_pending_state(state, current)) {
 				timeout = -ERESTARTSYS;
 				break;
 			}
+			__prepare_to_swait(&x->wait, &wait);
 			__set_current_state(state);
 			raw_spin_unlock_irq(&x->wait.lock);
 			timeout = action(timeout);
 			raw_spin_lock_irq(&x->wait.lock);
+			__finish_swait(&x->wait, &wait);
 		} while (!x->done && timeout);
-		__finish_swait(&x->wait, &wait);
 		if (!x->done)
 			return timeout;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread
* [PATCH RT v2] Fix a lockup in wait_for_completion() and friends
@ 2019-05-09 19:33 minyard
  2019-05-09 19:51 ` Steven Rostedt
  2019-05-10 10:33 ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 28+ messages in thread
From: minyard @ 2019-05-09 19:33 UTC (permalink / raw)
  To: linux-rt-users
  Cc: minyard, linux-kernel, Sebastian Andrzej Siewior, Peter Zijlstra,
	tglx, Steven Rostedt, Corey Minyard

From: Corey Minyard <cminyard@mvista.com>

The function call do_wait_for_common() has a race condition that
can result in lockups waiting for completions.  Adding the thread
to (and removing the thread from) the wait queue for the completion
is done outside the do loop in that function.  However, if the thread
is woken up, the swake_up_locked() function will delete the entry
from the wait queue.  If that happens and another thread sneaks
in and decrements the done count in the completion to zero, the
loop will go around again, but the thread will no longer be in the
wait queue, so there is no way to wake it up.

Visually, here's a diagram from Sebastian Andrzej Siewior:
  T0                    T1                       T2
  wait_for_completion()
   do_wait_for_common()
    __prepare_to_swait()
     schedule()
                        complete()
                         x->done++ (0 -> 1)
                         raw_spin_lock_irqsave()
                         swake_up_locked()       wait_for_completion()
                          wake_up_process(T0)
                          list_del_init()
                         raw_spin_unlock_irqrestore()
                                                  raw_spin_lock_irq(&x->wait.lock)
  raw_spin_lock_irq(&x->wait.lock)                x->done != UINT_MAX, 1 -> 0
                                                  raw_spin_unlock_irq(&x->wait.lock)
                                                  return 1
   while (!x->done && timeout),
   continue loop, not enqueued
   on &x->wait

Basically, the problem is that the original wait queues used in
completions did not remove the item from the queue in the wakeup
function, but swake_up_locked() does.

Fix it by adding the thread to the wait queue inside the do loop.
The design of swait detects if it is already in the list and doesn't
do the list add again.

Fixes: a04ff6b4ec4ee7e ("completion: Use simple wait queues")
Signed-off-by: Corey Minyard <cminyard@mvista.com>
---
Changes since v1:
* Only move __prepare_to_swait() into the loop.  __prepare_to_swait()
  handles if called when already in the list, and of course
  __finish_swait() handles if the item is not queued on the
  list.

 kernel/sched/completion.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index 755a58084978..49c14137988e 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -72,12 +72,12 @@ do_wait_for_common(struct completion *x,
 	if (!x->done) {
 		DECLARE_SWAITQUEUE(wait);
 
-		__prepare_to_swait(&x->wait, &wait);
 		do {
 			if (signal_pending_state(state, current)) {
 				timeout = -ERESTARTSYS;
 				break;
 			}
+			__prepare_to_swait(&x->wait, &wait);
 			__set_current_state(state);
 			raw_spin_unlock_irq(&x->wait.lock);
 			timeout = action(timeout);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2019-07-02 11:53 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-08 20:57 [PATCH v2] Fix a lockup in wait_for_completion() and friends minyard
2019-05-09 16:19 ` [PATCH RT " Sebastian Andrzej Siewior
2019-05-09 17:46   ` Corey Minyard
2019-05-14  8:43   ` Peter Zijlstra
2019-05-14  9:12     ` Sebastian Andrzej Siewior
2019-05-14 11:35       ` Peter Zijlstra
2019-05-14 15:25         ` Sebastian Andrzej Siewior
2019-05-14 12:13       ` Corey Minyard
2019-05-14 15:36         ` Sebastian Andrzej Siewior
2019-05-15 16:22           ` Corey Minyard
2019-06-26 10:35   ` Peter Zijlstra
2019-05-09 19:33 minyard
2019-05-09 19:51 ` Steven Rostedt
2019-05-10 10:33 ` Sebastian Andrzej Siewior
2019-05-10 12:08   ` Corey Minyard
2019-05-10 12:26     ` Sebastian Andrzej Siewior
2019-06-29  1:49   ` Steven Rostedt
2019-07-01 19:09     ` Corey Minyard
2019-07-01 20:18       ` Steven Rostedt
2019-07-01 20:43         ` Corey Minyard
2019-07-01 21:06           ` Steven Rostedt
2019-07-01 21:13             ` Steven Rostedt
2019-07-01 21:28               ` Steven Rostedt
2019-07-01 21:34                 ` Corey Minyard
2019-07-02  7:04                 ` Kurt Kanzenbach
2019-07-02  8:35                   ` Sebastian Andrzej Siewior
2019-07-02 11:40                     ` Corey Minyard
2019-07-02 11:53                       ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.