All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock
@ 2018-04-04 15:24 Kirill Tkhai
  2018-04-04 15:35 ` Peter Zijlstra
  2018-04-04 15:43 ` Greg KH
  0 siblings, 2 replies; 9+ messages in thread
From: Kirill Tkhai @ 2018-04-04 15:24 UTC (permalink / raw)
  To: peterz, mingo, stable, linux-kernel, ktkhai

The following situation leads to deadlock:

[task 1]                          [task 2]                         [task 3]
kill_fasync()                     mm_update_next_owner()           copy_process()
 spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
  send_sigio()                    <IRQ>                             ...
   read_lock(&fown->lock)         kill_fasync()                     ...
    read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...

Task 1 can't acquire read locked tasklist_lock, since there is
already task 3 expressed its wish to take the lock exclusive.
Task 2 holds the read locked lock, but it can't take the spin lock.

The patch makes queued_read_lock_slowpath() to give task 1 the same
priority as it was an interrupt handler, and to take the lock
dispite of task 3 is waiting it, and this prevents the deadlock.
It seems there is no better way to detect such the situations,
also in general it's not good to wait so long for readers with
interrupts disabled, since read_lock may nest with another locks
and delay the system.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 kernel/locking/qrwlock.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index c7471c3fb798..d15df85de8f5 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -32,7 +32,7 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
 	/*
 	 * Readers come here when they cannot get the lock without waiting
 	 */
-	if (unlikely(in_interrupt())) {
+	if (unlikely(irqs_disabled())) {
 		/*
 		 * Readers in interrupt context will get the lock immediately
 		 * if the writer is just waiting (not holding the lock yet),

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock
  2018-04-04 15:24 [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock Kirill Tkhai
@ 2018-04-04 15:35 ` Peter Zijlstra
  2018-04-04 15:51   ` Kirill Tkhai
  2018-04-04 15:43 ` Greg KH
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2018-04-04 15:35 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: mingo, stable, linux-kernel, Waiman Long, Boqun Feng

On Wed, Apr 04, 2018 at 06:24:39PM +0300, Kirill Tkhai wrote:
> The following situation leads to deadlock:
> 
> [task 1]                          [task 2]                         [task 3]
> kill_fasync()                     mm_update_next_owner()           copy_process()
>  spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
>   send_sigio()                    <IRQ>                             ...
>    read_lock(&fown->lock)         kill_fasync()                     ...
>     read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...
> 
> Task 1 can't acquire read locked tasklist_lock, since there is
> already task 3 expressed its wish to take the lock exclusive.
> Task 2 holds the read locked lock, but it can't take the spin lock.
> 
> The patch makes queued_read_lock_slowpath() to give task 1 the same
> priority as it was an interrupt handler, and to take the lock

That re-introduces starvation scenarios. And the above looks like a
proper deadlock that should be sorted by fixing the locking order.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock
  2018-04-04 15:24 [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock Kirill Tkhai
  2018-04-04 15:35 ` Peter Zijlstra
@ 2018-04-04 15:43 ` Greg KH
  1 sibling, 0 replies; 9+ messages in thread
From: Greg KH @ 2018-04-04 15:43 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: peterz, mingo, stable, linux-kernel

On Wed, Apr 04, 2018 at 06:24:39PM +0300, Kirill Tkhai wrote:
> The following situation leads to deadlock:
> 
> [task 1]                          [task 2]                         [task 3]
> kill_fasync()                     mm_update_next_owner()           copy_process()
>  spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
>   send_sigio()                    <IRQ>                             ...
>    read_lock(&fown->lock)         kill_fasync()                     ...
>     read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...
> 
> Task 1 can't acquire read locked tasklist_lock, since there is
> already task 3 expressed its wish to take the lock exclusive.
> Task 2 holds the read locked lock, but it can't take the spin lock.
> 
> The patch makes queued_read_lock_slowpath() to give task 1 the same
> priority as it was an interrupt handler, and to take the lock
> dispite of task 3 is waiting it, and this prevents the deadlock.
> It seems there is no better way to detect such the situations,
> also in general it's not good to wait so long for readers with
> interrupts disabled, since read_lock may nest with another locks
> and delay the system.
> 
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  kernel/locking/qrwlock.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock
  2018-04-04 15:35 ` Peter Zijlstra
@ 2018-04-04 15:51   ` Kirill Tkhai
  2018-04-04 15:55     ` Kirill Tkhai
  2018-04-04 16:18     ` Peter Zijlstra
  0 siblings, 2 replies; 9+ messages in thread
From: Kirill Tkhai @ 2018-04-04 15:51 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: mingo, stable, linux-kernel, Waiman Long, Boqun Feng

On 04.04.2018 18:35, Peter Zijlstra wrote:
> On Wed, Apr 04, 2018 at 06:24:39PM +0300, Kirill Tkhai wrote:
>> The following situation leads to deadlock:
>>
>> [task 1]                          [task 2]                         [task 3]
>> kill_fasync()                     mm_update_next_owner()           copy_process()
>>  spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
>>   send_sigio()                    <IRQ>                             ...
>>    read_lock(&fown->lock)         kill_fasync()                     ...
>>     read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...
>>
>> Task 1 can't acquire read locked tasklist_lock, since there is
>> already task 3 expressed its wish to take the lock exclusive.
>> Task 2 holds the read locked lock, but it can't take the spin lock.
>>
>> The patch makes queued_read_lock_slowpath() to give task 1 the same
>> priority as it was an interrupt handler, and to take the lock
> 
> That re-introduces starvation scenarios. And the above looks like a
> proper deadlock that should be sorted by fixing the locking order.

We can move tasklist_lock out of send_sigio(), but I'm not sure
it's possible for read_lock(&fown->lock).

Is there another solution? Is there reliable way to iterate do_each_pid_task()
with rcu_read_lock()?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock
  2018-04-04 15:51   ` Kirill Tkhai
@ 2018-04-04 15:55     ` Kirill Tkhai
  2018-04-04 16:25       ` Waiman Long
  2018-04-04 16:18     ` Peter Zijlstra
  1 sibling, 1 reply; 9+ messages in thread
From: Kirill Tkhai @ 2018-04-04 15:55 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: mingo, stable, linux-kernel, Waiman Long, Boqun Feng

On 04.04.2018 18:51, Kirill Tkhai wrote:
> On 04.04.2018 18:35, Peter Zijlstra wrote:
>> On Wed, Apr 04, 2018 at 06:24:39PM +0300, Kirill Tkhai wrote:
>>> The following situation leads to deadlock:
>>>
>>> [task 1]                          [task 2]                         [task 3]
>>> kill_fasync()                     mm_update_next_owner()           copy_process()
>>>  spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
>>>   send_sigio()                    <IRQ>                             ...
>>>    read_lock(&fown->lock)         kill_fasync()                     ...
>>>     read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...
>>>
>>> Task 1 can't acquire read locked tasklist_lock, since there is
>>> already task 3 expressed its wish to take the lock exclusive.
>>> Task 2 holds the read locked lock, but it can't take the spin lock.
>>>
>>> The patch makes queued_read_lock_slowpath() to give task 1 the same
>>> priority as it was an interrupt handler, and to take the lock
>>
>> That re-introduces starvation scenarios. And the above looks like a
>> proper deadlock that should be sorted by fixing the locking order.
> 
> We can move tasklist_lock out of send_sigio(), but I'm not sure
> it's possible for read_lock(&fown->lock).
> 
> Is there another solution? Is there reliable way to iterate do_each_pid_task()
> with rcu_read_lock()?

In case of &fown->lock we may always disable irqs for all the places, where it's
taken for read, i.e. read_lock_irqsave(&fown->lock). This seems to fix the problem
for this lock.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock
  2018-04-04 15:51   ` Kirill Tkhai
  2018-04-04 15:55     ` Kirill Tkhai
@ 2018-04-04 16:18     ` Peter Zijlstra
  2018-04-05  9:43       ` Kirill Tkhai
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2018-04-04 16:18 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: mingo, stable, linux-kernel, Waiman Long, Boqun Feng

On Wed, Apr 04, 2018 at 06:51:08PM +0300, Kirill Tkhai wrote:
> On 04.04.2018 18:35, Peter Zijlstra wrote:
> > On Wed, Apr 04, 2018 at 06:24:39PM +0300, Kirill Tkhai wrote:
> >> The following situation leads to deadlock:
> >>
> >> [task 1]                          [task 2]                         [task 3]
> >> kill_fasync()                     mm_update_next_owner()           copy_process()
> >>  spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
> >>   send_sigio()                    <IRQ>                             ...
> >>    read_lock(&fown->lock)         kill_fasync()                     ...
> >>     read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...
> >>
> >> Task 1 can't acquire read locked tasklist_lock, since there is
> >> already task 3 expressed its wish to take the lock exclusive.
> >> Task 2 holds the read locked lock, but it can't take the spin lock.
> >>
> >> The patch makes queued_read_lock_slowpath() to give task 1 the same
> >> priority as it was an interrupt handler, and to take the lock
> > 
> > That re-introduces starvation scenarios. And the above looks like a
> > proper deadlock that should be sorted by fixing the locking order.
> 
> We can move tasklist_lock out of send_sigio(), but I'm not sure
> it's possible for read_lock(&fown->lock).

So the scenario is:

CPU0			CPU1				CPU2

spin_lock_irqsave(&fa->fa_lock);
			read_lock(&tasklist_lock)
							write_lock_irq(&tasklist_lock)
read_lock(&tasklist_lock)
			<IRQ>
			  spin_lock_irqsave(&fa->fa_lock);


Right? (where the row now signifies time)

That doesn't seem to include fown->lock, you're saying it has an
identical issue?

> Is there another solution? Is there reliable way to iterate do_each_pid_task()
> with rcu_read_lock()?

Depends on what you call reliable :-), Yes you can use
do_each_pid_task() with RCU, but as always you're prone to see tasks
that are dead and miss tasks that just came in.

If that is sufficient for the signal muck, dunno :/ Typically signals
use sighand lock, not tasklist_lock.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock
  2018-04-04 15:55     ` Kirill Tkhai
@ 2018-04-04 16:25       ` Waiman Long
  2018-04-05  9:57         ` Kirill Tkhai
  0 siblings, 1 reply; 9+ messages in thread
From: Waiman Long @ 2018-04-04 16:25 UTC (permalink / raw)
  To: Kirill Tkhai, Peter Zijlstra; +Cc: mingo, stable, linux-kernel, Boqun Feng

On 04/04/2018 11:55 AM, Kirill Tkhai wrote:
> On 04.04.2018 18:51, Kirill Tkhai wrote:
>> On 04.04.2018 18:35, Peter Zijlstra wrote:
>>> On Wed, Apr 04, 2018 at 06:24:39PM +0300, Kirill Tkhai wrote:
>>>> The following situation leads to deadlock:
>>>>
>>>> [task 1]                          [task 2]                         [task 3]
>>>> kill_fasync()                     mm_update_next_owner()           copy_process()
>>>>  spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
>>>>   send_sigio()                    <IRQ>                             ...
>>>>    read_lock(&fown->lock)         kill_fasync()                     ...
>>>>     read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...
>>>>
>>>> Task 1 can't acquire read locked tasklist_lock, since there is
>>>> already task 3 expressed its wish to take the lock exclusive.
>>>> Task 2 holds the read locked lock, but it can't take the spin lock.
>>>>
>>>> The patch makes queued_read_lock_slowpath() to give task 1 the same
>>>> priority as it was an interrupt handler, and to take the lock
>>> That re-introduces starvation scenarios. And the above looks like a
>>> proper deadlock that should be sorted by fixing the locking order.
>> We can move tasklist_lock out of send_sigio(), but I'm not sure
>> it's possible for read_lock(&fown->lock).
>>
>> Is there another solution? Is there reliable way to iterate do_each_pid_task()
>> with rcu_read_lock()?
> In case of &fown->lock we may always disable irqs for all the places, where it's
> taken for read, i.e. read_lock_irqsave(&fown->lock). This seems to fix the problem
> for this lock.

One possible solution is add a flag in send_sigio() to use a
read_trylock(&tasklist_lock) instead of read_lock(). If the trylock
fails, returns an error and have the caller (kill_fasync) release
fa->fa_lock and retry again. Task 1 has 3 levels of nested locking and
so it should be the one that does a retry if the innermost locking
fails. An warning can be printed if the retry count is too large.

-Longman

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock
  2018-04-04 16:18     ` Peter Zijlstra
@ 2018-04-05  9:43       ` Kirill Tkhai
  0 siblings, 0 replies; 9+ messages in thread
From: Kirill Tkhai @ 2018-04-05  9:43 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: mingo, stable, linux-kernel, Waiman Long, Boqun Feng

On 04.04.2018 19:18, Peter Zijlstra wrote:
> On Wed, Apr 04, 2018 at 06:51:08PM +0300, Kirill Tkhai wrote:
>> On 04.04.2018 18:35, Peter Zijlstra wrote:
>>> On Wed, Apr 04, 2018 at 06:24:39PM +0300, Kirill Tkhai wrote:
>>>> The following situation leads to deadlock:
>>>>
>>>> [task 1]                          [task 2]                         [task 3]
>>>> kill_fasync()                     mm_update_next_owner()           copy_process()
>>>>  spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
>>>>   send_sigio()                    <IRQ>                             ...
>>>>    read_lock(&fown->lock)         kill_fasync()                     ...
>>>>     read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...
>>>>
>>>> Task 1 can't acquire read locked tasklist_lock, since there is
>>>> already task 3 expressed its wish to take the lock exclusive.
>>>> Task 2 holds the read locked lock, but it can't take the spin lock.
>>>>
>>>> The patch makes queued_read_lock_slowpath() to give task 1 the same
>>>> priority as it was an interrupt handler, and to take the lock
>>>
>>> That re-introduces starvation scenarios. And the above looks like a
>>> proper deadlock that should be sorted by fixing the locking order.
>>
>> We can move tasklist_lock out of send_sigio(), but I'm not sure
>> it's possible for read_lock(&fown->lock).
> 
> So the scenario is:
> 
> CPU0			CPU1				CPU2
> 
> spin_lock_irqsave(&fa->fa_lock);
> 			read_lock(&tasklist_lock)
> 							write_lock_irq(&tasklist_lock)
> read_lock(&tasklist_lock)
> 			<IRQ>
> 			  spin_lock_irqsave(&fa->fa_lock);
> 
> 
> Right? (where the row now signifies time)
> 
> That doesn't seem to include fown->lock, you're saying it has an
> identical issue?

There is also read_lock(), which can be taken from interrupt:

CPU0                                          CPU1                                     CPU2

f_getown()                                    kill_fasync()                               
  read_lock(&f_own->lock)                      spin_lock_irqsave(&fa->fa_lock, flags)    
  <IRQ>                                        send_sigio()                            write_lock_irq(&f_own->lock);
    kill_fasync()                                 read_lock(&fown->lock)
      spin_lock_irqsave(&fa->fa_lock, flags)

To prevent deadlock, this requires all &f_own->lock be taken via read_lock_irqsave().

This may be formalized as lockdep rule: if spinlock nests into read_lock(), and they
both can be called from interrupt, the rest of read_lock() always must disable interrupts.

>> Is there another solution? Is there reliable way to iterate do_each_pid_task()
>> with rcu_read_lock()?
> 
> Depends on what you call reliable :-), Yes you can use
> do_each_pid_task() with RCU, but as always you're prone to see tasks
> that are dead and miss tasks that just came in.
>
> If that is sufficient for the signal muck, dunno :/ Typically signals
> use sighand lock, not tasklist_lock.

The first thing is not a problem, while missing a newly moved task is not suitable.
So, it seems it's not reliable...

Kirill

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock
  2018-04-04 16:25       ` Waiman Long
@ 2018-04-05  9:57         ` Kirill Tkhai
  0 siblings, 0 replies; 9+ messages in thread
From: Kirill Tkhai @ 2018-04-05  9:57 UTC (permalink / raw)
  To: Waiman Long, Peter Zijlstra; +Cc: mingo, stable, linux-kernel, Boqun Feng

On 04.04.2018 19:25, Waiman Long wrote:
> On 04/04/2018 11:55 AM, Kirill Tkhai wrote:
>> On 04.04.2018 18:51, Kirill Tkhai wrote:
>>> On 04.04.2018 18:35, Peter Zijlstra wrote:
>>>> On Wed, Apr 04, 2018 at 06:24:39PM +0300, Kirill Tkhai wrote:
>>>>> The following situation leads to deadlock:
>>>>>
>>>>> [task 1]                          [task 2]                         [task 3]
>>>>> kill_fasync()                     mm_update_next_owner()           copy_process()
>>>>>  spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
>>>>>   send_sigio()                    <IRQ>                             ...
>>>>>    read_lock(&fown->lock)         kill_fasync()                     ...
>>>>>     read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...
>>>>>
>>>>> Task 1 can't acquire read locked tasklist_lock, since there is
>>>>> already task 3 expressed its wish to take the lock exclusive.
>>>>> Task 2 holds the read locked lock, but it can't take the spin lock.
>>>>>
>>>>> The patch makes queued_read_lock_slowpath() to give task 1 the same
>>>>> priority as it was an interrupt handler, and to take the lock
>>>> That re-introduces starvation scenarios. And the above looks like a
>>>> proper deadlock that should be sorted by fixing the locking order.
>>> We can move tasklist_lock out of send_sigio(), but I'm not sure
>>> it's possible for read_lock(&fown->lock).
>>>
>>> Is there another solution? Is there reliable way to iterate do_each_pid_task()
>>> with rcu_read_lock()?
>> In case of &fown->lock we may always disable irqs for all the places, where it's
>> taken for read, i.e. read_lock_irqsave(&fown->lock). This seems to fix the problem
>> for this lock.
> 
> One possible solution is add a flag in send_sigio() to use a
> read_trylock(&tasklist_lock) instead of read_lock(). If the trylock
> fails, returns an error and have the caller (kill_fasync) release
> fa->fa_lock and retry again. Task 1 has 3 levels of nested locking and
> so it should be the one that does a retry if the innermost locking
> fails. An warning can be printed if the retry count is too large.

send_sigio() is called from several places, and the context, which calls
it, in general is unknown. In case of dnotify_handle_event(), which calls
it under spinlock, trylock will act as busy loop. This may block some
smp/stop machine primitives, and I'm not sure, this can't bring to some
currently unvisible deadlocks.

There is a solution, I'm analysing in the moment, when we can convert
fasync_struct::fa_lock to read/write lock, then we can take it from
interrupt without problems.

Kirill

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-04-05  9:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-04 15:24 [PATCH] locking/qrwlock: Give priority to readers with irqs disabled to prevent deadlock Kirill Tkhai
2018-04-04 15:35 ` Peter Zijlstra
2018-04-04 15:51   ` Kirill Tkhai
2018-04-04 15:55     ` Kirill Tkhai
2018-04-04 16:25       ` Waiman Long
2018-04-05  9:57         ` Kirill Tkhai
2018-04-04 16:18     ` Peter Zijlstra
2018-04-05  9:43       ` Kirill Tkhai
2018-04-04 15:43 ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.