linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] locking/rwlocks: do not starve writers
@ 2022-06-17  9:10 Eric Dumazet
  2022-06-17 12:07 ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2022-06-17  9:10 UTC (permalink / raw)
  To: Peter Zijlstra, Linus Torvalds
  Cc: linux-kernel, Eric Dumazet, Eric Dumazet, Ingo Molnar,
	Waiman Long, Boqun Feng, Will Deacon, Roman Penyaev,
	Shakeel Butt

From: Eric Dumazet <edumazet@google.com>

Networking is still using rwlocks and read_lock() is called
from softirq context, potentially from many cpus.

In this (soft)irq context, rwlock code is unfair to writers
and can cause soft lockups.

We first noticed an issue with epoll after commit a218cc491420
("epoll: use rwlock in order to reduce ep_poll_callback() contention"),
but it is trivial to brick a host using this repro:

for i in {1..48}
do
 ping -f -n -q 127.0.0.1 &
 sleep 0.1
done

If really an unfair version of rwlocks is needed, we should introduce
a new read_lock_unfair().

[  673.678717][   C34] watchdog: BUG: soft lockup - CPU#34 stuck for 82s! [ping:17794]
[  673.700713][   C45] watchdog: BUG: soft lockup - CPU#45 stuck for 82s! [ping:17796]
[  673.702712][   C46] watchdog: BUG: soft lockup - CPU#46 stuck for 78s! [ping:17802]
[  673.704712][   C47] watchdog: BUG: soft lockup - CPU#47 stuck for 82s! [ping:17798]
[  677.636023][   C13] watchdog: BUG: soft lockup - CPU#13 stuck for 82s! [ping:17804]
[  677.638022][   C14] watchdog: BUG: soft lockup - CPU#14 stuck for 75s! [ping:17825]
[  677.644021][   C17] watchdog: BUG: soft lockup - CPU#17 stuck for 75s! [ping:17821]
[  677.650020][   C20] watchdog: BUG: soft lockup - CPU#20 stuck for 82s! [ping:17800]
[  677.686014][   C38] watchdog: BUG: soft lockup - CPU#38 stuck for 75s! [ping:17819]
[  681.691318][   C41] watchdog: BUG: soft lockup - CPU#41 stuck for 74s! [ping:17823]
[  684.657807][   C46] rcu: INFO: rcu_sched self-detected stall on CPU
[  684.664075][   C46] rcu: 	46-....: (1 GPs behind) idle=529/1/0x4000000000000000 softirq=22717/22717 fqs=20200
[  705.633252][   C14] watchdog: BUG: soft lockup - CPU#14 stuck for 101s! [ping:17825]
[  706.999058][  T309] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-... 41-... } 88575 jiffies s: 2325 root: 0x5/.
[  706.999069][  T309] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x4000/. l=1:32-47:0x200/.
[  709.686574][   C41] watchdog: BUG: soft lockup - CPU#41 stuck for 100s! [ping:17823]
[  714.457782][   C41] rcu: INFO: rcu_sched self-detected stall on CPU
[  714.464047][   C41] rcu: 	41-....: (1 GPs behind) idle=403/1/0x4000000000000000 softirq=24654/24655 fqs=4653

Fixes: 70af2f8a4f48 ("locking/rwlocks: Introduce 'qrwlocks' - fair, queued rwlocks")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: Shakeel Butt <shakeelb@google.com>
---
 kernel/locking/qrwlock.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index 2e1600906c9f5cd868415d20e2d7024c5b1e0531..bf64d14f0fc88613363c3c007bca8c0918709123 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
 	/*
 	 * Readers come here when they cannot get the lock without waiting
 	 */
-	if (unlikely(in_interrupt())) {
-		/*
-		 * Readers in interrupt context will get the lock immediately
-		 * if the writer is just waiting (not holding the lock yet),
-		 * so spin with ACQUIRE semantics until the lock is available
-		 * without waiting in the queue.
-		 */
-		atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
-		return;
-	}
 	atomic_sub(_QR_BIAS, &lock->cnts);
 
 	trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17  9:10 [PATCH] locking/rwlocks: do not starve writers Eric Dumazet
@ 2022-06-17 12:07 ` Peter Zijlstra
  2022-06-17 14:43   ` Waiman Long
  0 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2022-06-17 12:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, Eric Dumazet, Ingo Molnar,
	Waiman Long, Boqun Feng, Will Deacon, Roman Penyaev,
	Shakeel Butt

On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
>  	/*
>  	 * Readers come here when they cannot get the lock without waiting
>  	 */
> -	if (unlikely(in_interrupt())) {
> -		/*
> -		 * Readers in interrupt context will get the lock immediately
> -		 * if the writer is just waiting (not holding the lock yet),
> -		 * so spin with ACQUIRE semantics until the lock is available
> -		 * without waiting in the queue.
> -		 */
> -		atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
> -		return;
> -	}
>  	atomic_sub(_QR_BIAS, &lock->cnts);
>  
>  	trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);

This is known to break tasklist_lock.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 12:07 ` Peter Zijlstra
@ 2022-06-17 14:43   ` Waiman Long
  2022-06-17 14:57     ` Shakeel Butt
  0 siblings, 1 reply; 22+ messages in thread
From: Waiman Long @ 2022-06-17 14:43 UTC (permalink / raw)
  To: Peter Zijlstra, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, Eric Dumazet, Ingo Molnar,
	Boqun Feng, Will Deacon, Roman Penyaev, Shakeel Butt

On 6/17/22 08:07, Peter Zijlstra wrote:
> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
>> --- a/kernel/locking/qrwlock.c
>> +++ b/kernel/locking/qrwlock.c
>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
>>   	/*
>>   	 * Readers come here when they cannot get the lock without waiting
>>   	 */
>> -	if (unlikely(in_interrupt())) {
>> -		/*
>> -		 * Readers in interrupt context will get the lock immediately
>> -		 * if the writer is just waiting (not holding the lock yet),
>> -		 * so spin with ACQUIRE semantics until the lock is available
>> -		 * without waiting in the queue.
>> -		 */
>> -		atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
>> -		return;
>> -	}
>>   	atomic_sub(_QR_BIAS, &lock->cnts);
>>   
>>   	trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
> This is known to break tasklist_lock.
>
We certainly can't break the current usage of tasklist_lock.

I am aware of this problem with networking code and is thinking about 
either relaxing the check to exclude softirq or provide a 
read_lock_unfair() variant for networking use. I think tasklist_lock 
isn't taken from softirq context, but I may be wrong. Providing a 
read_lock_unfair() will require quite a bit of work in the supporting 
infrastructure as well.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 14:43   ` Waiman Long
@ 2022-06-17 14:57     ` Shakeel Butt
  2022-06-17 15:00       ` Waiman Long
  0 siblings, 1 reply; 22+ messages in thread
From: Shakeel Butt @ 2022-06-17 14:57 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Eric Dumazet, Linus Torvalds, linux-kernel,
	Eric Dumazet, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@redhat.com> wrote:
>
> On 6/17/22 08:07, Peter Zijlstra wrote:
> > On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
> >> --- a/kernel/locking/qrwlock.c
> >> +++ b/kernel/locking/qrwlock.c
> >> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
> >>      /*
> >>       * Readers come here when they cannot get the lock without waiting
> >>       */
> >> -    if (unlikely(in_interrupt())) {
> >> -            /*
> >> -             * Readers in interrupt context will get the lock immediately
> >> -             * if the writer is just waiting (not holding the lock yet),
> >> -             * so spin with ACQUIRE semantics until the lock is available
> >> -             * without waiting in the queue.
> >> -             */
> >> -            atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
> >> -            return;
> >> -    }
> >>      atomic_sub(_QR_BIAS, &lock->cnts);
> >>
> >>      trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
> > This is known to break tasklist_lock.
> >
> We certainly can't break the current usage of tasklist_lock.
>
> I am aware of this problem with networking code and is thinking about
> either relaxing the check to exclude softirq or provide a
> read_lock_unfair() variant for networking use.

read_lock_unfair() for networking use or tasklist_lock use?

> I think tasklist_lock
> isn't taken from softirq context, but I may be wrong. Providing a
> read_lock_unfair() will require quite a bit of work in the supporting
> infrastructure as well.
>
> Cheers,
> Longman
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 14:57     ` Shakeel Butt
@ 2022-06-17 15:00       ` Waiman Long
  2022-06-17 15:24         ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Waiman Long @ 2022-06-17 15:00 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Peter Zijlstra, Eric Dumazet, Linus Torvalds, linux-kernel,
	Eric Dumazet, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On 6/17/22 10:57, Shakeel Butt wrote:
> On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@redhat.com> wrote:
>> On 6/17/22 08:07, Peter Zijlstra wrote:
>>> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
>>>> --- a/kernel/locking/qrwlock.c
>>>> +++ b/kernel/locking/qrwlock.c
>>>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
>>>>       /*
>>>>        * Readers come here when they cannot get the lock without waiting
>>>>        */
>>>> -    if (unlikely(in_interrupt())) {
>>>> -            /*
>>>> -             * Readers in interrupt context will get the lock immediately
>>>> -             * if the writer is just waiting (not holding the lock yet),
>>>> -             * so spin with ACQUIRE semantics until the lock is available
>>>> -             * without waiting in the queue.
>>>> -             */
>>>> -            atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
>>>> -            return;
>>>> -    }
>>>>       atomic_sub(_QR_BIAS, &lock->cnts);
>>>>
>>>>       trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
>>> This is known to break tasklist_lock.
>>>
>> We certainly can't break the current usage of tasklist_lock.
>>
>> I am aware of this problem with networking code and is thinking about
>> either relaxing the check to exclude softirq or provide a
>> read_lock_unfair() variant for networking use.
> read_lock_unfair() for networking use or tasklist_lock use?

I mean to say read_lock_fair(), but it could also be the other way 
around. Thanks for spotting that.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 15:00       ` Waiman Long
@ 2022-06-17 15:24         ` Eric Dumazet
  2022-06-17 15:56           ` Peter Zijlstra
  2022-06-17 17:41           ` Waiman Long
  0 siblings, 2 replies; 22+ messages in thread
From: Eric Dumazet @ 2022-06-17 15:24 UTC (permalink / raw)
  To: Waiman Long
  Cc: Shakeel Butt, Peter Zijlstra, Eric Dumazet, Linus Torvalds,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 5:00 PM Waiman Long <longman@redhat.com> wrote:
>
> On 6/17/22 10:57, Shakeel Butt wrote:
> > On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@redhat.com> wrote:
> >> On 6/17/22 08:07, Peter Zijlstra wrote:
> >>> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
> >>>> --- a/kernel/locking/qrwlock.c
> >>>> +++ b/kernel/locking/qrwlock.c
> >>>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
> >>>>       /*
> >>>>        * Readers come here when they cannot get the lock without waiting
> >>>>        */
> >>>> -    if (unlikely(in_interrupt())) {
> >>>> -            /*
> >>>> -             * Readers in interrupt context will get the lock immediately
> >>>> -             * if the writer is just waiting (not holding the lock yet),
> >>>> -             * so spin with ACQUIRE semantics until the lock is available
> >>>> -             * without waiting in the queue.
> >>>> -             */
> >>>> -            atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
> >>>> -            return;
> >>>> -    }
> >>>>       atomic_sub(_QR_BIAS, &lock->cnts);
> >>>>
> >>>>       trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
> >>> This is known to break tasklist_lock.
> >>>
> >> We certainly can't break the current usage of tasklist_lock.
> >>
> >> I am aware of this problem with networking code and is thinking about
> >> either relaxing the check to exclude softirq or provide a
> >> read_lock_unfair() variant for networking use.
> > read_lock_unfair() for networking use or tasklist_lock use?
>
> I mean to say read_lock_fair(), but it could also be the other way
> around. Thanks for spotting that.
>

If only tasklist_lock is problematic and needs the unfair variant,
then changing a few read_lock() for tasklist_lock will be less
invasive than ~1000 read_lock() elsewhere....

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 15:24         ` Eric Dumazet
@ 2022-06-17 15:56           ` Peter Zijlstra
  2022-06-17 17:41           ` Waiman Long
  1 sibling, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2022-06-17 15:56 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Waiman Long, Shakeel Butt, Eric Dumazet, Linus Torvalds,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 05:24:58PM +0200, Eric Dumazet wrote:

> If only tasklist_lock is problematic and needs the unfair variant,
> then changing a few read_lock() for tasklist_lock will be less
> invasive than ~1000 read_lock() elsewhere....

This is unknown. tasklist_lock was the obvious one, there might be more.
Lockdep should be able to tell you.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 15:24         ` Eric Dumazet
  2022-06-17 15:56           ` Peter Zijlstra
@ 2022-06-17 17:41           ` Waiman Long
  2022-06-17 17:45             ` Eric Dumazet
  1 sibling, 1 reply; 22+ messages in thread
From: Waiman Long @ 2022-06-17 17:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Shakeel Butt, Peter Zijlstra, Eric Dumazet, Linus Torvalds,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On 6/17/22 11:24, Eric Dumazet wrote:
> On Fri, Jun 17, 2022 at 5:00 PM Waiman Long <longman@redhat.com> wrote:
>> On 6/17/22 10:57, Shakeel Butt wrote:
>>> On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@redhat.com> wrote:
>>>> On 6/17/22 08:07, Peter Zijlstra wrote:
>>>>> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
>>>>>> --- a/kernel/locking/qrwlock.c
>>>>>> +++ b/kernel/locking/qrwlock.c
>>>>>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
>>>>>>        /*
>>>>>>         * Readers come here when they cannot get the lock without waiting
>>>>>>         */
>>>>>> -    if (unlikely(in_interrupt())) {
>>>>>> -            /*
>>>>>> -             * Readers in interrupt context will get the lock immediately
>>>>>> -             * if the writer is just waiting (not holding the lock yet),
>>>>>> -             * so spin with ACQUIRE semantics until the lock is available
>>>>>> -             * without waiting in the queue.
>>>>>> -             */
>>>>>> -            atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
>>>>>> -            return;
>>>>>> -    }
>>>>>>        atomic_sub(_QR_BIAS, &lock->cnts);
>>>>>>
>>>>>>        trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
>>>>> This is known to break tasklist_lock.
>>>>>
>>>> We certainly can't break the current usage of tasklist_lock.
>>>>
>>>> I am aware of this problem with networking code and is thinking about
>>>> either relaxing the check to exclude softirq or provide a
>>>> read_lock_unfair() variant for networking use.
>>> read_lock_unfair() for networking use or tasklist_lock use?
>> I mean to say read_lock_fair(), but it could also be the other way
>> around. Thanks for spotting that.
>>
> If only tasklist_lock is problematic and needs the unfair variant,
> then changing a few read_lock() for tasklist_lock will be less
> invasive than ~1000 read_lock() elsewhere....

After a second thought, I think the right way is to introduce a fair 
variant, if needed. If an arch isn't using qrwlock, the native rwlock 
implementation will be unfair. In that sense, unfair rwlock is the 
default. We will only need to change the relevant network read_lock() 
calls to use the fair variant which will still be unfair if qrwlock 
isn't used. We are not going to touch other read_lock call that don't 
care about fair or unfair.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 17:41           ` Waiman Long
@ 2022-06-17 17:45             ` Eric Dumazet
  2022-06-17 18:57               ` Waiman Long
                                 ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Eric Dumazet @ 2022-06-17 17:45 UTC (permalink / raw)
  To: Waiman Long
  Cc: Shakeel Butt, Peter Zijlstra, Eric Dumazet, Linus Torvalds,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 7:42 PM Waiman Long <longman@redhat.com> wrote:
>
> On 6/17/22 11:24, Eric Dumazet wrote:
> > On Fri, Jun 17, 2022 at 5:00 PM Waiman Long <longman@redhat.com> wrote:
> >> On 6/17/22 10:57, Shakeel Butt wrote:
> >>> On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@redhat.com> wrote:
> >>>> On 6/17/22 08:07, Peter Zijlstra wrote:
> >>>>> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
> >>>>>> --- a/kernel/locking/qrwlock.c
> >>>>>> +++ b/kernel/locking/qrwlock.c
> >>>>>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
> >>>>>>        /*
> >>>>>>         * Readers come here when they cannot get the lock without waiting
> >>>>>>         */
> >>>>>> -    if (unlikely(in_interrupt())) {
> >>>>>> -            /*
> >>>>>> -             * Readers in interrupt context will get the lock immediately
> >>>>>> -             * if the writer is just waiting (not holding the lock yet),
> >>>>>> -             * so spin with ACQUIRE semantics until the lock is available
> >>>>>> -             * without waiting in the queue.
> >>>>>> -             */
> >>>>>> -            atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
> >>>>>> -            return;
> >>>>>> -    }
> >>>>>>        atomic_sub(_QR_BIAS, &lock->cnts);
> >>>>>>
> >>>>>>        trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
> >>>>> This is known to break tasklist_lock.
> >>>>>
> >>>> We certainly can't break the current usage of tasklist_lock.
> >>>>
> >>>> I am aware of this problem with networking code and is thinking about
> >>>> either relaxing the check to exclude softirq or provide a
> >>>> read_lock_unfair() variant for networking use.
> >>> read_lock_unfair() for networking use or tasklist_lock use?
> >> I mean to say read_lock_fair(), but it could also be the other way
> >> around. Thanks for spotting that.
> >>
> > If only tasklist_lock is problematic and needs the unfair variant,
> > then changing a few read_lock() for tasklist_lock will be less
> > invasive than ~1000 read_lock() elsewhere....
>
> After a second thought, I think the right way is to introduce a fair
> variant, if needed. If an arch isn't using qrwlock, the native rwlock
> implementation will be unfair. In that sense, unfair rwlock is the
> default. We will only need to change the relevant network read_lock()
> calls to use the fair variant which will still be unfair if qrwlock
> isn't used. We are not going to touch other read_lock call that don't
> care about fair or unfair.
>

Hmm... backporting this kind of invasive change to stable kernels will
be a daunting task.

Were rwlocks always unfair, and we have been lucky ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 17:45             ` Eric Dumazet
@ 2022-06-17 18:57               ` Waiman Long
  2022-06-17 19:04               ` Peter Zijlstra
  2022-06-17 19:08               ` Linus Torvalds
  2 siblings, 0 replies; 22+ messages in thread
From: Waiman Long @ 2022-06-17 18:57 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Shakeel Butt, Peter Zijlstra, Eric Dumazet, Linus Torvalds,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On 6/17/22 13:45, Eric Dumazet wrote:
> On Fri, Jun 17, 2022 at 7:42 PM Waiman Long <longman@redhat.com> wrote:
>> On 6/17/22 11:24, Eric Dumazet wrote:
>>> On Fri, Jun 17, 2022 at 5:00 PM Waiman Long <longman@redhat.com> wrote:
>>>> On 6/17/22 10:57, Shakeel Butt wrote:
>>>>> On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@redhat.com> wrote:
>>>>>> On 6/17/22 08:07, Peter Zijlstra wrote:
>>>>>>> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
>>>>>>>> --- a/kernel/locking/qrwlock.c
>>>>>>>> +++ b/kernel/locking/qrwlock.c
>>>>>>>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
>>>>>>>>         /*
>>>>>>>>          * Readers come here when they cannot get the lock without waiting
>>>>>>>>          */
>>>>>>>> -    if (unlikely(in_interrupt())) {
>>>>>>>> -            /*
>>>>>>>> -             * Readers in interrupt context will get the lock immediately
>>>>>>>> -             * if the writer is just waiting (not holding the lock yet),
>>>>>>>> -             * so spin with ACQUIRE semantics until the lock is available
>>>>>>>> -             * without waiting in the queue.
>>>>>>>> -             */
>>>>>>>> -            atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
>>>>>>>> -            return;
>>>>>>>> -    }
>>>>>>>>         atomic_sub(_QR_BIAS, &lock->cnts);
>>>>>>>>
>>>>>>>>         trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
>>>>>>> This is known to break tasklist_lock.
>>>>>>>
>>>>>> We certainly can't break the current usage of tasklist_lock.
>>>>>>
>>>>>> I am aware of this problem with networking code and is thinking about
>>>>>> either relaxing the check to exclude softirq or provide a
>>>>>> read_lock_unfair() variant for networking use.
>>>>> read_lock_unfair() for networking use or tasklist_lock use?
>>>> I mean to say read_lock_fair(), but it could also be the other way
>>>> around. Thanks for spotting that.
>>>>
>>> If only tasklist_lock is problematic and needs the unfair variant,
>>> then changing a few read_lock() for tasklist_lock will be less
>>> invasive than ~1000 read_lock() elsewhere....
>> After a second thought, I think the right way is to introduce a fair
>> variant, if needed. If an arch isn't using qrwlock, the native rwlock
>> implementation will be unfair. In that sense, unfair rwlock is the
>> default. We will only need to change the relevant network read_lock()
>> calls to use the fair variant which will still be unfair if qrwlock
>> isn't used. We are not going to touch other read_lock call that don't
>> care about fair or unfair.
>>
> Hmm... backporting this kind of invasive change to stable kernels will
> be a daunting task.
>
> Were rwlocks always unfair, and we have been lucky ?
>
Yes, rwlocks was always unfair and it always had this kind of soft 
lockup problem and scalability problem because of cacheline bouncing. 
That was reason of creating qrwlock which can at least provide a fair 
rwlock at task context. Now we have systems with more and more cpus and 
that is the reason why you are seeing it all over again with the 
networking code.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 17:45             ` Eric Dumazet
  2022-06-17 18:57               ` Waiman Long
@ 2022-06-17 19:04               ` Peter Zijlstra
  2022-06-17 19:10                 ` Eric Dumazet
  2022-06-17 19:08               ` Linus Torvalds
  2 siblings, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2022-06-17 19:04 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Waiman Long, Shakeel Butt, Eric Dumazet, Linus Torvalds,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 07:45:14PM +0200, Eric Dumazet wrote:

> Were rwlocks always unfair, and we have been lucky ?

Yes.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 17:45             ` Eric Dumazet
  2022-06-17 18:57               ` Waiman Long
  2022-06-17 19:04               ` Peter Zijlstra
@ 2022-06-17 19:08               ` Linus Torvalds
  2 siblings, 0 replies; 22+ messages in thread
From: Linus Torvalds @ 2022-06-17 19:08 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Waiman Long, Shakeel Butt, Peter Zijlstra, Eric Dumazet,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 12:45 PM Eric Dumazet <edumazet@google.com> wrote:
>
> Were rwlocks always unfair, and we have been lucky ?

Yes, rwlocks have always been unfair.

And the unfairness is very fundamental: it is what allows you to take
a read-lock without disabling interrupts (or softirqs), and still
allowing interrupts to *also* take the lock for reading.

That's actually a big deal for some things, with the tasklist lock
being the traditional thing, but there could be others.

Because with a fair lock, you'd have deadlocks when CPU1 gets the lock
for reading, then another CPU blocks on writing (with interrupts
disabled), and then CPU1 takes an interrupt and wants to read again.

This is not going to change.

If you want fair rwlocks, you have a couple of options:

 - add a new explicitlly fair version.

 - don't use rwlocks from irq or softirq context

 - use another lock entirely (spinlocks are fair, and often perform better)

You do *not* get to change behavior that has been there since day#1
and that very core code very much depends on.

In fact, the fact that you use a read_lock from softirq context makes
me suspect that you yourself might be in danger of deadlocking due to
that "look, another CPU is trying to get the write lock" deadlock
situation.

Because the only way to fix that deadlock is

 (a) unfair rwlocks like we have now

 (b) having to disable interrupts around all read-locked regions

and (b) often basically means that the whole point of using a rwlock
goes away, because it is now much more expensive.

             Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:04               ` Peter Zijlstra
@ 2022-06-17 19:10                 ` Eric Dumazet
  2022-06-17 19:19                   ` Linus Torvalds
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2022-06-17 19:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Shakeel Butt, Eric Dumazet, Linus Torvalds,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 9:04 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Jun 17, 2022 at 07:45:14PM +0200, Eric Dumazet wrote:
>
> > Were rwlocks always unfair, and we have been lucky ?
>
> Yes.

So I wonder why we replaced eventpoll spinlock with an rwlock.

Writing a repro is probably more difficult for eventpoll though.

We definitely had production issues after commit a218cc491420

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:10                 ` Eric Dumazet
@ 2022-06-17 19:19                   ` Linus Torvalds
  2022-06-17 19:25                     ` Eric Dumazet
  2022-06-17 19:34                     ` Shakeel Butt
  0 siblings, 2 replies; 22+ messages in thread
From: Linus Torvalds @ 2022-06-17 19:19 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Peter Zijlstra, Waiman Long, Shakeel Butt, Eric Dumazet,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 2:10 PM Eric Dumazet <edumazet@google.com> wrote:
>
> So I wonder why we replaced eventpoll spinlock with an rwlock.

Yeah, usually we've actually gone the other way.

Spinning rwlocks are seldom a big win, unless you can get some
secondary indirect win out of them.

That secondary win is often:

 (a) unfairness is usually very good for throughput (iow, the very
unfairness that you hit may *be* the reason why it looked good in some
benchmark, and people decided "ok, let's do this").

 (b) the special case of "interrupts take the lock for reading only"
thing that allows other readers to not disable interrupts

IOW, the win of a spinning rwlock is not necessarily the "we allow
multiple concurrent readers" that you'd expect, because if you have
small sections of code you protect, that just isn't a big deal, and
the costs are in the lock bouncing etc.

It's also worth pointing out that rwlocks are only unfair *if* they
hit that "reader from (soft)interrupt" case. Which means that such
cases *really* had better either have very very short locked regions
(with interrupts disabled), or they really need that (b) part above.

And yes, the tasklist lock really needs the (b) part above. Disabling
interrupts for task traversal would be completely and entirely
unacceptable, because the traversal can actually be fairly expensive
(lots and lots of threads).

I suspect eventpoll just did the wrong thing.

              Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:19                   ` Linus Torvalds
@ 2022-06-17 19:25                     ` Eric Dumazet
  2022-06-17 19:34                       ` Linus Torvalds
  2022-06-17 19:34                     ` Shakeel Butt
  1 sibling, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2022-06-17 19:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Waiman Long, Shakeel Butt, Eric Dumazet,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 9:19 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, Jun 17, 2022 at 2:10 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > So I wonder why we replaced eventpoll spinlock with an rwlock.
>
> Yeah, usually we've actually gone the other way.
>
> Spinning rwlocks are seldom a big win, unless you can get some
> secondary indirect win out of them.
>
> That secondary win is often:
>
>  (a) unfairness is usually very good for throughput (iow, the very
> unfairness that you hit may *be* the reason why it looked good in some
> benchmark, and people decided "ok, let's do this").
>
>  (b) the special case of "interrupts take the lock for reading only"
> thing that allows other readers to not disable interrupts
>
> IOW, the win of a spinning rwlock is not necessarily the "we allow
> multiple concurrent readers" that you'd expect, because if you have
> small sections of code you protect, that just isn't a big deal, and
> the costs are in the lock bouncing etc.
>
> It's also worth pointing out that rwlocks are only unfair *if* they
> hit that "reader from (soft)interrupt" case. Which means that such
> cases *really* had better either have very very short locked regions
> (with interrupts disabled), or they really need that (b) part above.
>
> And yes, the tasklist lock really needs the (b) part above. Disabling
> interrupts for task traversal would be completely and entirely
> unacceptable, because the traversal can actually be fairly expensive
> (lots and lots of threads).

Interesting...

 I think getrusage(RUSAGE_SELF) is blocking interrupts in the
possible long loop:

do {
      accumulate_thread_rusage(t, r);
} while_each_thread(p, t);



>
> I suspect eventpoll just did the wrong thing.
>
>               Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:19                   ` Linus Torvalds
  2022-06-17 19:25                     ` Eric Dumazet
@ 2022-06-17 19:34                     ` Shakeel Butt
  1 sibling, 0 replies; 22+ messages in thread
From: Shakeel Butt @ 2022-06-17 19:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric Dumazet, Peter Zijlstra, Waiman Long, Eric Dumazet,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 12:19 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, Jun 17, 2022 at 2:10 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > So I wonder why we replaced eventpoll spinlock with an rwlock.
>
> Yeah, usually we've actually gone the other way.
>
> Spinning rwlocks are seldom a big win, unless you can get some
> secondary indirect win out of them.
>
> That secondary win is often:
>
>  (a) unfairness is usually very good for throughput (iow, the very
> unfairness that you hit may *be* the reason why it looked good in some
> benchmark, and people decided "ok, let's do this").
>
>  (b) the special case of "interrupts take the lock for reading only"
> thing that allows other readers to not disable interrupts
>
> IOW, the win of a spinning rwlock is not necessarily the "we allow
> multiple concurrent readers" that you'd expect, because if you have
> small sections of code you protect, that just isn't a big deal, and
> the costs are in the lock bouncing etc.
>
> It's also worth pointing out that rwlocks are only unfair *if* they
> hit that "reader from (soft)interrupt" case. Which means that such
> cases *really* had better either have very very short locked regions
> (with interrupts disabled), or they really need that (b) part above.
>
> And yes, the tasklist lock really needs the (b) part above. Disabling
> interrupts for task traversal would be completely and entirely
> unacceptable, because the traversal can actually be fairly expensive
> (lots and lots of threads).
>
> I suspect eventpoll just did the wrong thing.
>

In addition the commit log of commit a218cc491420 has this comment:
"(I assume that write side of a rwlock does not starve, it seems
qrwlock implementation has these guarantees)."

Since this assumption seems incorrect, is there any objection in
reverting the commit a218cc491420? Or do we need more
evidence/arguments/repro?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:25                     ` Eric Dumazet
@ 2022-06-17 19:34                       ` Linus Torvalds
  2022-06-17 19:39                         ` Eric Dumazet
  0 siblings, 1 reply; 22+ messages in thread
From: Linus Torvalds @ 2022-06-17 19:34 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Peter Zijlstra, Waiman Long, Shakeel Butt, Eric Dumazet,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 2:25 PM Eric Dumazet <edumazet@google.com> wrote:
>
> Interesting...
>
>  I think getrusage(RUSAGE_SELF) is blocking interrupts in the
> possible long loop:

Yeah, that looks bad.

It needs that interrupt disable due to sighand->siglock, but normally
we would expect to *not* have a big loop inside the siglock.

Nasty.

I wonder if this is possibly a situation where we should actually make
siglock be a rwlock.

But considering that this RUSAGE_SELF is hopefully a special case,
maybe we could write it differently.

Instead of taking the sighand lock, we might be able to iterate just
over the regular thread list (using the tasklist lock), and then do
the "does sighand match" as a one-off check in
accumulate_thread_rusage().

It's not like we even really need that strict locking there, I suspect.

Anyway, I should have noted in my previous email that my "rwlock is
often not the win you'd think it is" that that is only true for this
*spinning* rwlock.

For the actual sleeping reader-writer lock (down_read/down_write and
friends), the whole "you can have multiple readers" is often a *huge*
deal and very central to using a rwlock. It's literally just the
spinning one that is often better as a spinlock unless you have those
magical reasons to use it.

              Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:34                       ` Linus Torvalds
@ 2022-06-17 19:39                         ` Eric Dumazet
  2022-06-17 19:48                           ` Linus Torvalds
  2022-06-17 21:58                           ` David Laight
  0 siblings, 2 replies; 22+ messages in thread
From: Eric Dumazet @ 2022-06-17 19:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Waiman Long, Shakeel Butt, Eric Dumazet,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 9:35 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, Jun 17, 2022 at 2:25 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > Interesting...
> >
> >  I think getrusage(RUSAGE_SELF) is blocking interrupts in the
> > possible long loop:
>
> Yeah, that looks bad.
>
> It needs that interrupt disable due to sighand->siglock, but normally
> we would expect to *not* have a big loop inside the siglock.
>
> Nasty.
>
> I wonder if this is possibly a situation where we should actually make
> siglock be a rwlock.
>
> But considering that this RUSAGE_SELF is hopefully a special case,
> maybe we could write it differently.
>
> Instead of taking the sighand lock, we might be able to iterate just
> over the regular thread list (using the tasklist lock), and then do
> the "does sighand match" as a one-off check in
> accumulate_thread_rusage().
>
> It's not like we even really need that strict locking there, I suspect.
>
> Anyway, I should have noted in my previous email that my "rwlock is
> often not the win you'd think it is" that that is only true for this
> *spinning* rwlock.
>
> For the actual sleeping reader-writer lock (down_read/down_write and
> friends), the whole "you can have multiple readers" is often a *huge*
> deal and very central to using a rwlock. It's literally just the
> spinning one that is often better as a spinlock unless you have those
> magical reasons to use it.
>

I am converting RAW sockets to RCU.

We will likely need to use RCU in place of rwlock in most networking code.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:39                         ` Eric Dumazet
@ 2022-06-17 19:48                           ` Linus Torvalds
  2022-06-20  7:25                             ` Peter Zijlstra
  2022-06-21 16:55                             ` Eric W. Biederman
  2022-06-17 21:58                           ` David Laight
  1 sibling, 2 replies; 22+ messages in thread
From: Linus Torvalds @ 2022-06-17 19:48 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Peter Zijlstra, Waiman Long, Shakeel Butt, Eric Dumazet,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 2:39 PM Eric Dumazet <edumazet@google.com> wrote:
>
> I am converting RAW sockets to RCU.

RCU is usually absolutely the best approach. Use regular spinlocks for
writers, and RCU for readers.

I'd love to see the tasklist_lock be converted to RCU too.  But that
locks predates RCU (and probably 99% of all kernel code), and it's
messy, so nobody sane has ever willingly tried to do that afaik.

               Linus

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:39                         ` Eric Dumazet
  2022-06-17 19:48                           ` Linus Torvalds
@ 2022-06-17 21:58                           ` David Laight
  1 sibling, 0 replies; 22+ messages in thread
From: David Laight @ 2022-06-17 21:58 UTC (permalink / raw)
  To: 'Eric Dumazet', Linus Torvalds
  Cc: Peter Zijlstra, Waiman Long, Shakeel Butt, Eric Dumazet,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

From: Eric Dumazet
> Sent: 17 June 2022 20:39
...
> I am converting RAW sockets to RCU.

I presume that is because you are shoving a lot of packets
through them?

Will that remove the horrid delayed free of the routing
table entry that gets created for IPv4 with 'hdrinc'
because the address in the packet might not match that
in the address buffer?

I've seen the softint code spend ages doing the final frees.

We have good reasons for using raw sockets to send a lot of UDP.
But 'hdrinc' has to be used to get the UDP checksum right.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:48                           ` Linus Torvalds
@ 2022-06-20  7:25                             ` Peter Zijlstra
  2022-06-21 16:55                             ` Eric W. Biederman
  1 sibling, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2022-06-20  7:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric Dumazet, Waiman Long, Shakeel Butt, Eric Dumazet,
	linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

On Fri, Jun 17, 2022 at 02:48:08PM -0500, Linus Torvalds wrote:

> I'd love to see the tasklist_lock be converted to RCU too.  But that
> locks predates RCU (and probably 99% of all kernel code), and it's
> messy, so nobody sane has ever willingly tried to do that afaik.

Thomas actually tried a few years ago (for RT, which also hates on RW
type locks). At the time he converted most users to RCU (since the
tasklist itself is also RCU protected), but there's a bunch of users
(more than you'd like) that really need to be exclusive vs fork.

IIRC the most prominent problem is that RCU iteration can miss incoming
tasks and they'll miss getting updated. But like said, it's a few years
ago so I'm a bit hazy on the details.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] locking/rwlocks: do not starve writers
  2022-06-17 19:48                           ` Linus Torvalds
  2022-06-20  7:25                             ` Peter Zijlstra
@ 2022-06-21 16:55                             ` Eric W. Biederman
  1 sibling, 0 replies; 22+ messages in thread
From: Eric W. Biederman @ 2022-06-21 16:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric Dumazet, Peter Zijlstra, Waiman Long, Shakeel Butt,
	Eric Dumazet, linux-kernel, Ingo Molnar, Boqun Feng, Will Deacon,
	Roman Penyaev

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Fri, Jun 17, 2022 at 2:39 PM Eric Dumazet <edumazet@google.com> wrote:
>>
>> I am converting RAW sockets to RCU.
>
> RCU is usually absolutely the best approach. Use regular spinlocks for
> writers, and RCU for readers.
>
> I'd love to see the tasklist_lock be converted to RCU too.  But that
> locks predates RCU (and probably 99% of all kernel code), and it's
> messy, so nobody sane has ever willingly tried to do that afaik.

Well sort of.  I converted proc many many years ago.

Like Peter mentioned the big obvious challenge for converting
signal delivery to something else is the atomic delivery aspect.

I am playing with it, and I think I see how to convert signal delivery.
Something like a quick grab of lock that updates struct pid and creates
a list of signals are pending to be delivered.  Plus code that forces
clone to deliver the pending signal before clone creates a new task.

Plus something like a generation counter so I can see when pulling the
signal in clone if the signal has already been delivered.

I think tasks exiting before getting a signal is ok, and does not need
any code.


I have some patches that are almost working that can use siglock to
protect the parent/child/ptrace relation ship for SIGCHLD processing.
Which will remove the pressure on tasklist_lock when I get them sorted.


Not that any of this will kill tasklist_lock but with a little luck we
can get to short deterministic hold times.

Eric

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-06-21 16:55 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-17  9:10 [PATCH] locking/rwlocks: do not starve writers Eric Dumazet
2022-06-17 12:07 ` Peter Zijlstra
2022-06-17 14:43   ` Waiman Long
2022-06-17 14:57     ` Shakeel Butt
2022-06-17 15:00       ` Waiman Long
2022-06-17 15:24         ` Eric Dumazet
2022-06-17 15:56           ` Peter Zijlstra
2022-06-17 17:41           ` Waiman Long
2022-06-17 17:45             ` Eric Dumazet
2022-06-17 18:57               ` Waiman Long
2022-06-17 19:04               ` Peter Zijlstra
2022-06-17 19:10                 ` Eric Dumazet
2022-06-17 19:19                   ` Linus Torvalds
2022-06-17 19:25                     ` Eric Dumazet
2022-06-17 19:34                       ` Linus Torvalds
2022-06-17 19:39                         ` Eric Dumazet
2022-06-17 19:48                           ` Linus Torvalds
2022-06-20  7:25                             ` Peter Zijlstra
2022-06-21 16:55                             ` Eric W. Biederman
2022-06-17 21:58                           ` David Laight
2022-06-17 19:34                     ` Shakeel Butt
2022-06-17 19:08               ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).