linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running
@ 2015-03-07  7:45 Jason Low
  2015-03-07  9:21 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jason Low @ 2015-03-07  7:45 UTC (permalink / raw)
  To: Linus Torvalds, Ingo Molnar, Peter Zijlstra
  Cc: Davidlohr Bueso, Tim Chen, Paul E. McKenney, Michel Lespinasse,
	Sasha Levin, LKML, Dave Jones, Ming Lei, Jason Low

Fixes tip commit b3fd4f03ca0b (locking/rwsem: Avoid deceiving lock spinners).

Ming reported soft lockups occurring when running xfstest due to
commit b3fd4f03ca0b.

When doing optimistic spinning in rwsem, threads should stop spinning when
the lock owner is not running. While a thread is spinning on owner, if
the owner reschedules, owner->on_cpu returns false and we stop spinning.

However, commit b3fd4f03ca0b essentially caused the check to get ignored
because when we break out of the spin loop due to !on_cpu, we continue
spinning if sem->owner != NULL.

This patch fixes this by making sure we stop spinning if the owner is not
running. Furthermore, just like with mutexes, refactor the code such that
we don't have separate checks for owner_running(). This makes it more
straightforward in terms of why we exit the spin on owner loop and we
would also avoid needing to "guess" why we broke out of the loop to make
this more readable.

Reported-and-tested-by: Ming Lei <ming.lei@canonical.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jason Low <jason.low2@hp.com>
---
 kernel/locking/rwsem-xadd.c |   31 +++++++++++--------------------
 1 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 06e2214..3417d01 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -324,32 +324,23 @@ done:
 	return ret;
 }
 
-static inline bool owner_running(struct rw_semaphore *sem,
-				 struct task_struct *owner)
-{
-	if (sem->owner != owner)
-		return false;
-
-	/*
-	 * Ensure we emit the owner->on_cpu, dereference _after_ checking
-	 * sem->owner still matches owner, if that fails, owner might
-	 * point to free()d memory, if it still matches, the rcu_read_lock()
-	 * ensures the memory stays valid.
-	 */
-	barrier();
-
-	return owner->on_cpu;
-}
-
 static noinline
 bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
 {
 	long count;
 
 	rcu_read_lock();
-	while (owner_running(sem, owner)) {
-		/* abort spinning when need_resched */
-		if (need_resched()) {
+	while (sem->owner == owner) {
+		/*
+		 * Ensure we emit the owner->on_cpu, dereference _after_
+		 * checking sem->owner still matches owner, if that fails,
+		 * owner might point to free()d memory, if it still matches,
+		 * the rcu_read_lock() ensures the memory stays valid.
+		 */
+		barrier();
+
+		/* abort spinning when need_resched or owner is not running */
+		if (!owner->on_cpu || need_resched()) {
 			rcu_read_unlock();
 			return false;
 		}
-- 
1.7.2.5




^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-07  7:45 [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running Jason Low
@ 2015-03-07  9:21 ` Peter Zijlstra
  2015-03-09 17:42   ` Jason Low
  2015-03-07 16:43 ` [tip:locking/core] " tip-bot for Jason Low
  2015-03-07 18:17 ` [PATCH] " Sasha Levin
  2 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2015-03-07  9:21 UTC (permalink / raw)
  To: Jason Low
  Cc: Linus Torvalds, Ingo Molnar, Davidlohr Bueso, Tim Chen,
	Paul E. McKenney, Michel Lespinasse, Sasha Levin, LKML,
	Dave Jones, Ming Lei

On Fri, Mar 06, 2015 at 11:45:31PM -0800, Jason Low wrote:
>  static noinline
>  bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
>  {
>  	long count;
>  
>  	rcu_read_lock();
> +	while (sem->owner == owner) {
> +		/*
> +		 * Ensure we emit the owner->on_cpu, dereference _after_
> +		 * checking sem->owner still matches owner, if that fails,
> +		 * owner might point to free()d memory, if it still matches,
> +		 * the rcu_read_lock() ensures the memory stays valid.
> +		 */
> +		barrier();
> +
> +		/* abort spinning when need_resched or owner is not running */
> +		if (!owner->on_cpu || need_resched()) {
>  			rcu_read_unlock();
>  			return false;
>  		}


Thanks, looks good; do we want to change the mutex code (again) to more
closely resemble this too? It still has the while (true) instead of the
while(lock->owner != owner).



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-07  7:45 [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running Jason Low
  2015-03-07  9:21 ` Peter Zijlstra
@ 2015-03-07 16:43 ` tip-bot for Jason Low
  2015-03-07 17:13   ` Oleg Nesterov
  2015-03-07 18:17 ` [PATCH] " Sasha Levin
  2 siblings, 1 reply; 13+ messages in thread
From: tip-bot for Jason Low @ 2015-03-07 16:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dave, hpa, sasha.levin, oleg, tglx, linux-kernel, jason.low2,
	walken, akpm, mingo, paulmck, davej, ming.lei, peterz, torvalds,
	tim.c.chen

Commit-ID:  9198f6edfd9ced74fd90b238d5a354aeac89bdfa
Gitweb:     http://git.kernel.org/tip/9198f6edfd9ced74fd90b238d5a354aeac89bdfa
Author:     Jason Low <jason.low2@hp.com>
AuthorDate: Fri, 6 Mar 2015 23:45:31 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 7 Mar 2015 09:50:49 +0100

locking/rwsem: Fix lock optimistic spinning when owner is not running

Ming reported soft lockups occurring when running xfstest due to
the following tip:locking/core commit:

  b3fd4f03ca0b ("locking/rwsem: Avoid deceiving lock spinners")

When doing optimistic spinning in rwsem, threads should stop
spinning when the lock owner is not running. While a thread is
spinning on owner, if the owner reschedules, owner->on_cpu
returns false and we stop spinning.

However, this commit essentially caused the check to get
ignored because when we break out of the spin loop due to
!on_cpu, we continue spinning if sem->owner != NULL.

This patch fixes this by making sure we stop spinning if the
owner is not running. Furthermore, just like with mutexes,
refactor the code such that we don't have separate checks for
owner_running(). This makes it more straightforward in terms of
why we exit the spin on owner loop and we would also avoid
needing to "guess" why we broke out of the loop to make this
more readable.

Reported-and-tested-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jason Low <jason.low2@hp.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1425714331.2475.388.camel@j-VirtualBox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/locking/rwsem-xadd.c | 31 +++++++++++--------------------
 1 file changed, 11 insertions(+), 20 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 06e2214..3417d01 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -324,32 +324,23 @@ done:
 	return ret;
 }
 
-static inline bool owner_running(struct rw_semaphore *sem,
-				 struct task_struct *owner)
-{
-	if (sem->owner != owner)
-		return false;
-
-	/*
-	 * Ensure we emit the owner->on_cpu, dereference _after_ checking
-	 * sem->owner still matches owner, if that fails, owner might
-	 * point to free()d memory, if it still matches, the rcu_read_lock()
-	 * ensures the memory stays valid.
-	 */
-	barrier();
-
-	return owner->on_cpu;
-}
-
 static noinline
 bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
 {
 	long count;
 
 	rcu_read_lock();
-	while (owner_running(sem, owner)) {
-		/* abort spinning when need_resched */
-		if (need_resched()) {
+	while (sem->owner == owner) {
+		/*
+		 * Ensure we emit the owner->on_cpu, dereference _after_
+		 * checking sem->owner still matches owner, if that fails,
+		 * owner might point to free()d memory, if it still matches,
+		 * the rcu_read_lock() ensures the memory stays valid.
+		 */
+		barrier();
+
+		/* abort spinning when need_resched or owner is not running */
+		if (!owner->on_cpu || need_resched()) {
 			rcu_read_unlock();
 			return false;
 		}

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-07 16:43 ` [tip:locking/core] " tip-bot for Jason Low
@ 2015-03-07 17:13   ` Oleg Nesterov
  2015-03-10 10:59     ` Peter Zijlstra
  2015-03-10 16:04     ` Linus Torvalds
  0 siblings, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2015-03-07 17:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-tip-commits, dave, hpa, sasha.levin, tglx, linux-kernel,
	jason.low2, walken, akpm, mingo, paulmck, davej, ming.lei,
	peterz, torvalds, tim.c.chen

I think the patch is fine, but this reminds me...

On 03/07, tip-bot for Jason Low wrote:
>
>  bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
>  {
>  	long count;
>
>  	rcu_read_lock();
> -	while (owner_running(sem, owner)) {
> -		/* abort spinning when need_resched */
> -		if (need_resched()) {
> +	while (sem->owner == owner) {
> +		/*
> +		 * Ensure we emit the owner->on_cpu, dereference _after_
> +		 * checking sem->owner still matches owner, if that fails,
> +		 * owner might point to free()d memory, if it still matches,
> +		 * the rcu_read_lock() ensures the memory stays valid.
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Yes, this is another case when we wrongly assume this.

Peter, should I resend

	[PATCH 3/3] introduce task_rcu_dereference()
	http://marc.info/?l=linux-kernel&m=141443631413914

? or should we add another call_rcu() in finish_task_switch() (like -rt does)
to make this true?

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-07  7:45 [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running Jason Low
  2015-03-07  9:21 ` Peter Zijlstra
  2015-03-07 16:43 ` [tip:locking/core] " tip-bot for Jason Low
@ 2015-03-07 18:17 ` Sasha Levin
  2015-03-09 17:37   ` Jason Low
  2 siblings, 1 reply; 13+ messages in thread
From: Sasha Levin @ 2015-03-07 18:17 UTC (permalink / raw)
  To: Jason Low, Linus Torvalds, Ingo Molnar, Peter Zijlstra
  Cc: Davidlohr Bueso, Tim Chen, Paul E. McKenney, Michel Lespinasse,
	LKML, Dave Jones, Ming Lei

On 03/07/2015 02:45 AM, Jason Low wrote:
> Fixes tip commit b3fd4f03ca0b (locking/rwsem: Avoid deceiving lock spinners).
> 
> Ming reported soft lockups occurring when running xfstest due to
> commit b3fd4f03ca0b.
> 
> When doing optimistic spinning in rwsem, threads should stop spinning when
> the lock owner is not running. While a thread is spinning on owner, if
> the owner reschedules, owner->on_cpu returns false and we stop spinning.
> 
> However, commit b3fd4f03ca0b essentially caused the check to get ignored
> because when we break out of the spin loop due to !on_cpu, we continue
> spinning if sem->owner != NULL.
> 
> This patch fixes this by making sure we stop spinning if the owner is not
> running. Furthermore, just like with mutexes, refactor the code such that
> we don't have separate checks for owner_running(). This makes it more
> straightforward in terms of why we exit the spin on owner loop and we
> would also avoid needing to "guess" why we broke out of the loop to make
> this more readable.

That seems to solve the hangs I'm seeing as well.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-07 18:17 ` [PATCH] " Sasha Levin
@ 2015-03-09 17:37   ` Jason Low
  0 siblings, 0 replies; 13+ messages in thread
From: Jason Low @ 2015-03-09 17:37 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Linus Torvalds, Ingo Molnar, Peter Zijlstra, Davidlohr Bueso,
	Tim Chen, Paul E. McKenney, Michel Lespinasse, LKML, Dave Jones,
	Ming Lei, jason.low2

On Sat, 2015-03-07 at 13:17 -0500, Sasha Levin wrote:
> On 03/07/2015 02:45 AM, Jason Low wrote:
> > Fixes tip commit b3fd4f03ca0b (locking/rwsem: Avoid deceiving lock spinners).
> > 
> > Ming reported soft lockups occurring when running xfstest due to
> > commit b3fd4f03ca0b.
> > 
> > When doing optimistic spinning in rwsem, threads should stop spinning when
> > the lock owner is not running. While a thread is spinning on owner, if
> > the owner reschedules, owner->on_cpu returns false and we stop spinning.
> > 
> > However, commit b3fd4f03ca0b essentially caused the check to get ignored
> > because when we break out of the spin loop due to !on_cpu, we continue
> > spinning if sem->owner != NULL.
> > 
> > This patch fixes this by making sure we stop spinning if the owner is not
> > running. Furthermore, just like with mutexes, refactor the code such that
> > we don't have separate checks for owner_running(). This makes it more
> > straightforward in terms of why we exit the spin on owner loop and we
> > would also avoid needing to "guess" why we broke out of the loop to make
> > this more readable.
> 
> That seems to solve the hangs I'm seeing as well.

Great, thanks for confirming this.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-07  9:21 ` Peter Zijlstra
@ 2015-03-09 17:42   ` Jason Low
  0 siblings, 0 replies; 13+ messages in thread
From: Jason Low @ 2015-03-09 17:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Ingo Molnar, Davidlohr Bueso, Tim Chen,
	Paul E. McKenney, Michel Lespinasse, Sasha Levin, LKML,
	Dave Jones, Ming Lei, jason.low2

On Sat, 2015-03-07 at 10:21 +0100, Peter Zijlstra wrote:
> On Fri, Mar 06, 2015 at 11:45:31PM -0800, Jason Low wrote:
> >  static noinline
> >  bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
> >  {
> >  	long count;
> >  
> >  	rcu_read_lock();
> > +	while (sem->owner == owner) {
> > +		/*
> > +		 * Ensure we emit the owner->on_cpu, dereference _after_
> > +		 * checking sem->owner still matches owner, if that fails,
> > +		 * owner might point to free()d memory, if it still matches,
> > +		 * the rcu_read_lock() ensures the memory stays valid.
> > +		 */
> > +		barrier();
> > +
> > +		/* abort spinning when need_resched or owner is not running */
> > +		if (!owner->on_cpu || need_resched()) {
> >  			rcu_read_unlock();
> >  			return false;
> >  		}
> 
> Thanks, looks good; do we want to change the mutex code (again) to more
> closely resemble this too? It still has the while (true) instead of the
> while(lock->owner != owner).

Yeah, I wondered about the same thing, though I wasn't sure if you
wanted an additional patch just for that change.

Guess you answered it for me :)



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-07 17:13   ` Oleg Nesterov
@ 2015-03-10 10:59     ` Peter Zijlstra
  2015-03-10 16:04     ` Linus Torvalds
  1 sibling, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2015-03-10 10:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-tip-commits, dave, hpa, sasha.levin, tglx, linux-kernel,
	jason.low2, walken, akpm, mingo, paulmck, davej, ming.lei,
	torvalds, tim.c.chen

On Sat, Mar 07, 2015 at 06:13:47PM +0100, Oleg Nesterov wrote:
> I think the patch is fine, but this reminds me...
> 
> On 03/07, tip-bot for Jason Low wrote:
> >
> >  bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
> >  {
> >  	long count;
> >
> >  	rcu_read_lock();
> > -	while (owner_running(sem, owner)) {
> > -		/* abort spinning when need_resched */
> > -		if (need_resched()) {
> > +	while (sem->owner == owner) {
> > +		/*
> > +		 * Ensure we emit the owner->on_cpu, dereference _after_
> > +		 * checking sem->owner still matches owner, if that fails,
> > +		 * owner might point to free()d memory, if it still matches,
> > +		 * the rcu_read_lock() ensures the memory stays valid.
>                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Yes, this is another case when we wrongly assume this.
> 
> Peter, should I resend
> 
> 	[PATCH 3/3] introduce task_rcu_dereference()
> 	http://marc.info/?l=linux-kernel&m=141443631413914
> 
> ? or should we add another call_rcu() in finish_task_switch() (like -rt does)
> to make this true?

Yeah, I think the extra call_rcu() makes most sense.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-07 17:13   ` Oleg Nesterov
  2015-03-10 10:59     ` Peter Zijlstra
@ 2015-03-10 16:04     ` Linus Torvalds
  2015-03-10 16:16       ` Peter Zijlstra
  2015-03-10 17:28       ` Oleg Nesterov
  1 sibling, 2 replies; 13+ messages in thread
From: Linus Torvalds @ 2015-03-10 16:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, linux-tip-commits, Davidlohr Bueso, Peter Anvin,
	Sasha Levin, Thomas Gleixner, Linux Kernel Mailing List,
	Jason Low, Michel Lespinasse, Andrew Morton, Ingo Molnar,
	Paul McKenney, Dave Jones, Ming Lei, Tim Chen

On Sat, Mar 7, 2015 at 9:13 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> +             /*
>> +              * Ensure we emit the owner->on_cpu, dereference _after_
>> +              * checking sem->owner still matches owner, if that fails,
>> +              * owner might point to free()d memory, if it still matches,
>> +              * the rcu_read_lock() ensures the memory stays valid.
>                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Yes, this is another case when we wrongly assume this.
>
> Peter, should I resend
>
>         [PATCH 3/3] introduce task_rcu_dereference()
>         http://marc.info/?l=linux-kernel&m=141443631413914
>
> ? or should we add another call_rcu() in finish_task_switch() (like -rt does)
> to make this true?

I think we should just make 'task_struct_cachep' have SLAB_DESTROY_BY_RCU.

It has almost no overhead - since it only affects the final page
freeing, not the actual slab alloc/free paths - and it means that you
can do the "access for reading under rcu lock" without worrying.

Of course, with SLAB_DESTROY_BY_RCU, a task may be re-allocated (to
*another* task) even under RCU, so the results from unlocked RCU read
accesses aren't "guaranteed" in that sense, but it's fine for
optimistic spinning where the code also ends up re-checking that the
task pointer itself matches. Getting the occasional race with "oops,
task went away" is fine, as long as we don't keep looping over a dead
task.

                           Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-10 16:04     ` Linus Torvalds
@ 2015-03-10 16:16       ` Peter Zijlstra
  2015-03-10 17:28       ` Oleg Nesterov
  1 sibling, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2015-03-10 16:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Oleg Nesterov, linux-tip-commits, Davidlohr Bueso, Peter Anvin,
	Sasha Levin, Thomas Gleixner, Linux Kernel Mailing List,
	Jason Low, Michel Lespinasse, Andrew Morton, Ingo Molnar,
	Paul McKenney, Dave Jones, Ming Lei, Tim Chen

On Tue, Mar 10, 2015 at 09:04:23AM -0700, Linus Torvalds wrote:
> 
> I think we should just make 'task_struct_cachep' have SLAB_DESTROY_BY_RCU.

Yes, that'll work. Also for the other cases Oleg found previously I
think.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-10 16:04     ` Linus Torvalds
  2015-03-10 16:16       ` Peter Zijlstra
@ 2015-03-10 17:28       ` Oleg Nesterov
  2015-03-10 17:45         ` Linus Torvalds
  1 sibling, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2015-03-10 17:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, linux-tip-commits, Davidlohr Bueso, Peter Anvin,
	Sasha Levin, Thomas Gleixner, Linux Kernel Mailing List,
	Jason Low, Michel Lespinasse, Andrew Morton, Ingo Molnar,
	Paul McKenney, Dave Jones, Ming Lei, Tim Chen, Kirill Tkhai

On 03/10, Linus Torvalds wrote:
>
> On Sat, Mar 7, 2015 at 9:13 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >> +             /*
> >> +              * Ensure we emit the owner->on_cpu, dereference _after_
> >> +              * checking sem->owner still matches owner, if that fails,
> >> +              * owner might point to free()d memory, if it still matches,
> >> +              * the rcu_read_lock() ensures the memory stays valid.
> >                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > Yes, this is another case when we wrongly assume this.
> >
> > Peter, should I resend
> >
> >         [PATCH 3/3] introduce task_rcu_dereference()
> >         http://marc.info/?l=linux-kernel&m=141443631413914
> >
> > ? or should we add another call_rcu() in finish_task_switch() (like -rt does)
> > to make this true?
>
> I think we should just make 'task_struct_cachep' have SLAB_DESTROY_BY_RCU.

This is what I initially suggested too, but then tried to argue with.
But it seems that I lost if you too prefer SLAB_DESTROY_BY_RCU.

Yes, SLAB_DESTROY_BY_RCU will work in this case because we recheck
->owner in a loop. And because task->on_cpu is just a word we can
safely read.

But this won't fix other problems we might have. For example, suppose
that we will need get_task_struct(owner) in this code, this won't work.

Or, as Kirill pointed out, lets look at "tsk = ACCESS_ONCE(cpu_rq(cpu)->curr)"
in task_numa_group(). Even if this will be "fixed" by SLAB_DESTROY_BY_RCU,
this code won't be correct anyway. Even if (I think) it will be safe to
dereference ->numa_group as well.

But OK, I won't argue.

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-10 17:28       ` Oleg Nesterov
@ 2015-03-10 17:45         ` Linus Torvalds
  2015-03-10 18:33           ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2015-03-10 17:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, linux-tip-commits, Davidlohr Bueso, Peter Anvin,
	Sasha Levin, Thomas Gleixner, Linux Kernel Mailing List,
	Jason Low, Michel Lespinasse, Andrew Morton, Ingo Molnar,
	Paul McKenney, Dave Jones, Ming Lei, Tim Chen, Kirill Tkhai

On Tue, Mar 10, 2015 at 10:28 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> But this won't fix other problems we might have. For example, suppose
> that we will need get_task_struct(owner) in this code, this won't work.

I agree that SLAB_DESTROY_BY_RCU in general is fairly fragile, and
leads to subtle issues. If speculatively taking locks, for example,
they might need to be initialized in the constructor function, and not
ever touched at actual allocation time. Etc etc.

So I'm not a huge fan of SLAB_DESTROY_BY_RCU in general, but for
really core data structures like this, I think it's worth it. The
overhead of "call_rcu()" can be quite noticeable, and the other
alternative solutions (like that suggested task_rcu_dereference()) are
even *more* complex and subtle and generally perform worse.

I'm not claiming it's perfect, but I think the "access the task
pointer without locking" is kind of special and worth making work in
reasonable ways.

                         Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tip:locking/core] locking/rwsem: Fix lock optimistic spinning when owner is not running
  2015-03-10 17:45         ` Linus Torvalds
@ 2015-03-10 18:33           ` Oleg Nesterov
  0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2015-03-10 18:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, linux-tip-commits, Davidlohr Bueso, Peter Anvin,
	Sasha Levin, Thomas Gleixner, Linux Kernel Mailing List,
	Jason Low, Michel Lespinasse, Andrew Morton, Ingo Molnar,
	Paul McKenney, Dave Jones, Ming Lei, Tim Chen, Kirill Tkhai

On 03/10, Linus Torvalds wrote:
>
> So I'm not a huge fan of SLAB_DESTROY_BY_RCU in general, but for
> really core data structures like this, I think it's worth it.

I agree, but we have other users which can't be fixed if we just add
SLAB_DESTROY_BY_RCU to task_struct_cachep. So we need something else
anyway.

> The
> overhead of "call_rcu()" can be quite noticeable,

Yes, this was my concern too,

> and the other
> alternative solutions (like that suggested task_rcu_dereference()) are
> even *more* complex and subtle

Yes, but at least the ugliness is hidden inside task_rcu_dereference().
The usage is simple.

> and generally perform worse.

Well, task_rcu_dereference() should be cheap enough. probe_slab_address()
is a plain LOAD unless CONFIG_DEBUG_PAGEALLOC.

But!

> I'm not claiming it's perfect,

Same here, it is not that I think task_rcu_dereference() is very nice.


So I leave this to you and Peter ;)

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-03-10 18:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-07  7:45 [PATCH] locking/rwsem: Fix lock optimistic spinning when owner is not running Jason Low
2015-03-07  9:21 ` Peter Zijlstra
2015-03-09 17:42   ` Jason Low
2015-03-07 16:43 ` [tip:locking/core] " tip-bot for Jason Low
2015-03-07 17:13   ` Oleg Nesterov
2015-03-10 10:59     ` Peter Zijlstra
2015-03-10 16:04     ` Linus Torvalds
2015-03-10 16:16       ` Peter Zijlstra
2015-03-10 17:28       ` Oleg Nesterov
2015-03-10 17:45         ` Linus Torvalds
2015-03-10 18:33           ` Oleg Nesterov
2015-03-07 18:17 ` [PATCH] " Sasha Levin
2015-03-09 17:37   ` Jason Low

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).