[-next] locking/percpu-rwsem: fix a task_struct refcount
diff mbox series

Message ID 20200327031057.10866-1-cai@lca.pw
State New
Headers show
Series
  • [-next] locking/percpu-rwsem: fix a task_struct refcount
Related show

Commit Message

Qian Cai March 27, 2020, 3:10 a.m. UTC
There are some memory leaks due to a missing put_task_struct().

Fixes: 7f26482a872c ("locking/percpu-rwsem: Remove the embedded rwsem")
Signed-off-by: Qian Cai <cai@lca.pw>
---
 kernel/locking/percpu-rwsem.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Peter Zijlstra March 27, 2020, 9:37 a.m. UTC | #1
On Thu, Mar 26, 2020 at 11:10:57PM -0400, Qian Cai wrote:
> There are some memory leaks due to a missing put_task_struct().

This is an absolutely inadequate changelog. There is no explaning what
the actual race is and why this patch is correct.

> Fixes: 7f26482a872c ("locking/percpu-rwsem: Remove the embedded rwsem")
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
>  kernel/locking/percpu-rwsem.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
> index a008a1ba21a7..6f487e5d923f 100644
> --- a/kernel/locking/percpu-rwsem.c
> +++ b/kernel/locking/percpu-rwsem.c
> @@ -123,8 +123,10 @@ static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
>  	struct percpu_rw_semaphore *sem = key;
>  
>  	/* concurrent against percpu_down_write(), can get stolen */
> -	if (!__percpu_rwsem_trylock(sem, reader))
> +	if (!__percpu_rwsem_trylock(sem, reader)) {
> +		put_task_struct(p);
>  		return 1;
> +	}


If the trylock fails, someone else got the lock and we remain on the
waitqueue. It seems like a very bad idea to put the task while it
remains on the waitqueue, no?

>  
>  	list_del_init(&wq_entry->entry);
>  	smp_store_release(&wq_entry->private, NULL);
> -- 
> 2.21.0 (Apple Git-122.2)
>
Qian Cai March 27, 2020, 10:19 a.m. UTC | #2
> On Mar 27, 2020, at 5:37 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> If the trylock fails, someone else got the lock and we remain on the
> waitqueue. It seems like a very bad idea to put the task while it
> remains on the waitqueue, no?

Interesting, I thought this was more straightforward to see, but I may be wrong as always. At the beginning of percpu_rwsem_wake_function() it calls get_task_struct(), but if the trylock failed, it will remain in the waitqueue. However, it will run percpu_rwsem_wake_function() again with get_task_struct() to increase the refcount. Can you enlighten me where it will call put_task_struct() in waitqueue or elsewhere to balance the refcount in this case?
Peter Zijlstra March 30, 2020, 11:18 a.m. UTC | #3
On Fri, Mar 27, 2020 at 06:19:37AM -0400, Qian Cai wrote:
> 
> 
> > On Mar 27, 2020, at 5:37 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > If the trylock fails, someone else got the lock and we remain on the
> > waitqueue. It seems like a very bad idea to put the task while it
> > remains on the waitqueue, no?
> 
> Interesting, I thought this was more straightforward to see,

It is indeed as straight forward as you explain; but when doing 10
things at once, and having just dug through some low-level arch assembly
code for the previous email, even obvious things might sometimes need
a little explaining :/

So please, always try and err on the side of a little verbose when
writing Changelogs, esp. when concerning locking / concurrency, you
really can't be clear enough.

> but I may
> be wrong as always. At the beginning of percpu_rwsem_wake_function()
> it calls get_task_struct(), but if the trylock failed, it will remain
> in the waitqueue. However, it will run percpu_rwsem_wake_function()
> again with get_task_struct() to increase the refcount. Can you
> enlighten me where it will call put_task_struct() in waitqueue or
> elsewhere to balance the refcount in this case?

See, had that explaination been part of the Changelog, my brain would've
probably been able to kick itself in gear and actually spot the problem.

Yes, you're right.

That said, I wonder if we can just move the get_task_struct() call like
below; after all the race we're guarding against is percpu_rwsem_wait()
observing !private, terminating the wait and doing a quick exit() while
percpu_rwsem_wake_function() then does wake_up_process(p) as a
use-after-free.

Hmm?

diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
index a008a1ba21a7..8bbafe3e5203 100644
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -118,14 +118,15 @@ static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
 				      unsigned int mode, int wake_flags,
 				      void *key)
 {
-	struct task_struct *p = get_task_struct(wq_entry->private);
 	bool reader = wq_entry->flags & WQ_FLAG_CUSTOM;
 	struct percpu_rw_semaphore *sem = key;
+	struct task_struct *p;
 
 	/* concurrent against percpu_down_write(), can get stolen */
 	if (!__percpu_rwsem_trylock(sem, reader))
 		return 1;
 
+	p = get_task_struct(wq_entry->private);
 	list_del_init(&wq_entry->entry);
 	smp_store_release(&wq_entry->private, NULL);
Qian Cai March 30, 2020, 1:18 p.m. UTC | #4
> On Mar 30, 2020, at 7:18 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Fri, Mar 27, 2020 at 06:19:37AM -0400, Qian Cai wrote:
>> 
>> 
>>> On Mar 27, 2020, at 5:37 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>>> 
>>> If the trylock fails, someone else got the lock and we remain on the
>>> waitqueue. It seems like a very bad idea to put the task while it
>>> remains on the waitqueue, no?
>> 
>> Interesting, I thought this was more straightforward to see,
> 
> It is indeed as straight forward as you explain; but when doing 10
> things at once, and having just dug through some low-level arch assembly
> code for the previous email, even obvious things might sometimes need
> a little explaining :/
> 
> So please, always try and err on the side of a little verbose when
> writing Changelogs, esp. when concerning locking / concurrency, you
> really can't be clear enough.
> 
>> but I may
>> be wrong as always. At the beginning of percpu_rwsem_wake_function()
>> it calls get_task_struct(), but if the trylock failed, it will remain
>> in the waitqueue. However, it will run percpu_rwsem_wake_function()
>> again with get_task_struct() to increase the refcount. Can you
>> enlighten me where it will call put_task_struct() in waitqueue or
>> elsewhere to balance the refcount in this case?
> 
> See, had that explaination been part of the Changelog, my brain would've
> probably been able to kick itself in gear and actually spot the problem.
> 
> Yes, you're right.
> 
> That said, I wonder if we can just move the get_task_struct() call like
> below; after all the race we're guarding against is percpu_rwsem_wait()
> observing !private, terminating the wait and doing a quick exit() while
> percpu_rwsem_wake_function() then does wake_up_process(p) as a
> use-after-free.

Looks good to me. If no one has any objection, I’ll dust-out the commit log
and send out a v2 for it. 

> 
> Hmm?
> 
> diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
> index a008a1ba21a7..8bbafe3e5203 100644
> --- a/kernel/locking/percpu-rwsem.c
> +++ b/kernel/locking/percpu-rwsem.c
> @@ -118,14 +118,15 @@ static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
> 				      unsigned int mode, int wake_flags,
> 				      void *key)
> {
> -	struct task_struct *p = get_task_struct(wq_entry->private);
> 	bool reader = wq_entry->flags & WQ_FLAG_CUSTOM;
> 	struct percpu_rw_semaphore *sem = key;
> +	struct task_struct *p;
> 
> 	/* concurrent against percpu_down_write(), can get stolen */
> 	if (!__percpu_rwsem_trylock(sem, reader))
> 		return 1;
> 
> +	p = get_task_struct(wq_entry->private);
> 	list_del_init(&wq_entry->entry);
> 	smp_store_release(&wq_entry->private, NULL);
>

Patch
diff mbox series

diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
index a008a1ba21a7..6f487e5d923f 100644
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -123,8 +123,10 @@  static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
 	struct percpu_rw_semaphore *sem = key;
 
 	/* concurrent against percpu_down_write(), can get stolen */
-	if (!__percpu_rwsem_trylock(sem, reader))
+	if (!__percpu_rwsem_trylock(sem, reader)) {
+		put_task_struct(p);
 		return 1;
+	}
 
 	list_del_init(&wq_entry->entry);
 	smp_store_release(&wq_entry->private, NULL);