All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] memory leak in __send_signal
@ 2021-06-06 14:32 syzbot
  2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: syzbot @ 2021-06-06 14:32 UTC (permalink / raw)
  To: axboe, christian, ebiederm, elver, linux-kernel, oleg, pcc,
	peterz, syzkaller-bugs, tglx

Hello,

syzbot found the following issue on:

HEAD commit:    9d32fa5d Merge tag 'net-5.13-rc5' of git://git.kernel.org/..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10fd97dfd00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=de8efb0998945e75
dashboard link: https://syzkaller.appspot.com/bug?extid=0bac5fec63d4f399ba98
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16029ce0300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0bac5fec63d4f399ba98@syzkaller.appspotmail.com

2021/06/05 21:42:36 executed programs: 303
2021/06/05 21:42:42 executed programs: 312
2021/06/05 21:42:48 executed programs: 319
2021/06/05 21:42:54 executed programs: 331
BUG: memory leak
unreferenced object 0xffff8881278e3c80 (size 80):
  comm "syz-executor.4", pid 12851, jiffies 4295068441 (age 14.610s)
  hex dump (first 32 bytes):
    80 3c 8e 27 81 88 ff ff 80 3c 8e 27 81 88 ff ff  .<.'.....<.'....
    00 00 00 00 00 00 00 00 05 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff812450d6>] __sigqueue_alloc+0xd6/0x240 kernel/signal.c:441
    [<ffffffff81247d31>] __send_signal+0x231/0x600 kernel/signal.c:1155
    [<ffffffff8124b123>] do_send_sig_info+0x63/0xc0 kernel/signal.c:1333
    [<ffffffff8124b4f9>] do_send_specific+0xc9/0xf0 kernel/signal.c:3881
    [<ffffffff8124b5ab>] do_tkill+0x8b/0xb0 kernel/signal.c:3907
    [<ffffffff8124e811>] __do_sys_tkill kernel/signal.c:3942 [inline]
    [<ffffffff8124e811>] __se_sys_tkill kernel/signal.c:3936 [inline]
    [<ffffffff8124e811>] __x64_sys_tkill+0x31/0x50 kernel/signal.c:3936
    [<ffffffff843540da>] do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47
    [<ffffffff84400068>] entry_SYSCALL_64_after_hwframe+0x44/0xae



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] signal: Prevent sigqueue caching after task got released
  2021-06-06 14:32 [syzbot] memory leak in __send_signal syzbot
@ 2021-06-21 23:08 ` Thomas Gleixner
  2021-06-22  6:34   ` Oleg Nesterov
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Thomas Gleixner @ 2021-06-21 23:08 UTC (permalink / raw)
  To: syzbot
  Cc: axboe, christian, ebiederm, elver, linux-kernel, oleg, pcc,
	peterz, syzkaller-bugs

syzbot reported a memory leak related to sigqueue caching. This happens
when a thread group leader with child tasks is reaped.

The group leader's sigqueue_cache is correctly freed. The group leader then
reaps the child tasks and if any of them has a signal pending it caches
that signal. That's obviously bogus because nothing will free the cached
signal of the reaped group leader anymore.

Prevent this by setting tsk::sigqueue_cache to an error pointer value in
exit_task_sigqueue_cache().

Add comments to all relevant places.

Fixes: 4bad58ebc8bc ("signal: Allow tasks to cache one sigqueue struct")
Reported-by: syzbot+0bac5fec63d4f399ba98@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/signal.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -435,6 +435,12 @@ static struct sigqueue *
 		 * Preallocation does not hold sighand::siglock so it can't
 		 * use the cache. The lockless caching requires that only
 		 * one consumer and only one producer run at a time.
+		 *
+		 * For the regular allocation case it is sufficient to
+		 * check @q for NULL because this code can only be called
+		 * if the target task @t has not been reaped yet; which
+		 * means this code can never observe the error pointer which is
+		 * written to @t->sigqueue_cache in exit_task_sigqueue_cache().
 		 */
 		q = READ_ONCE(t->sigqueue_cache);
 		if (!q || sigqueue_flags)
@@ -463,13 +469,18 @@ void exit_task_sigqueue_cache(struct tas
 	struct sigqueue *q = tsk->sigqueue_cache;
 
 	if (q) {
-		tsk->sigqueue_cache = NULL;
 		/*
 		 * Hand it back to the cache as the task might
 		 * be self reaping which would leak the object.
 		 */
 		 kmem_cache_free(sigqueue_cachep, q);
 	}
+
+	/*
+	 * Set an error pointer to ensure that @tsk will not cache a
+	 * sigqueue when it is reaping it's child tasks
+	 */
+	tsk->sigqueue_cache = ERR_PTR(-1);
 }
 
 static void sigqueue_cache_or_free(struct sigqueue *q)
@@ -481,6 +492,10 @@ static void sigqueue_cache_or_free(struc
 	 * is intentional when run without holding current->sighand->siglock,
 	 * which is fine as current obviously cannot run __sigqueue_free()
 	 * concurrently.
+	 *
+	 * The NULL check is safe even if current has been reaped already,
+	 * in which case exit_task_sigqueue_cache() wrote an error pointer
+	 * into current->sigqueue_cache.
 	 */
 	if (!READ_ONCE(current->sigqueue_cache))
 		WRITE_ONCE(current->sigqueue_cache, q);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] signal: Prevent sigqueue caching after task got released
  2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
@ 2021-06-22  6:34   ` Oleg Nesterov
  2021-06-22  7:59     ` Thomas Gleixner
  2021-06-22  8:06   ` Christian Brauner
  2021-06-22 13:59   ` [tip: core/urgent] " tip-bot2 for Thomas Gleixner
  2 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2021-06-22  6:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: syzbot, axboe, christian, ebiederm, elver, linux-kernel, pcc,
	peterz, syzkaller-bugs

On 06/22, Thomas Gleixner wrote:
>
> syzbot reported a memory leak related to sigqueue caching. This happens
> when a thread group leader with child tasks is reaped.
>
> The group leader's sigqueue_cache is correctly freed. The group leader then
> reaps the child tasks and if any of them has a signal pending it caches
> that signal.

I guess you mean the race with exit_notify() ? Could you spell please?
I am just curious how exactly this problem was found.

This doesn't really matter, because damn yes, a task T can call
release_task(another_task)->sigqueue_cache_or_free() after
exit_task_sigqueue_cache(T) was already called. For example, a last non-leader
thread exits and reaps a zombie leader.

Somehow I thought that exit_task_sigqueue_cache() at the end of __exit_signal()
should fix this problem, but this is obviously wrong.


> @@ -463,13 +469,18 @@ void exit_task_sigqueue_cache(struct tas
>  	struct sigqueue *q = tsk->sigqueue_cache;
>  
>  	if (q) {
> -		tsk->sigqueue_cache = NULL;
>  		/*
>  		 * Hand it back to the cache as the task might
>  		 * be self reaping which would leak the object.
>  		 */
>  		 kmem_cache_free(sigqueue_cachep, q);
>  	}
> +
> +	/*
> +	 * Set an error pointer to ensure that @tsk will not cache a
> +	 * sigqueue when it is reaping it's child tasks
> +	 */
> +	tsk->sigqueue_cache = ERR_PTR(-1);
>  }


Reviewed-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] signal: Prevent sigqueue caching after task got released
  2021-06-22  6:34   ` Oleg Nesterov
@ 2021-06-22  7:59     ` Thomas Gleixner
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2021-06-22  7:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: syzbot, axboe, christian, ebiederm, elver, linux-kernel, pcc,
	peterz, syzkaller-bugs

On Tue, Jun 22 2021 at 08:34, Oleg Nesterov wrote:
> On 06/22, Thomas Gleixner wrote:
>
> I guess you mean the race with exit_notify() ? Could you spell please?

Yes let me rephrase that.

> I am just curious how exactly this problem was found.

I was looking at that syzbot report

  https://lore.kernel.org/r/000000000000148b4b05c419cbbb@google.com

and analyzed it how this ends up leaking memory.

> This doesn't really matter, because damn yes, a task T can call
> release_task(another_task)->sigqueue_cache_or_free() after
> exit_task_sigqueue_cache(T) was already called. For example, a last non-leader
> thread exits and reaps a zombie leader.
>
> Somehow I thought that exit_task_sigqueue_cache() at the end of __exit_signal()
> should fix this problem, but this is obviously wrong.
>
>
>> @@ -463,13 +469,18 @@ void exit_task_sigqueue_cache(struct tas
>>  	struct sigqueue *q = tsk->sigqueue_cache;
>>  
>>  	if (q) {
>> -		tsk->sigqueue_cache = NULL;
>>  		/*
>>  		 * Hand it back to the cache as the task might
>>  		 * be self reaping which would leak the object.
>>  		 */
>>  		 kmem_cache_free(sigqueue_cachep, q);
>>  	}
>> +
>> +	/*
>> +	 * Set an error pointer to ensure that @tsk will not cache a
>> +	 * sigqueue when it is reaping it's child tasks
>> +	 */
>> +	tsk->sigqueue_cache = ERR_PTR(-1);
>>  }
>
>
> Reviewed-by: Oleg Nesterov <oleg@redhat.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] signal: Prevent sigqueue caching after task got released
  2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
  2021-06-22  6:34   ` Oleg Nesterov
@ 2021-06-22  8:06   ` Christian Brauner
  2021-06-22 13:59   ` [tip: core/urgent] " tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2021-06-22  8:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: syzbot, axboe, christian, ebiederm, elver, linux-kernel, oleg,
	pcc, peterz, syzkaller-bugs

On Tue, Jun 22, 2021 at 01:08:30AM +0200, Thomas Gleixner wrote:
> syzbot reported a memory leak related to sigqueue caching. This happens
> when a thread group leader with child tasks is reaped.
> 
> The group leader's sigqueue_cache is correctly freed. The group leader then
> reaps the child tasks and if any of them has a signal pending it caches
> that signal. That's obviously bogus because nothing will free the cached
> signal of the reaped group leader anymore.
> 
> Prevent this by setting tsk::sigqueue_cache to an error pointer value in
> exit_task_sigqueue_cache().
> 
> Add comments to all relevant places.
> 
> Fixes: 4bad58ebc8bc ("signal: Allow tasks to cache one sigqueue struct")
> Reported-by: syzbot+0bac5fec63d4f399ba98@syzkaller.appspotmail.com
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip: core/urgent] signal: Prevent sigqueue caching after task got released
  2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
  2021-06-22  6:34   ` Oleg Nesterov
  2021-06-22  8:06   ` Christian Brauner
@ 2021-06-22 13:59   ` tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 6+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-22 13:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: syzbot+0bac5fec63d4f399ba98, Thomas Gleixner, Oleg Nesterov,
	Christian Brauner, x86, linux-kernel

The following commit has been merged into the core/urgent branch of tip:

Commit-ID:     399f8dd9a866e107639eabd3c1979cd526ca3a98
Gitweb:        https://git.kernel.org/tip/399f8dd9a866e107639eabd3c1979cd526ca3a98
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 22 Jun 2021 01:08:30 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 22 Jun 2021 15:55:41 +02:00

signal: Prevent sigqueue caching after task got released

syzbot reported a memory leak related to sigqueue caching.

The assumption that a task cannot cache a sigqueue after the signal handler
has been dropped and exit_task_sigqueue_cache() has been invoked turns out
to be wrong.

Such a task can still invoke release_task(other_task), which cleans up the
signals of 'other_task' and ends up in sigqueue_cache_or_free(), which in
turn will cache the signal because task->sigqueue_cache is NULL. That's
obviously bogus because nothing will free the cached signal of that task
anymore, so the cached item is leaked.

This happens when e.g. the last non-leader thread exits and reaps the
zombie leader.

Prevent this by setting tsk::sigqueue_cache to an error pointer value in
exit_task_sigqueue_cache() which forces any subsequent invocation of
sigqueue_cache_or_free() from that task to hand the sigqueue back to the
kmemcache.

Add comments to all relevant places.

Fixes: 4bad58ebc8bc ("signal: Allow tasks to cache one sigqueue struct")
Reported-by: syzbot+0bac5fec63d4f399ba98@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/878s32g6j5.ffs@nanos.tec.linutronix.de

---
 kernel/signal.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index f7c6ffc..f1ecd8f 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -435,6 +435,12 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
 		 * Preallocation does not hold sighand::siglock so it can't
 		 * use the cache. The lockless caching requires that only
 		 * one consumer and only one producer run at a time.
+		 *
+		 * For the regular allocation case it is sufficient to
+		 * check @q for NULL because this code can only be called
+		 * if the target task @t has not been reaped yet; which
+		 * means this code can never observe the error pointer which is
+		 * written to @t->sigqueue_cache in exit_task_sigqueue_cache().
 		 */
 		q = READ_ONCE(t->sigqueue_cache);
 		if (!q || sigqueue_flags)
@@ -463,13 +469,18 @@ void exit_task_sigqueue_cache(struct task_struct *tsk)
 	struct sigqueue *q = tsk->sigqueue_cache;
 
 	if (q) {
-		tsk->sigqueue_cache = NULL;
 		/*
 		 * Hand it back to the cache as the task might
 		 * be self reaping which would leak the object.
 		 */
 		 kmem_cache_free(sigqueue_cachep, q);
 	}
+
+	/*
+	 * Set an error pointer to ensure that @tsk will not cache a
+	 * sigqueue when it is reaping it's child tasks
+	 */
+	tsk->sigqueue_cache = ERR_PTR(-1);
 }
 
 static void sigqueue_cache_or_free(struct sigqueue *q)
@@ -481,6 +492,10 @@ static void sigqueue_cache_or_free(struct sigqueue *q)
 	 * is intentional when run without holding current->sighand->siglock,
 	 * which is fine as current obviously cannot run __sigqueue_free()
 	 * concurrently.
+	 *
+	 * The NULL check is safe even if current has been reaped already,
+	 * in which case exit_task_sigqueue_cache() wrote an error pointer
+	 * into current->sigqueue_cache.
 	 */
 	if (!READ_ONCE(current->sigqueue_cache))
 		WRITE_ONCE(current->sigqueue_cache, q);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-06-22 13:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-06 14:32 [syzbot] memory leak in __send_signal syzbot
2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
2021-06-22  6:34   ` Oleg Nesterov
2021-06-22  7:59     ` Thomas Gleixner
2021-06-22  8:06   ` Christian Brauner
2021-06-22 13:59   ` [tip: core/urgent] " tip-bot2 for Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.