linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] memory leak in __send_signal
@ 2021-06-06 14:32 syzbot
  2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: syzbot @ 2021-06-06 14:32 UTC (permalink / raw)
  To: axboe, christian, ebiederm, elver, linux-kernel, oleg, pcc,
	peterz, syzkaller-bugs, tglx

Hello,

syzbot found the following issue on:

HEAD commit:    9d32fa5d Merge tag 'net-5.13-rc5' of git://git.kernel.org/..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10fd97dfd00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=de8efb0998945e75
dashboard link: https://syzkaller.appspot.com/bug?extid=0bac5fec63d4f399ba98
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16029ce0300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0bac5fec63d4f399ba98@syzkaller.appspotmail.com

2021/06/05 21:42:36 executed programs: 303
2021/06/05 21:42:42 executed programs: 312
2021/06/05 21:42:48 executed programs: 319
2021/06/05 21:42:54 executed programs: 331
BUG: memory leak
unreferenced object 0xffff8881278e3c80 (size 80):
  comm "syz-executor.4", pid 12851, jiffies 4295068441 (age 14.610s)
  hex dump (first 32 bytes):
    80 3c 8e 27 81 88 ff ff 80 3c 8e 27 81 88 ff ff  .<.'.....<.'....
    00 00 00 00 00 00 00 00 05 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff812450d6>] __sigqueue_alloc+0xd6/0x240 kernel/signal.c:441
    [<ffffffff81247d31>] __send_signal+0x231/0x600 kernel/signal.c:1155
    [<ffffffff8124b123>] do_send_sig_info+0x63/0xc0 kernel/signal.c:1333
    [<ffffffff8124b4f9>] do_send_specific+0xc9/0xf0 kernel/signal.c:3881
    [<ffffffff8124b5ab>] do_tkill+0x8b/0xb0 kernel/signal.c:3907
    [<ffffffff8124e811>] __do_sys_tkill kernel/signal.c:3942 [inline]
    [<ffffffff8124e811>] __se_sys_tkill kernel/signal.c:3936 [inline]
    [<ffffffff8124e811>] __x64_sys_tkill+0x31/0x50 kernel/signal.c:3936
    [<ffffffff843540da>] do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47
    [<ffffffff84400068>] entry_SYSCALL_64_after_hwframe+0x44/0xae



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] signal: Prevent sigqueue caching after task got released
  2021-06-06 14:32 [syzbot] memory leak in __send_signal syzbot
@ 2021-06-21 23:08 ` Thomas Gleixner
  2021-06-22  6:34   ` Oleg Nesterov
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Thomas Gleixner @ 2021-06-21 23:08 UTC (permalink / raw)
  To: syzbot
  Cc: axboe, christian, ebiederm, elver, linux-kernel, oleg, pcc,
	peterz, syzkaller-bugs

syzbot reported a memory leak related to sigqueue caching. This happens
when a thread group leader with child tasks is reaped.

The group leader's sigqueue_cache is correctly freed. The group leader then
reaps the child tasks and if any of them has a signal pending it caches
that signal. That's obviously bogus because nothing will free the cached
signal of the reaped group leader anymore.

Prevent this by setting tsk::sigqueue_cache to an error pointer value in
exit_task_sigqueue_cache().

Add comments to all relevant places.

Fixes: 4bad58ebc8bc ("signal: Allow tasks to cache one sigqueue struct")
Reported-by: syzbot+0bac5fec63d4f399ba98@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/signal.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -435,6 +435,12 @@ static struct sigqueue *
 		 * Preallocation does not hold sighand::siglock so it can't
 		 * use the cache. The lockless caching requires that only
 		 * one consumer and only one producer run at a time.
+		 *
+		 * For the regular allocation case it is sufficient to
+		 * check @q for NULL because this code can only be called
+		 * if the target task @t has not been reaped yet; which
+		 * means this code can never observe the error pointer which is
+		 * written to @t->sigqueue_cache in exit_task_sigqueue_cache().
 		 */
 		q = READ_ONCE(t->sigqueue_cache);
 		if (!q || sigqueue_flags)
@@ -463,13 +469,18 @@ void exit_task_sigqueue_cache(struct tas
 	struct sigqueue *q = tsk->sigqueue_cache;
 
 	if (q) {
-		tsk->sigqueue_cache = NULL;
 		/*
 		 * Hand it back to the cache as the task might
 		 * be self reaping which would leak the object.
 		 */
 		 kmem_cache_free(sigqueue_cachep, q);
 	}
+
+	/*
+	 * Set an error pointer to ensure that @tsk will not cache a
+	 * sigqueue when it is reaping it's child tasks
+	 */
+	tsk->sigqueue_cache = ERR_PTR(-1);
 }
 
 static void sigqueue_cache_or_free(struct sigqueue *q)
@@ -481,6 +492,10 @@ static void sigqueue_cache_or_free(struc
 	 * is intentional when run without holding current->sighand->siglock,
 	 * which is fine as current obviously cannot run __sigqueue_free()
 	 * concurrently.
+	 *
+	 * The NULL check is safe even if current has been reaped already,
+	 * in which case exit_task_sigqueue_cache() wrote an error pointer
+	 * into current->sigqueue_cache.
 	 */
 	if (!READ_ONCE(current->sigqueue_cache))
 		WRITE_ONCE(current->sigqueue_cache, q);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] signal: Prevent sigqueue caching after task got released
  2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
@ 2021-06-22  6:34   ` Oleg Nesterov
  2021-06-22  7:59     ` Thomas Gleixner
  2021-06-22  8:06   ` Christian Brauner
  2021-06-22 13:59   ` [tip: core/urgent] " tip-bot2 for Thomas Gleixner
  2 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2021-06-22  6:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: syzbot, axboe, christian, ebiederm, elver, linux-kernel, pcc,
	peterz, syzkaller-bugs

On 06/22, Thomas Gleixner wrote:
>
> syzbot reported a memory leak related to sigqueue caching. This happens
> when a thread group leader with child tasks is reaped.
>
> The group leader's sigqueue_cache is correctly freed. The group leader then
> reaps the child tasks and if any of them has a signal pending it caches
> that signal.

I guess you mean the race with exit_notify() ? Could you spell please?
I am just curious how exactly this problem was found.

This doesn't really matter, because damn yes, a task T can call
release_task(another_task)->sigqueue_cache_or_free() after
exit_task_sigqueue_cache(T) was already called. For example, a last non-leader
thread exits and reaps a zombie leader.

Somehow I thought that exit_task_sigqueue_cache() at the end of __exit_signal()
should fix this problem, but this is obviously wrong.


> @@ -463,13 +469,18 @@ void exit_task_sigqueue_cache(struct tas
>  	struct sigqueue *q = tsk->sigqueue_cache;
>  
>  	if (q) {
> -		tsk->sigqueue_cache = NULL;
>  		/*
>  		 * Hand it back to the cache as the task might
>  		 * be self reaping which would leak the object.
>  		 */
>  		 kmem_cache_free(sigqueue_cachep, q);
>  	}
> +
> +	/*
> +	 * Set an error pointer to ensure that @tsk will not cache a
> +	 * sigqueue when it is reaping it's child tasks
> +	 */
> +	tsk->sigqueue_cache = ERR_PTR(-1);
>  }


Reviewed-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] signal: Prevent sigqueue caching after task got released
  2021-06-22  6:34   ` Oleg Nesterov
@ 2021-06-22  7:59     ` Thomas Gleixner
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2021-06-22  7:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: syzbot, axboe, christian, ebiederm, elver, linux-kernel, pcc,
	peterz, syzkaller-bugs

On Tue, Jun 22 2021 at 08:34, Oleg Nesterov wrote:
> On 06/22, Thomas Gleixner wrote:
>
> I guess you mean the race with exit_notify() ? Could you spell please?

Yes let me rephrase that.

> I am just curious how exactly this problem was found.

I was looking at that syzbot report

  https://lore.kernel.org/r/000000000000148b4b05c419cbbb@google.com

and analyzed it how this ends up leaking memory.

> This doesn't really matter, because damn yes, a task T can call
> release_task(another_task)->sigqueue_cache_or_free() after
> exit_task_sigqueue_cache(T) was already called. For example, a last non-leader
> thread exits and reaps a zombie leader.
>
> Somehow I thought that exit_task_sigqueue_cache() at the end of __exit_signal()
> should fix this problem, but this is obviously wrong.
>
>
>> @@ -463,13 +469,18 @@ void exit_task_sigqueue_cache(struct tas
>>  	struct sigqueue *q = tsk->sigqueue_cache;
>>  
>>  	if (q) {
>> -		tsk->sigqueue_cache = NULL;
>>  		/*
>>  		 * Hand it back to the cache as the task might
>>  		 * be self reaping which would leak the object.
>>  		 */
>>  		 kmem_cache_free(sigqueue_cachep, q);
>>  	}
>> +
>> +	/*
>> +	 * Set an error pointer to ensure that @tsk will not cache a
>> +	 * sigqueue when it is reaping it's child tasks
>> +	 */
>> +	tsk->sigqueue_cache = ERR_PTR(-1);
>>  }
>
>
> Reviewed-by: Oleg Nesterov <oleg@redhat.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] signal: Prevent sigqueue caching after task got released
  2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
  2021-06-22  6:34   ` Oleg Nesterov
@ 2021-06-22  8:06   ` Christian Brauner
  2021-06-22 13:59   ` [tip: core/urgent] " tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2021-06-22  8:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: syzbot, axboe, christian, ebiederm, elver, linux-kernel, oleg,
	pcc, peterz, syzkaller-bugs

On Tue, Jun 22, 2021 at 01:08:30AM +0200, Thomas Gleixner wrote:
> syzbot reported a memory leak related to sigqueue caching. This happens
> when a thread group leader with child tasks is reaped.
> 
> The group leader's sigqueue_cache is correctly freed. The group leader then
> reaps the child tasks and if any of them has a signal pending it caches
> that signal. That's obviously bogus because nothing will free the cached
> signal of the reaped group leader anymore.
> 
> Prevent this by setting tsk::sigqueue_cache to an error pointer value in
> exit_task_sigqueue_cache().
> 
> Add comments to all relevant places.
> 
> Fixes: 4bad58ebc8bc ("signal: Allow tasks to cache one sigqueue struct")
> Reported-by: syzbot+0bac5fec63d4f399ba98@syzkaller.appspotmail.com
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip: core/urgent] signal: Prevent sigqueue caching after task got released
  2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
  2021-06-22  6:34   ` Oleg Nesterov
  2021-06-22  8:06   ` Christian Brauner
@ 2021-06-22 13:59   ` tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 6+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-22 13:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: syzbot+0bac5fec63d4f399ba98, Thomas Gleixner, Oleg Nesterov,
	Christian Brauner, x86, linux-kernel

The following commit has been merged into the core/urgent branch of tip:

Commit-ID:     399f8dd9a866e107639eabd3c1979cd526ca3a98
Gitweb:        https://git.kernel.org/tip/399f8dd9a866e107639eabd3c1979cd526ca3a98
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 22 Jun 2021 01:08:30 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 22 Jun 2021 15:55:41 +02:00

signal: Prevent sigqueue caching after task got released

syzbot reported a memory leak related to sigqueue caching.

The assumption that a task cannot cache a sigqueue after the signal handler
has been dropped and exit_task_sigqueue_cache() has been invoked turns out
to be wrong.

Such a task can still invoke release_task(other_task), which cleans up the
signals of 'other_task' and ends up in sigqueue_cache_or_free(), which in
turn will cache the signal because task->sigqueue_cache is NULL. That's
obviously bogus because nothing will free the cached signal of that task
anymore, so the cached item is leaked.

This happens when e.g. the last non-leader thread exits and reaps the
zombie leader.

Prevent this by setting tsk::sigqueue_cache to an error pointer value in
exit_task_sigqueue_cache() which forces any subsequent invocation of
sigqueue_cache_or_free() from that task to hand the sigqueue back to the
kmemcache.

Add comments to all relevant places.

Fixes: 4bad58ebc8bc ("signal: Allow tasks to cache one sigqueue struct")
Reported-by: syzbot+0bac5fec63d4f399ba98@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/878s32g6j5.ffs@nanos.tec.linutronix.de

---
 kernel/signal.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index f7c6ffc..f1ecd8f 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -435,6 +435,12 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
 		 * Preallocation does not hold sighand::siglock so it can't
 		 * use the cache. The lockless caching requires that only
 		 * one consumer and only one producer run at a time.
+		 *
+		 * For the regular allocation case it is sufficient to
+		 * check @q for NULL because this code can only be called
+		 * if the target task @t has not been reaped yet; which
+		 * means this code can never observe the error pointer which is
+		 * written to @t->sigqueue_cache in exit_task_sigqueue_cache().
 		 */
 		q = READ_ONCE(t->sigqueue_cache);
 		if (!q || sigqueue_flags)
@@ -463,13 +469,18 @@ void exit_task_sigqueue_cache(struct task_struct *tsk)
 	struct sigqueue *q = tsk->sigqueue_cache;
 
 	if (q) {
-		tsk->sigqueue_cache = NULL;
 		/*
 		 * Hand it back to the cache as the task might
 		 * be self reaping which would leak the object.
 		 */
 		 kmem_cache_free(sigqueue_cachep, q);
 	}
+
+	/*
+	 * Set an error pointer to ensure that @tsk will not cache a
+	 * sigqueue when it is reaping it's child tasks
+	 */
+	tsk->sigqueue_cache = ERR_PTR(-1);
 }
 
 static void sigqueue_cache_or_free(struct sigqueue *q)
@@ -481,6 +492,10 @@ static void sigqueue_cache_or_free(struct sigqueue *q)
 	 * is intentional when run without holding current->sighand->siglock,
 	 * which is fine as current obviously cannot run __sigqueue_free()
 	 * concurrently.
+	 *
+	 * The NULL check is safe even if current has been reaped already,
+	 * in which case exit_task_sigqueue_cache() wrote an error pointer
+	 * into current->sigqueue_cache.
 	 */
 	if (!READ_ONCE(current->sigqueue_cache))
 		WRITE_ONCE(current->sigqueue_cache, q);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-06-22 13:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-06 14:32 [syzbot] memory leak in __send_signal syzbot
2021-06-21 23:08 ` [PATCH] signal: Prevent sigqueue caching after task got released Thomas Gleixner
2021-06-22  6:34   ` Oleg Nesterov
2021-06-22  7:59     ` Thomas Gleixner
2021-06-22  8:06   ` Christian Brauner
2021-06-22 13:59   ` [tip: core/urgent] " tip-bot2 for Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).