All of lore.kernel.org
 help / color / mirror / Atom feed
* KCSAN: data-race in taskstats_exit / taskstats_exit
@ 2019-10-05  4:26 syzbot
  2019-10-05  4:29 ` Dmitry Vyukov
                   ` (2 more replies)
  0 siblings, 3 replies; 62+ messages in thread
From: syzbot @ 2019-10-05  4:26 UTC (permalink / raw)
  To: bsingharora, elver, linux-kernel, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    b4bd9343 x86, kcsan: Enable KCSAN for x86
git tree:       https://github.com/google/ktsan.git kcsan
console output: https://syzkaller.appspot.com/x/log.txt?x=125329db600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
dashboard link: https://syzkaller.appspot.com/bug?extid=c5d03165a1bd1dead0c1
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in taskstats_exit / taskstats_exit

write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
  taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
  taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
  do_exit+0x2c2/0x18e0 kernel/exit.c:864
  do_group_exit+0xb4/0x1c0 kernel/exit.c:983
  get_signal+0x2a2/0x1320 kernel/signal.c:2734
  do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
  exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
  taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
  taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
  do_exit+0x2c2/0x18e0 kernel/exit.c:864
  do_group_exit+0xb4/0x1c0 kernel/exit.c:983
  __do_sys_exit_group kernel/exit.c:994 [inline]
  __se_sys_exit_group kernel/exit.c:992 [inline]
  __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
  do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 7949 Comm: syz-executor.3 Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
==================================================================
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 7949 Comm: syz-executor.3 Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xf5/0x159 lib/dump_stack.c:113
  panic+0x209/0x639 kernel/panic.c:219
  end_report kernel/kcsan/report.c:135 [inline]
  kcsan_report.cold+0x57/0xeb kernel/kcsan/report.c:283
  __kcsan_setup_watchpoint+0x342/0x500 kernel/kcsan/core.c:456
  __tsan_read8 kernel/kcsan/kcsan.c:31 [inline]
  __tsan_read8+0x2c/0x30 kernel/kcsan/kcsan.c:31
  taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
  taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
  do_exit+0x2c2/0x18e0 kernel/exit.c:864
  do_group_exit+0xb4/0x1c0 kernel/exit.c:983
  __do_sys_exit_group kernel/exit.c:994 [inline]
  __se_sys_exit_group kernel/exit.c:992 [inline]
  __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
  do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x459a59
Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc4ccf3408 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000459a59
RDX: 0000000000413741 RSI: fffffffffffffff7 RDI: 0000000000000000
RBP: 0000000000000000 R08: 000000007f8e4506 R09: 00007ffc4ccf3460
R10: ffffffff81007108 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffc4ccf3460 R14: 0000000000000000 R15: 00007ffc4ccf3470
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: KCSAN: data-race in taskstats_exit / taskstats_exit
  2019-10-05  4:26 KCSAN: data-race in taskstats_exit / taskstats_exit syzbot
@ 2019-10-05  4:29 ` Dmitry Vyukov
  2019-10-05 11:29   ` Christian Brauner
  2019-10-05 11:28 ` [PATCH] taskstats: fix data-race Christian Brauner
  2019-11-06  0:09 ` KCSAN: data-race in taskstats_exit / taskstats_exit Balbir Singh
  2 siblings, 1 reply; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-05  4:29 UTC (permalink / raw)
  To: syzbot, Christian Brauner; +Cc: bsingharora, Marco Elver, LKML, syzkaller-bugs

On Sat, Oct 5, 2019 at 6:26 AM syzbot
<syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    b4bd9343 x86, kcsan: Enable KCSAN for x86
> git tree:       https://github.com/google/ktsan.git kcsan
> console output: https://syzkaller.appspot.com/x/log.txt?x=125329db600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> dashboard link: https://syzkaller.appspot.com/bug?extid=c5d03165a1bd1dead0c1
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com

+Christian, you wanted races in process mgmt ;)


> ==================================================================
> BUG: KCSAN: data-race in taskstats_exit / taskstats_exit
>
> write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
>   taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
>   taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
>   do_exit+0x2c2/0x18e0 kernel/exit.c:864
>   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
>   get_signal+0x2a2/0x1320 kernel/signal.c:2734
>   do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
>   exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
>   prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>   syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
>   do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
>   taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
>   taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
>   do_exit+0x2c2/0x18e0 kernel/exit.c:864
>   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
>   __do_sys_exit_group kernel/exit.c:994 [inline]
>   __se_sys_exit_group kernel/exit.c:992 [inline]
>   __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
>   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 PID: 7949 Comm: syz-executor.3 Not tainted 5.3.0+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> ==================================================================
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 7949 Comm: syz-executor.3 Not tainted 5.3.0+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0xf5/0x159 lib/dump_stack.c:113
>   panic+0x209/0x639 kernel/panic.c:219
>   end_report kernel/kcsan/report.c:135 [inline]
>   kcsan_report.cold+0x57/0xeb kernel/kcsan/report.c:283
>   __kcsan_setup_watchpoint+0x342/0x500 kernel/kcsan/core.c:456
>   __tsan_read8 kernel/kcsan/kcsan.c:31 [inline]
>   __tsan_read8+0x2c/0x30 kernel/kcsan/kcsan.c:31
>   taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
>   taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
>   do_exit+0x2c2/0x18e0 kernel/exit.c:864
>   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
>   __do_sys_exit_group kernel/exit.c:994 [inline]
>   __se_sys_exit_group kernel/exit.c:992 [inline]
>   __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
>   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x459a59
> Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
> ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007ffc4ccf3408 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000459a59
> RDX: 0000000000413741 RSI: fffffffffffffff7 RDI: 0000000000000000
> RBP: 0000000000000000 R08: 000000007f8e4506 R09: 00007ffc4ccf3460
> R10: ffffffff81007108 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007ffc4ccf3460 R14: 0000000000000000 R15: 00007ffc4ccf3470
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/0000000000009b403005942237bf%40google.com.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH] taskstats: fix data-race
  2019-10-05  4:26 KCSAN: data-race in taskstats_exit / taskstats_exit syzbot
  2019-10-05  4:29 ` Dmitry Vyukov
@ 2019-10-05 11:28 ` Christian Brauner
  2019-10-05 13:33   ` Marco Elver
                     ` (3 more replies)
  2019-11-06  0:09 ` KCSAN: data-race in taskstats_exit / taskstats_exit Balbir Singh
  2 siblings, 4 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-05 11:28 UTC (permalink / raw)
  To: syzbot+c5d03165a1bd1dead0c1
  Cc: bsingharora, elver, linux-kernel, syzkaller-bugs, Christian Brauner

When assiging and testing taskstats in taskstats
taskstats_exit() there's a race around writing and reading sig->stats.

cpu0:
task calls exit()
do_exit()
	-> taskstats_exit()
		-> taskstats_tgid_alloc()
The task takes sighand lock and assigns new stats to sig->stats.

cpu1:
task catches signal
do_exit()
	-> taskstats_tgid_alloc()
		-> taskstats_exit()
The tasks reads sig->stats __without__ holding sighand lock seeing
garbage.

Fix this by taking sighand lock when reading sig->stats.

Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 kernel/taskstats.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 13a0f2e6ebc2..58b145234c4a 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -553,26 +553,32 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
 
 static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
 {
+	int empty;
+	struct taskstats *stats_new, *stats = NULL;
 	struct signal_struct *sig = tsk->signal;
-	struct taskstats *stats;
-
-	if (sig->stats || thread_group_empty(tsk))
-		goto ret;
 
 	/* No problem if kmem_cache_zalloc() fails */
-	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
+	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
+
+	empty = thread_group_empty(tsk);
 
 	spin_lock_irq(&tsk->sighand->siglock);
+	if (sig->stats || empty) {
+		stats = sig->stats;
+		spin_unlock_irq(&tsk->sighand->siglock);
+		goto free_cache;
+	}
+
 	if (!sig->stats) {
-		sig->stats = stats;
-		stats = NULL;
+		sig->stats = stats_new;
+		spin_unlock_irq(&tsk->sighand->siglock);
+		return stats_new;
 	}
 	spin_unlock_irq(&tsk->sighand->siglock);
 
-	if (stats)
-		kmem_cache_free(taskstats_cache, stats);
-ret:
-	return sig->stats;
+free_cache:
+	kmem_cache_free(taskstats_cache, stats_new);
+	return stats;
 }
 
 /* Send pid data out on exit */
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: KCSAN: data-race in taskstats_exit / taskstats_exit
  2019-10-05  4:29 ` Dmitry Vyukov
@ 2019-10-05 11:29   ` Christian Brauner
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-05 11:29 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: syzbot, bsingharora, Marco Elver, LKML, syzkaller-bugs

On Sat, Oct 05, 2019 at 06:29:39AM +0200, Dmitry Vyukov wrote:
> On Sat, Oct 5, 2019 at 6:26 AM syzbot
> <syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    b4bd9343 x86, kcsan: Enable KCSAN for x86
> > git tree:       https://github.com/google/ktsan.git kcsan
> > console output: https://syzkaller.appspot.com/x/log.txt?x=125329db600000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > dashboard link: https://syzkaller.appspot.com/bug?extid=c5d03165a1bd1dead0c1
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> 
> +Christian, you wanted races in process mgmt ;)

Yes, indeed. :) Sent a fix for this one just now. I'll put it in my
fixes tree for rc3.

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-05 11:28 ` [PATCH] taskstats: fix data-race Christian Brauner
@ 2019-10-05 13:33   ` Marco Elver
  2019-10-05 14:15     ` Christian Brauner
  2019-10-06 10:00   ` Dmitry Vyukov
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 62+ messages in thread
From: Marco Elver @ 2019-10-05 13:33 UTC (permalink / raw)
  To: Christian Brauner
  Cc: syzbot+c5d03165a1bd1dead0c1, bsingharora, LKML, syzkaller-bugs

On Sat, 5 Oct 2019 at 13:28, Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> When assiging and testing taskstats in taskstats
> taskstats_exit() there's a race around writing and reading sig->stats.
>
> cpu0:
> task calls exit()
> do_exit()
>         -> taskstats_exit()
>                 -> taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
>
> cpu1:
> task catches signal
> do_exit()
>         -> taskstats_tgid_alloc()
>                 -> taskstats_exit()
> The tasks reads sig->stats __without__ holding sighand lock seeing
> garbage.

Is the task seeing garbage reading the data pointed to by stats, or is
this just the pointer that would be garbage?

My only observation here is that the previous version was trying to do
double-checked locking, to avoid taking the lock if sig->stats was
already set. The obvious problem with the previous version is plain
read/write and missing memory ordering: the write inside the critical
section should be smp_store_release and there should only be one
smp_load_acquire at the start.

Maybe I missed something somewhere, but maybe my suggestion below
would be an equivalent fix without always having to take the lock to
assign the pointer? If performance is not critical here, then it's
probably not worth it.

Thanks,
-- Marco

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 13a0f2e6ebc2..f58dd285a44b 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -554,25 +554,31 @@ static int taskstats_user_cmd(struct sk_buff
*skb, struct genl_info *info)
 static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
 {
  struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
+ struct taskstats *stats_new, *stats;

- if (sig->stats || thread_group_empty(tsk))
+ /* acquire load to make pointed-to data visible */
+ stats = smp_load_acquire(&sig->stats);
+ if (stats || thread_group_empty(tsk))
  goto ret;

  /* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
+ stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);

  spin_lock_irq(&tsk->sighand->siglock);
- if (!sig->stats) {
- sig->stats = stats;
- stats = NULL;
+ stats = sig->stats;
+ if (!stats) {
+ stats = stats_new;
+ /* release store to order zalloc before */
+ smp_store_release(&sig->stats, stats_new);
+ stats_new = NULL;
  }
  spin_unlock_irq(&tsk->sighand->siglock);

- if (stats)
- kmem_cache_free(taskstats_cache, stats);
+ if (stats_new)
+ kmem_cache_free(taskstats_cache, stats_new);
+
 ret:
- return sig->stats;
+ return stats;
 }

 /* Send pid data out on exit */

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-05 13:33   ` Marco Elver
@ 2019-10-05 14:15     ` Christian Brauner
  2019-10-05 14:34       ` Marco Elver
  0 siblings, 1 reply; 62+ messages in thread
From: Christian Brauner @ 2019-10-05 14:15 UTC (permalink / raw)
  To: Marco Elver
  Cc: syzbot+c5d03165a1bd1dead0c1, bsingharora, LKML, syzkaller-bugs

On Sat, Oct 05, 2019 at 03:33:07PM +0200, Marco Elver wrote:
> On Sat, 5 Oct 2019 at 13:28, Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > When assiging and testing taskstats in taskstats
> > taskstats_exit() there's a race around writing and reading sig->stats.
> >
> > cpu0:
> > task calls exit()
> > do_exit()
> >         -> taskstats_exit()
> >                 -> taskstats_tgid_alloc()
> > The task takes sighand lock and assigns new stats to sig->stats.
> >
> > cpu1:
> > task catches signal
> > do_exit()
> >         -> taskstats_tgid_alloc()
> >                 -> taskstats_exit()
> > The tasks reads sig->stats __without__ holding sighand lock seeing
> > garbage.
> 
> Is the task seeing garbage reading the data pointed to by stats, or is
> this just the pointer that would be garbage?

I expect the pointer to be garbage.

> 
> My only observation here is that the previous version was trying to do
> double-checked locking, to avoid taking the lock if sig->stats was
> already set. The obvious problem with the previous version is plain
> read/write and missing memory ordering: the write inside the critical
> section should be smp_store_release and there should only be one
> smp_load_acquire at the start.
> 
> Maybe I missed something somewhere, but maybe my suggestion below
> would be an equivalent fix without always having to take the lock to
> assign the pointer? If performance is not critical here, then it's
> probably not worth it.

The only point of contention is when the whole thread-group exits (e.g.
via exit_group(2) since threads in a thread-group share signal struct).
The reason I didn't do memory barriers was because we need to take the
spinlock for the actual list manipulation anyway.
But I don't mind incorporating the acquire/release.

Christian

> 
> Thanks,
> -- Marco
> 
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 13a0f2e6ebc2..f58dd285a44b 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -554,25 +554,31 @@ static int taskstats_user_cmd(struct sk_buff
> *skb, struct genl_info *info)
>  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
>  {
>   struct signal_struct *sig = tsk->signal;
> - struct taskstats *stats;
> + struct taskstats *stats_new, *stats;
> 
> - if (sig->stats || thread_group_empty(tsk))
> + /* acquire load to make pointed-to data visible */
> + stats = smp_load_acquire(&sig->stats);
> + if (stats || thread_group_empty(tsk))
>   goto ret;
> 
>   /* No problem if kmem_cache_zalloc() fails */
> - stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> + stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> 
>   spin_lock_irq(&tsk->sighand->siglock);
> - if (!sig->stats) {
> - sig->stats = stats;
> - stats = NULL;
> + stats = sig->stats;
> + if (!stats) {
> + stats = stats_new;
> + /* release store to order zalloc before */
> + smp_store_release(&sig->stats, stats_new);
> + stats_new = NULL;
>   }
>   spin_unlock_irq(&tsk->sighand->siglock);
> 
> - if (stats)
> - kmem_cache_free(taskstats_cache, stats);
> + if (stats_new)
> + kmem_cache_free(taskstats_cache, stats_new);
> +
>  ret:
> - return sig->stats;
> + return stats;
>  }
> 
>  /* Send pid data out on exit */

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-05 14:15     ` Christian Brauner
@ 2019-10-05 14:34       ` Marco Elver
  0 siblings, 0 replies; 62+ messages in thread
From: Marco Elver @ 2019-10-05 14:34 UTC (permalink / raw)
  To: Christian Brauner
  Cc: syzbot+c5d03165a1bd1dead0c1, bsingharora, LKML, syzkaller-bugs

On Sat, 5 Oct 2019 at 16:15, Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Sat, Oct 05, 2019 at 03:33:07PM +0200, Marco Elver wrote:
> > On Sat, 5 Oct 2019 at 13:28, Christian Brauner
> > <christian.brauner@ubuntu.com> wrote:
> > >
> > > When assiging and testing taskstats in taskstats
> > > taskstats_exit() there's a race around writing and reading sig->stats.
> > >
> > > cpu0:
> > > task calls exit()
> > > do_exit()
> > >         -> taskstats_exit()
> > >                 -> taskstats_tgid_alloc()
> > > The task takes sighand lock and assigns new stats to sig->stats.
> > >
> > > cpu1:
> > > task catches signal
> > > do_exit()
> > >         -> taskstats_tgid_alloc()
> > >                 -> taskstats_exit()
> > > The tasks reads sig->stats __without__ holding sighand lock seeing
> > > garbage.
> >
> > Is the task seeing garbage reading the data pointed to by stats, or is
> > this just the pointer that would be garbage?
>
> I expect the pointer to be garbage.
>
> >
> > My only observation here is that the previous version was trying to do
> > double-checked locking, to avoid taking the lock if sig->stats was
> > already set. The obvious problem with the previous version is plain
> > read/write and missing memory ordering: the write inside the critical
> > section should be smp_store_release and there should only be one
> > smp_load_acquire at the start.
> >
> > Maybe I missed something somewhere, but maybe my suggestion below
> > would be an equivalent fix without always having to take the lock to
> > assign the pointer? If performance is not critical here, then it's
> > probably not worth it.
>
> The only point of contention is when the whole thread-group exits (e.g.
> via exit_group(2) since threads in a thread-group share signal struct).
> The reason I didn't do memory barriers was because we need to take the
> spinlock for the actual list manipulation anyway.
> But I don't mind incorporating the acquire/release.

Thanks for the clarification. Since you no longer do double-checked
locking, explicit memory barriers shouldn't be needed because
spin_lock/unlock already provides acquire/release ordering.

-- Marco

> Christian
>
> >
> > Thanks,
> > -- Marco
> >
> > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > index 13a0f2e6ebc2..f58dd285a44b 100644
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -554,25 +554,31 @@ static int taskstats_user_cmd(struct sk_buff
> > *skb, struct genl_info *info)
> >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> >  {
> >   struct signal_struct *sig = tsk->signal;
> > - struct taskstats *stats;
> > + struct taskstats *stats_new, *stats;
> >
> > - if (sig->stats || thread_group_empty(tsk))
> > + /* acquire load to make pointed-to data visible */
> > + stats = smp_load_acquire(&sig->stats);
> > + if (stats || thread_group_empty(tsk))
> >   goto ret;
> >
> >   /* No problem if kmem_cache_zalloc() fails */
> > - stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > + stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> >
> >   spin_lock_irq(&tsk->sighand->siglock);
> > - if (!sig->stats) {
> > - sig->stats = stats;
> > - stats = NULL;
> > + stats = sig->stats;
> > + if (!stats) {
> > + stats = stats_new;
> > + /* release store to order zalloc before */
> > + smp_store_release(&sig->stats, stats_new);
> > + stats_new = NULL;
> >   }
> >   spin_unlock_irq(&tsk->sighand->siglock);
> >
> > - if (stats)
> > - kmem_cache_free(taskstats_cache, stats);
> > + if (stats_new)
> > + kmem_cache_free(taskstats_cache, stats_new);
> > +
> >  ret:
> > - return sig->stats;
> > + return stats;
> >  }
> >
> >  /* Send pid data out on exit */

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-05 11:28 ` [PATCH] taskstats: fix data-race Christian Brauner
  2019-10-05 13:33   ` Marco Elver
@ 2019-10-06 10:00   ` Dmitry Vyukov
  2019-10-06 10:59     ` Christian Brauner
  2019-10-06 23:52   ` Christian Brauner
  2019-11-06  0:27   ` Balbir Singh
  3 siblings, 1 reply; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-06 10:00 UTC (permalink / raw)
  To: Christian Brauner; +Cc: syzbot, bsingharora, Marco Elver, LKML, syzkaller-bugs

On Sat, Oct 5, 2019 at 1:28 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> When assiging and testing taskstats in taskstats
> taskstats_exit() there's a race around writing and reading sig->stats.
>
> cpu0:
> task calls exit()
> do_exit()
>         -> taskstats_exit()
>                 -> taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
>
> cpu1:
> task catches signal
> do_exit()
>         -> taskstats_tgid_alloc()
>                 -> taskstats_exit()
> The tasks reads sig->stats __without__ holding sighand lock seeing
> garbage.
>
> Fix this by taking sighand lock when reading sig->stats.
>
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> ---
>  kernel/taskstats.c | 28 +++++++++++++++++-----------
>  1 file changed, 17 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 13a0f2e6ebc2..58b145234c4a 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -553,26 +553,32 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
>
>  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
>  {
> +       int empty;
> +       struct taskstats *stats_new, *stats = NULL;
>         struct signal_struct *sig = tsk->signal;
> -       struct taskstats *stats;
> -
> -       if (sig->stats || thread_group_empty(tsk))
> -               goto ret;
>
>         /* No problem if kmem_cache_zalloc() fails */
> -       stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> +       stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);

This seems to be over-pessimistic wrt performance b/c:
1. We always allocate the object and free it on every call, even if
the stats are already allocated, whereas currently we don't.
2. We always allocate the object and free it if thread_group_empty,
whereas currently we don't.
3. We do lock/unlock on every call.

I would suggest to fix the double-checked locking properly.
Locking is not the only correct way to synchronize things. Lock-free
synchronization is also possible. It's more tricky, but it can be
correct and it's supported by KCSAN/KTSAN. It just needs to be
properly implemented and expressed. For some cases we may decide to
switch to locking instead, but it needs to be an explicit decision.

We can fix the current code by doing READ_ONCE on sig->stats (which
implies smp_read_barrier_depends since 4.15), and storing to it with
smp_store_release.

> +       empty = thread_group_empty(tsk);
>
>         spin_lock_irq(&tsk->sighand->siglock);
> +       if (sig->stats || empty) {
> +               stats = sig->stats;
> +               spin_unlock_irq(&tsk->sighand->siglock);
> +               goto free_cache;
> +       }
> +
>         if (!sig->stats) {
> -               sig->stats = stats;
> -               stats = NULL;
> +               sig->stats = stats_new;
> +               spin_unlock_irq(&tsk->sighand->siglock);
> +               return stats_new;
>         }
>         spin_unlock_irq(&tsk->sighand->siglock);
>
> -       if (stats)
> -               kmem_cache_free(taskstats_cache, stats);
> -ret:
> -       return sig->stats;
> +free_cache:
> +       kmem_cache_free(taskstats_cache, stats_new);
> +       return stats;
>  }
>
>  /* Send pid data out on exit */
> --
> 2.23.0
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20191005112806.13960-1-christian.brauner%40ubuntu.com.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-06 10:00   ` Dmitry Vyukov
@ 2019-10-06 10:59     ` Christian Brauner
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-06 10:59 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: syzbot, bsingharora, Marco Elver, LKML, syzkaller-bugs

On Sun, Oct 06, 2019 at 12:00:32PM +0200, Dmitry Vyukov wrote:
> On Sat, Oct 5, 2019 at 1:28 PM Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > When assiging and testing taskstats in taskstats
> > taskstats_exit() there's a race around writing and reading sig->stats.
> >
> > cpu0:
> > task calls exit()
> > do_exit()
> >         -> taskstats_exit()
> >                 -> taskstats_tgid_alloc()
> > The task takes sighand lock and assigns new stats to sig->stats.
> >
> > cpu1:
> > task catches signal
> > do_exit()
> >         -> taskstats_tgid_alloc()
> >                 -> taskstats_exit()
> > The tasks reads sig->stats __without__ holding sighand lock seeing
> > garbage.
> >
> > Fix this by taking sighand lock when reading sig->stats.
> >
> > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > ---
> >  kernel/taskstats.c | 28 +++++++++++++++++-----------
> >  1 file changed, 17 insertions(+), 11 deletions(-)
> >
> > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > index 13a0f2e6ebc2..58b145234c4a 100644
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -553,26 +553,32 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> >
> >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> >  {
> > +       int empty;
> > +       struct taskstats *stats_new, *stats = NULL;
> >         struct signal_struct *sig = tsk->signal;
> > -       struct taskstats *stats;
> > -
> > -       if (sig->stats || thread_group_empty(tsk))
> > -               goto ret;
> >
> >         /* No problem if kmem_cache_zalloc() fails */
> > -       stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > +       stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> 
> This seems to be over-pessimistic wrt performance b/c:
> 1. We always allocate the object and free it on every call, even if
> the stats are already allocated, whereas currently we don't.
> 2. We always allocate the object and free it if thread_group_empty,
> whereas currently we don't.
> 3. We do lock/unlock on every call.
> 
> I would suggest to fix the double-checked locking properly.

As I said in my reply to Marco: I'm integrating this. I just haven't
had time to update the patch.

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH] taskstats: fix data-race
  2019-10-05 11:28 ` [PATCH] taskstats: fix data-race Christian Brauner
  2019-10-05 13:33   ` Marco Elver
  2019-10-06 10:00   ` Dmitry Vyukov
@ 2019-10-06 23:52   ` Christian Brauner
  2019-10-07  7:31     ` Dmitry Vyukov
  2019-10-07 10:40     ` Andrea Parri
  2019-11-06  0:27   ` Balbir Singh
  3 siblings, 2 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-06 23:52 UTC (permalink / raw)
  To: christian.brauner
  Cc: bsingharora, elver, linux-kernel, syzbot+c5d03165a1bd1dead0c1,
	syzkaller-bugs, Dmitry Vyukov

When assiging and testing taskstats in taskstats_exit() there's a race
when writing and reading sig->stats when a thread-group with more than
one thread exits:

cpu0:
thread catches fatal signal and whole thread-group gets taken down
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The tasks reads sig->stats holding sighand lock seeing garbage.

cpu1:
task calls exit_group()
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The task takes sighand lock and assigns new stats to sig->stats.

Fix this by using READ_ONCE() and smp_store_release().

Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v1 */
Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com

/* v2 */
- Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
  - fix the original double-checked locking using memory barriers
---
 kernel/taskstats.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 13a0f2e6ebc2..8ee046e8a792 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -554,24 +554,25 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
 static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
 {
 	struct signal_struct *sig = tsk->signal;
-	struct taskstats *stats;
+	struct taskstats *stats_new, *stats;
 
-	if (sig->stats || thread_group_empty(tsk))
-		goto ret;
+	stats = READ_ONCE(sig->stats);
+	if (stats || thread_group_empty(tsk))
+		return stats;
 
 	/* No problem if kmem_cache_zalloc() fails */
-	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
+	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
 
 	spin_lock_irq(&tsk->sighand->siglock);
 	if (!sig->stats) {
-		sig->stats = stats;
-		stats = NULL;
+		smp_store_release(&sig->stats, stats_new);
+		stats_new = NULL;
 	}
 	spin_unlock_irq(&tsk->sighand->siglock);
 
-	if (stats)
-		kmem_cache_free(taskstats_cache, stats);
-ret:
+	if (stats_new)
+		kmem_cache_free(taskstats_cache, stats_new);
+
 	return sig->stats;
 }
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-06 23:52   ` Christian Brauner
@ 2019-10-07  7:31     ` Dmitry Vyukov
  2019-10-07  9:29       ` Christian Brauner
  2019-10-07 10:40     ` Andrea Parri
  1 sibling, 1 reply; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-07  7:31 UTC (permalink / raw)
  To: Christian Brauner; +Cc: bsingharora, Marco Elver, LKML, syzbot, syzkaller-bugs

On Mon, Oct 7, 2019 at 1:52 AM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
>
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The tasks reads sig->stats holding sighand lock seeing garbage.
>
> cpu1:
> task calls exit_group()
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
>
> Fix this by using READ_ONCE() and smp_store_release().
>
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> ---
> /* v1 */
> Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
>
> /* v2 */
> - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
>   - fix the original double-checked locking using memory barriers
> ---
>  kernel/taskstats.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)

Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

Thanks!

> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 13a0f2e6ebc2..8ee046e8a792 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -554,24 +554,25 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
>  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
>  {
>         struct signal_struct *sig = tsk->signal;
> -       struct taskstats *stats;
> +       struct taskstats *stats_new, *stats;
>
> -       if (sig->stats || thread_group_empty(tsk))
> -               goto ret;
> +       stats = READ_ONCE(sig->stats);
> +       if (stats || thread_group_empty(tsk))
> +               return stats;
>
>         /* No problem if kmem_cache_zalloc() fails */
> -       stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> +       stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
>
>         spin_lock_irq(&tsk->sighand->siglock);
>         if (!sig->stats) {
> -               sig->stats = stats;
> -               stats = NULL;
> +               smp_store_release(&sig->stats, stats_new);
> +               stats_new = NULL;
>         }
>         spin_unlock_irq(&tsk->sighand->siglock);
>
> -       if (stats)
> -               kmem_cache_free(taskstats_cache, stats);
> -ret:
> +       if (stats_new)
> +               kmem_cache_free(taskstats_cache, stats_new);
> +
>         return sig->stats;
>  }
>
> --
> 2.23.0
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-07  7:31     ` Dmitry Vyukov
@ 2019-10-07  9:29       ` Christian Brauner
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-07  9:29 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: bsingharora, Marco Elver, LKML, syzbot, syzkaller-bugs

On Mon, Oct 07, 2019 at 09:31:16AM +0200, Dmitry Vyukov wrote:
> On Mon, Oct 7, 2019 at 1:52 AM Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > When assiging and testing taskstats in taskstats_exit() there's a race
> > when writing and reading sig->stats when a thread-group with more than
> > one thread exits:
> >
> > cpu0:
> > thread catches fatal signal and whole thread-group gets taken down
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The tasks reads sig->stats holding sighand lock seeing garbage.
> >
> > cpu1:
> > task calls exit_group()
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The task takes sighand lock and assigns new stats to sig->stats.
> >
> > Fix this by using READ_ONCE() and smp_store_release().
> >
> > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > ---
> > /* v1 */
> > Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> >
> > /* v2 */
> > - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
> >   - fix the original double-checked locking using memory barriers
> > ---
> >  kernel/taskstats.c | 19 ++++++++++---------
> >  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

Applied to:
https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=fixes

Should show up in linux-next tomorrow.
Targeting v5.4-rc3.
Cced stable.

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-06 23:52   ` Christian Brauner
  2019-10-07  7:31     ` Dmitry Vyukov
@ 2019-10-07 10:40     ` Andrea Parri
  2019-10-07 10:50       ` Christian Brauner
  2019-10-07 11:01       ` [PATCH v2] " Christian Brauner
  1 sibling, 2 replies; 62+ messages in thread
From: Andrea Parri @ 2019-10-07 10:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: bsingharora, elver, linux-kernel, syzbot+c5d03165a1bd1dead0c1,
	syzkaller-bugs, Dmitry Vyukov

Hi Christian,

On Mon, Oct 07, 2019 at 01:52:16AM +0200, Christian Brauner wrote:
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
> 
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The tasks reads sig->stats holding sighand lock seeing garbage.
> 
> cpu1:
> task calls exit_group()
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
> 
> Fix this by using READ_ONCE() and smp_store_release().
> 
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>

FYI, checkpatch.pl says:

WARNING: memory barrier without comment
#62: FILE: kernel/taskstats.c:568:
+		smp_store_release(&sig->stats, stats_new);

Maybe you can make checkpatch.pl happy  ;-) and add a comment to stress
the 'pairing' between this barrier and the added READ_ONCE() (as Dmitry
was alluding to in a previous post)?

Thanks,
  Andrea


> ---
> /* v1 */
> Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> 
> /* v2 */
> - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
>   - fix the original double-checked locking using memory barriers
> ---
>  kernel/taskstats.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 13a0f2e6ebc2..8ee046e8a792 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -554,24 +554,25 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
>  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
>  {
>  	struct signal_struct *sig = tsk->signal;
> -	struct taskstats *stats;
> +	struct taskstats *stats_new, *stats;
>  
> -	if (sig->stats || thread_group_empty(tsk))
> -		goto ret;
> +	stats = READ_ONCE(sig->stats);
> +	if (stats || thread_group_empty(tsk))
> +		return stats;
>  
>  	/* No problem if kmem_cache_zalloc() fails */
> -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
>  
>  	spin_lock_irq(&tsk->sighand->siglock);
>  	if (!sig->stats) {
> -		sig->stats = stats;
> -		stats = NULL;
> +		smp_store_release(&sig->stats, stats_new);




> +		stats_new = NULL;
>  	}
>  	spin_unlock_irq(&tsk->sighand->siglock);
>  
> -	if (stats)
> -		kmem_cache_free(taskstats_cache, stats);
> -ret:
> +	if (stats_new)
> +		kmem_cache_free(taskstats_cache, stats_new);
> +
>  	return sig->stats;
>  }
>  
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-07 10:40     ` Andrea Parri
@ 2019-10-07 10:50       ` Christian Brauner
  2019-10-07 11:01       ` [PATCH v2] " Christian Brauner
  1 sibling, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-07 10:50 UTC (permalink / raw)
  To: Andrea Parri
  Cc: bsingharora, elver, linux-kernel, syzbot+c5d03165a1bd1dead0c1,
	syzkaller-bugs, Dmitry Vyukov

On Mon, Oct 07, 2019 at 12:40:39PM +0200, Andrea Parri wrote:
> Hi Christian,
> 
> On Mon, Oct 07, 2019 at 01:52:16AM +0200, Christian Brauner wrote:
> > When assiging and testing taskstats in taskstats_exit() there's a race
> > when writing and reading sig->stats when a thread-group with more than
> > one thread exits:
> > 
> > cpu0:
> > thread catches fatal signal and whole thread-group gets taken down
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The tasks reads sig->stats holding sighand lock seeing garbage.
> > 
> > cpu1:
> > task calls exit_group()
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The task takes sighand lock and assigns new stats to sig->stats.
> > 
> > Fix this by using READ_ONCE() and smp_store_release().
> > 
> > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> 
> FYI, checkpatch.pl says:
> 
> WARNING: memory barrier without comment
> #62: FILE: kernel/taskstats.c:568:
> +		smp_store_release(&sig->stats, stats_new);
> 
> Maybe you can make checkpatch.pl happy  ;-) and add a comment to stress
> the 'pairing' between this barrier and the added READ_ONCE() (as Dmitry
> was alluding to in a previous post)?

Of course. I totally forgot the memory barrier documentation requirement.

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2] taskstats: fix data-race
  2019-10-07 10:40     ` Andrea Parri
  2019-10-07 10:50       ` Christian Brauner
@ 2019-10-07 11:01       ` Christian Brauner
  2019-10-07 13:18         ` Andrea Parri
  1 sibling, 1 reply; 62+ messages in thread
From: Christian Brauner @ 2019-10-07 11:01 UTC (permalink / raw)
  To: parri.andrea
  Cc: bsingharora, christian.brauner, dvyukov, elver, linux-kernel,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs, stable

When assiging and testing taskstats in taskstats_exit() there's a race
when writing and reading sig->stats when a thread-group with more than
one thread exits:

cpu0:
thread catches fatal signal and whole thread-group gets taken down
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The tasks reads sig->stats holding sighand lock seeing garbage.

cpu1:
task calls exit_group()
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The task takes sighand lock and assigns new stats to sig->stats.

Fix this by using READ_ONCE() and smp_store_release().

Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
---
/* v1 */
Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com

/* v2 */
- Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
  - fix the original double-checked locking using memory barriers

/* v3 */
- Andrea Parri <parri.andrea@gmail.com>:
  - document memory barriers to make checkpatch happy
---
 kernel/taskstats.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 13a0f2e6ebc2..978d7931fb65 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
 static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
 {
 	struct signal_struct *sig = tsk->signal;
-	struct taskstats *stats;
+	struct taskstats *stats_new, *stats;
 
-	if (sig->stats || thread_group_empty(tsk))
-		goto ret;
+	/* Pairs with smp_store_release() below. */
+	stats = READ_ONCE(sig->stats);
+	if (stats || thread_group_empty(tsk))
+		return stats;
 
 	/* No problem if kmem_cache_zalloc() fails */
-	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
+	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
 
 	spin_lock_irq(&tsk->sighand->siglock);
 	if (!sig->stats) {
-		sig->stats = stats;
-		stats = NULL;
+		/* Pairs with READ_ONCE() above. */
+		smp_store_release(&sig->stats, stats_new);
+		stats_new = NULL;
 	}
 	spin_unlock_irq(&tsk->sighand->siglock);
 
-	if (stats)
-		kmem_cache_free(taskstats_cache, stats);
-ret:
+	if (stats_new)
+		kmem_cache_free(taskstats_cache, stats_new);
+
 	return sig->stats;
 }
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-07 11:01       ` [PATCH v2] " Christian Brauner
@ 2019-10-07 13:18         ` Andrea Parri
  2019-10-07 13:28           ` Christian Brauner
  2019-10-07 13:50           ` Dmitry Vyukov
  0 siblings, 2 replies; 62+ messages in thread
From: Andrea Parri @ 2019-10-07 13:18 UTC (permalink / raw)
  To: Christian Brauner
  Cc: bsingharora, dvyukov, elver, linux-kernel,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs, stable

On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
> 
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The tasks reads sig->stats holding sighand lock seeing garbage.

You meant "without holding sighand lock" here, right?


> 
> cpu1:
> task calls exit_group()
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
> 
> Fix this by using READ_ONCE() and smp_store_release().
> 
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
> ---
> /* v1 */
> Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> 
> /* v2 */
> - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
>   - fix the original double-checked locking using memory barriers
> 
> /* v3 */
> - Andrea Parri <parri.andrea@gmail.com>:
>   - document memory barriers to make checkpatch happy
> ---
>  kernel/taskstats.c | 21 ++++++++++++---------
>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 13a0f2e6ebc2..978d7931fb65 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
>  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
>  {
>  	struct signal_struct *sig = tsk->signal;
> -	struct taskstats *stats;
> +	struct taskstats *stats_new, *stats;
>  
> -	if (sig->stats || thread_group_empty(tsk))
> -		goto ret;
> +	/* Pairs with smp_store_release() below. */
> +	stats = READ_ONCE(sig->stats);

This pairing suggests that the READ_ONCE() is heading an address
dependency, but I fail to identify it: what is the target memory
access of such a (putative) dependency?


> +	if (stats || thread_group_empty(tsk))
> +		return stats;
>  
>  	/* No problem if kmem_cache_zalloc() fails */
> -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
>  
>  	spin_lock_irq(&tsk->sighand->siglock);
>  	if (!sig->stats) {
> -		sig->stats = stats;
> -		stats = NULL;
> +		/* Pairs with READ_ONCE() above. */
> +		smp_store_release(&sig->stats, stats_new);

This is intended to 'order' the _zalloc()  (zero initializazion)
before the update of sig->stats, right?  what else am I missing?

Thanks,
  Andrea


> +		stats_new = NULL;
>  	}
>  	spin_unlock_irq(&tsk->sighand->siglock);
>  
> -	if (stats)
> -		kmem_cache_free(taskstats_cache, stats);
> -ret:
> +	if (stats_new)
> +		kmem_cache_free(taskstats_cache, stats_new);
> +
>  	return sig->stats;
>  }
>  
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-07 13:18         ` Andrea Parri
@ 2019-10-07 13:28           ` Christian Brauner
  2019-10-07 13:50           ` Dmitry Vyukov
  1 sibling, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-07 13:28 UTC (permalink / raw)
  To: Andrea Parri
  Cc: bsingharora, dvyukov, elver, linux-kernel,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs, stable

On Mon, Oct 07, 2019 at 03:18:04PM +0200, Andrea Parri wrote:
> On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
> > When assiging and testing taskstats in taskstats_exit() there's a race
> > when writing and reading sig->stats when a thread-group with more than
> > one thread exits:
> > 
> > cpu0:
> > thread catches fatal signal and whole thread-group gets taken down
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The tasks reads sig->stats holding sighand lock seeing garbage.
> 
> You meant "without holding sighand lock" here, right?

Correct, thanks for noticing!

> 
> 
> > 
> > cpu1:
> > task calls exit_group()
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The task takes sighand lock and assigns new stats to sig->stats.
> > 
> > Fix this by using READ_ONCE() and smp_store_release().
> > 
> > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
> > ---
> > /* v1 */
> > Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> > 
> > /* v2 */
> > - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
> >   - fix the original double-checked locking using memory barriers
> > 
> > /* v3 */
> > - Andrea Parri <parri.andrea@gmail.com>:
> >   - document memory barriers to make checkpatch happy
> > ---
> >  kernel/taskstats.c | 21 ++++++++++++---------
> >  1 file changed, 12 insertions(+), 9 deletions(-)
> > 
> > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > index 13a0f2e6ebc2..978d7931fb65 100644
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> >  {
> >  	struct signal_struct *sig = tsk->signal;
> > -	struct taskstats *stats;
> > +	struct taskstats *stats_new, *stats;
> >  
> > -	if (sig->stats || thread_group_empty(tsk))
> > -		goto ret;
> > +	/* Pairs with smp_store_release() below. */
> > +	stats = READ_ONCE(sig->stats);
> 
> This pairing suggests that the READ_ONCE() is heading an address
> dependency, but I fail to identify it: what is the target memory
> access of such a (putative) dependency?
> 
> 
> > +	if (stats || thread_group_empty(tsk))
> > +		return stats;
> >  
> >  	/* No problem if kmem_cache_zalloc() fails */
> > -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> >  
> >  	spin_lock_irq(&tsk->sighand->siglock);
> >  	if (!sig->stats) {
> > -		sig->stats = stats;
> > -		stats = NULL;
> > +		/* Pairs with READ_ONCE() above. */
> > +		smp_store_release(&sig->stats, stats_new);
> 
> This is intended to 'order' the _zalloc()  (zero initializazion)
> before the update of sig->stats, right?  what else am I missing?

Right, I should've mentioned that. I'll change the comment.
But I thought this also paired with smp_read_barrier_depends() that's
placed alongside READ_ONCE()?

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-07 13:18         ` Andrea Parri
  2019-10-07 13:28           ` Christian Brauner
@ 2019-10-07 13:50           ` Dmitry Vyukov
  2019-10-07 13:55             ` Christian Brauner
  2019-10-07 14:14             ` Andrea Parri
  1 sibling, 2 replies; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-07 13:50 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Christian Brauner, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

On Mon, Oct 7, 2019 at 3:18 PM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
> > When assiging and testing taskstats in taskstats_exit() there's a race
> > when writing and reading sig->stats when a thread-group with more than
> > one thread exits:
> >
> > cpu0:
> > thread catches fatal signal and whole thread-group gets taken down
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The tasks reads sig->stats holding sighand lock seeing garbage.
>
> You meant "without holding sighand lock" here, right?
>
>
> >
> > cpu1:
> > task calls exit_group()
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The task takes sighand lock and assigns new stats to sig->stats.
> >
> > Fix this by using READ_ONCE() and smp_store_release().
> >
> > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
> > ---
> > /* v1 */
> > Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> >
> > /* v2 */
> > - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
> >   - fix the original double-checked locking using memory barriers
> >
> > /* v3 */
> > - Andrea Parri <parri.andrea@gmail.com>:
> >   - document memory barriers to make checkpatch happy
> > ---
> >  kernel/taskstats.c | 21 ++++++++++++---------
> >  1 file changed, 12 insertions(+), 9 deletions(-)
> >
> > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > index 13a0f2e6ebc2..978d7931fb65 100644
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> >  {
> >       struct signal_struct *sig = tsk->signal;
> > -     struct taskstats *stats;
> > +     struct taskstats *stats_new, *stats;
> >
> > -     if (sig->stats || thread_group_empty(tsk))
> > -             goto ret;
> > +     /* Pairs with smp_store_release() below. */
> > +     stats = READ_ONCE(sig->stats);
>
> This pairing suggests that the READ_ONCE() is heading an address
> dependency, but I fail to identify it: what is the target memory
> access of such a (putative) dependency?

I would assume callers of this function access *stats. So the
dependency is between loading stats and accessing *stats.

> > +     if (stats || thread_group_empty(tsk))
> > +             return stats;
> >
> >       /* No problem if kmem_cache_zalloc() fails */
> > -     stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > +     stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> >
> >       spin_lock_irq(&tsk->sighand->siglock);
> >       if (!sig->stats) {
> > -             sig->stats = stats;
> > -             stats = NULL;
> > +             /* Pairs with READ_ONCE() above. */
> > +             smp_store_release(&sig->stats, stats_new);
>
> This is intended to 'order' the _zalloc()  (zero initializazion)
> before the update of sig->stats, right?  what else am I missing?
>
> Thanks,
>   Andrea
>
>
> > +             stats_new = NULL;
> >       }
> >       spin_unlock_irq(&tsk->sighand->siglock);
> >
> > -     if (stats)
> > -             kmem_cache_free(taskstats_cache, stats);
> > -ret:
> > +     if (stats_new)
> > +             kmem_cache_free(taskstats_cache, stats_new);
> > +
> >       return sig->stats;
> >  }
> >
> > --
> > 2.23.0
> >

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-07 13:50           ` Dmitry Vyukov
@ 2019-10-07 13:55             ` Christian Brauner
  2019-10-07 14:08               ` Dmitry Vyukov
  2019-10-07 14:14             ` Andrea Parri
  1 sibling, 1 reply; 62+ messages in thread
From: Christian Brauner @ 2019-10-07 13:55 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrea Parri, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

On Mon, Oct 07, 2019 at 03:50:47PM +0200, Dmitry Vyukov wrote:
> On Mon, Oct 7, 2019 at 3:18 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> >
> > On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
> > > When assiging and testing taskstats in taskstats_exit() there's a race
> > > when writing and reading sig->stats when a thread-group with more than
> > > one thread exits:
> > >
> > > cpu0:
> > > thread catches fatal signal and whole thread-group gets taken down
> > >  do_exit()
> > >  do_group_exit()
> > >  taskstats_exit()
> > >  taskstats_tgid_alloc()
> > > The tasks reads sig->stats holding sighand lock seeing garbage.
> >
> > You meant "without holding sighand lock" here, right?
> >
> >
> > >
> > > cpu1:
> > > task calls exit_group()
> > >  do_exit()
> > >  do_group_exit()
> > >  taskstats_exit()
> > >  taskstats_tgid_alloc()
> > > The task takes sighand lock and assigns new stats to sig->stats.
> > >
> > > Fix this by using READ_ONCE() and smp_store_release().
> > >
> > > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > > Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > > Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
> > > ---
> > > /* v1 */
> > > Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> > >
> > > /* v2 */
> > > - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
> > >   - fix the original double-checked locking using memory barriers
> > >
> > > /* v3 */
> > > - Andrea Parri <parri.andrea@gmail.com>:
> > >   - document memory barriers to make checkpatch happy
> > > ---
> > >  kernel/taskstats.c | 21 ++++++++++++---------
> > >  1 file changed, 12 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > > index 13a0f2e6ebc2..978d7931fb65 100644
> > > --- a/kernel/taskstats.c
> > > +++ b/kernel/taskstats.c
> > > @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > >  {
> > >       struct signal_struct *sig = tsk->signal;
> > > -     struct taskstats *stats;
> > > +     struct taskstats *stats_new, *stats;
> > >
> > > -     if (sig->stats || thread_group_empty(tsk))
> > > -             goto ret;
> > > +     /* Pairs with smp_store_release() below. */
> > > +     stats = READ_ONCE(sig->stats);
> >
> > This pairing suggests that the READ_ONCE() is heading an address
> > dependency, but I fail to identify it: what is the target memory
> > access of such a (putative) dependency?
> 
> I would assume callers of this function access *stats. So the
> dependency is between loading stats and accessing *stats.

Right, but why READ_ONCE() and not smp_load_acquire here?

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-07 13:55             ` Christian Brauner
@ 2019-10-07 14:08               ` Dmitry Vyukov
  2019-10-07 14:10                 ` Christian Brauner
  0 siblings, 1 reply; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-07 14:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Andrea Parri, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

On Mon, Oct 7, 2019 at 3:55 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Mon, Oct 07, 2019 at 03:50:47PM +0200, Dmitry Vyukov wrote:
> > On Mon, Oct 7, 2019 at 3:18 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > >
> > > On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
> > > > When assiging and testing taskstats in taskstats_exit() there's a race
> > > > when writing and reading sig->stats when a thread-group with more than
> > > > one thread exits:
> > > >
> > > > cpu0:
> > > > thread catches fatal signal and whole thread-group gets taken down
> > > >  do_exit()
> > > >  do_group_exit()
> > > >  taskstats_exit()
> > > >  taskstats_tgid_alloc()
> > > > The tasks reads sig->stats holding sighand lock seeing garbage.
> > >
> > > You meant "without holding sighand lock" here, right?
> > >
> > >
> > > >
> > > > cpu1:
> > > > task calls exit_group()
> > > >  do_exit()
> > > >  do_group_exit()
> > > >  taskstats_exit()
> > > >  taskstats_tgid_alloc()
> > > > The task takes sighand lock and assigns new stats to sig->stats.
> > > >
> > > > Fix this by using READ_ONCE() and smp_store_release().
> > > >
> > > > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > > > Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> > > > Cc: stable@vger.kernel.org
> > > > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > > > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > > > Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
> > > > ---
> > > > /* v1 */
> > > > Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> > > >
> > > > /* v2 */
> > > > - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
> > > >   - fix the original double-checked locking using memory barriers
> > > >
> > > > /* v3 */
> > > > - Andrea Parri <parri.andrea@gmail.com>:
> > > >   - document memory barriers to make checkpatch happy
> > > > ---
> > > >  kernel/taskstats.c | 21 ++++++++++++---------
> > > >  1 file changed, 12 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > > > index 13a0f2e6ebc2..978d7931fb65 100644
> > > > --- a/kernel/taskstats.c
> > > > +++ b/kernel/taskstats.c
> > > > @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> > > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > >  {
> > > >       struct signal_struct *sig = tsk->signal;
> > > > -     struct taskstats *stats;
> > > > +     struct taskstats *stats_new, *stats;
> > > >
> > > > -     if (sig->stats || thread_group_empty(tsk))
> > > > -             goto ret;
> > > > +     /* Pairs with smp_store_release() below. */
> > > > +     stats = READ_ONCE(sig->stats);
> > >
> > > This pairing suggests that the READ_ONCE() is heading an address
> > > dependency, but I fail to identify it: what is the target memory
> > > access of such a (putative) dependency?
> >
> > I would assume callers of this function access *stats. So the
> > dependency is between loading stats and accessing *stats.
>
> Right, but why READ_ONCE() and not smp_load_acquire here?

Because if all memory accesses we need to order have data dependency
between them, READ_ONCE is enough and is cheaper on some archs (e.g.
ARM).
In our case there is a data dependency between loading of stats and
accessing *stats (only Alpha could reorder that, other arches can't
load via a pointer before loading the pointer itself (sic!)).

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-07 14:08               ` Dmitry Vyukov
@ 2019-10-07 14:10                 ` Christian Brauner
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-07 14:10 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrea Parri, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

On Mon, Oct 07, 2019 at 04:08:41PM +0200, Dmitry Vyukov wrote:
> On Mon, Oct 7, 2019 at 3:55 PM Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > On Mon, Oct 07, 2019 at 03:50:47PM +0200, Dmitry Vyukov wrote:
> > > On Mon, Oct 7, 2019 at 3:18 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > > >
> > > > On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
> > > > > When assiging and testing taskstats in taskstats_exit() there's a race
> > > > > when writing and reading sig->stats when a thread-group with more than
> > > > > one thread exits:
> > > > >
> > > > > cpu0:
> > > > > thread catches fatal signal and whole thread-group gets taken down
> > > > >  do_exit()
> > > > >  do_group_exit()
> > > > >  taskstats_exit()
> > > > >  taskstats_tgid_alloc()
> > > > > The tasks reads sig->stats holding sighand lock seeing garbage.
> > > >
> > > > You meant "without holding sighand lock" here, right?
> > > >
> > > >
> > > > >
> > > > > cpu1:
> > > > > task calls exit_group()
> > > > >  do_exit()
> > > > >  do_group_exit()
> > > > >  taskstats_exit()
> > > > >  taskstats_tgid_alloc()
> > > > > The task takes sighand lock and assigns new stats to sig->stats.
> > > > >
> > > > > Fix this by using READ_ONCE() and smp_store_release().
> > > > >
> > > > > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > > > > Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> > > > > Cc: stable@vger.kernel.org
> > > > > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > > > > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > > > > Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
> > > > > ---
> > > > > /* v1 */
> > > > > Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> > > > >
> > > > > /* v2 */
> > > > > - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
> > > > >   - fix the original double-checked locking using memory barriers
> > > > >
> > > > > /* v3 */
> > > > > - Andrea Parri <parri.andrea@gmail.com>:
> > > > >   - document memory barriers to make checkpatch happy
> > > > > ---
> > > > >  kernel/taskstats.c | 21 ++++++++++++---------
> > > > >  1 file changed, 12 insertions(+), 9 deletions(-)
> > > > >
> > > > > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > > > > index 13a0f2e6ebc2..978d7931fb65 100644
> > > > > --- a/kernel/taskstats.c
> > > > > +++ b/kernel/taskstats.c
> > > > > @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> > > > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > > >  {
> > > > >       struct signal_struct *sig = tsk->signal;
> > > > > -     struct taskstats *stats;
> > > > > +     struct taskstats *stats_new, *stats;
> > > > >
> > > > > -     if (sig->stats || thread_group_empty(tsk))
> > > > > -             goto ret;
> > > > > +     /* Pairs with smp_store_release() below. */
> > > > > +     stats = READ_ONCE(sig->stats);
> > > >
> > > > This pairing suggests that the READ_ONCE() is heading an address
> > > > dependency, but I fail to identify it: what is the target memory
> > > > access of such a (putative) dependency?
> > >
> > > I would assume callers of this function access *stats. So the
> > > dependency is between loading stats and accessing *stats.
> >
> > Right, but why READ_ONCE() and not smp_load_acquire here?
> 
> Because if all memory accesses we need to order have data dependency
> between them, READ_ONCE is enough and is cheaper on some archs (e.g.
> ARM).
> In our case there is a data dependency between loading of stats and
> accessing *stats (only Alpha could reorder that, other arches can't
> load via a pointer before loading the pointer itself (sic!)).

Right, the except-Alpha-clause is well-known...

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-07 13:50           ` Dmitry Vyukov
  2019-10-07 13:55             ` Christian Brauner
@ 2019-10-07 14:14             ` Andrea Parri
  2019-10-07 14:18               ` Dmitry Vyukov
  1 sibling, 1 reply; 62+ messages in thread
From: Andrea Parri @ 2019-10-07 14:14 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christian Brauner, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

> > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > >  {
> > >       struct signal_struct *sig = tsk->signal;
> > > -     struct taskstats *stats;
> > > +     struct taskstats *stats_new, *stats;
> > >
> > > -     if (sig->stats || thread_group_empty(tsk))
> > > -             goto ret;
> > > +     /* Pairs with smp_store_release() below. */
> > > +     stats = READ_ONCE(sig->stats);
> >
> > This pairing suggests that the READ_ONCE() is heading an address
> > dependency, but I fail to identify it: what is the target memory
> > access of such a (putative) dependency?
> 
> I would assume callers of this function access *stats. So the
> dependency is between loading stats and accessing *stats.

AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(),
which 'casts' the return value to a boolean (so I really don't see how
any address dependency could be carried over/relied upon here).

  Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-07 14:14             ` Andrea Parri
@ 2019-10-07 14:18               ` Dmitry Vyukov
  2019-10-08 14:20                 ` Andrea Parri
  0 siblings, 1 reply; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-07 14:18 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Christian Brauner, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> > > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > >  {
> > > >       struct signal_struct *sig = tsk->signal;
> > > > -     struct taskstats *stats;
> > > > +     struct taskstats *stats_new, *stats;
> > > >
> > > > -     if (sig->stats || thread_group_empty(tsk))
> > > > -             goto ret;
> > > > +     /* Pairs with smp_store_release() below. */
> > > > +     stats = READ_ONCE(sig->stats);
> > >
> > > This pairing suggests that the READ_ONCE() is heading an address
> > > dependency, but I fail to identify it: what is the target memory
> > > access of such a (putative) dependency?
> >
> > I would assume callers of this function access *stats. So the
> > dependency is between loading stats and accessing *stats.
>
> AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(),
> which 'casts' the return value to a boolean (so I really don't see how
> any address dependency could be carried over/relied upon here).

This does not make sense.

But later taskstats_exit does:

memcpy(stats, tsk->signal->stats, sizeof(*stats));

Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-07 14:18               ` Dmitry Vyukov
@ 2019-10-08 14:20                 ` Andrea Parri
  2019-10-08 14:24                   ` Christian Brauner
  0 siblings, 1 reply; 62+ messages in thread
From: Andrea Parri @ 2019-10-08 14:20 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christian Brauner, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

On Mon, Oct 07, 2019 at 04:18:26PM +0200, Dmitry Vyukov wrote:
> On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> >
> > > > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > > >  {
> > > > >       struct signal_struct *sig = tsk->signal;
> > > > > -     struct taskstats *stats;
> > > > > +     struct taskstats *stats_new, *stats;
> > > > >
> > > > > -     if (sig->stats || thread_group_empty(tsk))
> > > > > -             goto ret;
> > > > > +     /* Pairs with smp_store_release() below. */
> > > > > +     stats = READ_ONCE(sig->stats);
> > > >
> > > > This pairing suggests that the READ_ONCE() is heading an address
> > > > dependency, but I fail to identify it: what is the target memory
> > > > access of such a (putative) dependency?
> > >
> > > I would assume callers of this function access *stats. So the
> > > dependency is between loading stats and accessing *stats.
> >
> > AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(),
> > which 'casts' the return value to a boolean (so I really don't see how
> > any address dependency could be carried over/relied upon here).
> 
> This does not make sense.
> 
> But later taskstats_exit does:
> 
> memcpy(stats, tsk->signal->stats, sizeof(*stats));
> 
> Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?

Seems reasonable to me.  If so, replacing the READ_ONCE() in question
with an smp_load_acquire() might be the solution.  Thoughts?

  Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-08 14:20                 ` Andrea Parri
@ 2019-10-08 14:24                   ` Christian Brauner
  2019-10-08 15:26                     ` Andrea Parri
  0 siblings, 1 reply; 62+ messages in thread
From: Christian Brauner @ 2019-10-08 14:24 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Dmitry Vyukov, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

On Tue, Oct 08, 2019 at 04:20:35PM +0200, Andrea Parri wrote:
> On Mon, Oct 07, 2019 at 04:18:26PM +0200, Dmitry Vyukov wrote:
> > On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > >
> > > > > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > > > >  {
> > > > > >       struct signal_struct *sig = tsk->signal;
> > > > > > -     struct taskstats *stats;
> > > > > > +     struct taskstats *stats_new, *stats;
> > > > > >
> > > > > > -     if (sig->stats || thread_group_empty(tsk))
> > > > > > -             goto ret;
> > > > > > +     /* Pairs with smp_store_release() below. */
> > > > > > +     stats = READ_ONCE(sig->stats);
> > > > >
> > > > > This pairing suggests that the READ_ONCE() is heading an address
> > > > > dependency, but I fail to identify it: what is the target memory
> > > > > access of such a (putative) dependency?
> > > >
> > > > I would assume callers of this function access *stats. So the
> > > > dependency is between loading stats and accessing *stats.
> > >
> > > AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(),
> > > which 'casts' the return value to a boolean (so I really don't see how
> > > any address dependency could be carried over/relied upon here).
> > 
> > This does not make sense.
> > 
> > But later taskstats_exit does:
> > 
> > memcpy(stats, tsk->signal->stats, sizeof(*stats));
> > 
> > Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?
> 
> Seems reasonable to me.  If so, replacing the READ_ONCE() in question
> with an smp_load_acquire() might be the solution.  Thoughts?

I've done that already in my tree yesterday. I can resend for another
review if you'd prefer.

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-08 14:24                   ` Christian Brauner
@ 2019-10-08 15:26                     ` Andrea Parri
  2019-10-08 15:35                       ` Christian Brauner
  0 siblings, 1 reply; 62+ messages in thread
From: Andrea Parri @ 2019-10-08 15:26 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Dmitry Vyukov, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

On Tue, Oct 08, 2019 at 04:24:14PM +0200, Christian Brauner wrote:
> On Tue, Oct 08, 2019 at 04:20:35PM +0200, Andrea Parri wrote:
> > On Mon, Oct 07, 2019 at 04:18:26PM +0200, Dmitry Vyukov wrote:
> > > On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > > >
> > > > > > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > > > > >  {
> > > > > > >       struct signal_struct *sig = tsk->signal;
> > > > > > > -     struct taskstats *stats;
> > > > > > > +     struct taskstats *stats_new, *stats;
> > > > > > >
> > > > > > > -     if (sig->stats || thread_group_empty(tsk))
> > > > > > > -             goto ret;
> > > > > > > +     /* Pairs with smp_store_release() below. */
> > > > > > > +     stats = READ_ONCE(sig->stats);
> > > > > >
> > > > > > This pairing suggests that the READ_ONCE() is heading an address
> > > > > > dependency, but I fail to identify it: what is the target memory
> > > > > > access of such a (putative) dependency?
> > > > >
> > > > > I would assume callers of this function access *stats. So the
> > > > > dependency is between loading stats and accessing *stats.
> > > >
> > > > AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(),
> > > > which 'casts' the return value to a boolean (so I really don't see how
> > > > any address dependency could be carried over/relied upon here).
> > > 
> > > This does not make sense.
> > > 
> > > But later taskstats_exit does:
> > > 
> > > memcpy(stats, tsk->signal->stats, sizeof(*stats));
> > > 
> > > Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?
> > 
> > Seems reasonable to me.  If so, replacing the READ_ONCE() in question
> > with an smp_load_acquire() might be the solution.  Thoughts?
> 
> I've done that already in my tree yesterday. I can resend for another
> review if you'd prefer.

Oh nice!  No need to resend of course.  ;D FWIW, I can check it if you
let me know the particular branch/commit (guessing that's somewhere in
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git, yes?).

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-08 15:26                     ` Andrea Parri
@ 2019-10-08 15:35                       ` Christian Brauner
  2019-10-08 15:44                         ` Andrea Parri
  0 siblings, 1 reply; 62+ messages in thread
From: Christian Brauner @ 2019-10-08 15:35 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Dmitry Vyukov, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

On Tue, Oct 08, 2019 at 05:26:59PM +0200, Andrea Parri wrote:
> On Tue, Oct 08, 2019 at 04:24:14PM +0200, Christian Brauner wrote:
> > On Tue, Oct 08, 2019 at 04:20:35PM +0200, Andrea Parri wrote:
> > > On Mon, Oct 07, 2019 at 04:18:26PM +0200, Dmitry Vyukov wrote:
> > > > On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > > > >
> > > > > > > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > > > > > >  {
> > > > > > > >       struct signal_struct *sig = tsk->signal;
> > > > > > > > -     struct taskstats *stats;
> > > > > > > > +     struct taskstats *stats_new, *stats;
> > > > > > > >
> > > > > > > > -     if (sig->stats || thread_group_empty(tsk))
> > > > > > > > -             goto ret;
> > > > > > > > +     /* Pairs with smp_store_release() below. */
> > > > > > > > +     stats = READ_ONCE(sig->stats);
> > > > > > >
> > > > > > > This pairing suggests that the READ_ONCE() is heading an address
> > > > > > > dependency, but I fail to identify it: what is the target memory
> > > > > > > access of such a (putative) dependency?
> > > > > >
> > > > > > I would assume callers of this function access *stats. So the
> > > > > > dependency is between loading stats and accessing *stats.
> > > > >
> > > > > AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(),
> > > > > which 'casts' the return value to a boolean (so I really don't see how
> > > > > any address dependency could be carried over/relied upon here).
> > > > 
> > > > This does not make sense.
> > > > 
> > > > But later taskstats_exit does:
> > > > 
> > > > memcpy(stats, tsk->signal->stats, sizeof(*stats));
> > > > 
> > > > Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?
> > > 
> > > Seems reasonable to me.  If so, replacing the READ_ONCE() in question
> > > with an smp_load_acquire() might be the solution.  Thoughts?
> > 
> > I've done that already in my tree yesterday. I can resend for another
> > review if you'd prefer.
> 
> Oh nice!  No need to resend of course.  ;D FWIW, I can check it if you
> let me know the particular branch/commit (guessing that's somewhere in
> git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git, yes?).

Oh ups, yeah of course :)
https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=taskstats_syzbot

Thanks!
Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] taskstats: fix data-race
  2019-10-08 15:35                       ` Christian Brauner
@ 2019-10-08 15:44                         ` Andrea Parri
  2019-10-09 11:31                           ` [PATCH] " Christian Brauner
  0 siblings, 1 reply; 62+ messages in thread
From: Andrea Parri @ 2019-10-08 15:44 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Dmitry Vyukov, bsingharora, Marco Elver, LKML, syzbot,
	syzkaller-bugs, stable

> Oh ups, yeah of course :)
> https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=taskstats_syzbot

You forgot to update the commit msg.  It looks good to me modulo that.

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH] taskstats: fix data-race
  2019-10-08 15:44                         ` Andrea Parri
@ 2019-10-09 11:31                           ` Christian Brauner
  2019-10-09 11:40                             ` Christian Brauner
                                               ` (2 more replies)
  0 siblings, 3 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-09 11:31 UTC (permalink / raw)
  To: parri.andrea
  Cc: bsingharora, christian.brauner, dvyukov, elver, linux-kernel,
	stable, syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

When assiging and testing taskstats in taskstats_exit() there's a race
when writing and reading sig->stats when a thread-group with more than
one thread exits:

cpu0:
thread catches fatal signal and whole thread-group gets taken down
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The tasks reads sig->stats without holding sighand lock seeing garbage.

cpu1:
task calls exit_group()
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The task takes sighand lock and assigns new stats to sig->stats.

Fix this by using smp_load_acquire() and smp_store_release().

Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
---
/* v1 */
Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com

/* v2 */
Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
- Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
  - fix the original double-checked locking using memory barriers

/* v3 */
Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com/
- Andrea Parri <parri.andrea@gmail.com>:
  - document memory barriers to make checkpatch happy

/* v4 */
- Andrea Parri <parri.andrea@gmail.com>:
  - use smp_load_acquire(), not READ_ONCE()
  - update commit message
---
 kernel/taskstats.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 13a0f2e6ebc2..e6b45315607a 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -554,24 +554,30 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
 static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
 {
 	struct signal_struct *sig = tsk->signal;
-	struct taskstats *stats;
+	struct taskstats *stats_new, *stats;
 
-	if (sig->stats || thread_group_empty(tsk))
-		goto ret;
+	/* Pairs with smp_store_release() below. */
+	stats = smp_load_acquire(sig->stats);
+	if (stats || thread_group_empty(tsk))
+		return stats;
 
 	/* No problem if kmem_cache_zalloc() fails */
-	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
+	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
 
 	spin_lock_irq(&tsk->sighand->siglock);
 	if (!sig->stats) {
-		sig->stats = stats;
-		stats = NULL;
+		/*
+		 * Pairs with smp_store_release() above and order the
+		 * kmem_cache_zalloc().
+		 */
+		smp_store_release(&sig->stats, stats_new);
+		stats_new = NULL;
 	}
 	spin_unlock_irq(&tsk->sighand->siglock);
 
-	if (stats)
-		kmem_cache_free(taskstats_cache, stats);
-ret:
+	if (stats_new)
+		kmem_cache_free(taskstats_cache, stats_new);
+
 	return sig->stats;
 }
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-09 11:31                           ` [PATCH] " Christian Brauner
@ 2019-10-09 11:40                             ` Christian Brauner
  2019-10-09 11:48                             ` [PATCH v5] " Christian Brauner
  2019-10-09 11:48                             ` [PATCH] " Marco Elver
  2 siblings, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-09 11:40 UTC (permalink / raw)
  To: parri.andrea
  Cc: bsingharora, dvyukov, elver, linux-kernel, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

On Wed, Oct 09, 2019 at 01:31:34PM +0200, Christian Brauner wrote:
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
> 
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The tasks reads sig->stats without holding sighand lock seeing garbage.
> 
> cpu1:
> task calls exit_group()
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
> 
> Fix this by using smp_load_acquire() and smp_store_release().
> 
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

_sigh_, let me resend since i fcked this one up.

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v5] taskstats: fix data-race
  2019-10-09 11:31                           ` [PATCH] " Christian Brauner
  2019-10-09 11:40                             ` Christian Brauner
@ 2019-10-09 11:48                             ` Christian Brauner
  2019-10-09 12:08                               ` Andrea Parri
                                                 ` (2 more replies)
  2019-10-09 11:48                             ` [PATCH] " Marco Elver
  2 siblings, 3 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-09 11:48 UTC (permalink / raw)
  To: christian.brauner
  Cc: bsingharora, dvyukov, elver, linux-kernel, parri.andrea, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

When assiging and testing taskstats in taskstats_exit() there's a race
when writing and reading sig->stats when a thread-group with more than
one thread exits:

cpu0:
thread catches fatal signal and whole thread-group gets taken down
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The tasks reads sig->stats without holding sighand lock seeing garbage.

cpu1:
task calls exit_group()
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The task takes sighand lock and assigns new stats to sig->stats.

Fix this by using smp_load_acquire() and smp_store_release().

Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
---
/* v1 */
Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com

/* v2 */
Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
- Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
  - fix the original double-checked locking using memory barriers

/* v3 */
Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com
- Andrea Parri <parri.andrea@gmail.com>:
  - document memory barriers to make checkpatch happy

/* v4 */
Link: https://lore.kernel.org/r/20191009113134.5171-1-christian.brauner@ubuntu.com
- Andrea Parri <parri.andrea@gmail.com>:
  - use smp_load_acquire(), not READ_ONCE()
  - update commit message

/* v5 */
- Andrea Parri <parri.andrea@gmail.com>:
  - fix typo in smp_load_acquire()
---
 kernel/taskstats.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 13a0f2e6ebc2..6e18fdc4f7c8 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -554,24 +554,30 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
 static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
 {
 	struct signal_struct *sig = tsk->signal;
-	struct taskstats *stats;
+	struct taskstats *stats_new, *stats;
 
-	if (sig->stats || thread_group_empty(tsk))
-		goto ret;
+	/* Pairs with smp_store_release() below. */
+	stats = smp_load_acquire(&sig->stats);
+	if (stats || thread_group_empty(tsk))
+		return stats;
 
 	/* No problem if kmem_cache_zalloc() fails */
-	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
+	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
 
 	spin_lock_irq(&tsk->sighand->siglock);
 	if (!sig->stats) {
-		sig->stats = stats;
-		stats = NULL;
+		/*
+		 * Pairs with smp_store_release() above and order the
+		 * kmem_cache_zalloc().
+		 */
+		smp_store_release(&sig->stats, stats_new);
+		stats_new = NULL;
 	}
 	spin_unlock_irq(&tsk->sighand->siglock);
 
-	if (stats)
-		kmem_cache_free(taskstats_cache, stats);
-ret:
+	if (stats_new)
+		kmem_cache_free(taskstats_cache, stats_new);
+
 	return sig->stats;
 }
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-09 11:31                           ` [PATCH] " Christian Brauner
  2019-10-09 11:40                             ` Christian Brauner
  2019-10-09 11:48                             ` [PATCH v5] " Christian Brauner
@ 2019-10-09 11:48                             ` Marco Elver
  2019-10-09 11:53                               ` Christian Brauner
  2 siblings, 1 reply; 62+ messages in thread
From: Marco Elver @ 2019-10-09 11:48 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Andrea Parri, bsingharora, Dmitry Vyukov, LKML, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

On Wed, 9 Oct 2019 at 13:31, Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
>
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The tasks reads sig->stats without holding sighand lock seeing garbage.
>
> cpu1:
> task calls exit_group()
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
>
> Fix this by using smp_load_acquire() and smp_store_release().
>
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> ---
> /* v1 */
> Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
>
> /* v2 */
> Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
> - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
>   - fix the original double-checked locking using memory barriers
>
> /* v3 */
> Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com/
> - Andrea Parri <parri.andrea@gmail.com>:
>   - document memory barriers to make checkpatch happy
>
> /* v4 */
> - Andrea Parri <parri.andrea@gmail.com>:
>   - use smp_load_acquire(), not READ_ONCE()
>   - update commit message

Acked-by: Marco Elver <elver@google.com>

Note that this now looks almost like what I suggested, except the
return at the end of the function is accessing sig->stats again. In
this case, it seems it's fine assuming sig->stats cannot be written
elsewhere. Just wanted to point it out to make sure it's considered.

Thanks!

> ---
>  kernel/taskstats.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 13a0f2e6ebc2..e6b45315607a 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -554,24 +554,30 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
>  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
>  {
>         struct signal_struct *sig = tsk->signal;
> -       struct taskstats *stats;
> +       struct taskstats *stats_new, *stats;
>
> -       if (sig->stats || thread_group_empty(tsk))
> -               goto ret;
> +       /* Pairs with smp_store_release() below. */
> +       stats = smp_load_acquire(sig->stats);
> +       if (stats || thread_group_empty(tsk))
> +               return stats;
>
>         /* No problem if kmem_cache_zalloc() fails */
> -       stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> +       stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
>
>         spin_lock_irq(&tsk->sighand->siglock);
>         if (!sig->stats) {
> -               sig->stats = stats;
> -               stats = NULL;
> +               /*
> +                * Pairs with smp_store_release() above and order the
> +                * kmem_cache_zalloc().
> +                */
> +               smp_store_release(&sig->stats, stats_new);
> +               stats_new = NULL;
>         }
>         spin_unlock_irq(&tsk->sighand->siglock);
>
> -       if (stats)
> -               kmem_cache_free(taskstats_cache, stats);
> -ret:
> +       if (stats_new)
> +               kmem_cache_free(taskstats_cache, stats_new);
> +
>         return sig->stats;
>  }
>
> --
> 2.23.0
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-09 11:48                             ` [PATCH] " Marco Elver
@ 2019-10-09 11:53                               ` Christian Brauner
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-09 11:53 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrea Parri, bsingharora, Dmitry Vyukov, LKML, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

On Wed, Oct 09, 2019 at 01:48:27PM +0200, Marco Elver wrote:
> On Wed, 9 Oct 2019 at 13:31, Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > When assiging and testing taskstats in taskstats_exit() there's a race
> > when writing and reading sig->stats when a thread-group with more than
> > one thread exits:
> >
> > cpu0:
> > thread catches fatal signal and whole thread-group gets taken down
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The tasks reads sig->stats without holding sighand lock seeing garbage.
> >
> > cpu1:
> > task calls exit_group()
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The task takes sighand lock and assigns new stats to sig->stats.
> >
> > Fix this by using smp_load_acquire() and smp_store_release().
> >
> > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > ---
> > /* v1 */
> > Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> >
> > /* v2 */
> > Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
> > - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
> >   - fix the original double-checked locking using memory barriers
> >
> > /* v3 */
> > Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com/
> > - Andrea Parri <parri.andrea@gmail.com>:
> >   - document memory barriers to make checkpatch happy
> >
> > /* v4 */
> > - Andrea Parri <parri.andrea@gmail.com>:
> >   - use smp_load_acquire(), not READ_ONCE()
> >   - update commit message
> 
> Acked-by: Marco Elver <elver@google.com>
> 
> Note that this now looks almost like what I suggested, except the

Right, I think we all just needed to get our heads clear about what
exactly is happening here. This codepath is not a very prominent one. :)

> return at the end of the function is accessing sig->stats again. In
> this case, it seems it's fine assuming sig->stats cannot be written
> elsewhere. Just wanted to point it out to make sure it's considered.

Yes, I considered that but thanks for mentioning it.

Note that this patch has a bug. It should be
smp_load_acquire(&sig->stats) and not smp_load_acquire(sig->stats).
I accidently didn't automatically recompile the patchset after the last
change I made. Andrea thankfully caught this.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v5] taskstats: fix data-race
  2019-10-09 11:48                             ` [PATCH v5] " Christian Brauner
@ 2019-10-09 12:08                               ` Andrea Parri
  2019-10-09 13:26                               ` Christian Brauner
  2019-10-21 11:33                               ` [PATCH v6] " Christian Brauner
  2 siblings, 0 replies; 62+ messages in thread
From: Andrea Parri @ 2019-10-09 12:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: bsingharora, dvyukov, elver, linux-kernel, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

On Wed, Oct 09, 2019 at 01:48:09PM +0200, Christian Brauner wrote:
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
> 
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The tasks reads sig->stats without holding sighand lock seeing garbage.
> 
> cpu1:
> task calls exit_group()
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
> 
> Fix this by using smp_load_acquire() and smp_store_release().
> 
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

Reviewed-by: Andrea Parri <parri.andrea@gmail.com>

Thanks,
  Andrea


> ---
> /* v1 */
> Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com
> 
> /* v2 */
> Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
> - Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
>   - fix the original double-checked locking using memory barriers
> 
> /* v3 */
> Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com
> - Andrea Parri <parri.andrea@gmail.com>:
>   - document memory barriers to make checkpatch happy
> 
> /* v4 */
> Link: https://lore.kernel.org/r/20191009113134.5171-1-christian.brauner@ubuntu.com
> - Andrea Parri <parri.andrea@gmail.com>:
>   - use smp_load_acquire(), not READ_ONCE()
>   - update commit message
> 
> /* v5 */
> - Andrea Parri <parri.andrea@gmail.com>:
>   - fix typo in smp_load_acquire()
> ---
>  kernel/taskstats.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 13a0f2e6ebc2..6e18fdc4f7c8 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -554,24 +554,30 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
>  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
>  {
>  	struct signal_struct *sig = tsk->signal;
> -	struct taskstats *stats;
> +	struct taskstats *stats_new, *stats;
>  
> -	if (sig->stats || thread_group_empty(tsk))
> -		goto ret;
> +	/* Pairs with smp_store_release() below. */
> +	stats = smp_load_acquire(&sig->stats);
> +	if (stats || thread_group_empty(tsk))
> +		return stats;
>  
>  	/* No problem if kmem_cache_zalloc() fails */
> -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
>  
>  	spin_lock_irq(&tsk->sighand->siglock);
>  	if (!sig->stats) {
> -		sig->stats = stats;
> -		stats = NULL;
> +		/*
> +		 * Pairs with smp_store_release() above and order the
> +		 * kmem_cache_zalloc().
> +		 */
> +		smp_store_release(&sig->stats, stats_new);
> +		stats_new = NULL;
>  	}
>  	spin_unlock_irq(&tsk->sighand->siglock);
>  
> -	if (stats)
> -		kmem_cache_free(taskstats_cache, stats);
> -ret:
> +	if (stats_new)
> +		kmem_cache_free(taskstats_cache, stats_new);
> +
>  	return sig->stats;
>  }
>  
> -- 
> 2.23.0
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v5] taskstats: fix data-race
  2019-10-09 11:48                             ` [PATCH v5] " Christian Brauner
  2019-10-09 12:08                               ` Andrea Parri
@ 2019-10-09 13:26                               ` Christian Brauner
  2019-10-21 11:33                               ` [PATCH v6] " Christian Brauner
  2 siblings, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-09 13:26 UTC (permalink / raw)
  To: bsingharora, dvyukov, elver, linux-kernel, parri.andrea, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

On Wed, Oct 09, 2019 at 01:48:09PM +0200, Christian Brauner wrote:
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
> 
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The tasks reads sig->stats without holding sighand lock seeing garbage.
> 
> cpu1:
> task calls exit_group()
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
> 
> Fix this by using smp_load_acquire() and smp_store_release().
> 
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

Applied to:
https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=taskstats_syzbot

Merged into:
https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=fixes

Targeting v5.4-rc3.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v6] taskstats: fix data-race
  2019-10-09 11:48                             ` [PATCH v5] " Christian Brauner
  2019-10-09 12:08                               ` Andrea Parri
  2019-10-09 13:26                               ` Christian Brauner
@ 2019-10-21 11:33                               ` Christian Brauner
  2019-10-21 12:19                                 ` Rasmus Villemoes
                                                   ` (2 more replies)
  2 siblings, 3 replies; 62+ messages in thread
From: Christian Brauner @ 2019-10-21 11:33 UTC (permalink / raw)
  To: Will Deacon, linux-kernel, christian.brauner
  Cc: bsingharora, dvyukov, elver, parri.andrea, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

When assiging and testing taskstats in taskstats_exit() there's a race
when writing and reading sig->stats when a thread-group with more than
one thread exits:

cpu0:
thread catches fatal signal and whole thread-group gets taken down
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The tasks reads sig->stats without holding sighand lock.

cpu1:
task calls exit_group()
 do_exit()
 do_group_exit()
 taskstats_exit()
 taskstats_tgid_alloc()
The task takes sighand lock and assigns new stats to sig->stats.

The first approach used smp_load_acquire() and smp_store_release().
However, after having discussed this it seems that the data dependency
for kmem_cache_alloc() would be fixed by WRITE_ONCE().
Furthermore, the smp_load_acquire() would only manage to order the stats
check before the thread_group_empty() check. So it seems just using
READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
up for discussion at least.

Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation")
Cc: Will Deacon <will@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v1 */
Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.com

/* v2 */
Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
- Dmitry Vyukov <dvyukov@google.com>, Marco Elver <elver@google.com>:
  - fix the original double-checked locking using memory barriers

/* v3 */
Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com
- Andrea Parri <parri.andrea@gmail.com>:
  - document memory barriers to make checkpatch happy

/* v4 */
Link: https://lore.kernel.org/r/20191009113134.5171-1-christian.brauner@ubuntu.com
- Andrea Parri <parri.andrea@gmail.com>:
  - use smp_load_acquire(), not READ_ONCE()
  - update commit message

/* v5 */
Link: https://lore.kernel.org/r/20191009114809.8643-1-christian.brauner@ubuntu.com
- Andrea Parri <parri.andrea@gmail.com>:
  - fix typo in smp_load_acquire()

/* v6 */
- Christian Brauner <christian.brauner@ubuntu.com>:
  - bring up READ_ONCE()/WRITE_ONCE() approach for discussion
---
 kernel/taskstats.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 13a0f2e6ebc2..111bb4139aa2 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
 static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
 {
 	struct signal_struct *sig = tsk->signal;
-	struct taskstats *stats;
+	struct taskstats *stats_new, *stats;
 
-	if (sig->stats || thread_group_empty(tsk))
-		goto ret;
+	/* Pairs with WRITE_ONCE() below. */
+	stats = READ_ONCE(sig->stats);
+	if (stats || thread_group_empty(tsk))
+		return stats;
 
 	/* No problem if kmem_cache_zalloc() fails */
-	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
+	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
 
 	spin_lock_irq(&tsk->sighand->siglock);
-	if (!sig->stats) {
-		sig->stats = stats;
-		stats = NULL;
+	if (!stats) {
+		stats = stats_new;
+		/* Pairs with READ_ONCE() above. */
+		WRITE_ONCE(sig->stats, stats_new);
+		stats_new = NULL;
 	}
 	spin_unlock_irq(&tsk->sighand->siglock);
 
-	if (stats)
-		kmem_cache_free(taskstats_cache, stats);
-ret:
-	return sig->stats;
+	if (stats_new)
+		kmem_cache_free(taskstats_cache, stats_new);
+
+	return stats;
 }
 
 /* Send pid data out on exit */
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-21 11:33                               ` [PATCH v6] " Christian Brauner
@ 2019-10-21 12:19                                 ` Rasmus Villemoes
  2019-10-21 13:04                                   ` Christian Brauner
  2019-10-23 12:16                                 ` Andrea Parri
  2019-11-29 17:57                                 ` Will Deacon
  2 siblings, 1 reply; 62+ messages in thread
From: Rasmus Villemoes @ 2019-10-21 12:19 UTC (permalink / raw)
  To: Christian Brauner, Will Deacon, linux-kernel
  Cc: bsingharora, dvyukov, elver, parri.andrea, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

On 21/10/2019 13.33, Christian Brauner wrote:
> The first approach used smp_load_acquire() and smp_store_release().
> However, after having discussed this it seems that the data dependency
> for kmem_cache_alloc() would be fixed by WRITE_ONCE().
> Furthermore, the smp_load_acquire() would only manage to order the stats
> check before the thread_group_empty() check. So it seems just using
> READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
> up for discussion at least.
> 
> /* v6 */
> - Christian Brauner <christian.brauner@ubuntu.com>:
>   - bring up READ_ONCE()/WRITE_ONCE() approach for discussion
> ---
>  kernel/taskstats.c | 26 +++++++++++++++-----------
>  1 file changed, 15 insertions(+), 11 deletions(-)
> 
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 13a0f2e6ebc2..111bb4139aa2 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
>  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
>  {
>  	struct signal_struct *sig = tsk->signal;
> -	struct taskstats *stats;
> +	struct taskstats *stats_new, *stats;
>  
> -	if (sig->stats || thread_group_empty(tsk))
> -		goto ret;
> +	/* Pairs with WRITE_ONCE() below. */
> +	stats = READ_ONCE(sig->stats);
> +	if (stats || thread_group_empty(tsk))
> +		return stats;
>  
>  	/* No problem if kmem_cache_zalloc() fails */
> -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
>  
>  	spin_lock_irq(&tsk->sighand->siglock);
> -	if (!sig->stats) {
> -		sig->stats = stats;
> -		stats = NULL;
> +	if (!stats) {
> +		stats = stats_new;
> +		/* Pairs with READ_ONCE() above. */
> +		WRITE_ONCE(sig->stats, stats_new);
> +		stats_new = NULL;

No idea about the memory ordering issues, but don't you need to
load/check sig->stats again? Otherwise it seems that two threads might
both see !sig->stats, both allocate a stats_new, and both
unconditionally in turn assign their stats_new to sig->stats. Then the
first assignment ends up becoming a memory leak (and any writes through
that pointer done by the caller end up in /dev/null...)

Rasmus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-21 12:19                                 ` Rasmus Villemoes
@ 2019-10-21 13:04                                   ` Christian Brauner
  2019-11-29 17:56                                     ` Will Deacon
  0 siblings, 1 reply; 62+ messages in thread
From: Christian Brauner @ 2019-10-21 13:04 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Will Deacon, linux-kernel, bsingharora, dvyukov, elver,
	parri.andrea, stable, syzbot+c5d03165a1bd1dead0c1,
	syzkaller-bugs

On Mon, Oct 21, 2019 at 02:19:01PM +0200, Rasmus Villemoes wrote:
> On 21/10/2019 13.33, Christian Brauner wrote:
> > The first approach used smp_load_acquire() and smp_store_release().
> > However, after having discussed this it seems that the data dependency
> > for kmem_cache_alloc() would be fixed by WRITE_ONCE().
> > Furthermore, the smp_load_acquire() would only manage to order the stats
> > check before the thread_group_empty() check. So it seems just using
> > READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
> > up for discussion at least.
> > 
> > /* v6 */
> > - Christian Brauner <christian.brauner@ubuntu.com>:
> >   - bring up READ_ONCE()/WRITE_ONCE() approach for discussion
> > ---
> >  kernel/taskstats.c | 26 +++++++++++++++-----------
> >  1 file changed, 15 insertions(+), 11 deletions(-)
> > 
> > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > index 13a0f2e6ebc2..111bb4139aa2 100644
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> >  {
> >  	struct signal_struct *sig = tsk->signal;
> > -	struct taskstats *stats;
> > +	struct taskstats *stats_new, *stats;
> >  
> > -	if (sig->stats || thread_group_empty(tsk))
> > -		goto ret;
> > +	/* Pairs with WRITE_ONCE() below. */
> > +	stats = READ_ONCE(sig->stats);
> > +	if (stats || thread_group_empty(tsk))
> > +		return stats;
> >  
> >  	/* No problem if kmem_cache_zalloc() fails */
> > -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> >  
> >  	spin_lock_irq(&tsk->sighand->siglock);
> > -	if (!sig->stats) {
> > -		sig->stats = stats;
> > -		stats = NULL;
> > +	if (!stats) {
> > +		stats = stats_new;
> > +		/* Pairs with READ_ONCE() above. */
> > +		WRITE_ONCE(sig->stats, stats_new);
> > +		stats_new = NULL;
> 
> No idea about the memory ordering issues, but don't you need to
> load/check sig->stats again? Otherwise it seems that two threads might
> both see !sig->stats, both allocate a stats_new, and both
> unconditionally in turn assign their stats_new to sig->stats. Then the
> first assignment ends up becoming a memory leak (and any writes through
> that pointer done by the caller end up in /dev/null...)

Trigger hand too fast. I guess you're thinking sm like:

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 13a0f2e6ebc2..c4e1ed11e785 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -554,25 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
 static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
 {
 	struct signal_struct *sig = tsk->signal;
-	struct taskstats *stats;
+	struct taskstats *stats_new, *stats;
 
-	if (sig->stats || thread_group_empty(tsk))
-		goto ret;
+	stats = READ_ONCE(sig->stats);
+	if (stats || thread_group_empty(tsk))
+		return stats;
 
-	/* No problem if kmem_cache_zalloc() fails */
-	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
+	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
 
 	spin_lock_irq(&tsk->sighand->siglock);
-	if (!sig->stats) {
-		sig->stats = stats;
-		stats = NULL;
+	stats = READ_ONCE(sig->stats);
+	if (!stats) {
+		stats = stats_new;
+		WRITE_ONCE(sig->stats, stats_new);
+		stats_new = NULL;
 	}
 	spin_unlock_irq(&tsk->sighand->siglock);
 
-	if (stats)
-		kmem_cache_free(taskstats_cache, stats);
-ret:
-	return sig->stats;
+	if (stats_new)
+		kmem_cache_free(taskstats_cache, stats_new);
+
+	return stats;
 }

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-21 11:33                               ` [PATCH v6] " Christian Brauner
  2019-10-21 12:19                                 ` Rasmus Villemoes
@ 2019-10-23 12:16                                 ` Andrea Parri
  2019-10-23 12:39                                   ` Dmitry Vyukov
  2019-11-29 17:57                                 ` Will Deacon
  2 siblings, 1 reply; 62+ messages in thread
From: Andrea Parri @ 2019-10-23 12:16 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Will Deacon, linux-kernel, bsingharora, dvyukov, elver, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
> 
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The tasks reads sig->stats without holding sighand lock.
> 
> cpu1:
> task calls exit_group()
>  do_exit()
>  do_group_exit()
>  taskstats_exit()
>  taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
> 
> The first approach used smp_load_acquire() and smp_store_release().
> However, after having discussed this it seems that the data dependency
> for kmem_cache_alloc() would be fixed by WRITE_ONCE().
> Furthermore, the smp_load_acquire() would only manage to order the stats
> check before the thread_group_empty() check. So it seems just using
> READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
> up for discussion at least.

Mmh, the RELEASE was intended to order the memory initialization in
kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT,
there is no data dependency between such memory accesses.

Correspondingly, the ACQUIRE was intended to order the ->stats pointer
load with later, _independent dereferences of the same pointer; the
latter are, e.g., in taskstats_exit() (but not thread_group_empty()).

Looking again, I see that fill_tgid_exit()'s dereferences of ->stats
are protected by ->siglock: maybe you meant to rely on such a critical
section pairing with the critical section in taskstats_tgid_alloc()?

That memcpy(-, tsk->signal->stats, -) at the end of taskstats_exit()
also bugs me: could these dereferences of ->stats happen concurrently
with other stores to the same memory locations?

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-23 12:16                                 ` Andrea Parri
@ 2019-10-23 12:39                                   ` Dmitry Vyukov
  2019-10-23 13:11                                     ` Christian Brauner
  2019-10-24 11:31                                     ` Andrea Parri
  0 siblings, 2 replies; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-23 12:39 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Christian Brauner, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Wed, Oct 23, 2019 at 2:16 PM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
> > When assiging and testing taskstats in taskstats_exit() there's a race
> > when writing and reading sig->stats when a thread-group with more than
> > one thread exits:
> >
> > cpu0:
> > thread catches fatal signal and whole thread-group gets taken down
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The tasks reads sig->stats without holding sighand lock.
> >
> > cpu1:
> > task calls exit_group()
> >  do_exit()
> >  do_group_exit()
> >  taskstats_exit()
> >  taskstats_tgid_alloc()
> > The task takes sighand lock and assigns new stats to sig->stats.
> >
> > The first approach used smp_load_acquire() and smp_store_release().
> > However, after having discussed this it seems that the data dependency
> > for kmem_cache_alloc() would be fixed by WRITE_ONCE().
> > Furthermore, the smp_load_acquire() would only manage to order the stats
> > check before the thread_group_empty() check. So it seems just using
> > READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
> > up for discussion at least.
>
> Mmh, the RELEASE was intended to order the memory initialization in
> kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT,
> there is no data dependency between such memory accesses.

I agree. This needs smp_store_release. The latest version that I
looked at contained:
smp_store_release(&sig->stats, stats_new);

> Correspondingly, the ACQUIRE was intended to order the ->stats pointer
> load with later, _independent dereferences of the same pointer; the
> latter are, e.g., in taskstats_exit() (but not thread_group_empty()).

How these later loads can be completely independent of the pointer
value? They need to obtain the pointer value from somewhere. And this
can only be done by loaded it. And if a thread loads a pointer and
then dereferences that pointer, that's a data/address dependency and
we assume this is now covered by READ_ONCE.
Or these later loads of the pointer can also race with the store? If
so, I think they also need to use READ_ONCE (rather than turn this earlier
pointer load into acquire).


> Looking again, I see that fill_tgid_exit()'s dereferences of ->stats
> are protected by ->siglock: maybe you meant to rely on such a critical
> section pairing with the critical section in taskstats_tgid_alloc()?
>
> That memcpy(-, tsk->signal->stats, -) at the end of taskstats_exit()
> also bugs me: could these dereferences of ->stats happen concurrently
> with other stores to the same memory locations?
>
> Thanks,
>   Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-23 12:39                                   ` Dmitry Vyukov
@ 2019-10-23 13:11                                     ` Christian Brauner
  2019-10-23 13:20                                       ` Dmitry Vyukov
  2019-10-24 11:31                                     ` Andrea Parri
  1 sibling, 1 reply; 62+ messages in thread
From: Christian Brauner @ 2019-10-23 13:11 UTC (permalink / raw)
  To: Dmitry Vyukov, Will Deacon
  Cc: Andrea Parri, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Wed, Oct 23, 2019 at 02:39:55PM +0200, Dmitry Vyukov wrote:
> On Wed, Oct 23, 2019 at 2:16 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> >
> > On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
> > > When assiging and testing taskstats in taskstats_exit() there's a race
> > > when writing and reading sig->stats when a thread-group with more than
> > > one thread exits:
> > >
> > > cpu0:
> > > thread catches fatal signal and whole thread-group gets taken down
> > >  do_exit()
> > >  do_group_exit()
> > >  taskstats_exit()
> > >  taskstats_tgid_alloc()
> > > The tasks reads sig->stats without holding sighand lock.
> > >
> > > cpu1:
> > > task calls exit_group()
> > >  do_exit()
> > >  do_group_exit()
> > >  taskstats_exit()
> > >  taskstats_tgid_alloc()
> > > The task takes sighand lock and assigns new stats to sig->stats.
> > >
> > > The first approach used smp_load_acquire() and smp_store_release().
> > > However, after having discussed this it seems that the data dependency
> > > for kmem_cache_alloc() would be fixed by WRITE_ONCE().
> > > Furthermore, the smp_load_acquire() would only manage to order the stats
> > > check before the thread_group_empty() check. So it seems just using
> > > READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
> > > up for discussion at least.
> >
> > Mmh, the RELEASE was intended to order the memory initialization in
> > kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT,
> > there is no data dependency between such memory accesses.
> 
> I agree. This needs smp_store_release. The latest version that I
> looked at contained:
> smp_store_release(&sig->stats, stats_new);

This is what really makes me wonder. Can the compiler really re-order
the kmem_cache_zalloc() call with the assignment. If that's really the
case then shouldn't all allocation functions have compiler barriers in
them? This then seems like a very generic problem.

> 
> > Correspondingly, the ACQUIRE was intended to order the ->stats pointer
> > load with later, _independent dereferences of the same pointer; the
> > latter are, e.g., in taskstats_exit() (but not thread_group_empty()).
> 
> How these later loads can be completely independent of the pointer
> value? They need to obtain the pointer value from somewhere. And this
> can only be done by loaded it. And if a thread loads a pointer and
> then dereferences that pointer, that's a data/address dependency and
> we assume this is now covered by READ_ONCE.
> Or these later loads of the pointer can also race with the store? If

To clarify, later loads as in taskstats_exit() and thread_group_empty(),
not the later load in the double-checked locking case.

> so, I think they also need to use READ_ONCE (rather than turn this earlier
> pointer load into acquire).

Using READ_ONCE() in the alloc, taskstat_exit(), and
thread_group_empty() case.

Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-23 13:11                                     ` Christian Brauner
@ 2019-10-23 13:20                                       ` Dmitry Vyukov
  0 siblings, 0 replies; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-23 13:20 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Will Deacon, Andrea Parri, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Wed, Oct 23, 2019 at 3:11 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Wed, Oct 23, 2019 at 02:39:55PM +0200, Dmitry Vyukov wrote:
> > On Wed, Oct 23, 2019 at 2:16 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > >
> > > On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
> > > > When assiging and testing taskstats in taskstats_exit() there's a race
> > > > when writing and reading sig->stats when a thread-group with more than
> > > > one thread exits:
> > > >
> > > > cpu0:
> > > > thread catches fatal signal and whole thread-group gets taken down
> > > >  do_exit()
> > > >  do_group_exit()
> > > >  taskstats_exit()
> > > >  taskstats_tgid_alloc()
> > > > The tasks reads sig->stats without holding sighand lock.
> > > >
> > > > cpu1:
> > > > task calls exit_group()
> > > >  do_exit()
> > > >  do_group_exit()
> > > >  taskstats_exit()
> > > >  taskstats_tgid_alloc()
> > > > The task takes sighand lock and assigns new stats to sig->stats.
> > > >
> > > > The first approach used smp_load_acquire() and smp_store_release().
> > > > However, after having discussed this it seems that the data dependency
> > > > for kmem_cache_alloc() would be fixed by WRITE_ONCE().
> > > > Furthermore, the smp_load_acquire() would only manage to order the stats
> > > > check before the thread_group_empty() check. So it seems just using
> > > > READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
> > > > up for discussion at least.
> > >
> > > Mmh, the RELEASE was intended to order the memory initialization in
> > > kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT,
> > > there is no data dependency between such memory accesses.
> >
> > I agree. This needs smp_store_release. The latest version that I
> > looked at contained:
> > smp_store_release(&sig->stats, stats_new);
>
> This is what really makes me wonder. Can the compiler really re-order
> the kmem_cache_zalloc() call with the assignment.

Yes.
Not sure about compiler, but hardware definitely can. And generally
one does not care if it's compiler or hardware.

> If that's really the
> case then shouldn't all allocation functions have compiler barriers in
> them? This then seems like a very generic problem.

No.
One puts memory barriers into synchronization primitives.
This equally affects memset's, memcpy's and in fact all normal stores.
Adding a memory barrier to every normal store is not the solution to
this. The memory barrier is done before publication of the memory. And
we already have smp_store_release for this. So if one doesn't publish
objects with a plain store (which breaks all possible rules anyways)
and uses a proper primitive, there is no problem.

> > > Correspondingly, the ACQUIRE was intended to order the ->stats pointer
> > > load with later, _independent dereferences of the same pointer; the
> > > latter are, e.g., in taskstats_exit() (but not thread_group_empty()).
> >
> > How these later loads can be completely independent of the pointer
> > value? They need to obtain the pointer value from somewhere. And this
> > can only be done by loaded it. And if a thread loads a pointer and
> > then dereferences that pointer, that's a data/address dependency and
> > we assume this is now covered by READ_ONCE.
> > Or these later loads of the pointer can also race with the store? If
>
> To clarify, later loads as in taskstats_exit() and thread_group_empty(),
> not the later load in the double-checked locking case.
>
> > so, I think they also need to use READ_ONCE (rather than turn this earlier
> > pointer load into acquire).
>
> Using READ_ONCE() in the alloc, taskstat_exit(), and
> thread_group_empty() case.
>
> Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-23 12:39                                   ` Dmitry Vyukov
  2019-10-23 13:11                                     ` Christian Brauner
@ 2019-10-24 11:31                                     ` Andrea Parri
  2019-10-24 11:51                                       ` Dmitry Vyukov
  1 sibling, 1 reply; 62+ messages in thread
From: Andrea Parri @ 2019-10-24 11:31 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christian Brauner, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

> How these later loads can be completely independent of the pointer
> value? They need to obtain the pointer value from somewhere. And this
> can only be done by loaded it. And if a thread loads a pointer and
> then dereferences that pointer, that's a data/address dependency and
> we assume this is now covered by READ_ONCE.

The "dependency" I was considering here is a dependency _between the
load of sig->stats in taskstats_tgid_alloc() and the (program-order)
later loads of *(sig->stats) in taskstats_exit().  Roughly speaking,
such a dependency should correspond to a dependency chain at the asm
or registers level from the first load to the later loads; e.g., in:

  Thread [register r0 contains the address of sig->stats]

  A: LOAD r1,[r0]	// LOAD_ACQUIRE sig->stats
     ...
  B: LOAD r2,[r0]	// LOAD *(sig->stats)
  C: LOAD r3,[r2]

there would be no such dependency from A to C.  Compare, e.g., with:

  Thread [register r0 contains the address of sig->stats]

  A: LOAD r1,[r0]	// LOAD_ACQUIRE sig->stats
     ...
  C: LOAD r3,[r1]	// LOAD *(sig->stats)

AFAICT, there's no guarantee that the compilers will generate such a
dependency from the code under discussion.


> Or these later loads of the pointer can also race with the store? If
> so, I think they also need to use READ_ONCE (rather than turn this earlier
> pointer load into acquire).

AFAICT, _if the LOAD_ACQUIRE reads from the mentioned STORE_RELEASE,
then the former must induce enough synchronization to eliminate data
races (as well as any undesired re-ordering).

TBH, I am not familiar enough with the underlying logic of this code
to say whether that "if .. reads from .." pre-condition holds by the
time those *(sig->stats) execute.

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-24 11:31                                     ` Andrea Parri
@ 2019-10-24 11:51                                       ` Dmitry Vyukov
  2019-10-24 13:05                                         ` Andrea Parri
  0 siblings, 1 reply; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-24 11:51 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Christian Brauner, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> > How these later loads can be completely independent of the pointer
> > value? They need to obtain the pointer value from somewhere. And this
> > can only be done by loaded it. And if a thread loads a pointer and
> > then dereferences that pointer, that's a data/address dependency and
> > we assume this is now covered by READ_ONCE.
>
> The "dependency" I was considering here is a dependency _between the
> load of sig->stats in taskstats_tgid_alloc() and the (program-order)
> later loads of *(sig->stats) in taskstats_exit().  Roughly speaking,
> such a dependency should correspond to a dependency chain at the asm
> or registers level from the first load to the later loads; e.g., in:
>
>   Thread [register r0 contains the address of sig->stats]
>
>   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
>      ...
>   B: LOAD r2,[r0]       // LOAD *(sig->stats)
>   C: LOAD r3,[r2]
>
> there would be no such dependency from A to C.  Compare, e.g., with:
>
>   Thread [register r0 contains the address of sig->stats]
>
>   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
>      ...
>   C: LOAD r3,[r1]       // LOAD *(sig->stats)
>
> AFAICT, there's no guarantee that the compilers will generate such a
> dependency from the code under discussion.

Fixing this by making A ACQUIRE looks like somewhat weird code pattern
to me (though correct). B is what loads the address used to read
indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get
from READ_ONCE).

What you are suggesting is:

addr = ptr.load(memory_order_acquire);
if (addr) {
  addr = ptr.load(memory_order_relaxed);
  data = *addr;
}

whereas the canonical/non-convoluted form of this pattern is:

addr = ptr.load(memory_order_consume);
if (addr)
  data = *addr;

Moreover the second load of ptr is not even atomic in our case, so it
is a subject to another data race?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-24 11:51                                       ` Dmitry Vyukov
@ 2019-10-24 13:05                                         ` Andrea Parri
  2019-10-24 13:13                                           ` Dmitry Vyukov
  0 siblings, 1 reply; 62+ messages in thread
From: Andrea Parri @ 2019-10-24 13:05 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christian Brauner, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Thu, Oct 24, 2019 at 01:51:20PM +0200, Dmitry Vyukov wrote:
> On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> >
> > > How these later loads can be completely independent of the pointer
> > > value? They need to obtain the pointer value from somewhere. And this
> > > can only be done by loaded it. And if a thread loads a pointer and
> > > then dereferences that pointer, that's a data/address dependency and
> > > we assume this is now covered by READ_ONCE.
> >
> > The "dependency" I was considering here is a dependency _between the
> > load of sig->stats in taskstats_tgid_alloc() and the (program-order)
> > later loads of *(sig->stats) in taskstats_exit().  Roughly speaking,
> > such a dependency should correspond to a dependency chain at the asm
> > or registers level from the first load to the later loads; e.g., in:
> >
> >   Thread [register r0 contains the address of sig->stats]
> >
> >   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
> >      ...
> >   B: LOAD r2,[r0]       // LOAD *(sig->stats)
> >   C: LOAD r3,[r2]
> >
> > there would be no such dependency from A to C.  Compare, e.g., with:
> >
> >   Thread [register r0 contains the address of sig->stats]
> >
> >   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
> >      ...
> >   C: LOAD r3,[r1]       // LOAD *(sig->stats)
> >
> > AFAICT, there's no guarantee that the compilers will generate such a
> > dependency from the code under discussion.
> 
> Fixing this by making A ACQUIRE looks like somewhat weird code pattern
> to me (though correct). B is what loads the address used to read
> indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get
> from READ_ONCE).
> 
> What you are suggesting is:
> 
> addr = ptr.load(memory_order_acquire);
> if (addr) {
>   addr = ptr.load(memory_order_relaxed);
>   data = *addr;
> }
> 
> whereas the canonical/non-convoluted form of this pattern is:
> 
> addr = ptr.load(memory_order_consume);
> if (addr)
>   data = *addr;

No, I'd rather be suggesting:

  addr = ptr.load(memory_order_acquire);
  if (addr)
    data = *addr;

since I'd not expect any form of encouragement to rely on "consume" or
on "READ_ONCE() + true-address-dependency" from myself.  ;-)

IAC, v6 looks more like:

  addr = ptr.load(memory_order_consume);
  if (!!addr)
    *ptr = 1;
  data = *ptr;

to me (hence my comments/questions ...).

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-24 13:05                                         ` Andrea Parri
@ 2019-10-24 13:13                                           ` Dmitry Vyukov
  2019-10-24 13:21                                             ` Christian Brauner
  2019-10-24 13:43                                             ` Andrea Parri
  0 siblings, 2 replies; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-24 13:13 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Christian Brauner, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Thu, Oct 24, 2019 at 3:05 PM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> On Thu, Oct 24, 2019 at 01:51:20PM +0200, Dmitry Vyukov wrote:
> > On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > >
> > > > How these later loads can be completely independent of the pointer
> > > > value? They need to obtain the pointer value from somewhere. And this
> > > > can only be done by loaded it. And if a thread loads a pointer and
> > > > then dereferences that pointer, that's a data/address dependency and
> > > > we assume this is now covered by READ_ONCE.
> > >
> > > The "dependency" I was considering here is a dependency _between the
> > > load of sig->stats in taskstats_tgid_alloc() and the (program-order)
> > > later loads of *(sig->stats) in taskstats_exit().  Roughly speaking,
> > > such a dependency should correspond to a dependency chain at the asm
> > > or registers level from the first load to the later loads; e.g., in:
> > >
> > >   Thread [register r0 contains the address of sig->stats]
> > >
> > >   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
> > >      ...
> > >   B: LOAD r2,[r0]       // LOAD *(sig->stats)
> > >   C: LOAD r3,[r2]
> > >
> > > there would be no such dependency from A to C.  Compare, e.g., with:
> > >
> > >   Thread [register r0 contains the address of sig->stats]
> > >
> > >   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
> > >      ...
> > >   C: LOAD r3,[r1]       // LOAD *(sig->stats)
> > >
> > > AFAICT, there's no guarantee that the compilers will generate such a
> > > dependency from the code under discussion.
> >
> > Fixing this by making A ACQUIRE looks like somewhat weird code pattern
> > to me (though correct). B is what loads the address used to read
> > indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get
> > from READ_ONCE).
> >
> > What you are suggesting is:
> >
> > addr = ptr.load(memory_order_acquire);
> > if (addr) {
> >   addr = ptr.load(memory_order_relaxed);
> >   data = *addr;
> > }
> >
> > whereas the canonical/non-convoluted form of this pattern is:
> >
> > addr = ptr.load(memory_order_consume);
> > if (addr)
> >   data = *addr;
>
> No, I'd rather be suggesting:
>
>   addr = ptr.load(memory_order_acquire);
>   if (addr)
>     data = *addr;
>
> since I'd not expect any form of encouragement to rely on "consume" or
> on "READ_ONCE() + true-address-dependency" from myself.  ;-)

But why? I think kernel contains lots of such cases and it seems to be
officially documented by the LKMM:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt
address dependencies and ppo
Am I missing something? Why do we need acquire with address dependency?

> IAC, v6 looks more like:
>
>   addr = ptr.load(memory_order_consume);
>   if (!!addr)
>     *ptr = 1;
>   data = *ptr;
>
> to me (hence my comments/questions ...).
>
> Thanks,
>   Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-24 13:13                                           ` Dmitry Vyukov
@ 2019-10-24 13:21                                             ` Christian Brauner
  2019-10-24 13:34                                               ` Dmitry Vyukov
  2019-10-24 13:43                                             ` Andrea Parri
  1 sibling, 1 reply; 62+ messages in thread
From: Christian Brauner @ 2019-10-24 13:21 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrea Parri, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Thu, Oct 24, 2019 at 03:13:48PM +0200, Dmitry Vyukov wrote:
> On Thu, Oct 24, 2019 at 3:05 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> >
> > On Thu, Oct 24, 2019 at 01:51:20PM +0200, Dmitry Vyukov wrote:
> > > On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > > >
> > > > > How these later loads can be completely independent of the pointer
> > > > > value? They need to obtain the pointer value from somewhere. And this
> > > > > can only be done by loaded it. And if a thread loads a pointer and
> > > > > then dereferences that pointer, that's a data/address dependency and
> > > > > we assume this is now covered by READ_ONCE.
> > > >
> > > > The "dependency" I was considering here is a dependency _between the
> > > > load of sig->stats in taskstats_tgid_alloc() and the (program-order)
> > > > later loads of *(sig->stats) in taskstats_exit().  Roughly speaking,
> > > > such a dependency should correspond to a dependency chain at the asm
> > > > or registers level from the first load to the later loads; e.g., in:
> > > >
> > > >   Thread [register r0 contains the address of sig->stats]
> > > >
> > > >   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
> > > >      ...
> > > >   B: LOAD r2,[r0]       // LOAD *(sig->stats)
> > > >   C: LOAD r3,[r2]
> > > >
> > > > there would be no such dependency from A to C.  Compare, e.g., with:
> > > >
> > > >   Thread [register r0 contains the address of sig->stats]
> > > >
> > > >   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
> > > >      ...
> > > >   C: LOAD r3,[r1]       // LOAD *(sig->stats)
> > > >
> > > > AFAICT, there's no guarantee that the compilers will generate such a
> > > > dependency from the code under discussion.
> > >
> > > Fixing this by making A ACQUIRE looks like somewhat weird code pattern
> > > to me (though correct). B is what loads the address used to read
> > > indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get
> > > from READ_ONCE).
> > >
> > > What you are suggesting is:
> > >
> > > addr = ptr.load(memory_order_acquire);
> > > if (addr) {
> > >   addr = ptr.load(memory_order_relaxed);
> > >   data = *addr;
> > > }
> > >
> > > whereas the canonical/non-convoluted form of this pattern is:
> > >
> > > addr = ptr.load(memory_order_consume);
> > > if (addr)
> > >   data = *addr;
> >
> > No, I'd rather be suggesting:
> >
> >   addr = ptr.load(memory_order_acquire);
> >   if (addr)
> >     data = *addr;
> >
> > since I'd not expect any form of encouragement to rely on "consume" or
> > on "READ_ONCE() + true-address-dependency" from myself.  ;-)
> 
> But why? I think kernel contains lots of such cases and it seems to be
> officially documented by the LKMM:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt
> address dependencies and ppo

You mean this section:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n955
and specifically:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n982
?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-24 13:21                                             ` Christian Brauner
@ 2019-10-24 13:34                                               ` Dmitry Vyukov
  0 siblings, 0 replies; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-24 13:34 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Andrea Parri, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Thu, Oct 24, 2019 at 3:21 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Thu, Oct 24, 2019 at 03:13:48PM +0200, Dmitry Vyukov wrote:
> > On Thu, Oct 24, 2019 at 3:05 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > >
> > > On Thu, Oct 24, 2019 at 01:51:20PM +0200, Dmitry Vyukov wrote:
> > > > On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > > > >
> > > > > > How these later loads can be completely independent of the pointer
> > > > > > value? They need to obtain the pointer value from somewhere. And this
> > > > > > can only be done by loaded it. And if a thread loads a pointer and
> > > > > > then dereferences that pointer, that's a data/address dependency and
> > > > > > we assume this is now covered by READ_ONCE.
> > > > >
> > > > > The "dependency" I was considering here is a dependency _between the
> > > > > load of sig->stats in taskstats_tgid_alloc() and the (program-order)
> > > > > later loads of *(sig->stats) in taskstats_exit().  Roughly speaking,
> > > > > such a dependency should correspond to a dependency chain at the asm
> > > > > or registers level from the first load to the later loads; e.g., in:
> > > > >
> > > > >   Thread [register r0 contains the address of sig->stats]
> > > > >
> > > > >   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
> > > > >      ...
> > > > >   B: LOAD r2,[r0]       // LOAD *(sig->stats)
> > > > >   C: LOAD r3,[r2]
> > > > >
> > > > > there would be no such dependency from A to C.  Compare, e.g., with:
> > > > >
> > > > >   Thread [register r0 contains the address of sig->stats]
> > > > >
> > > > >   A: LOAD r1,[r0]       // LOAD_ACQUIRE sig->stats
> > > > >      ...
> > > > >   C: LOAD r3,[r1]       // LOAD *(sig->stats)
> > > > >
> > > > > AFAICT, there's no guarantee that the compilers will generate such a
> > > > > dependency from the code under discussion.
> > > >
> > > > Fixing this by making A ACQUIRE looks like somewhat weird code pattern
> > > > to me (though correct). B is what loads the address used to read
> > > > indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get
> > > > from READ_ONCE).
> > > >
> > > > What you are suggesting is:
> > > >
> > > > addr = ptr.load(memory_order_acquire);
> > > > if (addr) {
> > > >   addr = ptr.load(memory_order_relaxed);
> > > >   data = *addr;
> > > > }
> > > >
> > > > whereas the canonical/non-convoluted form of this pattern is:
> > > >
> > > > addr = ptr.load(memory_order_consume);
> > > > if (addr)
> > > >   data = *addr;
> > >
> > > No, I'd rather be suggesting:
> > >
> > >   addr = ptr.load(memory_order_acquire);
> > >   if (addr)
> > >     data = *addr;
> > >
> > > since I'd not expect any form of encouragement to rely on "consume" or
> > > on "READ_ONCE() + true-address-dependency" from myself.  ;-)
> >
> > But why? I think kernel contains lots of such cases and it seems to be
> > officially documented by the LKMM:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt
> > address dependencies and ppo
>
> You mean this section:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n955
> and specifically:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n982
> ?

Yes, and also this:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n450

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-24 13:13                                           ` Dmitry Vyukov
  2019-10-24 13:21                                             ` Christian Brauner
@ 2019-10-24 13:43                                             ` Andrea Parri
  2019-10-24 13:58                                               ` Dmitry Vyukov
  1 sibling, 1 reply; 62+ messages in thread
From: Andrea Parri @ 2019-10-24 13:43 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christian Brauner, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

> But why? I think kernel contains lots of such cases and it seems to be
> officially documented by the LKMM:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt
> address dependencies and ppo

Well, that same documentation also alerts about some of the pitfalls
developers can incur while relying on dependencies.  I'm sure you're
more than aware of some of the debate surrounding these issues.

  Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-24 13:43                                             ` Andrea Parri
@ 2019-10-24 13:58                                               ` Dmitry Vyukov
  2019-10-24 14:40                                                 ` Andrea Parri
  0 siblings, 1 reply; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-24 13:58 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Christian Brauner, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Thu, Oct 24, 2019 at 3:43 PM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> > But why? I think kernel contains lots of such cases and it seems to be
> > officially documented by the LKMM:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt
> > address dependencies and ppo
>
> Well, that same documentation also alerts about some of the pitfalls
> developers can incur while relying on dependencies.  I'm sure you're
> more than aware of some of the debate surrounding these issues.

I thought that LKMM is finally supposed to stop all these
centi-threads around subtle details of ordering. And not we finally
have it. And it says that using address-dependencies is legal. And you
are one of the authors. And now you are arguing here that we better
not use it :) Can we have some black/white yes/no for code correctness
reflected in LKMM please :) If we are banning address dependencies,
don't we need to fix all of rcu uses?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-24 13:58                                               ` Dmitry Vyukov
@ 2019-10-24 14:40                                                 ` Andrea Parri
  2019-10-24 14:49                                                   ` Dmitry Vyukov
  0 siblings, 1 reply; 62+ messages in thread
From: Andrea Parri @ 2019-10-24 14:40 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christian Brauner, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Thu, Oct 24, 2019 at 03:58:40PM +0200, Dmitry Vyukov wrote:
> On Thu, Oct 24, 2019 at 3:43 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> >
> > > But why? I think kernel contains lots of such cases and it seems to be
> > > officially documented by the LKMM:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt
> > > address dependencies and ppo
> >
> > Well, that same documentation also alerts about some of the pitfalls
> > developers can incur while relying on dependencies.  I'm sure you're
> > more than aware of some of the debate surrounding these issues.
> 
> I thought that LKMM is finally supposed to stop all these
> centi-threads around subtle details of ordering. And not we finally
> have it. And it says that using address-dependencies is legal. And you
> are one of the authors. And now you are arguing here that we better
> not use it :) Can we have some black/white yes/no for code correctness
> reflected in LKMM please :) If we are banning address dependencies,
> don't we need to fix all of rcu uses?

Current limitations of the LKMM are listed in tools/memory-model/README
(and I myself discussed a number of them at LPC recently); the relevant
point here seems to be:

1.	Compiler optimizations are not accurately modeled.  Of course,
	the use of READ_ONCE() and WRITE_ONCE() limits the compiler's
	ability to optimize, but under some circumstances it is possible
	for the compiler to undermine the memory model.  [...]

	Note that this limitation in turn limits LKMM's ability to
	accurately model address, control, and data dependencies.

A less elegant, but hopefully more effective, way to phrase such point
is maybe "feel free to rely on dependencies, but then do not blame the
LKMM authors please".  ;-)

Thanks,
  Andrea

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-24 14:40                                                 ` Andrea Parri
@ 2019-10-24 14:49                                                   ` Dmitry Vyukov
  0 siblings, 0 replies; 62+ messages in thread
From: Dmitry Vyukov @ 2019-10-24 14:49 UTC (permalink / raw)
  To: Andrea Parri
  Cc: Christian Brauner, Will Deacon, LKML, bsingharora, Marco Elver,
	stable, syzbot, syzkaller-bugs

On Thu, Oct 24, 2019 at 4:40 PM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> On Thu, Oct 24, 2019 at 03:58:40PM +0200, Dmitry Vyukov wrote:
> > On Thu, Oct 24, 2019 at 3:43 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> > >
> > > > But why? I think kernel contains lots of such cases and it seems to be
> > > > officially documented by the LKMM:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt
> > > > address dependencies and ppo
> > >
> > > Well, that same documentation also alerts about some of the pitfalls
> > > developers can incur while relying on dependencies.  I'm sure you're
> > > more than aware of some of the debate surrounding these issues.
> >
> > I thought that LKMM is finally supposed to stop all these
> > centi-threads around subtle details of ordering. And not we finally
> > have it. And it says that using address-dependencies is legal. And you
> > are one of the authors. And now you are arguing here that we better
> > not use it :) Can we have some black/white yes/no for code correctness
> > reflected in LKMM please :) If we are banning address dependencies,
> > don't we need to fix all of rcu uses?
>
> Current limitations of the LKMM are listed in tools/memory-model/README
> (and I myself discussed a number of them at LPC recently); the relevant
> point here seems to be:
>
> 1.      Compiler optimizations are not accurately modeled.  Of course,
>         the use of READ_ONCE() and WRITE_ONCE() limits the compiler's
>         ability to optimize, but under some circumstances it is possible
>         for the compiler to undermine the memory model.  [...]
>
>         Note that this limitation in turn limits LKMM's ability to
>         accurately model address, control, and data dependencies.
>
> A less elegant, but hopefully more effective, way to phrase such point
> is maybe "feel free to rely on dependencies, but then do not blame the
> LKMM authors please".  ;-)

We are not going to blame LKMM authors :)

Acquire will introduce actual hardware barrier on arm/power/etc. Maybe
it does not matter here. But I feel if we start replacing all
load-depends/rcu with acquire, it will be noticeable overhead. So what
do we do in the context of the whole kernel?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: KCSAN: data-race in taskstats_exit / taskstats_exit
  2019-10-05  4:26 KCSAN: data-race in taskstats_exit / taskstats_exit syzbot
  2019-10-05  4:29 ` Dmitry Vyukov
  2019-10-05 11:28 ` [PATCH] taskstats: fix data-race Christian Brauner
@ 2019-11-06  0:09 ` Balbir Singh
  2019-11-06 10:23   ` Marco Elver
  2 siblings, 1 reply; 62+ messages in thread
From: Balbir Singh @ 2019-11-06  0:09 UTC (permalink / raw)
  To: syzbot, elver, linux-kernel, syzkaller-bugs

On Fri, 2019-10-04 at 21:26 -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    b4bd9343 x86, kcsan: Enable KCSAN for x86
> git tree:       https://github.com/google/ktsan.git kcsan
> console output: https://syzkaller.appspot.com/x/log.txt?x=125329db600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> dashboard link: https://syzkaller.appspot.com/bug?extid=c5d03165a1bd1dead0c1
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Unfortunately, I don't have any reproducer for this crash yet.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> 
> ==================================================================
> BUG: KCSAN: data-race in taskstats_exit / taskstats_exit
> 
> write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
>   taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
>   taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
>   do_exit+0x2c2/0x18e0 kernel/exit.c:864
>   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
>   get_signal+0x2a2/0x1320 kernel/signal.c:2734
>   do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
>   exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
>   prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>   syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
>   do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
>   taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
>   taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
>   do_exit+0x2c2/0x18e0 kernel/exit.c:864
>   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
>   __do_sys_exit_group kernel/exit.c:994 [inline]
>   __se_sys_exit_group kernel/exit.c:992 [inline]
>   __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
>   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 

Sorry I've been away and just catching up with email

I don't think this is a bug, if I interpret the report correctly it shows a
race

static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
{
	struct signal_struct *sig = tsk->signal;
	struct taskstats *stats;

#1	if (sig->stats || thread_group_empty(tsk)) <- the check of sig->stats
		goto ret;

	/* No problem if kmem_cache_zalloc() fails */
	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);

	spin_lock_irq(&tsk->sighand->siglock);
	if (!sig->stats) {
#2		sig->stats = stats; <- here in setting sig->stats
		stats = NULL;
	}
	spin_unlock_irq(&tsk->sighand->siglock);

	if (stats)
		kmem_cache_free(taskstats_cache, stats);
ret:
	return sig->stats;
}

The worst case scenario is that we might see sig->stats as being NULL when two
threads belonging to the same tgid. We do free up stats if we got that wrong

Am I misinterpreting the report?

Balbir Singh.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] taskstats: fix data-race
  2019-10-05 11:28 ` [PATCH] taskstats: fix data-race Christian Brauner
                     ` (2 preceding siblings ...)
  2019-10-06 23:52   ` Christian Brauner
@ 2019-11-06  0:27   ` Balbir Singh
  3 siblings, 0 replies; 62+ messages in thread
From: Balbir Singh @ 2019-11-06  0:27 UTC (permalink / raw)
  To: Christian Brauner, syzbot+c5d03165a1bd1dead0c1
  Cc: elver, linux-kernel, syzkaller-bugs

On Sat, 2019-10-05 at 13:28 +0200, Christian Brauner wrote:
> When assiging and testing taskstats in taskstats
> taskstats_exit() there's a race around writing and reading sig->stats.
> 
> cpu0:
> task calls exit()
> do_exit()
> 	-> taskstats_exit()
> 		-> taskstats_tgid_alloc()
> The task takes sighand lock and assigns new stats to sig->stats.
> 
> cpu1:
> task catches signal
> do_exit()
> 	-> taskstats_tgid_alloc()
> 		-> taskstats_exit()
> The tasks reads sig->stats __without__ holding sighand lock seeing
> garbage.
> 
> Fix this by taking sighand lock when reading sig->stats.
> 

Hi

Sorry I am just catching up with my inbox, why is sig->stats garbage? I'll
need to go back and look, unsigned long (pointer) reads are atomic and the
expectation is that they would be NULL initially and then non NULL for thread
groups when first allocated. May be I am just being dense in the morning -
what am I missing?

Balbir Singh.




^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: KCSAN: data-race in taskstats_exit / taskstats_exit
  2019-11-06  0:09 ` KCSAN: data-race in taskstats_exit / taskstats_exit Balbir Singh
@ 2019-11-06 10:23   ` Marco Elver
  2019-11-07 10:39     ` Balbir Singh
  2019-11-08  0:54     ` Balbir Singh
  0 siblings, 2 replies; 62+ messages in thread
From: Marco Elver @ 2019-11-06 10:23 UTC (permalink / raw)
  To: Balbir Singh; +Cc: syzbot, LKML, syzkaller-bugs

On Wed, 6 Nov 2019 at 01:10, Balbir Singh <bsingharora@gmail.com> wrote:
>
> On Fri, 2019-10-04 at 21:26 -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    b4bd9343 x86, kcsan: Enable KCSAN for x86
> > git tree:       https://github.com/google/ktsan.git kcsan
> > console output: https://syzkaller.appspot.com/x/log.txt?x=125329db600000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > dashboard link: https://syzkaller.appspot.com/bug?extid=c5d03165a1bd1dead0c1
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> >
> > ==================================================================
> > BUG: KCSAN: data-race in taskstats_exit / taskstats_exit
> >
> > write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
> >   taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
> >   taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
> >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> >   get_signal+0x2a2/0x1320 kernel/signal.c:2734
> >   do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
> >   exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
> >   prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> >   syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
> >   do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
> >   taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
> >   taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
> >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> >   __do_sys_exit_group kernel/exit.c:994 [inline]
> >   __se_sys_exit_group kernel/exit.c:992 [inline]
> >   __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
> >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
>
> Sorry I've been away and just catching up with email
>
> I don't think this is a bug, if I interpret the report correctly it shows a
> race
>
> static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> {
>         struct signal_struct *sig = tsk->signal;
>         struct taskstats *stats;
>
> #1      if (sig->stats || thread_group_empty(tsk)) <- the check of sig->stats
>                 goto ret;
>
>         /* No problem if kmem_cache_zalloc() fails */
>         stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
>
>         spin_lock_irq(&tsk->sighand->siglock);
>         if (!sig->stats) {
> #2              sig->stats = stats; <- here in setting sig->stats
>                 stats = NULL;
>         }
>         spin_unlock_irq(&tsk->sighand->siglock);
>
>         if (stats)
>                 kmem_cache_free(taskstats_cache, stats);
> ret:
>         return sig->stats;
> }
>
> The worst case scenario is that we might see sig->stats as being NULL when two
> threads belonging to the same tgid. We do free up stats if we got that wrong
>
> Am I misinterpreting the report?
>
> Balbir Singh.

The plain concurrent reads/writes are a data race, which may manifest
in various undefined behaviour due to compiler optimizations [1, 2].
Note that, "data race" does not necessarily imply "race condition";
some data races are race conditions (usually the more interesting
bugs) -- but not *all* data races are race conditions (sometimes
referred to as "benign races"). KCSAN reports data races according to
the LKMM.
[1] https://lwn.net/Articles/793253/
[2] https://lwn.net/Articles/799218/

If there is no race condition here that warrants heavier
synchronization (locking etc.), we still have the data race which
needs fixing by using marked atomic operations (READ_ONCE, WRITE_ONCE,
atomic_t, etc.). We also need to consider memory ordering requirements
(do we need smp_*mb(), smp_load_acquire/smp_store_release, ..)?

In the case here, the pattern is double-checked locking, which is
incorrect without atomic operations and the correct memory ordering.
There is a lengthy discussion here:
https://lore.kernel.org/lkml/20191021113327.22365-1-christian.brauner@ubuntu.com/

-- Marco

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: KCSAN: data-race in taskstats_exit / taskstats_exit
  2019-11-06 10:23   ` Marco Elver
@ 2019-11-07 10:39     ` Balbir Singh
  2019-11-08  0:54     ` Balbir Singh
  1 sibling, 0 replies; 62+ messages in thread
From: Balbir Singh @ 2019-11-07 10:39 UTC (permalink / raw)
  To: Marco Elver; +Cc: syzbot, LKML, syzkaller-bugs

On Wed, Nov 06, 2019 at 11:23:52AM +0100, Marco Elver wrote:
> On Wed, 6 Nov 2019 at 01:10, Balbir Singh <bsingharora@gmail.com> wrote:
> >
> > On Fri, 2019-10-04 at 21:26 -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    b4bd9343 x86, kcsan: Enable KCSAN for x86
> > > git tree:       https://github.com/google/ktsan.git kcsan
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=125329db600000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=c5d03165a1bd1dead0c1
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > >
> > > ==================================================================
> > > BUG: KCSAN: data-race in taskstats_exit / taskstats_exit
> > >
> > > write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
> > >   taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
> > >   taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
> > >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> > >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> > >   get_signal+0x2a2/0x1320 kernel/signal.c:2734
> > >   do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
> > >   exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
> > >   prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> > >   syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
> > >   do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
> > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >
> > > read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
> > >   taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
> > >   taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
> > >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> > >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> > >   __do_sys_exit_group kernel/exit.c:994 [inline]
> > >   __se_sys_exit_group kernel/exit.c:992 [inline]
> > >   __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
> > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >
> >
> > Sorry I've been away and just catching up with email
> >
> > I don't think this is a bug, if I interpret the report correctly it shows a
> > race
> >
> > static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > {
> >         struct signal_struct *sig = tsk->signal;
> >         struct taskstats *stats;
> >
> > #1      if (sig->stats || thread_group_empty(tsk)) <- the check of sig->stats
> >                 goto ret;
> >
> >         /* No problem if kmem_cache_zalloc() fails */
> >         stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> >
> >         spin_lock_irq(&tsk->sighand->siglock);
> >         if (!sig->stats) {
> > #2              sig->stats = stats; <- here in setting sig->stats
> >                 stats = NULL;
> >         }
> >         spin_unlock_irq(&tsk->sighand->siglock);
> >
> >         if (stats)
> >                 kmem_cache_free(taskstats_cache, stats);
> > ret:
> >         return sig->stats;
> > }
> >
> > The worst case scenario is that we might see sig->stats as being NULL when two
> > threads belonging to the same tgid. We do free up stats if we got that wrong
> >
> > Am I misinterpreting the report?
> >
> > Balbir Singh.
> 
> The plain concurrent reads/writes are a data race, which may manifest
> in various undefined behaviour due to compiler optimizations [1, 2].
> Note that, "data race" does not necessarily imply "race condition";
> some data races are race conditions (usually the more interesting
> bugs) -- but not *all* data races are race conditions (sometimes
> referred to as "benign races"). KCSAN reports data races according to
> the LKMM.
> [1] https://lwn.net/Articles/793253/
> [2] https://lwn.net/Articles/799218/
> 
> If there is no race condition here that warrants heavier
> synchronization (locking etc.), we still have the data race which
> needs fixing by using marked atomic operations (READ_ONCE, WRITE_ONCE,
> atomic_t, etc.). We also need to consider memory ordering requirements
> (do we need smp_*mb(), smp_load_acquire/smp_store_release, ..)?
> 
> In the case here, the pattern is double-checked locking, which is
> incorrect without atomic operations and the correct memory ordering.
> There is a lengthy discussion here:
> https://lore.kernel.org/lkml/20191021113327.22365-1-christian.brauner@ubuntu.com/
> 

Thanks for the explanation

I presume or presumed that accessing a pointer is atomic, in this case
sig->stats, I'll double check my assumption

Balbir


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: KCSAN: data-race in taskstats_exit / taskstats_exit
  2019-11-06 10:23   ` Marco Elver
  2019-11-07 10:39     ` Balbir Singh
@ 2019-11-08  0:54     ` Balbir Singh
  2019-11-08  8:55       ` Dmitry Vyukov
  1 sibling, 1 reply; 62+ messages in thread
From: Balbir Singh @ 2019-11-08  0:54 UTC (permalink / raw)
  To: Marco Elver; +Cc: syzbot, LKML, syzkaller-bugs

On Wed, 2019-11-06 at 11:23 +0100, Marco Elver wrote:
> On Wed, 6 Nov 2019 at 01:10, Balbir Singh <bsingharora@gmail.com> wrote:
> > 
> > On Fri, 2019-10-04 at 21:26 -0700, syzbot wrote:
> > > Hello,
> > > 
> > > syzbot found the following crash on:
> > > 
> > > HEAD commit:    b4bd9343 x86, kcsan: Enable KCSAN for x86
> > > git tree:       https://github.com/google/ktsan.git kcsan
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=125329db600000
> > > kernel config:  
> > > https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > > dashboard link: 
> > > https://syzkaller.appspot.com/bug?extid=c5d03165a1bd1dead0c1
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > 
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > > 
> > > IMPORTANT: if you fix the bug, please add the following tag to the
> > > commit:
> > > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > > 
> > > ==================================================================
> > > BUG: KCSAN: data-race in taskstats_exit / taskstats_exit
> > > 
> > > write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
> > >   taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
> > >   taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
> > >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> > >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> > >   get_signal+0x2a2/0x1320 kernel/signal.c:2734
> > >   do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
> > >   exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
> > >   prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> > >   syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
> > >   do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
> > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > 
> > > read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
> > >   taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
> > >   taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
> > >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> > >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> > >   __do_sys_exit_group kernel/exit.c:994 [inline]
> > >   __se_sys_exit_group kernel/exit.c:992 [inline]
> > >   __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
> > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > 
> > 
> > Sorry I've been away and just catching up with email
> > 
> > I don't think this is a bug, if I interpret the report correctly it shows
> > a
> > race
> > 
> > static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > {
> >         struct signal_struct *sig = tsk->signal;
> >         struct taskstats *stats;
> > 
> > #1      if (sig->stats || thread_group_empty(tsk)) <- the check of sig-
> > >stats
> >                 goto ret;
> > 
> >         /* No problem if kmem_cache_zalloc() fails */
> >         stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > 
> >         spin_lock_irq(&tsk->sighand->siglock);
> >         if (!sig->stats) {
> > #2              sig->stats = stats; <- here in setting sig->stats
> >                 stats = NULL;
> >         }
> >         spin_unlock_irq(&tsk->sighand->siglock);
> > 
> >         if (stats)
> >                 kmem_cache_free(taskstats_cache, stats);
> > ret:
> >         return sig->stats;
> > }
> > 
> > The worst case scenario is that we might see sig->stats as being NULL when
> > two
> > threads belonging to the same tgid. We do free up stats if we got that
> > wrong
> > 
> > Am I misinterpreting the report?
> > 
> > Balbir Singh.
> 
> The plain concurrent reads/writes are a data race, which may manifest
> in various undefined behaviour due to compiler optimizations [1, 2].
> Note that, "data race" does not necessarily imply "race condition";
> some data races are race conditions (usually the more interesting
> bugs) -- but not *all* data races are race conditions (sometimes
> referred to as "benign races"). KCSAN reports data races according to
> the LKMM.
> [1] https://lwn.net/Articles/793253/
> [2] https://lwn.net/Articles/799218/
> 
> If there is no race condition here that warrants heavier
> synchronization (locking etc.), we still have the data race which
> needs fixing by using marked atomic operations (READ_ONCE, WRITE_ONCE,
> atomic_t, etc.). We also need to consider memory ordering requirements
> (do we need smp_*mb(), smp_load_acquire/smp_store_release, ..)?
> 
> In the case here, the pattern is double-checked locking, which is
> incorrect without atomic operations and the correct memory ordering.
> There is a lengthy discussion here:
> 
https://lore.kernel.org/lkml/20191021113327.22365-1-christian.brauner@ubuntu.com/
> 
> 

I am still not convinced unless someone can prove that unsigned long reads are
non-atomic, acquire/release and barriers semantics don't matter because the
code deals with the race inside of a lock if the read was spurious, The
assumption is based on the face that sig->stats can be NULL or the kzalloc'ed
value in all cases.

Balbir Singh.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: KCSAN: data-race in taskstats_exit / taskstats_exit
  2019-11-08  0:54     ` Balbir Singh
@ 2019-11-08  8:55       ` Dmitry Vyukov
  2019-11-09  3:42         ` Balbir Singh
  0 siblings, 1 reply; 62+ messages in thread
From: Dmitry Vyukov @ 2019-11-08  8:55 UTC (permalink / raw)
  To: Balbir Singh; +Cc: Marco Elver, syzbot, LKML, syzkaller-bugs

On Fri, Nov 8, 2019 at 1:54 AM Balbir Singh <bsingharora@gmail.com> wrote:
>
> On Wed, 2019-11-06 at 11:23 +0100, Marco Elver wrote:
> > On Wed, 6 Nov 2019 at 01:10, Balbir Singh <bsingharora@gmail.com> wrote:
> > >
> > > On Fri, 2019-10-04 at 21:26 -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    b4bd9343 x86, kcsan: Enable KCSAN for x86
> > > > git tree:       https://github.com/google/ktsan.git kcsan
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=125329db600000
> > > > kernel config:
> > > > https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > > > dashboard link:
> > > > https://syzkaller.appspot.com/bug?extid=c5d03165a1bd1dead0c1
> > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > >
> > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the
> > > > commit:
> > > > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > > >
> > > > ==================================================================
> > > > BUG: KCSAN: data-race in taskstats_exit / taskstats_exit
> > > >
> > > > write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
> > > >   taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
> > > >   taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
> > > >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> > > >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> > > >   get_signal+0x2a2/0x1320 kernel/signal.c:2734
> > > >   do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
> > > >   exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
> > > >   prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> > > >   syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
> > > >   do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
> > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > >
> > > > read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
> > > >   taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
> > > >   taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
> > > >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> > > >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> > > >   __do_sys_exit_group kernel/exit.c:994 [inline]
> > > >   __se_sys_exit_group kernel/exit.c:992 [inline]
> > > >   __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
> > > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > >
> > >
> > > Sorry I've been away and just catching up with email
> > >
> > > I don't think this is a bug, if I interpret the report correctly it shows
> > > a
> > > race
> > >
> > > static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > {
> > >         struct signal_struct *sig = tsk->signal;
> > >         struct taskstats *stats;
> > >
> > > #1      if (sig->stats || thread_group_empty(tsk)) <- the check of sig-
> > > >stats
> > >                 goto ret;
> > >
> > >         /* No problem if kmem_cache_zalloc() fails */
> > >         stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > >
> > >         spin_lock_irq(&tsk->sighand->siglock);
> > >         if (!sig->stats) {
> > > #2              sig->stats = stats; <- here in setting sig->stats
> > >                 stats = NULL;
> > >         }
> > >         spin_unlock_irq(&tsk->sighand->siglock);
> > >
> > >         if (stats)
> > >                 kmem_cache_free(taskstats_cache, stats);
> > > ret:
> > >         return sig->stats;
> > > }
> > >
> > > The worst case scenario is that we might see sig->stats as being NULL when
> > > two
> > > threads belonging to the same tgid. We do free up stats if we got that
> > > wrong
> > >
> > > Am I misinterpreting the report?
> > >
> > > Balbir Singh.
> >
> > The plain concurrent reads/writes are a data race, which may manifest
> > in various undefined behaviour due to compiler optimizations [1, 2].
> > Note that, "data race" does not necessarily imply "race condition";
> > some data races are race conditions (usually the more interesting
> > bugs) -- but not *all* data races are race conditions (sometimes
> > referred to as "benign races"). KCSAN reports data races according to
> > the LKMM.
> > [1] https://lwn.net/Articles/793253/
> > [2] https://lwn.net/Articles/799218/
> >
> > If there is no race condition here that warrants heavier
> > synchronization (locking etc.), we still have the data race which
> > needs fixing by using marked atomic operations (READ_ONCE, WRITE_ONCE,
> > atomic_t, etc.). We also need to consider memory ordering requirements
> > (do we need smp_*mb(), smp_load_acquire/smp_store_release, ..)?
> >
> > In the case here, the pattern is double-checked locking, which is
> > incorrect without atomic operations and the correct memory ordering.
> > There is a lengthy discussion here:
> >
> https://lore.kernel.org/lkml/20191021113327.22365-1-christian.brauner@ubuntu.com/
> >
> >
>
> I am still not convinced unless someone can prove that unsigned long reads are
> non-atomic,

None of the relevant standards guarantee this. C standard very
explicitly states the opposite - any data race renders behavior of the
program as undefined. LKMM does not give any defined behavior to data
races either. QED ;)

> acquire/release and barriers semantics don't matter because the
> code deals with the race inside of a lock if the read was spurious, The
> assumption is based on the face that sig->stats can be NULL or the kzalloc'ed
> value in all cases.
>
> Balbir Singh.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/acd6b0d98a7ebcb4ead9b263ec5c568c5a747166.camel%40gmail.com.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: KCSAN: data-race in taskstats_exit / taskstats_exit
  2019-11-08  8:55       ` Dmitry Vyukov
@ 2019-11-09  3:42         ` Balbir Singh
  0 siblings, 0 replies; 62+ messages in thread
From: Balbir Singh @ 2019-11-09  3:42 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Marco Elver, syzbot, LKML, syzkaller-bugs

On Fri, Nov 8, 2019 at 7:55 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Fri, Nov 8, 2019 at 1:54 AM Balbir Singh <bsingharora@gmail.com> wrote:
> >
> > On Wed, 2019-11-06 at 11:23 +0100, Marco Elver wrote:
> > > On Wed, 6 Nov 2019 at 01:10, Balbir Singh <bsingharora@gmail.com> wrote:
> > > >
> > > > On Fri, 2019-10-04 at 21:26 -0700, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit:    b4bd9343 x86, kcsan: Enable KCSAN for x86
> > > > > git tree:       https://github.com/google/ktsan.git kcsan
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=125329db600000
> > > > > kernel config:
> > > > > https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> > > > > dashboard link:
> > > > > https://syzkaller.appspot.com/bug?extid=c5d03165a1bd1dead0c1
> > > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > > >
> > > > > IMPORTANT: if you fix the bug, please add the following tag to the
> > > > > commit:
> > > > > Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com
> > > > >
> > > > > ==================================================================
> > > > > BUG: KCSAN: data-race in taskstats_exit / taskstats_exit
> > > > >
> > > > > write to 0xffff8881157bbe10 of 8 bytes by task 7951 on cpu 0:
> > > > >   taskstats_tgid_alloc kernel/taskstats.c:567 [inline]
> > > > >   taskstats_exit+0x6b7/0x717 kernel/taskstats.c:596
> > > > >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> > > > >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> > > > >   get_signal+0x2a2/0x1320 kernel/signal.c:2734
> > > > >   do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
> > > > >   exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
> > > > >   prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> > > > >   syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
> > > > >   do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
> > > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > >
> > > > > read to 0xffff8881157bbe10 of 8 bytes by task 7949 on cpu 1:
> > > > >   taskstats_tgid_alloc kernel/taskstats.c:559 [inline]
> > > > >   taskstats_exit+0xb2/0x717 kernel/taskstats.c:596
> > > > >   do_exit+0x2c2/0x18e0 kernel/exit.c:864
> > > > >   do_group_exit+0xb4/0x1c0 kernel/exit.c:983
> > > > >   __do_sys_exit_group kernel/exit.c:994 [inline]
> > > > >   __se_sys_exit_group kernel/exit.c:992 [inline]
> > > > >   __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
> > > > >   do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
> > > > >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > >
> > > >
> > > > Sorry I've been away and just catching up with email
> > > >
> > > > I don't think this is a bug, if I interpret the report correctly it shows
> > > > a
> > > > race
> > > >
> > > > static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > > {
> > > >         struct signal_struct *sig = tsk->signal;
> > > >         struct taskstats *stats;
> > > >
> > > > #1      if (sig->stats || thread_group_empty(tsk)) <- the check of sig-
> > > > >stats
> > > >                 goto ret;
> > > >
> > > >         /* No problem if kmem_cache_zalloc() fails */
> > > >         stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > > >
> > > >         spin_lock_irq(&tsk->sighand->siglock);
> > > >         if (!sig->stats) {
> > > > #2              sig->stats = stats; <- here in setting sig->stats
> > > >                 stats = NULL;
> > > >         }
> > > >         spin_unlock_irq(&tsk->sighand->siglock);
> > > >
> > > >         if (stats)
> > > >                 kmem_cache_free(taskstats_cache, stats);
> > > > ret:
> > > >         return sig->stats;
> > > > }
> > > >
> > > > The worst case scenario is that we might see sig->stats as being NULL when
> > > > two
> > > > threads belonging to the same tgid. We do free up stats if we got that
> > > > wrong
> > > >
> > > > Am I misinterpreting the report?
> > > >
> > > > Balbir Singh.
> > >
> > > The plain concurrent reads/writes are a data race, which may manifest
> > > in various undefined behaviour due to compiler optimizations [1, 2].
> > > Note that, "data race" does not necessarily imply "race condition";
> > > some data races are race conditions (usually the more interesting
> > > bugs) -- but not *all* data races are race conditions (sometimes
> > > referred to as "benign races"). KCSAN reports data races according to
> > > the LKMM.
> > > [1] https://lwn.net/Articles/793253/
> > > [2] https://lwn.net/Articles/799218/
> > >
> > > If there is no race condition here that warrants heavier
> > > synchronization (locking etc.), we still have the data race which
> > > needs fixing by using marked atomic operations (READ_ONCE, WRITE_ONCE,
> > > atomic_t, etc.). We also need to consider memory ordering requirements
> > > (do we need smp_*mb(), smp_load_acquire/smp_store_release, ..)?
> > >
> > > In the case here, the pattern is double-checked locking, which is
> > > incorrect without atomic operations and the correct memory ordering.
> > > There is a lengthy discussion here:
> > >
> > https://lore.kernel.org/lkml/20191021113327.22365-1-christian.brauner@ubuntu.com/
> > >
> > >
> >
> > I am still not convinced unless someone can prove that unsigned long reads are
> > non-atomic,
>
> None of the relevant standards guarantee this. C standard very
> explicitly states the opposite - any data race renders behavior of the
> program as undefined. LKMM does not give any defined behavior to data
> races either. QED ;)
>

Fair enough! I am going to have a read the specification, in the
meanwhile, I agree changes are needed based on what you've just stated
above

Balbir

> > acquire/release and barriers semantics don't matter because the
> > code deals with the race inside of a lock if the read was spurious, The
> > assumption is based on the face that sig->stats can be NULL or the kzalloc'ed
> > value in all cases.
> >
> > Balbir Singh.
> >
> > --
> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/acd6b0d98a7ebcb4ead9b263ec5c568c5a747166.camel%40gmail.com.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-21 13:04                                   ` Christian Brauner
@ 2019-11-29 17:56                                     ` Will Deacon
  2019-11-30 15:08                                       ` Christian Brauner
  0 siblings, 1 reply; 62+ messages in thread
From: Will Deacon @ 2019-11-29 17:56 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Rasmus Villemoes, linux-kernel, bsingharora, dvyukov, elver,
	parri.andrea, stable, syzbot+c5d03165a1bd1dead0c1,
	syzkaller-bugs

On Mon, Oct 21, 2019 at 03:04:18PM +0200, Christian Brauner wrote:
> On Mon, Oct 21, 2019 at 02:19:01PM +0200, Rasmus Villemoes wrote:
> > On 21/10/2019 13.33, Christian Brauner wrote:
> > > The first approach used smp_load_acquire() and smp_store_release().
> > > However, after having discussed this it seems that the data dependency
> > > for kmem_cache_alloc() would be fixed by WRITE_ONCE().
> > > Furthermore, the smp_load_acquire() would only manage to order the stats
> > > check before the thread_group_empty() check. So it seems just using
> > > READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
> > > up for discussion at least.
> > > 
> > > /* v6 */
> > > - Christian Brauner <christian.brauner@ubuntu.com>:
> > >   - bring up READ_ONCE()/WRITE_ONCE() approach for discussion
> > > ---
> > >  kernel/taskstats.c | 26 +++++++++++++++-----------
> > >  1 file changed, 15 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > > index 13a0f2e6ebc2..111bb4139aa2 100644
> > > --- a/kernel/taskstats.c
> > > +++ b/kernel/taskstats.c
> > > @@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > >  {
> > >  	struct signal_struct *sig = tsk->signal;
> > > -	struct taskstats *stats;
> > > +	struct taskstats *stats_new, *stats;
> > >  
> > > -	if (sig->stats || thread_group_empty(tsk))
> > > -		goto ret;
> > > +	/* Pairs with WRITE_ONCE() below. */
> > > +	stats = READ_ONCE(sig->stats);
> > > +	if (stats || thread_group_empty(tsk))
> > > +		return stats;
> > >  
> > >  	/* No problem if kmem_cache_zalloc() fails */
> > > -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > > +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > >  
> > >  	spin_lock_irq(&tsk->sighand->siglock);
> > > -	if (!sig->stats) {
> > > -		sig->stats = stats;
> > > -		stats = NULL;
> > > +	if (!stats) {
> > > +		stats = stats_new;
> > > +		/* Pairs with READ_ONCE() above. */
> > > +		WRITE_ONCE(sig->stats, stats_new);
> > > +		stats_new = NULL;
> > 
> > No idea about the memory ordering issues, but don't you need to
> > load/check sig->stats again? Otherwise it seems that two threads might
> > both see !sig->stats, both allocate a stats_new, and both
> > unconditionally in turn assign their stats_new to sig->stats. Then the
> > first assignment ends up becoming a memory leak (and any writes through
> > that pointer done by the caller end up in /dev/null...)
> 
> Trigger hand too fast. I guess you're thinking sm like:
> 
> diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> index 13a0f2e6ebc2..c4e1ed11e785 100644
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -554,25 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
>  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
>  {
>  	struct signal_struct *sig = tsk->signal;
> -	struct taskstats *stats;
> +	struct taskstats *stats_new, *stats;
>  
> -	if (sig->stats || thread_group_empty(tsk))
> -		goto ret;
> +	stats = READ_ONCE(sig->stats);

This probably wants to be an acquire, since both the memcpy() later on
in taskstats_exit() and the accesses in {b,x}acct_add_tsk() appear to
read from the taskstats structure without the sighand->siglock held and
therefore may miss zeroed allocation from the zalloc() below, I think.

> +	if (stats || thread_group_empty(tsk))
> +		return stats;
>  
> -	/* No problem if kmem_cache_zalloc() fails */
> -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
>  
>  	spin_lock_irq(&tsk->sighand->siglock);
> -	if (!sig->stats) {
> -		sig->stats = stats;
> -		stats = NULL;
> +	stats = READ_ONCE(sig->stats);

You hold the spinlock here, so I don't think you need the READ_ONCE().

> +	if (!stats) {
> +		stats = stats_new;
> +		WRITE_ONCE(sig->stats, stats_new);

You probably want a release here to publish the zeroes from the zalloc()
(back to my first comment). With those changes:

Reviewed-by: Will Deacon <will@kernel.org>

However, this caused me to look at do_group_exit() and we appear to have
racy accesses on sig->flags there thanks to signal_group_exit(). I worry
that might run quite deep, and can probably be looked at separately.

Will

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-10-21 11:33                               ` [PATCH v6] " Christian Brauner
  2019-10-21 12:19                                 ` Rasmus Villemoes
  2019-10-23 12:16                                 ` Andrea Parri
@ 2019-11-29 17:57                                 ` Will Deacon
  2 siblings, 0 replies; 62+ messages in thread
From: Will Deacon @ 2019-11-29 17:57 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-kernel, bsingharora, dvyukov, elver, parri.andrea, stable,
	syzbot+c5d03165a1bd1dead0c1, syzkaller-bugs

On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
> When assiging and testing taskstats in taskstats_exit() there's a race
> when writing and reading sig->stats when a thread-group with more than
> one thread exits:
> 
> cpu0:
> thread catches fatal signal and whole thread-group gets taken down
>  do_exit()
>  do_group_exit()

Nit: I don't think this is the signal-handling path.

>  taskstats_exit()
>  taskstats_tgid_alloc()
> The tasks reads sig->stats without holding sighand lock.
> 
> cpu1:
> task calls exit_group()
>  do_exit()
>  do_group_exit()

Nit: These ^^ seem to be the wrong way round.

Will

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v6] taskstats: fix data-race
  2019-11-29 17:56                                     ` Will Deacon
@ 2019-11-30 15:08                                       ` Christian Brauner
  0 siblings, 0 replies; 62+ messages in thread
From: Christian Brauner @ 2019-11-30 15:08 UTC (permalink / raw)
  To: Will Deacon
  Cc: Rasmus Villemoes, linux-kernel, bsingharora, dvyukov, elver,
	parri.andrea, stable, syzbot+c5d03165a1bd1dead0c1,
	syzkaller-bugs

On Fri, Nov 29, 2019 at 05:56:05PM +0000, Will Deacon wrote:
> On Mon, Oct 21, 2019 at 03:04:18PM +0200, Christian Brauner wrote:
> > On Mon, Oct 21, 2019 at 02:19:01PM +0200, Rasmus Villemoes wrote:
> > > On 21/10/2019 13.33, Christian Brauner wrote:
> > > > The first approach used smp_load_acquire() and smp_store_release().
> > > > However, after having discussed this it seems that the data dependency
> > > > for kmem_cache_alloc() would be fixed by WRITE_ONCE().
> > > > Furthermore, the smp_load_acquire() would only manage to order the stats
> > > > check before the thread_group_empty() check. So it seems just using
> > > > READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this
> > > > up for discussion at least.
> > > > 
> > > > /* v6 */
> > > > - Christian Brauner <christian.brauner@ubuntu.com>:
> > > >   - bring up READ_ONCE()/WRITE_ONCE() approach for discussion
> > > > ---
> > > >  kernel/taskstats.c | 26 +++++++++++++++-----------
> > > >  1 file changed, 15 insertions(+), 11 deletions(-)
> > > > 
> > > > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > > > index 13a0f2e6ebc2..111bb4139aa2 100644
> > > > --- a/kernel/taskstats.c
> > > > +++ b/kernel/taskstats.c
> > > > @@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> > > >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> > > >  {
> > > >  	struct signal_struct *sig = tsk->signal;
> > > > -	struct taskstats *stats;
> > > > +	struct taskstats *stats_new, *stats;
> > > >  
> > > > -	if (sig->stats || thread_group_empty(tsk))
> > > > -		goto ret;
> > > > +	/* Pairs with WRITE_ONCE() below. */
> > > > +	stats = READ_ONCE(sig->stats);
> > > > +	if (stats || thread_group_empty(tsk))
> > > > +		return stats;
> > > >  
> > > >  	/* No problem if kmem_cache_zalloc() fails */
> > > > -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > > > +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > > >  
> > > >  	spin_lock_irq(&tsk->sighand->siglock);
> > > > -	if (!sig->stats) {
> > > > -		sig->stats = stats;
> > > > -		stats = NULL;
> > > > +	if (!stats) {
> > > > +		stats = stats_new;
> > > > +		/* Pairs with READ_ONCE() above. */
> > > > +		WRITE_ONCE(sig->stats, stats_new);
> > > > +		stats_new = NULL;
> > > 
> > > No idea about the memory ordering issues, but don't you need to
> > > load/check sig->stats again? Otherwise it seems that two threads might
> > > both see !sig->stats, both allocate a stats_new, and both
> > > unconditionally in turn assign their stats_new to sig->stats. Then the
> > > first assignment ends up becoming a memory leak (and any writes through
> > > that pointer done by the caller end up in /dev/null...)
> > 
> > Trigger hand too fast. I guess you're thinking sm like:
> > 
> > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > index 13a0f2e6ebc2..c4e1ed11e785 100644
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -554,25 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info)
> >  static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk)
> >  {
> >  	struct signal_struct *sig = tsk->signal;
> > -	struct taskstats *stats;
> > +	struct taskstats *stats_new, *stats;
> >  
> > -	if (sig->stats || thread_group_empty(tsk))
> > -		goto ret;
> > +	stats = READ_ONCE(sig->stats);
> 
> This probably wants to be an acquire, since both the memcpy() later on
> in taskstats_exit() and the accesses in {b,x}acct_add_tsk() appear to
> read from the taskstats structure without the sighand->siglock held and
> therefore may miss zeroed allocation from the zalloc() below, I think.
> 
> > +	if (stats || thread_group_empty(tsk))
> > +		return stats;
> >  
> > -	/* No problem if kmem_cache_zalloc() fails */
> > -	stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> > +	stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
> >  
> >  	spin_lock_irq(&tsk->sighand->siglock);
> > -	if (!sig->stats) {
> > -		sig->stats = stats;
> > -		stats = NULL;
> > +	stats = READ_ONCE(sig->stats);
> 
> You hold the spinlock here, so I don't think you need the READ_ONCE().
> 
> > +	if (!stats) {
> > +		stats = stats_new;
> > +		WRITE_ONCE(sig->stats, stats_new);
> 
> You probably want a release here to publish the zeroes from the zalloc()
> (back to my first comment). With those changes:
> 
> Reviewed-by: Will Deacon <will@kernel.org>

Thanks, this is basically what we had in v5. I'll rework and send this
after the merge window closes.

> 
> However, this caused me to look at do_group_exit() and we appear to have
> racy accesses on sig->flags there thanks to signal_group_exit(). I worry
> that might run quite deep, and can probably be looked at separately.

Yeah, we should look into this but separate from this patch.

Thanks for taking a look at this! Much appreciated!
Christian

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2019-11-30 15:09 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-05  4:26 KCSAN: data-race in taskstats_exit / taskstats_exit syzbot
2019-10-05  4:29 ` Dmitry Vyukov
2019-10-05 11:29   ` Christian Brauner
2019-10-05 11:28 ` [PATCH] taskstats: fix data-race Christian Brauner
2019-10-05 13:33   ` Marco Elver
2019-10-05 14:15     ` Christian Brauner
2019-10-05 14:34       ` Marco Elver
2019-10-06 10:00   ` Dmitry Vyukov
2019-10-06 10:59     ` Christian Brauner
2019-10-06 23:52   ` Christian Brauner
2019-10-07  7:31     ` Dmitry Vyukov
2019-10-07  9:29       ` Christian Brauner
2019-10-07 10:40     ` Andrea Parri
2019-10-07 10:50       ` Christian Brauner
2019-10-07 11:01       ` [PATCH v2] " Christian Brauner
2019-10-07 13:18         ` Andrea Parri
2019-10-07 13:28           ` Christian Brauner
2019-10-07 13:50           ` Dmitry Vyukov
2019-10-07 13:55             ` Christian Brauner
2019-10-07 14:08               ` Dmitry Vyukov
2019-10-07 14:10                 ` Christian Brauner
2019-10-07 14:14             ` Andrea Parri
2019-10-07 14:18               ` Dmitry Vyukov
2019-10-08 14:20                 ` Andrea Parri
2019-10-08 14:24                   ` Christian Brauner
2019-10-08 15:26                     ` Andrea Parri
2019-10-08 15:35                       ` Christian Brauner
2019-10-08 15:44                         ` Andrea Parri
2019-10-09 11:31                           ` [PATCH] " Christian Brauner
2019-10-09 11:40                             ` Christian Brauner
2019-10-09 11:48                             ` [PATCH v5] " Christian Brauner
2019-10-09 12:08                               ` Andrea Parri
2019-10-09 13:26                               ` Christian Brauner
2019-10-21 11:33                               ` [PATCH v6] " Christian Brauner
2019-10-21 12:19                                 ` Rasmus Villemoes
2019-10-21 13:04                                   ` Christian Brauner
2019-11-29 17:56                                     ` Will Deacon
2019-11-30 15:08                                       ` Christian Brauner
2019-10-23 12:16                                 ` Andrea Parri
2019-10-23 12:39                                   ` Dmitry Vyukov
2019-10-23 13:11                                     ` Christian Brauner
2019-10-23 13:20                                       ` Dmitry Vyukov
2019-10-24 11:31                                     ` Andrea Parri
2019-10-24 11:51                                       ` Dmitry Vyukov
2019-10-24 13:05                                         ` Andrea Parri
2019-10-24 13:13                                           ` Dmitry Vyukov
2019-10-24 13:21                                             ` Christian Brauner
2019-10-24 13:34                                               ` Dmitry Vyukov
2019-10-24 13:43                                             ` Andrea Parri
2019-10-24 13:58                                               ` Dmitry Vyukov
2019-10-24 14:40                                                 ` Andrea Parri
2019-10-24 14:49                                                   ` Dmitry Vyukov
2019-11-29 17:57                                 ` Will Deacon
2019-10-09 11:48                             ` [PATCH] " Marco Elver
2019-10-09 11:53                               ` Christian Brauner
2019-11-06  0:27   ` Balbir Singh
2019-11-06  0:09 ` KCSAN: data-race in taskstats_exit / taskstats_exit Balbir Singh
2019-11-06 10:23   ` Marco Elver
2019-11-07 10:39     ` Balbir Singh
2019-11-08  0:54     ` Balbir Singh
2019-11-08  8:55       ` Dmitry Vyukov
2019-11-09  3:42         ` Balbir Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.