linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus
@ 2019-11-04 11:30 syzbot
  2019-11-04 11:32 ` Marco Elver
  0 siblings, 1 reply; 5+ messages in thread
From: syzbot @ 2019-11-04 11:30 UTC (permalink / raw)
  To: akpm, amir73il, darrick.wong, elver, hannes, jack, josef,
	linux-kernel, linux-mm, songliubraving, syzkaller-bugs, willy

Hello,

syzbot found the following crash on:

HEAD commit:    05f22368 x86, kcsan: Enable KCSAN for x86
git tree:       https://github.com/google/ktsan.git kcsan
console output: https://syzkaller.appspot.com/x/log.txt?x=1296ff50e00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=87d111955f40591f
dashboard link: https://syzkaller.appspot.com/bug?extid=99f4ddade3c22ab0cf23
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus

read to 0xffffffff85a7d440 of 8 bytes by task 7624 on cpu 0:
  rcu_read_unlock_special kernel/rcu/tree_plugin.h:615 [inline]
  __rcu_read_unlock+0x381/0x3c0 kernel/rcu/tree_plugin.h:383
  rcu_read_unlock include/linux/rcupdate.h:652 [inline]
  filemap_map_pages+0x5b3/0x990 mm/filemap.c:2687
  do_fault_around mm/memory.c:3450 [inline]
  do_read_fault mm/memory.c:3484 [inline]
  do_fault mm/memory.c:3618 [inline]
  handle_pte_fault mm/memory.c:3849 [inline]
  __handle_mm_fault+0x2554/0x2cb0 mm/memory.c:3973
  handle_mm_fault+0x21b/0x530 mm/memory.c:4010
  do_user_addr_fault arch/x86/mm/fault.c:1441 [inline]
  __do_page_fault+0x3fb/0x9e0 arch/x86/mm/fault.c:1506
  do_page_fault+0x54/0x233 arch/x86/mm/fault.c:1530
  page_fault+0x34/0x40 arch/x86/entry/entry_64.S:1202

write to 0xffffffff85a7d440 of 8 bytes by task 7353 on cpu 1:
  sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
  sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
  rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
  wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
  process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
  worker_thread+0xa0/0x800 kernel/workqueue.c:2415
  kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: rcu_gp wait_rcu_exp_gp
==================================================================
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: rcu_gp wait_rcu_exp_gp
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xf5/0x159 lib/dump_stack.c:113
  panic+0x210/0x640 kernel/panic.c:221
  kcsan_report.cold+0xc/0x10 kernel/kcsan/report.c:302
  __kcsan_setup_watchpoint+0x32e/0x4a0 kernel/kcsan/core.c:411
  __tsan_write8 kernel/kcsan/kcsan.c:36 [inline]
  __tsan_write8+0x32/0x40 kernel/kcsan/kcsan.c:36
  sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
  sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
  rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
  wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
  process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
  worker_thread+0xa0/0x800 kernel/workqueue.c:2415
  kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
Shutting down cpus with NMI
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus
  2019-11-04 11:30 KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus syzbot
@ 2019-11-04 11:32 ` Marco Elver
  2019-11-04 16:26   ` Paul E. McKenney
  0 siblings, 1 reply; 5+ messages in thread
From: Marco Elver @ 2019-11-04 11:32 UTC (permalink / raw)
  To: syzbot, Lai Jiangshan, Paul E. McKenney, Josh Triplett, rcu
  Cc: Andrew Morton, amir73il, darrick.wong, Johannes Weiner, jack,
	josef, LKML, Linux Memory Management List, songliubraving,
	syzkaller-bugs, willy

+RCU folks

On Mon, 4 Nov 2019 at 12:30, syzbot
<syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    05f22368 x86, kcsan: Enable KCSAN for x86
> git tree:       https://github.com/google/ktsan.git kcsan
> console output: https://syzkaller.appspot.com/x/log.txt?x=1296ff50e00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=87d111955f40591f
> dashboard link: https://syzkaller.appspot.com/bug?extid=99f4ddade3c22ab0cf23
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus
>
> read to 0xffffffff85a7d440 of 8 bytes by task 7624 on cpu 0:
>   rcu_read_unlock_special kernel/rcu/tree_plugin.h:615 [inline]
>   __rcu_read_unlock+0x381/0x3c0 kernel/rcu/tree_plugin.h:383
>   rcu_read_unlock include/linux/rcupdate.h:652 [inline]
>   filemap_map_pages+0x5b3/0x990 mm/filemap.c:2687
>   do_fault_around mm/memory.c:3450 [inline]
>   do_read_fault mm/memory.c:3484 [inline]
>   do_fault mm/memory.c:3618 [inline]
>   handle_pte_fault mm/memory.c:3849 [inline]
>   __handle_mm_fault+0x2554/0x2cb0 mm/memory.c:3973
>   handle_mm_fault+0x21b/0x530 mm/memory.c:4010
>   do_user_addr_fault arch/x86/mm/fault.c:1441 [inline]
>   __do_page_fault+0x3fb/0x9e0 arch/x86/mm/fault.c:1506
>   do_page_fault+0x54/0x233 arch/x86/mm/fault.c:1530
>   page_fault+0x34/0x40 arch/x86/entry/entry_64.S:1202
>
> write to 0xffffffff85a7d440 of 8 bytes by task 7353 on cpu 1:
>   sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
>   sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
>   rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
>   wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
>   process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
>   worker_thread+0xa0/0x800 kernel/workqueue.c:2415
>   kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Workqueue: rcu_gp wait_rcu_exp_gp
> ==================================================================
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Workqueue: rcu_gp wait_rcu_exp_gp
> Call Trace:
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0xf5/0x159 lib/dump_stack.c:113
>   panic+0x210/0x640 kernel/panic.c:221
>   kcsan_report.cold+0xc/0x10 kernel/kcsan/report.c:302
>   __kcsan_setup_watchpoint+0x32e/0x4a0 kernel/kcsan/core.c:411
>   __tsan_write8 kernel/kcsan/kcsan.c:36 [inline]
>   __tsan_write8+0x32/0x40 kernel/kcsan/kcsan.c:36
>   sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
>   sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
>   rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
>   wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
>   process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
>   worker_thread+0xa0/0x800 kernel/workqueue.c:2415
>   kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
> Shutting down cpus with NMI
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus
  2019-11-04 11:32 ` Marco Elver
@ 2019-11-04 16:26   ` Paul E. McKenney
  2019-11-04 17:09     ` Marco Elver
  0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2019-11-04 16:26 UTC (permalink / raw)
  To: Marco Elver
  Cc: syzbot, Lai Jiangshan, Josh Triplett, rcu, Andrew Morton,
	amir73il, darrick.wong, Johannes Weiner, jack, josef, LKML,
	Linux Memory Management List, songliubraving, syzkaller-bugs,
	willy

On Mon, Nov 04, 2019 at 12:32:57PM +0100, Marco Elver wrote:
> +RCU folks
> 
> On Mon, 4 Nov 2019 at 12:30, syzbot
> <syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    05f22368 x86, kcsan: Enable KCSAN for x86
> > git tree:       https://github.com/google/ktsan.git kcsan
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1296ff50e00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=87d111955f40591f
> > dashboard link: https://syzkaller.appspot.com/bug?extid=99f4ddade3c22ab0cf23
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com
> >
> > ==================================================================
> > BUG: KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus
> >
> > read to 0xffffffff85a7d440 of 8 bytes by task 7624 on cpu 0:
> >   rcu_read_unlock_special kernel/rcu/tree_plugin.h:615 [inline]
> >   __rcu_read_unlock+0x381/0x3c0 kernel/rcu/tree_plugin.h:383
> >   rcu_read_unlock include/linux/rcupdate.h:652 [inline]
> >   filemap_map_pages+0x5b3/0x990 mm/filemap.c:2687
> >   do_fault_around mm/memory.c:3450 [inline]
> >   do_read_fault mm/memory.c:3484 [inline]
> >   do_fault mm/memory.c:3618 [inline]
> >   handle_pte_fault mm/memory.c:3849 [inline]
> >   __handle_mm_fault+0x2554/0x2cb0 mm/memory.c:3973
> >   handle_mm_fault+0x21b/0x530 mm/memory.c:4010
> >   do_user_addr_fault arch/x86/mm/fault.c:1441 [inline]
> >   __do_page_fault+0x3fb/0x9e0 arch/x86/mm/fault.c:1506
> >   do_page_fault+0x54/0x233 arch/x86/mm/fault.c:1530
> >   page_fault+0x34/0x40 arch/x86/entry/entry_64.S:1202
> >
> > write to 0xffffffff85a7d440 of 8 bytes by task 7353 on cpu 1:
> >   sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
> >   sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
> >   rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
> >   wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
> >   process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
> >   worker_thread+0xa0/0x800 kernel/workqueue.c:2415
> >   kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
> >   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Workqueue: rcu_gp wait_rcu_exp_gp
> > ==================================================================
> > Kernel panic - not syncing: panic_on_warn set ...
> > CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Workqueue: rcu_gp wait_rcu_exp_gp
> > Call Trace:
> >   __dump_stack lib/dump_stack.c:77 [inline]
> >   dump_stack+0xf5/0x159 lib/dump_stack.c:113
> >   panic+0x210/0x640 kernel/panic.c:221
> >   kcsan_report.cold+0xc/0x10 kernel/kcsan/report.c:302
> >   __kcsan_setup_watchpoint+0x32e/0x4a0 kernel/kcsan/core.c:411
> >   __tsan_write8 kernel/kcsan/kcsan.c:36 [inline]
> >   __tsan_write8+0x32/0x40 kernel/kcsan/kcsan.c:36
> >   sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
> >   sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
> >   rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
> >   wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
> >   process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
> >   worker_thread+0xa0/0x800 kernel/workqueue.c:2415
> >   kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
> >   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
> > Shutting down cpus with NMI
> > Kernel Offset: disabled
> > Rebooting in 86400 seconds..
> >
> >
> > ---
> > This bug is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> >
> > syzbot will keep track of this bug report. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Again, good catch!  Does the patch shown below fix this?

							Thanx, Paul

------------------------------------------------------------------------

commit c3c182aa501c13faccf5f2c168004a5c98a89d74
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Mon Nov 4 08:22:45 2019 -0800

    rcu: Use READ_ONCE() for ->expmask in rcu_read_unlock_special()
    
    The rcu_node structure's ->expmask field is updated only when holding the
    ->lock, but is also accessed locklessly.  This means that all ->expmask
    updates must use WRITE_ONCE() and all reads carried out without holding
    ->lock must use READ_ONCE().  This commit therefore changes the lockless
    ->expmask read in rcu_read_unlock_special() to use READ_ONCE().
    
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 6d54fd4..9a1b4b2 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -601,7 +601,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
 		struct rcu_node *rnp = rdp->mynode;
 
 		exp = (t->rcu_blocked_node && t->rcu_blocked_node->exp_tasks) ||
-		      (rdp->grpmask & rnp->expmask) ||
+		      (rdp->grpmask & READ_ONCE(rnp->expmask)) ||
 		      tick_nohz_full_cpu(rdp->cpu);
 		// Need to defer quiescent state until everything is enabled.
 		if (irqs_were_disabled && use_softirq &&

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus
  2019-11-04 16:26   ` Paul E. McKenney
@ 2019-11-04 17:09     ` Marco Elver
  2019-11-04 17:19       ` Paul E. McKenney
  0 siblings, 1 reply; 5+ messages in thread
From: Marco Elver @ 2019-11-04 17:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: syzbot, Lai Jiangshan, Josh Triplett, rcu, Andrew Morton,
	amir73il, darrick.wong, Johannes Weiner, jack, Josef Bacik, LKML,
	Linux Memory Management List, songliubraving, syzkaller-bugs,
	willy

On Mon, 4 Nov 2019 at 17:26, Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Mon, Nov 04, 2019 at 12:32:57PM +0100, Marco Elver wrote:
> > +RCU folks
> >
> > On Mon, 4 Nov 2019 at 12:30, syzbot
> > <syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    05f22368 x86, kcsan: Enable KCSAN for x86
> > > git tree:       https://github.com/google/ktsan.git kcsan
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1296ff50e00000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=87d111955f40591f
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=99f4ddade3c22ab0cf23
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com
> > >
> > > ==================================================================
> > > BUG: KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus
> > >
> > > read to 0xffffffff85a7d440 of 8 bytes by task 7624 on cpu 0:
> > >   rcu_read_unlock_special kernel/rcu/tree_plugin.h:615 [inline]
> > >   __rcu_read_unlock+0x381/0x3c0 kernel/rcu/tree_plugin.h:383
> > >   rcu_read_unlock include/linux/rcupdate.h:652 [inline]
> > >   filemap_map_pages+0x5b3/0x990 mm/filemap.c:2687
> > >   do_fault_around mm/memory.c:3450 [inline]
> > >   do_read_fault mm/memory.c:3484 [inline]
> > >   do_fault mm/memory.c:3618 [inline]
> > >   handle_pte_fault mm/memory.c:3849 [inline]
> > >   __handle_mm_fault+0x2554/0x2cb0 mm/memory.c:3973
> > >   handle_mm_fault+0x21b/0x530 mm/memory.c:4010
> > >   do_user_addr_fault arch/x86/mm/fault.c:1441 [inline]
> > >   __do_page_fault+0x3fb/0x9e0 arch/x86/mm/fault.c:1506
> > >   do_page_fault+0x54/0x233 arch/x86/mm/fault.c:1530
> > >   page_fault+0x34/0x40 arch/x86/entry/entry_64.S:1202
> > >
> > > write to 0xffffffff85a7d440 of 8 bytes by task 7353 on cpu 1:
> > >   sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
> > >   sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
> > >   rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
> > >   wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
> > >   process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
> > >   worker_thread+0xa0/0x800 kernel/workqueue.c:2415
> > >   kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
> > >   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
> > >
> > > Reported by Kernel Concurrency Sanitizer on:
> > > CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > Workqueue: rcu_gp wait_rcu_exp_gp
> > > ==================================================================
> > > Kernel panic - not syncing: panic_on_warn set ...
> > > CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > Workqueue: rcu_gp wait_rcu_exp_gp
> > > Call Trace:
> > >   __dump_stack lib/dump_stack.c:77 [inline]
> > >   dump_stack+0xf5/0x159 lib/dump_stack.c:113
> > >   panic+0x210/0x640 kernel/panic.c:221
> > >   kcsan_report.cold+0xc/0x10 kernel/kcsan/report.c:302
> > >   __kcsan_setup_watchpoint+0x32e/0x4a0 kernel/kcsan/core.c:411
> > >   __tsan_write8 kernel/kcsan/kcsan.c:36 [inline]
> > >   __tsan_write8+0x32/0x40 kernel/kcsan/kcsan.c:36
> > >   sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
> > >   sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
> > >   rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
> > >   wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
> > >   process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
> > >   worker_thread+0xa0/0x800 kernel/workqueue.c:2415
> > >   kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
> > >   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
> > > Shutting down cpus with NMI
> > > Kernel Offset: disabled
> > > Rebooting in 86400 seconds..
> > >
> > >
> > > ---
> > > This bug is generated by a bot. It may contain errors.
> > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > >
> > > syzbot will keep track of this bug report. See:
> > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> Again, good catch!  Does the patch shown below fix this?

Assuming writers will follow in another patch:

Acked-by: Marco Elver <elver@google.com>

Thanks,
-- Marco


>                                                         Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit c3c182aa501c13faccf5f2c168004a5c98a89d74
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Mon Nov 4 08:22:45 2019 -0800
>
>     rcu: Use READ_ONCE() for ->expmask in rcu_read_unlock_special()
>
>     The rcu_node structure's ->expmask field is updated only when holding the
>     ->lock, but is also accessed locklessly.  This means that all ->expmask
>     updates must use WRITE_ONCE() and all reads carried out without holding
>     ->lock must use READ_ONCE().  This commit therefore changes the lockless
>     ->expmask read in rcu_read_unlock_special() to use READ_ONCE().
>
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 6d54fd4..9a1b4b2 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -601,7 +601,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
>                 struct rcu_node *rnp = rdp->mynode;
>
>                 exp = (t->rcu_blocked_node && t->rcu_blocked_node->exp_tasks) ||
> -                     (rdp->grpmask & rnp->expmask) ||
> +                     (rdp->grpmask & READ_ONCE(rnp->expmask)) ||
>                       tick_nohz_full_cpu(rdp->cpu);
>                 // Need to defer quiescent state until everything is enabled.
>                 if (irqs_were_disabled && use_softirq &&

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus
  2019-11-04 17:09     ` Marco Elver
@ 2019-11-04 17:19       ` Paul E. McKenney
  0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2019-11-04 17:19 UTC (permalink / raw)
  To: Marco Elver
  Cc: syzbot, Lai Jiangshan, Josh Triplett, rcu, Andrew Morton,
	amir73il, darrick.wong, Johannes Weiner, jack, Josef Bacik, LKML,
	Linux Memory Management List, songliubraving, syzkaller-bugs,
	willy

On Mon, Nov 04, 2019 at 06:09:07PM +0100, Marco Elver wrote:
> On Mon, 4 Nov 2019 at 17:26, Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Mon, Nov 04, 2019 at 12:32:57PM +0100, Marco Elver wrote:
> > > +RCU folks
> > >
> > > On Mon, 4 Nov 2019 at 12:30, syzbot
> > > <syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    05f22368 x86, kcsan: Enable KCSAN for x86
> > > > git tree:       https://github.com/google/ktsan.git kcsan
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1296ff50e00000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=87d111955f40591f
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=99f4ddade3c22ab0cf23
> > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > >
> > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com
> > > >
> > > > ==================================================================
> > > > BUG: KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus
> > > >
> > > > read to 0xffffffff85a7d440 of 8 bytes by task 7624 on cpu 0:
> > > >   rcu_read_unlock_special kernel/rcu/tree_plugin.h:615 [inline]
> > > >   __rcu_read_unlock+0x381/0x3c0 kernel/rcu/tree_plugin.h:383
> > > >   rcu_read_unlock include/linux/rcupdate.h:652 [inline]
> > > >   filemap_map_pages+0x5b3/0x990 mm/filemap.c:2687
> > > >   do_fault_around mm/memory.c:3450 [inline]
> > > >   do_read_fault mm/memory.c:3484 [inline]
> > > >   do_fault mm/memory.c:3618 [inline]
> > > >   handle_pte_fault mm/memory.c:3849 [inline]
> > > >   __handle_mm_fault+0x2554/0x2cb0 mm/memory.c:3973
> > > >   handle_mm_fault+0x21b/0x530 mm/memory.c:4010
> > > >   do_user_addr_fault arch/x86/mm/fault.c:1441 [inline]
> > > >   __do_page_fault+0x3fb/0x9e0 arch/x86/mm/fault.c:1506
> > > >   do_page_fault+0x54/0x233 arch/x86/mm/fault.c:1530
> > > >   page_fault+0x34/0x40 arch/x86/entry/entry_64.S:1202
> > > >
> > > > write to 0xffffffff85a7d440 of 8 bytes by task 7353 on cpu 1:
> > > >   sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
> > > >   sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
> > > >   rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
> > > >   wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
> > > >   process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
> > > >   worker_thread+0xa0/0x800 kernel/workqueue.c:2415
> > > >   kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
> > > >   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
> > > >
> > > > Reported by Kernel Concurrency Sanitizer on:
> > > > CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > > Google 01/01/2011
> > > > Workqueue: rcu_gp wait_rcu_exp_gp
> > > > ==================================================================
> > > > Kernel panic - not syncing: panic_on_warn set ...
> > > > CPU: 1 PID: 7353 Comm: kworker/1:0 Not tainted 5.4.0-rc3+ #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > > Google 01/01/2011
> > > > Workqueue: rcu_gp wait_rcu_exp_gp
> > > > Call Trace:
> > > >   __dump_stack lib/dump_stack.c:77 [inline]
> > > >   dump_stack+0xf5/0x159 lib/dump_stack.c:113
> > > >   panic+0x210/0x640 kernel/panic.c:221
> > > >   kcsan_report.cold+0xc/0x10 kernel/kcsan/report.c:302
> > > >   __kcsan_setup_watchpoint+0x32e/0x4a0 kernel/kcsan/core.c:411
> > > >   __tsan_write8 kernel/kcsan/kcsan.c:36 [inline]
> > > >   __tsan_write8+0x32/0x40 kernel/kcsan/kcsan.c:36
> > > >   sync_exp_reset_tree kernel/rcu/tree_exp.h:137 [inline]
> > > >   sync_rcu_exp_select_cpus+0xd5/0x590 kernel/rcu/tree_exp.h:427
> > > >   rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:575 [inline]
> > > >   wait_rcu_exp_gp+0x25/0x40 kernel/rcu/tree_exp.h:589
> > > >   process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
> > > >   worker_thread+0xa0/0x800 kernel/workqueue.c:2415
> > > >   kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
> > > >   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
> > > > Shutting down cpus with NMI
> > > > Kernel Offset: disabled
> > > > Rebooting in 86400 seconds..
> > > >
> > > >
> > > > ---
> > > > This bug is generated by a bot. It may contain errors.
> > > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > > >
> > > > syzbot will keep track of this bug report. See:
> > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >
> > Again, good catch!  Does the patch shown below fix this?
> 
> Assuming writers will follow in another patch:
> 
> Acked-by: Marco Elver <elver@google.com>

Applied, thank you!

And this time actually remembered to include the Reported-by.

							Thanx, Paul

> Thanks,
> -- Marco
> 
> 
> >                                                         Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit c3c182aa501c13faccf5f2c168004a5c98a89d74
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Mon Nov 4 08:22:45 2019 -0800
> >
> >     rcu: Use READ_ONCE() for ->expmask in rcu_read_unlock_special()
> >
> >     The rcu_node structure's ->expmask field is updated only when holding the
> >     ->lock, but is also accessed locklessly.  This means that all ->expmask
> >     updates must use WRITE_ONCE() and all reads carried out without holding
> >     ->lock must use READ_ONCE().  This commit therefore changes the lockless
> >     ->expmask read in rcu_read_unlock_special() to use READ_ONCE().
> >
> >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 6d54fd4..9a1b4b2 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -601,7 +601,7 @@ static void rcu_read_unlock_special(struct task_struct *t)
> >                 struct rcu_node *rnp = rdp->mynode;
> >
> >                 exp = (t->rcu_blocked_node && t->rcu_blocked_node->exp_tasks) ||
> > -                     (rdp->grpmask & rnp->expmask) ||
> > +                     (rdp->grpmask & READ_ONCE(rnp->expmask)) ||
> >                       tick_nohz_full_cpu(rdp->cpu);
> >                 // Need to defer quiescent state until everything is enabled.
> >                 if (irqs_were_disabled && use_softirq &&

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-11-04 17:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-04 11:30 KCSAN: data-race in __rcu_read_unlock / sync_rcu_exp_select_cpus syzbot
2019-11-04 11:32 ` Marco Elver
2019-11-04 16:26   ` Paul E. McKenney
2019-11-04 17:09     ` Marco Elver
2019-11-04 17:19       ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).