All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] kcsan: Add option to allow watcher interruptions
@ 2020-02-20 14:15 Marco Elver
  2020-02-20 18:58 ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Marco Elver @ 2020-02-20 14:15 UTC (permalink / raw)
  To: elver; +Cc: paulmck, andreyknvl, glider, dvyukov, kasan-dev, linux-kernel

Add option to allow interrupts while a watchpoint is set up. This can be
enabled either via CONFIG_KCSAN_INTERRUPT_WATCHER or via the boot
parameter 'kcsan.interrupt_watcher=1'.

Note that, currently not all safe per-CPU access primitives and patterns
are accounted for, which could result in false positives. For example,
asm-generic/percpu.h uses plain operations, which by default are
instrumented. On interrupts and subsequent accesses to the same
variable, KCSAN would currently report a data race with this option.

Therefore, this option should currently remain disabled by default, but
may be enabled for specific test scenarios.

Signed-off-by: Marco Elver <elver@google.com>
---

As an example, the first data race that this found:

write to 0xffff88806b3324b8 of 4 bytes by interrupt on cpu 0:
 rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]
 __rcu_read_lock+0x3c/0x50 kernel/rcu/tree_plugin.h:373
 rcu_read_lock include/linux/rcupdate.h:599 [inline]
 cpuacct_charge+0x36/0x80 kernel/sched/cpuacct.c:347
 cgroup_account_cputime include/linux/cgroup.h:773 [inline]
 update_curr+0xe2/0x1d0 kernel/sched/fair.c:860
 enqueue_entity+0x130/0x5d0 kernel/sched/fair.c:4005
 enqueue_task_fair+0xb0/0x420 kernel/sched/fair.c:5260
 enqueue_task kernel/sched/core.c:1302 [inline]
 activate_task+0x6d/0x110 kernel/sched/core.c:1324
 ttwu_do_activate.isra.0+0x40/0x60 kernel/sched/core.c:2266
 ttwu_queue kernel/sched/core.c:2411 [inline]
 try_to_wake_up+0x3be/0x6c0 kernel/sched/core.c:2645
 wake_up_process+0x10/0x20 kernel/sched/core.c:2669
 hrtimer_wakeup+0x4c/0x60 kernel/time/hrtimer.c:1769
 __run_hrtimer kernel/time/hrtimer.c:1517 [inline]
 __hrtimer_run_queues+0x274/0x5f0 kernel/time/hrtimer.c:1579
 hrtimer_interrupt+0x22d/0x490 kernel/time/hrtimer.c:1641
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1119 [inline]
 smp_apic_timer_interrupt+0xdc/0x280 arch/x86/kernel/apic/apic.c:1144
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
 delay_tsc+0x38/0xc0 arch/x86/lib/delay.c:68                   <--- interrupt while delayed
 __delay arch/x86/lib/delay.c:161 [inline]
 __const_udelay+0x33/0x40 arch/x86/lib/delay.c:175
 __udelay+0x10/0x20 arch/x86/lib/delay.c:181
 kcsan_setup_watchpoint+0x17f/0x400 kernel/kcsan/core.c:428
 check_access kernel/kcsan/core.c:550 [inline]
 __tsan_read4+0xc6/0x100 kernel/kcsan/core.c:685               <--- Enter KCSAN runtime
 rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  <---+
 __rcu_read_lock+0x2a/0x50 kernel/rcu/tree_plugin.h:373            |
 rcu_read_lock include/linux/rcupdate.h:599 [inline]               |
 lock_page_memcg+0x31/0x110 mm/memcontrol.c:1972                   |
                                                                   |
read to 0xffff88806b3324b8 of 4 bytes by task 6131 on cpu 0:       |
 rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  ----+
 __rcu_read_lock+0x2a/0x50 kernel/rcu/tree_plugin.h:373
 rcu_read_lock include/linux/rcupdate.h:599 [inline]
 lock_page_memcg+0x31/0x110 mm/memcontrol.c:1972

The writer is doing 'current->rcu_read_lock_nesting++'. The read is as
vulnerable to compiler optimizations and would therefore conclude this
is a valid data race.
---
 kernel/kcsan/core.c | 30 ++++++++----------------------
 lib/Kconfig.kcsan   | 11 +++++++++++
 2 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c
index 589b1e7f0f253..43eb5f850c68e 100644
--- a/kernel/kcsan/core.c
+++ b/kernel/kcsan/core.c
@@ -21,6 +21,7 @@ static bool kcsan_early_enable = IS_ENABLED(CONFIG_KCSAN_EARLY_ENABLE);
 static unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK;
 static unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT;
 static long kcsan_skip_watch = CONFIG_KCSAN_SKIP_WATCH;
+static bool kcsan_interrupt_watcher = IS_ENABLED(CONFIG_KCSAN_INTERRUPT_WATCHER);
 
 #ifdef MODULE_PARAM_PREFIX
 #undef MODULE_PARAM_PREFIX
@@ -30,6 +31,7 @@ module_param_named(early_enable, kcsan_early_enable, bool, 0);
 module_param_named(udelay_task, kcsan_udelay_task, uint, 0644);
 module_param_named(udelay_interrupt, kcsan_udelay_interrupt, uint, 0644);
 module_param_named(skip_watch, kcsan_skip_watch, long, 0644);
+module_param_named(interrupt_watcher, kcsan_interrupt_watcher, bool, 0444);
 
 bool kcsan_enabled;
 
@@ -354,7 +356,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
 	unsigned long access_mask;
 	enum kcsan_value_change value_change = KCSAN_VALUE_CHANGE_MAYBE;
 	unsigned long ua_flags = user_access_save();
-	unsigned long irq_flags;
+	unsigned long irq_flags = 0;
 
 	/*
 	 * Always reset kcsan_skip counter in slow-path to avoid underflow; see
@@ -370,26 +372,9 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
 		goto out;
 	}
 
-	/*
-	 * Disable interrupts & preemptions to avoid another thread on the same
-	 * CPU accessing memory locations for the set up watchpoint; this is to
-	 * avoid reporting races to e.g. CPU-local data.
-	 *
-	 * An alternative would be adding the source CPU to the watchpoint
-	 * encoding, and checking that watchpoint-CPU != this-CPU. There are
-	 * several problems with this:
-	 *   1. we should avoid stealing more bits from the watchpoint encoding
-	 *      as it would affect accuracy, as well as increase performance
-	 *      overhead in the fast-path;
-	 *   2. if we are preempted, but there *is* a genuine data race, we
-	 *      would *not* report it -- since this is the common case (vs.
-	 *      CPU-local data accesses), it makes more sense (from a data race
-	 *      detection point of view) to simply disable preemptions to ensure
-	 *      as many tasks as possible run on other CPUs.
-	 *
-	 * Use raw versions, to avoid lockdep recursion via IRQ flags tracing.
-	 */
-	raw_local_irq_save(irq_flags);
+	if (!kcsan_interrupt_watcher)
+		/* Use raw to avoid lockdep recursion via IRQ flags tracing. */
+		raw_local_irq_save(irq_flags);
 
 	watchpoint = insert_watchpoint((unsigned long)ptr, size, is_write);
 	if (watchpoint == NULL) {
@@ -524,7 +509,8 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
 
 	kcsan_counter_dec(KCSAN_COUNTER_USED_WATCHPOINTS);
 out_unlock:
-	raw_local_irq_restore(irq_flags);
+	if (!kcsan_interrupt_watcher)
+		raw_local_irq_restore(irq_flags);
 out:
 	user_access_restore(ua_flags);
 }
diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan
index ba9268076cfbc..0f1447ff8f558 100644
--- a/lib/Kconfig.kcsan
+++ b/lib/Kconfig.kcsan
@@ -101,6 +101,17 @@ config KCSAN_SKIP_WATCH_RANDOMIZE
 	  KCSAN_WATCH_SKIP. If false, the chosen value is always
 	  KCSAN_WATCH_SKIP.
 
+config KCSAN_INTERRUPT_WATCHER
+	bool "Interruptible watchers"
+	help
+	  If enabled, a task that set up a watchpoint may be interrupted while
+	  delayed. This option will allow KCSAN to detect races between
+	  interrupted tasks and other threads of execution on the same CPU.
+
+	  Currently disabled by default, because not all safe per-CPU access
+	  primitives and patterns may be accounted for, and therefore could
+	  result in false positives.
+
 config KCSAN_REPORT_ONCE_IN_MS
 	int "Duration in milliseconds, in which any given race is only reported once"
 	default 3000
-- 
2.25.0.265.gbab2e86ba0-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-02-20 14:15 [PATCH] kcsan: Add option to allow watcher interruptions Marco Elver
@ 2020-02-20 18:58 ` Paul E. McKenney
  2020-02-20 21:33   ` Marco Elver
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2020-02-20 18:58 UTC (permalink / raw)
  To: Marco Elver; +Cc: andreyknvl, glider, dvyukov, kasan-dev, linux-kernel

On Thu, Feb 20, 2020 at 03:15:51PM +0100, Marco Elver wrote:
> Add option to allow interrupts while a watchpoint is set up. This can be
> enabled either via CONFIG_KCSAN_INTERRUPT_WATCHER or via the boot
> parameter 'kcsan.interrupt_watcher=1'.
> 
> Note that, currently not all safe per-CPU access primitives and patterns
> are accounted for, which could result in false positives. For example,
> asm-generic/percpu.h uses plain operations, which by default are
> instrumented. On interrupts and subsequent accesses to the same
> variable, KCSAN would currently report a data race with this option.
> 
> Therefore, this option should currently remain disabled by default, but
> may be enabled for specific test scenarios.
> 
> Signed-off-by: Marco Elver <elver@google.com>

Queued for review and testing, thank you!

> ---
> 
> As an example, the first data race that this found:
> 
> write to 0xffff88806b3324b8 of 4 bytes by interrupt on cpu 0:
>  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]
>  __rcu_read_lock+0x3c/0x50 kernel/rcu/tree_plugin.h:373
>  rcu_read_lock include/linux/rcupdate.h:599 [inline]
>  cpuacct_charge+0x36/0x80 kernel/sched/cpuacct.c:347
>  cgroup_account_cputime include/linux/cgroup.h:773 [inline]
>  update_curr+0xe2/0x1d0 kernel/sched/fair.c:860
>  enqueue_entity+0x130/0x5d0 kernel/sched/fair.c:4005
>  enqueue_task_fair+0xb0/0x420 kernel/sched/fair.c:5260
>  enqueue_task kernel/sched/core.c:1302 [inline]
>  activate_task+0x6d/0x110 kernel/sched/core.c:1324
>  ttwu_do_activate.isra.0+0x40/0x60 kernel/sched/core.c:2266
>  ttwu_queue kernel/sched/core.c:2411 [inline]
>  try_to_wake_up+0x3be/0x6c0 kernel/sched/core.c:2645
>  wake_up_process+0x10/0x20 kernel/sched/core.c:2669
>  hrtimer_wakeup+0x4c/0x60 kernel/time/hrtimer.c:1769
>  __run_hrtimer kernel/time/hrtimer.c:1517 [inline]
>  __hrtimer_run_queues+0x274/0x5f0 kernel/time/hrtimer.c:1579
>  hrtimer_interrupt+0x22d/0x490 kernel/time/hrtimer.c:1641
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1119 [inline]
>  smp_apic_timer_interrupt+0xdc/0x280 arch/x86/kernel/apic/apic.c:1144
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
>  delay_tsc+0x38/0xc0 arch/x86/lib/delay.c:68                   <--- interrupt while delayed
>  __delay arch/x86/lib/delay.c:161 [inline]
>  __const_udelay+0x33/0x40 arch/x86/lib/delay.c:175
>  __udelay+0x10/0x20 arch/x86/lib/delay.c:181
>  kcsan_setup_watchpoint+0x17f/0x400 kernel/kcsan/core.c:428
>  check_access kernel/kcsan/core.c:550 [inline]
>  __tsan_read4+0xc6/0x100 kernel/kcsan/core.c:685               <--- Enter KCSAN runtime
>  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  <---+
>  __rcu_read_lock+0x2a/0x50 kernel/rcu/tree_plugin.h:373            |
>  rcu_read_lock include/linux/rcupdate.h:599 [inline]               |
>  lock_page_memcg+0x31/0x110 mm/memcontrol.c:1972                   |
>                                                                    |
> read to 0xffff88806b3324b8 of 4 bytes by task 6131 on cpu 0:       |
>  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  ----+
>  __rcu_read_lock+0x2a/0x50 kernel/rcu/tree_plugin.h:373
>  rcu_read_lock include/linux/rcupdate.h:599 [inline]
>  lock_page_memcg+0x31/0x110 mm/memcontrol.c:1972
> 
> The writer is doing 'current->rcu_read_lock_nesting++'. The read is as
> vulnerable to compiler optimizations and would therefore conclude this
> is a valid data race.

Heh!  That one is a fun one!  It is on a very hot fastpath.  READ_ONCE()
and WRITE_ONCE() are likely to be measurable at the system level.

Thoughts on other options?

							Thanx, Paul

> ---
>  kernel/kcsan/core.c | 30 ++++++++----------------------
>  lib/Kconfig.kcsan   | 11 +++++++++++
>  2 files changed, 19 insertions(+), 22 deletions(-)
> 
> diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c
> index 589b1e7f0f253..43eb5f850c68e 100644
> --- a/kernel/kcsan/core.c
> +++ b/kernel/kcsan/core.c
> @@ -21,6 +21,7 @@ static bool kcsan_early_enable = IS_ENABLED(CONFIG_KCSAN_EARLY_ENABLE);
>  static unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK;
>  static unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT;
>  static long kcsan_skip_watch = CONFIG_KCSAN_SKIP_WATCH;
> +static bool kcsan_interrupt_watcher = IS_ENABLED(CONFIG_KCSAN_INTERRUPT_WATCHER);
>  
>  #ifdef MODULE_PARAM_PREFIX
>  #undef MODULE_PARAM_PREFIX
> @@ -30,6 +31,7 @@ module_param_named(early_enable, kcsan_early_enable, bool, 0);
>  module_param_named(udelay_task, kcsan_udelay_task, uint, 0644);
>  module_param_named(udelay_interrupt, kcsan_udelay_interrupt, uint, 0644);
>  module_param_named(skip_watch, kcsan_skip_watch, long, 0644);
> +module_param_named(interrupt_watcher, kcsan_interrupt_watcher, bool, 0444);
>  
>  bool kcsan_enabled;
>  
> @@ -354,7 +356,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
>  	unsigned long access_mask;
>  	enum kcsan_value_change value_change = KCSAN_VALUE_CHANGE_MAYBE;
>  	unsigned long ua_flags = user_access_save();
> -	unsigned long irq_flags;
> +	unsigned long irq_flags = 0;
>  
>  	/*
>  	 * Always reset kcsan_skip counter in slow-path to avoid underflow; see
> @@ -370,26 +372,9 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
>  		goto out;
>  	}
>  
> -	/*
> -	 * Disable interrupts & preemptions to avoid another thread on the same
> -	 * CPU accessing memory locations for the set up watchpoint; this is to
> -	 * avoid reporting races to e.g. CPU-local data.
> -	 *
> -	 * An alternative would be adding the source CPU to the watchpoint
> -	 * encoding, and checking that watchpoint-CPU != this-CPU. There are
> -	 * several problems with this:
> -	 *   1. we should avoid stealing more bits from the watchpoint encoding
> -	 *      as it would affect accuracy, as well as increase performance
> -	 *      overhead in the fast-path;
> -	 *   2. if we are preempted, but there *is* a genuine data race, we
> -	 *      would *not* report it -- since this is the common case (vs.
> -	 *      CPU-local data accesses), it makes more sense (from a data race
> -	 *      detection point of view) to simply disable preemptions to ensure
> -	 *      as many tasks as possible run on other CPUs.
> -	 *
> -	 * Use raw versions, to avoid lockdep recursion via IRQ flags tracing.
> -	 */
> -	raw_local_irq_save(irq_flags);
> +	if (!kcsan_interrupt_watcher)
> +		/* Use raw to avoid lockdep recursion via IRQ flags tracing. */
> +		raw_local_irq_save(irq_flags);
>  
>  	watchpoint = insert_watchpoint((unsigned long)ptr, size, is_write);
>  	if (watchpoint == NULL) {
> @@ -524,7 +509,8 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
>  
>  	kcsan_counter_dec(KCSAN_COUNTER_USED_WATCHPOINTS);
>  out_unlock:
> -	raw_local_irq_restore(irq_flags);
> +	if (!kcsan_interrupt_watcher)
> +		raw_local_irq_restore(irq_flags);
>  out:
>  	user_access_restore(ua_flags);
>  }
> diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan
> index ba9268076cfbc..0f1447ff8f558 100644
> --- a/lib/Kconfig.kcsan
> +++ b/lib/Kconfig.kcsan
> @@ -101,6 +101,17 @@ config KCSAN_SKIP_WATCH_RANDOMIZE
>  	  KCSAN_WATCH_SKIP. If false, the chosen value is always
>  	  KCSAN_WATCH_SKIP.
>  
> +config KCSAN_INTERRUPT_WATCHER
> +	bool "Interruptible watchers"
> +	help
> +	  If enabled, a task that set up a watchpoint may be interrupted while
> +	  delayed. This option will allow KCSAN to detect races between
> +	  interrupted tasks and other threads of execution on the same CPU.
> +
> +	  Currently disabled by default, because not all safe per-CPU access
> +	  primitives and patterns may be accounted for, and therefore could
> +	  result in false positives.
> +
>  config KCSAN_REPORT_ONCE_IN_MS
>  	int "Duration in milliseconds, in which any given race is only reported once"
>  	default 3000
> -- 
> 2.25.0.265.gbab2e86ba0-goog
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-02-20 18:58 ` Paul E. McKenney
@ 2020-02-20 21:33   ` Marco Elver
  2020-07-25 14:56     ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Marco Elver @ 2020-02-20 21:33 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: andreyknvl, glider, dvyukov, kasan-dev, linux-kernel



On Thu, 20 Feb 2020, Paul E. McKenney wrote:

> On Thu, Feb 20, 2020 at 03:15:51PM +0100, Marco Elver wrote:
> > Add option to allow interrupts while a watchpoint is set up. This can be
> > enabled either via CONFIG_KCSAN_INTERRUPT_WATCHER or via the boot
> > parameter 'kcsan.interrupt_watcher=1'.
> > 
> > Note that, currently not all safe per-CPU access primitives and patterns
> > are accounted for, which could result in false positives. For example,
> > asm-generic/percpu.h uses plain operations, which by default are
> > instrumented. On interrupts and subsequent accesses to the same
> > variable, KCSAN would currently report a data race with this option.
> > 
> > Therefore, this option should currently remain disabled by default, but
> > may be enabled for specific test scenarios.
> > 
> > Signed-off-by: Marco Elver <elver@google.com>
> 
> Queued for review and testing, thank you!
> 
> > ---
> > 
> > As an example, the first data race that this found:
> > 
> > write to 0xffff88806b3324b8 of 4 bytes by interrupt on cpu 0:
> >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]
> >  __rcu_read_lock+0x3c/0x50 kernel/rcu/tree_plugin.h:373
> >  rcu_read_lock include/linux/rcupdate.h:599 [inline]
> >  cpuacct_charge+0x36/0x80 kernel/sched/cpuacct.c:347
> >  cgroup_account_cputime include/linux/cgroup.h:773 [inline]
> >  update_curr+0xe2/0x1d0 kernel/sched/fair.c:860
> >  enqueue_entity+0x130/0x5d0 kernel/sched/fair.c:4005
> >  enqueue_task_fair+0xb0/0x420 kernel/sched/fair.c:5260
> >  enqueue_task kernel/sched/core.c:1302 [inline]
> >  activate_task+0x6d/0x110 kernel/sched/core.c:1324
> >  ttwu_do_activate.isra.0+0x40/0x60 kernel/sched/core.c:2266
> >  ttwu_queue kernel/sched/core.c:2411 [inline]
> >  try_to_wake_up+0x3be/0x6c0 kernel/sched/core.c:2645
> >  wake_up_process+0x10/0x20 kernel/sched/core.c:2669
> >  hrtimer_wakeup+0x4c/0x60 kernel/time/hrtimer.c:1769
> >  __run_hrtimer kernel/time/hrtimer.c:1517 [inline]
> >  __hrtimer_run_queues+0x274/0x5f0 kernel/time/hrtimer.c:1579
> >  hrtimer_interrupt+0x22d/0x490 kernel/time/hrtimer.c:1641
> >  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1119 [inline]
> >  smp_apic_timer_interrupt+0xdc/0x280 arch/x86/kernel/apic/apic.c:1144
> >  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
> >  delay_tsc+0x38/0xc0 arch/x86/lib/delay.c:68                   <--- interrupt while delayed
> >  __delay arch/x86/lib/delay.c:161 [inline]
> >  __const_udelay+0x33/0x40 arch/x86/lib/delay.c:175
> >  __udelay+0x10/0x20 arch/x86/lib/delay.c:181
> >  kcsan_setup_watchpoint+0x17f/0x400 kernel/kcsan/core.c:428
> >  check_access kernel/kcsan/core.c:550 [inline]
> >  __tsan_read4+0xc6/0x100 kernel/kcsan/core.c:685               <--- Enter KCSAN runtime
> >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  <---+
> >  __rcu_read_lock+0x2a/0x50 kernel/rcu/tree_plugin.h:373            |
> >  rcu_read_lock include/linux/rcupdate.h:599 [inline]               |
> >  lock_page_memcg+0x31/0x110 mm/memcontrol.c:1972                   |
> >                                                                    |
> > read to 0xffff88806b3324b8 of 4 bytes by task 6131 on cpu 0:       |
> >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  ----+
> >  __rcu_read_lock+0x2a/0x50 kernel/rcu/tree_plugin.h:373
> >  rcu_read_lock include/linux/rcupdate.h:599 [inline]
> >  lock_page_memcg+0x31/0x110 mm/memcontrol.c:1972
> > 
> > The writer is doing 'current->rcu_read_lock_nesting++'. The read is as
> > vulnerable to compiler optimizations and would therefore conclude this
> > is a valid data race.
> 
> Heh!  That one is a fun one!  It is on a very hot fastpath.  READ_ONCE()
> and WRITE_ONCE() are likely to be measurable at the system level.
> 
> Thoughts on other options?

Would this be a use-case for local_t? Don't think this_cpu ops work
here.

See below idea. This would avoid the data race (KCSAN stopped
complaining) and seems to generate reasonable code.

Version before:

 <__rcu_read_lock>:
     130	mov    %gs:0x0,%rax
     137
     139	addl   $0x1,0x370(%rax)
     140	retq   
     141	data16 nopw %cs:0x0(%rax,%rax,1)
     148
     14c	nopl   0x0(%rax)

Version after:

 <__rcu_read_lock>:
     130	mov    %gs:0x0,%rax
     137
     139	incq   0x370(%rax)
     140	retq   
     141	data16 nopw %cs:0x0(%rax,%rax,1)
     148
     14c	nopl   0x0(%rax)

I haven't checked the other places where it is used, though.
(Can send it as a patch if you think this might work.)

Thanks,
-- Marco

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 2678a37c31696..3d8586ee7ae64 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -50,7 +50,7 @@ void __rcu_read_unlock(void);
  * nesting depth, but makes sense only if CONFIG_PREEMPT_RCU -- in other
  * types of kernel builds, the rcu_read_lock() nesting depth is unknowable.
  */
-#define rcu_preempt_depth() (current->rcu_read_lock_nesting)
+#define rcu_preempt_depth() local_read(&current->rcu_read_lock_nesting)
 
 #else /* #ifdef CONFIG_PREEMPT_RCU */
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0918904c939d2..70d7e3257feed 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -10,6 +10,7 @@
 #include <uapi/linux/sched.h>
 
 #include <asm/current.h>
+#include <asm/local.h>
 
 #include <linux/pid.h>
 #include <linux/sem.h>
@@ -708,7 +709,7 @@ struct task_struct {
 	cpumask_t			cpus_mask;
 
 #ifdef CONFIG_PREEMPT_RCU
-	int				rcu_read_lock_nesting;
+	local_t				rcu_read_lock_nesting;
 	union rcu_special		rcu_read_unlock_special;
 	struct list_head		rcu_node_entry;
 	struct rcu_node			*rcu_blocked_node;
diff --git a/init/init_task.c b/init/init_task.c
index 096191d177d5c..941777fce11e5 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -130,7 +130,7 @@ struct task_struct init_task
 	.perf_event_list = LIST_HEAD_INIT(init_task.perf_event_list),
 #endif
 #ifdef CONFIG_PREEMPT_RCU
-	.rcu_read_lock_nesting = 0,
+	.rcu_read_lock_nesting = LOCAL_INIT(0),
 	.rcu_read_unlock_special.s = 0,
 	.rcu_node_entry = LIST_HEAD_INIT(init_task.rcu_node_entry),
 	.rcu_blocked_node = NULL,
diff --git a/kernel/fork.c b/kernel/fork.c
index 60a1295f43843..43af326081b06 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1669,7 +1669,7 @@ init_task_pid(struct task_struct *task, enum pid_type type, struct pid *pid)
 static inline void rcu_copy_process(struct task_struct *p)
 {
 #ifdef CONFIG_PREEMPT_RCU
-	p->rcu_read_lock_nesting = 0;
+	local_set(&p->rcu_read_lock_nesting, 0);
 	p->rcu_read_unlock_special.s = 0;
 	p->rcu_blocked_node = NULL;
 	INIT_LIST_HEAD(&p->rcu_node_entry);
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index c6ea81cd41890..e0595abd50c0f 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -350,17 +350,17 @@ static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp)
 
 static void rcu_preempt_read_enter(void)
 {
-	current->rcu_read_lock_nesting++;
+	local_inc(&current->rcu_read_lock_nesting);
 }
 
 static void rcu_preempt_read_exit(void)
 {
-	current->rcu_read_lock_nesting--;
+	local_dec(&current->rcu_read_lock_nesting);
 }
 
 static void rcu_preempt_depth_set(int val)
 {
-	current->rcu_read_lock_nesting = val;
+	local_set(&current->rcu_read_lock_nesting, val);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-02-20 21:33   ` Marco Elver
@ 2020-07-25 14:56     ` Paul E. McKenney
  2020-07-25 15:17       ` Marco Elver
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2020-07-25 14:56 UTC (permalink / raw)
  To: Marco Elver; +Cc: andreyknvl, glider, dvyukov, kasan-dev, linux-kernel

On Thu, Feb 20, 2020 at 10:33:17PM +0100, Marco Elver wrote:
> On Thu, 20 Feb 2020, Paul E. McKenney wrote:

I am clearly not keeping up...  :-/

> > On Thu, Feb 20, 2020 at 03:15:51PM +0100, Marco Elver wrote:
> > > Add option to allow interrupts while a watchpoint is set up. This can be
> > > enabled either via CONFIG_KCSAN_INTERRUPT_WATCHER or via the boot
> > > parameter 'kcsan.interrupt_watcher=1'.
> > > 
> > > Note that, currently not all safe per-CPU access primitives and patterns
> > > are accounted for, which could result in false positives. For example,
> > > asm-generic/percpu.h uses plain operations, which by default are
> > > instrumented. On interrupts and subsequent accesses to the same
> > > variable, KCSAN would currently report a data race with this option.
> > > 
> > > Therefore, this option should currently remain disabled by default, but
> > > may be enabled for specific test scenarios.
> > > 
> > > Signed-off-by: Marco Elver <elver@google.com>
> > 
> > Queued for review and testing, thank you!
> > 
> > > ---
> > > 
> > > As an example, the first data race that this found:
> > > 
> > > write to 0xffff88806b3324b8 of 4 bytes by interrupt on cpu 0:
> > >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]
> > >  __rcu_read_lock+0x3c/0x50 kernel/rcu/tree_plugin.h:373
> > >  rcu_read_lock include/linux/rcupdate.h:599 [inline]
> > >  cpuacct_charge+0x36/0x80 kernel/sched/cpuacct.c:347
> > >  cgroup_account_cputime include/linux/cgroup.h:773 [inline]
> > >  update_curr+0xe2/0x1d0 kernel/sched/fair.c:860
> > >  enqueue_entity+0x130/0x5d0 kernel/sched/fair.c:4005
> > >  enqueue_task_fair+0xb0/0x420 kernel/sched/fair.c:5260
> > >  enqueue_task kernel/sched/core.c:1302 [inline]
> > >  activate_task+0x6d/0x110 kernel/sched/core.c:1324
> > >  ttwu_do_activate.isra.0+0x40/0x60 kernel/sched/core.c:2266
> > >  ttwu_queue kernel/sched/core.c:2411 [inline]
> > >  try_to_wake_up+0x3be/0x6c0 kernel/sched/core.c:2645
> > >  wake_up_process+0x10/0x20 kernel/sched/core.c:2669
> > >  hrtimer_wakeup+0x4c/0x60 kernel/time/hrtimer.c:1769
> > >  __run_hrtimer kernel/time/hrtimer.c:1517 [inline]
> > >  __hrtimer_run_queues+0x274/0x5f0 kernel/time/hrtimer.c:1579
> > >  hrtimer_interrupt+0x22d/0x490 kernel/time/hrtimer.c:1641
> > >  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1119 [inline]
> > >  smp_apic_timer_interrupt+0xdc/0x280 arch/x86/kernel/apic/apic.c:1144
> > >  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
> > >  delay_tsc+0x38/0xc0 arch/x86/lib/delay.c:68                   <--- interrupt while delayed
> > >  __delay arch/x86/lib/delay.c:161 [inline]
> > >  __const_udelay+0x33/0x40 arch/x86/lib/delay.c:175
> > >  __udelay+0x10/0x20 arch/x86/lib/delay.c:181
> > >  kcsan_setup_watchpoint+0x17f/0x400 kernel/kcsan/core.c:428
> > >  check_access kernel/kcsan/core.c:550 [inline]
> > >  __tsan_read4+0xc6/0x100 kernel/kcsan/core.c:685               <--- Enter KCSAN runtime
> > >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  <---+
> > >  __rcu_read_lock+0x2a/0x50 kernel/rcu/tree_plugin.h:373            |
> > >  rcu_read_lock include/linux/rcupdate.h:599 [inline]               |
> > >  lock_page_memcg+0x31/0x110 mm/memcontrol.c:1972                   |
> > >                                                                    |
> > > read to 0xffff88806b3324b8 of 4 bytes by task 6131 on cpu 0:       |
> > >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  ----+
> > >  __rcu_read_lock+0x2a/0x50 kernel/rcu/tree_plugin.h:373
> > >  rcu_read_lock include/linux/rcupdate.h:599 [inline]
> > >  lock_page_memcg+0x31/0x110 mm/memcontrol.c:1972
> > > 
> > > The writer is doing 'current->rcu_read_lock_nesting++'. The read is as
> > > vulnerable to compiler optimizations and would therefore conclude this
> > > is a valid data race.
> > 
> > Heh!  That one is a fun one!  It is on a very hot fastpath.  READ_ONCE()
> > and WRITE_ONCE() are likely to be measurable at the system level.
> > 
> > Thoughts on other options?
> 
> Would this be a use-case for local_t? Don't think this_cpu ops work
> here.
> 
> See below idea. This would avoid the data race (KCSAN stopped
> complaining) and seems to generate reasonable code.
> 
> Version before:
> 
>  <__rcu_read_lock>:
>      130	mov    %gs:0x0,%rax
>      137
>      139	addl   $0x1,0x370(%rax)
>      140	retq   
>      141	data16 nopw %cs:0x0(%rax,%rax,1)
>      148
>      14c	nopl   0x0(%rax)
> 
> Version after:
> 
>  <__rcu_read_lock>:
>      130	mov    %gs:0x0,%rax
>      137
>      139	incq   0x370(%rax)
>      140	retq   
>      141	data16 nopw %cs:0x0(%rax,%rax,1)
>      148
>      14c	nopl   0x0(%rax)
> 
> I haven't checked the other places where it is used, though.
> (Can send it as a patch if you think this might work.)
> 
> Thanks,
> -- Marco
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 2678a37c31696..3d8586ee7ae64 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -50,7 +50,7 @@ void __rcu_read_unlock(void);
>   * nesting depth, but makes sense only if CONFIG_PREEMPT_RCU -- in other
>   * types of kernel builds, the rcu_read_lock() nesting depth is unknowable.
>   */
> -#define rcu_preempt_depth() (current->rcu_read_lock_nesting)
> +#define rcu_preempt_depth() local_read(&current->rcu_read_lock_nesting)
>  
>  #else /* #ifdef CONFIG_PREEMPT_RCU */
>  
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 0918904c939d2..70d7e3257feed 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -10,6 +10,7 @@
>  #include <uapi/linux/sched.h>
>  
>  #include <asm/current.h>
> +#include <asm/local.h>
>  
>  #include <linux/pid.h>
>  #include <linux/sem.h>
> @@ -708,7 +709,7 @@ struct task_struct {
>  	cpumask_t			cpus_mask;
>  
>  #ifdef CONFIG_PREEMPT_RCU
> -	int				rcu_read_lock_nesting;
> +	local_t				rcu_read_lock_nesting;
>  	union rcu_special		rcu_read_unlock_special;
>  	struct list_head		rcu_node_entry;
>  	struct rcu_node			*rcu_blocked_node;
> diff --git a/init/init_task.c b/init/init_task.c
> index 096191d177d5c..941777fce11e5 100644
> --- a/init/init_task.c
> +++ b/init/init_task.c
> @@ -130,7 +130,7 @@ struct task_struct init_task
>  	.perf_event_list = LIST_HEAD_INIT(init_task.perf_event_list),
>  #endif
>  #ifdef CONFIG_PREEMPT_RCU
> -	.rcu_read_lock_nesting = 0,
> +	.rcu_read_lock_nesting = LOCAL_INIT(0),
>  	.rcu_read_unlock_special.s = 0,
>  	.rcu_node_entry = LIST_HEAD_INIT(init_task.rcu_node_entry),
>  	.rcu_blocked_node = NULL,
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 60a1295f43843..43af326081b06 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1669,7 +1669,7 @@ init_task_pid(struct task_struct *task, enum pid_type type, struct pid *pid)
>  static inline void rcu_copy_process(struct task_struct *p)
>  {
>  #ifdef CONFIG_PREEMPT_RCU
> -	p->rcu_read_lock_nesting = 0;
> +	local_set(&p->rcu_read_lock_nesting, 0);
>  	p->rcu_read_unlock_special.s = 0;
>  	p->rcu_blocked_node = NULL;
>  	INIT_LIST_HEAD(&p->rcu_node_entry);
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index c6ea81cd41890..e0595abd50c0f 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -350,17 +350,17 @@ static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp)
>  
>  static void rcu_preempt_read_enter(void)
>  {
> -	current->rcu_read_lock_nesting++;
> +	local_inc(&current->rcu_read_lock_nesting);
>  }
>  
>  static void rcu_preempt_read_exit(void)
>  {
> -	current->rcu_read_lock_nesting--;
> +	local_dec(&current->rcu_read_lock_nesting);
>  }
>  
>  static void rcu_preempt_depth_set(int val)
>  {
> -	current->rcu_read_lock_nesting = val;
> +	local_set(&current->rcu_read_lock_nesting, val);

I agree that this removes the data races, and that the code for x86 is
quite nice, but aren't rcu_read_lock() and rcu_read_unlock() going to
have heavyweight atomic operations on many CPUs?

Maybe I am stuck with arch-specific code in rcu_read_lock() and
rcu_preempt_read_exit().  I suppose worse things could happen.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-07-25 14:56     ` Paul E. McKenney
@ 2020-07-25 15:17       ` Marco Elver
  2020-07-25 17:44         ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Marco Elver @ 2020-07-25 15:17 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Andrey Konovalov, Alexander Potapenko, Dmitry Vyukov, kasan-dev,
	LKML, Peter Zijlstra

[+Peter]

On Sat, 25 Jul 2020 at 16:56, Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Feb 20, 2020 at 10:33:17PM +0100, Marco Elver wrote:
> > On Thu, 20 Feb 2020, Paul E. McKenney wrote:
>
> I am clearly not keeping up...  :-/

Not to worry, I think the local_t idea was discarded based on Peter's
feedback anyway at one point.

> > > On Thu, Feb 20, 2020 at 03:15:51PM +0100, Marco Elver wrote:
> > > > Add option to allow interrupts while a watchpoint is set up. This can be
> > > > enabled either via CONFIG_KCSAN_INTERRUPT_WATCHER or via the boot
> > > > parameter 'kcsan.interrupt_watcher=1'.
[...]
> > > > As an example, the first data race that this found:
> > > >
> > > > write to 0xffff88806b3324b8 of 4 bytes by interrupt on cpu 0:
> > > >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]
> > > >  __rcu_read_lock+0x3c/0x50 kernel/rcu/tree_plugin.h:373
[...]
> > > > read to 0xffff88806b3324b8 of 4 bytes by task 6131 on cpu 0:       |
> > > >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  ----+
[...]
> > > >
> > > > The writer is doing 'current->rcu_read_lock_nesting++'. The read is as
> > > > vulnerable to compiler optimizations and would therefore conclude this
> > > > is a valid data race.
> > >
> > > Heh!  That one is a fun one!  It is on a very hot fastpath.  READ_ONCE()
> > > and WRITE_ONCE() are likely to be measurable at the system level.
> > >
> > > Thoughts on other options?
> >
> > Would this be a use-case for local_t? Don't think this_cpu ops work
> > here.
> >
> > See below idea. This would avoid the data race (KCSAN stopped
> > complaining) and seems to generate reasonable code.
> >
> > Version before:
> >
> >  <__rcu_read_lock>:
> >      130      mov    %gs:0x0,%rax
> >      137
> >      139      addl   $0x1,0x370(%rax)
> >      140      retq
> >      141      data16 nopw %cs:0x0(%rax,%rax,1)
> >      148
> >      14c      nopl   0x0(%rax)
> >
> > Version after:
> >
> >  <__rcu_read_lock>:
> >      130      mov    %gs:0x0,%rax
> >      137
> >      139      incq   0x370(%rax)
> >      140      retq
> >      141      data16 nopw %cs:0x0(%rax,%rax,1)
> >      148
> >      14c      nopl   0x0(%rax)
> >
> > I haven't checked the other places where it is used, though.
> > (Can send it as a patch if you think this might work.)
> >
> > Thanks,
> > -- Marco
> >
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 2678a37c31696..3d8586ee7ae64 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -50,7 +50,7 @@ void __rcu_read_unlock(void);
> >   * nesting depth, but makes sense only if CONFIG_PREEMPT_RCU -- in other
> >   * types of kernel builds, the rcu_read_lock() nesting depth is unknowable.
> >   */
> > -#define rcu_preempt_depth() (current->rcu_read_lock_nesting)
> > +#define rcu_preempt_depth() local_read(&current->rcu_read_lock_nesting)
> >
> >  #else /* #ifdef CONFIG_PREEMPT_RCU */
> >
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 0918904c939d2..70d7e3257feed 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -10,6 +10,7 @@
> >  #include <uapi/linux/sched.h>
> >
> >  #include <asm/current.h>
> > +#include <asm/local.h>
> >
> >  #include <linux/pid.h>
> >  #include <linux/sem.h>
> > @@ -708,7 +709,7 @@ struct task_struct {
> >       cpumask_t                       cpus_mask;
> >
> >  #ifdef CONFIG_PREEMPT_RCU
> > -     int                             rcu_read_lock_nesting;
> > +     local_t                         rcu_read_lock_nesting;
> >       union rcu_special               rcu_read_unlock_special;
> >       struct list_head                rcu_node_entry;
> >       struct rcu_node                 *rcu_blocked_node;
> > diff --git a/init/init_task.c b/init/init_task.c
> > index 096191d177d5c..941777fce11e5 100644
> > --- a/init/init_task.c
> > +++ b/init/init_task.c
> > @@ -130,7 +130,7 @@ struct task_struct init_task
> >       .perf_event_list = LIST_HEAD_INIT(init_task.perf_event_list),
> >  #endif
> >  #ifdef CONFIG_PREEMPT_RCU
> > -     .rcu_read_lock_nesting = 0,
> > +     .rcu_read_lock_nesting = LOCAL_INIT(0),
> >       .rcu_read_unlock_special.s = 0,
> >       .rcu_node_entry = LIST_HEAD_INIT(init_task.rcu_node_entry),
> >       .rcu_blocked_node = NULL,
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 60a1295f43843..43af326081b06 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -1669,7 +1669,7 @@ init_task_pid(struct task_struct *task, enum pid_type type, struct pid *pid)
> >  static inline void rcu_copy_process(struct task_struct *p)
> >  {
> >  #ifdef CONFIG_PREEMPT_RCU
> > -     p->rcu_read_lock_nesting = 0;
> > +     local_set(&p->rcu_read_lock_nesting, 0);
> >       p->rcu_read_unlock_special.s = 0;
> >       p->rcu_blocked_node = NULL;
> >       INIT_LIST_HEAD(&p->rcu_node_entry);
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index c6ea81cd41890..e0595abd50c0f 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -350,17 +350,17 @@ static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp)
> >
> >  static void rcu_preempt_read_enter(void)
> >  {
> > -     current->rcu_read_lock_nesting++;
> > +     local_inc(&current->rcu_read_lock_nesting);
> >  }
> >
> >  static void rcu_preempt_read_exit(void)
> >  {
> > -     current->rcu_read_lock_nesting--;
> > +     local_dec(&current->rcu_read_lock_nesting);
> >  }
> >
> >  static void rcu_preempt_depth_set(int val)
> >  {
> > -     current->rcu_read_lock_nesting = val;
> > +     local_set(&current->rcu_read_lock_nesting, val);

> I agree that this removes the data races, and that the code for x86 is
> quite nice, but aren't rcu_read_lock() and rcu_read_unlock() going to
> have heavyweight atomic operations on many CPUs?
>
> Maybe I am stuck with arch-specific code in rcu_read_lock() and
> rcu_preempt_read_exit().  I suppose worse things could happen.

Peter also mentioned to me that while local_t on x86 generates
reasonable code, on other architectures it's terrible. So I think
something else is needed, and feel free to discard the above idea.
With sufficient enough reasoning, how bad would a 'data_race(..)' be?

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-07-25 15:17       ` Marco Elver
@ 2020-07-25 17:44         ` Peter Zijlstra
  2020-07-25 19:39           ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2020-07-25 17:44 UTC (permalink / raw)
  To: Marco Elver
  Cc: Paul E. McKenney, Andrey Konovalov, Alexander Potapenko,
	Dmitry Vyukov, kasan-dev, LKML

On Sat, Jul 25, 2020 at 05:17:43PM +0200, Marco Elver wrote:
> On Sat, 25 Jul 2020 at 16:56, Paul E. McKenney <paulmck@kernel.org> wrote:
> > On Thu, Feb 20, 2020 at 10:33:17PM +0100, Marco Elver wrote:
> > > On Thu, 20 Feb 2020, Paul E. McKenney wrote:
> > > > On Thu, Feb 20, 2020 at 03:15:51PM +0100, Marco Elver wrote:
> > > > > Add option to allow interrupts while a watchpoint is set up. This can be
> > > > > enabled either via CONFIG_KCSAN_INTERRUPT_WATCHER or via the boot
> > > > > parameter 'kcsan.interrupt_watcher=1'.
> [...]
> > > > > As an example, the first data race that this found:
> > > > >
> > > > > write to 0xffff88806b3324b8 of 4 bytes by interrupt on cpu 0:
> > > > >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]
> > > > >  __rcu_read_lock+0x3c/0x50 kernel/rcu/tree_plugin.h:373
> [...]
> > > > > read to 0xffff88806b3324b8 of 4 bytes by task 6131 on cpu 0:       |
> > > > >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  ----+
> [...]
> > > > >
> > > > > The writer is doing 'current->rcu_read_lock_nesting++'. The read is as
> > > > > vulnerable to compiler optimizations and would therefore conclude this
> > > > > is a valid data race.
> > > >
> > > > Heh!  That one is a fun one!  It is on a very hot fastpath.  READ_ONCE()
> > > > and WRITE_ONCE() are likely to be measurable at the system level.
> > > >
> > > > Thoughts on other options?

> > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > index c6ea81cd41890..e0595abd50c0f 100644
> > > --- a/kernel/rcu/tree_plugin.h
> > > +++ b/kernel/rcu/tree_plugin.h
> > > @@ -350,17 +350,17 @@ static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp)
> > >
> > >  static void rcu_preempt_read_enter(void)
> > >  {
> > > -     current->rcu_read_lock_nesting++;
> > > +     local_inc(&current->rcu_read_lock_nesting);
> > >  }
> > >
> > >  static void rcu_preempt_read_exit(void)
> > >  {
> > > -     current->rcu_read_lock_nesting--;
> > > +     local_dec(&current->rcu_read_lock_nesting);
> > >  }
> > >
> > >  static void rcu_preempt_depth_set(int val)
> > >  {
> > > -     current->rcu_read_lock_nesting = val;
> > > +     local_set(&current->rcu_read_lock_nesting, val);
> 
> > I agree that this removes the data races, and that the code for x86 is
> > quite nice, but aren't rcu_read_lock() and rcu_read_unlock() going to
> > have heavyweight atomic operations on many CPUs?
> >
> > Maybe I am stuck with arch-specific code in rcu_read_lock() and
> > rcu_preempt_read_exit().  I suppose worse things could happen.
> 
> Peter also mentioned to me that while local_t on x86 generates
> reasonable code, on other architectures it's terrible. So I think
> something else is needed, and feel free to discard the above idea.
> With sufficient enough reasoning, how bad would a 'data_race(..)' be?

Right, so local_t it atrocious on many architectures, they fall back to
atomic_t.

Even architectures that have optimized variants (eg. Power), they're
quite a lot more expensive than what we actually need here.

Only architectures like x86 that have single instruction memops can
generate anywhere near the code that we'd want here.

So the thing is, since RCU count is 0 per context (an IRQ must have an
equal amount of rcu_read_unlock() as it has rcu_read_lock()), interrupts
are not in fact a problem, even on load-store (RISC) architectures
(preempt_count has the same thing).

So the addition/subtraction in rcu_preempt_read_{enter,exit}() doesn't
need to be atomic vs interrupts. The only thing we really do need is
them being single-copy-atomic.

The problem with READ/WRITE_ONCE is that if we were to use it, we'd end
up with a load-store, even on x86, which is sub-optimal.

I suppose the 'correct' code here would be something like:

	*((volatile int *)&current->rcu_read_lock_nesting)++;

then the compiler can either do a single memop (x86 and the like) or a
load-store that is free from tearing.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-07-25 17:44         ` Peter Zijlstra
@ 2020-07-25 19:39           ` Paul E. McKenney
  2020-07-25 20:10             ` peterz
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2020-07-25 19:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Marco Elver, Andrey Konovalov, Alexander Potapenko,
	Dmitry Vyukov, kasan-dev, LKML

On Sat, Jul 25, 2020 at 07:44:30PM +0200, Peter Zijlstra wrote:
> On Sat, Jul 25, 2020 at 05:17:43PM +0200, Marco Elver wrote:
> > On Sat, 25 Jul 2020 at 16:56, Paul E. McKenney <paulmck@kernel.org> wrote:
> > > On Thu, Feb 20, 2020 at 10:33:17PM +0100, Marco Elver wrote:
> > > > On Thu, 20 Feb 2020, Paul E. McKenney wrote:
> > > > > On Thu, Feb 20, 2020 at 03:15:51PM +0100, Marco Elver wrote:
> > > > > > Add option to allow interrupts while a watchpoint is set up. This can be
> > > > > > enabled either via CONFIG_KCSAN_INTERRUPT_WATCHER or via the boot
> > > > > > parameter 'kcsan.interrupt_watcher=1'.
> > [...]
> > > > > > As an example, the first data race that this found:
> > > > > >
> > > > > > write to 0xffff88806b3324b8 of 4 bytes by interrupt on cpu 0:
> > > > > >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]
> > > > > >  __rcu_read_lock+0x3c/0x50 kernel/rcu/tree_plugin.h:373
> > [...]
> > > > > > read to 0xffff88806b3324b8 of 4 bytes by task 6131 on cpu 0:       |
> > > > > >  rcu_preempt_read_enter kernel/rcu/tree_plugin.h:353 [inline]  ----+
> > [...]
> > > > > >
> > > > > > The writer is doing 'current->rcu_read_lock_nesting++'. The read is as
> > > > > > vulnerable to compiler optimizations and would therefore conclude this
> > > > > > is a valid data race.
> > > > >
> > > > > Heh!  That one is a fun one!  It is on a very hot fastpath.  READ_ONCE()
> > > > > and WRITE_ONCE() are likely to be measurable at the system level.
> > > > >
> > > > > Thoughts on other options?
> 
> > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > > index c6ea81cd41890..e0595abd50c0f 100644
> > > > --- a/kernel/rcu/tree_plugin.h
> > > > +++ b/kernel/rcu/tree_plugin.h
> > > > @@ -350,17 +350,17 @@ static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp)
> > > >
> > > >  static void rcu_preempt_read_enter(void)
> > > >  {
> > > > -     current->rcu_read_lock_nesting++;
> > > > +     local_inc(&current->rcu_read_lock_nesting);
> > > >  }
> > > >
> > > >  static void rcu_preempt_read_exit(void)
> > > >  {
> > > > -     current->rcu_read_lock_nesting--;
> > > > +     local_dec(&current->rcu_read_lock_nesting);
> > > >  }
> > > >
> > > >  static void rcu_preempt_depth_set(int val)
> > > >  {
> > > > -     current->rcu_read_lock_nesting = val;
> > > > +     local_set(&current->rcu_read_lock_nesting, val);
> > 
> > > I agree that this removes the data races, and that the code for x86 is
> > > quite nice, but aren't rcu_read_lock() and rcu_read_unlock() going to
> > > have heavyweight atomic operations on many CPUs?
> > >
> > > Maybe I am stuck with arch-specific code in rcu_read_lock() and
> > > rcu_preempt_read_exit().  I suppose worse things could happen.
> > 
> > Peter also mentioned to me that while local_t on x86 generates
> > reasonable code, on other architectures it's terrible. So I think
> > something else is needed, and feel free to discard the above idea.
> > With sufficient enough reasoning, how bad would a 'data_race(..)' be?
> 
> Right, so local_t it atrocious on many architectures, they fall back to
> atomic_t.
> 
> Even architectures that have optimized variants (eg. Power), they're
> quite a lot more expensive than what we actually need here.
> 
> Only architectures like x86 that have single instruction memops can
> generate anywhere near the code that we'd want here.
> 
> So the thing is, since RCU count is 0 per context (an IRQ must have an
> equal amount of rcu_read_unlock() as it has rcu_read_lock()), interrupts
> are not in fact a problem, even on load-store (RISC) architectures
> (preempt_count has the same thing).

True enough!

> So the addition/subtraction in rcu_preempt_read_{enter,exit}() doesn't
> need to be atomic vs interrupts. The only thing we really do need is
> them being single-copy-atomic.
> 
> The problem with READ/WRITE_ONCE is that if we were to use it, we'd end
> up with a load-store, even on x86, which is sub-optimal.

Agreed.

> I suppose the 'correct' code here would be something like:
> 
> 	*((volatile int *)&current->rcu_read_lock_nesting)++;
> 
> then the compiler can either do a single memop (x86 and the like) or a
> load-store that is free from tearing.

Hah!!!  That is the original ACCESS_ONCE(), isn't it?  ;-)

	ACCESS_ONCE(current->rcu_read_lock_nesting)++;

But open-coding makes sense unless a lot of other places need something
similar.  Besides, open-coding allows me to defer bikeshedding on the
name, given that there are actually two accesses.  :-/

Ah, but that gets compiler warnings:

kernel/rcu/tree_plugin.h:354:52: error: lvalue required as increment operand
  *((volatile int *)&current->rcu_read_lock_nesting)++;

Let's try the old ACCESS_ONCE().  Dialing back to v3.0:

#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))

So:
	(*(volatile int *)&(current->rcu_read_lock_nesting))++;

This gets me the following for __rcu_read_lock():

00000000000000e0 <__rcu_read_lock>:
      e0:	48 8b 14 25 00 00 00 	mov    0x0,%rdx
      e7:	00 
      e8:	8b 82 e0 02 00 00    	mov    0x2e0(%rdx),%eax
      ee:	83 c0 01             	add    $0x1,%eax
      f1:	89 82 e0 02 00 00    	mov    %eax,0x2e0(%rdx)
      f7:	c3                   	retq   
      f8:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
      ff:	00 

One might hope for a dec instruction, but this isn't bad.  We do lose
a few instructions compared to the C-language case due to differences
in address calculation:

00000000000000e0 <__rcu_read_lock>:
      e0:	48 8b 04 25 00 00 00 	mov    0x0,%rax
      e7:	00 
      e8:	83 80 e0 02 00 00 01 	addl   $0x1,0x2e0(%rax)
      ef:	c3                   	retq   

For the relevant portion of __rcu_read_unlock(), this gets us
the following:

00000000000027f0 <__rcu_read_unlock>:
    27f0:	48 8b 3c 25 00 00 00 	mov    0x0,%rdi
    27f7:	00 
    27f8:	53                   	push   %rbx
    27f9:	8b 87 e0 02 00 00    	mov    0x2e0(%rdi),%eax
    27ff:	8d 50 ff             	lea    -0x1(%rax),%edx
    2802:	85 c0                	test   %eax,%eax
    2804:	89 97 e0 02 00 00    	mov    %edx,0x2e0(%rdi)
    280a:	75 0a                	jne    2816 <__rcu_read_unlock+0x26>

Here we have a load-subtract-store, but given that we need to test
the value, this seems reasonable to me.  We again lose a few instructions
compared to the C-language case, and again due to address calculation:

00000000000027e0 <__rcu_read_unlock>:
    27e0:	53                   	push   %rbx
    27e1:	48 8b 3c 25 00 00 00 	mov    0x0,%rdi
    27e8:	00 
    27e9:	83 af e0 02 00 00 01 	subl   $0x1,0x2e0(%rdi)
    27f0:	75 0a                	jne    27fc <__rcu_read_unlock+0x1c>

Thoughts?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-07-25 19:39           ` Paul E. McKenney
@ 2020-07-25 20:10             ` peterz
  2020-07-25 20:21               ` peterz
  0 siblings, 1 reply; 13+ messages in thread
From: peterz @ 2020-07-25 20:10 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Marco Elver, Andrey Konovalov, Alexander Potapenko,
	Dmitry Vyukov, kasan-dev, LKML

On Sat, Jul 25, 2020 at 12:39:09PM -0700, Paul E. McKenney wrote:
> On Sat, Jul 25, 2020 at 07:44:30PM +0200, Peter Zijlstra wrote:

> > So the thing is, since RCU count is 0 per context (an IRQ must have an
> > equal amount of rcu_read_unlock() as it has rcu_read_lock()), interrupts
> > are not in fact a problem, even on load-store (RISC) architectures
> > (preempt_count has the same thing).
> 
> True enough!
> 
> > So the addition/subtraction in rcu_preempt_read_{enter,exit}() doesn't
> > need to be atomic vs interrupts. The only thing we really do need is
> > them being single-copy-atomic.
> > 
> > The problem with READ/WRITE_ONCE is that if we were to use it, we'd end
> > up with a load-store, even on x86, which is sub-optimal.
> 
> Agreed.
> 
> > I suppose the 'correct' code here would be something like:
> > 
> > 	*((volatile int *)&current->rcu_read_lock_nesting)++;
> > 
> > then the compiler can either do a single memop (x86 and the like) or a
> > load-store that is free from tearing.
> 
> Hah!!!  That is the original ACCESS_ONCE(), isn't it?  ;-)
> 
> 	ACCESS_ONCE(current->rcu_read_lock_nesting)++;

Indeed :-)

> But open-coding makes sense unless a lot of other places need something
> similar.  Besides, open-coding allows me to defer bikeshedding on the
> name, given that there are actually two accesses.  :-/

Yeah, ISTR that being one of the reasons we got rid of it.

> So:
> 	(*(volatile int *)&(current->rcu_read_lock_nesting))++;

Urgh, sorry for messing that up.

> This gets me the following for __rcu_read_lock():
> 
> 00000000000000e0 <__rcu_read_lock>:
>       e0:	48 8b 14 25 00 00 00 	mov    0x0,%rdx
>       e7:	00 
>       e8:	8b 82 e0 02 00 00    	mov    0x2e0(%rdx),%eax
>       ee:	83 c0 01             	add    $0x1,%eax
>       f1:	89 82 e0 02 00 00    	mov    %eax,0x2e0(%rdx)
>       f7:	c3                   	retq   
>       f8:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
>       ff:	00 
> 
> One might hope for a dec instruction, but this isn't bad.  We do lose
> a few instructions compared to the C-language case due to differences
> in address calculation:
> 
> 00000000000000e0 <__rcu_read_lock>:
>       e0:	48 8b 04 25 00 00 00 	mov    0x0,%rax
>       e7:	00 
>       e8:	83 80 e0 02 00 00 01 	addl   $0x1,0x2e0(%rax)
>       ef:	c3                   	retq   

Shees, that's daft... I think this is one of the cases where GCC is
perhaps overly cautious when presented with 'volatile'.

It has a history of generating excessively crap code around volatile,
and while it has improved somewhat, this seems to show there's still
room for improvement...

I suppose this is the point where we go bug a friendly compiler person.

Alternatively we can employ data_race() and trust the compiler not to be
daft about tearing... which we've been relying with this code anyway.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-07-25 20:10             ` peterz
@ 2020-07-25 20:21               ` peterz
  2020-07-25 22:07                 ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: peterz @ 2020-07-25 20:21 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Marco Elver, Andrey Konovalov, Alexander Potapenko,
	Dmitry Vyukov, kasan-dev, LKML

On Sat, Jul 25, 2020 at 10:10:13PM +0200, peterz@infradead.org wrote:
> On Sat, Jul 25, 2020 at 12:39:09PM -0700, Paul E. McKenney wrote:

> > This gets me the following for __rcu_read_lock():
> > 
> > 00000000000000e0 <__rcu_read_lock>:
> >       e0:	48 8b 14 25 00 00 00 	mov    0x0,%rdx
> >       e7:	00 
> >       e8:	8b 82 e0 02 00 00    	mov    0x2e0(%rdx),%eax
> >       ee:	83 c0 01             	add    $0x1,%eax
> >       f1:	89 82 e0 02 00 00    	mov    %eax,0x2e0(%rdx)
> >       f7:	c3                   	retq   
> >       f8:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
> >       ff:	00 
> > 
> > One might hope for a dec instruction, but this isn't bad.  We do lose
> > a few instructions compared to the C-language case due to differences
> > in address calculation:
> > 
> > 00000000000000e0 <__rcu_read_lock>:
> >       e0:	48 8b 04 25 00 00 00 	mov    0x0,%rax
> >       e7:	00 
> >       e8:	83 80 e0 02 00 00 01 	addl   $0x1,0x2e0(%rax)
> >       ef:	c3                   	retq   
> 
> Shees, that's daft... I think this is one of the cases where GCC is
> perhaps overly cautious when presented with 'volatile'.
> 
> It has a history of generating excessively crap code around volatile,
> and while it has improved somewhat, this seems to show there's still
> room for improvement...
> 
> I suppose this is the point where we go bug a friendly compiler person.

Having had a play with godbolt.org, it seems clang isn't affected by
this particular flavour of crazy, but GCC does indeed refuse to fuse the
address calculation and the addition.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-07-25 20:21               ` peterz
@ 2020-07-25 22:07                 ` Paul E. McKenney
  2020-07-26 11:52                   ` peterz
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2020-07-25 22:07 UTC (permalink / raw)
  To: peterz
  Cc: Marco Elver, Andrey Konovalov, Alexander Potapenko,
	Dmitry Vyukov, kasan-dev, LKML

On Sat, Jul 25, 2020 at 10:21:31PM +0200, peterz@infradead.org wrote:
> On Sat, Jul 25, 2020 at 10:10:13PM +0200, peterz@infradead.org wrote:
> > On Sat, Jul 25, 2020 at 12:39:09PM -0700, Paul E. McKenney wrote:
> 
> > > This gets me the following for __rcu_read_lock():
> > > 
> > > 00000000000000e0 <__rcu_read_lock>:
> > >       e0:	48 8b 14 25 00 00 00 	mov    0x0,%rdx
> > >       e7:	00 
> > >       e8:	8b 82 e0 02 00 00    	mov    0x2e0(%rdx),%eax
> > >       ee:	83 c0 01             	add    $0x1,%eax
> > >       f1:	89 82 e0 02 00 00    	mov    %eax,0x2e0(%rdx)
> > >       f7:	c3                   	retq   
> > >       f8:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
> > >       ff:	00 
> > > 
> > > One might hope for a dec instruction, but this isn't bad.  We do lose
> > > a few instructions compared to the C-language case due to differences
> > > in address calculation:
> > > 
> > > 00000000000000e0 <__rcu_read_lock>:
> > >       e0:	48 8b 04 25 00 00 00 	mov    0x0,%rax
> > >       e7:	00 
> > >       e8:	83 80 e0 02 00 00 01 	addl   $0x1,0x2e0(%rax)
> > >       ef:	c3                   	retq   
> > 
> > Shees, that's daft... I think this is one of the cases where GCC is
> > perhaps overly cautious when presented with 'volatile'.
> > 
> > It has a history of generating excessively crap code around volatile,
> > and while it has improved somewhat, this seems to show there's still
> > room for improvement...
> > 
> > I suppose this is the point where we go bug a friendly compiler person.

Sounds very good!  Do you have someone specific in mind?

> Having had a play with godbolt.org, it seems clang isn't affected by
> this particular flavour of crazy, but GCC does indeed refuse to fuse the
> address calculation and the addition.

So there is hope, then!

And even GCC's current state is an improvement.  Last I messed with this,
the ACCESS_ONCE()++ approach generated a load, a register increment,
and a store.

Do you still have the godbolt.org URLs?  I would be happy to file
a bugzilla.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-07-25 22:07                 ` Paul E. McKenney
@ 2020-07-26 11:52                   ` peterz
  2020-07-26 18:09                     ` Paul E. McKenney
  2020-07-27  1:48                     ` Paul E. McKenney
  0 siblings, 2 replies; 13+ messages in thread
From: peterz @ 2020-07-26 11:52 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Marco Elver, Andrey Konovalov, Alexander Potapenko,
	Dmitry Vyukov, kasan-dev, LKML, jakub, hjl.tools

On Sat, Jul 25, 2020 at 03:07:50PM -0700, Paul E. McKenney wrote:
> On Sat, Jul 25, 2020 at 10:21:31PM +0200, peterz@infradead.org wrote:
> > On Sat, Jul 25, 2020 at 10:10:13PM +0200, peterz@infradead.org wrote:
> > > On Sat, Jul 25, 2020 at 12:39:09PM -0700, Paul E. McKenney wrote:
> > 
> > > > This gets me the following for __rcu_read_lock():
> > > > 
> > > > 00000000000000e0 <__rcu_read_lock>:
> > > >       e0:	48 8b 14 25 00 00 00 	mov    0x0,%rdx
> > > >       e7:	00 
> > > >       e8:	8b 82 e0 02 00 00    	mov    0x2e0(%rdx),%eax
> > > >       ee:	83 c0 01             	add    $0x1,%eax
> > > >       f1:	89 82 e0 02 00 00    	mov    %eax,0x2e0(%rdx)
> > > >       f7:	c3                   	retq   
> > > >       f8:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
> > > >       ff:	00 
> > > > 
> > > > One might hope for a dec instruction, but this isn't bad.  We do lose
> > > > a few instructions compared to the C-language case due to differences
> > > > in address calculation:
> > > > 
> > > > 00000000000000e0 <__rcu_read_lock>:
> > > >       e0:	48 8b 04 25 00 00 00 	mov    0x0,%rax
> > > >       e7:	00 
> > > >       e8:	83 80 e0 02 00 00 01 	addl   $0x1,0x2e0(%rax)
> > > >       ef:	c3                   	retq   
> > > 
> > > Shees, that's daft... I think this is one of the cases where GCC is
> > > perhaps overly cautious when presented with 'volatile'.
> > > 
> > > It has a history of generating excessively crap code around volatile,
> > > and while it has improved somewhat, this seems to show there's still
> > > room for improvement...
> > > 
> > > I suppose this is the point where we go bug a friendly compiler person.
> 
> Sounds very good!  Do you have someone specific in mind?

Jakub perhaps?, Cc'ed

> > Having had a play with godbolt.org, it seems clang isn't affected by
> > this particular flavour of crazy, but GCC does indeed refuse to fuse the
> > address calculation and the addition.
> 
> So there is hope, then!
> 
> And even GCC's current state is an improvement.  Last I messed with this,
> the ACCESS_ONCE()++ approach generated a load, a register increment,
> and a store.
> 
> Do you still have the godbolt.org URLs?  I would be happy to file
> a bugzilla.

https://godbolt.org/z/rP8rYM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-07-26 11:52                   ` peterz
@ 2020-07-26 18:09                     ` Paul E. McKenney
  2020-07-27  1:48                     ` Paul E. McKenney
  1 sibling, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2020-07-26 18:09 UTC (permalink / raw)
  To: peterz
  Cc: Marco Elver, Andrey Konovalov, Alexander Potapenko,
	Dmitry Vyukov, kasan-dev, LKML, jakub, hjl.tools

On Sun, Jul 26, 2020 at 01:52:42PM +0200, peterz@infradead.org wrote:
> On Sat, Jul 25, 2020 at 03:07:50PM -0700, Paul E. McKenney wrote:
> > On Sat, Jul 25, 2020 at 10:21:31PM +0200, peterz@infradead.org wrote:
> > > On Sat, Jul 25, 2020 at 10:10:13PM +0200, peterz@infradead.org wrote:
> > > > On Sat, Jul 25, 2020 at 12:39:09PM -0700, Paul E. McKenney wrote:
> > > 
> > > > > This gets me the following for __rcu_read_lock():
> > > > > 
> > > > > 00000000000000e0 <__rcu_read_lock>:
> > > > >       e0:	48 8b 14 25 00 00 00 	mov    0x0,%rdx
> > > > >       e7:	00 
> > > > >       e8:	8b 82 e0 02 00 00    	mov    0x2e0(%rdx),%eax
> > > > >       ee:	83 c0 01             	add    $0x1,%eax
> > > > >       f1:	89 82 e0 02 00 00    	mov    %eax,0x2e0(%rdx)
> > > > >       f7:	c3                   	retq   
> > > > >       f8:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
> > > > >       ff:	00 
> > > > > 
> > > > > One might hope for a dec instruction, but this isn't bad.  We do lose
> > > > > a few instructions compared to the C-language case due to differences
> > > > > in address calculation:
> > > > > 
> > > > > 00000000000000e0 <__rcu_read_lock>:
> > > > >       e0:	48 8b 04 25 00 00 00 	mov    0x0,%rax
> > > > >       e7:	00 
> > > > >       e8:	83 80 e0 02 00 00 01 	addl   $0x1,0x2e0(%rax)
> > > > >       ef:	c3                   	retq   
> > > > 
> > > > Shees, that's daft... I think this is one of the cases where GCC is
> > > > perhaps overly cautious when presented with 'volatile'.
> > > > 
> > > > It has a history of generating excessively crap code around volatile,
> > > > and while it has improved somewhat, this seems to show there's still
> > > > room for improvement...
> > > > 
> > > > I suppose this is the point where we go bug a friendly compiler person.
> > 
> > Sounds very good!  Do you have someone specific in mind?
> 
> Jakub perhaps?, Cc'ed
> 
> > > Having had a play with godbolt.org, it seems clang isn't affected by
> > > this particular flavour of crazy, but GCC does indeed refuse to fuse the
> > > address calculation and the addition.
> > 
> > So there is hope, then!
> > 
> > And even GCC's current state is an improvement.  Last I messed with this,
> > the ACCESS_ONCE()++ approach generated a load, a register increment,
> > and a store.
> > 
> > Do you still have the godbolt.org URLs?  I would be happy to file
> > a bugzilla.
> 
> https://godbolt.org/z/rP8rYM

Thank you!

Now creating a GCC bugzilla account.  For some strange reason, my old
ibm.com account no longer functions.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] kcsan: Add option to allow watcher interruptions
  2020-07-26 11:52                   ` peterz
  2020-07-26 18:09                     ` Paul E. McKenney
@ 2020-07-27  1:48                     ` Paul E. McKenney
  1 sibling, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2020-07-27  1:48 UTC (permalink / raw)
  To: peterz
  Cc: Marco Elver, Andrey Konovalov, Alexander Potapenko,
	Dmitry Vyukov, kasan-dev, LKML, jakub, hjl.tools

On Sun, Jul 26, 2020 at 01:52:42PM +0200, peterz@infradead.org wrote:
> On Sat, Jul 25, 2020 at 03:07:50PM -0700, Paul E. McKenney wrote:
> > On Sat, Jul 25, 2020 at 10:21:31PM +0200, peterz@infradead.org wrote:
> > > On Sat, Jul 25, 2020 at 10:10:13PM +0200, peterz@infradead.org wrote:
> > > > On Sat, Jul 25, 2020 at 12:39:09PM -0700, Paul E. McKenney wrote:
> > > 
> > > > > This gets me the following for __rcu_read_lock():
> > > > > 
> > > > > 00000000000000e0 <__rcu_read_lock>:
> > > > >       e0:	48 8b 14 25 00 00 00 	mov    0x0,%rdx
> > > > >       e7:	00 
> > > > >       e8:	8b 82 e0 02 00 00    	mov    0x2e0(%rdx),%eax
> > > > >       ee:	83 c0 01             	add    $0x1,%eax
> > > > >       f1:	89 82 e0 02 00 00    	mov    %eax,0x2e0(%rdx)
> > > > >       f7:	c3                   	retq   
> > > > >       f8:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
> > > > >       ff:	00 
> > > > > 
> > > > > One might hope for a dec instruction, but this isn't bad.  We do lose
> > > > > a few instructions compared to the C-language case due to differences
> > > > > in address calculation:
> > > > > 
> > > > > 00000000000000e0 <__rcu_read_lock>:
> > > > >       e0:	48 8b 04 25 00 00 00 	mov    0x0,%rax
> > > > >       e7:	00 
> > > > >       e8:	83 80 e0 02 00 00 01 	addl   $0x1,0x2e0(%rax)
> > > > >       ef:	c3                   	retq   
> > > > 
> > > > Shees, that's daft... I think this is one of the cases where GCC is
> > > > perhaps overly cautious when presented with 'volatile'.
> > > > 
> > > > It has a history of generating excessively crap code around volatile,
> > > > and while it has improved somewhat, this seems to show there's still
> > > > room for improvement...
> > > > 
> > > > I suppose this is the point where we go bug a friendly compiler person.
> > 
> > Sounds very good!  Do you have someone specific in mind?
> 
> Jakub perhaps?, Cc'ed
> 
> > > Having had a play with godbolt.org, it seems clang isn't affected by
> > > this particular flavour of crazy, but GCC does indeed refuse to fuse the
> > > address calculation and the addition.
> > 
> > So there is hope, then!
> > 
> > And even GCC's current state is an improvement.  Last I messed with this,
> > the ACCESS_ONCE()++ approach generated a load, a register increment,
> > and a store.
> > 
> > Do you still have the godbolt.org URLs?  I would be happy to file
> > a bugzilla.
> 
> https://godbolt.org/z/rP8rYM

Here you go!  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96327

							Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-07-27  1:48 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20 14:15 [PATCH] kcsan: Add option to allow watcher interruptions Marco Elver
2020-02-20 18:58 ` Paul E. McKenney
2020-02-20 21:33   ` Marco Elver
2020-07-25 14:56     ` Paul E. McKenney
2020-07-25 15:17       ` Marco Elver
2020-07-25 17:44         ` Peter Zijlstra
2020-07-25 19:39           ` Paul E. McKenney
2020-07-25 20:10             ` peterz
2020-07-25 20:21               ` peterz
2020-07-25 22:07                 ` Paul E. McKenney
2020-07-26 11:52                   ` peterz
2020-07-26 18:09                     ` Paul E. McKenney
2020-07-27  1:48                     ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.