linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] stop_machine: avoid potential race behaviour
@ 2019-10-07 10:45 Mark Rutland
  2019-10-07 11:01 ` Marco Elver
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Mark Rutland @ 2019-10-07 10:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mark Rutland, Marco Elver, Thomas Gleixner, Peter Zijlstra

Both multi_cpu_stop() and set_state() access multi_stop_data::state
racily using plain accesses. These are subject to compiler
transformations which could break the intended behaviour of the code,
and this situation is detected by KCSAN on both arm64 and x86 (splats
below).

Let's improve matters by using READ_ONCE() and WRITE_ONCE() to ensure
that the compiler cannot elide, replay, or tear loads and stores. In
multi_cpu_stop() we expect the two loads of multi_stop_data::state to be
a consistent value, so we snapshot the value into a temporary variable
to ensure this.

The state transitions are serialized by atomic manipulation of
multi_stop_data::num_threads, and other fields in multi_stop_data are
not modified while subject to concurrent reads.

KCSAN splat on arm64:

| BUG: KCSAN: data-race in multi_cpu_stop+0xa8/0x198 and set_state+0x80/0xb0
|
| write to 0xffff00001003bd00 of 4 bytes by task 24 on cpu 3:
|  set_state+0x80/0xb0
|  multi_cpu_stop+0x16c/0x198
|  cpu_stopper_thread+0x170/0x298
|  smpboot_thread_fn+0x40c/0x560
|  kthread+0x1a8/0x1b0
|  ret_from_fork+0x10/0x18
|
| read to 0xffff00001003bd00 of 4 bytes by task 14 on cpu 1:
|  multi_cpu_stop+0xa8/0x198
|  cpu_stopper_thread+0x170/0x298
|  smpboot_thread_fn+0x40c/0x560
|  kthread+0x1a8/0x1b0
|  ret_from_fork+0x10/0x18
|
| Reported by Kernel Concurrency Sanitizer on:
| CPU: 1 PID: 14 Comm: migration/1 Not tainted 5.3.0-00007-g67ab35a199f4-dirty #3
| Hardware name: linux,dummy-virt (DT)

KCSAN splat on x86:

| write to 0xffffb0bac0013e18 of 4 bytes by task 19 on cpu 2:
|  set_state kernel/stop_machine.c:170 [inline]
|  ack_state kernel/stop_machine.c:177 [inline]
|  multi_cpu_stop+0x1a4/0x220 kernel/stop_machine.c:227
|  cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516
|  smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165
|  kthread+0x1b5/0x200 kernel/kthread.c:255
|  ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352
|
| read to 0xffffb0bac0013e18 of 4 bytes by task 44 on cpu 7:
|  multi_cpu_stop+0xb4/0x220 kernel/stop_machine.c:213
|  cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516
|  smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165
|  kthread+0x1b5/0x200 kernel/kthread.c:255
|  ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352
|
| Reported by Kernel Concurrency Sanitizer on:
| CPU: 7 PID: 44 Comm: migration/7 Not tainted 5.3.0+ #1
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 kernel/stop_machine.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index c7031a22aa7b..998d50ee2d9b 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -7,6 +7,7 @@
  * Copyright (C) 2010		SUSE Linux Products GmbH
  * Copyright (C) 2010		Tejun Heo <tj@kernel.org>
  */
+#include <linux/compiler.h>
 #include <linux/completion.h>
 #include <linux/cpu.h>
 #include <linux/init.h>
@@ -167,7 +168,7 @@ static void set_state(struct multi_stop_data *msdata,
 	/* Reset ack counter. */
 	atomic_set(&msdata->thread_ack, msdata->num_threads);
 	smp_wmb();
-	msdata->state = newstate;
+	WRITE_ONCE(msdata->state, newstate);
 }
 
 /* Last one to ack a state moves to the next state. */
@@ -186,7 +187,7 @@ void __weak stop_machine_yield(const struct cpumask *cpumask)
 static int multi_cpu_stop(void *data)
 {
 	struct multi_stop_data *msdata = data;
-	enum multi_stop_state curstate = MULTI_STOP_NONE;
+	enum multi_stop_state newstate, curstate = MULTI_STOP_NONE;
 	int cpu = smp_processor_id(), err = 0;
 	const struct cpumask *cpumask;
 	unsigned long flags;
@@ -210,8 +211,9 @@ static int multi_cpu_stop(void *data)
 	do {
 		/* Chill out and ensure we re-read multi_stop_state. */
 		stop_machine_yield(cpumask);
-		if (msdata->state != curstate) {
-			curstate = msdata->state;
+		newstate = READ_ONCE(msdata->state);
+		if (newstate != curstate) {
+			curstate = newstate;
 			switch (curstate) {
 			case MULTI_STOP_DISABLE_IRQ:
 				local_irq_disable();
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] stop_machine: avoid potential race behaviour
  2019-10-07 10:45 [PATCH] stop_machine: avoid potential race behaviour Mark Rutland
@ 2019-10-07 11:01 ` Marco Elver
  2019-10-08  8:36 ` Peter Zijlstra
  2019-10-17 10:49 ` [tip: core/urgent] stop_machine: Avoid " tip-bot2 for Mark Rutland
  2 siblings, 0 replies; 5+ messages in thread
From: Marco Elver @ 2019-10-07 11:01 UTC (permalink / raw)
  To: Mark Rutland; +Cc: LKML, Thomas Gleixner, Peter Zijlstra

On Mon, 7 Oct 2019 at 12:45, Mark Rutland <mark.rutland@arm.com> wrote:
>
> Both multi_cpu_stop() and set_state() access multi_stop_data::state
> racily using plain accesses. These are subject to compiler
> transformations which could break the intended behaviour of the code,
> and this situation is detected by KCSAN on both arm64 and x86 (splats
> below).
>
> Let's improve matters by using READ_ONCE() and WRITE_ONCE() to ensure
> that the compiler cannot elide, replay, or tear loads and stores. In
> multi_cpu_stop() we expect the two loads of multi_stop_data::state to be
> a consistent value, so we snapshot the value into a temporary variable
> to ensure this.
>
> The state transitions are serialized by atomic manipulation of
> multi_stop_data::num_threads, and other fields in multi_stop_data are
> not modified while subject to concurrent reads.
>
> KCSAN splat on arm64:
>
> | BUG: KCSAN: data-race in multi_cpu_stop+0xa8/0x198 and set_state+0x80/0xb0
> |
> | write to 0xffff00001003bd00 of 4 bytes by task 24 on cpu 3:
> |  set_state+0x80/0xb0
> |  multi_cpu_stop+0x16c/0x198
> |  cpu_stopper_thread+0x170/0x298
> |  smpboot_thread_fn+0x40c/0x560
> |  kthread+0x1a8/0x1b0
> |  ret_from_fork+0x10/0x18
> |
> | read to 0xffff00001003bd00 of 4 bytes by task 14 on cpu 1:
> |  multi_cpu_stop+0xa8/0x198
> |  cpu_stopper_thread+0x170/0x298
> |  smpboot_thread_fn+0x40c/0x560
> |  kthread+0x1a8/0x1b0
> |  ret_from_fork+0x10/0x18
> |
> | Reported by Kernel Concurrency Sanitizer on:
> | CPU: 1 PID: 14 Comm: migration/1 Not tainted 5.3.0-00007-g67ab35a199f4-dirty #3
> | Hardware name: linux,dummy-virt (DT)
>
> KCSAN splat on x86:
>
> | write to 0xffffb0bac0013e18 of 4 bytes by task 19 on cpu 2:
> |  set_state kernel/stop_machine.c:170 [inline]
> |  ack_state kernel/stop_machine.c:177 [inline]
> |  multi_cpu_stop+0x1a4/0x220 kernel/stop_machine.c:227
> |  cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516
> |  smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165
> |  kthread+0x1b5/0x200 kernel/kthread.c:255
> |  ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352
> |
> | read to 0xffffb0bac0013e18 of 4 bytes by task 44 on cpu 7:
> |  multi_cpu_stop+0xb4/0x220 kernel/stop_machine.c:213
> |  cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516
> |  smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165
> |  kthread+0x1b5/0x200 kernel/kthread.c:255
> |  ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352
> |
> | Reported by Kernel Concurrency Sanitizer on:
> | CPU: 7 PID: 44 Comm: migration/7 Not tainted 5.3.0+ #1
> | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>

Thanks for fixing this!

Acked-by: Marco Elver <elver@google.com>

> ---
>  kernel/stop_machine.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index c7031a22aa7b..998d50ee2d9b 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -7,6 +7,7 @@
>   * Copyright (C) 2010          SUSE Linux Products GmbH
>   * Copyright (C) 2010          Tejun Heo <tj@kernel.org>
>   */
> +#include <linux/compiler.h>
>  #include <linux/completion.h>
>  #include <linux/cpu.h>
>  #include <linux/init.h>
> @@ -167,7 +168,7 @@ static void set_state(struct multi_stop_data *msdata,
>         /* Reset ack counter. */
>         atomic_set(&msdata->thread_ack, msdata->num_threads);
>         smp_wmb();
> -       msdata->state = newstate;
> +       WRITE_ONCE(msdata->state, newstate);
>  }
>
>  /* Last one to ack a state moves to the next state. */
> @@ -186,7 +187,7 @@ void __weak stop_machine_yield(const struct cpumask *cpumask)
>  static int multi_cpu_stop(void *data)
>  {
>         struct multi_stop_data *msdata = data;
> -       enum multi_stop_state curstate = MULTI_STOP_NONE;
> +       enum multi_stop_state newstate, curstate = MULTI_STOP_NONE;
>         int cpu = smp_processor_id(), err = 0;
>         const struct cpumask *cpumask;
>         unsigned long flags;
> @@ -210,8 +211,9 @@ static int multi_cpu_stop(void *data)
>         do {
>                 /* Chill out and ensure we re-read multi_stop_state. */
>                 stop_machine_yield(cpumask);
> -               if (msdata->state != curstate) {
> -                       curstate = msdata->state;
> +               newstate = READ_ONCE(msdata->state);
> +               if (newstate != curstate) {
> +                       curstate = newstate;
>                         switch (curstate) {
>                         case MULTI_STOP_DISABLE_IRQ:
>                                 local_irq_disable();
> --
> 2.11.0
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] stop_machine: avoid potential race behaviour
  2019-10-07 10:45 [PATCH] stop_machine: avoid potential race behaviour Mark Rutland
  2019-10-07 11:01 ` Marco Elver
@ 2019-10-08  8:36 ` Peter Zijlstra
  2019-10-14 10:09   ` Mark Rutland
  2019-10-17 10:49 ` [tip: core/urgent] stop_machine: Avoid " tip-bot2 for Mark Rutland
  2 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2019-10-08  8:36 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-kernel, Marco Elver, Thomas Gleixner

On Mon, Oct 07, 2019 at 11:45:36AM +0100, Mark Rutland wrote:
> Both multi_cpu_stop() and set_state() access multi_stop_data::state
> racily using plain accesses. These are subject to compiler
> transformations which could break the intended behaviour of the code,
> and this situation is detected by KCSAN on both arm64 and x86 (splats
> below).

I really don't think there is anything the compiler can do wrong here.

That is, I'm thinking I'd like to get this called out as a false-positive.

That said, the patch looks obviously fine and will help with the
validation effort so no real objection there.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] stop_machine: avoid potential race behaviour
  2019-10-08  8:36 ` Peter Zijlstra
@ 2019-10-14 10:09   ` Mark Rutland
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Rutland @ 2019-10-14 10:09 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Marco Elver, Thomas Gleixner

On Tue, Oct 08, 2019 at 10:36:37AM +0200, Peter Zijlstra wrote:
> On Mon, Oct 07, 2019 at 11:45:36AM +0100, Mark Rutland wrote:
> > Both multi_cpu_stop() and set_state() access multi_stop_data::state
> > racily using plain accesses. These are subject to compiler
> > transformations which could break the intended behaviour of the code,
> > and this situation is detected by KCSAN on both arm64 and x86 (splats
> > below).
> 
> I really don't think there is anything the compiler can do wrong here.
> 
> That is, I'm thinking I'd like to get this called out as a false-positive.

I agree that in practice, it's very unlikely this would go wrong.

There are some things the compiler could do, e.g. with re-ordering of
volatile and plain reads of the same variable:

  https://lore.kernel.org/lkml/20191003161233.GB38140@lakrids.cambridge.arm.com/

... and while I agree that's vanishingly unlikely to happen here, I
couldn't say how to rule that out without ruling out cases that would
actually blow up in practice.

> That said, the patch looks obviously fine and will help with the
> validation effort so no real objection there.

Great! Can I take that as an Acked-by?

I assume this should go via the tip tree.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: core/urgent] stop_machine: Avoid potential race behaviour
  2019-10-07 10:45 [PATCH] stop_machine: avoid potential race behaviour Mark Rutland
  2019-10-07 11:01 ` Marco Elver
  2019-10-08  8:36 ` Peter Zijlstra
@ 2019-10-17 10:49 ` tip-bot2 for Mark Rutland
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Mark Rutland @ 2019-10-17 10:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Mark Rutland, Thomas Gleixner, Marco Elver, Ingo Molnar,
	Borislav Petkov, linux-kernel

The following commit has been merged into the core/urgent branch of tip:

Commit-ID:     b1fc5833357524d5d342737913dbe32ff3557bc5
Gitweb:        https://git.kernel.org/tip/b1fc5833357524d5d342737913dbe32ff3557bc5
Author:        Mark Rutland <mark.rutland@arm.com>
AuthorDate:    Mon, 07 Oct 2019 11:45:36 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 17 Oct 2019 12:47:12 +02:00

stop_machine: Avoid potential race behaviour

Both multi_cpu_stop() and set_state() access multi_stop_data::state
racily using plain accesses. These are subject to compiler
transformations which could break the intended behaviour of the code,
and this situation is detected by KCSAN on both arm64 and x86 (splats
below).

Improve matters by using READ_ONCE() and WRITE_ONCE() to ensure that the
compiler cannot elide, replay, or tear loads and stores.

In multi_cpu_stop() the two loads of multi_stop_data::state are expected to
be a consistent value, so snapshot the value into a temporary variable to
ensure this.

The state transitions are serialized by atomic manipulation of
multi_stop_data::num_threads, and other fields in multi_stop_data are not
modified while subject to concurrent reads.

KCSAN splat on arm64:

| BUG: KCSAN: data-race in multi_cpu_stop+0xa8/0x198 and set_state+0x80/0xb0
|
| write to 0xffff00001003bd00 of 4 bytes by task 24 on cpu 3:
|  set_state+0x80/0xb0
|  multi_cpu_stop+0x16c/0x198
|  cpu_stopper_thread+0x170/0x298
|  smpboot_thread_fn+0x40c/0x560
|  kthread+0x1a8/0x1b0
|  ret_from_fork+0x10/0x18
|
| read to 0xffff00001003bd00 of 4 bytes by task 14 on cpu 1:
|  multi_cpu_stop+0xa8/0x198
|  cpu_stopper_thread+0x170/0x298
|  smpboot_thread_fn+0x40c/0x560
|  kthread+0x1a8/0x1b0
|  ret_from_fork+0x10/0x18
|
| Reported by Kernel Concurrency Sanitizer on:
| CPU: 1 PID: 14 Comm: migration/1 Not tainted 5.3.0-00007-g67ab35a199f4-dirty #3
| Hardware name: linux,dummy-virt (DT)

KCSAN splat on x86:

| write to 0xffffb0bac0013e18 of 4 bytes by task 19 on cpu 2:
|  set_state kernel/stop_machine.c:170 [inline]
|  ack_state kernel/stop_machine.c:177 [inline]
|  multi_cpu_stop+0x1a4/0x220 kernel/stop_machine.c:227
|  cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516
|  smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165
|  kthread+0x1b5/0x200 kernel/kthread.c:255
|  ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352
|
| read to 0xffffb0bac0013e18 of 4 bytes by task 44 on cpu 7:
|  multi_cpu_stop+0xb4/0x220 kernel/stop_machine.c:213
|  cpu_stopper_thread+0x19e/0x280 kernel/stop_machine.c:516
|  smpboot_thread_fn+0x1a8/0x300 kernel/smpboot.c:165
|  kthread+0x1b5/0x200 kernel/kthread.c:255
|  ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:352
|
| Reported by Kernel Concurrency Sanitizer on:
| CPU: 7 PID: 44 Comm: migration/7 Not tainted 5.3.0+ #1
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20191007104536.27276-1-mark.rutland@arm.com

---
 kernel/stop_machine.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index c7031a2..998d50e 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -7,6 +7,7 @@
  * Copyright (C) 2010		SUSE Linux Products GmbH
  * Copyright (C) 2010		Tejun Heo <tj@kernel.org>
  */
+#include <linux/compiler.h>
 #include <linux/completion.h>
 #include <linux/cpu.h>
 #include <linux/init.h>
@@ -167,7 +168,7 @@ static void set_state(struct multi_stop_data *msdata,
 	/* Reset ack counter. */
 	atomic_set(&msdata->thread_ack, msdata->num_threads);
 	smp_wmb();
-	msdata->state = newstate;
+	WRITE_ONCE(msdata->state, newstate);
 }
 
 /* Last one to ack a state moves to the next state. */
@@ -186,7 +187,7 @@ void __weak stop_machine_yield(const struct cpumask *cpumask)
 static int multi_cpu_stop(void *data)
 {
 	struct multi_stop_data *msdata = data;
-	enum multi_stop_state curstate = MULTI_STOP_NONE;
+	enum multi_stop_state newstate, curstate = MULTI_STOP_NONE;
 	int cpu = smp_processor_id(), err = 0;
 	const struct cpumask *cpumask;
 	unsigned long flags;
@@ -210,8 +211,9 @@ static int multi_cpu_stop(void *data)
 	do {
 		/* Chill out and ensure we re-read multi_stop_state. */
 		stop_machine_yield(cpumask);
-		if (msdata->state != curstate) {
-			curstate = msdata->state;
+		newstate = READ_ONCE(msdata->state);
+		if (newstate != curstate) {
+			curstate = newstate;
 			switch (curstate) {
 			case MULTI_STOP_DISABLE_IRQ:
 				local_irq_disable();

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-10-17 10:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-07 10:45 [PATCH] stop_machine: avoid potential race behaviour Mark Rutland
2019-10-07 11:01 ` Marco Elver
2019-10-08  8:36 ` Peter Zijlstra
2019-10-14 10:09   ` Mark Rutland
2019-10-17 10:49 ` [tip: core/urgent] stop_machine: Avoid " tip-bot2 for Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).