linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched: Fix 32bit race in sched_clock_remote()
@ 2013-04-05 16:36 Peter Zijlstra
  2013-04-08 21:16 ` Steven Rostedt
  2013-04-09 14:55 ` Yong Zhang
  0 siblings, 2 replies; 4+ messages in thread
From: Peter Zijlstra @ 2013-04-05 16:36 UTC (permalink / raw)
  To: tglx, Steven Rostedt, mingo; +Cc: LKML

Thomas spotted a nasty 32bit race in sched_clock_remote() after way too
many hours of debugging weirdness.

What happens is that sched_clock_remote() does regular machine word
reads of sched_clock_data::clock; this appears safe since we use
cmpxchg64() to update the variable and any half-read value would
trigger a retry.

Except we don't validate the new value 'val' in the same way! Thus we
can propagate non-atomic read errors into the clock value.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Debugged-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/sched/clock.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index c685e31..7042ef7 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -170,6 +170,21 @@ static u64 sched_clock_local(struct sched_clock_data *scd)
 	return clock;
 }
 
+#ifndef CONFIG_64BIT
+/*
+ * 32bit machines can't atomically read a u64 except using cmpxchg64()
+ */
+static inline u64 scd_read_clock(struct sched_clock_data *scd)
+{
+	return cmpxchg64(&scd->clock, 0, 0);
+}
+#else
+static inline u64 scd_read_clock(struct sched_clock_data *scd)
+{
+	return scd->clock;
+}
+#endif
+
 static u64 sched_clock_remote(struct sched_clock_data *scd)
 {
 	struct sched_clock_data *my_scd = this_scd();
@@ -178,8 +193,8 @@ static u64 sched_clock_remote(struct sched_clock_data *scd)
 
 	sched_clock_local(my_scd);
 again:
-	this_clock = my_scd->clock;
-	remote_clock = scd->clock;
+	this_clock = scd_clock_read(my_scd);
+	remote_clock = scd_clock_read(scd);
 
 	/*
 	 * Use the opportunity that we have both locks



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched: Fix 32bit race in sched_clock_remote()
  2013-04-05 16:36 [PATCH] sched: Fix 32bit race in sched_clock_remote() Peter Zijlstra
@ 2013-04-08 21:16 ` Steven Rostedt
  2013-04-09 14:55 ` Yong Zhang
  1 sibling, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2013-04-08 21:16 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: tglx, mingo, LKML

On Fri, 2013-04-05 at 18:36 +0200, Peter Zijlstra wrote:
> Thomas spotted a nasty 32bit race in sched_clock_remote() after way too
> many hours of debugging weirdness.
> 
> What happens is that sched_clock_remote() does regular machine word
> reads of sched_clock_data::clock; this appears safe since we use
> cmpxchg64() to update the variable and any half-read value would
> trigger a retry.
> 
> Except we don't validate the new value 'val' in the same way! Thus we
> can propagate non-atomic read errors into the clock value.
> 
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Debugged-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  kernel/sched/clock.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> index c685e31..7042ef7 100644
> --- a/kernel/sched/clock.c
> +++ b/kernel/sched/clock.c
> @@ -170,6 +170,21 @@ static u64 sched_clock_local(struct sched_clock_data *scd)
>  	return clock;
>  }
>  
> +#ifndef CONFIG_64BIT

We need to add a Kconfig:

config 32BIT
	depends on BROKEN

Acked-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

> +/*
> + * 32bit machines can't atomically read a u64 except using cmpxchg64()
> + */
> +static inline u64 scd_read_clock(struct sched_clock_data *scd)
> +{
> +	return cmpxchg64(&scd->clock, 0, 0);
> +}
> +#else
> +static inline u64 scd_read_clock(struct sched_clock_data *scd)
> +{
> +	return scd->clock;
> +}
> +#endif
> +
>  static u64 sched_clock_remote(struct sched_clock_data *scd)
>  {
>  	struct sched_clock_data *my_scd = this_scd();
> @@ -178,8 +193,8 @@ static u64 sched_clock_remote(struct sched_clock_data *scd)
>  
>  	sched_clock_local(my_scd);
>  again:
> -	this_clock = my_scd->clock;
> -	remote_clock = scd->clock;
> +	this_clock = scd_clock_read(my_scd);
> +	remote_clock = scd_clock_read(scd);
>  
>  	/*
>  	 * Use the opportunity that we have both locks
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched: Fix 32bit race in sched_clock_remote()
  2013-04-05 16:36 [PATCH] sched: Fix 32bit race in sched_clock_remote() Peter Zijlstra
  2013-04-08 21:16 ` Steven Rostedt
@ 2013-04-09 14:55 ` Yong Zhang
  2013-04-10  7:14   ` Peter Zijlstra
  1 sibling, 1 reply; 4+ messages in thread
From: Yong Zhang @ 2013-04-09 14:55 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: tglx, Steven Rostedt, mingo, LKML

On Fri, Apr 05, 2013 at 06:36:40PM +0200, Peter Zijlstra wrote:
> Thomas spotted a nasty 32bit race in sched_clock_remote() after way too
> many hours of debugging weirdness.
> 
> What happens is that sched_clock_remote() does regular machine word
> reads of sched_clock_data::clock; this appears safe since we use
> cmpxchg64() to update the variable and any half-read value would
> trigger a retry.
> 
> Except we don't validate the new value 'val' in the same way! Thus we
> can propagate non-atomic read errors into the clock value.
> 
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Debugged-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  kernel/sched/clock.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> index c685e31..7042ef7 100644
> --- a/kernel/sched/clock.c
> +++ b/kernel/sched/clock.c
> @@ -170,6 +170,21 @@ static u64 sched_clock_local(struct sched_clock_data *scd)
>  	return clock;
>  }
>  
> +#ifndef CONFIG_64BIT
> +/*
> + * 32bit machines can't atomically read a u64 except using cmpxchg64()
> + */
> +static inline u64 scd_read_clock(struct sched_clock_data *scd)
> +{
> +	return cmpxchg64(&scd->clock, 0, 0);
> +}
> +#else
> +static inline u64 scd_read_clock(struct sched_clock_data *scd)
> +{
> +	return scd->clock;
> +}
> +#endif
> +
>  static u64 sched_clock_remote(struct sched_clock_data *scd)
>  {
>  	struct sched_clock_data *my_scd = this_scd();
> @@ -178,8 +193,8 @@ static u64 sched_clock_remote(struct sched_clock_data *scd)
>  
>  	sched_clock_local(my_scd);
>  again:
> -	this_clock = my_scd->clock;
> -	remote_clock = scd->clock;
> +	this_clock = scd_clock_read(my_scd);
> +	remote_clock = scd_clock_read(scd);
		       ^^^^^^^^^^^^^^
		       it doesn't match the declaration: scd_read_clock().

Thanks,
Yong

>  
>  	/*
>  	 * Use the opportunity that we have both locks
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched: Fix 32bit race in sched_clock_remote()
  2013-04-09 14:55 ` Yong Zhang
@ 2013-04-10  7:14   ` Peter Zijlstra
  0 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2013-04-10  7:14 UTC (permalink / raw)
  To: Yong Zhang; +Cc: tglx, Steven Rostedt, mingo, LKML

On Tue, 2013-04-09 at 22:55 +0800, Yong Zhang wrote:
> > +     this_clock = scd_clock_read(my_scd);
> > +     remote_clock = scd_clock_read(scd);
>                        ^^^^^^^^^^^^^^
>                        it doesn't match the declaration:
> scd_read_clock().

Yeah, I'm a moron and forgot to compile test or somesuch :-)

Anyway, Thomas wrote a much better patch which made it in; see
a1cbcaa9ea87b87a96b9fc465951dcf36e459ca2.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-04-10  7:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-05 16:36 [PATCH] sched: Fix 32bit race in sched_clock_remote() Peter Zijlstra
2013-04-08 21:16 ` Steven Rostedt
2013-04-09 14:55 ` Yong Zhang
2013-04-10  7:14   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).