Additional compiler barrier required in sched_preempt_enable_no

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Additional compiler barrier required in sched_preempt_enable_no_resched?
@ 2016-05-13  6:39 Vikram Mulukutla
  2016-05-13 14:21 ` Thomas Gleixner
  2016-05-13 14:58 ` Peter Zijlstra
  0 siblings, 2 replies; 10+ messages in thread
From: Vikram Mulukutla @ 2016-05-13  6:39 UTC (permalink / raw)
  To: peterz, tglx; +Cc: linux-kernel

Hi,

I came across a piece of engineering code that looked like:

preempt_disable();
/* --cut, lots of code-- */
preempt_enable_no_resched();
put_user()
preempt_disable();

(If you wish to seriously question the usage of the preempt API in this 
manner, I unfortunately have no comment since I didn't write the code.)

This particular block of code was causing lockups and crashes on a 
certain ARM64 device. The generated assembly revealed that the compiler 
was simply optimizing out the increment and decrement of the preempt 
count, allowing put_user to run without preemption enabled, causing all 
sorts of badness. Since put_user doesn't actually access the preempt 
count and translates to just a few instructions without any branching, I 
suppose that the compiler figured it was OK to optimize.

The immediate solution is to add a compiler barrier to the code above, 
but should sched_preempt_enable_no_resched have an additional compiler 
barrier after (has one before already) the preempt-count decrement to 
prevent this sort of thing?

Thanks,
Vikram

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum, a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
  2016-05-13  6:39 Additional compiler barrier required in sched_preempt_enable_no_resched? Vikram Mulukutla
@ 2016-05-13 14:21 ` Thomas Gleixner
  2016-05-13 14:58 ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: Thomas Gleixner @ 2016-05-13 14:21 UTC (permalink / raw)
  To: Vikram Mulukutla; +Cc: peterz, linux-kernel

On Thu, 12 May 2016, Vikram Mulukutla wrote:
> preempt_disable();
> /* --cut, lots of code-- */
> preempt_enable_no_resched();
> put_user()
> preempt_disable();
> 
> (If you wish to seriously question the usage of the preempt API in this
> manner, I unfortunately have no comment since I didn't write the code.)
> 
> This particular block of code was causing lockups and crashes on a certain
> ARM64 device. The generated assembly revealed that the compiler was simply
> optimizing out the increment and decrement of the preempt count, allowing
> put_user to run without preemption enabled, causing all sorts of badness.
> Since put_user doesn't actually access the preempt count and translates to
> just a few instructions without any branching, I suppose that the compiler
> figured it was OK to optimize.
> 
> The immediate solution is to add a compiler barrier to the code above, but
> should sched_preempt_enable_no_resched have an additional compiler barrier

preempt_enable_no_resched() should not be used at all. Use preempt_enable().

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
  2016-05-13  6:39 Additional compiler barrier required in sched_preempt_enable_no_resched? Vikram Mulukutla
  2016-05-13 14:21 ` Thomas Gleixner
@ 2016-05-13 14:58 ` Peter Zijlstra
  2016-05-13 22:44   ` Vikram Mulukutla
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2016-05-13 14:58 UTC (permalink / raw)
  To: Vikram Mulukutla; +Cc: tglx, linux-kernel

On Thu, May 12, 2016 at 11:39:47PM -0700, Vikram Mulukutla wrote:
> Hi,
> 
> I came across a piece of engineering code that looked like:
> 
> preempt_disable();
> /* --cut, lots of code-- */
> preempt_enable_no_resched();
> put_user()
> preempt_disable();
> 
> (If you wish to seriously question the usage of the preempt API in this
> manner, I unfortunately have no comment since I didn't write the code.)

I'm with Thomas here, that's broken and should not be done.

> This particular block of code was causing lockups and crashes on a certain
> ARM64 device. The generated assembly revealed that the compiler was simply
> optimizing out the increment and decrement of the preempt count, allowing
> put_user to run without preemption enabled, causing all sorts of badness.
> Since put_user doesn't actually access the preempt count and translates to
> just a few instructions without any branching, I suppose that the compiler
> figured it was OK to optimize.
> 
> The immediate solution is to add a compiler barrier to the code above, but
> should sched_preempt_enable_no_resched have an additional compiler barrier
> after (has one before already) the preempt-count decrement to prevent this
> sort of thing?

I think the below would be sufficient; IIRC the compiler may not combine
or elide volatile operations.

---
 include/asm-generic/preempt.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
index 5d8ffa3e6f8c..c1cde3577551 100644
--- a/include/asm-generic/preempt.h
+++ b/include/asm-generic/preempt.h
@@ -7,10 +7,10 @@
 
 static __always_inline int preempt_count(void)
 {
-	return current_thread_info()->preempt_count;
+	return READ_ONCE(current_thread_info()->preempt_count);
 }
 
-static __always_inline int *preempt_count_ptr(void)
+static __always_inline volatile int *preempt_count_ptr(void)
 {
 	return &current_thread_info()->preempt_count;
 }

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
  2016-05-13 14:58 ` Peter Zijlstra
@ 2016-05-13 22:44   ` Vikram Mulukutla
  2016-05-14 15:39     ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Vikram Mulukutla @ 2016-05-13 22:44 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: tglx, linux-kernel

On 5/13/2016 7:58 AM, Peter Zijlstra wrote:
> On Thu, May 12, 2016 at 11:39:47PM -0700, Vikram Mulukutla wrote:
>> Hi,
>>
>> I came across a piece of engineering code that looked like:
>>
>> preempt_disable();
>> /* --cut, lots of code-- */
>> preempt_enable_no_resched();
>> put_user()
>> preempt_disable();
>>
>> (If you wish to seriously question the usage of the preempt API in this
>> manner, I unfortunately have no comment since I didn't write the code.)
>
> I'm with Thomas here, that's broken and should not be done.

Ok. I did in fact zero in on this code by replacing each instance of 
preempt_enable_no_resched with preempt_enable one by one (there were 
several uses in the driver). I will ask the original developer to 
consider using preempt_enable.

>
>> This particular block of code was causing lockups and crashes on a certain
>> ARM64 device. The generated assembly revealed that the compiler was simply
>> optimizing out the increment and decrement of the preempt count, allowing
>> put_user to run without preemption enabled, causing all sorts of badness.
>> Since put_user doesn't actually access the preempt count and translates to
>> just a few instructions without any branching, I suppose that the compiler
>> figured it was OK to optimize.
>>
>> The immediate solution is to add a compiler barrier to the code above, but
>> should sched_preempt_enable_no_resched have an additional compiler barrier
>> after (has one before already) the preempt-count decrement to prevent this
>> sort of thing?
>
> I think the below would be sufficient; IIRC the compiler may not combine
> or elide volatile operations.
>
> ---
>   include/asm-generic/preempt.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
> index 5d8ffa3e6f8c..c1cde3577551 100644
> --- a/include/asm-generic/preempt.h
> +++ b/include/asm-generic/preempt.h
> @@ -7,10 +7,10 @@
>
>   static __always_inline int preempt_count(void)
>   {
> -	return current_thread_info()->preempt_count;
> +	return READ_ONCE(current_thread_info()->preempt_count);
>   }
>
> -static __always_inline int *preempt_count_ptr(void)
> +static __always_inline volatile int *preempt_count_ptr(void)
>   {
>   	return &current_thread_info()->preempt_count;
>   }
>

Thanks Peter, this patch worked for me. The compiler no longer optimizes 
out the increment/decrement of the preempt_count.

Thanks,
Vikram

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
  2016-05-13 22:44   ` Vikram Mulukutla
@ 2016-05-14 15:39     ` Thomas Gleixner
  2016-05-14 18:28       ` Vikram Mulukutla
  2016-05-16 10:55       ` Peter Zijlstra
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Gleixner @ 2016-05-14 15:39 UTC (permalink / raw)
  To: Vikram Mulukutla; +Cc: Peter Zijlstra, linux-kernel

On Fri, 13 May 2016, Vikram Mulukutla wrote:
> On 5/13/2016 7:58 AM, Peter Zijlstra wrote:
> > diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
> > index 5d8ffa3e6f8c..c1cde3577551 100644
> > --- a/include/asm-generic/preempt.h
> > +++ b/include/asm-generic/preempt.h
> > @@ -7,10 +7,10 @@
> > 
> >   static __always_inline int preempt_count(void)
> >   {
> > -	return current_thread_info()->preempt_count;
> > +	return READ_ONCE(current_thread_info()->preempt_count);
> >   }
> > 
> > -static __always_inline int *preempt_count_ptr(void)
> > +static __always_inline volatile int *preempt_count_ptr(void)
> >   {
> >   	return &current_thread_info()->preempt_count;
> >   }
> > 
> 
> Thanks Peter, this patch worked for me. The compiler no longer optimizes out
> the increment/decrement of the preempt_count.

I have a hard time to understand why the compiler optimizes out stuff w/o that
patch.

We already have:

#define preempt_disable() \
do { \
        preempt_count_inc(); \
        barrier(); \
} while (0)

#define sched_preempt_enable_no_resched() \
do { \
        barrier(); \
        preempt_count_dec(); \
} while (0)

#define preempt_enable() \
do { \
        barrier(); \
        if (unlikely(preempt_count_dec_and_test())) \
                __preempt_schedule(); \
} while (0)

So the barriers already forbid that the compiler reorders code. How on earth
is the compiler supposed to optimize the dec/inc out?

There is more code than the preempt stuff depending on barrier() and expecting
that the compiler does not optimize out things at will. Are we supposed to
hunt all occurences and amend them with READ_ONCE or whatever one by one?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
  2016-05-14 15:39     ` Thomas Gleixner
@ 2016-05-14 18:28       ` Vikram Mulukutla
  2016-05-16 10:55       ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: Vikram Mulukutla @ 2016-05-14 18:28 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Peter Zijlstra, linux-kernel

On 5/14/2016 8:39 AM, Thomas Gleixner wrote:
> On Fri, 13 May 2016, Vikram Mulukutla wrote:
>> On 5/13/2016 7:58 AM, Peter Zijlstra wrote:
>>> diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
>>> index 5d8ffa3e6f8c..c1cde3577551 100644
>>> --- a/include/asm-generic/preempt.h
>>> +++ b/include/asm-generic/preempt.h
>>> @@ -7,10 +7,10 @@
>>>
>>>    static __always_inline int preempt_count(void)
>>>    {
>>> -	return current_thread_info()->preempt_count;
>>> +	return READ_ONCE(current_thread_info()->preempt_count);
>>>    }
>>>
>>> -static __always_inline int *preempt_count_ptr(void)
>>> +static __always_inline volatile int *preempt_count_ptr(void)
>>>    {
>>>    	return &current_thread_info()->preempt_count;
>>>    }
>>>
>>
>> Thanks Peter, this patch worked for me. The compiler no longer optimizes out
>> the increment/decrement of the preempt_count.
>
> I have a hard time to understand why the compiler optimizes out stuff w/o that
> patch.
>
> We already have:
>
> #define preempt_disable() \
> do { \
>          preempt_count_inc(); \
>          barrier(); \
> } while (0)
>
> #define sched_preempt_enable_no_resched() \
> do { \
>          barrier(); \
>          preempt_count_dec(); \
> } while (0)
>
> #define preempt_enable() \
> do { \
>          barrier(); \
>          if (unlikely(preempt_count_dec_and_test())) \
>                  __preempt_schedule(); \
> } while (0)
>
> So the barriers already forbid that the compiler reorders code. How on earth
> is the compiler supposed to optimize the dec/inc out?
>

While I cannot claim more than a very rudimentary knowledge of 
compilers, this was the initial reaction that I had as well. But then 
the barriers are in the wrong spots for the way the code was used in the 
driver in question. preempt_disable has the barrier() _after_ the 
increment, and sched_preempt_enable_no_resched has it _before_ the 
decrement. Therefore, if one were to use preempt_enable_no_resched 
followed by preempt_disable, there is no compiler barrier between the 
decrement and the increment statements. Whether this should translate to 
such a seemingly aggressive optimization is something I'm not completely 
sure of.

> There is more code than the preempt stuff depending on barrier() and expecting
> that the compiler does not optimize out things at will. Are we supposed to
> hunt all occurences and amend them with READ_ONCE or whatever one by one?
>

I think the barrier is working as intended for the sequence of 
preempt_disable followed by preempt_enable_no_resched.

> Thanks,
>
> 	tglx
>
>
>

As a simple experiment I used gcc on x86 on the following C program. 
This was really to convince myself rather than you and Peter!

#include <stdio.h>

#define barrier() __asm__ __volatile__("": : :"memory")

struct thread_info {
         int pc;
};

#define preempt_count() \
         ti_ptr->pc

#define preempt_disable() \
do \
{ \
         preempt_count() += 1; \
         barrier(); \
} \
while (0)

#define preempt_enable() \
do \
{ \
         barrier(); \
         preempt_count() -=  1; \
} \
while (0)

struct thread_info *ti_ptr;

int main(void)
{
         struct thread_info ti;
         ti_ptr = &ti;
         int b;

         preempt_enable();
         b = b + 500;
         preempt_disable();

         printf("b = %d\n", b);

         return 0;
}

With gcc -O2 I get this (the ARM compiler behaves similarly):

0000000000400470 <main>:
   400470:       48 83 ec 18             sub    $0x18,%rsp
   400474:       48 89 25 cd 0b 20 00    mov    %rsp,0x200bcd(%rip)
   40047b:       ba f4 01 00 00          mov    $0x1f4,%edx
   400480:       be 14 06 40 00          mov    $0x400614,%esi
   400485:       bf 01 00 00 00          mov    $0x1,%edi
   40048a:       31 c0                   xor    %eax,%eax
   40048c:       e8 cf ff ff ff          callq  400460 <__printf_chk@plt>
   400491:       31 c0                   xor    %eax,%eax
   400493:       48 83 c4 18             add    $0x18,%rsp
   400497:       c3                      retq

If I swap preempt_enable and preempt_disable I get this:

0000000000400470 <main>:
   400470:       48 83 ec 18             sub    $0x18,%rsp
   400474:       48 89 25 cd 0b 20 00    mov    %rsp,0x200bcd(%rip)
   40047b:       83 04 24 01             addl   $0x1,(%rsp)
   40047f:       48 8b 05 c2 0b 20 00    mov    0x200bc2(%rip),%rax
   400486:       ba f4 01 00 00          mov    $0x1f4,%edx
   40048b:       be 14 06 40 00          mov    $0x400614,%esi
   400490:       bf 01 00 00 00          mov    $0x1,%edi
   400495:       83 28 01                subl   $0x1,(%rax)
   400498:       31 c0                   xor    %eax,%eax
   40049a:       e8 c1 ff ff ff          callq  400460 <__printf_chk@plt>
   40049f:       31 c0                   xor    %eax,%eax
   4004a1:       48 83 c4 18             add    $0x18,%rsp
   4004a5:       c3                      retq

Note: If I place ti_ptr on the stack, the ordering/barriers no longer 
matter, the inc/dec is always optimized out. I suppose the compiler does 
treat current_thread_info as coming from a different memory location 
rather than the current stack frame. In any case, IMHO the volatile 
preempt_count patch is necessary.

Thanks,
Vikram

-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
  2016-05-14 15:39     ` Thomas Gleixner
  2016-05-14 18:28       ` Vikram Mulukutla
@ 2016-05-16 10:55       ` Peter Zijlstra
  2016-05-16 11:00         ` Peter Zijlstra
  2016-05-16 13:17         ` Peter Zijlstra
  1 sibling, 2 replies; 10+ messages in thread
From: Peter Zijlstra @ 2016-05-16 10:55 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Vikram Mulukutla, linux-kernel

On Sat, May 14, 2016 at 05:39:37PM +0200, Thomas Gleixner wrote:
> I have a hard time to understand why the compiler optimizes out stuff w/o that
> patch.
> 
> We already have:
> 
> #define preempt_disable() \
> do { \
>         preempt_count_inc(); \
>         barrier(); \
> } while (0)
> 
> #define sched_preempt_enable_no_resched() \
> do { \
>         barrier(); \
>         preempt_count_dec(); \
> } while (0)
> 
> #define preempt_enable() \
> do { \
>         barrier(); \
>         if (unlikely(preempt_count_dec_and_test())) \
>                 __preempt_schedule(); \
> } while (0)
> 
> So the barriers already forbid that the compiler reorders code. How on earth
> is the compiler supposed to optimize the dec/inc out?

Order things like:

> #define sched_preempt_enable_no_resched() \
> do { \
>         barrier(); \
>         preempt_count_dec(); \
> } while (0)

> #define preempt_disable() \
> do { \
>         preempt_count_inc(); \
>         barrier(); \
> } while (0)

And there is no barrier between the dec and inc, and a smarty pants
compiler could just decide to forgo the update, since in program order
there is no observable difference either way.

Making the thing volatile tells the compiler there can be external
observations of the memory and it cannot assume things like that and
must emit the operations.

You're right in that the 'proper' sequence:

> #define preempt_enable() \
> do { \
>         barrier(); \
>         if (unlikely(preempt_count_dec_and_test())) \
>                 __preempt_schedule(); \
> } while (0)

> #define preempt_disable() \
> do { \
>         preempt_count_inc(); \
>         barrier(); \
> } while (0)

Has a higher chance of succeeding to emit the operations to memory; but
an even smarter pants compiler might figure doing something like:

	if (preempt_count() == 1)
		__preempt_schedule();

is equivalent and emits that instead, not bothering to modify the actual
variable at all -- the program as specified cannot tell the difference
etc..

Also; in the case of !PREEMPT && PREEMPT_COUNT, the normal:

	preempt_disable();
	preempt_enable();

sequence turns into the first case again.

So I'll go write a proper changelog for the volatile thing and get it
merged with a Cc to stable.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
  2016-05-16 10:55       ` Peter Zijlstra
@ 2016-05-16 11:00         ` Peter Zijlstra
  2016-05-16 13:17         ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2016-05-16 11:00 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Vikram Mulukutla, linux-kernel

On Mon, May 16, 2016 at 12:55:57PM +0200, Peter Zijlstra wrote:
> You're right in that the 'proper' sequence:
> 
> > #define preempt_enable() \
> > do { \
> >         barrier(); \
> >         if (unlikely(preempt_count_dec_and_test())) \
> >                 __preempt_schedule(); \
> > } while (0)
> 
> > #define preempt_disable() \
> > do { \
> >         preempt_count_inc(); \
> >         barrier(); \
> > } while (0)
> 
> Has a higher chance of succeeding to emit the operations to memory; but
> an even smarter pants compiler might figure doing something like:
> 
> 	if (preempt_count() == 1)
> 		__preempt_schedule();
> 
> is equivalent and emits that instead, not bothering to modify the actual
> variable at all -- the program as specified cannot tell the difference
> etc..

For this to work the call __preempt_schedule() must be obfuscated
though; but I think the thing we do for x86 which turns it into:

	asm("call ___preempt_schedule;");

might just be enough for the compiler to get confused about that.

A normal function call would act as a compiler barrier and force the
compiler to emit the memory ops.

Then again, it could maybe do:

	if (preempt_count() == 1) {
		preempt_count_dec();
		__preempt_schedule();
		preempt_count_inc();
	}

and think it did us a service by 'optimizing' away that memory reference
in the 'fast' path.

Who knows what these compilers get up to ;-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
  2016-05-16 10:55       ` Peter Zijlstra
  2016-05-16 11:00         ` Peter Zijlstra
@ 2016-05-16 13:17         ` Peter Zijlstra
  2016-05-17 14:21           ` [tip:sched/urgent] sched/preempt: Fix preempt_count manipulations tip-bot for Peter Zijlstra
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2016-05-16 13:17 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Vikram Mulukutla, linux-kernel, Ingo Molnar

On Mon, May 16, 2016 at 12:55:57PM +0200, Peter Zijlstra wrote:
> > So the barriers already forbid that the compiler reorders code. How on earth
> > is the compiler supposed to optimize the dec/inc out?
> 
> Order things like:
> 
> > #define sched_preempt_enable_no_resched() \
> > do { \
> >         barrier(); \
> >         preempt_count_dec(); \
> > } while (0)
> 
> > #define preempt_disable() \
> > do { \
> >         preempt_count_inc(); \
> >         barrier(); \
> > } while (0)
> 
> And there is no barrier between the dec and inc, and a smarty pants
> compiler could just decide to forgo the update, since in program order
> there is no observable difference either way.
> 
> Making the thing volatile tells the compiler there can be external
> observations of the memory and it cannot assume things like that and
> must emit the operations.

...

> So I'll go write a proper changelog for the volatile thing and get it
> merged with a Cc to stable.


---
Subject: sched,preempt: Fix preempt_count manipulations
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon May 16 15:01:11 CEST 2016

Vikram reported that his ARM64 compiler managed to 'optimize' away the
preempt_count manipulations in code like:

	preempt_enable_no_resched();
	put_user();
	preempt_disable();

Irrespective of that fact that that is horrible code that should be
fixed for many reasons, it does highlight a deficiency in the generic
preempt_count manipulators. As it is never right to combine/elide
preempt_count manipulations like this.

Therefore sprinkle some volatile in the two generic accessors to
ensure the compiler is aware of the fact that the preempt_count is
observed outside of the regular program-order view and thus cannot be
optimized away like this.

x86; the only arch not using the generic code is not affected as we
do all this in asm in order to use the segment base per-cpu stuff.

Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a787870924db ("sched, arch: Create asm/preempt.h")
Reported-by: Vikram Mulukutla <markivx@codeaurora.org>
Tested-by: Vikram Mulukutla <markivx@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/asm-generic/preempt.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/asm-generic/preempt.h
+++ b/include/asm-generic/preempt.h
@@ -7,10 +7,10 @@
 
 static __always_inline int preempt_count(void)
 {
-	return current_thread_info()->preempt_count;
+	return READ_ONCE(current_thread_info()->preempt_count);
 }
 
-static __always_inline int *preempt_count_ptr(void)
+static __always_inline volatile int *preempt_count_ptr(void)
 {
 	return &current_thread_info()->preempt_count;
 }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip:sched/urgent] sched/preempt: Fix preempt_count manipulations
  2016-05-16 13:17         ` Peter Zijlstra
@ 2016-05-17 14:21           ` tip-bot for Peter Zijlstra
  0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Peter Zijlstra @ 2016-05-17 14:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, linux-kernel, markivx, peterz, hpa, tglx, torvalds

Commit-ID:  2e636d5e66c35dfcbaf617aa8fa963f6847478fe
Gitweb:     http://git.kernel.org/tip/2e636d5e66c35dfcbaf617aa8fa963f6847478fe
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Mon, 16 May 2016 15:01:11 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 17 May 2016 12:24:21 +0200

sched/preempt: Fix preempt_count manipulations

Vikram reported that his ARM64 compiler managed to 'optimize' away the
preempt_count manipulations in code like:

	preempt_enable_no_resched();
	put_user();
	preempt_disable();

Irrespective of that fact that that is horrible code that should be
fixed for many reasons, it does highlight a deficiency in the generic
preempt_count manipulators. As it is never right to combine/elide
preempt_count manipulations like this.

Therefore sprinkle some volatile in the two generic accessors to
ensure the compiler is aware of the fact that the preempt_count is
observed outside of the regular program-order view and thus cannot be
optimized away like this.

x86; the only arch not using the generic code is not affected as we
do all this in asm in order to use the segment base per-cpu stuff.

Reported-by: Vikram Mulukutla <markivx@codeaurora.org>
Tested-by: Vikram Mulukutla <markivx@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a787870924db ("sched, arch: Create asm/preempt.h")
Link: http://lkml.kernel.org/r/20160516131751.GH3205@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/asm-generic/preempt.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
index 5d8ffa3..c1cde35 100644
--- a/include/asm-generic/preempt.h
+++ b/include/asm-generic/preempt.h
@@ -7,10 +7,10 @@
 
 static __always_inline int preempt_count(void)
 {
-	return current_thread_info()->preempt_count;
+	return READ_ONCE(current_thread_info()->preempt_count);
 }
 
-static __always_inline int *preempt_count_ptr(void)
+static __always_inline volatile int *preempt_count_ptr(void)
 {
 	return &current_thread_info()->preempt_count;
 }

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-05-17 14:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-13  6:39 Additional compiler barrier required in sched_preempt_enable_no_resched? Vikram Mulukutla
2016-05-13 14:21 ` Thomas Gleixner
2016-05-13 14:58 ` Peter Zijlstra
2016-05-13 22:44   ` Vikram Mulukutla
2016-05-14 15:39     ` Thomas Gleixner
2016-05-14 18:28       ` Vikram Mulukutla
2016-05-16 10:55       ` Peter Zijlstra
2016-05-16 11:00         ` Peter Zijlstra
2016-05-16 13:17         ` Peter Zijlstra
2016-05-17 14:21           ` [tip:sched/urgent] sched/preempt: Fix preempt_count manipulations tip-bot for Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).