* Additional compiler barrier required in sched_preempt_enable_no_resched?
@ 2016-05-13 6:39 Vikram Mulukutla
2016-05-13 14:21 ` Thomas Gleixner
2016-05-13 14:58 ` Peter Zijlstra
0 siblings, 2 replies; 10+ messages in thread
From: Vikram Mulukutla @ 2016-05-13 6:39 UTC (permalink / raw)
To: peterz, tglx; +Cc: linux-kernel
Hi,
I came across a piece of engineering code that looked like:
preempt_disable();
/* --cut, lots of code-- */
preempt_enable_no_resched();
put_user()
preempt_disable();
(If you wish to seriously question the usage of the preempt API in this
manner, I unfortunately have no comment since I didn't write the code.)
This particular block of code was causing lockups and crashes on a
certain ARM64 device. The generated assembly revealed that the compiler
was simply optimizing out the increment and decrement of the preempt
count, allowing put_user to run without preemption enabled, causing all
sorts of badness. Since put_user doesn't actually access the preempt
count and translates to just a few instructions without any branching, I
suppose that the compiler figured it was OK to optimize.
The immediate solution is to add a compiler barrier to the code above,
but should sched_preempt_enable_no_resched have an additional compiler
barrier after (has one before already) the preempt-count decrement to
prevent this sort of thing?
Thanks,
Vikram
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
2016-05-13 6:39 Additional compiler barrier required in sched_preempt_enable_no_resched? Vikram Mulukutla
@ 2016-05-13 14:21 ` Thomas Gleixner
2016-05-13 14:58 ` Peter Zijlstra
1 sibling, 0 replies; 10+ messages in thread
From: Thomas Gleixner @ 2016-05-13 14:21 UTC (permalink / raw)
To: Vikram Mulukutla; +Cc: peterz, linux-kernel
On Thu, 12 May 2016, Vikram Mulukutla wrote:
> preempt_disable();
> /* --cut, lots of code-- */
> preempt_enable_no_resched();
> put_user()
> preempt_disable();
>
> (If you wish to seriously question the usage of the preempt API in this
> manner, I unfortunately have no comment since I didn't write the code.)
>
> This particular block of code was causing lockups and crashes on a certain
> ARM64 device. The generated assembly revealed that the compiler was simply
> optimizing out the increment and decrement of the preempt count, allowing
> put_user to run without preemption enabled, causing all sorts of badness.
> Since put_user doesn't actually access the preempt count and translates to
> just a few instructions without any branching, I suppose that the compiler
> figured it was OK to optimize.
>
> The immediate solution is to add a compiler barrier to the code above, but
> should sched_preempt_enable_no_resched have an additional compiler barrier
preempt_enable_no_resched() should not be used at all. Use preempt_enable().
Thanks,
tglx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
2016-05-13 6:39 Additional compiler barrier required in sched_preempt_enable_no_resched? Vikram Mulukutla
2016-05-13 14:21 ` Thomas Gleixner
@ 2016-05-13 14:58 ` Peter Zijlstra
2016-05-13 22:44 ` Vikram Mulukutla
1 sibling, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2016-05-13 14:58 UTC (permalink / raw)
To: Vikram Mulukutla; +Cc: tglx, linux-kernel
On Thu, May 12, 2016 at 11:39:47PM -0700, Vikram Mulukutla wrote:
> Hi,
>
> I came across a piece of engineering code that looked like:
>
> preempt_disable();
> /* --cut, lots of code-- */
> preempt_enable_no_resched();
> put_user()
> preempt_disable();
>
> (If you wish to seriously question the usage of the preempt API in this
> manner, I unfortunately have no comment since I didn't write the code.)
I'm with Thomas here, that's broken and should not be done.
> This particular block of code was causing lockups and crashes on a certain
> ARM64 device. The generated assembly revealed that the compiler was simply
> optimizing out the increment and decrement of the preempt count, allowing
> put_user to run without preemption enabled, causing all sorts of badness.
> Since put_user doesn't actually access the preempt count and translates to
> just a few instructions without any branching, I suppose that the compiler
> figured it was OK to optimize.
>
> The immediate solution is to add a compiler barrier to the code above, but
> should sched_preempt_enable_no_resched have an additional compiler barrier
> after (has one before already) the preempt-count decrement to prevent this
> sort of thing?
I think the below would be sufficient; IIRC the compiler may not combine
or elide volatile operations.
---
include/asm-generic/preempt.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
index 5d8ffa3e6f8c..c1cde3577551 100644
--- a/include/asm-generic/preempt.h
+++ b/include/asm-generic/preempt.h
@@ -7,10 +7,10 @@
static __always_inline int preempt_count(void)
{
- return current_thread_info()->preempt_count;
+ return READ_ONCE(current_thread_info()->preempt_count);
}
-static __always_inline int *preempt_count_ptr(void)
+static __always_inline volatile int *preempt_count_ptr(void)
{
return ¤t_thread_info()->preempt_count;
}
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
2016-05-13 14:58 ` Peter Zijlstra
@ 2016-05-13 22:44 ` Vikram Mulukutla
2016-05-14 15:39 ` Thomas Gleixner
0 siblings, 1 reply; 10+ messages in thread
From: Vikram Mulukutla @ 2016-05-13 22:44 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: tglx, linux-kernel
On 5/13/2016 7:58 AM, Peter Zijlstra wrote:
> On Thu, May 12, 2016 at 11:39:47PM -0700, Vikram Mulukutla wrote:
>> Hi,
>>
>> I came across a piece of engineering code that looked like:
>>
>> preempt_disable();
>> /* --cut, lots of code-- */
>> preempt_enable_no_resched();
>> put_user()
>> preempt_disable();
>>
>> (If you wish to seriously question the usage of the preempt API in this
>> manner, I unfortunately have no comment since I didn't write the code.)
>
> I'm with Thomas here, that's broken and should not be done.
Ok. I did in fact zero in on this code by replacing each instance of
preempt_enable_no_resched with preempt_enable one by one (there were
several uses in the driver). I will ask the original developer to
consider using preempt_enable.
>
>> This particular block of code was causing lockups and crashes on a certain
>> ARM64 device. The generated assembly revealed that the compiler was simply
>> optimizing out the increment and decrement of the preempt count, allowing
>> put_user to run without preemption enabled, causing all sorts of badness.
>> Since put_user doesn't actually access the preempt count and translates to
>> just a few instructions without any branching, I suppose that the compiler
>> figured it was OK to optimize.
>>
>> The immediate solution is to add a compiler barrier to the code above, but
>> should sched_preempt_enable_no_resched have an additional compiler barrier
>> after (has one before already) the preempt-count decrement to prevent this
>> sort of thing?
>
> I think the below would be sufficient; IIRC the compiler may not combine
> or elide volatile operations.
>
> ---
> include/asm-generic/preempt.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
> index 5d8ffa3e6f8c..c1cde3577551 100644
> --- a/include/asm-generic/preempt.h
> +++ b/include/asm-generic/preempt.h
> @@ -7,10 +7,10 @@
>
> static __always_inline int preempt_count(void)
> {
> - return current_thread_info()->preempt_count;
> + return READ_ONCE(current_thread_info()->preempt_count);
> }
>
> -static __always_inline int *preempt_count_ptr(void)
> +static __always_inline volatile int *preempt_count_ptr(void)
> {
> return ¤t_thread_info()->preempt_count;
> }
>
Thanks Peter, this patch worked for me. The compiler no longer optimizes
out the increment/decrement of the preempt_count.
Thanks,
Vikram
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
2016-05-13 22:44 ` Vikram Mulukutla
@ 2016-05-14 15:39 ` Thomas Gleixner
2016-05-14 18:28 ` Vikram Mulukutla
2016-05-16 10:55 ` Peter Zijlstra
0 siblings, 2 replies; 10+ messages in thread
From: Thomas Gleixner @ 2016-05-14 15:39 UTC (permalink / raw)
To: Vikram Mulukutla; +Cc: Peter Zijlstra, linux-kernel
On Fri, 13 May 2016, Vikram Mulukutla wrote:
> On 5/13/2016 7:58 AM, Peter Zijlstra wrote:
> > diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
> > index 5d8ffa3e6f8c..c1cde3577551 100644
> > --- a/include/asm-generic/preempt.h
> > +++ b/include/asm-generic/preempt.h
> > @@ -7,10 +7,10 @@
> >
> > static __always_inline int preempt_count(void)
> > {
> > - return current_thread_info()->preempt_count;
> > + return READ_ONCE(current_thread_info()->preempt_count);
> > }
> >
> > -static __always_inline int *preempt_count_ptr(void)
> > +static __always_inline volatile int *preempt_count_ptr(void)
> > {
> > return ¤t_thread_info()->preempt_count;
> > }
> >
>
> Thanks Peter, this patch worked for me. The compiler no longer optimizes out
> the increment/decrement of the preempt_count.
I have a hard time to understand why the compiler optimizes out stuff w/o that
patch.
We already have:
#define preempt_disable() \
do { \
preempt_count_inc(); \
barrier(); \
} while (0)
#define sched_preempt_enable_no_resched() \
do { \
barrier(); \
preempt_count_dec(); \
} while (0)
#define preempt_enable() \
do { \
barrier(); \
if (unlikely(preempt_count_dec_and_test())) \
__preempt_schedule(); \
} while (0)
So the barriers already forbid that the compiler reorders code. How on earth
is the compiler supposed to optimize the dec/inc out?
There is more code than the preempt stuff depending on barrier() and expecting
that the compiler does not optimize out things at will. Are we supposed to
hunt all occurences and amend them with READ_ONCE or whatever one by one?
Thanks,
tglx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
2016-05-14 15:39 ` Thomas Gleixner
@ 2016-05-14 18:28 ` Vikram Mulukutla
2016-05-16 10:55 ` Peter Zijlstra
1 sibling, 0 replies; 10+ messages in thread
From: Vikram Mulukutla @ 2016-05-14 18:28 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Peter Zijlstra, linux-kernel
On 5/14/2016 8:39 AM, Thomas Gleixner wrote:
> On Fri, 13 May 2016, Vikram Mulukutla wrote:
>> On 5/13/2016 7:58 AM, Peter Zijlstra wrote:
>>> diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
>>> index 5d8ffa3e6f8c..c1cde3577551 100644
>>> --- a/include/asm-generic/preempt.h
>>> +++ b/include/asm-generic/preempt.h
>>> @@ -7,10 +7,10 @@
>>>
>>> static __always_inline int preempt_count(void)
>>> {
>>> - return current_thread_info()->preempt_count;
>>> + return READ_ONCE(current_thread_info()->preempt_count);
>>> }
>>>
>>> -static __always_inline int *preempt_count_ptr(void)
>>> +static __always_inline volatile int *preempt_count_ptr(void)
>>> {
>>> return ¤t_thread_info()->preempt_count;
>>> }
>>>
>>
>> Thanks Peter, this patch worked for me. The compiler no longer optimizes out
>> the increment/decrement of the preempt_count.
>
> I have a hard time to understand why the compiler optimizes out stuff w/o that
> patch.
>
> We already have:
>
> #define preempt_disable() \
> do { \
> preempt_count_inc(); \
> barrier(); \
> } while (0)
>
> #define sched_preempt_enable_no_resched() \
> do { \
> barrier(); \
> preempt_count_dec(); \
> } while (0)
>
> #define preempt_enable() \
> do { \
> barrier(); \
> if (unlikely(preempt_count_dec_and_test())) \
> __preempt_schedule(); \
> } while (0)
>
> So the barriers already forbid that the compiler reorders code. How on earth
> is the compiler supposed to optimize the dec/inc out?
>
While I cannot claim more than a very rudimentary knowledge of
compilers, this was the initial reaction that I had as well. But then
the barriers are in the wrong spots for the way the code was used in the
driver in question. preempt_disable has the barrier() _after_ the
increment, and sched_preempt_enable_no_resched has it _before_ the
decrement. Therefore, if one were to use preempt_enable_no_resched
followed by preempt_disable, there is no compiler barrier between the
decrement and the increment statements. Whether this should translate to
such a seemingly aggressive optimization is something I'm not completely
sure of.
> There is more code than the preempt stuff depending on barrier() and expecting
> that the compiler does not optimize out things at will. Are we supposed to
> hunt all occurences and amend them with READ_ONCE or whatever one by one?
>
I think the barrier is working as intended for the sequence of
preempt_disable followed by preempt_enable_no_resched.
> Thanks,
>
> tglx
>
>
>
As a simple experiment I used gcc on x86 on the following C program.
This was really to convince myself rather than you and Peter!
#include <stdio.h>
#define barrier() __asm__ __volatile__("": : :"memory")
struct thread_info {
int pc;
};
#define preempt_count() \
ti_ptr->pc
#define preempt_disable() \
do \
{ \
preempt_count() += 1; \
barrier(); \
} \
while (0)
#define preempt_enable() \
do \
{ \
barrier(); \
preempt_count() -= 1; \
} \
while (0)
struct thread_info *ti_ptr;
int main(void)
{
struct thread_info ti;
ti_ptr = &ti;
int b;
preempt_enable();
b = b + 500;
preempt_disable();
printf("b = %d\n", b);
return 0;
}
With gcc -O2 I get this (the ARM compiler behaves similarly):
0000000000400470 <main>:
400470: 48 83 ec 18 sub $0x18,%rsp
400474: 48 89 25 cd 0b 20 00 mov %rsp,0x200bcd(%rip)
40047b: ba f4 01 00 00 mov $0x1f4,%edx
400480: be 14 06 40 00 mov $0x400614,%esi
400485: bf 01 00 00 00 mov $0x1,%edi
40048a: 31 c0 xor %eax,%eax
40048c: e8 cf ff ff ff callq 400460 <__printf_chk@plt>
400491: 31 c0 xor %eax,%eax
400493: 48 83 c4 18 add $0x18,%rsp
400497: c3 retq
If I swap preempt_enable and preempt_disable I get this:
0000000000400470 <main>:
400470: 48 83 ec 18 sub $0x18,%rsp
400474: 48 89 25 cd 0b 20 00 mov %rsp,0x200bcd(%rip)
40047b: 83 04 24 01 addl $0x1,(%rsp)
40047f: 48 8b 05 c2 0b 20 00 mov 0x200bc2(%rip),%rax
400486: ba f4 01 00 00 mov $0x1f4,%edx
40048b: be 14 06 40 00 mov $0x400614,%esi
400490: bf 01 00 00 00 mov $0x1,%edi
400495: 83 28 01 subl $0x1,(%rax)
400498: 31 c0 xor %eax,%eax
40049a: e8 c1 ff ff ff callq 400460 <__printf_chk@plt>
40049f: 31 c0 xor %eax,%eax
4004a1: 48 83 c4 18 add $0x18,%rsp
4004a5: c3 retq
Note: If I place ti_ptr on the stack, the ordering/barriers no longer
matter, the inc/dec is always optimized out. I suppose the compiler does
treat current_thread_info as coming from a different memory location
rather than the current stack frame. In any case, IMHO the volatile
preempt_count patch is necessary.
Thanks,
Vikram
--
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
2016-05-14 15:39 ` Thomas Gleixner
2016-05-14 18:28 ` Vikram Mulukutla
@ 2016-05-16 10:55 ` Peter Zijlstra
2016-05-16 11:00 ` Peter Zijlstra
2016-05-16 13:17 ` Peter Zijlstra
1 sibling, 2 replies; 10+ messages in thread
From: Peter Zijlstra @ 2016-05-16 10:55 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Vikram Mulukutla, linux-kernel
On Sat, May 14, 2016 at 05:39:37PM +0200, Thomas Gleixner wrote:
> I have a hard time to understand why the compiler optimizes out stuff w/o that
> patch.
>
> We already have:
>
> #define preempt_disable() \
> do { \
> preempt_count_inc(); \
> barrier(); \
> } while (0)
>
> #define sched_preempt_enable_no_resched() \
> do { \
> barrier(); \
> preempt_count_dec(); \
> } while (0)
>
> #define preempt_enable() \
> do { \
> barrier(); \
> if (unlikely(preempt_count_dec_and_test())) \
> __preempt_schedule(); \
> } while (0)
>
> So the barriers already forbid that the compiler reorders code. How on earth
> is the compiler supposed to optimize the dec/inc out?
Order things like:
> #define sched_preempt_enable_no_resched() \
> do { \
> barrier(); \
> preempt_count_dec(); \
> } while (0)
> #define preempt_disable() \
> do { \
> preempt_count_inc(); \
> barrier(); \
> } while (0)
And there is no barrier between the dec and inc, and a smarty pants
compiler could just decide to forgo the update, since in program order
there is no observable difference either way.
Making the thing volatile tells the compiler there can be external
observations of the memory and it cannot assume things like that and
must emit the operations.
You're right in that the 'proper' sequence:
> #define preempt_enable() \
> do { \
> barrier(); \
> if (unlikely(preempt_count_dec_and_test())) \
> __preempt_schedule(); \
> } while (0)
> #define preempt_disable() \
> do { \
> preempt_count_inc(); \
> barrier(); \
> } while (0)
Has a higher chance of succeeding to emit the operations to memory; but
an even smarter pants compiler might figure doing something like:
if (preempt_count() == 1)
__preempt_schedule();
is equivalent and emits that instead, not bothering to modify the actual
variable at all -- the program as specified cannot tell the difference
etc..
Also; in the case of !PREEMPT && PREEMPT_COUNT, the normal:
preempt_disable();
preempt_enable();
sequence turns into the first case again.
So I'll go write a proper changelog for the volatile thing and get it
merged with a Cc to stable.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
2016-05-16 10:55 ` Peter Zijlstra
@ 2016-05-16 11:00 ` Peter Zijlstra
2016-05-16 13:17 ` Peter Zijlstra
1 sibling, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2016-05-16 11:00 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Vikram Mulukutla, linux-kernel
On Mon, May 16, 2016 at 12:55:57PM +0200, Peter Zijlstra wrote:
> You're right in that the 'proper' sequence:
>
> > #define preempt_enable() \
> > do { \
> > barrier(); \
> > if (unlikely(preempt_count_dec_and_test())) \
> > __preempt_schedule(); \
> > } while (0)
>
> > #define preempt_disable() \
> > do { \
> > preempt_count_inc(); \
> > barrier(); \
> > } while (0)
>
> Has a higher chance of succeeding to emit the operations to memory; but
> an even smarter pants compiler might figure doing something like:
>
> if (preempt_count() == 1)
> __preempt_schedule();
>
> is equivalent and emits that instead, not bothering to modify the actual
> variable at all -- the program as specified cannot tell the difference
> etc..
For this to work the call __preempt_schedule() must be obfuscated
though; but I think the thing we do for x86 which turns it into:
asm("call ___preempt_schedule;");
might just be enough for the compiler to get confused about that.
A normal function call would act as a compiler barrier and force the
compiler to emit the memory ops.
Then again, it could maybe do:
if (preempt_count() == 1) {
preempt_count_dec();
__preempt_schedule();
preempt_count_inc();
}
and think it did us a service by 'optimizing' away that memory reference
in the 'fast' path.
Who knows what these compilers get up to ;-)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
2016-05-16 10:55 ` Peter Zijlstra
2016-05-16 11:00 ` Peter Zijlstra
@ 2016-05-16 13:17 ` Peter Zijlstra
2016-05-17 14:21 ` [tip:sched/urgent] sched/preempt: Fix preempt_count manipulations tip-bot for Peter Zijlstra
1 sibling, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2016-05-16 13:17 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Vikram Mulukutla, linux-kernel, Ingo Molnar
On Mon, May 16, 2016 at 12:55:57PM +0200, Peter Zijlstra wrote:
> > So the barriers already forbid that the compiler reorders code. How on earth
> > is the compiler supposed to optimize the dec/inc out?
>
> Order things like:
>
> > #define sched_preempt_enable_no_resched() \
> > do { \
> > barrier(); \
> > preempt_count_dec(); \
> > } while (0)
>
> > #define preempt_disable() \
> > do { \
> > preempt_count_inc(); \
> > barrier(); \
> > } while (0)
>
> And there is no barrier between the dec and inc, and a smarty pants
> compiler could just decide to forgo the update, since in program order
> there is no observable difference either way.
>
> Making the thing volatile tells the compiler there can be external
> observations of the memory and it cannot assume things like that and
> must emit the operations.
...
> So I'll go write a proper changelog for the volatile thing and get it
> merged with a Cc to stable.
---
Subject: sched,preempt: Fix preempt_count manipulations
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon May 16 15:01:11 CEST 2016
Vikram reported that his ARM64 compiler managed to 'optimize' away the
preempt_count manipulations in code like:
preempt_enable_no_resched();
put_user();
preempt_disable();
Irrespective of that fact that that is horrible code that should be
fixed for many reasons, it does highlight a deficiency in the generic
preempt_count manipulators. As it is never right to combine/elide
preempt_count manipulations like this.
Therefore sprinkle some volatile in the two generic accessors to
ensure the compiler is aware of the fact that the preempt_count is
observed outside of the regular program-order view and thus cannot be
optimized away like this.
x86; the only arch not using the generic code is not affected as we
do all this in asm in order to use the segment base per-cpu stuff.
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a787870924db ("sched, arch: Create asm/preempt.h")
Reported-by: Vikram Mulukutla <markivx@codeaurora.org>
Tested-by: Vikram Mulukutla <markivx@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/asm-generic/preempt.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/asm-generic/preempt.h
+++ b/include/asm-generic/preempt.h
@@ -7,10 +7,10 @@
static __always_inline int preempt_count(void)
{
- return current_thread_info()->preempt_count;
+ return READ_ONCE(current_thread_info()->preempt_count);
}
-static __always_inline int *preempt_count_ptr(void)
+static __always_inline volatile int *preempt_count_ptr(void)
{
return ¤t_thread_info()->preempt_count;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* [tip:sched/urgent] sched/preempt: Fix preempt_count manipulations
2016-05-16 13:17 ` Peter Zijlstra
@ 2016-05-17 14:21 ` tip-bot for Peter Zijlstra
0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Peter Zijlstra @ 2016-05-17 14:21 UTC (permalink / raw)
To: linux-tip-commits
Cc: mingo, linux-kernel, markivx, peterz, hpa, tglx, torvalds
Commit-ID: 2e636d5e66c35dfcbaf617aa8fa963f6847478fe
Gitweb: http://git.kernel.org/tip/2e636d5e66c35dfcbaf617aa8fa963f6847478fe
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Mon, 16 May 2016 15:01:11 +0200
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 17 May 2016 12:24:21 +0200
sched/preempt: Fix preempt_count manipulations
Vikram reported that his ARM64 compiler managed to 'optimize' away the
preempt_count manipulations in code like:
preempt_enable_no_resched();
put_user();
preempt_disable();
Irrespective of that fact that that is horrible code that should be
fixed for many reasons, it does highlight a deficiency in the generic
preempt_count manipulators. As it is never right to combine/elide
preempt_count manipulations like this.
Therefore sprinkle some volatile in the two generic accessors to
ensure the compiler is aware of the fact that the preempt_count is
observed outside of the regular program-order view and thus cannot be
optimized away like this.
x86; the only arch not using the generic code is not affected as we
do all this in asm in order to use the segment base per-cpu stuff.
Reported-by: Vikram Mulukutla <markivx@codeaurora.org>
Tested-by: Vikram Mulukutla <markivx@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a787870924db ("sched, arch: Create asm/preempt.h")
Link: http://lkml.kernel.org/r/20160516131751.GH3205@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
include/asm-generic/preempt.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
index 5d8ffa3..c1cde35 100644
--- a/include/asm-generic/preempt.h
+++ b/include/asm-generic/preempt.h
@@ -7,10 +7,10 @@
static __always_inline int preempt_count(void)
{
- return current_thread_info()->preempt_count;
+ return READ_ONCE(current_thread_info()->preempt_count);
}
-static __always_inline int *preempt_count_ptr(void)
+static __always_inline volatile int *preempt_count_ptr(void)
{
return ¤t_thread_info()->preempt_count;
}
^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-05-17 14:22 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-13 6:39 Additional compiler barrier required in sched_preempt_enable_no_resched? Vikram Mulukutla
2016-05-13 14:21 ` Thomas Gleixner
2016-05-13 14:58 ` Peter Zijlstra
2016-05-13 22:44 ` Vikram Mulukutla
2016-05-14 15:39 ` Thomas Gleixner
2016-05-14 18:28 ` Vikram Mulukutla
2016-05-16 10:55 ` Peter Zijlstra
2016-05-16 11:00 ` Peter Zijlstra
2016-05-16 13:17 ` Peter Zijlstra
2016-05-17 14:21 ` [tip:sched/urgent] sched/preempt: Fix preempt_count manipulations tip-bot for Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).