linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* freezer: should barriers be smp ?
@ 2011-04-13  6:14 Mike Frysinger
  2011-04-13 20:58 ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Mike Frysinger @ 2011-04-13  6:14 UTC (permalink / raw)
  To: Pavel Machek, Rafael J. Wysocki
  Cc: linux-kernel, uclinux-dist-devel, linux-pm

when we suspend/resume Blackfin SMP systems, we notice that the
freezer code runs on multiple cores.  this is of course what you want
-- freeze processes in parallel.  however, the code only uses non-smp
based barriers which causes us problems ... our cores need software
support to keep caches in sync, so our smp barriers do just that.  but
the non-smp barriers do not, and so the frozen/thawed processes
randomly get stuck in the wrong task state.

thinking about it, shouldnt the freezer code be using smp barriers ?
the point is to make sure that the updates are seen across all cores
are not just the current one ?
-mike

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: freezer: should barriers be smp ?
  2011-04-13  6:14 freezer: should barriers be smp ? Mike Frysinger
@ 2011-04-13 20:58 ` Rafael J. Wysocki
  2011-04-13 21:02   ` Mike Frysinger
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-13 20:58 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: Pavel Machek, linux-kernel, uclinux-dist-devel, linux-pm

On Wednesday, April 13, 2011, Mike Frysinger wrote:
> when we suspend/resume Blackfin SMP systems, we notice that the
> freezer code runs on multiple cores.  this is of course what you want
> -- freeze processes in parallel.  however, the code only uses non-smp
> based barriers which causes us problems ... our cores need software
> support to keep caches in sync, so our smp barriers do just that.  but
> the non-smp barriers do not, and so the frozen/thawed processes
> randomly get stuck in the wrong task state.
> 
> thinking about it, shouldnt the freezer code be using smp barriers ?

Yes, it should, but rmb() and wmb() are supposed to be SMP barriers.

Or do you mean something different?

Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: freezer: should barriers be smp ?
  2011-04-13 20:58 ` Rafael J. Wysocki
@ 2011-04-13 21:02   ` Mike Frysinger
  2011-04-13 21:05     ` Pavel Machek
  0 siblings, 1 reply; 25+ messages in thread
From: Mike Frysinger @ 2011-04-13 21:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, linux-kernel, uclinux-dist-devel, linux-pm

On Wed, Apr 13, 2011 at 16:58, Rafael J. Wysocki wrote:
> On Wednesday, April 13, 2011, Mike Frysinger wrote:
>> when we suspend/resume Blackfin SMP systems, we notice that the
>> freezer code runs on multiple cores.  this is of course what you want
>> -- freeze processes in parallel.  however, the code only uses non-smp
>> based barriers which causes us problems ... our cores need software
>> support to keep caches in sync, so our smp barriers do just that.  but
>> the non-smp barriers do not, and so the frozen/thawed processes
>> randomly get stuck in the wrong task state.
>>
>> thinking about it, shouldnt the freezer code be using smp barriers ?
>
> Yes, it should, but rmb() and wmb() are supposed to be SMP barriers.
>
> Or do you mean something different?

then what's the diff between smp_rmb() and rmb() ?

this is what i'm proposing:
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -17,7 +17,7 @@ static inline void frozen_process(void)
 {
    if (!unlikely(current->flags & PF_NOFREEZE)) {
        current->flags |= PF_FROZEN;
-       wmb();
+       smp_wmb();
    }
    clear_freeze_flag(current);
 }
@@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only)
     * the task as frozen and next clears its TIF_FREEZE.
     */
    if (!freezing(p)) {
-       rmb();
+       smp_rmb();
        if (frozen(p))
            return false;

-mike

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: freezer: should barriers be smp ?
  2011-04-13 21:02   ` Mike Frysinger
@ 2011-04-13 21:05     ` Pavel Machek
  2011-04-13 21:11       ` [uclinux-dist-devel] " Mike Frysinger
  0 siblings, 1 reply; 25+ messages in thread
From: Pavel Machek @ 2011-04-13 21:05 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Rafael J. Wysocki, linux-kernel, uclinux-dist-devel, linux-pm

On Wed 2011-04-13 17:02:45, Mike Frysinger wrote:
> On Wed, Apr 13, 2011 at 16:58, Rafael J. Wysocki wrote:
> > On Wednesday, April 13, 2011, Mike Frysinger wrote:
> >> when we suspend/resume Blackfin SMP systems, we notice that the
> >> freezer code runs on multiple cores.  this is of course what you want
> >> -- freeze processes in parallel.  however, the code only uses non-smp
> >> based barriers which causes us problems ... our cores need software
> >> support to keep caches in sync, so our smp barriers do just that.  but
> >> the non-smp barriers do not, and so the frozen/thawed processes
> >> randomly get stuck in the wrong task state.
> >>
> >> thinking about it, shouldnt the freezer code be using smp barriers ?
> >
> > Yes, it should, but rmb() and wmb() are supposed to be SMP barriers.
> >
> > Or do you mean something different?
> 
> then what's the diff between smp_rmb() and rmb() ?
> 
> this is what i'm proposing:
> --- a/kernel/freezer.c
> +++ b/kernel/freezer.c
> @@ -17,7 +17,7 @@ static inline void frozen_process(void)
>  {
>     if (!unlikely(current->flags & PF_NOFREEZE)) {
>         current->flags |= PF_FROZEN;
> -       wmb();
> +       smp_wmb();
>     }
>     clear_freeze_flag(current);
>  }
> @@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only)
>      * the task as frozen and next clears its TIF_FREEZE.
>      */
>     if (!freezing(p)) {
> -       rmb();
> +       smp_rmb();
>         if (frozen(p))
>             return false;

smp_rmb() is NOP on uniprocessor.

I believe the code is correct as is.
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] freezer: should barriers be smp ?
  2011-04-13 21:05     ` Pavel Machek
@ 2011-04-13 21:11       ` Mike Frysinger
  2011-04-13 21:53         ` Rafael J. Wysocki
  2011-04-13 22:04         ` [linux-pm] [uclinux-dist-devel] " Alan Stern
  0 siblings, 2 replies; 25+ messages in thread
From: Mike Frysinger @ 2011-04-13 21:11 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Rafael J. Wysocki, uclinux-dist-devel, linux-pm, linux-kernel

On Wed, Apr 13, 2011 at 17:05, Pavel Machek wrote:
> On Wed 2011-04-13 17:02:45, Mike Frysinger wrote:
>> On Wed, Apr 13, 2011 at 16:58, Rafael J. Wysocki wrote:
>> > On Wednesday, April 13, 2011, Mike Frysinger wrote:
>> >> when we suspend/resume Blackfin SMP systems, we notice that the
>> >> freezer code runs on multiple cores.  this is of course what you want
>> >> -- freeze processes in parallel.  however, the code only uses non-smp
>> >> based barriers which causes us problems ... our cores need software
>> >> support to keep caches in sync, so our smp barriers do just that.  but
>> >> the non-smp barriers do not, and so the frozen/thawed processes
>> >> randomly get stuck in the wrong task state.
>> >>
>> >> thinking about it, shouldnt the freezer code be using smp barriers ?
>> >
>> > Yes, it should, but rmb() and wmb() are supposed to be SMP barriers.
>> >
>> > Or do you mean something different?
>>
>> then what's the diff between smp_rmb() and rmb() ?
>>
>> this is what i'm proposing:
>> --- a/kernel/freezer.c
>> +++ b/kernel/freezer.c
>> @@ -17,7 +17,7 @@ static inline void frozen_process(void)
>>  {
>>     if (!unlikely(current->flags & PF_NOFREEZE)) {
>>         current->flags |= PF_FROZEN;
>> -       wmb();
>> +       smp_wmb();
>>     }
>>     clear_freeze_flag(current);
>>  }
>> @@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only)
>>      * the task as frozen and next clears its TIF_FREEZE.
>>      */
>>     if (!freezing(p)) {
>> -       rmb();
>> +       smp_rmb();
>>         if (frozen(p))
>>             return false;
>
> smp_rmb() is NOP on uniprocessor.
>
> I believe the code is correct as is.

that isnt what the code / documentation says.  unless i'm reading them
wrong, both seem to indicate that the proposed patch is what we
actually want.

include/linux/compiler-gcc.h:
#define barrier() __asm__ __volatile__("": : :"memory")

include/asm-generic/system.h:
#define mb()    asm volatile ("": : :"memory")
#define rmb()   mb()
#define wmb()   asm volatile ("": : :"memory")

#ifdef CONFIG_SMP
#define smp_mb()    mb()
#define smp_rmb()   rmb()
#define smp_wmb()   wmb()
#else
#define smp_mb()    barrier()
#define smp_rmb()   barrier()
#define smp_wmb()   barrier()
#endif

Documentation/memory-barriers.txt:
SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
systems because it is assumed that a CPU will appear to be self-consistent,
and will order overlapping accesses correctly with respect to itself.

[!] Note that SMP memory barriers _must_ be used to control the ordering of
references to shared memory on SMP systems, though the use of locking instead
is sufficient.
-mike

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] freezer: should barriers be smp ?
  2011-04-13 21:11       ` [uclinux-dist-devel] " Mike Frysinger
@ 2011-04-13 21:53         ` Rafael J. Wysocki
  2011-04-13 22:11           ` [linux-pm] " Alan Stern
  2011-04-13 22:22           ` [uclinux-dist-devel] freezer: should barriers be smp ? Mike Frysinger
  2011-04-13 22:04         ` [linux-pm] [uclinux-dist-devel] " Alan Stern
  1 sibling, 2 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-13 21:53 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: Pavel Machek, uclinux-dist-devel, linux-pm, linux-kernel

On Wednesday, April 13, 2011, Mike Frysinger wrote:
> On Wed, Apr 13, 2011 at 17:05, Pavel Machek wrote:
> > On Wed 2011-04-13 17:02:45, Mike Frysinger wrote:
> >> On Wed, Apr 13, 2011 at 16:58, Rafael J. Wysocki wrote:
> >> > On Wednesday, April 13, 2011, Mike Frysinger wrote:
> >> >> when we suspend/resume Blackfin SMP systems, we notice that the
> >> >> freezer code runs on multiple cores.  this is of course what you want
> >> >> -- freeze processes in parallel.  however, the code only uses non-smp
> >> >> based barriers which causes us problems ... our cores need software
> >> >> support to keep caches in sync, so our smp barriers do just that.  but
> >> >> the non-smp barriers do not, and so the frozen/thawed processes
> >> >> randomly get stuck in the wrong task state.
> >> >>
> >> >> thinking about it, shouldnt the freezer code be using smp barriers ?
> >> >
> >> > Yes, it should, but rmb() and wmb() are supposed to be SMP barriers.
> >> >
> >> > Or do you mean something different?
> >>
> >> then what's the diff between smp_rmb() and rmb() ?
> >>
> >> this is what i'm proposing:
> >> --- a/kernel/freezer.c
> >> +++ b/kernel/freezer.c
> >> @@ -17,7 +17,7 @@ static inline void frozen_process(void)
> >>  {
> >>     if (!unlikely(current->flags & PF_NOFREEZE)) {
> >>         current->flags |= PF_FROZEN;
> >> -       wmb();
> >> +       smp_wmb();
> >>     }
> >>     clear_freeze_flag(current);
> >>  }
> >> @@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only)
> >>      * the task as frozen and next clears its TIF_FREEZE.
> >>      */
> >>     if (!freezing(p)) {
> >> -       rmb();
> >> +       smp_rmb();
> >>         if (frozen(p))
> >>             return false;
> >
> > smp_rmb() is NOP on uniprocessor.
> >
> > I believe the code is correct as is.
> 
> that isnt what the code / documentation says.  unless i'm reading them
> wrong, both seem to indicate that the proposed patch is what we
> actually want.

Not really.

> include/linux/compiler-gcc.h:
> #define barrier() __asm__ __volatile__("": : :"memory")
> 
> include/asm-generic/system.h:
> #define mb()    asm volatile ("": : :"memory")
> #define rmb()   mb()
> #define wmb()   asm volatile ("": : :"memory")
> 
> #ifdef CONFIG_SMP
> #define smp_mb()    mb()
> #define smp_rmb()   rmb()
> #define smp_wmb()   wmb()
> #else
> #define smp_mb()    barrier()
> #define smp_rmb()   barrier()
> #define smp_wmb()   barrier()
> #endif

The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set,
which basically means that *mb() are more restrictive than the corresponding
smp_*mb().  More precisely, they also cover the cases in which the CPU
reorders instructions on uniprocessor, which we definitely want to cover.

IOW, your patch would break things on uniprocessor where the CPU reorders
instructions.

> Documentation/memory-barriers.txt:
> SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
> systems because it is assumed that a CPU will appear to be self-consistent,
> and will order overlapping accesses correctly with respect to itself.

Exactly, which is not guaranteed in general (e.g. on Alpha).  That is, some
CPUs can reorder instructions in such a way that a compiler barrier is not
sufficient to prevent breakage.

The code _may_ be wrong for a different reason, though.  I need to check.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp ?
  2011-04-13 21:11       ` [uclinux-dist-devel] " Mike Frysinger
  2011-04-13 21:53         ` Rafael J. Wysocki
@ 2011-04-13 22:04         ` Alan Stern
  2011-04-15 16:29           ` Pavel Machek
  1 sibling, 1 reply; 25+ messages in thread
From: Alan Stern @ 2011-04-13 22:04 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: Pavel Machek, uclinux-dist-devel, linux-pm, linux-kernel

On Wed, 13 Apr 2011, Mike Frysinger wrote:

> On Wed, Apr 13, 2011 at 17:05, Pavel Machek wrote:
> > On Wed 2011-04-13 17:02:45, Mike Frysinger wrote:
> 
> >> then what's the diff between smp_rmb() and rmb() ?
> >>
> >> this is what i'm proposing:
> >> --- a/kernel/freezer.c
> >> +++ b/kernel/freezer.c
> >> @@ -17,7 +17,7 @@ static inline void frozen_process(void)
> >>  {
> >>     if (!unlikely(current->flags & PF_NOFREEZE)) {
> >>         current->flags |= PF_FROZEN;
> >> -       wmb();
> >> +       smp_wmb();
> >>     }
> >>     clear_freeze_flag(current);
> >>  }
> >> @@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only)
> >>      * the task as frozen and next clears its TIF_FREEZE.
> >>      */
> >>     if (!freezing(p)) {
> >> -       rmb();
> >> +       smp_rmb();
> >>         if (frozen(p))
> >>             return false;
> >
> > smp_rmb() is NOP on uniprocessor.
> >
> > I believe the code is correct as is.
> 
> that isnt what the code / documentation says.  unless i'm reading them
> wrong, both seem to indicate that the proposed patch is what we
> actually want.

The existing code is correct but it isn't optimal.

wmb() and rmb() are heavy-duty operations, and you don't want to call
them when they aren't needed.  That's exactly what smp_wmb() and
smp_rmb() are for -- they call wmb() and rmb(), but only in SMP 
kernels.

Unless you need to synchronize with another processor (not necessarily 
a CPU, it could be something embedded within a device), you should 
always use smp_wmb() and smp_rmb() rather than wmb() and rmb().

Alan Stern


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp ?
  2011-04-13 21:53         ` Rafael J. Wysocki
@ 2011-04-13 22:11           ` Alan Stern
  2011-04-13 22:34             ` [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp? Rafael J. Wysocki
  2011-04-13 22:22           ` [uclinux-dist-devel] freezer: should barriers be smp ? Mike Frysinger
  1 sibling, 1 reply; 25+ messages in thread
From: Alan Stern @ 2011-04-13 22:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Mike Frysinger, uclinux-dist-devel, linux-pm, linux-kernel

On Wed, 13 Apr 2011, Rafael J. Wysocki wrote:

> The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set,
> which basically means that *mb() are more restrictive than the corresponding
> smp_*mb().  More precisely, they also cover the cases in which the CPU
> reorders instructions on uniprocessor, which we definitely want to cover.
> 
> IOW, your patch would break things on uniprocessor where the CPU reorders
> instructions.

How could anything break on a UP system?  CPUs don't reorder 
instructions that drastically.  For example, no CPU will ever violate
this assertion:

	x = 0;
	y = x;
	x = 1;
	assert(y == 0);

even if it does reorder the second and third statements internally.  
This is guaranteed by the C language specification.

> > Documentation/memory-barriers.txt:
> > SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
> > systems because it is assumed that a CPU will appear to be self-consistent,
> > and will order overlapping accesses correctly with respect to itself.
> 
> Exactly, which is not guaranteed in general (e.g. on Alpha).  That is, some
> CPUs can reorder instructions in such a way that a compiler barrier is not
> sufficient to prevent breakage.

I don't think this is right.  You _can_ assume that Alphas appear to be
self-consistent.  If they didn't, you wouldn't be able to use them at
all.

Alan Stern


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] freezer: should barriers be smp ?
  2011-04-13 21:53         ` Rafael J. Wysocki
  2011-04-13 22:11           ` [linux-pm] " Alan Stern
@ 2011-04-13 22:22           ` Mike Frysinger
  2011-04-13 22:49             ` Rafael J. Wysocki
  1 sibling, 1 reply; 25+ messages in thread
From: Mike Frysinger @ 2011-04-13 22:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, uclinux-dist-devel, linux-pm, linux-kernel

On Wed, Apr 13, 2011 at 17:53, Rafael J. Wysocki wrote:
> On Wednesday, April 13, 2011, Mike Frysinger wrote:
>> On Wed, Apr 13, 2011 at 17:05, Pavel Machek wrote:
>> > On Wed 2011-04-13 17:02:45, Mike Frysinger wrote:
>> >> On Wed, Apr 13, 2011 at 16:58, Rafael J. Wysocki wrote:
>> >> > On Wednesday, April 13, 2011, Mike Frysinger wrote:
>> >> >> when we suspend/resume Blackfin SMP systems, we notice that the
>> >> >> freezer code runs on multiple cores.  this is of course what you want
>> >> >> -- freeze processes in parallel.  however, the code only uses non-smp
>> >> >> based barriers which causes us problems ... our cores need software
>> >> >> support to keep caches in sync, so our smp barriers do just that.  but
>> >> >> the non-smp barriers do not, and so the frozen/thawed processes
>> >> >> randomly get stuck in the wrong task state.
>> >> >>
>> >> >> thinking about it, shouldnt the freezer code be using smp barriers ?
>> >> >
>> >> > Yes, it should, but rmb() and wmb() are supposed to be SMP barriers.
>> >> >
>> >> > Or do you mean something different?
>> >>
>> >> then what's the diff between smp_rmb() and rmb() ?
>> >>
>> >> this is what i'm proposing:
>> >> --- a/kernel/freezer.c
>> >> +++ b/kernel/freezer.c
>> >> @@ -17,7 +17,7 @@ static inline void frozen_process(void)
>> >>  {
>> >>     if (!unlikely(current->flags & PF_NOFREEZE)) {
>> >>         current->flags |= PF_FROZEN;
>> >> -       wmb();
>> >> +       smp_wmb();
>> >>     }
>> >>     clear_freeze_flag(current);
>> >>  }
>> >> @@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only)
>> >>      * the task as frozen and next clears its TIF_FREEZE.
>> >>      */
>> >>     if (!freezing(p)) {
>> >> -       rmb();
>> >> +       smp_rmb();
>> >>         if (frozen(p))
>> >>             return false;
>> >
>> > smp_rmb() is NOP on uniprocessor.
>> >
>> > I believe the code is correct as is.
>>
>> that isnt what the code / documentation says.  unless i'm reading them
>> wrong, both seem to indicate that the proposed patch is what we
>> actually want.
>
> Not really.
>
>> include/linux/compiler-gcc.h:
>> #define barrier() __asm__ __volatile__("": : :"memory")
>>
>> include/asm-generic/system.h:
>> #define mb()    asm volatile ("": : :"memory")
>> #define rmb()   mb()
>> #define wmb()   asm volatile ("": : :"memory")
>>
>> #ifdef CONFIG_SMP
>> #define smp_mb()    mb()
>> #define smp_rmb()   rmb()
>> #define smp_wmb()   wmb()
>> #else
>> #define smp_mb()    barrier()
>> #define smp_rmb()   barrier()
>> #define smp_wmb()   barrier()
>> #endif
>
> The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set,
> which basically means that *mb() are more restrictive than the corresponding
> smp_*mb().  More precisely, they also cover the cases in which the CPU
> reorders instructions on uniprocessor, which we definitely want to cover.
>
> IOW, your patch would break things on uniprocessor where the CPU reorders
> instructions.
>
>> Documentation/memory-barriers.txt:
>> SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
>> systems because it is assumed that a CPU will appear to be self-consistent,
>> and will order overlapping accesses correctly with respect to itself.
>
> Exactly, which is not guaranteed in general (e.g. on Alpha).  That is, some
> CPUs can reorder instructions in such a way that a compiler barrier is not
> sufficient to prevent breakage.
>
> The code _may_ be wrong for a different reason, though.  I need to check.

so the current code is protecting against a UP system swapping in/out
freezer threads for processes, and the barriers are to make sure that
the updated flags variable is posted by the time another swapped in
thread gets to that point.

i guess the trouble for us is that you have one CPU posting writes to
task->flags (and doing so by grabbing the task's spinlock), but the
other CPU is simply reading those flags.  there are no SMP barriers in
between the read and write steps, nor is the reading CPU grabbing any
locks which would be an implicit SMP barrier.  since the Blackfin SMP
port lacks hardware cache coherency, there is no way for us to know
"we've got to sync the caches before we can do this read".  by using
the patch i posted above, we have that signal and so things work
correctly.,
-mike

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp?
  2011-04-13 22:11           ` [linux-pm] " Alan Stern
@ 2011-04-13 22:34             ` Rafael J. Wysocki
  2011-04-14 14:55               ` Alan Stern
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-13 22:34 UTC (permalink / raw)
  To: Alan Stern; +Cc: Mike Frysinger, uclinux-dist-devel, linux-pm, linux-kernel

On Thursday, April 14, 2011, Alan Stern wrote:
> On Wed, 13 Apr 2011, Rafael J. Wysocki wrote:
> 
> > The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set,
> > which basically means that *mb() are more restrictive than the corresponding
> > smp_*mb().  More precisely, they also cover the cases in which the CPU
> > reorders instructions on uniprocessor, which we definitely want to cover.
> > 
> > IOW, your patch would break things on uniprocessor where the CPU reorders
> > instructions.
> 
> How could anything break on a UP system?  CPUs don't reorder 
> instructions that drastically.  For example, no CPU will ever violate
> this assertion:
> 
> 	x = 0;
> 	y = x;
> 	x = 1;
> 	assert(y == 0);
> 
> even if it does reorder the second and third statements internally.  
> This is guaranteed by the C language specification.

Well, you conveniently removed the patch from your reply. :-)

For example, there's no reason why the CPU cannot reorder things so that
the "if (frozen(p))" is (speculatively) done before the "if (!freezing(p))"
if there's only a compiler barrier between them.

> > > Documentation/memory-barriers.txt:
> > > SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
> > > systems because it is assumed that a CPU will appear to be self-consistent,
> > > and will order overlapping accesses correctly with respect to itself.
> > 
> > Exactly, which is not guaranteed in general (e.g. on Alpha).  That is, some
> > CPUs can reorder instructions in such a way that a compiler barrier is not
> > sufficient to prevent breakage.
> 
> I don't think this is right.  You _can_ assume that Alphas appear to be
> self-consistent.  If they didn't, you wouldn't be able to use them at
> all.

I'm quite convinced that the statement "some CPUs can reorder instructions in
such a way that a compiler barrier is not sufficient to prevent breakage" is
correct.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] freezer: should barriers be smp ?
  2011-04-13 22:22           ` [uclinux-dist-devel] freezer: should barriers be smp ? Mike Frysinger
@ 2011-04-13 22:49             ` Rafael J. Wysocki
  2011-04-13 22:53               ` Rafael J. Wysocki
  2011-04-13 22:57               ` Mike Frysinger
  0 siblings, 2 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-13 22:49 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: Pavel Machek, uclinux-dist-devel, linux-pm, linux-kernel

On Thursday, April 14, 2011, Mike Frysinger wrote:
> On Wed, Apr 13, 2011 at 17:53, Rafael J. Wysocki wrote:
> > On Wednesday, April 13, 2011, Mike Frysinger wrote:
> >> On Wed, Apr 13, 2011 at 17:05, Pavel Machek wrote:
> >> > On Wed 2011-04-13 17:02:45, Mike Frysinger wrote:
> >> >> On Wed, Apr 13, 2011 at 16:58, Rafael J. Wysocki wrote:
> >> >> > On Wednesday, April 13, 2011, Mike Frysinger wrote:
> >> >> >> when we suspend/resume Blackfin SMP systems, we notice that the
> >> >> >> freezer code runs on multiple cores.  this is of course what you want
> >> >> >> -- freeze processes in parallel.  however, the code only uses non-smp
> >> >> >> based barriers which causes us problems ... our cores need software
> >> >> >> support to keep caches in sync, so our smp barriers do just that.  but
> >> >> >> the non-smp barriers do not, and so the frozen/thawed processes
> >> >> >> randomly get stuck in the wrong task state.
> >> >> >>
> >> >> >> thinking about it, shouldnt the freezer code be using smp barriers ?
> >> >> >
> >> >> > Yes, it should, but rmb() and wmb() are supposed to be SMP barriers.
> >> >> >
> >> >> > Or do you mean something different?
> >> >>
> >> >> then what's the diff between smp_rmb() and rmb() ?
> >> >>
> >> >> this is what i'm proposing:
> >> >> --- a/kernel/freezer.c
> >> >> +++ b/kernel/freezer.c
> >> >> @@ -17,7 +17,7 @@ static inline void frozen_process(void)
> >> >>  {
> >> >>     if (!unlikely(current->flags & PF_NOFREEZE)) {
> >> >>         current->flags |= PF_FROZEN;
> >> >> -       wmb();
> >> >> +       smp_wmb();
> >> >>     }
> >> >>     clear_freeze_flag(current);
> >> >>  }
> >> >> @@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only)
> >> >>      * the task as frozen and next clears its TIF_FREEZE.
> >> >>      */
> >> >>     if (!freezing(p)) {
> >> >> -       rmb();
> >> >> +       smp_rmb();
> >> >>         if (frozen(p))
> >> >>             return false;
> >> >
> >> > smp_rmb() is NOP on uniprocessor.
> >> >
> >> > I believe the code is correct as is.
> >>
> >> that isnt what the code / documentation says.  unless i'm reading them
> >> wrong, both seem to indicate that the proposed patch is what we
> >> actually want.
> >
> > Not really.
> >
> >> include/linux/compiler-gcc.h:
> >> #define barrier() __asm__ __volatile__("": : :"memory")
> >>
> >> include/asm-generic/system.h:
> >> #define mb()    asm volatile ("": : :"memory")
> >> #define rmb()   mb()
> >> #define wmb()   asm volatile ("": : :"memory")
> >>
> >> #ifdef CONFIG_SMP
> >> #define smp_mb()    mb()
> >> #define smp_rmb()   rmb()
> >> #define smp_wmb()   wmb()
> >> #else
> >> #define smp_mb()    barrier()
> >> #define smp_rmb()   barrier()
> >> #define smp_wmb()   barrier()
> >> #endif
> >
> > The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set,
> > which basically means that *mb() are more restrictive than the corresponding
> > smp_*mb().  More precisely, they also cover the cases in which the CPU
> > reorders instructions on uniprocessor, which we definitely want to cover.
> >
> > IOW, your patch would break things on uniprocessor where the CPU reorders
> > instructions.
> >
> >> Documentation/memory-barriers.txt:
> >> SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
> >> systems because it is assumed that a CPU will appear to be self-consistent,
> >> and will order overlapping accesses correctly with respect to itself.
> >
> > Exactly, which is not guaranteed in general (e.g. on Alpha).  That is, some
> > CPUs can reorder instructions in such a way that a compiler barrier is not
> > sufficient to prevent breakage.
> >
> > The code _may_ be wrong for a different reason, though.  I need to check.
> 
> so the current code is protecting against a UP system swapping in/out
> freezer threads for processes, and the barriers are to make sure that
> the updated flags variable is posted by the time another swapped in
> thread gets to that point.

The existing memory barriers are SMP barriers too, but they are more than
_just_ SMP barriers.  At least that's how it is _supposed_ to be (eg.
rmb() is supposed to be stronger than smp_rmb()).

> i guess the trouble for us is that you have one CPU posting writes to
> task->flags (and doing so by grabbing the task's spinlock), but the
> other CPU is simply reading those flags.  there are no SMP barriers in
> between the read and write steps, nor is the reading CPU grabbing any
> locks which would be an implicit SMP barrier.  since the Blackfin SMP
> port lacks hardware cache coherency, there is no way for us to know
> "we've got to sync the caches before we can do this read".  by using
> the patch i posted above, we have that signal and so things work
> correctly.,

In theory I wouldn't expect the patch to work correctly, because it replaces
_stronger_ memory barriers with _weaker_ SMP barriers.  However, looking at
the blackfin's definitions of SMP barriers I see that it uses extra stuff that
should _also_ be used in the definitions of the mandatory barriers.

In my opinion is an architecture problem, not the freezer code problem.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] freezer: should barriers be smp ?
  2011-04-13 22:49             ` Rafael J. Wysocki
@ 2011-04-13 22:53               ` Rafael J. Wysocki
  2011-04-13 22:57               ` Mike Frysinger
  1 sibling, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-13 22:53 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: Pavel Machek, uclinux-dist-devel, linux-pm, linux-kernel

On Thursday, April 14, 2011, Rafael J. Wysocki wrote:
> On Thursday, April 14, 2011, Mike Frysinger wrote:
> > On Wed, Apr 13, 2011 at 17:53, Rafael J. Wysocki wrote:
> > > On Wednesday, April 13, 2011, Mike Frysinger wrote:
> > >> On Wed, Apr 13, 2011 at 17:05, Pavel Machek wrote:
> > >> > On Wed 2011-04-13 17:02:45, Mike Frysinger wrote:
> > >> >> On Wed, Apr 13, 2011 at 16:58, Rafael J. Wysocki wrote:
> > >> >> > On Wednesday, April 13, 2011, Mike Frysinger wrote:
> > >> >> >> when we suspend/resume Blackfin SMP systems, we notice that the
> > >> >> >> freezer code runs on multiple cores.  this is of course what you want
> > >> >> >> -- freeze processes in parallel.  however, the code only uses non-smp
> > >> >> >> based barriers which causes us problems ... our cores need software
> > >> >> >> support to keep caches in sync, so our smp barriers do just that.  but
> > >> >> >> the non-smp barriers do not, and so the frozen/thawed processes
> > >> >> >> randomly get stuck in the wrong task state.
> > >> >> >>
> > >> >> >> thinking about it, shouldnt the freezer code be using smp barriers ?
> > >> >> >
> > >> >> > Yes, it should, but rmb() and wmb() are supposed to be SMP barriers.
> > >> >> >
> > >> >> > Or do you mean something different?
> > >> >>
> > >> >> then what's the diff between smp_rmb() and rmb() ?
> > >> >>
> > >> >> this is what i'm proposing:
> > >> >> --- a/kernel/freezer.c
> > >> >> +++ b/kernel/freezer.c
> > >> >> @@ -17,7 +17,7 @@ static inline void frozen_process(void)
> > >> >>  {
> > >> >>     if (!unlikely(current->flags & PF_NOFREEZE)) {
> > >> >>         current->flags |= PF_FROZEN;
> > >> >> -       wmb();
> > >> >> +       smp_wmb();
> > >> >>     }
> > >> >>     clear_freeze_flag(current);
> > >> >>  }
> > >> >> @@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only)
> > >> >>      * the task as frozen and next clears its TIF_FREEZE.
> > >> >>      */
> > >> >>     if (!freezing(p)) {
> > >> >> -       rmb();
> > >> >> +       smp_rmb();
> > >> >>         if (frozen(p))
> > >> >>             return false;
> > >> >
> > >> > smp_rmb() is NOP on uniprocessor.
> > >> >
> > >> > I believe the code is correct as is.
> > >>
> > >> that isnt what the code / documentation says.  unless i'm reading them
> > >> wrong, both seem to indicate that the proposed patch is what we
> > >> actually want.
> > >
> > > Not really.
> > >
> > >> include/linux/compiler-gcc.h:
> > >> #define barrier() __asm__ __volatile__("": : :"memory")
> > >>
> > >> include/asm-generic/system.h:
> > >> #define mb()    asm volatile ("": : :"memory")
> > >> #define rmb()   mb()
> > >> #define wmb()   asm volatile ("": : :"memory")
> > >>
> > >> #ifdef CONFIG_SMP
> > >> #define smp_mb()    mb()
> > >> #define smp_rmb()   rmb()
> > >> #define smp_wmb()   wmb()
> > >> #else
> > >> #define smp_mb()    barrier()
> > >> #define smp_rmb()   barrier()
> > >> #define smp_wmb()   barrier()
> > >> #endif
> > >
> > > The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set,
> > > which basically means that *mb() are more restrictive than the corresponding
> > > smp_*mb().  More precisely, they also cover the cases in which the CPU
> > > reorders instructions on uniprocessor, which we definitely want to cover.
> > >
> > > IOW, your patch would break things on uniprocessor where the CPU reorders
> > > instructions.
> > >
> > >> Documentation/memory-barriers.txt:
> > >> SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
> > >> systems because it is assumed that a CPU will appear to be self-consistent,
> > >> and will order overlapping accesses correctly with respect to itself.
> > >
> > > Exactly, which is not guaranteed in general (e.g. on Alpha).  That is, some
> > > CPUs can reorder instructions in such a way that a compiler barrier is not
> > > sufficient to prevent breakage.
> > >
> > > The code _may_ be wrong for a different reason, though.  I need to check.
> > 
> > so the current code is protecting against a UP system swapping in/out
> > freezer threads for processes, and the barriers are to make sure that
> > the updated flags variable is posted by the time another swapped in
> > thread gets to that point.
> 
> The existing memory barriers are SMP barriers too, but they are more than
> _just_ SMP barriers.  At least that's how it is _supposed_ to be (eg.
> rmb() is supposed to be stronger than smp_rmb()).
> 
> > i guess the trouble for us is that you have one CPU posting writes to
> > task->flags (and doing so by grabbing the task's spinlock), but the
> > other CPU is simply reading those flags.  there are no SMP barriers in
> > between the read and write steps, nor is the reading CPU grabbing any
> > locks which would be an implicit SMP barrier.  since the Blackfin SMP
> > port lacks hardware cache coherency, there is no way for us to know
> > "we've got to sync the caches before we can do this read".  by using
> > the patch i posted above, we have that signal and so things work
> > correctly.,
> 
> In theory I wouldn't expect the patch to work correctly, because it replaces
> _stronger_ memory barriers with _weaker_ SMP barriers.  However, looking at
> the blackfin's definitions of SMP barriers I see that it uses extra stuff that
> should _also_ be used in the definitions of the mandatory barriers.
> 
> In my opinion is an architecture problem, not the freezer code problem.

If I wasn't clear enough, which is very likely, a mandatory memory barrier
is supposed to imply a corresponding SMP barrier, although not necessarily
the other way around (eg. rmb() is supposed to imply smp_rmb() etc.).  This
doesn't seem to be the case on Blackfin, however.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: freezer: should barriers be smp ?
  2011-04-13 22:49             ` Rafael J. Wysocki
  2011-04-13 22:53               ` Rafael J. Wysocki
@ 2011-04-13 22:57               ` Mike Frysinger
  2011-04-13 23:12                 ` Rafael J. Wysocki
  2011-04-14 15:13                 ` [linux-pm] " Alan Stern
  1 sibling, 2 replies; 25+ messages in thread
From: Mike Frysinger @ 2011-04-13 22:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, uclinux-dist-devel, linux-pm, linux-kernel

On Wed, Apr 13, 2011 at 18:49, Rafael J. Wysocki wrote:
> On Thursday, April 14, 2011, Mike Frysinger wrote:
>> i guess the trouble for us is that you have one CPU posting writes to
>> task->flags (and doing so by grabbing the task's spinlock), but the
>> other CPU is simply reading those flags.  there are no SMP barriers in
>> between the read and write steps, nor is the reading CPU grabbing any
>> locks which would be an implicit SMP barrier.  since the Blackfin SMP
>> port lacks hardware cache coherency, there is no way for us to know
>> "we've got to sync the caches before we can do this read".  by using
>> the patch i posted above, we have that signal and so things work
>> correctly.,
>
> In theory I wouldn't expect the patch to work correctly, because it replaces
> _stronger_ memory barriers with _weaker_ SMP barriers.  However, looking at
> the blackfin's definitions of SMP barriers I see that it uses extra stuff that
> should _also_ be used in the definitions of the mandatory barriers.
>
> In my opinion is an architecture problem, not the freezer code problem.

OK, we have a patch pending locally which populates all barriers with
this logic, but based on my understanding of things, that didnt seem
correct.  i guess i'm reading too much into the names ... i'd expect
the opposite behavior where "rmb" is only for UP needs while "smp_rmb"
is a rmb which additionally covers SMP.
-mike

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: freezer: should barriers be smp ?
  2011-04-13 22:57               ` Mike Frysinger
@ 2011-04-13 23:12                 ` Rafael J. Wysocki
  2011-04-14 15:13                 ` [linux-pm] " Alan Stern
  1 sibling, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-13 23:12 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: Pavel Machek, uclinux-dist-devel, linux-pm, linux-kernel

On Thursday, April 14, 2011, Mike Frysinger wrote:
> On Wed, Apr 13, 2011 at 18:49, Rafael J. Wysocki wrote:
> > On Thursday, April 14, 2011, Mike Frysinger wrote:
> >> i guess the trouble for us is that you have one CPU posting writes to
> >> task->flags (and doing so by grabbing the task's spinlock), but the
> >> other CPU is simply reading those flags.  there are no SMP barriers in
> >> between the read and write steps, nor is the reading CPU grabbing any
> >> locks which would be an implicit SMP barrier.  since the Blackfin SMP
> >> port lacks hardware cache coherency, there is no way for us to know
> >> "we've got to sync the caches before we can do this read".  by using
> >> the patch i posted above, we have that signal and so things work
> >> correctly.,
> >
> > In theory I wouldn't expect the patch to work correctly, because it replaces
> > _stronger_ memory barriers with _weaker_ SMP barriers.  However, looking at
> > the blackfin's definitions of SMP barriers I see that it uses extra stuff that
> > should _also_ be used in the definitions of the mandatory barriers.
> >
> > In my opinion is an architecture problem, not the freezer code problem.
> 
> OK, we have a patch pending locally which populates all barriers with
> this logic, but based on my understanding of things, that didnt seem
> correct.  i guess i'm reading too much into the names ... i'd expect
> the opposite behavior where "rmb" is only for UP needs while "smp_rmb"
> is a rmb which additionally covers SMP.

Well, I guess the naming is for historical reasons, ie. mb(), rmb() and wmb()
were there first and it probably was regarded cleaner to use new names for the
optimized smp_ variants than to rename all instances already in the code and
then repurpose the old names.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp?
  2011-04-13 22:34             ` [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp? Rafael J. Wysocki
@ 2011-04-14 14:55               ` Alan Stern
  2011-04-14 22:34                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Alan Stern @ 2011-04-14 14:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Mike Frysinger, uclinux-dist-devel, linux-pm, linux-kernel

On Thu, 14 Apr 2011, Rafael J. Wysocki wrote:

> On Thursday, April 14, 2011, Alan Stern wrote:
> > On Wed, 13 Apr 2011, Rafael J. Wysocki wrote:
> > 
> > > The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set,
> > > which basically means that *mb() are more restrictive than the corresponding
> > > smp_*mb().  More precisely, they also cover the cases in which the CPU
> > > reorders instructions on uniprocessor, which we definitely want to cover.
> > > 
> > > IOW, your patch would break things on uniprocessor where the CPU reorders
> > > instructions.
> > 
> > How could anything break on a UP system?  CPUs don't reorder 
> > instructions that drastically.  For example, no CPU will ever violate
> > this assertion:
> > 
> > 	x = 0;
> > 	y = x;
> > 	x = 1;
> > 	assert(y == 0);
> > 
> > even if it does reorder the second and third statements internally.  
> > This is guaranteed by the C language specification.
> 
> Well, you conveniently removed the patch from your reply. :-)

All the patch does is replace an instance of wmb() with smp_wmb() and 
an instance of rmb() with smp_rmb().

> For example, there's no reason why the CPU cannot reorder things so that
> the "if (frozen(p))" is (speculatively) done before the "if (!freezing(p))"
> if there's only a compiler barrier between them.

That's true.  On an SMP system, smp_wmb() is identical to wmb(), so
there will be a true memory barrier when it is needed.  On a UP system,
reordering the instructions in this way will not change the final
result -- in particular, it won't break anything.

In your example, the two tests look at different flags in *p.  
Speculative reordering of the tests won't make any difference unless
one of the flags gets changed in between.  On a UP system, the only way
the flag can be changed is for the CPU to change it, in which case
the CPU would obviously know that the speculative result had to be
invalidated.

> > > > Documentation/memory-barriers.txt:
> > > > SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
> > > > systems because it is assumed that a CPU will appear to be self-consistent,
> > > > and will order overlapping accesses correctly with respect to itself.
> > > 
> > > Exactly, which is not guaranteed in general (e.g. on Alpha).  That is, some
> > > CPUs can reorder instructions in such a way that a compiler barrier is not
> > > sufficient to prevent breakage.
> > 
> > I don't think this is right.  You _can_ assume that Alphas appear to be
> > self-consistent.  If they didn't, you wouldn't be able to use them at
> > all.
> 
> I'm quite convinced that the statement "some CPUs can reorder instructions in
> such a way that a compiler barrier is not sufficient to prevent breakage" is
> correct.

No.  The correct statement is "Some CPUs can reorder instructions in 
such a way that a compiler barrier is not sufficient to prevent 
breakage on SMP systems."

Just for kicks...  Which was added to the kernel first: SMP support or 
memory barriers?  I don't know the answer; it would take a fair amount 
of digging to find out.

Alan Stern


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [linux-pm] freezer: should barriers be smp ?
  2011-04-13 22:57               ` Mike Frysinger
  2011-04-13 23:12                 ` Rafael J. Wysocki
@ 2011-04-14 15:13                 ` Alan Stern
  2011-04-14 22:40                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 25+ messages in thread
From: Alan Stern @ 2011-04-14 15:13 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Rafael J. Wysocki, uclinux-dist-devel, linux-pm, linux-kernel

On Wed, 13 Apr 2011, Mike Frysinger wrote:

> > On Wed, Apr 13, 2011 at 18:49, Rafael J. Wysocki wrote:

> > In my opinion is an architecture problem, not the freezer code problem.
> 
> OK, we have a patch pending locally which populates all barriers with
> this logic, but based on my understanding of things, that didnt seem
> correct.  i guess i'm reading too much into the names ... i'd expect
> the opposite behavior where "rmb" is only for UP needs while "smp_rmb"
> is a rmb which additionally covers SMP.

You are misinterpreting the names and the concepts, both.

First, you need to understand that memory barriers are needed only for
purposes of synchronizing between two different entities capable of
accessing memory (obviously it's not necessary to synchronize an entity
with itself).  One of those entities is always a CPU, of course; the
other entity could be a DMA-capable device or it could be another CPU.

A device driver might need to use memory barriers even on a UP
platform, because it might need to synchronize the CPU with the device
it is driving.

But core kernel code is concerned only with CPUs.  Therefore on UP 
systems, core kernel code (such as the freezer) never needs to use 
memory barriers.

That's the difference between rmb() and smp_rmb().  rmb() _always_ 
generates a memory barrier, so it should be used only in device 
drivers.  smp_rmb() generates a memory barrier only if CONFIG_SMP is 
enabled; otherwise it merely generates a compiler barrier.

In the freezer, there is no reason to use rmb() and wmb().  It should 
use smp_rmb() and smp_wmb().

Alan Stern


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp?
  2011-04-14 14:55               ` Alan Stern
@ 2011-04-14 22:34                 ` Rafael J. Wysocki
  2011-04-15 14:32                   ` Alan Stern
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-14 22:34 UTC (permalink / raw)
  To: Alan Stern; +Cc: Mike Frysinger, uclinux-dist-devel, linux-pm, linux-kernel

On Thursday, April 14, 2011, Alan Stern wrote:
> On Thu, 14 Apr 2011, Rafael J. Wysocki wrote:
> 
> > On Thursday, April 14, 2011, Alan Stern wrote:
> > > On Wed, 13 Apr 2011, Rafael J. Wysocki wrote:
> > > 
> > > > The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set,
> > > > which basically means that *mb() are more restrictive than the corresponding
> > > > smp_*mb().  More precisely, they also cover the cases in which the CPU
> > > > reorders instructions on uniprocessor, which we definitely want to cover.
> > > > 
> > > > IOW, your patch would break things on uniprocessor where the CPU reorders
> > > > instructions.
> > > 
> > > How could anything break on a UP system?  CPUs don't reorder 
> > > instructions that drastically.  For example, no CPU will ever violate
> > > this assertion:
> > > 
> > > 	x = 0;
> > > 	y = x;
> > > 	x = 1;
> > > 	assert(y == 0);
> > > 
> > > even if it does reorder the second and third statements internally.  
> > > This is guaranteed by the C language specification.
> > 
> > Well, you conveniently removed the patch from your reply. :-)
> 
> All the patch does is replace an instance of wmb() with smp_wmb() and 
> an instance of rmb() with smp_rmb().
> 
> > For example, there's no reason why the CPU cannot reorder things so that
> > the "if (frozen(p))" is (speculatively) done before the "if (!freezing(p))"
> > if there's only a compiler barrier between them.
> 
> That's true.  On an SMP system, smp_wmb() is identical to wmb(), so
> there will be a true memory barrier when it is needed.  On a UP system,
> reordering the instructions in this way will not change the final
> result -- in particular, it won't break anything.
> 
> In your example, the two tests look at different flags in *p.  
> Speculative reordering of the tests won't make any difference unless
> one of the flags gets changed in between.  On a UP system, the only way
> the flag can be changed is for the CPU to change it, in which case
> the CPU would obviously know that the speculative result had to be
> invalidated.

Note, however, that preemption may happen basically at any time, so the
task that executes the two "if" statements can be preempted after it has
loaded p->flags into a register and before it checks the TIF_FREEZE (if
they are reordered).  In that case the p->flags (in memory) may be
changed by another task in the meantime.

> > > > > Documentation/memory-barriers.txt:
> > > > > SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
> > > > > systems because it is assumed that a CPU will appear to be self-consistent,
> > > > > and will order overlapping accesses correctly with respect to itself.
> > > > 
> > > > Exactly, which is not guaranteed in general (e.g. on Alpha).  That is, some
> > > > CPUs can reorder instructions in such a way that a compiler barrier is not
> > > > sufficient to prevent breakage.
> > > 
> > > I don't think this is right.  You _can_ assume that Alphas appear to be
> > > self-consistent.  If they didn't, you wouldn't be able to use them at
> > > all.
> > 
> > I'm quite convinced that the statement "some CPUs can reorder instructions in
> > such a way that a compiler barrier is not sufficient to prevent breakage" is
> > correct.
> 
> No.  The correct statement is "Some CPUs can reorder instructions in 
> such a way that a compiler barrier is not sufficient to prevent 
> breakage on SMP systems."

That's if preemption is not taken into account.

> Just for kicks...  Which was added to the kernel first: SMP support or 
> memory barriers?  I don't know the answer; it would take a fair amount 
> of digging to find out.

I have no idea. :-)

Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [linux-pm] freezer: should barriers be smp ?
  2011-04-14 15:13                 ` [linux-pm] " Alan Stern
@ 2011-04-14 22:40                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-14 22:40 UTC (permalink / raw)
  To: Alan Stern; +Cc: Mike Frysinger, uclinux-dist-devel, linux-pm, linux-kernel

On Thursday, April 14, 2011, Alan Stern wrote:
> On Wed, 13 Apr 2011, Mike Frysinger wrote:
> 
> > > On Wed, Apr 13, 2011 at 18:49, Rafael J. Wysocki wrote:
> 
> > > In my opinion is an architecture problem, not the freezer code problem.
> > 
> > OK, we have a patch pending locally which populates all barriers with
> > this logic, but based on my understanding of things, that didnt seem
> > correct.  i guess i'm reading too much into the names ... i'd expect
> > the opposite behavior where "rmb" is only for UP needs while "smp_rmb"
> > is a rmb which additionally covers SMP.
> 
> You are misinterpreting the names and the concepts, both.
> 
> First, you need to understand that memory barriers are needed only for
> purposes of synchronizing between two different entities capable of
> accessing memory (obviously it's not necessary to synchronize an entity
> with itself).  One of those entities is always a CPU, of course; the
> other entity could be a DMA-capable device or it could be another CPU.
> 
> A device driver might need to use memory barriers even on a UP
> platform, because it might need to synchronize the CPU with the device
> it is driving.
> 
> But core kernel code is concerned only with CPUs.  Therefore on UP 
> systems, core kernel code (such as the freezer) never needs to use 
> memory barriers.
> 
> That's the difference between rmb() and smp_rmb().  rmb() _always_ 
> generates a memory barrier, so it should be used only in device 
> drivers.  smp_rmb() generates a memory barrier only if CONFIG_SMP is 
> enabled; otherwise it merely generates a compiler barrier.
> 
> In the freezer, there is no reason to use rmb() and wmb().  It should 
> use smp_rmb() and smp_wmb().

OK, I think you're right, but that's because rmb() and wmb() cause too much
overhead to happen.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp?
  2011-04-14 22:34                 ` Rafael J. Wysocki
@ 2011-04-15 14:32                   ` Alan Stern
  0 siblings, 0 replies; 25+ messages in thread
From: Alan Stern @ 2011-04-15 14:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Mike Frysinger, uclinux-dist-devel, linux-pm, linux-kernel

On Fri, 15 Apr 2011, Rafael J. Wysocki wrote:

> > > For example, there's no reason why the CPU cannot reorder things so that
> > > the "if (frozen(p))" is (speculatively) done before the "if (!freezing(p))"
> > > if there's only a compiler barrier between them.
> > 
> > That's true.  On an SMP system, smp_wmb() is identical to wmb(), so
> > there will be a true memory barrier when it is needed.  On a UP system,
> > reordering the instructions in this way will not change the final
> > result -- in particular, it won't break anything.
> > 
> > In your example, the two tests look at different flags in *p.  
> > Speculative reordering of the tests won't make any difference unless
> > one of the flags gets changed in between.  On a UP system, the only way
> > the flag can be changed is for the CPU to change it, in which case
> > the CPU would obviously know that the speculative result had to be
> > invalidated.
> 
> Note, however, that preemption may happen basically at any time, so the
> task that executes the two "if" statements can be preempted after it has
> loaded p->flags into a register and before it checks the TIF_FREEZE (if
> they are reordered).  In that case the p->flags (in memory) may be
> changed by another task in the meantime.

That's okay, because on a UP system there's only one CPU involved.  

When the other task changes p->flags, the CPU will realize that the
register containing the speculative p->flags value is now invalid.  
This will force it to re-read p->flags before testing TIF_FREEZE.

In principle, the sequence of events is no different from single-line
code doing:

	p->a = 1;
	if (p->b)
		...
	if (p->a)
		...

No CPU is ever going to mess this up, even if it does speculatively
load p->a before executing any of the other statements.

A problem can arise only when the "p->a = 1" statement is executed by a 
_different_ CPU.  In that case the original CPU has no way to tell that 
the speculative value is invalid, so a memory barrier is needed to 
prevent the speculative load.

> > > I'm quite convinced that the statement "some CPUs can reorder instructions in
> > > such a way that a compiler barrier is not sufficient to prevent breakage" is
> > > correct.
> > 
> > No.  The correct statement is "Some CPUs can reorder instructions in 
> > such a way that a compiler barrier is not sufficient to prevent 
> > breakage on SMP systems."
> 
> That's if preemption is not taken into account.

It's true even with preemption.

Besides, if this helps to convince you, recall that the task-switch
code (i.e., the scheduler) contains its own memory barriers.  If one
task gets preempted by another, these barriers will defeat instruction
reordering across the preemption boundary.

Alan Stern


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp ?
  2011-04-13 22:04         ` [linux-pm] [uclinux-dist-devel] " Alan Stern
@ 2011-04-15 16:29           ` Pavel Machek
  2011-04-15 16:33             ` [uclinux-dist-devel] [linux-pm] " Mike Frysinger
  0 siblings, 1 reply; 25+ messages in thread
From: Pavel Machek @ 2011-04-15 16:29 UTC (permalink / raw)
  To: Alan Stern; +Cc: Mike Frysinger, uclinux-dist-devel, linux-pm, linux-kernel

Hi!

> > > I believe the code is correct as is.
> > 
> > that isnt what the code / documentation says.  unless i'm reading them
> > wrong, both seem to indicate that the proposed patch is what we
> > actually want.
> 
> The existing code is correct but it isn't optimal.
> 
> wmb() and rmb() are heavy-duty operations, and you don't want to call
> them when they aren't needed.  That's exactly what smp_wmb() and
> smp_rmb() are for -- they call wmb() and rmb(), but only in SMP 
> kernels.
> 
> Unless you need to synchronize with another processor (not necessarily 
> a CPU, it could be something embedded within a device), you should 
> always use smp_wmb() and smp_rmb() rather than wmb() and rmb().

Maybe; but this code is not performance critical and I believe being
obvious here is better...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] [linux-pm] freezer: should barriers be smp ?
  2011-04-15 16:29           ` Pavel Machek
@ 2011-04-15 16:33             ` Mike Frysinger
  2011-04-15 16:57               ` Pavel Machek
  2011-04-15 23:11               ` Rafael J. Wysocki
  0 siblings, 2 replies; 25+ messages in thread
From: Mike Frysinger @ 2011-04-15 16:33 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Alan Stern, uclinux-dist-devel, linux-pm, linux-kernel

On Fri, Apr 15, 2011 at 12:29, Pavel Machek wrote:
>> > > I believe the code is correct as is.
>> >
>> > that isnt what the code / documentation says.  unless i'm reading them
>> > wrong, both seem to indicate that the proposed patch is what we
>> > actually want.
>>
>> The existing code is correct but it isn't optimal.
>>
>> wmb() and rmb() are heavy-duty operations, and you don't want to call
>> them when they aren't needed.  That's exactly what smp_wmb() and
>> smp_rmb() are for -- they call wmb() and rmb(), but only in SMP
>> kernels.
>>
>> Unless you need to synchronize with another processor (not necessarily
>> a CPU, it could be something embedded within a device), you should
>> always use smp_wmb() and smp_rmb() rather than wmb() and rmb().
>
> Maybe; but this code is not performance critical and I believe being
> obvious here is better...

isnt it though ?  especially when we talk about suspending/resuming on
embedded systems to get more savings over just cpu idle ?  we want
that latency to be as low as possible.
-mike

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] [linux-pm] freezer: should barriers be smp ?
  2011-04-15 16:33             ` [uclinux-dist-devel] [linux-pm] " Mike Frysinger
@ 2011-04-15 16:57               ` Pavel Machek
  2011-04-15 23:11               ` Rafael J. Wysocki
  1 sibling, 0 replies; 25+ messages in thread
From: Pavel Machek @ 2011-04-15 16:57 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: Alan Stern, uclinux-dist-devel, linux-pm, linux-kernel

> >> The existing code is correct but it isn't optimal.
> >>
> >> wmb() and rmb() are heavy-duty operations, and you don't want to call
> >> them when they aren't needed.  That's exactly what smp_wmb() and
> >> smp_rmb() are for -- they call wmb() and rmb(), but only in SMP
> >> kernels.
> >>
> >> Unless you need to synchronize with another processor (not necessarily
> >> a CPU, it could be something embedded within a device), you should
> >> always use smp_wmb() and smp_rmb() rather than wmb() and rmb().
> >
> > Maybe; but this code is not performance critical and I believe being
> > obvious here is better...
> 
> isnt it though ?  especially when we talk about suspending/resuming on
> embedded systems to get more savings over just cpu idle ?  we want
> that latency to be as low as possible.

Feel free to measure the difference. I bet it is lost in noise,
freezer just is not _that_ hot path.

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] [linux-pm] freezer: should barriers be smp ?
  2011-04-15 16:33             ` [uclinux-dist-devel] [linux-pm] " Mike Frysinger
  2011-04-15 16:57               ` Pavel Machek
@ 2011-04-15 23:11               ` Rafael J. Wysocki
  2011-04-15 23:24                 ` Mike Frysinger
  1 sibling, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-15 23:11 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Pavel Machek, Alan Stern, uclinux-dist-devel, linux-pm, linux-kernel

On Friday, April 15, 2011, Mike Frysinger wrote:
> On Fri, Apr 15, 2011 at 12:29, Pavel Machek wrote:
> >> > > I believe the code is correct as is.
> >> >
> >> > that isnt what the code / documentation says.  unless i'm reading them
> >> > wrong, both seem to indicate that the proposed patch is what we
> >> > actually want.
> >>
> >> The existing code is correct but it isn't optimal.
> >>
> >> wmb() and rmb() are heavy-duty operations, and you don't want to call
> >> them when they aren't needed.  That's exactly what smp_wmb() and
> >> smp_rmb() are for -- they call wmb() and rmb(), but only in SMP
> >> kernels.
> >>
> >> Unless you need to synchronize with another processor (not necessarily
> >> a CPU, it could be something embedded within a device), you should
> >> always use smp_wmb() and smp_rmb() rather than wmb() and rmb().
> >
> > Maybe; but this code is not performance critical and I believe being
> > obvious here is better...
> 
> isnt it though ?  especially when we talk about suspending/resuming on
> embedded systems to get more savings over just cpu idle ?  we want
> that latency to be as low as possible.

I agree, we can switch the freezer to smp_ barriers, but not for the reason
you gave before. :-)

Care to repost the patch with a suitable changelog?

Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] [linux-pm] freezer: should barriers be smp ?
  2011-04-15 23:11               ` Rafael J. Wysocki
@ 2011-04-15 23:24                 ` Mike Frysinger
  2011-04-15 23:30                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Mike Frysinger @ 2011-04-15 23:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Alan Stern, uclinux-dist-devel, linux-pm, linux-kernel

On Fri, Apr 15, 2011 at 19:11, Rafael J. Wysocki wrote:
> On Friday, April 15, 2011, Mike Frysinger wrote:
>> On Fri, Apr 15, 2011 at 12:29, Pavel Machek wrote:
>> >> > > I believe the code is correct as is.
>> >> >
>> >> > that isnt what the code / documentation says.  unless i'm reading them
>> >> > wrong, both seem to indicate that the proposed patch is what we
>> >> > actually want.
>> >>
>> >> The existing code is correct but it isn't optimal.
>> >>
>> >> wmb() and rmb() are heavy-duty operations, and you don't want to call
>> >> them when they aren't needed.  That's exactly what smp_wmb() and
>> >> smp_rmb() are for -- they call wmb() and rmb(), but only in SMP
>> >> kernels.
>> >>
>> >> Unless you need to synchronize with another processor (not necessarily
>> >> a CPU, it could be something embedded within a device), you should
>> >> always use smp_wmb() and smp_rmb() rather than wmb() and rmb().
>> >
>> > Maybe; but this code is not performance critical and I believe being
>> > obvious here is better...
>>
>> isnt it though ?  especially when we talk about suspending/resuming on
>> embedded systems to get more savings over just cpu idle ?  we want
>> that latency to be as low as possible.
>
> I agree, we can switch the freezer to smp_ barriers, but not for the reason
> you gave before. :-)
>
> Care to repost the patch with a suitable changelog?

np

to be clear, what you said wrt the Blackfin smp barriers still holds
true right ?  so this changset i merged doesnt need any tweaking ...
http://git.kernel.org/?p=linux/kernel/git/vapier/blackfin.git;a=commitdiff;h=943aee0c685d0563228d5a2ad9c8394ad0300fb5
-mike

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [uclinux-dist-devel] [linux-pm] freezer: should barriers be smp ?
  2011-04-15 23:24                 ` Mike Frysinger
@ 2011-04-15 23:30                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2011-04-15 23:30 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Pavel Machek, Alan Stern, uclinux-dist-devel, linux-pm, linux-kernel

On Saturday, April 16, 2011, Mike Frysinger wrote:
> On Fri, Apr 15, 2011 at 19:11, Rafael J. Wysocki wrote:
> > On Friday, April 15, 2011, Mike Frysinger wrote:
> >> On Fri, Apr 15, 2011 at 12:29, Pavel Machek wrote:
> >> >> > > I believe the code is correct as is.
> >> >> >
> >> >> > that isnt what the code / documentation says.  unless i'm reading them
> >> >> > wrong, both seem to indicate that the proposed patch is what we
> >> >> > actually want.
> >> >>
> >> >> The existing code is correct but it isn't optimal.
> >> >>
> >> >> wmb() and rmb() are heavy-duty operations, and you don't want to call
> >> >> them when they aren't needed.  That's exactly what smp_wmb() and
> >> >> smp_rmb() are for -- they call wmb() and rmb(), but only in SMP
> >> >> kernels.
> >> >>
> >> >> Unless you need to synchronize with another processor (not necessarily
> >> >> a CPU, it could be something embedded within a device), you should
> >> >> always use smp_wmb() and smp_rmb() rather than wmb() and rmb().
> >> >
> >> > Maybe; but this code is not performance critical and I believe being
> >> > obvious here is better...
> >>
> >> isnt it though ?  especially when we talk about suspending/resuming on
> >> embedded systems to get more savings over just cpu idle ?  we want
> >> that latency to be as low as possible.
> >
> > I agree, we can switch the freezer to smp_ barriers, but not for the reason
> > you gave before. :-)
> >
> > Care to repost the patch with a suitable changelog?
> 
> np
> 
> to be clear, what you said wrt the Blackfin smp barriers still holds
> true right ?

I think so.

> so this changset i merged doesnt need any tweaking ...
> http://git.kernel.org/?p=linux/kernel/git/vapier/blackfin.git;a=commitdiff;h=943aee0c685d0563228d5a2ad9c8394ad0300fb5

Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2011-04-15 23:30 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-13  6:14 freezer: should barriers be smp ? Mike Frysinger
2011-04-13 20:58 ` Rafael J. Wysocki
2011-04-13 21:02   ` Mike Frysinger
2011-04-13 21:05     ` Pavel Machek
2011-04-13 21:11       ` [uclinux-dist-devel] " Mike Frysinger
2011-04-13 21:53         ` Rafael J. Wysocki
2011-04-13 22:11           ` [linux-pm] " Alan Stern
2011-04-13 22:34             ` [linux-pm] [uclinux-dist-devel] freezer: should barriers be smp? Rafael J. Wysocki
2011-04-14 14:55               ` Alan Stern
2011-04-14 22:34                 ` Rafael J. Wysocki
2011-04-15 14:32                   ` Alan Stern
2011-04-13 22:22           ` [uclinux-dist-devel] freezer: should barriers be smp ? Mike Frysinger
2011-04-13 22:49             ` Rafael J. Wysocki
2011-04-13 22:53               ` Rafael J. Wysocki
2011-04-13 22:57               ` Mike Frysinger
2011-04-13 23:12                 ` Rafael J. Wysocki
2011-04-14 15:13                 ` [linux-pm] " Alan Stern
2011-04-14 22:40                   ` Rafael J. Wysocki
2011-04-13 22:04         ` [linux-pm] [uclinux-dist-devel] " Alan Stern
2011-04-15 16:29           ` Pavel Machek
2011-04-15 16:33             ` [uclinux-dist-devel] [linux-pm] " Mike Frysinger
2011-04-15 16:57               ` Pavel Machek
2011-04-15 23:11               ` Rafael J. Wysocki
2011-04-15 23:24                 ` Mike Frysinger
2011-04-15 23:30                   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).