All of lore.kernel.org
 help / color / mirror / Atom feed
* Documentation/memory-barriers.txt
@ 2011-09-12 14:15 Benjamin Poirier
  2011-09-12 16:33 ` Documentation/memory-barriers.txt Paul E. McKenney
  0 siblings, 1 reply; 3+ messages in thread
From: Benjamin Poirier @ 2011-09-12 14:15 UTC (permalink / raw)
  To: David Howells, Paul E. McKenney; +Cc: linux-kernel

Hello David, Paul,

Thank you for this great piece on memory barriers. I think it made a
complex topic approachable. I have two questions:
1)
I had a hard time understanding the second part of the example in the
section "Sleep and wake-up functions".

> 	set_current_state(TASK_INTERRUPTIBLE);
> 	if (event_indicated)
> 		break;
> 	__set_current_state(TASK_RUNNING);
> 	do_something(my_data);

I understand the need for memory barriers, but I don't understand what
the code is trying to achieve.
Where are the for (;;) loop and the schedule() call gone to?

> 	set_current_state(TASK_INTERRUPTIBLE);
> 	if (event_indicated) {
> 		smp_rmb();
> 		do_something(my_data);
> 	}

Isn't a break; missing here? How come do_something() has moved inside
the condition?

I'm thinking these final example code bits should look like this
(without and with the smp_rmb), no?:

for (;;) {
	set_current_state(TASK_INTERRUPTIBLE);
	if (event_indicated) {
		smp_rmb();
		do_something(my_data);
		break;
	}
	schedule();
}
__set_current_state(TASK_RUNNING);

2)
On a more general note, why is there a read_barrier_depends() but not a
write_barrier_depends()?

l=7
"write_barrier_depends()"
g=&l

---

l=g
read_barrier_depends()
t=*l

Most processors do not reorder dependent loads but do reorder loads
after loads. I'm guessing there's no processor that does not reorder
dependent stores but that does reorder stores after stores. So there's
no point in having write_barrier_depends(), it would always be defined
to wmb()?

Thanks,
-Ben

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Documentation/memory-barriers.txt
  2011-09-12 14:15 Documentation/memory-barriers.txt Benjamin Poirier
@ 2011-09-12 16:33 ` Paul E. McKenney
  2011-09-12 23:56   ` Documentation/memory-barriers.txt Benjamin Poirier
  0 siblings, 1 reply; 3+ messages in thread
From: Paul E. McKenney @ 2011-09-12 16:33 UTC (permalink / raw)
  To: Benjamin Poirier; +Cc: David Howells, linux-kernel

On Mon, Sep 12, 2011 at 10:15:20AM -0400, Benjamin Poirier wrote:
> Hello David, Paul,
> 
> Thank you for this great piece on memory barriers. I think it made a
> complex topic approachable. I have two questions:
> 1)
> I had a hard time understanding the second part of the example in the
> section "Sleep and wake-up functions".
> 
> > 	set_current_state(TASK_INTERRUPTIBLE);
> > 	if (event_indicated)
> > 		break;
> > 	__set_current_state(TASK_RUNNING);
> > 	do_something(my_data);
> 
> I understand the need for memory barriers, but I don't understand what
> the code is trying to achieve.
> Where are the for (;;) loop and the schedule() call gone to?

This is a discussion of memory barriers, which handle communication
between multiple CPUs.  So, yes, in many cases the for-loop is required,
but the actual communication will occur on a particular iteration of
the for-loop.  But there are other use cases, for example, involving
prepare_to_wait()/schedule()/finish_wait(), that do not need an enclosing
loop.  See for example the use in bsg_io_schedule().

Nevertheless, all of the wait/wakeup examples need to enforce proper
memory ordering, and we therefore take the least common denominator
for the more detailed examples.

> > 	set_current_state(TASK_INTERRUPTIBLE);
> > 	if (event_indicated) {
> > 		smp_rmb();
> > 		do_something(my_data);
> > 	}
> 
> Isn't a break; missing here? How come do_something() has moved inside
> the condition?

Again, it depends on the enclosing use case.  Keep in mind that even
in cases involving a loop, there is only one pass through the loop that
actually does anything.

> I'm thinking these final example code bits should look like this
> (without and with the smp_rmb), no?:
> 
> for (;;) {
> 	set_current_state(TASK_INTERRUPTIBLE);
> 	if (event_indicated) {
> 		smp_rmb();
> 		do_something(my_data);
> 		break;
> 	}
> 	schedule();
> }
> __set_current_state(TASK_RUNNING);

This example would be correct for a looping case, but is more ornate than
required for illustrating the effects of memory barriers.  So we took the
simpler case without the loop.

> 2)
> On a more general note, why is there a read_barrier_depends() but not a
> write_barrier_depends()?

You use rcu_assign_pointer() for write_barrier_depends().  An alternative
extremely expensive definition of write_barrier_depends() is to force
a memory barrier on all CPUs.  This was debated quite some time ago and
was rejected.

However, you can get this effect by calling one of the synchronize_rcu()
or synchronize_rcu_expedited() family of functions.  Please be aware that
synchronize_rcu() will impose several milliseconds of latency but minimal
CPU overhead, while synchronize_rcu_expedited() will impose only a few
tens of microseconds of latency, but will IPI each and every CPU.  So
both of these are expensive in different ways.

Another way to get this effect is to use smp_call_function().  Like
synchronize_rcu_expedited(), this will IPI each and every CPU.

But before going down any of these paths other than rcu_assign_pointer(),
you really need to look very carefully at why you need smp_mb() on each
and every CPU.  Normally, this is a way bigger hammer than you need.

To reiterate, if you think you need write_barrier_depends(), please
carefully revisit your design.  The odds are that you really do not
need it.

> l=7
> "write_barrier_depends()"
> g=&l
> 
> ---
> 
> l=g
> read_barrier_depends()
> t=*l
> 
> Most processors do not reorder dependent loads but do reorder loads
> after loads. I'm guessing there's no processor that does not reorder
> dependent stores but that does reorder stores after stores. So there's
> no point in having write_barrier_depends(), it would always be defined
> to wmb()?

Yes, exactly -- rcu_assign_pointer() does use smp_wmb().

The only CPU that could make good generic use of write_barrier_depends()
is DEC Alpha.  What we do instead is make Alpha's read_barrier_depends()
use smp_rmb().

								Thanx, Paul


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Documentation/memory-barriers.txt
  2011-09-12 16:33 ` Documentation/memory-barriers.txt Paul E. McKenney
@ 2011-09-12 23:56   ` Benjamin Poirier
  0 siblings, 0 replies; 3+ messages in thread
From: Benjamin Poirier @ 2011-09-12 23:56 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: David Howells, linux-kernel

On 11-09-12 09:33, Paul E. McKenney wrote:
> On Mon, Sep 12, 2011 at 10:15:20AM -0400, Benjamin Poirier wrote:
> > Hello David, Paul,
> > 
[snip]
> 
> This example would be correct for a looping case, but is more ornate than
> required for illustrating the effects of memory barriers.  So we took the
> simpler case without the loop.

Thanks for the clarification. I was under the false impression that all
of the code examples in this section represented the same code segment
that was modified from one example to the next. In any case, these
examples do demonstrate the use of memory barriers clearly.

-Ben

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-09-12 23:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-12 14:15 Documentation/memory-barriers.txt Benjamin Poirier
2011-09-12 16:33 ` Documentation/memory-barriers.txt Paul E. McKenney
2011-09-12 23:56   ` Documentation/memory-barriers.txt Benjamin Poirier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.