All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@redhat.com>,
	Felipe Balbi <felipe.balbi@linux.intel.com>,
	USB list <linux-usb@vger.kernel.org>,
	Kernel development list <linux-kernel@vger.kernel.org>,
	Will Deacon <will.deacon@arm.com>
Subject: Re: Memory barrier needed with wake_up_process()?
Date: Tue, 6 Sep 2016 13:36:05 +0200	[thread overview]
Message-ID: <20160906113605.GL10153@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <Pine.LNX.4.44L0.1609051105440.25234-100000@netrider.rowland.org>

On Mon, Sep 05, 2016 at 11:29:26AM -0400, Alan Stern wrote:
> On Mon, 5 Sep 2016, Peter Zijlstra wrote:
> 
> > > Actually, seeing it written out like this, one realizes that it really 
> > > ought to be:
> > > 
> > > 	CPU 0				CPU 1
> > >         -----				-----
> > >         Start DMA			Handle DMA-complete irq
> > >         Sleep until bh->state		smp_mb()
> > > 					set bh->state
> > > 					Wake up CPU 0
> > > 	smp_mb()
> > > 	Compute rc based on contents of the DMA buffer
> > > 
> > > (Bear in mind also that on some platforms, the I/O operation is carried 
> > > out by PIO rather than DMA.)

> > Also, smp_mb() doesn't necessarily interact with MMIO/DMA at all IIRC.
> > Its only defined to do CPU/CPU interactions.
> 
> Suppose the DMA master finishes filling up an input buffer and issues a
> completion irq to CPU1.  Presumably the data would then be visible to
> CPU1 if the interrupt handler looked at it.  So if CPU1 executes smp_mb
> before setting bh->state and waking up CPU0, and if CPU0 executes
> smp_mb after testing bh->state and before reading the data buffer,
> wouldn't CPU0 then see the correct data in the buffer?  Even if CPU0 
> never did go to sleep?

Couple of notes here; I would expect the DMA master to make its stores
_globally_ visible on 'completion'. Because I'm not sure our smp_mb()
would help make it globally visible, since its only defined on CPU/CPU
interactions, not external.

Note that for example ARM has the distinction where smp_mb() uses
DMB-ISH barrier, dma_[rw]mb() uses DMB-OSH{LD,ST} and mb() uses DSB-SY.

The ISH domain is the Inner-SHarable (IIRC) and only includes CPUs. The
OSH is Outer-SHarable and adds external agents like DMA (also includes
CPUs). The DSB-SY thing is even heavier and syncs world or something; I
always forget these details.

> Or would something more be needed?

The thing is, this is x86 (TSO). Most everything is globally
ordered/visible and full barriers.

The only reorder allowed by TSO is a later read can happen before a
prior store. That is, only:

	X = 1;
	smp_mb();
	r = Y;

actually needs a full barrier. All the other variants are no-ops.

Note that x86's dma_[rw]mb() are no-ops (like the regular smp_[rw]mb()).
Which seems to imply DMA is coherent and globally visible.

> > I would very much expect the device IO stuff to order things for us in
> > this case. "Start DMA" should very much include sufficient fences to
> > ensure the data under DMA is visible to the DMA engine, this would very
> > much include things like flushing store buffers and maybe even writeback
> > caches, depending on platform needs.
> > 
> > At the same time, I would expect "Handle DMA-complete irq", even if it
> > were done with a PIO polling loop, to guarantee ordering against later
> > operations such that 'complete' really means that.
> 
> That's what I would expect too.
> 
> Back in the original email thread where the problem was first reported, 
> Felipe says that the problem appears to be something else.  Here's what 
> it looks like now, in schematic form:
> 
> 	CPU0
> 	----
> 	get_next_command():
> 	  while (bh->state != BUF_STATE_EMPTY)
> 		sleep_thread();
> 	  start an input request for bh
> 	  while (bh->state != BUF_STATE_FULL)
> 		sleep_thread();
> 
> As mentioned above, the input involves DMA and is terminated by an irq.
> The request's completion handler is bulk_out_complete():
> 
> 	CPU1
> 	----
> 	bulk_out_complete():
> 	  bh->state = BUF_STATE_FULL;
> 	  wakeup_thread();
> 
> According to Felipe, when CPU0 wakes up and checks bh->state, it sees a 
> value different from BUF_STATE_FULL.  So it goes back to sleep again 
> and doesn't make any forward progress.
> 
> It's possible that something else is changing bh->state when it 
> shouldn't.  But if this were the explanation, why would Felipe see that 
> the problem goes away when he changes the memory barriers in 
> sleep_thread() and wakeup_thread() to smp_mb()?

Good question, let me stare at the code more.

Could you confirm that bulk_{in,out}_complete() work on different
usb_request structures, and they can not, at any time, get called on the
_same_ request?

  reply	other threads:[~2016-09-06 11:37 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-02 18:10 Memory barrier needed with wake_up_process()? Alan Stern
2016-09-02 18:47 ` Paul E. McKenney
2016-09-02 20:29   ` Alan Stern
2016-09-03  9:07     ` Paul E. McKenney
2016-09-03 12:31     ` Peter Zijlstra
2016-09-03 14:26       ` Alan Stern
2016-09-03 14:49         ` Alan Stern
2016-09-05  8:33           ` Peter Zijlstra
2016-09-05 15:29             ` Alan Stern
2016-09-06 11:36               ` Peter Zijlstra [this message]
2016-09-06 11:43                 ` Felipe Balbi
2016-09-06 11:49                   ` Peter Zijlstra
2016-09-06 12:20                     ` Peter Zijlstra
2016-09-06 14:46                       ` Alan Stern
2016-09-06 15:05                         ` Peter Zijlstra
2016-09-07 10:12                         ` Felipe Balbi
2016-09-09 10:36                           ` Felipe Balbi
2016-09-09 16:12                             ` Alan Stern
2016-09-19 11:11                               ` Felipe Balbi
2016-09-19 17:35                                 ` Alan Stern
2016-09-20 10:12                                   ` Felipe Balbi
2016-09-20 12:53                                     ` Felipe Balbi
2016-09-20 14:40                                     ` Alan Stern
2017-01-16 11:12                                       ` Felipe Balbi
2017-01-16 17:09                                         ` Alan Stern
2017-01-16 19:04                                           ` Felipe Balbi
2017-01-16 19:19                                             ` Felipe Balbi
2016-09-02 19:18 ` Peter Zijlstra
2016-09-02 20:16   ` Alan Stern
2016-09-02 22:14     ` Peter Zijlstra
2016-09-02 22:16       ` Peter Zijlstra
2016-09-05  9:43         ` Will Deacon
2016-09-06 11:10           ` Peter Zijlstra
2016-09-02 22:16     ` Peter Zijlstra
2016-09-03  6:58       ` Felipe Balbi
2016-09-03 12:19         ` Peter Zijlstra
2016-09-03 13:51           ` Felipe Balbi
2016-09-05  8:09             ` Peter Zijlstra
2016-09-03 14:16           ` Alan Stern
2016-09-05  8:08             ` Peter Zijlstra
2016-09-05 14:33               ` Alan Stern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160906113605.GL10153@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=felipe.balbi@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=stern@rowland.harvard.edu \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.