From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934933AbcIFOrA (ORCPT ); Tue, 6 Sep 2016 10:47:00 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:56964 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S934789AbcIFOq5 (ORCPT ); Tue, 6 Sep 2016 10:46:57 -0400 Date: Tue, 6 Sep 2016 10:46:55 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Peter Zijlstra cc: Felipe Balbi , "Paul E. McKenney" , Ingo Molnar , USB list , Kernel development list , Will Deacon Subject: Re: Memory barrier needed with wake_up_process()? In-Reply-To: <20160906122037.GL10168@twins.programming.kicks-ass.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 6 Sep 2016, Peter Zijlstra wrote: > On Tue, Sep 06, 2016 at 01:49:37PM +0200, Peter Zijlstra wrote: > > On Tue, Sep 06, 2016 at 02:43:39PM +0300, Felipe Balbi wrote: > > > > My fear now, however, is that changing smp_[rw]mb() to smp_mb() just > > > adds extra overhead which makes the problem much, much less likely to > > > happen. Does that sound plausible to you? > > > > I did consider that, but I've not sufficiently grokked the code to rule > > out actual fail. So let me stare at this a bit more. > > OK, so I'm really not seeing it, we've got: > > while (bh->state != FULL) { > for (;;) { > set_current_state(INTERRUPTIBLE); /* MB after */ > if (signal_pending(current)) > return -EINTR; > if (common->thread_wakeup_needed) > break; > schedule(); /* MB */ > } > __set_current_state(RUNNING); > common->thread_wakeup_needed = 0; > smp_rmb(); /* NOP */ > } > > > VS. > > > spin_lock(&common->lock); /* MB */ > bh->state = FULL; > smp_wmb(); /* NOP */ > common->thread_wakeup_needed = 1; > wake_up_process(common->thread_task); /* MB before */ > spin_unlock(&common->lock); > > > > (the MB annotations specific to x86, not true in general) > > > If we observe thread_wakeup_needed, we must also observe bh->state. > > And the sleep/wakeup ordering is also correct, we either see > thread_wakeup_needed and continue, or we see task->state == RUNNING > (from the wakeup) and NO-OP schedule(). The MB from set_current_statE() > then matches with the MB from wake_up_process() to ensure we must see > thead_wakeup_needed. > > Or, we go sleep, and get woken up, at which point the same happens. > Since the waking CPU gets the task back on its RQ the happens-before > chain includes the waking CPUs state along with the state of the task > itself before it went to sleep. > > At which point we're back where we started, once we see > thread_wakeup_needed we must then also see bh->state (and all state > prior to that on the waking CPU). > > > > There's enough cruft in the while-sleep loop to force reload bh->state. > > Load/store tearing cannot be a problem because all values are single > bytes (the variables are multi bytes, but all values used only affect > the LSB). > > Colour me puzzled. Felipe, can you please try this patch on an unmodified tree? If the problem still occurs, what shows up in the kernel log? Alan Stern Index: usb-4.x/drivers/usb/gadget/function/f_mass_storage.c =================================================================== --- usb-4.x.orig/drivers/usb/gadget/function/f_mass_storage.c +++ usb-4.x/drivers/usb/gadget/function/f_mass_storage.c @@ -485,6 +485,8 @@ static void bulk_out_complete(struct usb spin_lock(&common->lock); bh->outreq_busy = 0; bh->state = BUF_STATE_FULL; + if (bh->bulk_out_intended_length == US_BULK_CB_WRAP_LEN) + INFO(common, "compl: bh %p state %d\n", bh, bh->state); wakeup_thread(common); spin_unlock(&common->lock); } @@ -2207,6 +2209,7 @@ static int get_next_command(struct fsg_c rc = sleep_thread(common, true); if (rc) return rc; + INFO(common, "next: bh %p state %d\n", bh, bh->state); } smp_rmb(); rc = fsg_is_set(common) ? received_cbw(common->fsg, bh) : -EIO;