Hi, Alan Stern writes: > On Tue, 6 Sep 2016, Peter Zijlstra wrote: > >> On Tue, Sep 06, 2016 at 01:49:37PM +0200, Peter Zijlstra wrote: >> > On Tue, Sep 06, 2016 at 02:43:39PM +0300, Felipe Balbi wrote: >> >> > > My fear now, however, is that changing smp_[rw]mb() to smp_mb() just >> > > adds extra overhead which makes the problem much, much less likely to >> > > happen. Does that sound plausible to you? >> > >> > I did consider that, but I've not sufficiently grokked the code to rule >> > out actual fail. So let me stare at this a bit more. >> >> OK, so I'm really not seeing it, we've got: >> >> while (bh->state != FULL) { >> for (;;) { >> set_current_state(INTERRUPTIBLE); /* MB after */ >> if (signal_pending(current)) >> return -EINTR; >> if (common->thread_wakeup_needed) >> break; >> schedule(); /* MB */ >> } >> __set_current_state(RUNNING); >> common->thread_wakeup_needed = 0; >> smp_rmb(); /* NOP */ >> } >> >> >> VS. >> >> >> spin_lock(&common->lock); /* MB */ >> bh->state = FULL; >> smp_wmb(); /* NOP */ >> common->thread_wakeup_needed = 1; >> wake_up_process(common->thread_task); /* MB before */ >> spin_unlock(&common->lock); >> >> >> >> (the MB annotations specific to x86, not true in general) >> >> >> If we observe thread_wakeup_needed, we must also observe bh->state. >> >> And the sleep/wakeup ordering is also correct, we either see >> thread_wakeup_needed and continue, or we see task->state == RUNNING >> (from the wakeup) and NO-OP schedule(). The MB from set_current_statE() >> then matches with the MB from wake_up_process() to ensure we must see >> thead_wakeup_needed. >> >> Or, we go sleep, and get woken up, at which point the same happens. >> Since the waking CPU gets the task back on its RQ the happens-before >> chain includes the waking CPUs state along with the state of the task >> itself before it went to sleep. >> >> At which point we're back where we started, once we see >> thread_wakeup_needed we must then also see bh->state (and all state >> prior to that on the waking CPU). >> >> >> >> There's enough cruft in the while-sleep loop to force reload bh->state. >> >> Load/store tearing cannot be a problem because all values are single >> bytes (the variables are multi bytes, but all values used only affect >> the LSB). >> >> Colour me puzzled. > > Felipe, can you please try this patch on an unmodified tree? If the > problem still occurs, what shows up in the kernel log? > > Alan Stern > > > > Index: usb-4.x/drivers/usb/gadget/function/f_mass_storage.c > =================================================================== > --- usb-4.x.orig/drivers/usb/gadget/function/f_mass_storage.c > +++ usb-4.x/drivers/usb/gadget/function/f_mass_storage.c > @@ -485,6 +485,8 @@ static void bulk_out_complete(struct usb > spin_lock(&common->lock); > bh->outreq_busy = 0; > bh->state = BUF_STATE_FULL; > + if (bh->bulk_out_intended_length == US_BULK_CB_WRAP_LEN) > + INFO(common, "compl: bh %p state %d\n", bh, bh->state); > wakeup_thread(common); > spin_unlock(&common->lock); > } > @@ -2207,6 +2209,7 @@ static int get_next_command(struct fsg_c > rc = sleep_thread(common, true); > if (rc) > return rc; > + INFO(common, "next: bh %p state %d\n", bh, bh->state); > } > smp_rmb(); > rc = fsg_is_set(common) ? received_cbw(common->fsg, bh) : -EIO; I've replace INFO() with trace_printk() (which is what I have been using anyway): diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c index 2505117e88e8..dbc6a380b38b 100644 --- a/drivers/usb/gadget/function/f_mass_storage.c +++ b/drivers/usb/gadget/function/f_mass_storage.c @@ -485,6 +485,8 @@ static void bulk_out_complete(struct usb_ep *ep, struct usb_request *req) spin_lock(&common->lock); bh->outreq_busy = 0; bh->state = BUF_STATE_FULL; + if (bh->bulk_out_intended_length == US_BULK_CB_WRAP_LEN) + trace_printk("compl: bh %p state %d\n", bh, bh->state); wakeup_thread(common); spin_unlock(&common->lock); } @@ -2207,6 +2209,7 @@ static int get_next_command(struct fsg_common *common) rc = sleep_thread(common, true); if (rc) return rc; + trace_printk("next: bh %p state %d\n", bh, bh->state); } smp_rmb(); rc = fsg_is_set(common) ? received_cbw(common->fsg, bh) : -EIO; But I can't reproduce as reliably as before. I'll keep the thing running an infinite loop which will stop only when interrupts in UDC (dwc3 in this case) stop increasing. -- balbi