From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756475AbcIGKNY (ORCPT ); Wed, 7 Sep 2016 06:13:24 -0400 Received: from mga03.intel.com ([134.134.136.65]:14513 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756053AbcIGKNR (ORCPT ); Wed, 7 Sep 2016 06:13:17 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.30,296,1470726000"; d="asc'?scan'208";a="875943597" From: Felipe Balbi To: Alan Stern , Peter Zijlstra Cc: "Paul E. McKenney" , Ingo Molnar , USB list , Kernel development list , Will Deacon Subject: Re: Memory barrier needed with wake_up_process()? In-Reply-To: References: User-Agent: Notmuch/0.22.1+63~g994277e (https://notmuchmail.org) Emacs/25.1.3 (x86_64-pc-linux-gnu) Date: Wed, 07 Sep 2016 13:12:40 +0300 Message-ID: <87oa40koh3.fsf@linux.intel.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi, Alan Stern writes: > On Tue, 6 Sep 2016, Peter Zijlstra wrote: > >> On Tue, Sep 06, 2016 at 01:49:37PM +0200, Peter Zijlstra wrote: >> > On Tue, Sep 06, 2016 at 02:43:39PM +0300, Felipe Balbi wrote: >>=20 >> > > My fear now, however, is that changing smp_[rw]mb() to smp_mb() just >> > > adds extra overhead which makes the problem much, much less likely to >> > > happen. Does that sound plausible to you? >> >=20 >> > I did consider that, but I've not sufficiently grokked the code to rule >> > out actual fail. So let me stare at this a bit more. >>=20 >> OK, so I'm really not seeing it, we've got: >>=20 >> while (bh->state !=3D FULL) { >> for (;;) { >> set_current_state(INTERRUPTIBLE); /* MB after */ >> if (signal_pending(current)) >> return -EINTR; >> if (common->thread_wakeup_needed) >> break; >> schedule(); /* MB */ >> } >> __set_current_state(RUNNING); >> common->thread_wakeup_needed =3D 0; >> smp_rmb(); /* NOP */ >> } >>=20 >>=20 >> VS. >>=20 >>=20 >> spin_lock(&common->lock); /* MB */ >> bh->state =3D FULL; >> smp_wmb(); /* NOP */ >> common->thread_wakeup_needed =3D 1; >> wake_up_process(common->thread_task); /* MB before */ >> spin_unlock(&common->lock); >>=20 >>=20 >>=20 >> (the MB annotations specific to x86, not true in general) >>=20 >>=20 >> If we observe thread_wakeup_needed, we must also observe bh->state. >>=20 >> And the sleep/wakeup ordering is also correct, we either see >> thread_wakeup_needed and continue, or we see task->state =3D=3D RUNNING >> (from the wakeup) and NO-OP schedule(). The MB from set_current_statE() >> then matches with the MB from wake_up_process() to ensure we must see >> thead_wakeup_needed. >>=20 >> Or, we go sleep, and get woken up, at which point the same happens. >> Since the waking CPU gets the task back on its RQ the happens-before >> chain includes the waking CPUs state along with the state of the task >> itself before it went to sleep. >>=20 >> At which point we're back where we started, once we see >> thread_wakeup_needed we must then also see bh->state (and all state >> prior to that on the waking CPU). >>=20 >>=20 >>=20 >> There's enough cruft in the while-sleep loop to force reload bh->state. >>=20 >> Load/store tearing cannot be a problem because all values are single >> bytes (the variables are multi bytes, but all values used only affect >> the LSB). >>=20 >> Colour me puzzled. > > Felipe, can you please try this patch on an unmodified tree? If the=20 > problem still occurs, what shows up in the kernel log? > > Alan Stern > > > > Index: usb-4.x/drivers/usb/gadget/function/f_mass_storage.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- usb-4.x.orig/drivers/usb/gadget/function/f_mass_storage.c > +++ usb-4.x/drivers/usb/gadget/function/f_mass_storage.c > @@ -485,6 +485,8 @@ static void bulk_out_complete(struct usb > spin_lock(&common->lock); > bh->outreq_busy =3D 0; > bh->state =3D BUF_STATE_FULL; > + if (bh->bulk_out_intended_length =3D=3D US_BULK_CB_WRAP_LEN) > + INFO(common, "compl: bh %p state %d\n", bh, bh->state); > wakeup_thread(common); > spin_unlock(&common->lock); > } > @@ -2207,6 +2209,7 @@ static int get_next_command(struct fsg_c > rc =3D sleep_thread(common, true); > if (rc) > return rc; > + INFO(common, "next: bh %p state %d\n", bh, bh->state); > } > smp_rmb(); > rc =3D fsg_is_set(common) ? received_cbw(common->fsg, bh) : -EIO; I've replace INFO() with trace_printk() (which is what I have been using anyway): diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gad= get/function/f_mass_storage.c index 2505117e88e8..dbc6a380b38b 100644 =2D-- a/drivers/usb/gadget/function/f_mass_storage.c +++ b/drivers/usb/gadget/function/f_mass_storage.c @@ -485,6 +485,8 @@ static void bulk_out_complete(struct usb_ep *ep, struct= usb_request *req) spin_lock(&common->lock); bh->outreq_busy =3D 0; bh->state =3D BUF_STATE_FULL; + if (bh->bulk_out_intended_length =3D=3D US_BULK_CB_WRAP_LEN) + trace_printk("compl: bh %p state %d\n", bh, bh->state); wakeup_thread(common); spin_unlock(&common->lock); } @@ -2207,6 +2209,7 @@ static int get_next_command(struct fsg_common *common) rc =3D sleep_thread(common, true); if (rc) return rc; + trace_printk("next: bh %p state %d\n", bh, bh->state); } smp_rmb(); rc =3D fsg_is_set(common) ? received_cbw(common->fsg, bh) : -EIO; But I can't reproduce as reliably as before. I'll keep the thing running an infinite loop which will stop only when interrupts in UDC (dwc3 in this case) stop increasing. =2D-=20 balbi --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIcBAEBCAAGBQJXz+gYAAoJEMy+uJnhGpkGUEkP/inR7Yg0KtoAeJa8luEn1UrS FMIQeYLZtIQvkE2LUySisMC8jGCPyLfm0fslukNHG8rCGF+qTgLcLWy8s4lLrDiU y131mvrKPsjFPVOyVQlMR201dxoBQz+s5UhI5c1RWI2qZuii4mm7Jw6Pb6KhA+f3 9oyFjQAw4O7YeCVwbdkKOx+xSSktAYxfTm4A/oprpjyIf5HMjdYgEeHmStWe1xQx roiGxJdQJJaRlo+X9Bdd2Wzw3RceZaJFkUCiFmXgaohWaMeCriGKZAshggTnR9px HO/4LGQZrIw+wtgRo3gLJvY/XjxHpU5C/kLqb1dO8r/1BTA0xn0QlxDc1wZVh1QA ARSxdlOpjlT7q4RVwA7QBMT3MA/lsFIitZCcSKul1iGJU8n8/AL30aEx7LUAsQlD KJAFcFpofdsVdTiv+PdECCukFzxu3890ZSvXhuNjOyn0Dh/jrPyHHn+kxC6JUZQd pucrsNfKyU8/xJMNfX/aZn8+7uFv5oDBpg6T8tQieTdTqpSxkpplyqgRZbx6sYd0 vbjt00WMml6fMB480SEkGruZzNuA+R4yxWGTqbDrOTtxgyvYVx1oe6ZivKtQlbhi APUYGJBb19WaA8d+7AVRQ26yYr7E3hNYNh2GhHv9K2uL2Z24wwAaRvKXvmVwv99a HlrAc1v042aOMVJhn0O2 =Vh2z -----END PGP SIGNATURE----- --=-=-=--