From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [RFC][PATCH] wake_up_var() memory ordering Date: Tue, 25 Jun 2019 10:11:03 +0200 Message-ID: <20190625081103.GU3436@hirez.programming.kicks-ass.net> References: <20190624165012.GH3436@hirez.programming.kicks-ass.net> <32379.1561449061@warthog.procyon.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <32379.1561449061-S6HVgzuS8uM4Awkfq6JHfwNdhmdF6hFW@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-mediatek" Errors-To: linux-mediatek-bounces+glpam-linux-mediatek=m.gmane.org-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org To: David Howells Cc: Martin Brandenburg , Mike Snitzer , linux-aio-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, David Airlie , samba-technical-w/Ol4Ecudpl8XjKLYN78aQ@public.gmane.org, Joonas Lahtinen , Will Deacon , dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org, "J. Bruce Fields" , Chris Mason , dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, keyrings-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Ingo Molnar , linux-afs-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, Alasdair Kergon , Mike Marshall , linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, rds-devel-N0ozoZBvEnrZJqsBc5GL+g@public.gmane.org, Andreas Gruenbacher , linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, James Morris , cluster-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Antti Palosaari , Matthias Brugger , Paul McKenney , intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org, devel@lists.o List-Id: linux-rdma@vger.kernel.org (sorry for cross-posting to moderated lists btw, I've since acquired a patch to get_maintainers.pl that wil exclude them in the future) On Tue, Jun 25, 2019 at 08:51:01AM +0100, David Howells wrote: > Peter Zijlstra wrote: > > > I tried using wake_up_var() today and accidentally noticed that it > > didn't imply an smp_mb() and specifically requires it through > > wake_up_bit() / waitqueue_active(). > > Thinking about it again, I'm not sure why you need to add the barrier when > wake_up() (which this is a wrapper around) is required to impose a barrier at > the front if there's anything to wake up (ie. the wait queue isn't empty). > > If this is insufficient, does it make sense just to have wake_up*() functions > do an unconditional release or full barrier right at the front, rather than it > being conditional on something being woken up? The curprit is __wake_up_bit()'s usage of waitqueue_active(); it is this latter (see its comment) that requires the smp_mb(). wake_up_bit() and wake_up_var() are wrappers around __wake_up_bit(). Without this barrier it is possible for the waitqueue_active() load to be hoisted over the cond=true store and the remote end can miss the store and we can miss its enqueue and we'll all miss a wakeup and get stuck. Adding an smp_mb() (or use wq_has_sleeper()) in __wake_up_bit() would be nice, but I fear some people will complain about overhead, esp. since about half the sites don't need the barrier due to being behind test_and_clear_bit() and the other half using smp_mb__after_atomic() after some clear_bit*() variant. There's a few sites that seem to open-code wait_var_event()/wake_up_var() and those actually need the full smp_mb(), but then maybe they should be converted to var instread of bit anyway. > > @@ -619,9 +614,7 @@ static int dvb_usb_fe_sleep(struct dvb_frontend *fe) > > err: > > if (!adap->suspend_resume_active) { > > adap->active_fe = -1; > > I'm wondering if there's a missing barrier here. Should the clear_bit() on > the next line be clear_bit_unlock() or clear_bit_release()? That looks reasonable, but I'd like to hear from the DVB folks on that. > > - clear_bit(ADAP_SLEEP, &adap->state_bits); > > - smp_mb__after_atomic(); > > - wake_up_bit(&adap->state_bits, ADAP_SLEEP); > > + clear_and_wake_up_bit(ADAP_SLEEP, &adap->state_bits); > > } > > > > dev_dbg(&d->udev->dev, "%s: ret=%d\n", __func__, ret); > > diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c > > index cfe62b154f68..377ee07d5f76 100644 > > --- a/fs/afs/fs_probe.c > > +++ b/fs/afs/fs_probe.c > > @@ -18,6 +18,7 @@ static bool afs_fs_probe_done(struct afs_server *server) > > > > wake_up_var(&server->probe_outstanding); > > clear_bit_unlock(AFS_SERVER_FL_PROBING, &server->flags); > > + smp_mb__after_atomic(); > > wake_up_bit(&server->flags, AFS_SERVER_FL_PROBING); > > return true; > > } > > Looking at this and the dvb one, does it make sense to stick the release > semantics of clear_bit_unlock() into clear_and_wake_up_bit()? I was thinking of adding another helper, maybe unlock_and_wake_up_bit() that included that extra barrier, but maybe making it unconditional isn't the worst idea. > Also, should clear_bit_unlock() be renamed to clear_bit_release() (and > similarly test_and_set_bit_lock() -> test_and_set_bit_acquire()) if we seem to > be trying to standardise on that terminology. That definitely makes sense to me, there's only 157 clear_bit_unlock() and 76 test_and_set_bit_lock() users (note the asymetry of that). From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Date: Tue, 25 Jun 2019 10:11:03 +0200 Subject: [Cluster-devel] [RFC][PATCH] wake_up_var() memory ordering In-Reply-To: <32379.1561449061@warthog.procyon.org.uk> References: <20190624165012.GH3436@hirez.programming.kicks-ass.net> <32379.1561449061@warthog.procyon.org.uk> Message-ID: <20190625081103.GU3436@hirez.programming.kicks-ass.net> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit (sorry for cross-posting to moderated lists btw, I've since acquired a patch to get_maintainers.pl that wil exclude them in the future) On Tue, Jun 25, 2019 at 08:51:01AM +0100, David Howells wrote: > Peter Zijlstra wrote: > > > I tried using wake_up_var() today and accidentally noticed that it > > didn't imply an smp_mb() and specifically requires it through > > wake_up_bit() / waitqueue_active(). > > Thinking about it again, I'm not sure why you need to add the barrier when > wake_up() (which this is a wrapper around) is required to impose a barrier at > the front if there's anything to wake up (ie. the wait queue isn't empty). > > If this is insufficient, does it make sense just to have wake_up*() functions > do an unconditional release or full barrier right at the front, rather than it > being conditional on something being woken up? The curprit is __wake_up_bit()'s usage of waitqueue_active(); it is this latter (see its comment) that requires the smp_mb(). wake_up_bit() and wake_up_var() are wrappers around __wake_up_bit(). Without this barrier it is possible for the waitqueue_active() load to be hoisted over the cond=true store and the remote end can miss the store and we can miss its enqueue and we'll all miss a wakeup and get stuck. Adding an smp_mb() (or use wq_has_sleeper()) in __wake_up_bit() would be nice, but I fear some people will complain about overhead, esp. since about half the sites don't need the barrier due to being behind test_and_clear_bit() and the other half using smp_mb__after_atomic() after some clear_bit*() variant. There's a few sites that seem to open-code wait_var_event()/wake_up_var() and those actually need the full smp_mb(), but then maybe they should be converted to var instread of bit anyway. > > @@ -619,9 +614,7 @@ static int dvb_usb_fe_sleep(struct dvb_frontend *fe) > > err: > > if (!adap->suspend_resume_active) { > > adap->active_fe = -1; > > I'm wondering if there's a missing barrier here. Should the clear_bit() on > the next line be clear_bit_unlock() or clear_bit_release()? That looks reasonable, but I'd like to hear from the DVB folks on that. > > - clear_bit(ADAP_SLEEP, &adap->state_bits); > > - smp_mb__after_atomic(); > > - wake_up_bit(&adap->state_bits, ADAP_SLEEP); > > + clear_and_wake_up_bit(ADAP_SLEEP, &adap->state_bits); > > } > > > > dev_dbg(&d->udev->dev, "%s: ret=%d\n", __func__, ret); > > diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c > > index cfe62b154f68..377ee07d5f76 100644 > > --- a/fs/afs/fs_probe.c > > +++ b/fs/afs/fs_probe.c > > @@ -18,6 +18,7 @@ static bool afs_fs_probe_done(struct afs_server *server) > > > > wake_up_var(&server->probe_outstanding); > > clear_bit_unlock(AFS_SERVER_FL_PROBING, &server->flags); > > + smp_mb__after_atomic(); > > wake_up_bit(&server->flags, AFS_SERVER_FL_PROBING); > > return true; > > } > > Looking at this and the dvb one, does it make sense to stick the release > semantics of clear_bit_unlock() into clear_and_wake_up_bit()? I was thinking of adding another helper, maybe unlock_and_wake_up_bit() that included that extra barrier, but maybe making it unconditional isn't the worst idea. > Also, should clear_bit_unlock() be renamed to clear_bit_release() (and > similarly test_and_set_bit_lock() -> test_and_set_bit_acquire()) if we seem to > be trying to standardise on that terminology. That definitely makes sense to me, there's only 157 clear_bit_unlock() and 76 test_and_set_bit_lock() users (note the asymetry of that).