From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751411AbdISOFy (ORCPT ); Tue, 19 Sep 2017 10:05:54 -0400 Received: from mail-wr0-f194.google.com ([209.85.128.194]:38259 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750972AbdISOFw (ORCPT ); Tue, 19 Sep 2017 10:05:52 -0400 X-Google-Smtp-Source: AOwi7QBSOG7owq002oLDPVY0QWwSJZ2IndVuR0WhUYGWtopyi510Hk72XWfTQLyaRZjaPCW9kVwHXQ== Date: Tue, 19 Sep 2017 16:05:39 +0200 From: Andrea Parri To: Prateek Sood , peterz@infradead.org, mingo@kernel.org Cc: longman@redhat.com, dave@stgolabs.net, linux-kernel@vger.kernel.org, sramana@codeaurora.org Subject: Re: [PATCH] rwsem: fix missed wakeup due to reordering of load Message-ID: <20170919140539.GA23698@andrea> References: <1504794658-15397-1-git-send-email-prsood@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1504794658-15397-1-git-send-email-prsood@codeaurora.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 07, 2017 at 08:00:58PM +0530, Prateek Sood wrote: > If a spinner is present, there is a chance that the load of > rwsem_has_spinner() in rwsem_wake() can be reordered with > respect to decrement of rwsem count in __up_write() leading > to wakeup being missed. > > spinning writer up_write caller > --------------- ----------------------- > [S] osq_unlock() [L] osq > spin_lock(wait_lock) > sem->count=0xFFFFFFFF00000001 > +0xFFFFFFFF00000000 > count=sem->count > MB > sem->count=0xFFFFFFFE00000001 > -0xFFFFFFFF00000001 > spin_trylock(wait_lock) > return > rwsem_try_write_lock(count) > spin_unlock(wait_lock) > schedule() > > Reordering of atomic_long_sub_return_release() in __up_write() > and rwsem_has_spinner() in rwsem_wake() can cause missing of > wakeup in up_write() context. In spinning writer, sem->count > and local variable count is 0XFFFFFFFE00000001. It would result > in rwsem_try_write_lock() failing to acquire rwsem and spinning > writer going to sleep in rwsem_down_write_failed(). > > The smp_rmb() will make sure that the spinner state is > consulted after sem->count is updated in up_write context. > > Signed-off-by: Prateek Sood Reviewed-by: Andrea Parri I understand that the merge window and LPC made this stalls for a while... what am I missing? are there other changes that need to be considered for this patch? Andrea > --- > kernel/locking/rwsem-xadd.c | 27 +++++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c > index 02f6606..1fefe6d 100644 > --- a/kernel/locking/rwsem-xadd.c > +++ b/kernel/locking/rwsem-xadd.c > @@ -613,6 +613,33 @@ struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem) > DEFINE_WAKE_Q(wake_q); > > /* > + * __rwsem_down_write_failed_common(sem) > + * rwsem_optimistic_spin(sem) > + * osq_unlock(sem->osq) > + * ... > + * atomic_long_add_return(&sem->count) > + * > + * - VS - > + * > + * __up_write() > + * if (atomic_long_sub_return_release(&sem->count) < 0) > + * rwsem_wake(sem) > + * osq_is_locked(&sem->osq) > + * > + * And __up_write() must observe !osq_is_locked() when it observes the > + * atomic_long_add_return() in order to not miss a wakeup. > + * > + * This boils down to: > + * > + * [S.rel] X = 1 [RmW] r0 = (Y += 0) > + * MB RMB > + * [RmW] Y += 1 [L] r1 = X > + * > + * exists (r0=1 /\ r1=0) > + */ > + smp_rmb(); > + > + /* > * If a spinner is present, it is not necessary to do the wakeup. > * Try to do wakeup only if the trylock succeeds to minimize > * spinlock contention which may introduce too much delay in the > -- > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., > is a member of Code Aurora Forum, a Linux Foundation Collaborative Project. >