From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752574AbcEKI26 (ORCPT ); Wed, 11 May 2016 04:28:58 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:33351 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751455AbcEKI24 (ORCPT ); Wed, 11 May 2016 04:28:56 -0400 Date: Wed, 11 May 2016 10:28:53 +0200 From: Michal Hocko To: Peter Zijlstra Cc: Tetsuo Handa , LKML , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , "David S. Miller" , Tony Luck , Andrew Morton , Chris Zankel , Max Filippov , Davidlohr Bueso , Waiman Long Subject: Re: [PATCH 03/11] locking, rwsem: introduce basis for down_write_killable Message-ID: <20160511082853.GF16677@dhcp22.suse.cz> References: <1459508695-14915-1-git-send-email-mhocko@kernel.org> <1459508695-14915-4-git-send-email-mhocko@kernel.org> <8bd03bdc-0373-a3bb-da12-045322efb797@I-love.SAKURA.ne.jp> <20160510115320.GJ23576@dhcp22.suse.cz> <20160510123806.GB3193@twins.programming.kicks-ass.net> <20160511072357.GC16677@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160511072357.GC16677@dhcp22.suse.cz> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 11-05-16 09:23:57, Michal Hocko wrote: > On Tue 10-05-16 14:38:06, Peter Zijlstra wrote: [...] > > However, at the time I was thinking that if we have: > > > > reader (owner) > > writer (pending) > > reader (blocked on writer) > > > > and writer would get cancelled, the up_read() would do a wakeup and kick > > the blocked reader. > > > > But yes, immediately kicking further pending waiters might be better. > > OK, that makes sense. We shouldn't be waiting for the first reader to > do up_read. I am still trying to wrap my head around the wake up logic (I managed to forget all the juicy details). I still do not see the correctness issue with the current code so the following should be "just an optimization". But as I've said I might be missing something here. Does the following look correct/reasonable? This is absolutely untested and more for a discussion: --- diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index df4dcb883b50..726620c97366 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -516,7 +516,27 @@ EXPORT_SYMBOL(rwsem_down_write_failed); __visible struct rw_semaphore * __sched rwsem_down_write_failed_killable(struct rw_semaphore *sem) { - return __rwsem_down_write_failed_common(sem, TASK_KILLABLE); + struct rw_semaphore * ret; + + ret = __rwsem_down_write_failed_common(sem, TASK_KILLABLE); + if (IS_ERR(ret)) { + long count; + + raw_spin_lock_irq(&sem->wait_lock); + count = READ_ONCE(sem->count); + + /* + * If there were already threads queued before us and there are + * no active writers, the lock must be read owned; so we try to + * wake any read locks that were queued ahead of us so they + * do not have to wait for up_read to wake up. + */ + if (count > RWSEM_WAITING_BIAS) + sem = __rwsem_do_wake(sem, RWSEM_WAKE_READERS); + raw_spin_unlock_irq(&sem->wait_lock); + } + + return ret; } EXPORT_SYMBOL(rwsem_down_write_failed_killable); -- Michal Hocko SUSE Labs