From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753511AbbFIEmC (ORCPT ); Tue, 9 Jun 2015 00:42:02 -0400 Received: from cantor2.suse.de ([195.135.220.15]:53084 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753187AbbFIElz (ORCPT ); Tue, 9 Jun 2015 00:41:55 -0400 Message-ID: <1433824902.3165.61.camel@stgolabs.net> Subject: Re: [PATCH -rfc 4/4] locking/rtmutex: Support spin on owner (osq) From: Davidlohr Bueso To: Thomas Gleixner Cc: Peter Zijlstra , Ingo Molnar , Steven Rostedt , Mike Galbraith , "Paul E. McKenney" , Sebastian Andrzej Siewior , linux-kernel@vger.kernel.org Date: Mon, 08 Jun 2015 21:41:42 -0700 In-Reply-To: References: <1432056298-18738-1-git-send-email-dave@stgolabs.net> <1432056298-18738-5-git-send-email-dave@stgolabs.net> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2015-06-05 at 15:59 +0200, Thomas Gleixner wrote: > On Tue, 19 May 2015, Davidlohr Bueso wrote: > > > > +/* > > + * Lockless alternative to rt_mutex_has_waiters() as we do not need the > > + * wait_lock to check if we are in, for instance, a transitional state > > + * after calling mark_rt_mutex_waiters(). > > Before I get into a state of brain melt, could you please explain that > in an understandable way? With that I meant that we could check the owner to see if the RT_MUTEX_HAS_WAITERS bit was set without taking the wait_lock and no owner. > > rt_mutex_has_waiters() looks at the root pointer of the rbtree head > whether that's empty. You can do a lockless check of that as well, > right? So what's the FAST part of that function and how is that > related to a point after we called mark_rt_mutex_waiters()? You're right, we could use rt_mutex_has_waiters(). When I thought of this originally, I was considering something like: if (rt_mutex_has_waiters(lock)) { if (current->prio >= rt_mutex_top_waiter(lock)->prio) ... Which obviously requires the wait_lock, but I did not consider just using the tree. However, the consequence I see in doing this is that we would miss scenarios where mark_rt_mutex_waiters() is called (under nil owner, for example), so we would force tasks to block only when there are truly waiters. > > + */ > > +static inline bool rt_mutex_has_waiters_fast(struct rt_mutex *lock) > > +{ > > + unsigned long val = (unsigned long)lock->owner; > > + > > + if (!val) > > + return false; > > + return val & RT_MUTEX_HAS_WAITERS; > > +} > > + > > > +/* > > + * Initial check for entering the mutex spinning loop > > + */ > > +static inline bool rt_mutex_can_spin_on_owner(struct rt_mutex *lock) > > +{ > > + struct task_struct *owner; > > + /* default return to spin: if no owner, the lock is free */ > > > Rather than having a comment in the middle of the variable declaration > section, I'd prefer a comment explaing the whole logic of this > function. Ok. > > + int ret = true; > > > +static bool rt_mutex_optimistic_spin(struct rt_mutex *lock) > > +{ > > + bool taken = false; > > + > > + preempt_disable(); > > + > > + if (!rt_mutex_can_spin_on_owner(lock)) > > + goto done; > > + /* > > + * In order to avoid a stampede of mutex spinners trying to > > + * acquire the mutex all at once, the spinners need to take a > > + * MCS (queued) lock first before spinning on the owner field. > > + */ > > + if (!osq_lock(&lock->osq)) > > + goto done; > > Hmm. The queue lock is serializing potential spinners, right? Yes. > > So that's going to lead to a potential priority ordering problem > because if a lower prio task wins the racing to the ocq_lock queue, > then the higher prio waiter will be queued behind and blocked from > taking the lock first. Hmm yes, ocq is a fair lock. However I believe this is mitigated by (a) the conservative spinning approach, and (b) by osq_lock's need_resched() check, so at least a spinner will abort if a higher prio task comes in. But of course, this only deals with spinners, and we cannot account for a lower prio owner task. So if this is not acceptable, I guess we'll have to do without the mcs like properties. Thanks, Davidlohr