All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long@hp.com>
To: Davidlohr Bueso <davidlohr@hp.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, Jason Low <jason.low2@hp.com>,
	Scott J Norton <scott.norton@hp.com>,
	aswin@hp.com
Subject: Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
Date: Fri, 08 Aug 2014 14:30:02 -0400	[thread overview]
Message-ID: <53E5172A.7090508@hp.com> (raw)
In-Reply-To: <1407476387.2513.39.camel@buesod1.americas.hpqcorp.net>

On 08/08/2014 01:39 AM, Davidlohr Bueso wrote:
> On Thu, 2014-08-07 at 17:45 -0700, Davidlohr Bueso wrote:
>> On Thu, 2014-08-07 at 18:26 -0400, Waiman Long wrote:
>>> On a highly contended rwsem, spinlock contention due to the slow
>>> rwsem_wake() call can be a significant portion of the total CPU cycles
>>> used. With writer lock stealing and writer optimistic spinning, there
>>> is also a pretty good chance that the lock may have been stolen
>>> before the waker wakes up the waiters. The woken tasks, if any,
>>> will have to go back to sleep again.
>> Good catch! And this applies to mutexes as well. How about something
>> like this:
>>
>> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
>> index dadbf88..e037588 100644
>> --- a/kernel/locking/mutex.c
>> +++ b/kernel/locking/mutex.c
>> @@ -707,6 +707,20 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
>>
>>   #endif
>>
>> +#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)
> If DEBUG, we don't clear the owner when unlocking. This can just be
>
> +#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
>
>> +static inline bool mutex_has_owner(struct mutex *lock)
>> +{
>> +	struct task_struct *owner = ACCESS_ONCE(lock->owner);
>> +
>> +	return owner != NULL;
>> +}
>> +#else
>> +static inline bool mutex_has_owner(struct mutex *lock)
>> +{
>> +	return false;
>> +}
>> +#endif
>> +
>>   /*
>>    * Release the lock, slowpath:
>>    */
>> @@ -734,6 +748,15 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>>   	mutex_release(&lock->dep_map, nested, _RET_IP_);
>>   	debug_mutex_unlock(lock);
>>
>> +	/*
>> +	 * Abort the wakeup operation if there is an active writer as the
>> +	 * lock was stolen. mutex_unlock() should have cleared the owner field
>> +	 * before calling this function. If that field is now set, there must
>> +	 * be an active writer present.
>> +	 */
>> +	if (mutex_has_owner(lock))
>> +		goto done;
> Err so we actually deadlock here because we do the check with the
> lock->wait_lock held and at the same time another task comes into the
> slowpath of a mutex_lock() call which also tries to take the wait_lock.
> Ending up with hung tasks. Here's a more tested patch against
> peterz-queue, survives aim7 and kernel builds on a 80core box. Thanks.

I couldn't figure out why there will be hang tasks. The logic looks OK 
to me.

>
> 8<---------------------------------------------------------------
> From: Davidlohr Bueso<davidlohr@hp.com>
> Subject: [PATCH] locking/mutex: Do not falsely wake-up tasks
>
> Mutexes lock-stealing functionality allows another task to
> skip its turn in the wait-queue and atomically acquire the lock.
> This is fine and a nice optimization, however, when releasing
> the mutex, we always wakeup the next task in FIFO order. When
> the lock has been stolen this leads to wasting waking up a
> task just to immediately realize it cannot acquire the lock
> and just go back to sleep. This is specially true on highly
> contended mutexes that stress the wait_lock.
>
> Signed-off-by: Davidlohr Bueso<davidlohr@hp.com>
> ---
>   kernel/locking/mutex.c | 32 +++++++++++++++++++++++++++++++-
>   1 file changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index dadbf88..52e1136 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -383,12 +383,26 @@ done:
>
>   	return false;
>   }
> +
> +static inline bool mutex_has_owner(struct mutex *lock)
> +{
> +	struct task_struct *owner = ACCESS_ONCE(lock->owner);
> +
> +	return owner != NULL;
> +}
> +
>   #else
> +
>   static bool mutex_optimistic_spin(struct mutex *lock,
>   				  struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx)
>   {
>   	return false;
>   }
> +
> +static inline bool mutex_has_owner(struct mutex *lock)
> +{
> +	return false;
> +}
>   #endif
>
>   __visible __used noinline
> @@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>   	if (__mutex_slowpath_needs_to_unlock())
>   		atomic_set(&lock->count, 1);
>
> +/*
> + * Skipping the mutex_has_owner() check when DEBUG, allows us to
> + * avoid taking the wait_lock in order to do not call mutex_release()
> + * and debug_mutex_unlock() when !DEBUG. This can otherwise result in
> + * deadlocks when another task enters the lock's slowpath in mutex_lock().
> + */
> +#ifndef CONFIG_DEBUG_MUTEXES
> +	/*
> +	 * Abort the wakeup operation if there is an another mutex owner, as the
> +	 * lock was stolen. mutex_unlock() should have cleared the owner field
> +	 * before calling this function. If that field is now set, another task
> +	 * must have acquired the mutex.
> +	 */
> +	if (mutex_has_owner(lock))
> +		return;
> +#endif
> +
>   	spin_lock_mutex(&lock->wait_lock, flags);
>   	mutex_release(&lock->dep_map, nested, _RET_IP_);
>   	debug_mutex_unlock(lock);
> @@ -744,7 +775,6 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>
>   		wake_up_process(waiter->task);
>   	}
> -
>   	spin_unlock_mutex(&lock->wait_lock, flags);
>   }
>

I have 2 issues about this. First of all, the timing windows between 
atomic_set() and mutex_has_owner() check is really small, I doubt it 
will be that effective. Secondly, I think you may need to call 
mutex_release() and debug_mutex_unlock() to make the debugging code 
work, but they seems to be called only under the wait_lock. So I think 
there is more work that need to be done before this patch is ready.

-Longman

  reply	other threads:[~2014-08-08 18:30 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
2014-08-07 22:26 ` [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup Waiman Long
2014-08-08  0:45   ` Davidlohr Bueso
2014-08-08  5:39     ` Davidlohr Bueso
2014-08-08 18:30       ` Waiman Long [this message]
2014-08-08 19:03         ` Davidlohr Bueso
2014-08-10 21:41           ` Waiman Long
2014-08-10 23:50             ` Davidlohr Bueso
2014-08-11 19:35               ` Waiman Long
2014-08-08 19:50       ` Jason Low
2014-08-08 20:21         ` Davidlohr Bueso
2014-08-08 20:38           ` Jason Low
2014-08-10 21:44             ` Waiman Long
2014-08-07 22:26 ` [PATCH v2 2/7] locking/rwsem: threshold limited spinning for active readers Waiman Long
2014-08-07 22:26 ` [PATCH v2 3/7] locking/rwsem: rwsem_can_spin_on_owner can be called with preemption enabled Waiman Long
2014-08-07 22:26 ` [PATCH v2 4/7] locking/rwsem: more aggressive use of optimistic spinning Waiman Long
2014-08-07 22:26 ` [PATCH v2 5/7] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
2014-08-07 22:26 ` [PATCH v2 6/7] locking/rwsem: enables optimistic spinning for readers Waiman Long
2014-08-07 22:26 ` [PATCH v2 7/7] locking/rwsem: allow waiting writers to go back to spinning Waiman Long
2014-08-07 23:52 ` [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Davidlohr Bueso
2014-08-08 18:16   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53E5172A.7090508@hp.com \
    --to=waiman.long@hp.com \
    --cc=aswin@hp.com \
    --cc=davidlohr@hp.com \
    --cc=jason.low2@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.