linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Davidlohr Bueso <dave@stgolabs.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mike Galbraith <umgwanakikbuti@gmail.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH -rfc 4/4] locking/rtmutex: Support spin on owner (osq)
Date: Tue, 9 Jun 2015 11:29:59 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.11.1506091028500.4133@nanos> (raw)
In-Reply-To: <1433824902.3165.61.camel@stgolabs.net>

On Mon, 8 Jun 2015, Davidlohr Bueso wrote:
> On Fri, 2015-06-05 at 15:59 +0200, Thomas Gleixner wrote:
> > rt_mutex_has_waiters() looks at the root pointer of the rbtree head
> > whether that's empty. You can do a lockless check of that as well,
> > right? So what's the FAST part of that function and how is that
> > related to a point after we called mark_rt_mutex_waiters()?
> 
> You're right, we could use rt_mutex_has_waiters(). When I thought of
> this originally, I was considering something like:
> 
> if (rt_mutex_has_waiters(lock)) {
> 	if (current->prio >= rt_mutex_top_waiter(lock)->prio)
> 	...
> 
> Which obviously requires the wait_lock, but I did not consider just
> using the tree. However, the consequence I see in doing this is that we
> would miss scenarios where mark_rt_mutex_waiters() is called (under nil
> owner, for example), so we would force tasks to block only when there
> are truly waiters.

Fair enough. But we really want a proper comment explaining it.

> > > +static bool rt_mutex_optimistic_spin(struct rt_mutex *lock)
> > > +{
> > > +	bool taken = false;
> > > +
> > > +	preempt_disable();
> > > +
> > > +	if (!rt_mutex_can_spin_on_owner(lock))
> > > +		goto done;
> > > +	/*
> > > +	 * In order to avoid a stampede of mutex spinners trying to
> > > +	 * acquire the mutex all at once, the spinners need to take a
> > > +	 * MCS (queued) lock first before spinning on the owner field.
> > > +	 */
> > > +	if (!osq_lock(&lock->osq))
> > > +		goto done;
> > 
> > Hmm. The queue lock is serializing potential spinners, right?
> 
> Yes.
> 
> > 
> > So that's going to lead to a potential priority ordering problem
> > because if a lower prio task wins the racing to the ocq_lock queue,
> > then the higher prio waiter will be queued behind and blocked from
> > taking the lock first.
> 
> Hmm yes, ocq is a fair lock. However I believe this is mitigated by (a)
> the conservative spinning approach, and (b) by osq_lock's need_resched()
> check, so at least a spinner will abort if a higher prio task comes in.
> But of course, this only deals with spinners, and we cannot account for
> a lower prio owner task.
> 
> So if this is not acceptable, I guess we'll have to do without the mcs
> like properties.

I'd say it accounts as priority inversion.

If you look at the RT code, then you'll notice that in the slow lock
path we queue the incoming waiter (including the PI dance) and then
spin only if the waiter is the top waiter on the lock.

Surely it would be nice to avoid the whole PI dance, but OTOH it can
lead to the following issue (and some others):

CPU0   		CPU1

T0 prio=10	T1 prio=20
lock(RTM);
		lock(RTM);
		spin()
->preempt()
T2 prio=15	   if (!owner->on_cpu)
   		       break;
		block_on_rtmutex();
			prio_boost(T0);	
->preempt()
T0 prio=20
unlock(RTM);

IIRC, the initial RT attempt was to spin w/o the PI dance but we gave
up on it due to latency and correctness issues.

I wrapped my head around doing the following:

  1) Lightweight mark the owner as boosted, w/o actually boosting it to
     keep it on the cpu 

  2) Have a priority check in the spin to drop out when a higher prio
     waiter comes in.

But #1 is a nightmare in the scheduler to do, so I gave up on it. If
you have a better idea, you're welcome.

Thanks,

	tglx

  reply	other threads:[~2015-06-09  9:30 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-19 17:24 [PATCH -tip 0/4] rtmutex: Spin on owner Davidlohr Bueso
2015-05-19 17:24 ` [PATCH 1/4] locking/rtmutex: Implement lockless top-waiter wakeup Davidlohr Bueso
2015-06-05 12:35   ` Thomas Gleixner
2015-06-16 19:29   ` [PATCH] futex: lower the lock contention on the HB lock during wake up Sebastian Andrzej Siewior
2015-06-16 19:50     ` Davidlohr Bueso
2015-06-17  8:33       ` [PATCH v2] " Sebastian Andrzej Siewior
2015-06-17 14:17         ` Mike Galbraith
2015-06-17 14:28           ` Sebastian Andrzej Siewior
2015-06-17 14:31             ` Mike Galbraith
2015-06-21  4:35             ` Mike Galbraith
2015-06-18 20:30         ` [tip:sched/core] futex: Lower " tip-bot for Sebastian Andrzej Siewior
2015-06-19 17:51         ` [PATCH v2] futex: lower " Kevin Hilman
2015-06-19 18:54           ` Thomas Gleixner
2015-06-19 19:32             ` Kevin Hilman
2015-06-19 19:33         ` [tip:sched/locking] futex: Lower " tip-bot for Sebastian Andrzej Siewior
2015-06-18 20:30   ` [tip:sched/core] locking/rtmutex: Implement lockless top-waiter wakeup tip-bot for Davidlohr Bueso
2015-05-19 17:24 ` [PATCH 2/4] locking/rtmutex: Use cmp-cmpxchg Davidlohr Bueso
2015-06-05 12:38   ` Thomas Gleixner
2015-06-06 15:27     ` Davidlohr Bueso
2015-06-15 18:34       ` Jason Low
2015-06-15 19:37         ` Davidlohr Bueso
2015-06-16  1:00           ` Jason Low
2015-05-19 17:24 ` [PATCH 3/4] locking/rtmutex: Update stale plist comments Davidlohr Bueso
2015-06-05 12:39   ` Thomas Gleixner
2015-06-18 20:57   ` [tip:sched/core] " tip-bot for Davidlohr Bueso
2015-06-19 19:33   ` [tip:sched/locking] " tip-bot for Davidlohr Bueso
2015-05-19 17:24 ` [PATCH -rfc 4/4] locking/rtmutex: Support spin on owner (osq) Davidlohr Bueso
2015-05-20  7:11   ` Paul Bolle
2015-05-25 20:35     ` Davidlohr Bueso
2015-05-29 15:19   ` Davidlohr Bueso
2015-05-29 18:01     ` Davidlohr Bueso
2015-06-05 13:59   ` Thomas Gleixner
2015-06-09  4:41     ` Davidlohr Bueso
2015-06-09  9:29       ` Thomas Gleixner [this message]
2015-06-09 11:21         ` Peter Zijlstra
2015-06-09 12:53           ` Thomas Gleixner
2015-05-25 20:35 ` [PATCH -tip 0/4] rtmutex: Spin on owner Davidlohr Bueso
2015-05-26 19:05   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.11.1506091028500.4133@nanos \
    --to=tglx@linutronix.de \
    --cc=bigeasy@linutronix.de \
    --cc=dave@stgolabs.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).