linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Mathieu Desnoyers' <mathieu.desnoyers@efficios.com>,
	'Peter Zijlstra' <peterz@infradead.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"H . Peter Anvin" <hpa@zytor.com>, "Paul Turner" <pjt@google.com>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"Christian Brauner" <brauner@kernel.org>,
	"Florian Weimer" <fw@deneb.enyo.de>,
	"carlos@redhat.com" <carlos@redhat.com>,
	"Peter Oskolkov" <posk@posk.io>,
	"Alexander Mikhalitsyn" <alexander@mihalicyn.com>,
	"Chris Kennelly" <ckennelly@google.com>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Darren Hart" <dvhart@infradead.org>,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	"André Almeida" <andrealmeid@igalia.com>,
	"libc-alpha@sourceware.org" <libc-alpha@sourceware.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Noah Goldstein" <goldstein.w.n@gmail.com>,
	"Daniel Colascione" <dancol@google.com>,
	"longman@redhat.com" <longman@redhat.com>,
	"Florian Weimer" <fweimer@redhat.com>
Subject: RE: [RFC PATCH v2 1/4] rseq: Add sched_state field to struct rseq
Date: Thu, 28 Sep 2023 14:33:19 +0000	[thread overview]
Message-ID: <e2957e5bc071480889f4e1aa32b9cdea@AcuMS.aculab.com> (raw)
In-Reply-To: <34ddb730-8893-19a8-00fe-84c4e281eef1@efficios.com>

From: Mathieu Desnoyers
> Sent: 28 September 2023 14:21
> 
> On 9/28/23 07:22, David Laight wrote:
> > From: Peter Zijlstra
> >> Sent: 28 September 2023 11:39
> >>
> >> On Mon, May 29, 2023 at 03:14:13PM -0400, Mathieu Desnoyers wrote:
> >>> Expose the "on-cpu" state for each thread through struct rseq to allow
> >>> adaptative mutexes to decide more accurately between busy-waiting and
> >>> calling sys_futex() to release the CPU, based on the on-cpu state of the
> >>> mutex owner.
> >
> > Are you trying to avoid spinning when the owning process is sleeping?
> 
> Yes, this is my main intent.
> 
> > Or trying to avoid the system call when it will find that the futex
> > is no longer held?
> >
> > The latter is really horribly detremental.
> 
> That's a good questions. What should we do in those three situations
> when trying to grab the lock:
> 
> 1) Lock has no owner
> 
> We probably want to simply grab the lock with an atomic instruction. But
> then if other threads are queued on sys_futex and did not manage to grab
> the lock yet, this would be detrimental to fairness.
> 
> 2) Lock owner is running:
> 
> The lock owner is certainly running on another cpu (I'm using the term
> "cpu" here as logical cpu).
> 
> I guess we could either decide to bypass sys_futex entirely and try to
> grab the lock with an atomic, or we go through sys_futex nevertheless to
> allow futex to guarantee some fairness across threads.

I'd not worry about 'fairness'.
If the mutex is that contended you've already lost!

I had a big problem trying to avoid the existing 'fairness' code.
Consider 30 RT threads blocked in cv_wait() on the same condvar.
Something does cv_broadcast() and you want them all to wakeup.
They'll all release the mutex pretty quickly - it doesn't matter is they spin.
But what actually happens is one thread is woken up.
Once it has been scheduled (after the cpu has come out of a sleep state
and/or any hardware interrupts completed (etc) then next thread is woken.
If you are lucky it'll 'only' take a few ms to get them all running.
Not good when you are trying to process audio every 10ms.
I had to use a separate cv for each thread and get the woken threads
to help with the wakeups. Gog knows what happens with 256 threads!

> 3) Lock owner is sleeping:
> 
> The lock owner may be either tied to the same cpu as the requester, or a
> different cpu. Here calling FUTEX_WAIT and friends is pretty much required.

You'd need the 'holding process is sleeping' test to be significantly
faster then the 'optimistic spin hoping the mutex will be released'.
And for the 'spin' to be longer than the syscall time for futex.
Otherwise you are optimising an already slow path.
If the thread is going to have to sleep until the thread that owns
a mutex wakes up then I can't imagine performance mattering.

OTOH it is much more usual for the owning thread to be running and
release the mutex quickly.

I wouldn't have thought it was really worth optimising for the
'lock owner is sleeping' case.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

  parent reply	other threads:[~2023-09-28 14:33 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-29 19:14 [RFC PATCH v2 0/4] Extend rseq with sched_state_ptr field Mathieu Desnoyers
2023-05-29 19:14 ` [RFC PATCH v2 1/4] rseq: Add sched_state field to struct rseq Mathieu Desnoyers
2023-05-29 19:35   ` Florian Weimer
2023-05-29 19:48     ` Mathieu Desnoyers
2023-05-30  8:20       ` Florian Weimer
2023-05-30 14:25         ` Mathieu Desnoyers
2023-05-30 15:13           ` Mathieu Desnoyers
2023-09-26 20:52       ` Dmitry Vyukov
2023-09-26 23:49         ` Dmitry Vyukov
2023-09-26 23:54           ` Dmitry Vyukov
2023-09-27  4:51           ` Florian Weimer
2023-09-27 15:58             ` Dmitry Vyukov
2023-09-28  8:52               ` Florian Weimer
2023-09-28 14:44                 ` Dmitry Vyukov
2023-09-28 14:47           ` Dmitry Vyukov
2023-09-28 10:39   ` Peter Zijlstra
2023-09-28 11:22     ` David Laight
2023-09-28 13:20       ` Mathieu Desnoyers
2023-09-28 14:26         ` Peter Zijlstra
2023-09-28 14:33         ` David Laight [this message]
2023-09-28 15:05         ` André Almeida
2023-09-28 14:43     ` Steven Rostedt
2023-09-28 15:51       ` David Laight
2023-10-02 16:51         ` Steven Rostedt
2023-10-02 17:22           ` David Laight
2023-10-02 17:56             ` Steven Rostedt
2023-09-28 20:21   ` Thomas Gleixner
2023-09-28 20:43     ` Mathieu Desnoyers
2023-09-28 20:54   ` Thomas Gleixner
2023-09-28 22:11     ` Mathieu Desnoyers
2023-05-29 19:14 ` [RFC PATCH v2 2/4] selftests/rseq: Add sched_state rseq field and getter Mathieu Desnoyers
2023-05-29 19:14 ` [RFC PATCH v2 3/4] selftests/rseq: Implement sched state test program Mathieu Desnoyers
2023-05-29 19:14 ` [RFC PATCH v2 4/4] selftests/rseq: Implement rseq_mutex " Mathieu Desnoyers
2023-09-28 19:55   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e2957e5bc071480889f4e1aa32b9cdea@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=alexander@mihalicyn.com \
    --cc=andrealmeid@igalia.com \
    --cc=boqun.feng@gmail.com \
    --cc=brauner@kernel.org \
    --cc=carlos@redhat.com \
    --cc=ckennelly@google.com \
    --cc=corbet@lwn.net \
    --cc=dancol@google.com \
    --cc=dave@stgolabs.net \
    --cc=dvhart@infradead.org \
    --cc=fw@deneb.enyo.de \
    --cc=fweimer@redhat.com \
    --cc=goldstein.w.n@gmail.com \
    --cc=hpa@zytor.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=posk@posk.io \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).