linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Noah Goldstein <goldstein.w.n@gmail.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
	linux-kernel@vger.kernel.org,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"H . Peter Anvin" <hpa@zytor.com>, "Paul Turner" <pjt@google.com>,
	linux-api@vger.kernel.org,
	"Christian Brauner" <brauner@kernel.org>,
	"Florian Weimer" <fw@deneb.enyo.de>,
	David.Laight@aculab.com, carlos@redhat.com,
	"Peter Oskolkov" <posk@posk.io>,
	"Alexander Mikhalitsyn" <alexander@mihalicyn.com>,
	"Chris Kennelly" <ckennelly@google.com>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Darren Hart" <dvhart@infradead.org>,
	"Davidlohr Bueso" <dave@stgolabs.net>,
	"André Almeida" <andrealmeid@igalia.com>,
	libc-alpha@sourceware.org, "Steven Rostedt" <rostedt@goodmis.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Florian Weimer" <fweimer@redhat.com>
Subject: Re: [RFC PATCH 1/4] rseq: Add sched_state field to struct rseq
Date: Tue, 23 May 2023 11:32:40 -0500	[thread overview]
Message-ID: <CAFUsyf+L6JF=pZ6QstQhdGGPVM7e7ML2a5LEbzmP6sTs3cwJng@mail.gmail.com> (raw)
In-Reply-To: <cdac8821-a298-aced-8084-8da3ba64a1be@efficios.com>

On Tue, May 23, 2023 at 7:49 AM Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
>
> On 2023-05-19 16:51, Noah Goldstein wrote:
> > On Wed, May 17, 2023 at 10:28 AM Mathieu Desnoyers via Libc-alpha
> > <libc-alpha@sourceware.org> wrote:
> >>
> >> Expose the "on-cpu" state for each thread through struct rseq to allow
> >> adaptative mutexes to decide more accurately between busy-waiting and
> >> calling sys_futex() to release the CPU, based on the on-cpu state of the
> >> mutex owner.
> >>
> >> It is only provided as an optimization hint, because there is no
> >> guarantee that the page containing this field is in the page cache, and
> >> therefore the scheduler may very well fail to clear the on-cpu state on
> >> preemption. This is expected to be rare though, and is resolved as soon
> >> as the task returns to user-space.
> >>
> >> The goal is to improve use-cases where the duration of the critical
> >> sections for a given lock follows a multi-modal distribution, preventing
> >> statistical guesses from doing a good job at choosing between busy-wait
> >> and futex wait behavior.
> >>
> >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> >> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> >> Cc: Jonathan Corbet <corbet@lwn.net>
> >> Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
> >> Cc: Carlos O'Donell <carlos@redhat.com>
> >> Cc: Florian Weimer <fweimer@redhat.com>
> >> Cc: libc-alpha@sourceware.org
> >> ---
> >>   include/linux/sched.h     | 12 ++++++++++++
> >>   include/uapi/linux/rseq.h | 17 +++++++++++++++++
> >>   kernel/rseq.c             | 14 ++++++++++++++
> >>   3 files changed, 43 insertions(+)
> >>
> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
> >> index eed5d65b8d1f..c7e9248134c1 100644
> >> --- a/include/linux/sched.h
> >> +++ b/include/linux/sched.h
> >> @@ -2351,11 +2351,20 @@ static inline void rseq_signal_deliver(struct ksignal *ksig,
> >>          rseq_handle_notify_resume(ksig, regs);
> >>   }
> >>
> >> +void __rseq_set_sched_state(struct task_struct *t, unsigned int state);
> >> +
> >> +static inline void rseq_set_sched_state(struct task_struct *t, unsigned int state)
> >> +{
> >> +       if (t->rseq)
> >> +               __rseq_set_sched_state(t, state);
> >> +}
> >> +
> >>   /* rseq_preempt() requires preemption to be disabled. */
> >>   static inline void rseq_preempt(struct task_struct *t)
> >>   {
> >>          __set_bit(RSEQ_EVENT_PREEMPT_BIT, &t->rseq_event_mask);
> >>          rseq_set_notify_resume(t);
> >> +       rseq_set_sched_state(t, 0);
> >
> > Should rseq_migrate also be made to update the cpu_id of the new core?
> > I imagine the usage of this will be something along the lines of:
> >
> > if(!on_cpu(mutex->owner_rseq_struct) &&
> >     cpu(mutex->owner_rseq_struct) == this_threads_cpu)
> >     // goto futex
> >
> > So I would think updating on migrate would be useful as well.
>
> I don't think we want to act differently based on the cpu on which the
> owner is queued.
>
> If the mutex owner is not on-cpu, and queued on the same cpu as the
> current thread, we indeed want to call sys_futex WAIT.
>
> If the mutex owner is not on-cpu, but queued on a different cpu than the
> current thread, we *still* want to call sys_futex WAIT, because
> busy-waiting for a thread which is queued but not currently running is
> wasteful.
>
I think this is less clear. In some cases sure but not always. Going
to the futex
has more latency that userland waits, and if the system is not busy (other than
the one process) most likely less latency that yield. Also going to the futex
requires a syscall on unlock.

For example if the critical section is expected to be very small, it
would be easy
to imagine the lock be better implemented with:
while(is_locked)
  if (owner->on_cpu || owner->cpu != my_cpu)
    exponential backoff
  else
    yield

Its not that "just go to futex" doesn't ever make sense, but I don't
think its fair
to say that *always* the case.

Looking at the kernel code, it doesn't seem to be a particularly high cost to
keep the CPU field updated during migration so seems like a why not
kind of question.
> Or am I missing something ?
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com
>

  reply	other threads:[~2023-05-23 16:33 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-17 15:26 [RFC PATCH 0/4] Extend rseq with sched_state field Mathieu Desnoyers
2023-05-17 15:26 ` [RFC PATCH 1/4] rseq: Add sched_state field to struct rseq Mathieu Desnoyers
2023-05-17 16:03   ` Davidlohr Bueso
2023-05-18 21:49   ` Boqun Feng
2023-05-19 14:15     ` Mathieu Desnoyers
2023-05-19 17:18       ` Boqun Feng
2023-05-23 14:10         ` Mathieu Desnoyers
2023-05-19 20:51   ` Noah Goldstein
2023-05-23 12:49     ` Mathieu Desnoyers
2023-05-23 16:32       ` Noah Goldstein [this message]
2023-05-23 17:30         ` Mathieu Desnoyers
2023-05-23 20:10           ` Noah Goldstein
2023-05-17 15:26 ` [RFC PATCH 2/4] selftests/rseq: Add sched_state rseq field and getter Mathieu Desnoyers
2023-05-28 14:04   ` kernel test robot
2023-05-17 15:26 ` [RFC PATCH 3/4] selftests/rseq: Implement sched state test program Mathieu Desnoyers
2023-05-17 15:26 ` [RFC PATCH 4/4] selftests/rseq: Implement rseq_mutex " Mathieu Desnoyers
2023-05-17 16:07 ` [RFC PATCH 0/4] Extend rseq with sched_state field Davidlohr Bueso
2023-05-17 18:36 ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFUsyf+L6JF=pZ6QstQhdGGPVM7e7ML2a5LEbzmP6sTs3cwJng@mail.gmail.com' \
    --to=goldstein.w.n@gmail.com \
    --cc=David.Laight@aculab.com \
    --cc=alexander@mihalicyn.com \
    --cc=andrealmeid@igalia.com \
    --cc=boqun.feng@gmail.com \
    --cc=brauner@kernel.org \
    --cc=carlos@redhat.com \
    --cc=ckennelly@google.com \
    --cc=corbet@lwn.net \
    --cc=dave@stgolabs.net \
    --cc=dvhart@infradead.org \
    --cc=fw@deneb.enyo.de \
    --cc=fweimer@redhat.com \
    --cc=hpa@zytor.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=posk@posk.io \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).