All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Peter Oskolkov <posk@posk.io>, Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	juri.lelli@redhat.com,
	Vincent Guittot <vincent.guittot@linaro.org>,
	dietmar.eggemann@arm.com, Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	mgorman@suse.de, bristot@redhat.com,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	linux-api@vger.kernel.org, x86@kernel.org,
	Paul Turner <pjt@google.com>, Peter Oskolkov <posk@google.com>,
	Andrei Vagin <avagin@google.com>, Jann Horn <jannh@google.com>,
	Thierry Delisle <tdelisle@uwaterloo.ca>
Subject: Re: [RFC][PATCH 0/3] sched: User Managed Concurrency Groups
Date: Wed, 15 Dec 2021 18:54:44 +0100	[thread overview]
Message-ID: <Ybor5FvS9i560Db4@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <YbnyaCT4wh0II/Ew@casper.infradead.org>

On Wed, Dec 15, 2021 at 01:49:28PM +0000, Matthew Wilcox wrote:
> On Wed, Dec 15, 2021 at 11:44:49AM +0100, Peter Zijlstra wrote:
> > On Tue, Dec 14, 2021 at 07:46:25PM -0800, Peter Oskolkov wrote:
> > 
> > > Anyway, I'll test your patchset over the next week or so and let you
> > > know if anything really needed is missing (other than waking an idle
> > > server if there is one on a worker wakeup; this piece is definitely
> > > needed).
> > 
> > Right, so the problem I'm having is that a single idle server ptr like
> > before can trivially miss waking annother idle server.
> > 
> > Suppose:
> > 
> > 	umcg::idle_server_tid_ptr
> > 
> > Then the enqueue_and_wake() thing from the last patch would:
> > 
> > 	idle_server_tid = xchg((pid_t __user *)self->idle_server_tid_ptr, 0);
> > 
> > to consume the tid, and then use that to enqueue and wake. But what if a
> > second wakeup happens right after that? There might be a second idle
> > server, but we'll never find it, because userspace hasn't had time to
> > update the field again.
> > 
> > Alternatively, we do a linked list of servers, but then every such
> > wakeup needs to iterate the whole list, looking for one that has
> > UMCG_TF_IDLE set, or something like that, but that lookup is bad for
> > performance.
> > 
> > So I'm really not sure what way to go yet.
> 
> 1. Linked lists are fugly and bad for the CPU.

Absolutely.. although a stack might work, except for that ABA issue (and
contention).

> 2. I'm not sure how big the 'N' in 'M:N' is supposed to be.  Might be
> one per hardware thread?  So it could be hundreds-to-thousands,
> depending on the scale of system.

Typically yes, one server task per hardware thread. Now, I'm also fairly
sure you don't want excessive cross-node traffic for this stuff, so that
puts a limit on things as well.

> 3. The interface between user-kernel could be an array of idle tids,
> maybe 16 entries long (16 * 4 = 64 bytes, just one cacheline).  As a
> server finishes work, it looks for a 0 tid in the batch and stores
> its tid in the slot (cmpxchg, I guess, since the array will be shared
> between processes).  If there are no free slots in the array, then we
> definitely have 16 threads already waiting for work, so it can park itself
> in whatever data structure userspace wants to use to manage idle servers.
> It's up to userspace to decide when to repopulate the array of available
> servers from its data structure of idle servers.

Right, a tid array might work. Could even have userspace specify the
length, then it can do the trade-offs all on it's own. Either a fixed
location for each server and a larger array, or clever things, whatever
they want.

I suppose I'll code up the variable length array, we have space for
that.

      reply	other threads:[~2021-12-15 17:55 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-14 20:44 [RFC][PATCH 0/3] sched: User Managed Concurrency Groups Peter Zijlstra
2021-12-14 20:44 ` [RFC][PATCH 1/3] sched/umcg: add WF_CURRENT_CPU and externise ttwu Peter Zijlstra
2021-12-14 20:44 ` [RFC][PATCH 2/3] x86/uaccess: Implement unsafe_try_cmpxchg_user() Peter Zijlstra
2021-12-20 17:30   ` Sean Christopherson
2021-12-21 11:17     ` Peter Zijlstra
2021-12-14 20:44 ` [RFC][PATCH 3/3] sched: User Mode Concurency Groups Peter Zijlstra
2021-12-21 17:19   ` Peter Oskolkov
2022-01-14 14:09     ` Peter Zijlstra
2022-01-14 15:16       ` Peter Zijlstra
2022-01-14 17:15       ` Peter Zijlstra
2022-01-17 11:35       ` Peter Zijlstra
2022-01-17 12:22         ` Peter Zijlstra
2022-01-17 12:12       ` Peter Zijlstra
2022-01-18 10:09       ` Peter Zijlstra
2022-01-18 18:19         ` Peter Oskolkov
2022-01-19  8:47           ` Peter Zijlstra
2022-01-19 17:33             ` Peter Oskolkov
2022-01-19  8:51           ` Peter Zijlstra
2022-01-19  8:59           ` Peter Zijlstra
2022-01-19 17:52             ` Peter Oskolkov
2022-01-20 10:37               ` Peter Zijlstra
2022-01-17 13:04     ` Peter Zijlstra
2021-12-24 11:27   ` Peter Zijlstra
2021-12-14 21:00 ` [RFC][PATCH 0/3] sched: User Managed Concurrency Groups Peter Zijlstra
2021-12-15  3:46 ` Peter Oskolkov
2021-12-15 10:06   ` Peter Zijlstra
2021-12-15 13:03     ` Peter Zijlstra
2021-12-15 17:56     ` Peter Oskolkov
2021-12-15 18:18       ` Peter Zijlstra
2021-12-15 19:49         ` Peter Oskolkov
2021-12-15 22:25           ` Peter Zijlstra
2021-12-15 23:26             ` Peter Oskolkov
2021-12-16 13:23               ` Thomas Gleixner
2021-12-15 18:25       ` Peter Zijlstra
2021-12-15 21:04         ` Peter Oskolkov
2021-12-15 23:16           ` Peter Zijlstra
2021-12-15 23:31             ` Peter Oskolkov
2021-12-15 10:44   ` Peter Zijlstra
2021-12-15 13:49     ` Matthew Wilcox
2021-12-15 17:54       ` Peter Zijlstra [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ybor5FvS9i560Db4@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=avagin@google.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=jannh@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=pjt@google.com \
    --cc=posk@google.com \
    --cc=posk@posk.io \
    --cc=rostedt@goodmis.org \
    --cc=tdelisle@uwaterloo.ca \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.