linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thierry Delisle <tdelisle@uwaterloo.ca>
To: <posk@posk.io>
Cc: <avagin@google.com>, <bsegall@google.com>, <jannh@google.com>,
	<jnewsome@torproject.org>, <joel@joelfernandes.org>,
	<linux-api@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<mingo@redhat.com>, <peterz@infradead.org>, <pjt@google.com>,
	<posk@google.com>, <tglx@linutronix.de>,
	Peter Buhr <pabuhr@uwaterloo.ca>,
	Martin Karsten <mkarsten@uwaterloo.ca>
Subject: Re: [RFC PATCH 3/3 v0.2] sched/umcg: RFC: implement UMCG syscalls
Date: Sun, 11 Jul 2021 14:29:39 -0400	[thread overview]
Message-ID: <bb30216c-4339-2703-9d87-9326af86a7b0@uwaterloo.ca> (raw)
In-Reply-To: <20210708194638.128950-4-posk@google.com>

 > Let's move the discussion to the new thread.

I'm happy to start a new thread. I'm re-responding to my last post 
because many
of my questions are still unanswered.

 > + * State transitions:
 > + *
 > + * RUNNING => IDLE:   the current RUNNING task becomes IDLE by calling
 > + *                    sys_umcg_wait();
 >
 > [...]
 >
 > +/**
 > + * enum umcg_wait_flag - flags to pass to sys_umcg_wait
 > + * @UMCG_WAIT_WAKE_ONLY: wake @self->next_tid, don't put @self to sleep;
 > + * @UMCG_WF_CURRENT_CPU: wake @self->next_tid on the current CPU
 > + *                       (use WF_CURRENT_CPU); @UMCG_WAIT_WAKE_ONLY 
must be set.
 > + */
 > +enum umcg_wait_flag {
 > +    UMCG_WAIT_WAKE_ONLY = 1,
 > +    UMCG_WF_CURRENT_CPU = 2,
 > +};

What is the purpose of using sys_umcg_wait without next_tid or with
UMCG_WAIT_WAKE_ONLY? It looks like Java's park/unpark semantics to me, 
that is
worker threads can use this for synchronization and mutual exclusion. In 
this
case, how do these compare to using FUTEX_WAIT/FUTEX_WAKE?


 > +struct umcg_task {
 > [...]
 > +    /**
 > +     * @server_tid: the TID of the server UMCG task that should be
 > +     *              woken when this WORKER becomes BLOCKED. Can be zero.
 > +     *
 > +     *              If this is a UMCG server, @server_tid should
 > +     *              contain the TID of @self - it will be used to find
 > +     *              the task_struct to wake when pulled from
 > +     *              @idle_servers.
 > +     *
 > +     * Read-only for the kernel, read/write for the userspace.
 > +     */
 > +    uint32_t    server_tid;        /* r   */
 > [...]
 > +    /**
 > +     * @idle_servers_ptr: a single-linked list pointing to the list
 > +     *                    of idle servers. Can be NULL.
 > +     *
 > +     * Readable/writable by both the kernel and the userspace: the
 > +     * userspace adds items to the list, the kernel removes them.
 > +     *
 > +     * TODO: describe how the list works.
 > +     */
 > +    uint64_t    idle_servers_ptr;    /* r/w */
 > [...]
 > +} __attribute__((packed, aligned(8 * sizeof(__u64))));

 From the comments and by elimination, I'm guessing that idle_servers_ptr is
somehow used by servers to block until some worker threads become idle. 
However,
I do not understand how the userspace is expected to use it. I also do not
understand if these link fields form a stack or a queue and where is the 
head.


 > +/**
 > + * sys_umcg_ctl: (un)register a task as a UMCG task.
 > + * @flags:       ORed values from enum umcg_ctl_flag; see below;
 > + * @self:        a pointer to struct umcg_task that describes this
 > + *               task and governs the behavior of sys_umcg_wait if
 > + *               registering; must be NULL if unregistering.
 > + *
 > + * @flags & UMCG_CTL_REGISTER: register a UMCG task:
 > + *         UMCG workers:
 > + *              - self->state must be UMCG_TASK_IDLE
 > + *              - @flags & UMCG_CTL_WORKER
 > + *
 > + *         If the conditions above are met, sys_umcg_ctl() 
immediately returns
 > + *         if the registered task is a RUNNING server or basic task; 
an IDLE
 > + *         worker will be added to idle_workers_ptr, and the worker 
put to
 > + *         sleep; an idle server from idle_servers_ptr will be 
woken, if any.

This approach to creating UMCG workers concerns me a little. My 
understanding
is that in general, the number of servers controls the amount of parallelism
in the program. But in the case of creating new UMCG workers, the new 
threads
only respect the M:N threading model after sys_umcg_ctl has blocked. 
What does
this mean for applications that create thousands of short lived tasks? Are
users expcted to create pools of reusable UMCG workers?


I would suggest adding at least one uint64_t field to the struct 
umcg_task that
is left as-is by the kernel. This allows implementers of user-space
schedulers to add scheduler specific data structures to the threads without
needing some kind of table on the side.


  parent reply	other threads:[~2021-07-11 18:36 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-08 19:46 [RFC PATCH 0/3 v0.2] RFC: sched/UMCG Peter Oskolkov
2021-07-08 19:46 ` [RFC PATCH 1/3 v0.2] sched: add WF_CURRENT_CPU and externise ttwu Peter Oskolkov
2021-07-08 19:46 ` [RFC PATCH 2/3 v0.2] sched/umcg: RFC: add userspace atomic helpers Peter Oskolkov
2021-07-08 21:12   ` Jann Horn
2021-07-09  4:01     ` Peter Oskolkov
2021-07-09  8:01   ` Peter Zijlstra
2021-07-09 16:57     ` Peter Oskolkov
2021-07-09 17:33       ` Peter Oskolkov
2021-07-13 16:10       ` Peter Zijlstra
2021-07-13 17:14         ` Peter Oskolkov
2021-07-08 19:46 ` [RFC PATCH 3/3 v0.2] sched/umcg: RFC: implement UMCG syscalls Peter Oskolkov
2021-07-11 16:35   ` Peter Oskolkov
2021-07-11 18:29   ` Thierry Delisle [this message]
2021-07-12 15:40     ` Peter Oskolkov
2021-07-12 21:44       ` Thierry Delisle
2021-07-12 23:31         ` Peter Oskolkov
2021-07-13 14:02           ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb30216c-4339-2703-9d87-9326af86a7b0@uwaterloo.ca \
    --to=tdelisle@uwaterloo.ca \
    --cc=avagin@google.com \
    --cc=bsegall@google.com \
    --cc=jannh@google.com \
    --cc=jnewsome@torproject.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mkarsten@uwaterloo.ca \
    --cc=pabuhr@uwaterloo.ca \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=posk@google.com \
    --cc=posk@posk.io \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).