All of lore.kernel.org
 help / color / mirror / Atom feed
From: "André Almeida" <andrealmeid@igalia.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Darren Hart <dvhart@infradead.org>,
	linux-kernel@vger.kernel.org
Cc: linux-api@vger.kernel.org, fweimer@redhat.com,
	libc-alpha@sourceware.org,
	Andrey Semashev <andrey.semashev@gmail.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: [RFC] futex2: add NUMA awareness
Date: Thu, 14 Jul 2022 00:18:51 -0300	[thread overview]
Message-ID: <36a8f60a-69b2-4586-434e-29820a64cd88@igalia.com> (raw)

Hi,

futex2 is an ongoing project with the goal to create a new interface for
futex that solves ongoing issues with the current syscall.

One of this problems is the lack of NUMA awareness for futex operations.
This RFC is aimed to gather feedback around the a NUMA interface proposal.

 * The problem

futex has a single, global hash table to store information of current
waiters to be queried by wakers. This hash table is stored in a single
node in non-uniform machines. This means that a process running in other
nodes will have some overhead using futex, given that it will need to
access the table in a different node.

 * A solution

For NUMA machines, it would be allocated a table per node. Processes
then would be able to use the local table to avoid sharing data with
other nodes.

 * The interface

Userspace needs to specify which node would like to use to store/query
the futex table. The common case would be to operate on the current
node, but some cases could required to operate in another one.

Before getting to the NUMA part, a quick recap of the syscalls interface
of futex2:

futex_wait(void *uaddr, unsigned int val, unsigned int flags,
           struct timespec *timo)

futex_wake(void *uaddr, unsigned long nr_wake, unsigned int flags)

struct futex_requeue {
	void *uaddr;
	unsigned int flags;
};

futex_requeue(struct futex_requeue *rq1, struct futex_requeue *rq2,
	      unsigned int nr_wake, unsigned int nr_requeue,
	      u64 cmpval, unsigned int flags)


As requeue already has 6 arguments, we can't add an argument for the
node ID, we need to pack it in a struct. So then we have

struct futexX_numa {
        __uX value;
        __sX hint;
};

Where X can be 8, 16, 32 or 64 (futex2 supports variable sized futexes).
`value` is the futex value and `hint` can be -1 for the current node, or
[0, MAX_NUMA_NODES) to specify a node. Example:

struct futex32_numa f = {.value = 0, hint = -1};

...

futex_wait(&f, 0, FUTEX_NUMA | FUTEX_32, NULL);

Then &f would be used as the futex address, as expected, and this would
be used for the current node. If an app is expecting to have calls from
different nodes then it should do for instance:

struct futex32_numa f = {.value = 0, hint = 2};

For non-NUMA apps, a call without FUTEX_NUMA flag would just use the
first node as default.

Feedback? Who else should I CC?

Thanks,
	André

             reply	other threads:[~2022-07-14  3:19 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-14  3:18 André Almeida [this message]
2022-07-14 11:01 ` [RFC] futex2: add NUMA awareness Andrey Semashev
2022-07-14 15:00   ` André Almeida
2022-07-22 16:42     ` Andrey Semashev
2022-07-27 17:19       ` André Almeida

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36a8f60a-69b2-4586-434e-29820a64cd88@igalia.com \
    --to=andrealmeid@igalia.com \
    --cc=andrey.semashev@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=dave@stgolabs.net \
    --cc=dvhart@infradead.org \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.