archive mirror
 help / color / mirror / Atom feed
* [RFC] futex2: add NUMA awareness
@ 2022-07-14  3:18 André Almeida
  2022-07-14 11:01 ` Andrey Semashev
  0 siblings, 1 reply; 5+ messages in thread
From: André Almeida @ 2022-07-14  3:18 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart, linux-kernel
  Cc: linux-api, fweimer, libc-alpha, Andrey Semashev, Davidlohr Bueso,
	Steven Rostedt, Sebastian Andrzej Siewior


futex2 is an ongoing project with the goal to create a new interface for
futex that solves ongoing issues with the current syscall.

One of this problems is the lack of NUMA awareness for futex operations.
This RFC is aimed to gather feedback around the a NUMA interface proposal.

 * The problem

futex has a single, global hash table to store information of current
waiters to be queried by wakers. This hash table is stored in a single
node in non-uniform machines. This means that a process running in other
nodes will have some overhead using futex, given that it will need to
access the table in a different node.

 * A solution

For NUMA machines, it would be allocated a table per node. Processes
then would be able to use the local table to avoid sharing data with
other nodes.

 * The interface

Userspace needs to specify which node would like to use to store/query
the futex table. The common case would be to operate on the current
node, but some cases could required to operate in another one.

Before getting to the NUMA part, a quick recap of the syscalls interface
of futex2:

futex_wait(void *uaddr, unsigned int val, unsigned int flags,
           struct timespec *timo)

futex_wake(void *uaddr, unsigned long nr_wake, unsigned int flags)

struct futex_requeue {
	void *uaddr;
	unsigned int flags;

futex_requeue(struct futex_requeue *rq1, struct futex_requeue *rq2,
	      unsigned int nr_wake, unsigned int nr_requeue,
	      u64 cmpval, unsigned int flags)

As requeue already has 6 arguments, we can't add an argument for the
node ID, we need to pack it in a struct. So then we have

struct futexX_numa {
        __uX value;
        __sX hint;

Where X can be 8, 16, 32 or 64 (futex2 supports variable sized futexes).
`value` is the futex value and `hint` can be -1 for the current node, or
[0, MAX_NUMA_NODES) to specify a node. Example:

struct futex32_numa f = {.value = 0, hint = -1};


futex_wait(&f, 0, FUTEX_NUMA | FUTEX_32, NULL);

Then &f would be used as the futex address, as expected, and this would
be used for the current node. If an app is expecting to have calls from
different nodes then it should do for instance:

struct futex32_numa f = {.value = 0, hint = 2};

For non-NUMA apps, a call without FUTEX_NUMA flag would just use the
first node as default.

Feedback? Who else should I CC?


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-07-27 18:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-14  3:18 [RFC] futex2: add NUMA awareness André Almeida
2022-07-14 11:01 ` Andrey Semashev
2022-07-14 15:00   ` André Almeida
2022-07-22 16:42     ` Andrey Semashev
2022-07-27 17:19       ` André Almeida

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).