linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] cpumask: improve on cpumask_local_spread() locality
@ 2022-11-12 19:09 Yury Norov
  2022-11-12 19:09 ` [PATCH v2 1/4] lib/find: introduce find_nth_and_andnot_bit Yury Norov
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Yury Norov @ 2022-11-12 19:09 UTC (permalink / raw)
  To: linux-kernel, David S. Miller, Andy Shevchenko, Barry Song,
	Ben Segall, haniel Bristot de Oliveira, Dietmar Eggemann,
	Gal Pressman, Greg Kroah-Hartman, Heiko Carstens, Ingo Molnar,
	Jakub Kicinski, Jason Gunthorpe, Jesse Brandeburg,
	Jonathan Cameron, Juri Lelli, Leon Romanovsky, Mel Gorman,
	Peter Zijlstra, Rasmus Villemoes, Saeed Mahameed, Steven Rostedt,
	Tariq Toukan, Tariq Toukan, Tony Luck, Valentin Schneider,
	Vincent Guittot
  Cc: Yury Norov, linux-crypto, netdev, linux-rdma

cpumask_local_spread() currently checks local node for presence of i'th
CPU, and then if it finds nothing makes a flat search among all non-local
CPUs. We can do it better by checking CPUs per NUMA hops.

This series is inspired by Tariq Toukan and Valentin Schneider's "net/mlx5e:
Improve remote NUMA preferences used for the IRQ affinity hints"

https://patchwork.kernel.org/project/netdevbpf/patch/20220728191203.4055-3-tariqt@nvidia.com/

According to their measurements, for mlx5e:

        Bottleneck in RX side is released, reached linerate (~1.8x speedup).
        ~30% less cpu util on TX.

This patch makes cpumask_local_spread() traversing CPUs based on NUMA
distance, just as well, and I expect comparabale improvement for its
users, as in case of mlx5e.

I tested new behavior on my VM with the following NUMA configuration:

root@debian:~# numactl -H
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3
node 0 size: 3869 MB
node 0 free: 3740 MB
node 1 cpus: 4 5
node 1 size: 1969 MB
node 1 free: 1937 MB
node 2 cpus: 6 7
node 2 size: 1967 MB
node 2 free: 1873 MB
node 3 cpus: 8 9 10 11 12 13 14 15
node 3 size: 7842 MB
node 3 free: 7723 MB
node distances:
node   0   1   2   3
  0:  10  50  30  70
  1:  50  10  70  30
  2:  30  70  10  50
  3:  70  30  50  10

And the cpumask_local_spread() for each node and offset traversing looks
like this:

node 0:   0   1   2   3   6   7   4   5   8   9  10  11  12  13  14  15
node 1:   4   5   8   9  10  11  12  13  14  15   0   1   2   3   6   7
node 2:   6   7   0   1   2   3   8   9  10  11  12  13  14  15   4   5
node 3:   8   9  10  11  12  13  14  15   4   5   6   7   0   1   2   3

v1: https://lore.kernel.org/lkml/20221111040027.621646-5-yury.norov@gmail.com/T/
v2: 
 - use bsearch() in sched_numa_find_nth_cpu();
 - fix missing 'static inline' in 3rd patch.

Yury Norov (4):
  lib/find: introduce find_nth_and_andnot_bit
  cpumask: introduce cpumask_nth_and_andnot
  sched: add sched_numa_find_nth_cpu()
  cpumask: improve on cpumask_local_spread() locality

 include/linux/cpumask.h  | 20 +++++++++++++++
 include/linux/find.h     | 33 ++++++++++++++++++++++++
 include/linux/topology.h |  8 ++++++
 kernel/sched/topology.c  | 55 ++++++++++++++++++++++++++++++++++++++++
 lib/cpumask.c            | 12 ++-------
 lib/find_bit.c           |  9 +++++++
 6 files changed, 127 insertions(+), 10 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-12-08  2:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-12 19:09 [PATCH v2 0/4] cpumask: improve on cpumask_local_spread() locality Yury Norov
2022-11-12 19:09 ` [PATCH v2 1/4] lib/find: introduce find_nth_and_andnot_bit Yury Norov
2022-11-12 19:09 ` [PATCH v2 2/4] cpumask: introduce cpumask_nth_and_andnot Yury Norov
2022-11-12 19:09 ` [PATCH v2 3/4] sched: add sched_numa_find_nth_cpu() Yury Norov
2022-11-14 14:32   ` Andy Shevchenko
2022-11-14 15:02     ` Andy Shevchenko
2022-12-08  2:55     ` Yury Norov
2022-11-15 17:25   ` Valentin Schneider
2022-11-12 19:09 ` [PATCH v2 4/4] cpumask: improve on cpumask_local_spread() locality Yury Norov
2022-11-15 17:24 ` [PATCH v2 0/4] " Valentin Schneider
2022-11-15 18:32   ` Yury Norov
2022-11-17 12:23     ` Valentin Schneider
2022-11-28  6:39       ` Tariq Toukan
2022-11-30  1:47         ` Yury Norov
2022-12-07 12:53           ` Tariq Toukan
2022-12-07 20:45             ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).