All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] hwloc: Add support for exporting latency, bandwidth topology through calibration
@ 2021-12-01  9:45 Chengchang Tang
  0 siblings, 0 replies; only message in thread
From: Chengchang Tang @ 2021-12-01  9:45 UTC (permalink / raw)
  To: Brice.Goglin
  Cc: hwloc-devel, linux-kernel, song.bao.hua, linuxarm, shenyang (M),
	Jonathan Cameron, yangyicong

Currently, hwloc can export hardware and network locality for 
applications to obtain and set their affinity. However, in many 
scenarios, the information provided by the topology is not enough, for 
example, it cannot reflect the actual memory latency and bandwidth data 
between different schedule domain. We hope to provide more detailed and 
precise information of HW capabilities in hwloc by adding several new 
calibration tools, so that application can achieve a more refined design 
to achieve higher performance and fully tap the capabilities of the HW.

We mainly focus on exposing memory/bus bandwidth, cache coherence/bus 
communication latency etc to users. Those topology information has 
neither standard ACPI nor dts interface to export, but they can be 
beneficial of user applications. Some examples,
1. the memory bandwidth while we spread tasks between multiple clusters 
vs. gather them in one cluster
2. the memory bandwidth while we spread tasks between multiple NUMA 
nodes vs. gather them in one NUMA
3. the cache synchronization latency while we spread tasks between 
multiple clusters vs. gather them in one cluster
4. the cache synchronization latency while we spread tasks between 
multiple NUMA nodes vs. gather them in one NUMA node
5. bus bandwidth and congestion in complex topology, for example, for 
the below topology
node 1 - node0 - node2 - node3
the bus between node0 and node2 might become bottleneck as the 
communications between node1 and node3 also depend on it.
numa distance can't describe this kind of complex bus topology at all.
6. I/O bandwidth and latency while we access I/O devices such as 
accelerators, networks, storages from the NUMA node which devices belong 
to vs. from different NUMA nodes.
...

If possible, we also can export more such as IPC bandwidth and 
latency(for example, pipe), spinlock/mutex latency etc. Calibration 
tools will provide these data about different entities at some certain 
topology levels so that application could select the spreading and 
gathering strategy of threads according to this data.

The design of the calibration tool will be similar to netloc. Three 
steps are required to use the calibration tool.

The first step is to get data about system bandwidth, latency, etc by 
running some benchmark tests since the standard operating system does 
not support providing this information. The raw data will be saved in 
files. This step may need to be performed by a privilege user.

The second step is to convert the original file generated in the 
previous step into a file in a readable format by the calibration tool. 
No privileges are required for this step.

In the third step, the application could obtain the calibration 
information of the system through a C APIs exposed by calibration tool 
and hwloc commands can be also extended to show these new information. 
The source of the calibration data is the readable file generated in the 
second step. E.g. hwloc_get_mem_bandwidth(hwloc_topology_t topology, 
unsigned idx1, unsigned idx2) could be used to get the memory bandwidth 
ability between idx1 and idx2 in some topology type.


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-12-01  9:45 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-01  9:45 [RFC] hwloc: Add support for exporting latency, bandwidth topology through calibration Chengchang Tang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.