I’m sure someone else can help. I at least tried to repro your results as another data point but even after following the direction son

https://github.com/spdk/spdk/tree/master/examples/ioat/kperf I get:

 

peluse@pels-64:~/spdk/examples/ioat/kperf$ ./ioat_kperf -n 8

Cannot set dma channels

 

-Paul

 

From: SPDK [mailto:spdk-bounces@lists.01.org] On Behalf Of huangqingxin@ruijie.com.cn
Sent: Monday, December 4, 2017 6:38 AM
To: spdk@lists.01.org
Subject: [SPDK] ioat performance questions

 

hi, 

 

When I run the ioat_perf provided by spdk , I get this result.

 

[root@localhost kperf]# ./ioat_kperf -n 8
Total 8 Channels, Queue_Depth 256, Transfer Size 4096 Bytes, Total Transfer Size 4 GB
Running I/O . . . . . . . .
Channel 0 Bandwidth 661 MiB/s
Channel 1 Bandwidth 660 MiB/s
Channel 2 Bandwidth 661 MiB/s
Channel 3 Bandwidth 661 MiB/s
Channel 4 Bandwidth 661 MiB/s
Channel 5 Bandwidth 661 MiB/s
Channel 6 Bandwidth 661 MiB/s
Channel 7 Bandwidth 661 MiB/s
Total Channel Bandwidth: 5544 MiB/s
Average Bandwidth Per Channel: 660 MiB/s
[root@localhost kperf]# ./ioat_kperf -n 4
Total 4 Channels, Queue_Depth 256, Transfer Size 4096 Bytes, Total Transfer Size 4 GB
Running I/O . . . . .
Channel 0 Bandwidth 1319 MiB/s
Channel 1 Bandwidth 1322 MiB/s
Channel 2 Bandwidth 1319 MiB/s
Channel 3 Bandwidth 1318 MiB/s
Total Channel Bandwidth: 5530 MiB/s
Average Bandwidth Per Channel: 1318 MiB/s
[root@localhost kperf]#

 

[root@localhost kperf]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Stepping:              2
CPU MHz:               1200.000
CPU max MHz:           2400.0000
CPU min MHz:           1200.0000
BogoMIPS:              4799.90
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              15360K
NUMA node0 CPU(s):     0-5,12-17
NUMA node1 CPU(s):     6-11,18-23

 

I found the `Total Channel Bandwidth` can not increase with more channels. What's the limitation? Does the performance of ioat dma on E5 V3 can only access around 5GB/s ?

 

Any helps will be appreciated!