From: Hans-Peter Lehmann <hans-peter.lehmann@kit.edu>
To: Erwan Velu <e.velu@criteo.com>, fio@vger.kernel.org
Subject: Re: Question: t/io_uring performance
Date: Thu, 26 Aug 2021 17:57:16 +0200 [thread overview]
Message-ID: <5b58a227-c376-1f3e-7a10-1aa5483bdc0d@kit.edu> (raw)
In-Reply-To: <867506cc-642e-1047-08c6-aae60e7294c5@criteo.com>
Thank you very much for your reply.
> You didn't mention the size of your P4510
Sorry, the P4510 SSDs each have 2 TB.
> Did you checked how your NVMEs are connected via their PCI lanes? It's obvious here that you need multiple PCI-GEN3 lanes to reach 1.6M IOPS (I'd say two).
If I understand the lspci output (listed below) correctly, the SSDs are connected directly to the same PCIe root complex, each of them getting their maximum of x4 lanes. Given that I can saturate the SSDs when using 2 t/io_uring instances, I think the hardware-side connection should not be the limitation - or am I missing something?
> Then considering the EPYC processor, what's your current Numa configuration?
The processor was configured to use one single Numa node (NPS=1). I just tried to switch to NPS=4 and ran the benchmark on a core belonging to the SSDs' Numa node (using numactl). It brought the IOPS from 580k to 590k. That's still nowhere near the values that Jens got.
> If you want to run a single core benchmark, you should also ensure how the IRQs are pinned over the Cores and NUMA domains (even if it's a single socket CPU).
Is IRQ pinning the "big thing" that will double the IOPS? To me, it sounds like there must be something else that is wrong. I will definitely try it, though.
= Details =
# lspci -tv
-+-[0000:c0]-+-00.0 Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
| +- [...]
+-[0000:80]-+-00.0 Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
| +- [...]
+-[0000:40]-+-00.0 Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
| +- [...]
\-[0000:00]-+-00.0 Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
+- [...]
+-03.1-[01]----00.0 Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller]
+-03.2-[02]----00.0 Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller]
# lspci -vv
01:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] (prog-if 02 [NVM Express])
Subsystem: Intel Corporation NVMe Datacenter SSD [3DNAND] SE 2.5" U.2 (P4510)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 65
NUMA node: 0
[...]
Capabilities: [60] Express (v2) Endpoint, MSI 00
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s, Exit Latency L0s <64ns
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x4 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
[...]
02:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] (prog-if 02 [NVM Express])
Subsystem: Intel Corporation NVMe Datacenter SSD [3DNAND] SE 2.5" U.2 (P4510)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 67
NUMA node: 0
[...]
Capabilities: [60] Express (v2) Endpoint, MSI 00
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s, Exit Latency L0s <64ns
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x4 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
[...]
next prev parent reply other threads:[~2021-08-26 15:57 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-25 15:57 Question: t/io_uring performance Hans-Peter Lehmann
2021-08-26 7:27 ` Erwan Velu
2021-08-26 15:57 ` Hans-Peter Lehmann [this message]
2021-08-27 7:20 ` Erwan Velu
2021-09-01 10:36 ` Hans-Peter Lehmann
2021-09-01 13:17 ` Erwan Velu
2021-09-01 14:02 ` Hans-Peter Lehmann
2021-09-01 14:05 ` Erwan Velu
2021-09-01 14:17 ` Erwan Velu
2021-09-06 14:26 ` Hans-Peter Lehmann
2021-09-06 14:41 ` Erwan Velu
2021-09-08 11:53 ` Sitsofe Wheeler
2021-09-08 12:22 ` Jens Axboe
2021-09-08 12:41 ` Jens Axboe
2021-09-08 16:12 ` Hans-Peter Lehmann
2021-09-08 16:20 ` Jens Axboe
2021-09-08 21:24 ` Hans-Peter Lehmann
2021-09-08 21:34 ` Jens Axboe
2021-09-10 11:25 ` Hans-Peter Lehmann
2021-09-10 11:45 ` Erwan Velu
2021-09-08 12:33 ` Jens Axboe
2021-09-08 17:11 ` Erwan Velu
2021-09-08 22:37 ` Erwan Velu
2021-09-16 21:18 ` Erwan Velu
2021-09-21 7:05 ` Erwan Velu
2021-09-22 14:45 ` Hans-Peter Lehmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5b58a227-c376-1f3e-7a10-1aa5483bdc0d@kit.edu \
--to=hans-peter.lehmann@kit.edu \
--cc=e.velu@criteo.com \
--cc=fio@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).