All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hans-Peter Lehmann <hans-peter.lehmann@kit.edu>
To: Erwan Velu <e.velu@criteo.com>, fio@vger.kernel.org
Subject: Re: Question: t/io_uring performance
Date: Thu, 26 Aug 2021 17:57:16 +0200	[thread overview]
Message-ID: <5b58a227-c376-1f3e-7a10-1aa5483bdc0d@kit.edu> (raw)
In-Reply-To: <867506cc-642e-1047-08c6-aae60e7294c5@criteo.com>


Thank you very much for your reply.

> You didn't mention the size of your P4510

Sorry, the P4510 SSDs each have 2 TB.

> Did you checked how your NVMEs are connected via their PCI lanes? It's obvious here that you need multiple PCI-GEN3 lanes to reach 1.6M IOPS (I'd say two).

If I understand the lspci output (listed below) correctly, the SSDs are connected directly to the same PCIe root complex, each of them getting their maximum of x4 lanes. Given that I can saturate the SSDs when using 2 t/io_uring instances, I think the hardware-side connection should not be the limitation - or am I missing something?

> Then considering the EPYC processor, what's your current Numa configuration? 

The processor was configured to use one single Numa node (NPS=1). I just tried to switch to NPS=4 and ran the benchmark on a core belonging to the SSDs' Numa node (using numactl). It brought the IOPS from 580k to 590k. That's still nowhere near the values that Jens got.

> If you want to run a single core benchmark, you should also ensure how the IRQs are pinned over the Cores and NUMA domains (even if it's a single socket CPU).

Is IRQ pinning the "big thing" that will double the IOPS? To me, it sounds like there must be something else that is wrong. I will definitely try it, though.


= Details =

# lspci -tv
-+-[0000:c0]-+-00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
  |           +- [...]
  +-[0000:80]-+-00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
  |           +- [...]
  +-[0000:40]-+-00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
  |           +- [...]
  \-[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
              +- [...]
              +-03.1-[01]----00.0  Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller]
              +-03.2-[02]----00.0  Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller]

# lspci -vv
01:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] (prog-if 02 [NVM Express])
         Subsystem: Intel Corporation NVMe Datacenter SSD [3DNAND] SE 2.5" U.2 (P4510)
         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 64 bytes
         Interrupt: pin A routed to IRQ 65
         NUMA node: 0
         [...]
         Capabilities: [60] Express (v2) Endpoint, MSI 00
                 LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s, Exit Latency L0s <64ns
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 8GT/s (ok), Width x4 (ok)
                         TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                 [...]
02:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] (prog-if 02 [NVM Express])
         Subsystem: Intel Corporation NVMe Datacenter SSD [3DNAND] SE 2.5" U.2 (P4510)
         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 64 bytes
         Interrupt: pin A routed to IRQ 67
         NUMA node: 0
         [...]
         Capabilities: [60] Express (v2) Endpoint, MSI 00
                 LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s, Exit Latency L0s <64ns
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                 LnkSta: Speed 8GT/s (ok), Width x4 (ok)
                         TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                 [...]


  reply	other threads:[~2021-08-26 15:57 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-25 15:57 Question: t/io_uring performance Hans-Peter Lehmann
2021-08-26  7:27 ` Erwan Velu
2021-08-26 15:57   ` Hans-Peter Lehmann [this message]
2021-08-27  7:20     ` Erwan Velu
2021-09-01 10:36       ` Hans-Peter Lehmann
2021-09-01 13:17         ` Erwan Velu
2021-09-01 14:02           ` Hans-Peter Lehmann
2021-09-01 14:05             ` Erwan Velu
2021-09-01 14:17               ` Erwan Velu
2021-09-06 14:26                 ` Hans-Peter Lehmann
2021-09-06 14:41                   ` Erwan Velu
2021-09-08 11:53                   ` Sitsofe Wheeler
2021-09-08 12:22                     ` Jens Axboe
2021-09-08 12:41                       ` Jens Axboe
2021-09-08 16:12                         ` Hans-Peter Lehmann
2021-09-08 16:20                           ` Jens Axboe
2021-09-08 21:24                             ` Hans-Peter Lehmann
2021-09-08 21:34                               ` Jens Axboe
2021-09-10 11:25                                 ` Hans-Peter Lehmann
2021-09-10 11:45                                   ` Erwan Velu
2021-09-08 12:33                 ` Jens Axboe
2021-09-08 17:11                   ` Erwan Velu
2021-09-08 22:37                     ` Erwan Velu
2021-09-16 21:18                       ` Erwan Velu
2021-09-21  7:05                         ` Erwan Velu
2021-09-22 14:45                           ` Hans-Peter Lehmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b58a227-c376-1f3e-7a10-1aa5483bdc0d@kit.edu \
    --to=hans-peter.lehmann@kit.edu \
    --cc=e.velu@criteo.com \
    --cc=fio@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.