* [Bug 1086] Significant TX packet drops with Mellanox NIC (mlx5 PMD)
@ 2022-09-28 13:41 bugzilla
0 siblings, 0 replies; only message in thread
From: bugzilla @ 2022-09-28 13:41 UTC (permalink / raw)
To: dev
https://bugs.dpdk.org/show_bug.cgi?id=1086
Bug ID: 1086
Summary: Significant TX packet drops with Mellanox NIC (mlx5
PMD)
Product: DPDK
Version: 21.11
Hardware: x86
OS: Linux
Status: UNCONFIRMED
Severity: critical
Priority: Normal
Component: ethdev
Assignee: dev@dpdk.org
Reporter: anton@vaa.su
Target Milestone: ---
Created attachment 222
--> https://bugs.dpdk.org/attachment.cgi?id=222&action=edit
testpmd-fec28ca0e3.log.txt
Given 2 servers with 25G Mellanox 2-port NICs:
# dpdk-devbind.py -s
Network devices using kernel driver
===================================
0000:3b:00.0 'MT27710 Family [ConnectX-4 Lx] 1015' if=ens1f0np0 drv=mlx5_core
unused=vfio-pci
0000:3b:00.1 'MT27710 Family [ConnectX-4 Lx] 1015' if=ens1f1np1 drv=mlx5_core
unused=vfio-pci
Servers are connected directly.
The first server is used as a packet generator, running TRex v2.99 in stateless
mode:
./t-rex-64 -c 16 -i
./trex-console
trex>start -f stl/udp_1pkt_range_clients.py -m 17mpps
The second one runs dpdk-testpmd:
OS: Debian GNU/Linux 10 (buster)
uname -r: 4.19.0-21-amd64
ofed_info: MLNX_OFED_LINUX-5.7-1.0.2.0
gcc version 8.3.0 (Debian 8.3.0-6)
When compiled DPDK v21.08 and running testpmd this way:
dpdk-testpmd -l 1-17 -n 4 --log-level=debug -- --nb-ports=2 --nb-cores=16
--portmask=0x3 --rxq=8 --txq=8
It handles roughly 17Mpps per port:
trex>start -f stl/udp_1pkt_range_clients.py -m 17mpps
TRex Port Statistics
port | 0 | 1 | total
-----------+-------------------+-------------------+------------------
owner | root | root |
link | UP | UP |
state | TRANSMITTING | TRANSMITTING |
speed | 25 Gb/s | 25 Gb/s |
CPU util. | 27.76% | 27.76% |
-- | | |
Tx bps L2 | 8.7 Gbps | 8.73 Gbps | 17.43 Gbps
Tx bps L1 | 11.42 Gbps | 11.46 Gbps | 22.88 Gbps
Tx pps | 17 Mpps | 17.05 Mpps | 34.05 Mpps
Line Util. | 45.7 % | 45.83 % |
--- | | |
Rx bps | 8.7 Gbps | 8.73 Gbps | 17.43 Gbps
Rx pps | 17 Mpps | 17.05 Mpps | 34.05 Mpps
---- | | |
opackets | 290928398 | 291050836 | 581979234
ipackets | 290885740 | 291093159 | 581978899
obytes | 18619417472 | 18627254464 | 37246671936
ibytes | 18616688080 | 18629962836 | 37246650916
tx-pkts | 290.93 Mpkts | 291.05 Mpkts | 581.98 Mpkts
rx-pkts | 290.89 Mpkts | 291.09 Mpkts | 581.98 Mpkts
tx-bytes | 18.62 GB | 18.63 GB | 37.25 GB
rx-bytes | 18.62 GB | 18.63 GB | 37.25 GB
----- | | |
oerrors | 0 | 0 | 0
ierrors | 0 | 0 | 0
But if we switch to DPDK v21.11, it becomes much worse:
TRex Port Statistics
port | 0 | 1 | total
-----------+-------------------+-------------------+------------------
owner | root | root |
link | UP | UP |
state | TRANSMITTING | TRANSMITTING |
speed | 25 Gb/s | 25 Gb/s |
CPU util. | 26.06% | 26.06% |
-- | | |
Tx bps L2 | 8.7 Gbps | 8.72 Gbps | 17.42 Gbps
Tx bps L1 | 11.42 Gbps | 11.45 Gbps | 22.86 Gbps
Tx pps | 16.99 Mpps | 17.04 Mpps | 34.02 Mpps
Line Util. | 45.66 % | 45.79 % |
--- | | |
Rx bps | 3.75 Gbps | 3.76 Gbps | 7.5 Gbps
Rx pps | 7.32 Mpps | 7.34 Mpps | 14.66 Mpps
---- | | |
opackets | 190538147 | 190707494 | 381245641
ipackets | 82174700 | 82260152 | 164434852
obytes | 12194441408 | 12205280936 | 24399722344
ibytes | 5259181520 | 5264649728 | 10523831248
tx-pkts | 190.54 Mpkts | 190.71 Mpkts | 381.25 Mpkts
rx-pkts | 82.17 Mpkts | 82.26 Mpkts | 164.43 Mpkts
tx-bytes | 12.19 GB | 12.21 GB | 24.4 GB
rx-bytes | 5.26 GB | 5.26 GB | 10.52 GB
----- | | |
oerrors | 0 | 0 | 0
ierrors | 0 | 0 | 0
It handles only ~7 Mpps for each port, instead of ~17 Mpps! There are huge TX
drops stats reported by testpmd:
---------------------- Forward statistics for port 0 ----------------------
RX-packets: 1101378001 RX-dropped: 0 RX-total: 1101378001
TX-packets: 1016776861 TX-dropped: 84576754 TX-total: 1101353615
----------------------------------------------------------------------------
---------------------- Forward statistics for port 1 ----------------------
RX-packets: 1101353615 RX-dropped: 0 RX-total: 1101353615
TX-packets: 1016804108 TX-dropped: 84573893 TX-total: 1101378001
----------------------------------------------------------------------------
+++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
RX-packets: 2202731616 RX-dropped: 0 RX-total: 2202731616
TX-packets: 2033580969 TX-dropped: 169150647 TX-total: 2202731616
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I found the commit (between 21.08 and 21.11), which caused this trouble using
git bisect:
https://github.com/DPDK/dpdk/commit/fec28ca0e3a93143829f3b41a28a8da933f28499
Also, I've used to profile it with Intel VTune 2021.3.0 (-collect hotspots &
-collect memory-access). I've compared two revisions:
1. 690b2a88c2 (GOOD)
2. fec28ca0e3 (BAD)
I may try to share corresponding profiling results somehow if it helps.
Unfortunately, I cannot attach them here (vtune stats data is too big).
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2022-09-28 13:41 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-28 13:41 [Bug 1086] Significant TX packet drops with Mellanox NIC (mlx5 PMD) bugzilla
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.