From: Federico Parola <fede.parola@hotmail.it>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>,
brouer@redhat.com, xdp-newbies@vger.kernel.org
Subject: Re: Multi-core scalability problems
Date: Wed, 14 Oct 2020 08:56:43 +0200 [thread overview]
Message-ID: <VI1PR04MB3104B4EA129004982325E2389E050@VI1PR04MB3104.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <87r1q29ita.fsf@toke.dk>
Thanks for your help!
On 13/10/20 18:44, Toke Høiland-Jørgensen wrote:
> Federico Parola<fede.parola@hotmail.it> writes:
>
>> Hello,
>> I'm testing the performance of XDP when dropping packets using multiple
>> cores and I'm getting unexpected results.
>> My machine is equipped with a dual port Intel XL710 40 GbE and an Intel
>> Xeon Gold 5120 CPU @ 2.20GHz with 14 cores (HyperThreading disabled),
>> running Ubuntu server 18.04 with kernel 5.8.12.
>> I'm using the xdp_rxq_info program from the kernel tree samples to drop
>> packets.
>> I generate 64 bytes UDP packets with MoonGen for a total of 42 Mpps.
>> Packets are uniformly distributed in different flows (different src
>> port) and I use flow direction rules on the rx NIC to send these flows
>> to different queues/cores.
>> Here are my results:
>>
>> 1 FLOW:
>> Running XDP on dev:enp101s0f0 (ifindex:3) action:XDP_DROP options:no_touch
>> XDP stats CPU pps issue-pps
>> XDP-RX CPU 0 17784270 0
>> XDP-RX CPU total 17784270
>>
>> RXQ stats RXQ:CPU pps issue-pps
>> rx_queue_index 0:0 17784270 0
>> rx_queue_index 0:sum 17784270
>> ---
>>
>> 2 FLOWS:
>> Running XDP on dev:enp101s0f0 (ifindex:3) action:XDP_DROP options:no_touch
>> XDP stats CPU pps issue-pps
>> XDP-RX CPU 0 7016363 0
>> XDP-RX CPU 1 7017291 0
>> XDP-RX CPU total 14033655
>>
>> RXQ stats RXQ:CPU pps issue-pps
>> rx_queue_index 0:0 7016366 0
>> rx_queue_index 0:sum 7016366
>> rx_queue_index 1:1 7017294 0
>> rx_queue_index 1:sum 7017294
>> ---
>>
>> 4 FLOWS:
>> Running XDP on dev:enp101s0f0 (ifindex:3) action:XDP_DROP options:no_touch
>> XDP stats CPU pps issue-pps
>> XDP-RX CPU 0 2359478 0
>> XDP-RX CPU 1 2358508 0
>> XDP-RX CPU 2 2357042 0
>> XDP-RX CPU 3 2355396 0
>> XDP-RX CPU total 9430425
>>
>> RXQ stats RXQ:CPU pps issue-pps
>> rx_queue_index 0:0 2359474 0
>> rx_queue_index 0:sum 2359474
>> rx_queue_index 1:1 2358504 0
>> rx_queue_index 1:sum 2358504
>> rx_queue_index 2:2 2357040 0
>> rx_queue_index 2:sum 2357040
>> rx_queue_index 3:3 2355392 0
>> rx_queue_index 3:sum 2355392
>>
>> I don't understand why overall performance is reducing with the number
>> of cores, according to [1] I would expect it to increase until reaching
>> a maximum value. Is there any parameter I should tune to overcome the
>> problem?
> Yeah, this does look a bit odd. My immediate thought is that maybe your
> RXQs are not pinned to the cores correctly? There is nothing in
> xdp_rxq_info that ensures this, you have to configure the IRQ affinity
> manually. If you don't do this, I suppose the processing could be
> bouncing around on different CPUs leading to cache line contention when
> updating the stats map.
>
> You can try to look at what the actual CPU load is on each core -
> 'mpstat -P ALL -n 1' is my goto for this.
>
> -Toke
>
I forgot to mention, I have manually configured the IRQ affinity to map
every queue on a different core, and running your command confirms that
one core per queue/flow is used.
On 13/10/20 18:41, Jesper Dangaard Brouer wrote:
> This is what I see with i40e:
>
> unning XDP on dev:i40e2 (ifindex:6) action:XDP_DROP options:no_touch
> XDP stats CPU pps issue-pps
> XDP-RX CPU 1 8,411,547 0
> XDP-RX CPU 2 2,804,016 0
> XDP-RX CPU 3 2,803,600 0
> XDP-RX CPU 4 5,608,380 0
> XDP-RX CPU 5 13,999,125 0
> XDP-RX CPU total 33,626,671
>
> RXQ stats RXQ:CPU pps issue-pps
> rx_queue_index 0:3 2,803,600 0
> rx_queue_index 0:sum 2,803,600
> rx_queue_index 1:1 8,411,540 0
> rx_queue_index 1:sum 8,411,540
> rx_queue_index 2:2 2,804,015 0
> rx_queue_index 2:sum 2,804,015
> rx_queue_index 3:5 8,399,326 0
> rx_queue_index 3:sum 8,399,326
> rx_queue_index 4:4 5,608,372 0
> rx_queue_index 4:sum 5,608,372
> rx_queue_index 5:5 5,599,809 0
> rx_queue_index 5:sum 5,599,809
> That is strange, as my results above show that it does scale on my
> testlab on same NIC i40e (Intel Corporation Ethernet Controller XL710
> for 40GbE QSFP+ (rev 02)).
>
> Can you try to use this[2] tool:
> ethtool_stats.pl --dev enp101s0f0
>
> And notice if there are any strange counters.
>
>
> [2]https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl
> My best guess is that you have Ethernet flow-control enabled.
> Some ethtool counter might show if that is the case.
>
Here are the results of the tool:
1 FLOW:
Show adapter(s) (enp101s0f0) statistics (ONLY that changed!)
Ethtool(enp101s0f0) stat: 35458700 ( 35,458,700) <=
port.fdir_sb_match /sec
Ethtool(enp101s0f0) stat: 2729223958 ( 2,729,223,958) <=
port.rx_bytes /sec
Ethtool(enp101s0f0) stat: 7185397 ( 7,185,397) <=
port.rx_dropped /sec
Ethtool(enp101s0f0) stat: 42644155 ( 42,644,155) <=
port.rx_size_64 /sec
Ethtool(enp101s0f0) stat: 42644140 ( 42,644,140) <=
port.rx_unicast /sec
Ethtool(enp101s0f0) stat: 1062159456 ( 1,062,159,456) <= rx-0.bytes /sec
Ethtool(enp101s0f0) stat: 17702658 ( 17,702,658) <= rx-0.packets
/sec
Ethtool(enp101s0f0) stat: 1062155639 ( 1,062,155,639) <= rx_bytes /sec
Ethtool(enp101s0f0) stat: 17756128 ( 17,756,128) <= rx_dropped /sec
Ethtool(enp101s0f0) stat: 17702594 ( 17,702,594) <= rx_packets /sec
Ethtool(enp101s0f0) stat: 35458743 ( 35,458,743) <= rx_unicast /sec
---
4 FLOWS:
Show adapter(s) (enp101s0f0) statistics (ONLY that changed!)
Ethtool(enp101s0f0) stat: 9351001 ( 9,351,001) <=
port.fdir_sb_match /sec
Ethtool(enp101s0f0) stat: 2559136358 ( 2,559,136,358) <=
port.rx_bytes /sec
Ethtool(enp101s0f0) stat: 30635346 ( 30,635,346) <=
port.rx_dropped /sec
Ethtool(enp101s0f0) stat: 39986386 ( 39,986,386) <=
port.rx_size_64 /sec
Ethtool(enp101s0f0) stat: 39986799 ( 39,986,799) <=
port.rx_unicast /sec
Ethtool(enp101s0f0) stat: 140177834 ( 140,177,834) <= rx-0.bytes /sec
Ethtool(enp101s0f0) stat: 2336297 ( 2,336,297) <= rx-0.packets
/sec
Ethtool(enp101s0f0) stat: 140260002 ( 140,260,002) <= rx-1.bytes /sec
Ethtool(enp101s0f0) stat: 2337667 ( 2,337,667) <= rx-1.packets
/sec
Ethtool(enp101s0f0) stat: 140261431 ( 140,261,431) <= rx-2.bytes /sec
Ethtool(enp101s0f0) stat: 2337691 ( 2,337,691) <= rx-2.packets
/sec
Ethtool(enp101s0f0) stat: 140175690 ( 140,175,690) <= rx-3.bytes /sec
Ethtool(enp101s0f0) stat: 2336262 ( 2,336,262) <= rx-3.packets
/sec
Ethtool(enp101s0f0) stat: 560877338 ( 560,877,338) <= rx_bytes /sec
Ethtool(enp101s0f0) stat: 3354 ( 3,354) <= rx_dropped /sec
Ethtool(enp101s0f0) stat: 9347956 ( 9,347,956) <= rx_packets /sec
Ethtool(enp101s0f0) stat: 9351183 ( 9,351,183) <= rx_unicast /sec
So if I understand the field port.rx_dropped represents packets dropped
due to a lack of buffer on the NIC while rx_dropped represents packets
dropped because upper layers aren't able to process them, am I right?
It seems that the problem is in the NIC.
Federico
next prev parent reply other threads:[~2020-10-14 6:56 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-13 13:49 Multi-core scalability problems Federico Parola
2020-10-13 16:41 ` Jesper Dangaard Brouer
2020-10-13 16:44 ` Toke Høiland-Jørgensen
2020-10-14 6:56 ` Federico Parola [this message]
2020-10-14 9:15 ` Jesper Dangaard Brouer
2020-10-14 12:17 ` Federico Parola
2020-10-14 14:26 ` Jesper Dangaard Brouer
2020-10-15 12:04 ` Federico Parola
2020-10-15 13:22 ` Jesper Dangaard Brouer
2020-10-19 15:23 ` Federico Parola
2020-10-19 18:26 ` Jesper Dangaard Brouer
2020-10-24 13:57 ` Federico Parola
2020-10-26 8:14 ` Jesper Dangaard Brouer
[not found] <VI1PR04MB3104C1D86BDC113F4AC0CF4A9E050@VI1PR04MB3104.eurprd04.prod.outlook.com>
2020-10-14 8:35 ` Federico Parola
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=VI1PR04MB3104B4EA129004982325E2389E050@VI1PR04MB3104.eurprd04.prod.outlook.com \
--to=fede.parola@hotmail.it \
--cc=brouer@redhat.com \
--cc=toke@redhat.com \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).