xdp-newbies.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* How to benchmark packet throughput for submitting patches?
@ 2022-08-01 21:11 Zvi Effron
  2022-08-08  8:04 ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 2+ messages in thread
From: Zvi Effron @ 2022-08-01 21:11 UTC (permalink / raw)
  To: Xdp

I see that many XDP patchset submissions to the bpf mailing list
include benchmark numbers for packet throughput to show how much the
change improves (or worsens) performance. They frequently show numbers
for a single core test.

I was wondering what methodology people are using to generate these
benchmark results?

Thanks!
--Zvi

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: How to benchmark packet throughput for submitting patches?
  2022-08-01 21:11 How to benchmark packet throughput for submitting patches? Zvi Effron
@ 2022-08-08  8:04 ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 2+ messages in thread
From: Jesper Dangaard Brouer @ 2022-08-08  8:04 UTC (permalink / raw)
  To: Zvi Effron, Xdp; +Cc: brouer


On 01/08/2022 23.11, Zvi Effron wrote:
> I see that many XDP patchset submissions to the bpf mailing list
> include benchmark numbers for packet throughput to show how much the
> change improves (or worsens) performance. 

It is very important to show the *change* in performance.
Meaning baseline numbers for comparison is more important than the
absolute performance numbers.

> They frequently show numbers for a single core test.
> 

The single core or actually single RX-queue test is important to XDP.
For reasons that might surprise you(?).

The intuitive reason is that it's easier to reason about and do
calculations on as we know the CPU is kept 100% busy.

The non-intuitive reason is that when scaling up with more CPUs, then
XDP is so fast that hardware becomes the bottleneck and CPUs will start
to have idle cycles.  This is MUCH harder to reason about and
understand, and is often misinterpreted.  The xdp-paper benchmarks[2]
doc examples where the HW is the bottleneck and how we identify counter
via ethtool_stats.pl [3].



> I was wondering what methodology people are using to generate these
> benchmark results?

On the packet *generator*, I usually use the kernels pktgen via the 
scripts in kernel tree under samples/pktgen/[1]

  [1] https://github.com/torvalds/linux/tree/master/samples/pktgen

  Example command:
   $ ./samples/pktgen/pktgen_sample03_burst_single_flow.sh -vi mlx5p2 -d 
10.40.40.2 -m 3c:fd:fe:b3:31:49 -t 12

As the script name "pktgen_sample03_burst_single_flow" indicate this is
generating a single flow, which will cause the RSS-hash in the NIC hit a
single RX-queue.  The '-t 12' means 12 CPU cores will be generating this
traffic.

Our xdp-paper have detailed records of the benchmarking we did:
  [2] https://github.com/xdp-project/xdp-paper/tree/master/benchmarks


On the Device Under Test (DUT) I usually run sample "xdp_rxq_info", that
report stats on a RX-queue + CPU basis.


I'm interested in hearing what other do?

--Jesper

[3] 
https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-08-08  8:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-01 21:11 How to benchmark packet throughput for submitting patches? Zvi Effron
2022-08-08  8:04 ` Jesper Dangaard Brouer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).