From: Andrew Werner <awerner32@gmail.com>
To: bpf@vger.kernel.org
Cc: kernel-team@dataexmachina.dev, alexei.starovoitov@gmail.com,
Andrew Werner <awerner32@gmail.com>
Subject: [PATCH bpf-next v2] selftests/bpf: improve ringbuf benchmark output
Date: Wed, 19 Jul 2023 16:15:34 -0400 [thread overview]
Message-ID: <20230719201533.176702-1-awerner32@gmail.com> (raw)
The ringbuf benchmarks print headers for each section of benchmarks.
The naming conventions lead a user of the benchmarks to some confusion.
This change is a cosmetic update to the output of that benchmark; no
changes were made to what the script actually executes.
The back-to-back exploration of sample rates for Perfbuf and Ringbuf
have been combined into a single section.
Some of the variables in the script were renamed for clarity; b is
always a benchmark name, s is a sampling rate, n is a number of
producers. Before the change, b was the only variable.
After:
```
Parallel producer
=================
rb-libbpf 43.072 ± 0.165M/s (drops 0.940 ± 0.016M/s)
rb-custom 20.274 ± 0.442M/s (drops 0.000 ± 0.000M/s)
pb-libbpf 1.480 ± 0.015M/s (drops 0.000 ± 0.000M/s)
pb-custom 1.492 ± 0.023M/s (drops 0.000 ± 0.000M/s)
Parallel producer, sampled notifications
========================================
rb-libbpf 41.132 ± 0.113M/s (drops 0.000 ± 0.000M/s)
rb-custom 33.228 ± 0.086M/s (drops 0.000 ± 0.000M/s)
pb-libbpf 22.498 ± 0.142M/s (drops 0.052 ± 0.171M/s)
pb-custom 22.399 ± 0.060M/s (drops 0.030 ± 0.100M/s)
Back-to-back producer
=====================
rb-libbpf 59.951 ± 0.712M/s (drops 0.000 ± 0.000M/s)
rb-libbpf-sampled 57.751 ± 4.694M/s (drops 0.000 ± 0.000M/s)
rb-custom 71.568 ± 12.584M/s (drops 0.000 ± 0.000M/s)
rb-custom-sampled 71.919 ± 7.540M/s (drops 0.000 ± 0.000M/s)
pb-libbpf 1.961 ± 0.013M/s (drops 0.000 ± 0.000M/s)
pb-libbpf-sampled 22.339 ± 0.129M/s (drops 0.000 ± 0.000M/s)
pb-custom 1.972 ± 0.009M/s (drops 0.000 ± 0.000M/s)
pb-custom-sampled 22.802 ± 0.374M/s (drops 0.000 ± 0.000M/s)
Back-to-back producer, varying sample rate
==========================================
rb-custom-1 1.529 ± 0.008M/s (drops 0.000 ± 0.000M/s)
rb-custom-5 5.817 ± 1.945M/s (drops 0.000 ± 0.000M/s)
rb-custom-10 12.884 ± 0.032M/s (drops 0.000 ± 0.000M/s)
rb-custom-25 25.634 ± 0.031M/s (drops 0.000 ± 0.000M/s)
rb-custom-50 39.970 ± 0.309M/s (drops 0.000 ± 0.000M/s)
rb-custom-100 51.868 ± 0.210M/s (drops 0.000 ± 0.000M/s)
rb-custom-250 69.466 ± 0.039M/s (drops 0.000 ± 0.000M/s)
rb-custom-500 76.370 ± 0.181M/s (drops 0.000 ± 0.000M/s)
rb-custom-1000 79.778 ± 0.248M/s (drops 0.000 ± 0.000M/s)
rb-custom-2000 82.952 ± 0.198M/s (drops 0.000 ± 0.000M/s)
rb-custom-3000 82.314 ± 0.155M/s (drops 0.000 ± 0.000M/s)
pb-custom-1 1.418 ± 0.004M/s (drops 0.000 ± 0.000M/s)
pb-custom-5 5.655 ± 0.066M/s (drops 0.000 ± 0.000M/s)
pb-custom-10 9.091 ± 0.109M/s (drops 0.000 ± 0.000M/s)
pb-custom-25 14.338 ± 0.144M/s (drops 0.000 ± 0.000M/s)
pb-custom-50 17.841 ± 0.318M/s (drops 0.000 ± 0.000M/s)
pb-custom-100 20.491 ± 0.099M/s (drops 0.000 ± 0.000M/s)
pb-custom-250 22.047 ± 0.270M/s (drops 0.000 ± 0.000M/s)
pb-custom-500 22.475 ± 0.676M/s (drops 0.000 ± 0.000M/s)
pb-custom-1000 23.013 ± 0.786M/s (drops 0.000 ± 0.000M/s)
pb-custom-2000 23.305 ± 0.182M/s (drops 0.000 ± 0.000M/s)
pb-custom-3000 23.855 ± 0.071M/s (drops 0.000 ± 0.000M/s)
Back-to-back producer, rb-custom reserve+commit vs output
=========================================================
reserve 76.244 ± 0.469M/s (drops 0.000 ± 0.000M/s)
output 64.707 ± 5.618M/s (drops 0.000 ± 0.000M/s)
Parallel producer, rb-custom reserve+commit vs output, sampled notifications
============================================================================
reserve-sampled 33.560 ± 0.024M/s (drops 0.000 ± 0.000M/s)
output-sampled 30.348 ± 0.313M/s (drops 0.000 ± 0.000M/s)
Concurrent producer (same CPU as consumer), low batch count
===========================================================
rb-libbpf 0.563 ± 0.007M/s (drops 0.000 ± 0.000M/s)
rb-custom 0.571 ± 0.001M/s (drops 0.000 ± 0.000M/s)
pb-libbpf 0.523 ± 0.001M/s (drops 0.000 ± 0.000M/s)
pb-custom 0.530 ± 0.004M/s (drops 0.000 ± 0.000M/s)
Multiple parallel producers (contention)
========================================
rb-libbpf nr_prod 1 44.711 ± 0.058M/s (drops 0.183 ± 0.012M/s)
rb-libbpf nr_prod 2 23.534 ± 0.069M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 3 14.011 ± 0.023M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 4 14.858 ± 0.021M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 8 6.184 ± 0.031M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 12 4.719 ± 0.058M/s (drops 0.006 ± 0.021M/s)
rb-libbpf nr_prod 16 4.607 ± 0.055M/s (drops 0.010 ± 0.028M/s)
rb-libbpf nr_prod 20 5.001 ± 0.052M/s (drops 0.010 ± 0.025M/s)
rb-libbpf nr_prod 24 5.234 ± 0.114M/s (drops 0.006 ± 0.021M/s)
rb-libbpf nr_prod 28 5.021 ± 0.020M/s (drops 0.007 ± 0.014M/s)
rb-libbpf nr_prod 32 4.316 ± 0.142M/s (drops 0.614 ± 0.121M/s)
rb-libbpf nr_prod 36 4.353 ± 0.157M/s (drops 0.708 ± 0.126M/s)
rb-libbpf nr_prod 40 4.230 ± 0.058M/s (drops 0.775 ± 0.120M/s)
rb-libbpf nr_prod 44 4.212 ± 0.050M/s (drops 0.736 ± 0.084M/s)
rb-libbpf nr_prod 48 4.276 ± 0.057M/s (drops 0.784 ± 0.095M/s)
rb-libbpf nr_prod 52 4.222 ± 0.141M/s (drops 0.777 ± 0.172M/s)
```
Before:
```
Single-producer, parallel producer
==================================
rb-libbpf 43.366 ± 0.277M/s (drops 0.848 ± 0.027M/s)
rb-custom 17.831 ± 0.391M/s (drops 0.065 ± 0.216M/s)
pb-libbpf 1.494 ± 0.012M/s (drops 0.000 ± 0.000M/s)
pb-custom 1.521 ± 0.002M/s (drops 0.000 ± 0.000M/s)
Single-producer, parallel producer, sampled notification
========================================================
rb-libbpf 41.163 ± 0.031M/s (drops 0.000 ± 0.000M/s)
rb-custom 33.364 ± 0.347M/s (drops 0.025 ± 0.082M/s)
pb-libbpf 21.039 ± 3.350M/s (drops 0.014 ± 0.036M/s)
pb-custom 22.570 ± 0.267M/s (drops 0.136 ± 0.319M/s)
Single-producer, back-to-back mode
==================================
rb-libbpf 60.671 ± 0.274M/s (drops 0.000 ± 0.000M/s)
rb-libbpf-sampled 59.229 ± 0.422M/s (drops 0.000 ± 0.000M/s)
rb-custom 77.296 ± 0.156M/s (drops 0.000 ± 0.000M/s)
rb-custom-sampled 71.147 ± 0.281M/s (drops 0.000 ± 0.000M/s)
pb-libbpf 1.960 ± 0.007M/s (drops 0.000 ± 0.000M/s)
pb-libbpf-sampled 22.230 ± 0.115M/s (drops 0.000 ± 0.000M/s)
pb-custom 1.969 ± 0.005M/s (drops 0.000 ± 0.000M/s)
pb-custom-sampled 22.883 ± 0.122M/s (drops 0.000 ± 0.000M/s)
Ringbuf back-to-back, effect of sample rate
===========================================
rb-sampled-1 1.507 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-sampled-5 7.095 ± 0.016M/s (drops 0.000 ± 0.000M/s)
rb-sampled-10 13.091 ± 0.046M/s (drops 0.000 ± 0.000M/s)
rb-sampled-25 26.259 ± 0.061M/s (drops 0.000 ± 0.000M/s)
rb-sampled-50 39.831 ± 0.122M/s (drops 0.000 ± 0.000M/s)
rb-sampled-100 51.536 ± 2.984M/s (drops 0.000 ± 0.000M/s)
rb-sampled-250 67.850 ± 1.267M/s (drops 0.000 ± 0.000M/s)
rb-sampled-500 75.257 ± 0.438M/s (drops 0.000 ± 0.000M/s)
rb-sampled-1000 74.939 ± 0.295M/s (drops 0.000 ± 0.000M/s)
rb-sampled-2000 81.481 ± 0.769M/s (drops 0.000 ± 0.000M/s)
rb-sampled-3000 82.637 ± 0.448M/s (drops 0.000 ± 0.000M/s)
Perfbuf back-to-back, effect of sample rate
===========================================
pb-sampled-1 1.408 ± 0.003M/s (drops 0.000 ± 0.000M/s)
pb-sampled-5 5.667 ± 0.012M/s (drops 0.000 ± 0.000M/s)
pb-sampled-10 9.162 ± 0.026M/s (drops 0.000 ± 0.000M/s)
pb-sampled-25 14.389 ± 0.033M/s (drops 0.000 ± 0.000M/s)
pb-sampled-50 17.977 ± 0.049M/s (drops 0.000 ± 0.000M/s)
pb-sampled-100 20.541 ± 0.079M/s (drops 0.000 ± 0.000M/s)
pb-sampled-250 22.176 ± 0.523M/s (drops 0.000 ± 0.000M/s)
pb-sampled-500 23.121 ± 0.124M/s (drops 0.000 ± 0.000M/s)
pb-sampled-1000 22.415 ± 1.860M/s (drops 0.000 ± 0.000M/s)
pb-sampled-2000 23.333 ± 0.679M/s (drops 0.000 ± 0.000M/s)
pb-sampled-3000 23.032 ± 0.649M/s (drops 0.000 ± 0.000M/s)
Ringbuf back-to-back, reserve+commit vs output
==============================================
reserve 77.180 ± 0.304M/s (drops 0.000 ± 0.000M/s)
output 60.890 ± 7.685M/s (drops 0.000 ± 0.000M/s)
Ringbuf sampled, reserve+commit vs output
=========================================
reserve-sampled 30.724 ± 0.166M/s (drops 0.000 ± 0.000M/s)
output-sampled 30.261 ± 0.454M/s (drops 0.000 ± 0.000M/s)
Single-producer, consumer/producer competing on the same CPU, low batch count
=============================================================================
rb-libbpf 0.570 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-custom 0.569 ± 0.003M/s (drops 0.000 ± 0.000M/s)
pb-libbpf 0.539 ± 0.002M/s (drops 0.000 ± 0.000M/s)
pb-custom 0.549 ± 0.003M/s (drops 0.000 ± 0.000M/s)
Ringbuf, multi-producer contention
==================================
rb-libbpf nr_prod 1 44.359 ± 0.319M/s (drops 0.091 ± 0.027M/s)
rb-libbpf nr_prod 2 23.722 ± 0.024M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 3 14.128 ± 0.011M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 4 14.896 ± 0.020M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 8 6.056 ± 0.061M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 12 4.612 ± 0.042M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 16 4.684 ± 0.040M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 20 5.007 ± 0.046M/s (drops 0.001 ± 0.004M/s)
rb-libbpf nr_prod 24 5.207 ± 0.093M/s (drops 0.006 ± 0.013M/s)
rb-libbpf nr_prod 28 4.951 ± 0.073M/s (drops 0.030 ± 0.069M/s)
rb-libbpf nr_prod 32 4.509 ± 0.069M/s (drops 0.582 ± 0.057M/s)
rb-libbpf nr_prod 36 4.361 ± 0.064M/s (drops 0.733 ± 0.126M/s)
rb-libbpf nr_prod 40 4.261 ± 0.049M/s (drops 0.713 ± 0.116M/s)
rb-libbpf nr_prod 44 4.150 ± 0.207M/s (drops 0.841 ± 0.191M/s)
rb-libbpf nr_prod 48 4.033 ± 0.064M/s (drops 1.009 ± 0.082M/s)
rb-libbpf nr_prod 52 4.025 ± 0.049M/s (drops 1.012 ± 0.069M/s)
```
Signed-off-by: Andrew Werner <awerner32@gmail.com>
---
v1->v2:
- Improved commit message
- Added SOB
- Reworked all section headers for uniformity
v1: https://lore.kernel.org/bpf/20230719014744.3480131-1-awerner32@gmail.com/
---
.../bpf/benchs/run_bench_ringbufs.sh | 30 +++++++++----------
1 file changed, 14 insertions(+), 16 deletions(-)
diff --git a/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh b/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh
index 91e3567962ff..c495013c1d88 100755
--- a/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh
+++ b/tools/testing/selftests/bpf/benchs/run_bench_ringbufs.sh
@@ -6,46 +6,44 @@ set -eufo pipefail
RUN_RB_BENCH="$RUN_BENCH -c1"
-header "Single-producer, parallel producer"
+header "Parallel producer"
for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
summarize $b "$($RUN_RB_BENCH $b)"
done
-header "Single-producer, parallel producer, sampled notification"
+header "Parallel producer, sampled notifications"
for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
summarize $b "$($RUN_RB_BENCH --rb-sampled $b)"
done
-header "Single-producer, back-to-back mode"
+header "Back-to-back producer"
for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
summarize $b "$($RUN_RB_BENCH --rb-b2b $b)"
summarize $b-sampled "$($RUN_RB_BENCH --rb-sampled --rb-b2b $b)"
done
-header "Ringbuf back-to-back, effect of sample rate"
-for b in 1 5 10 25 50 100 250 500 1000 2000 3000; do
- summarize "rb-sampled-$b" "$($RUN_RB_BENCH --rb-b2b --rb-batch-cnt $b --rb-sampled --rb-sample-rate $b rb-custom)"
-done
-header "Perfbuf back-to-back, effect of sample rate"
-for b in 1 5 10 25 50 100 250 500 1000 2000 3000; do
- summarize "pb-sampled-$b" "$($RUN_RB_BENCH --rb-b2b --rb-batch-cnt $b --rb-sampled --rb-sample-rate $b pb-custom)"
+header "Back-to-back producer, varying sample rate"
+for b in rb-custom pb-custom; do
+ for r in 1 5 10 25 50 100 250 500 1000 2000 3000; do
+ summarize "$b-$r" "$($RUN_RB_BENCH --rb-b2b --rb-batch-cnt $r --rb-sampled --rb-sample-rate $r $b)"
+ done
done
-header "Ringbuf back-to-back, reserve+commit vs output"
+header "Back-to-back producer, rb-custom reserve+commit vs output"
summarize "reserve" "$($RUN_RB_BENCH --rb-b2b rb-custom)"
summarize "output" "$($RUN_RB_BENCH --rb-b2b --rb-use-output rb-custom)"
-header "Ringbuf sampled, reserve+commit vs output"
+header "Parallel producer, rb-custom reserve+commit vs output, sampled notifications"
summarize "reserve-sampled" "$($RUN_RB_BENCH --rb-sampled rb-custom)"
summarize "output-sampled" "$($RUN_RB_BENCH --rb-sampled --rb-use-output rb-custom)"
-header "Single-producer, consumer/producer competing on the same CPU, low batch count"
+header "Concurrent producer (same CPU as consumer), low batch count"
for b in rb-libbpf rb-custom pb-libbpf pb-custom; do
summarize $b "$($RUN_RB_BENCH --rb-batch-cnt 1 --rb-sample-rate 1 --prod-affinity 0 --cons-affinity 0 $b)"
done
-header "Ringbuf, multi-producer contention"
-for b in 1 2 3 4 8 12 16 20 24 28 32 36 40 44 48 52; do
- summarize "rb-libbpf nr_prod $b" "$($RUN_RB_BENCH -p$b --rb-batch-cnt 50 rb-libbpf)"
+header "Parallel producers (multiple, contention)"
+for n in 1 2 3 4 8 12 16 20 24 28 32 36 40 44 48 52; do
+ summarize "rb-libbpf nr_prod $n" "$($RUN_RB_BENCH -p$n --rb-batch-cnt 50 rb-libbpf)"
done
--
2.39.2
next reply other threads:[~2023-07-19 20:22 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-19 20:15 Andrew Werner [this message]
2023-07-21 12:57 ` [PATCH bpf-next v2] selftests/bpf: improve ringbuf benchmark output Hou Tao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230719201533.176702-1-awerner32@gmail.com \
--to=awerner32@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=kernel-team@dataexmachina.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).