From: Magnus Karlsson <magnus.karlsson@intel.com>
To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org,
daniel@iogearbox.net, netdev@vger.kernel.org,
jonathan.lemon@gmail.com
Cc: bpf@vger.kernel.org, saeedm@mellanox.com,
jeffrey.t.kirsher@intel.com, maciej.fijalkowski@intel.com,
maciejromanfijalkowski@gmail.com
Subject: [PATCH bpf-next 00/12] xsk: clean up ring access functions
Date: Mon, 9 Dec 2019 08:56:17 +0100 [thread overview]
Message-ID: <1575878189-31860-1-git-send-email-magnus.karlsson@intel.com> (raw)
This patch set cleans up the ring access functions of AF_XDP in hope
that it will now be easier to understand and maintain. I used to get a
headache every time I looked at this code in order to really understand it,
but now I do think it is a lot less painful.
The code has been simplified a lot and as a bonus we get better
performance. On my 2.0 GHz Broadwell machine with a standard default
config plus AF_XDP support and CONFIG_PREEMPT on I get the following
results in percent performance increases with this patch set compared
to without it:
Zero-copy (-N):
rxdrop txpush l2fwd
1 core: 4% 5% 4%
2 cores: 1% 0% 2%
Zero-copy with poll() (-N -p):
rxdrop txpush l2fwd
1 core: 1% 3% 3%
2 cores: 22% 0% 5%
Skb mode (-S):
Shows a 0% to 1% performance improvement over the same benchmarks as
above.
Here 1 core means that we are running the driver processing and the
application on the same core, while 2 cores means that they execute on
separate cores. The applications are from the xdpsock sample app.
When a results says 22% better, as in the case of poll mode with 2
cores and rxdrop, my first reaction is that it must be a
bug. Everything else shows between 0% and 5% performance
improvement. What is giving rise to 22%? A quick bisect indicates that
it is patches 2, 3, 4, 5, and 6 that are giving rise to most of this
improvement. So not one patch in particular, but something around 4%
improvement from each one of them. Note that exactly this benchmark
has previously had an extraordinary slow down compared to when running
without poll syscalls. For all the other poll tests above, the
slowdown has always been around 4% for using poll syscalls. But with
the bad performing test in question, it was above 25%. Interestingly,
after this clean up, the slow down is 4%, just like all the other poll
tests. Please take an extra peek at this so I have not messed up
something.
The 0% for txpush with two cores is due to the test bottlenecking on
a non-CPU HW resource. If I eliminated that bottleneck on my system,
I would expect to see an increase there too.
This patch has been applied against commit e7096c131e51 ("net: WireGuard secure network tunnel")
Structure of the patch set:
Patch 1: Eliminate the lazy update threshold used when preallocating
entries in the completion ring
Patch 2: Consolidate the two local producer pointers into one
Patch 3: Standardize the naming of the producer ring access functions
Patch 4: Simplify the detection of empty and full rings
Patch 5: Eliminate the Rx batch size used for the fill ring
Patch 6: Simplify the functions xskq_nb_avail and xskq_nb_free
Patch 7: Simplify and standardize the naming of the consumer ring
access functions
Patch 8: Change the names of the validation functions to improve
readability and also the return value of these functions
Patch 9: Change the name of xsk_umem_discard_addr() to
xsk_umem_release_addr() to better reflect the new
names. Requires a name change in the drivers that support AF_XDP
zero-copy.
Patch 10: Remove unnecessary READ_ONCE of data in the ring
Patch 11: Add overall function naming comment and reorder the functions
for easier reference
Patch 12: Use the struct_size helper function when allocating rings
Thanks: Magnus
Magnus Karlsson (12):
xsk: eliminate the lazy update threshold
xsk: consolidate to one single cached producer pointer
xsk: standardize naming of producer ring access functions
xsk: simplify detection of empty and full rings
xsk: eliminate the RX batch size
xsk: simplify xskq_nb_avail and xskq_nb_free
xsk: simplify the consumer ring access functions
xsk: change names of validation functions
xsk: ixgbe: i40e: ice: mlx5: xsk_umem_discard_addr to
xsk_umem_release_addr
xsk: remove unnecessary READ_ONCE of data
xsk: add function naming comments and reorder functions
xsk: use struct_size() helper
drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +-
drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 4 +-
.../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 2 +-
include/net/xdp_sock.h | 14 +-
net/xdp/xsk.c | 62 ++--
net/xdp/xsk_queue.c | 15 +-
net/xdp/xsk_queue.h | 370 +++++++++++----------
8 files changed, 245 insertions(+), 230 deletions(-)
--
2.7.4
next reply other threads:[~2019-12-09 7:56 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-09 7:56 Magnus Karlsson [this message]
2019-12-09 7:56 ` [PATCH bpf-next 01/12] xsk: eliminate the lazy update threshold Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 02/12] xsk: consolidate to one single cached producer pointer Magnus Karlsson
2019-12-10 0:42 ` Martin Lau
2019-12-10 9:04 ` Magnus Karlsson
2019-12-13 18:04 ` Maxim Mikityanskiy
2019-12-16 8:46 ` Magnus Karlsson
2019-12-19 14:35 ` Maxim Mikityanskiy
2019-12-19 16:21 ` Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 03/12] xsk: standardize naming of producer ring access functions Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 04/12] xsk: simplify detection of empty and full rings Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 05/12] xsk: eliminate the RX batch size Magnus Karlsson
2019-12-09 10:16 ` Sergei Shtylyov
2019-12-09 13:07 ` Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 06/12] xsk: simplify xskq_nb_avail and xskq_nb_free Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 07/12] xsk: simplify the consumer ring access functions Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 08/12] xsk: change names of validation functions Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 09/12] xsk: ixgbe: i40e: ice: mlx5: xsk_umem_discard_addr to xsk_umem_release_addr Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 10/12] xsk: remove unnecessary READ_ONCE of data Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 11/12] xsk: add function naming comments and reorder functions Magnus Karlsson
2019-12-09 7:56 ` [PATCH bpf-next 12/12] xsk: use struct_size() helper Magnus Karlsson
2019-12-10 1:02 ` [PATCH bpf-next 00/12] xsk: clean up ring access functions Martin Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1575878189-31860-1-git-send-email-magnus.karlsson@intel.com \
--to=magnus.karlsson@intel.com \
--cc=ast@kernel.org \
--cc=bjorn.topel@intel.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jeffrey.t.kirsher@intel.com \
--cc=jonathan.lemon@gmail.com \
--cc=maciej.fijalkowski@intel.com \
--cc=maciejromanfijalkowski@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=saeedm@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).